llvm-project

Commit Graph

Author	SHA1	Message	Date
Diana Picus	fc19a8ff07	[ARM] GlobalISel: Legalize loading pointers Make it legal to load pointer values. Also check that pointers are assigned to the GPR reg bank by default. llvm-svn: 293886	2017-02-02 13:20:49 +00:00
Matthew Simpson	ba5cf9dfee	[LV] Move interleaved access helper functions to VectorUtils (NFC) This patch moves some helper functions related to interleaved access vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like to use these functions in a follow-on patch that improves interleaved load and store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was already duplicated there and has been removed. Differential Revision: https://reviews.llvm.org/D29398 llvm-svn: 293788	2017-02-01 17:45:46 +00:00
Javed Absar	e5ad87e939	[ARM] Enable Cortex-M23 and Cortex-M33 support. Add both cores to the target parser and TableGen. Test that eabi attributes are set correctly for both cores. Additionally, test the absence and presence of MOVT in Cortex-M23 and Cortex-M33, respectively. Committed on behalf of Sanne Wouda. Reviewers : rengolin, olista01. Differential Revision: https://reviews.llvm.org/D29073 llvm-svn: 293761	2017-02-01 11:55:03 +00:00
Sam Parker	9bf658d5fe	[ARM] Avoid using ARM instructions in Thumb mode The Requires class overrides the target requirements of an instruction, rather than adding to them, so all ARM instructions need to include the IsARM predicate when they have overwitten requirements. This caused the swp and swpb instructions to be allowed in thumb mode assembly, and the ARM encoding of CDP to be selected in codegen (which is different for conditional instructions). Differential Revision: https://reviews.llvm.org/D29283 llvm-svn: 293634	2017-01-31 14:35:01 +00:00
Eugene Zelenko	342257ea92	[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293578	2017-01-31 00:56:17 +00:00
Saleem Abdulrasool	5282eed06c	ARM: support `-mlong-calls` with AEABI TLS on ELF Support lowering AEABI TLS access (__aeabi_read_tp) with long calls. This requires adjusting the call sequence to use an indirect call to get full addressability. Resolves PR31769! llvm-svn: 293433	2017-01-29 16:46:22 +00:00
Matthias Braun	8c209aa877	Cleanup dump() functions. We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif llvm-svn: 293359	2017-01-28 02:02:38 +00:00
Eugene Zelenko	e79c077ef9	[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293348	2017-01-27 23:58:02 +00:00
Saleem Abdulrasool	26c00e3700	ARM: fix vectorized division on WoA The Windows on ARM target uses custom division for normal division as the backend needs to insert division-by-zero checks. However, it is designed to only handle non-vectorized division. ARM has custom lowering for vectorized division as that can avoid loading registers with the values and invoke a division routine for each one, preferring to lower using NEON instructions. Fall back to the custom lowering for the NEON instructions if we encounter a vectorized division. Resolves PR31778! llvm-svn: 293259	2017-01-27 03:41:53 +00:00
Quentin Colombet	89dbea06f1	[ARM][LegalizerInfo] Specify the type of the opcode. This is to fix the win7 bot that does not seem to be very good at infering the type when it gets used in an initiliazer list. llvm-svn: 293248	2017-01-27 01:30:46 +00:00
Eugene Zelenko	e6cf4374b0	[ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293229	2017-01-26 23:40:06 +00:00
Nirav Dave	d32a421f75	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293184 which is failing in LTO builds llvm-svn: 293188	2017-01-26 16:46:13 +00:00
Serge Rogatch	e09ba748cf	[XRay][Arm32] Reduce the portion of the stub and implement more staging for tail calls - in LLVM Summary: This patch provides more staging for tail calls in XRay Arm32 . When the logging part of XRay is ready for tail calls, its support in the core part of XRay Arm32 may be as easy as changing the number passed to the handler from 1 to 2. Coupled patch: - https://reviews.llvm.org/D28674 Reviewers: dberris, rengolin Reviewed By: dberris Subscribers: llvm-commits, iid_iunknown, aemerson, rengolin, dberris Differential Revision: https://reviews.llvm.org/D28673 llvm-svn: 293185	2017-01-26 16:17:03 +00:00
Nirav Dave	de6516c466	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293184	2017-01-26 16:02:24 +00:00
Diana Picus	278c722e6d	[ARM] GlobalISel: Load i1, i8 and i16 args from stack Add support for loading i1, i8 and i16 arguments from the stack, with or without the ABI extension flags. When the ABI extension flags are present, we load a 4-byte value, otherwise we preserve the size of the load and let the instruction selector replace it with a LDRB/LDRH. This generates the same thing as DAGISel. Differential Revision: https://reviews.llvm.org/D27803 llvm-svn: 293163	2017-01-26 09:20:47 +00:00
Jonas Paulsson	8e2f948ef0	[TargetTransformInfo] Refactor and improve getScalarizationOverhead() Refactoring to remove duplications of this method. New method getOperandsScalarizationOverhead() that looks at the present unique operands and add extract costs for them. Old behaviour was to just add extract costs for one operand of the type always, which still happens in getArithmeticInstrCost() if no operands are provided by the caller. This is a good start of improving on this, but there are more places that can be improved by using getOperandsScalarizationOverhead(). Review: Hal Finkel https://reviews.llvm.org/D29017 llvm-svn: 293155	2017-01-26 07:03:25 +00:00
Martin Bohme	8396e14e7f	[ARM] GlobalISel: Fix stack-use-after-scope bug. Summary: Lifetime extension wasn't triggered on the result of BuildMI because the reference was non-const. However, instead of adding a const, I've removed the reference entirely as RVO should kick in anyway. Reviewers: rovka, bkramer Reviewed By: bkramer Subscribers: aemerson, rengolin, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D29124 llvm-svn: 293059	2017-01-25 14:28:19 +00:00
Diana Picus	d83df5d372	[ARM] GlobalISel: Support i1 add and ABI extensions Add support for: * i1 add * i1 function arguments, if passed through registers * i1 returns, with ABI signext/zeroext Differential Revision: https://reviews.llvm.org/D27706 llvm-svn: 293035	2017-01-25 08:47:40 +00:00
Diana Picus	8b6c6bedcb	[ARM] GlobalISel: Support i8/i16 ABI extensions At the moment, this means supporting the signext/zeroext attribute on the return type of the function. For function arguments, signext/zeroext should be handled by the caller, so there's nothing for us to do until we start lowering calls. Note that this does not include support for other extensions (i8 to i16), those will be added later. Differential Revision: https://reviews.llvm.org/D27705 llvm-svn: 293034	2017-01-25 08:10:40 +00:00
Diana Picus	1d8eaf4387	[ARM] GlobalISel: Bail out on Thumb. NFC Thumb is not supported yet, so bail out early. llvm-svn: 293029	2017-01-25 07:08:53 +00:00
Javed Absar	00cce41752	[ARM] Classification Improvements to ARM Sched-Models. NFCI. This is a series of patches to enable adding of machine sched models for ARM processors easier and compact. They define new sched-readwrites for groups of ARM instructions. This has been missing so far, and as a consequence, machine scheduler models for individual sub-targets have tended to be larger than they needed to be. The current patch focuses on floating-point instructions. Reviewers: Diana Picus (rovka), Renato Golin (rengolin) Differential Revision: https://reviews.llvm.org/D28194 llvm-svn: 292825	2017-01-23 20:20:39 +00:00
Matthias Braun	856548a616	ARM: tLDR_postidx should be marked mayLoad This fixes -verify-machineinstrs complaints. llvm-svn: 292629	2017-01-20 18:30:28 +00:00
Sjoerd Meijer	2db2a947f6	[Thumb] Add support for tMUL in the compare instruction peephole optimizer. We also want to optimise tests like this: return a*b == 0. The MULS instruction is flag setting, so we don't need the CMP instruction but can instead branch on the result of the MULS. The generated instructions sequence for this example was: MULS, MOVS, MOVS, CMP. The MOVS instruction load the boolean values resulting from the select instruction, but these MOVS instructions are flag setting and were thus preventing this optimisation. Now we first reorder and move the MULS to before the CMP and generate sequence MOVS, MOVS, MULS, CMP so that the optimisation could trigger. Reordering of the MULS and MOVS is safe to do because the subsequent MOVS instructions just set the CPSR register and don't use it, i.e. the CPSR is dead. Differential Revision: https://reviews.llvm.org/D27990 llvm-svn: 292608	2017-01-20 13:10:12 +00:00
Diana Picus	bd66b7dc87	[ARM] Use helpers for adding pred / CC operands. NFC Hunt down some of the places where we use bare addReg(0) or addImm(AL).addReg(0) and replace with add(condCodeOp()) and add(predOps()). This should make it easier to understand what those operands represent (without having to look at the definition of the instruction that we're adding to). Differential Revision: https://reviews.llvm.org/D27984 llvm-svn: 292587	2017-01-20 08:15:24 +00:00
Serge Rogatch	f83d2a25bf	[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier Summary: Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future. This patch is one of a series, see also - https://reviews.llvm.org/D28623 Reviewers: rengolin, dberris Reviewed By: dberris Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown Differential Revision: https://reviews.llvm.org/D28624 llvm-svn: 292516	2017-01-19 20:24:23 +00:00
Daniel Sanders	d64d5024a4	Re-commit: [globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Changes since first commit attempt: * Added missing guards * Added more missing guards * Found and fixed a use-after-free bug involving Twine locals Reviewers: t.p.northover, ab, rovka, qcolombet Reviewed By: qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292478	2017-01-19 11:15:55 +00:00
Chad Rosier	771db6f895	[Assembler] Fix crash when assembling .quad for AArch32. A 64-bit relocation does not exist in 32-bit ARMELF. Report an error instead of crashing. PR23870 Patch by Sanne Wouda (sanwou01). Differential Revision: https://reviews.llvm.org/D28851 llvm-svn: 292373	2017-01-18 15:02:54 +00:00
Florian Hahn	8485cecd3f	[thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue. Summary: In this function, virtual registers can be introduced (for example through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs will replace those virtual registers with concrete registers later on in PrologEpilogInserter, which sets NoVRegs again. This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which failed with expensive checks. https://llvm.org/bugs/show_bug.cgi?id=27484 Reviewers: rnk, bkramer, olista01 Reviewed By: olista01 Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D28829 llvm-svn: 292372	2017-01-18 15:01:22 +00:00
Daniel Sanders	af76f989b5	Re-revert: [globalisel] Tablegen-erate current Register Bank Information More missing guards. My build didn't notice it due to a stale file left over from a Global ISel build. llvm-svn: 292369	2017-01-18 14:26:12 +00:00
Daniel Sanders	517b61cb69	Re-commit: [globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Changes since last commit: The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and this should fix the buildbots however it may not be the whole fix. The previous buildbot failures suggest there may be a memory bug lurking that I'm unable to reproduce (including when using asan) or spot in the source. If they re-occur on this commit then I'll need assistance from the bot owners to track it down. Reviewers: t.p.northover, ab, rovka, qcolombet Reviewed By: qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292367	2017-01-18 14:17:50 +00:00
Sam Parker	df7c6ef96f	[ARM] Create objdump subtarget from build attrs Enable an ELFObjectFile to read the its arm build attributes to produce a target triple with a specific ARM architecture. llvm-objdump now uses this functionality to automatically produce a more accurate target. Differential Revision: https://reviews.llvm.org/D28769 llvm-svn: 292366	2017-01-18 13:52:12 +00:00
Renato Golin	03c5e69d07	Revert "[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier" This reverts commit r292210, as it broke the Thumb buldbot with: clang-5.0: error: the clang compiler does not support '-fxray-instrument on thumbv7-unknown-linux-gnueabihf'. llvm-svn: 292357	2017-01-18 09:08:43 +00:00
Tim Northover	d943354216	GlobalISel: correctly handle varargs Some platforms (notably iOS) use a different calling convention for unnamed vs named parameters in varargs functions, so we need to keep track of this information when translating calls. Since not many platforms are involved, the guts of the special handling is in the ValueHandler class (with a generic implementation that should work for most targets). llvm-svn: 292283	2017-01-17 22:30:10 +00:00
Serge Rogatch	50be6b45a9	[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier Summary: Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future. This patch is one of a series, see also - https://reviews.llvm.org/D28623 Reviewers: rengolin, dberris Reviewed By: dberris Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown Differential Revision: https://reviews.llvm.org/D28624 llvm-svn: 292210	2017-01-17 11:52:10 +00:00
Daniel Sanders	a83a1a69c5	Revert r292132: [globalisel] Tablegen-erate current Register Bank Information'... Several buildbots encountered a crash in tablegen when building this commit. Reverting while I investigate the cause. llvm-svn: 292136	2017-01-16 15:34:43 +00:00
Daniel Sanders	ab8194def0	[globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292132	2017-01-16 15:20:43 +00:00
Ivan Krasin	1ed7896c1b	Revert r291903 and r291898. Reason: they break check-lld on the bots. Summary: Revert [ARM] Fix ubig32_t read in ARMAttributeParser Now using support functions to read data instead of trying to perform casts. =========================================================== Revert [ARM] Enable objdump to construct triple for ARM Now that The ARMAttributeParser has been moved into the library, it has been modified so that it can parse the attributes without printing them and stores them in a map. ELFObjectFile now queries the attributes to fill out the architecture details of a provided triple for 'arm' and 'thumb' targets. llvm-objdump uses this new functionality. Subscribers: llvm-commits, samparker, aemerson, mgorny Differential Revision: https://reviews.llvm.org/D28683 llvm-svn: 291911	2017-01-13 16:45:15 +00:00
Saleem Abdulrasool	6ef45916c6	ARM: match GCC's behaviour for builtins GCC changes the CC between the user-code and the builtins based on the value of `-target` rather than `-mfloat-abi`. When a HF target is used, the VFP variant of the AAPCS CC is used. Otherwise, the AAPCS variant is used. In all cases, the AEABI functions use the AAPCS CC. Adjust the calling convention based on the target. Resolves PR30543! llvm-svn: 291909	2017-01-13 16:25:33 +00:00
Benjamin Kramer	061f4a5fe6	Apply clang-tidy's performance-unnecessary-value-param to LLVM. With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904	2017-01-13 14:39:03 +00:00
Sam Parker	770ceb69ba	[ARM] Enable objdump to construct triple for ARM Now that The ARMAttributeParser has been moved into the library, it has been modified so that it can parse the attributes without printing them and stores them in a map. ELFObjectFile now queries the attributes to fill out the architecture details of a provided triple for 'arm' and 'thumb' targets. llvm-objdump uses this new functionality. Differential Revision: https://reviews.llvm.org/D28281 llvm-svn: 291898	2017-01-13 11:04:21 +00:00
Diana Picus	a2c59149e1	[ARM] CodeGen: Replace AddDefaultT1CC and AddNoT1CC. NFC For AddDefaultT1CC, we add a new helper t1CondCodeOp, which creates the appropriate register operand. For AddNoT1CC, we use the existing condCodeOp helper - we only had two uses of AddNoT1CC, so at this point it's probably not worth having yet another helper just for them. Differential Revision: https://reviews.llvm.org/D28603 llvm-svn: 291894	2017-01-13 10:37:37 +00:00
Diana Picus	8a73f5562f	[ARM] CodeGen: Remove AddDefaultCC. NFC. Replace all uses of AddDefaultCC with add(condCodeOp()). The transformation has been done automatically with a custom tool based on Clang AST Matchers + RefactoringTool. Differential Revision: https://reviews.llvm.org/D28557 llvm-svn: 291893	2017-01-13 10:18:01 +00:00
Diana Picus	116bbab4e4	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891	2017-01-13 09:58:52 +00:00
Diana Picus	4f8c3e1882	[ARM] CodeGen: Remove AddDefaultPred. NFC. Replace all uses of AddDefaultPred with MachineInstrBuilder::add(predOps()). This makes the code building MachineInstrs more readable, because it allows us to write code like: MIB.addSomeOperand(blah) .add(predOps()) .addAnotherOperand(blahblah) instead of AddDefaultPred(MIB.addSomeOperand(blah)) .addAnotherOperand(blahblah) This commit also adds the predOps helper in the ARM backend, as well as the add method taking a variable number of operands to the MachineInstrBuilder. The transformation has been done mostly automatically with a custom tool based on Clang AST Matchers + RefactoringTool. Differential Revision: https://reviews.llvm.org/D28555 llvm-svn: 291890	2017-01-13 09:37:56 +00:00
Saleem Abdulrasool	555e5980a5	ARM: slightly more table driven libcall setup Switch some additional library call setup to be table driven. This makes it more immediately obvious what the library call looks like. This is important for ARM since the calling conventions for the builtins change based on the target/libcall name. NFC llvm-svn: 291789	2017-01-12 18:46:11 +00:00
Daniel Sanders	b7391dd3b4	[globalisel] Move as much RegisterBank initialization to the constructor as possible Summary: The register bank is now entirely initialized in the constructor. However, we still have the hardcoded number of register classes which will be dealt with in the TableGen patch (D27338) since we do not have access to this information to resolve this at this stage. The number of register classes is known to the TRI and to TableGen but the RegisterBank constructor is too early for the former and too late for the latter. This will be fixed when the data is tablegen-erated. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D27809 llvm-svn: 291770	2017-01-12 16:11:23 +00:00
Daniel Sanders	ae03595bfb	[globalisel] Initialize RegisterBanks with static data. Summary: Refactor the RegisterBank initialization to use static data. This requires GlobalISel implementations to rewrite calls to createRegisterBank() and addRegBankCoverage() into a call to setRegBankData(). Out of tree targets can use diff 4 of D27807 (https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump the register classes and other data that needs to be provided to setRegBankData(). This is the method that was used to generate the static data in this patch. Tablegen-eration of this static data will follow after some refactoring. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D27807 Differential Revision: https://reviews.llvm.org/D27808 llvm-svn: 291768	2017-01-12 15:32:10 +00:00
Eli Friedman	3a03742c37	[ARM] More aggressive matching for vpadd and vpaddl. The new matchers work after legalization to make them simpler, and to avoid blocking other optimizations. Differential Revision: https://reviews.llvm.org/D27779 llvm-svn: 291693	2017-01-11 19:33:38 +00:00
Mohammed Agabaria	2c96c43388	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch. updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657	2017-01-11 08:23:37 +00:00
Eugene Zelenko	c4ad1ce068	[Target] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 291641	2017-01-11 01:45:03 +00:00
Chad Rosier	d0114fc1dd	[ARM] Remove rbit intrinsics and autoupgrade to generic bitreverse. Testing already covered by CodeGen/ARM/rbit.ll llvm-svn: 291587	2017-01-10 19:23:51 +00:00
Mohammed Agabaria	23599ba794	Currently isLikelyComplexAddressComputation tries to figure out if the given stride seems to be 'complex' and need some extra cost for address computation handling. This code seems to be target dependent which may not be the same for all targets. Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'. Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general. Differential Revision: https://reviews.llvm.org/D27518 llvm-svn: 291106	2017-01-05 14:03:41 +00:00
Dean Michael Berris	f7e7b938ea	[XRay] Merge instrumentation point table emission code into AsmPrinter. Summary: No need to have this per-architecture. While there, unify 32-bit ARM's behaviour with what changed elsewhere and start function names lowercase as per the coding standards. Individual entry emission code goes to the entry's own class. Fully tested on amd64, cross-builds on both ARMs and PowerPC. Reviewers: dberris Subscribers: aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D28209 llvm-svn: 290858	2017-01-03 04:30:21 +00:00
Aaron Ballman	58a61e723e	Caught a simple typo. I do not know of a way to test this, but it seems like an unlikely thing to regress in the future. llvm-svn: 290757	2016-12-30 15:57:56 +00:00
Eli Friedman	d03df8145f	[ARM] Implement isExtractSubvectorCheap. See https://reviews.llvm.org/D6678 for the history of isExtractSubvectorCheap. Essentially the same considerations apply to ARM. This temporarily breaks the formation of vpadd/vpaddl in certain cases; AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles. See https://reviews.llvm.org/D27779 for followup fix. Differential Revision: https://reviews.llvm.org/D27774 llvm-svn: 290198	2016-12-20 20:05:07 +00:00
Daniel Jasper	24218d5993	Silence unused warning. llvm-svn: 290109	2016-12-19 14:24:22 +00:00
Diana Picus	97ae95c3a8	[ARM] GlobalISel: Lower i8 and i16 register args This allows lowering i8 and i16 arguments if they can fit in the registers. Note that the lowering is incomplete - ABI extensions are handled in a subsequent patch. (Last part of) Differential Revision: https://reviews.llvm.org/D27704 llvm-svn: 290106	2016-12-19 14:08:02 +00:00
Diana Picus	5a724452a0	[ARM] GlobalISel: Allow i8 and i16 adds Teach the instruction selector and legalizer that it's ok to have adds with 8 or 16-bit integers. This is the second part of https://reviews.llvm.org/D27704 llvm-svn: 290105	2016-12-19 14:07:56 +00:00
Diana Picus	36aa09fa3c	[ARM] GlobalISel: Select i8 and i16 copies Teach the instruction selector that it's ok to copy small values from physical registers. First part of https://reviews.llvm.org/D27704 llvm-svn: 290104	2016-12-19 14:07:50 +00:00
Diana Picus	1437f6d710	[ARM] GlobalISel: Lower more than 4 arguments This adds support for lowering more than 4 arguments (although still i32 only). It uses the handleAssignments / ValueHandler infrastructure extracted from the AArch64 backend in r288658. Differential Revision: https://reviews.llvm.org/D27195 llvm-svn: 290098	2016-12-19 11:55:41 +00:00
Diana Picus	519807f7be	[ARM] GlobalISel: Support loading from the stack Add support for selecting simple G_LOAD and G_FRAME_INDEX instructions (32-bit scalars only). This will be useful for functions that need to pass arguments on the stack. First part of https://reviews.llvm.org/D27195. llvm-svn: 290096	2016-12-19 11:26:31 +00:00
Eli Friedman	f624ec27b7	[ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently. Currently, there are substantial problems forming vld1_dup even if the VDUP survives legalization. The lack of an actual node leads to terrible results: not only can we not form post-increment vld1_dup instructions, but we form scalar pre-increment and post-increment loads which force the loaded value into a GPR. This patch fixes that by combining the vdup+load into an ARMISD node before DAGCombine messes it up. Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable). Recommiting with fix to avoid forming vld1dup if the type of the load doesn't match the type of the vdup (see https://llvm.org/bugs/show_bug.cgi?id=31404). Differential Revision: https://reviews.llvm.org/D27694 llvm-svn: 289972	2016-12-16 18:44:08 +00:00
Benjamin Kramer	24bf8689dd	[GlobalISel] Silence unused variable warnings in Release builds. llvm-svn: 289941	2016-12-16 13:13:03 +00:00
Diana Picus	812caee65a	[ARM] GlobalISel: Select add i32, i32 Add the minimal support necessary to select a function that returns the sum of two i32 values. This includes some support for argument/return lowering of i32 values through registers, as well as the handling of copy and add instructions throughout the GlobalISel pipeline. Differential Revision: https://reviews.llvm.org/D26677 llvm-svn: 289940	2016-12-16 12:54:46 +00:00
Diana Picus	2af9c389bf	[ARM] Expose methods to get the CCAssignFn. NFCI Add two public methods to ARMTargetLowering: CCAssignFnForCall and CCAssignFnForReturn, which are just calling the already existing private method CCAssignFnForNode. These will come in handy for GlobalISel on ARM. We also replace all calls to CCAssignFnForNode in ARMISelLowering.cpp, because the new methods are friendlier to the reader. llvm-svn: 289932	2016-12-16 10:35:20 +00:00
Nico Weber	11c2e6ee3a	Revert 279703, it caused PR31404. llvm-svn: 289923	2016-12-16 04:51:25 +00:00
Ahmed Bougacha	5228603387	[GlobalISel] Drop workaround for Legalizer member/class sharing a name. NFC. MachineLegalizer used to be the name of both the class and the member, causing GCC errors. r276522 fixed that by renaming the member to just 'Legalizer'. The 'class' workaround isn't necessary anymore; drop it. llvm-svn: 289848	2016-12-15 18:45:30 +00:00
Sjoerd Meijer	96e10b5a9e	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently This is essentially a recommit of r285893, but with a correctness fix. The problem of the original commit was that this: bic r5, r7, #31 cbz r5, .LBB2_10 got rewritten into: lsrs r5, r7, #5 beq .LBB2_10 The result in destination register r5 is not the same and this is incorrect when r5 is not dead. So this fix includes checking the uses of the AND destination register. And also, compared to the original commit, some regression tests didn't need changing anymore because of this extra check. For completeness, this was the original commit message: For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. Differential Revision: https://reviews.llvm.org/D27761 llvm-svn: 289794	2016-12-15 09:38:59 +00:00
Prakhar Bahuguna	13e9921ccc	Fix for build warning in execute-only support llvm-svn: 289788	2016-12-15 08:42:04 +00:00
Prakhar Bahuguna	52a7dd7d78	[ARM] Implement execute-only support in CodeGen This implements execute-only support for ARM code generation, which prevents the compiler from generating data accesses to code sections. The following changes are involved: * Add the CodeGen option "-arm-execute-only" to the ARM code generator. * Add the clang flag "-mexecute-only" as well as the GCC-compatible alias "-mpure-code" to enable this option. * When enabled, literal pools are replaced with MOVW/MOVT instructions, with VMOV used in addition for floating-point literals. As the MOVT instruction is required, execute-only support is only available in Thumb mode for targets supporting ARMv8-M baseline or Thumb2. * Jump tables are placed in data sections when in execute-only mode. * The execute-only text section is assigned section ID 0, and is marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'. This also overrides selection of ELF sections for globals. llvm-svn: 289784	2016-12-15 07:59:08 +00:00
Eli Friedman	cbed30c501	[ARM] Split 128-bit vectors in BUILD_VECTOR lowering Given that INSERT_VECTOR_ELT operates on D registers anyway, combining 64-bit vectors into a 128-bit vector is basically free. Therefore, try to split BUILD_VECTOR nodes before giving up and lowering them to a series of INSERT_VECTOR_ELT instructions. Sometimes this allows dramatically better lowerings; see testcases for examples. Inspired by similar code in the x86 backend for AVX. Differential Revision: https://reviews.llvm.org/D27624 llvm-svn: 289706	2016-12-14 20:44:38 +00:00
Eli Friedman	10576e73c9	[ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently. Currently, there are substantial problems forming vld1_dup even if the VDUP survives legalization. The lack of an actual node leads to terrible results: not only can we not form post-increment vld1_dup instructions, but we form scalar pre-increment and post-increment loads which force the loaded value into a GPR. This patch fixes that by combining the vdup+load into an ARMISD node before DAGCombine messes it up. Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable). Differential Revision: https://reviews.llvm.org/D27694 llvm-svn: 289703	2016-12-14 20:25:26 +00:00
Stephan Bergmann	17c7f70362	Replace APFloatBase static fltSemantics data members with getter functions At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647	2016-12-14 11:57:17 +00:00
Evandro Menezes	aeec780e42	Add support for Samsung Exynos M3 (NFC) llvm-svn: 289613	2016-12-13 23:31:41 +00:00
Alina Sbirlea	77c5eaaeda	Generalize strided store pattern in interleave access pass Summary: This patch aims to generalize matching of the strided store accesses to more general masks. The more general rule is to have consecutive accesses based on the stride: [x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...] All elements in the masks need not form a contiguous space, there may be gaps. As before, undefs are allowed and filled in with adjacent element loads. Reviewers: HaoLiu, mssimpso Subscribers: mkuper, delena, llvm-commits Differential Revision: https://reviews.llvm.org/D23646 llvm-svn: 289573	2016-12-13 19:32:36 +00:00
Matthias Braun	0c989a893b	LivePhysReg: Use reference instead of pointer in init(); NFC llvm-svn: 289002	2016-12-08 00:15:51 +00:00
Oliver Stannard	870b5cad45	[ARM] Better error message for invalid flag-preserving Thumb1 insts When we see a non flag-setting instruction for which only the flag-setting version is available in Thumb1, we should give a better error message than "invalid instruction". Differential Revision: https://reviews.llvm.org/D27414 llvm-svn: 288805	2016-12-06 12:59:08 +00:00
Peter Collingbourne	ab85225be4	IR: Change the gep_type_iterator API to avoid always exposing the "current" type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458	2016-12-02 02:24:42 +00:00
Oleg Ranevskyy	e2ae41519f	[ARM] Fix for 64-bit CAS expansion on ARM32 with -O0 Summary: This patch fixes comparison of 64-bit atomic with its expected value in CMP_SWAP_64 expansion. Currently, the low words are compared with CMP, while the high words are compared with SBC. SBC expects the carry flag to be set if CMP detects a difference. CMP might leave the carry unset for unequal arguments though if the first one is >= than the second. This might cause the comparison logic to detect false equality. Example of the broken C++ code: ``` std::atomic<long long> at(2); long long ll = 1; std::atomic_compare_exchange_strong(&at, &ll, 3); ``` Even though the atomic `at` and the expected value `ll` are not equal and `atomic_compare_exchange_strong` returns `false`, `at` is changed to 3. The patch replaces SBC with CMPEQ. Reviewers: t.p.northover Subscribers: aemerson, rengolin, llvm-commits, asl Differential Revision: https://reviews.llvm.org/D27315 llvm-svn: 288433	2016-12-01 22:58:35 +00:00
Matthias Braun	d0ee66c2e9	Move most EH from MachineModuleInfo to MachineFunction Recommitting r288293 with some extra fixes for GlobalISel code. Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288405	2016-12-01 19:32:15 +00:00
Eric Christopher	e70b7c3dfb	Temporarily Revert "Move most EH from MachineModuleInfo to MachineFunction" This apprears to have broken the global isel bot: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-globalisel_build/5174/console This reverts commit r288293. llvm-svn: 288322	2016-12-01 07:50:12 +00:00
Matthias Braun	ed14cb0604	Move most EH from MachineModuleInfo to MachineFunction Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288293	2016-11-30 23:49:01 +00:00
Matthias Braun	f23ef437cc	Move FrameInstructions from MachineModuleInfo to MachineFunction This is per function data so it is better kept at the function instead of the module. This is a necessary step to have machine module passes work properly. Differential Revision: https://reviews.llvm.org/D27185 llvm-svn: 288291	2016-11-30 23:48:42 +00:00
Matthias Braun	c52fe2961c	Clarify rules for reserved regs, fix aarch64 ones. No test case necessary as the problematic condition is checked with the newly introduced assertAllSuperRegsMarked() function. Differential Revision: https://reviews.llvm.org/D26648 llvm-svn: 288277	2016-11-30 22:17:10 +00:00
Kuba Mracek	06995e866b	[xray] Add XRay support for Mach-O in CodeGen Currently, XRay only supports emitting the XRay table (xray_instr_map) on ELF binaries. Let's add Mach-O support. Differential Revision: https://reviews.llvm.org/D26983 llvm-svn: 287734	2016-11-23 02:07:04 +00:00
Tim Northover	b64fb453ea	CodeGen: simplify TargetMachine::getSymbol interface. NFC. No-one actually had a mangler handy when calling this function, and getSymbol itself went most of the way towards getting its own mangler (with a local TLOF variable) so forcing all callers to supply one was just extra complication. llvm-svn: 287645	2016-11-22 16:17:20 +00:00
Pablo Barrio	c41e856f53	[ARM] Relax restriction on variadic functions for tailcall optimization Summary: Variadic functions can be treated in the same way as normal functions with respect to the number and types of parameters. Reviewers: grosbach, olista01, t.p.northover, rengolin Subscribers: javed.absar, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D26748 llvm-svn: 287219	2016-11-17 10:56:58 +00:00
Tim Northover	397f9d9d05	ARM: fix CodeGen for 64-bit shifts. One half of the shifts obviously needed conditional selection based on whether the shift amount is more than 32-bits, but leaving the other half as the natural shift isn't acceptable either: it's undefined behaviour to shift a 32-bit value by more than 31. llvm-svn: 287149	2016-11-16 20:54:28 +00:00
Tim Northover	ed55a05b01	GlobalISel: remove unused variable to silence warning. llvm-svn: 287027	2016-11-15 21:06:07 +00:00
Diana Picus	895c6aa6fd	[ARM] GlobalISel: Remove unused members. NFCI This silences some warnings that I didn't see with my host compiler. llvm-svn: 286981	2016-11-15 16:42:10 +00:00
Diana Picus	90f0a84943	[ARM] Make sure GlobalISel is only initialized once. NFCI Move some code inside the proper 'if' block to make sure it is only run once, when the subtarget is first created. Things can still break if we use different ARM target machines or if we have functions with different 'target-cpu' or 'target-features', we should fix that too in the future. llvm-svn: 286974	2016-11-15 15:38:15 +00:00
Javed Absar	f043dac25d	[ARM] Add machine scheduler for Cortex-R52 This patch adds the Sched Machine Model for Cortex-R52. Details of the pipeline and descriptions are in comments in file ARMScheduleR52.td included in this patch. Reviewers: rengolin, jmolloy Differential Revision: https://reviews.llvm.org/D26500 llvm-svn: 286949	2016-11-15 11:34:54 +00:00
Tim Northover	3d38c38826	ARM: try to fix GCC 4.8 compilation again after r286881. llvm-svn: 286882	2016-11-14 20:31:53 +00:00
Tim Northover	46a6f0fbf0	Recommit: ARM: sort register lists by encoding in push/pop instructions. For example we were producing push {r8, r10, r11, r4, r5, r7, lr} This is misleading (r4, r5 and r7 are actually pushed before the rest), and other components (stack folding recently) often forget to deal with the extra complexity coming from the different order, leading to miscompiles. Finally, we warn about our own code in -no-integrated-as mode without this, which is really not a good idea. Fixed usage of std::sort so that we (hopefully) use instantiations that actually exist in GCC 4.8. llvm-svn: 286881	2016-11-14 20:28:24 +00:00
Tim Northover	1b66f39cf2	Revert "ARM: sort register lists by encoding in push/pop instructions." This reverts commit 286866. It broke a bot, something to do with exactly which templates std::sort accepts. llvm-svn: 286867	2016-11-14 19:05:28 +00:00
Tim Northover	e908ea844c	ARM: sort register lists by encoding in push/pop instructions. For example we were producing push {r8, r10, r11, r4, r5, r7, lr} This is misleading (r4, r5 and r7 are actually pushed before the rest), and other components (stack folding recently) often forget to deal with the extra complexity coming from the different order, leading to miscompiles. Finally, we warn about our own code in -no-integrated-as mode without this, which is really not a good idea. llvm-svn: 286866	2016-11-14 19:02:17 +00:00
Diana Picus	22274934f4	[ARM] Add plumbing for GlobalISel Add GlobalISel skeleton, up to the point where we can select a ret void. llvm-svn: 286573	2016-11-11 08:27:37 +00:00
Davide Italiano	a22ddddfea	[Target] Rename X86/ARM Assembly printer to reflect reality. This shows up a lot profiling LTO testcases with -time-passes, so better have a non confusing name. llvm-svn: 286488	2016-11-10 18:39:31 +00:00
Oliver Stannard	18ca2adf2d	[ARM] Thumb2 LDR (literal) should accept PC as the destination The version of this instruction with the .w suffix already correctly accepts this, but the alias without the .w did not. Differential Revision: https://reviews.llvm.org/D26499 llvm-svn: 286446	2016-11-10 13:20:41 +00:00
James Molloy	b03e0879fc	[Thumb1] Move padding earlier when synthesizing TBBs off of the PC When the base register (register pointing to the jump table) is the PC, we expect the jump table to directly follow the jump sequence with no intervening padding. If there is intervening padding, the calculated offsets will not be correct. One solution would be to account for any padding in the emitted LDRB instruction, but at the moment we don't support emitting MCExprs for the load offset. In the meantime, it's correct and only a slight amount worse to just move the padding up, from just before the jump table to just before the jump instruction sequence. We can do that by emitting code alignment before the jump sequence, as we know the number of instructions in the sequence is always 4. llvm-svn: 286107	2016-11-07 13:38:21 +00:00
Saleem Abdulrasool	804e12eeb5	ARM: lower fpowi appropriately for Windows ARM This handles the last case of the builtin function calls that we would generate code which differed from Microsoft's ABI. Rather than generating a call to `__pow{d,s}i2` we now promote the parameter to a float or double and invoke `powf` or `pow` instead. Addresses PR30825! llvm-svn: 286082	2016-11-06 19:46:54 +00:00
Weiming Zhao	962eaaea9c	[Cortex-M0] Atomic lowering Summary: ARMv6m supports dmb etc fench instructions but not ldrex/strex etc. So for some atomic load/store, LLVM should inline instructions instead of lowering to __sync_ calls. Reviewers: rengolin, efriedma, t.p.northover, jmolloy Subscribers: efriedma, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D26120 llvm-svn: 285969	2016-11-03 21:49:08 +00:00
Chandler Carruth	5589aa60c7	Remove a redundant condition found by PVS-Studio. Filed http://llvm.org/PR30897 to teach Clang to warn on this kind of stuff. llvm-svn: 285945	2016-11-03 17:42:02 +00:00
James Molloy	e7d97368f2	Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently" This reverts commit r285893. It caused (probably) http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/83 . llvm-svn: 285912	2016-11-03 14:08:01 +00:00
James Molloy	b60d8b1987	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently This recommits r281323, which was backed out for two reasons. One, a selfhost failure, and two, it apparently caused Chromium failures. Actually, the latter was a red herring. The log has expired from the former, but I suspect that was a red herring too (actually caused by another problematic patch of mine). Therefore reapplying, and will watch the bots like a hawk. For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 285893	2016-11-03 10:18:20 +00:00
Nirav Dave	0a392a8e7f	[ARM][MC] Cleanup ARM Target Assembly Parser Summary: Correctly parse end-of-statement tokens and handle preprocessor end-of-line comments in ARM assembly processor. Reviewers: rnk, majnemer Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26152 llvm-svn: 285830	2016-11-02 16:22:51 +00:00
Alex Bradbury	58eba09949	[TableGen] Move OperandMatchResultTy enum to MCTargetAsmParser.h As it stands, the OperandMatchResultTy is only included in the generated header if there is custom operand parsing. However, almost all backends make use of MatchOperand_Success and friends from OperandMatchResultTy for e.g. parseRegister. This is a pain when starting an AsmParser for a new backend that doesn't yet have custom operand parsing. Move the enum to MCTargetAsmParser.h. This patch is a prerequisite for D23563 Differential Revision: https://reviews.llvm.org/D23496 llvm-svn: 285705	2016-11-01 16:32:05 +00:00
James Molloy	70a3d6df52	[Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables [Reapplying r284580 and r285917 with fix and testing to ensure emitted jump tables for Thumb-1 have 4-byte alignment] The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions. It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size. TBB example: Before: lsls r0, r0, #2 After: add r0, pc adr r1, .LJTI0_0 ldrb r0, [r0, #6] ldr r0, [r0, r1] lsls r0, r0, #1 mov pc, r0 add pc, r0 => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4. The only case that can increase dynamic instruction count is the TBH case: Before: lsls r0, r4, #2 After: lsls r4, r4, #1 adr r1, .LJTI0_0 add r4, pc ldr r0, [r0, r1] ldrh r4, [r4, #6] mov pc, r0 lsls r4, r4, #1 add pc, r4 => 1 more instruction in prologue. Jump table shrunk by a factor of 2. So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!) llvm-svn: 285690	2016-11-01 13:37:41 +00:00
Saleem Abdulrasool	e1aa782bd0	CodeGen: further loosen -O0 CG for WoA division Generate the slowest possible codepath for noopt CodeGen. Even trying to be clever with the negated jump can cause out-of-range jumps. Use a wide branch instead. Although the code is modelled simplistically, the later optimizations would recombine the branching into `cbz` if possible. This re-enables the previous optimization as well as hopefully gives us working code in all cases. Addresses PR30356! llvm-svn: 285649	2016-10-31 22:12:37 +00:00
Saleem Abdulrasool	075d2e3c59	ARM: ensure that the Windows DBZ check is in range The Windows ARM target expects the compiler to emit a division-by-zero check. The check would use the form of: cmp r?, #0 cbz .Ltrap b .Lbody .Lbody: ... .Ltrap: udf #249 @ __brkdiv0 This works great most of the time. However, if the body of the function is greater than 127 bytes, the branch target limitation of cbz becomes an issue. This occurs in the unoptimized code generation cases sometimes (like in compiler-rt). Since this is a matter of correctness, possibly pay a small penalty instead. We now form this slightly differently: cbnz .Lbody udf #249 @ __brkdiv0 .Lbody: ... The positive case is through the branch instead of being the next instruction. However, because of the basic block layout, the negated branch is going to be a short distance always (2 bytes away, after the inserted __brkdiv0). The new t__brkdiv0 instruction is required to explicitly mark the instruction as a terminator as the generic UDF instruction is not a terminator. Addresses PR30532! llvm-svn: 285312	2016-10-27 16:59:22 +00:00
Sam Parker	e7d9505c08	[ARM] Predicate UMAAL selection on hasDSP. UMAAL is a DSP instruction and it is not available on thumbv7m (Cortex-M3) and thumbv6m (Cortex-M0+1) targets. Also fix wrong CHECK prefix in longMAC.ll test. Patch by Vadzim Dambrouski. Differential Revision: https://reviews.llvm.org/D25890 llvm-svn: 285278	2016-10-27 09:47:10 +00:00
Tim Northover	a9cc385664	ARM: don't rely on push/pop reglists being in order when folding SP adjust. It would be a very nice invariant to rely on, but unfortunately it doesn't necessarily hold (and the causes of mis-sorted reglists appear to be quite varied) so to be robust the frame lowering code can't assume that the first register in the list is also the first one that actually gets pushed. Should fix an issue where we were turning something like: push {r8, r4, r7, lr} sub sp, #24 into nonsense like: push {r2, r3, r4, r5, r6, r7, r8, r4, r7, lr} llvm-svn: 285232	2016-10-26 20:01:00 +00:00
Matthias Braun	8b38ffaa98	CodeGen/Passes: Pass MachineFunction as functor arg; NFC Passing a MachineFunction as argument is more natural and avoids an unnecessary round-trip through the logic determining the correct Subtarget because MachineFunction already has a reference anyway. llvm-svn: 285039	2016-10-24 23:23:02 +00:00
Eli Friedman	b37864b58d	Revert r284580+r284917. ("Synthesize TBB/TBH instructions") The optimization has correctness issues, so reverting for now to fix tests on thumb1 targets. llvm-svn: 284993	2016-10-24 17:20:50 +00:00
James Molloy	2bae8640d7	[ARM] Fix crash in ConstantIslands tPCRelJT may not be the first instruction in a block. Check that instead of dereferencing a broken iterator. llvm-svn: 284917	2016-10-22 09:58:37 +00:00
Benjamin Kramer	2a8bef8769	Do a sweep over move ctors and remove those that are identical to the default. All of these existed because MSVC 2013 was unable to synthesize default move ctors. We recently dropped support for it so all that error-prone boilerplate can go. No functionality change intended. llvm-svn: 284721	2016-10-20 12:20:28 +00:00
Sjoerd Meijer	2fc4cb6f72	Reapply r284571 (with the new tests fixed). llvm-svn: 284588	2016-10-19 13:43:02 +00:00
James Molloy	fbfd173447	[Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions. It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size. TBB example: Before: lsls r0, r0, #2 After: add r0, pc adr r1, .LJTI0_0 ldrb r0, [r0, #6] ldr r0, [r0, r1] lsls r0, r0, #1 mov pc, r0 add pc, r0 => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4. The only case that can increase dynamic instruction count is the TBH case: Before: lsls r0, r4, #2 After: lsls r4, r4, #1 adr r1, .LJTI0_0 add r4, pc ldr r0, [r0, r1] ldrh r4, [r4, #6] mov pc, r0 lsls r4, r4, #1 add pc, r4 => 1 more instruction in prologue. Jump table shrunk by a factor of 2. So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!) llvm-svn: 284580	2016-10-19 12:06:49 +00:00
Sjoerd Meijer	3f5111d363	Revert of r284571 because of failing tests. llvm-svn: 284572	2016-10-19 07:45:48 +00:00
Sjoerd Meijer	a318779263	Checking FP function attribute values and adding more build attribute tests. This renames the function for checking FP function attribute values and also adds more build attribute tests (which are in separate files because build attributes are set per file). Differential Revision: https://reviews.llvm.org/D25625 llvm-svn: 284571	2016-10-19 07:25:06 +00:00
Eli Friedman	c0a717ba5b	Improve ARM lowering for "icmp <2 x i64> eq". The custom lowering is pretty straightforward: basically, just AND together the two halves of a <4 x i32> compare. Differential Revision: https://reviews.llvm.org/D25713 llvm-svn: 284536	2016-10-18 21:03:40 +00:00
Javed Absar	e7c338081a	[ARM] Assign cost of scaling for Cortex-R52 This patch assigns cost of the scaling used in addressing for Cortex-R52. On Cortex-R52 a negated register offset takes longer than a non-negated register offset, in a register-offset addressing mode. Differential Revision: http://reviews.llvm.org/D25670 Reviewer: jmolloy llvm-svn: 284460	2016-10-18 09:08:54 +00:00
Dean Michael Berris	156f6cafc2	[XRay] Support for for tail calls for ARM no-Thumb This patch adds simplified support for tail calls on ARM with XRay instrumentation. Known issue: compiled with generic flags: `-O3 -g -fxray-instrument -Wall -std=c++14 -ffunction-sections -fdata-sections` (this list doesn't include my specific flags like --target=armv7-linux-gnueabihf etc.), the following program #include <cstdio> #include <cassert> #include <xray/xray_interface.h> [[clang::xray_always_instrument]] void __attribute__ ((noinline)) fC() { std::printf("In fC()\n"); } [[clang::xray_always_instrument]] void __attribute__ ((noinline)) fB() { std::printf("In fB()\n"); fC(); } [[clang::xray_always_instrument]] void __attribute__ ((noinline)) fA() { std::printf("In fA()\n"); fB(); } // Avoid infinite recursion in case the logging function is instrumented (so calls logging // function again). [[clang::xray_never_instrument]] void simplyPrint(int32_t functionId, XRayEntryType xret) { printf("XRay: functionId=%d type=%d.\n", int(functionId), int(xret)); } int main(int argc, char* argv[]) { __xray_set_handler(simplyPrint); printf("Patching...\n"); __xray_patch(); fA(); printf("Unpatching...\n"); __xray_unpatch(); fA(); return 0; } gives the following output: Patching... XRay: functionId=3 type=0. In fA() XRay: functionId=3 type=1. XRay: functionId=2 type=0. In fB() XRay: functionId=2 type=1. XRay: functionId=1 type=0. XRay: functionId=1 type=1. In fC() Unpatching... In fA() In fB() In fC() So for function fC() the exit sled seems to be called too much before function exit: before printing In fC(). Debugging shows that the above happens because printf from fC is also called as a tail call. So first the exit sled of fC is executed, and only then printf is jumped into. So it seems we can't do anything about this with the current approach (i.e. within the simplification described in https://reviews.llvm.org/D23988 ). Differential Revision: https://reviews.llvm.org/D25030 llvm-svn: 284456	2016-10-18 05:54:15 +00:00
Davide Italiano	e9cdb24f67	[ArmFastISel] Kill dead code. NFCI. llvm-svn: 284320	2016-10-16 01:09:39 +00:00
Eric Christopher	445c952bd0	Tidy the calls to getCurrentSection().first -> getCurrentSectionOnly to help readability a bit. llvm-svn: 284202	2016-10-14 05:47:37 +00:00
Javed Absar	85874a9360	[ARM]: Assign cost of scaling used in addressing mode for ARM cores This patch assigns cost of the scaling used in addressing. On many ARM cores, a negated register offset takes longer than a non-negated register offset, in a register-offset addressing mode. For instance: LDR R0, [R1, R2 LSL #2] LDR R0, [R1, -R2 LSL #2] Above, (1) takes less cycles than (2). By assigning appropriate scaling factor cost, we enable the LLVM to make the right trade-offs in the optimization and code-selection phase. Differential Revision: http://reviews.llvm.org/D24857 Reviewers: jmolloy, rengolin llvm-svn: 284127	2016-10-13 14:57:43 +00:00
Reid Kleckner	bdfc05ff93	Re-land "[Thumb] Save/restore high registers in Thumb1 pro/epilogues" Reverts r283938 to reinstate r283867 with a fix. The original change had an ArrayRef referring to a destroyed temporary initializer list. Use plain C arrays instead. llvm-svn: 283942	2016-10-11 21:14:03 +00:00
Reid Kleckner	f4876beb2b	Revert "[Thumb] Save/restore high registers in Thumb1 pro/epilogues" This reverts r283867. This appears to be an infinite loop: while (HiRegToSave != AllHighRegs.end() && CopyReg != AllCopyRegs.end()) { if (HiRegsToSave.count(*HiRegToSave)) { ... CopyReg = findNextOrderedReg(++CopyReg, CopyRegs, AllCopyRegs.end()); HiRegToSave = findNextOrderedReg(++HiRegToSave, HiRegsToSave, AllHighRegs.end()); } } llvm-svn: 283938	2016-10-11 20:54:41 +00:00
NAKAMURA Takumi	e9587bd771	ARMMachineFunctionInfo.cpp: Add an initializer of ARMFunctionInfo::ReturnRegsCount in the explicit ctor. It caused crash since r283867. llvm-svn: 283909	2016-10-11 17:38:30 +00:00
NAKAMURA Takumi	e61de02016	Reformat. llvm-svn: 283908	2016-10-11 17:38:25 +00:00
Daniel Jasper	b617286089	Silence unused warning in non-assert builds. llvm-svn: 283899	2016-10-11 16:22:36 +00:00
Oliver Stannard	d2083fb356	[Thumb] Save/restore high registers in Thumb1 pro/epilogues The high registers are not allocatable in Thumb1 functions, but they could still be used by inline assembly, so we need to save and restore the callee-saved high registers (r8-r11) in the prologue and epilogue. This is complicated by the fact that the Thumb1 push and pop instructions cannot access these registers. Therefore, we have to move them down into low registers before pushing, and move them back after popping into low registers. In most functions, we will have low registers that are also being pushed/popped, which we can use as the temporary registers for saving/restoring the high registers. However, this is not guaranteed, so we may need to push some extra low registers to ensure that the high registers can be saved/restored. For correctness, it would be sufficient to use just one low register, but if we have enough low registers available then we only need one push/pop instruction, rather than one per high register. We can also use the argument/return registers when they are not live, and the link register when saving (but not restoring), reducing the number of extra registers we need to push. There are still a few extreme edge cases where we need two push/pop instructions, because not enough low registers can be made live in the prologue or epilogue. In addition to the regression tests included here, I've also tested this using a script to generate functions which clobber different combinations of registers, have different numbers of argument and return registers (including variadic arguments), allocate different fixed sized objects on the stack, and do or don't use variable sized allocas and the __builtin_return_address intrinsic (all of which affect the available registers in the prologue and epilogue). I ran these functions in a test harness which verifies that all of the callee-saved registers are correctly preserved. Differential Revision: https://reviews.llvm.org/D24228 llvm-svn: 283867	2016-10-11 10:12:25 +00:00
Oliver Stannard	50a74393c2	[ARM] Fix registers clobbered by SjLj EH on soft-float targets Currently, the Int_eh_sjlj_dispatchsetup intrinsic is marked as clobbering all registers, including floating-point registers that may not be present on the target. This is technically true, as we could get linked against code that does use the FP registers, but that will not actually work, as the soft-float code cannot save and restore the FP registers. SjLj exception handling can only work correctly if either all or none of the code is built for a target with FP registers. Therefore, we can assume that, when Int_eh_sjlj_dispatchsetup is compiled for a soft-float target, it is only going to be linked against other soft-float code, and so only clobbers the general-purpose registers. This allows us to check that no non-savable registers are clobbered when generating the prologue/epilogue. Differential Revision: https://reviews.llvm.org/D25180 llvm-svn: 283866	2016-10-11 10:06:59 +00:00
Peter Collingbourne	0da86301ad	Revert r283690, "MC: Remove unused entities." llvm-svn: 283814	2016-10-10 22:49:37 +00:00
Alexandros Lamprineas	20e9ddba73	[ARM] Fix invalid VLDM/VSTM access when targeting Big Endian with NEON The instructions VLDM/VSTM can only access word-aligned memory locations and produce alignment fault if the condition is not met. The compiler currently generates VLDM/VSTM for v2f64 load/store regardless the alignment of the memory access. Instead, if a v2f64 load/store is not word-aligned, the compiler should generate VLD1/VST1. For each non double-word-aligned VLD1/VST1, a VREV instruction should be generated when targeting Big Endian. Differential Revision: https://reviews.llvm.org/D25281 llvm-svn: 283763	2016-10-10 16:01:54 +00:00
Mehdi Amini	f42454b94b	Move the global variables representing each Target behind accessor function This avoids "static initialization order fiasco" Differential Revision: https://reviews.llvm.org/D25412 llvm-svn: 283702	2016-10-09 23:00:34 +00:00
Peter Collingbourne	cc723cccab	MC: Remove unused entities. llvm-svn: 283691	2016-10-09 04:39:13 +00:00
Peter Collingbourne	5c924d7117	Target: Remove unused entities. llvm-svn: 283690	2016-10-09 04:38:57 +00:00
Mehdi Amini	732afdd09a	Turn cl::values() (for enum) from a vararg function to using C++ variadic template The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling: va_start(ValueArgs, Desc); with Desc being a StringRef. Differential Revision: https://reviews.llvm.org/D25342 llvm-svn: 283671	2016-10-08 19:41:06 +00:00
Javed Absar	9797989ca7	[ARM]: add missing switch case for cortex-r52 Adds a missing switch case for handling cortex-r52 in init-subtarget-features. llvm-svn: 283551	2016-10-07 13:41:55 +00:00
Martin Storsjo	04864f45b2	[ARM] Reapply: Use __rt_div functions for divrem on Windows Reapplying r283383 after revert in r283442. The additional fix is a getting rid of a stray space in a function name, in the refactoring part of the commit. This avoids falling back to calling out to the GCC rem functions (__moddi3, __umoddi3) when targeting Windows. The __rt_div functions have flipped the two arguments compared to the __aeabi_divmod functions. To match MSVC, we emit a check for division by zero before actually calling the library function (even if the library function itself also might do the same check). Not all calls to __rt_div functions for division are currently merged with calls to the same function with the same parameters for the remainder. This is more wasteful than a div + mls as before, but avoids calls to __moddi3. Differential Revision: https://reviews.llvm.org/D25332 llvm-svn: 283550	2016-10-07 13:28:53 +00:00
Javed Absar	fb4b6e8db9	[ARM]: Add Cortex-R52 target to LLVM This patch adds Cortex-R52, the new ARM real-time processor, to LLVM. Cortex-R52 implements the ARMv8-R architecture. llvm-svn: 283542	2016-10-07 12:06:40 +00:00
Oliver Stannard	4df1cc0b00	[ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI With the ROPI and RWPI relocation models we can't always have pointers to global data or functions in constant data, so don't try to convert switches into lookup tables if any value in the lookup table would require a relocation. We can still safely emit lookup tables of other values, such as simple constants. Differential Revision: https://reviews.llvm.org/D24462 llvm-svn: 283530	2016-10-07 08:48:24 +00:00
Mehdi Amini	68c6c8cd78	Use StringRef in ARMELFStreamer (NFC) llvm-svn: 283529	2016-10-07 08:48:07 +00:00
Mehdi Amini	a0016ec95f	Use StringReg in TargetParser APIs (NFC) llvm-svn: 283527	2016-10-07 08:37:29 +00:00
Diana Picus	6341e46cd1	Revert "[ARM] Use __rt_div functions for divrem on Windows" This reverts commit r283383 because it broke some of the bots: undefined reference to ` __aeabi_uldivmod' It affected (at least) clang-cmake-armv7-a15-selfhost, clang-cmake-armv7-a15-selfhost and clang-native-arm-lnt. llvm-svn: 283442	2016-10-06 11:24:29 +00:00
James Molloy	6215fad0e9	[ARM] Constant pool promotion - fix alignment calculation Global variables are GlobalValues, so they have explicit alignment. Querying DataLayout for the alignment was incorrect. Testcase added. llvm-svn: 283423	2016-10-06 07:56:00 +00:00
Martin Storsjo	f997759aef	[ARM] Use __rt_div functions for divrem on Windows This avoids falling back to calling out to the GCC rem functions (__moddi3, __umoddi3) when targeting Windows. The __rt_div functions have flipped the two arguments compared to the __aeabi_divmod functions. To match MSVC, we emit a check for division by zero before actually calling the library function (even if the library function itself also might do the same check). Not all calls to __rt_div functions for division are currently merged with calls to the same function with the same parameters for the remainder. This is more wasteful than a div + mls as before, but avoids calls to __moddi3. Differential Revision: https://reviews.llvm.org/D24076 llvm-svn: 283383	2016-10-05 21:08:02 +00:00
James Molloy	b7de497cb9	[Thumb] Don't try and emit LDRH/LDRB from the constant pool This is not a valid encoding - these instructions cannot do PC-relative addressing. The underlying problem here is of whitelist in ARMISelDAGToDAG that unwraps ARMISD::Wrappers during addressing-mode selection. This didn't realise TargetConstantPool was actually possible, so didn't handle it. llvm-svn: 283323	2016-10-05 14:52:13 +00:00
Mehdi Amini	5b00770c35	Use StringRef in ARMConstantPool APIs (NFC) llvm-svn: 283293	2016-10-05 01:41:06 +00:00
Sjoerd Meijer	535529b41c	Consistent fp denormal mode names. NFC. This fixes the inconsistency of the fp denormal option names: in LLVM this was DenormalType, but in Clang this is DenormalMode which seems better. Differential Revision: https://reviews.llvm.org/D24906 llvm-svn: 283192	2016-10-04 08:03:36 +00:00
Sjoerd Meijer	4dbe73c1ed	[ARM] Code size optimisation to lower udiv+urem to udiv+mls instead of a library call to __aeabi_uidivmod. This is an improved implementation of r280808, see also D24133, that got reverted because isel was stuck in a loop. That was caused by the optimisation incorrectly triggering on i64 ints, which shouldn't happen because there is no 64bit hwdiv support; that put isel's type legalization and this optimisation in a loop. A native ARM compiler and testing now shows that this is fixed. Patch mostly by Pablo Barrio. Differential Revision: https://reviews.llvm.org/D25077 llvm-svn: 283098	2016-10-03 10:12:32 +00:00
Mehdi Amini	48878ae579	Use StringRef in Datalayout API (NFC) llvm-svn: 283013	2016-10-01 05:57:55 +00:00
Mehdi Amini	217b246484	Revert "Use StringRef in Datalayout API (NFC)" This reverts commit r283009. Bots are broken. llvm-svn: 283011	2016-10-01 05:12:48 +00:00
Mehdi Amini	29baf9c0e1	Use StringRef in Datalayout API (NFC) llvm-svn: 283009	2016-10-01 04:17:59 +00:00
Mehdi Amini	117296c0a0	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004	2016-10-01 02:56:57 +00:00
James Molloy	9abb2fa5bb	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. It also contains fixes for emitting .text relocations which made the sanitizer bots unhappy. llvm-svn: 282387	2016-09-26 07:26:24 +00:00
James Molloy	85124c76fc	Revert "[ARM] Promote small global constants to constant pools" This reverts commit r282241. It caused http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19882. llvm-svn: 282249	2016-09-23 13:35:43 +00:00
James Molloy	1ce54d6be2	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. It also contains fixes for emitting .text relocations which made the sanitizer bots unhappy. llvm-svn: 282241	2016-09-23 12:15:58 +00:00
Nico Weber	903859c0e4	Revert r281715, it caused PR30475 llvm-svn: 282076	2016-09-21 15:33:24 +00:00
Oliver Stannard	e1f6dc59ce	[Thumb] Set correct initial mapping symbol for big-endian thumb The initial mapping symbol state is set from the triple, but we only checked for the little-endian thumb triple, so could end up with an ARM mapping symbol for big-endian thumb. Differential Revision: https://reviews.llvm.org/D24553 llvm-svn: 281894	2016-09-19 09:21:45 +00:00
Tim Northover	eaee28b5ca	ARM: check alignment before transforming ldr -> ldm (or similar). ldm and stm instructions always require 4-byte alignment on the pointer, but we weren't checking this before trying to reduce code-size by replacing a post-indexed load/store with them. Unfortunately, we were also dropping this incormation in DAG ISel too, but that's easy enough to fix. llvm-svn: 281893	2016-09-19 09:11:09 +00:00
Dean Michael Berris	4640154446	[XRay] ARM 32-bit no-Thumb support in LLVM This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: https://reviews.llvm.org/D23932 (Clang test) https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 281878	2016-09-19 00:54:35 +00:00
Nirav Dave	2364748a49	Defer asm errors to post-statement failure Recommitting after fixing AsmParser initialization and X86 inline asm error cleanup. Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281762	2016-09-16 18:30:20 +00:00
Sjoerd Meijer	227825346e	Reverting r281719, this is causing buildbot failures and timeouts again. llvm-svn: 281722	2016-09-16 13:16:52 +00:00
Sjoerd Meijer	23385c87a4	This is an attempt to reapply r280808: [ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM) This was causing buildbot failures earlier (time outs in the LNT suite). However, we haven't been able to reproduce this and are suspecting this was caused by another (reverted) patch. llvm-svn: 281719	2016-09-16 12:10:09 +00:00
James Molloy	0dc4708fca	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. It also contains fixes for emitting .text relocations which made the sanitizer bots unhappy. llvm-svn: 281715	2016-09-16 10:17:04 +00:00
Eric Christopher	4367c7fb9a	Move the Mangler from the AsmPrinter down to TLOF and clean up the TLOF API accordingly. llvm-svn: 281708	2016-09-16 07:33:15 +00:00
Evgeniy Stepanov	a0601a40f7	Revert "[ARM] Promote small global constants to constant pools" This reverts r281604, which adds text relocations to ARM binaries. llvm-svn: 281645	2016-09-15 19:13:32 +00:00
James Molloy	fe7fd879d7	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. llvm-svn: 281604	2016-09-15 12:30:27 +00:00
Matt Arsenault	1b9fc8ed65	Finish renaming remaining analyzeBranch functions llvm-svn: 281535	2016-09-14 20:43:16 +00:00
Evgeniy Stepanov	e97d3b90b9	Revert "[ARM] Promote small global constants to constant pools" Breaks Android tests by introducing text relocations to ARM binaries. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/25362/steps/run%20asan%20lit%20tests%20%5Barm%2Fbullhead-userdebug%2FMTC20F%5D/logs/stdio llvm-svn: 281526	2016-09-14 20:02:30 +00:00
Matt Arsenault	e8e0f5cac6	Make analyzeBranch family of instruction names consistent analyzeBranch was renamed to use lowercase first, rename the related set to match. llvm-svn: 281506	2016-09-14 17:24:15 +00:00
Matt Arsenault	a2b036e88b	AArch64: Use TTI branch functions in branch relaxation The main change is to return the code size from InsertBranch/RemoveBranch. Patch mostly by Tim Northover llvm-svn: 281505	2016-09-14 17:23:48 +00:00
Sanjay Patel	284582b6d4	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits(), round 2 ; NFCI llvm-svn: 281498	2016-09-14 16:54:10 +00:00
Sanjay Patel	1ed771f5d7	getVectorElementType().getSizeInBits() -> getScalarSizeInBits() ; NFCI llvm-svn: 281495	2016-09-14 16:37:15 +00:00
Sanjay Patel	b1f0a0f4a8	getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI llvm-svn: 281493	2016-09-14 16:05:51 +00:00
Sanjay Patel	bd6fca1419	getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI llvm-svn: 281489	2016-09-14 15:21:00 +00:00
James Molloy	13065b00ba	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281484	2016-09-14 14:47:27 +00:00
James Molloy	9790d8f81d	Revert "[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently" This reverts commit r281323. It caused chromium test failures and a selfhost failure. llvm-svn: 281451	2016-09-14 09:45:28 +00:00
Sjoerd Meijer	724023a1ec	This reapplies r281304. The issue was that I had missed to copy the new isAdd field in the tablegen data structure. llvm-svn: 281447	2016-09-14 08:20:03 +00:00
Nico Weber	e204c48d16	Revert r281336 (and r281337), it caused PR30372. llvm-svn: 281361	2016-09-13 18:17:00 +00:00
Nirav Dave	9fa8af2180	Defer asm errors to post-statement failure Recommitting after fixing AsmParser Initialization. Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281336	2016-09-13 13:55:06 +00:00
James Molloy	043d613791	Revert "[ARM] Promote small global constants to constant pools" This reverts commit r281314. Speculatively revert as it's possible this caused linker errors: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19656 llvm-svn: 281327	2016-09-13 12:45:51 +00:00
Pablo Barrio	bb6984d401	[ARM] Add ".code 32" to functions in the ARM instruction set Before, only Thumb functions were marked as ".code 16". These ".code x" directives are effective until the next directive of its kind is encountered. Therefore, in code with interleaved ARM and Thumb functions, it was possible to declare a function as ARM and end up with a Thumb function after assembly. A test has been added. An existing test has also been fixed to take this change into account. Reviewers: aschwaighofer, t.p.northover, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D24337 llvm-svn: 281324	2016-09-13 12:18:15 +00:00
James Molloy	d246c598de	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281323	2016-09-13 12:12:32 +00:00
Peter Smith	85bbda191d	[ARM] Support ldr.w in pseudo instruction ldr rd,=immediate The changes made in r269352, r269353 and r269354 to support the transformation of the ldr rd,=immediate to mov introduced a regression from 3.8 (ldr.w rd, =immediate) not supported. This change puts support back in for ldr.w by means of a t2InstAlias for the .w form. The .w is ignored in ARM state and propagated to the ldr in Thumb2. llvm-svn: 281319	2016-09-13 11:15:51 +00:00
James Molloy	3e4bc66134	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281314	2016-09-13 10:28:11 +00:00
Sjoerd Meijer	520a18df9c	Revert of r281304 as it is causing build bot failures in hexagon hwloop regression tests. These tests pass locally; will be investigating where these differences come from. llvm-svn: 281306	2016-09-13 08:51:59 +00:00
Sjoerd Meijer	05453991fe	This adds a new field isAdd to MCInstrDesc. The ARM and Hexagon instruction descriptions now tag add instructions, and the Hexagon backend is using this to identify loop induction statements. Patch by Sam Parker and Sjoerd Meijer. Differential Revision: https://reviews.llvm.org/D23601 llvm-svn: 281304	2016-09-13 08:08:06 +00:00
Eric Christopher	04c7db31e8	Temporarily Revert "[MC] Defer asm errors to post-statement failure" as it's causing errors on the sanitizer bots. This reverts commit r281249. llvm-svn: 281280	2016-09-13 00:19:29 +00:00
Nico Weber	7c31d0ebc0	Revert r281215, it caused PR30358. llvm-svn: 281263	2016-09-12 21:40:50 +00:00
Nirav Dave	c0c0f7a196	[MC] Defer asm errors to post-statement failure Allow errors to be deferred and emitted as part of clean up to simplify and shorten Assembly parser code. This will allow error messages to be emitted in helper functions and be modified by the caller which has better context. As part of this many minor cleanups to the Parser: * Unify parser cleanup on error * Add Workaround for incorrect return values in ParseDirective instances * Tighten checks on error-signifying return values for parser functions and fix in-tree TargetParsers to be more consistent with the changes. * Fix AArch64 test cases checking for spurious error messages that are now fixed. These changes should be backwards compatible with current Target Parsers so long as the error status are correctly returned in appropriate functions. Reviewers: rnk, majnemer Subscribers: aemerson, jyknight, llvm-commits Differential Revision: https://reviews.llvm.org/D24047 llvm-svn: 281249	2016-09-12 20:03:02 +00:00
James Molloy	3d06ff22b7	Revert "[ARM] Promote small global constants to constant pools" This reverts commit r281213. It made a bot go bang: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/14625 llvm-svn: 281228	2016-09-12 16:18:23 +00:00
James Molloy	1e1b56bd48	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281215	2016-09-12 14:30:48 +00:00
James Molloy	8f82d45ff4	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281213	2016-09-12 13:42:16 +00:00
Duncan P. N. Exon Smith	1872096f1e	CodeGen: Give MachineBasicBlock::reverse_iterator a handle to the current MI Now that MachineBasicBlock::reverse_instr_iterator knows when it's at the end (since r281168 and r281170), implement MachineBasicBlock::reverse_iterator directly on top of an ilist::reverse_iterator by adding an IsReverse template parameter to MachineInstrBundleIterator. This replaces another hard-to-reason-about use of std::reverse_iterator on list iterators, matching the changes for ilist::reverse_iterator from r280032 (see the "out of scope" section at the end of that commit message). MachineBasicBlock::reverse_iterator now has a handle to the current node and has obvious invalidation semantics. r280032 has a more detailed explanation of how list-style reverse iterators (invalidated when the pointed-at node is deleted) are different from vector-style reverse iterators like std::reverse_iterator (invalidated on every operation). A great motivating example is this commit's changes to lib/CodeGen/DeadMachineInstructionElim.cpp. Note: If your out-of-tree backend deletes instructions while iterating on a MachineBasicBlock::reverse_iterator or converts between MachineBasicBlock::iterator and MachineBasicBlock::reverse_iterator, you'll need to update your code in similar ways to r280032. The following table might help: [Old] ==> [New] delete &RI, RE = end() delete &RI++ RI->erase(), RE = end() RI++->erase() reverse_iterator(I) std::prev(I).getReverse() reverse_iterator(I) ++I.getReverse() --reverse_iterator(I) I.getReverse() reverse_iterator(std::next(I)) I.getReverse() RI.base() std::prev(RI).getReverse() RI.base() ++RI.getReverse() --RI.base() RI.getReverse() std::next(RI).base() RI.getReverse() (For more details, have a look at r280032.) llvm-svn: 281172	2016-09-11 18:51:28 +00:00
Justin Lebar	adbf09e8cf	[CodeGen] Split out the notions of MI invariance and MI dereferenceability. Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 llvm-svn: 281151	2016-09-11 01:38:58 +00:00
Saleem Abdulrasool	92e33a3ebc	ARM: move the builtins libcall CC setup Move the target specific setup into the target specific lowering setup. As pointed out by Anton, the initial change was moving this too high up the stack resulting in a violation of the layering (the target generic code path setup target specific bits). Sink this into the ARM specific setup. NFC. llvm-svn: 281088	2016-09-09 20:11:31 +00:00
James Molloy	57d9dfa9ac	[ARM] ADD with a negative offset can become SUB for free So model that directly in TTI::getIntImmCost(). llvm-svn: 281044	2016-09-09 13:35:36 +00:00
James Molloy	1454e90f86	[ARM] icmp %x, -C can be lowered to a simple ADDS or CMN Tell TargetTransformInfo about this so ConstantHoisting is informed. llvm-svn: 281043	2016-09-09 13:35:28 +00:00
James Molloy	4d86bed0bb	[Thumb] Select (CMPZ X, -C) -> (CMPZ (ADDS X, C), 0) The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2. llvm-svn: 281040	2016-09-09 12:52:24 +00:00
James Molloy	0f41227b21	[Thumb1] Teach optimizeCompareInstr about thumb1 compares This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags. Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean. llvm-svn: 281027	2016-09-09 09:51:06 +00:00
Renato Golin	049f387112	Revert "[XRay] ARM 32-bit no-Thumb support in LLVM" And associated commits, as they broke the Thumb bots. This reverts commit r280935. This reverts commit r280891. This reverts commit r280888. llvm-svn: 280967	2016-09-08 17:10:39 +00:00
Renato Golin	d257373887	[ARM XRay] Try to fix Thumb-only failure I mised the check that it had to support ARM to work. This commit tries to fix that, to make sure we don't emit ARM code in Thumb-only mode. llvm-svn: 280935	2016-09-08 13:45:10 +00:00
James Molloy	753c18f5c0	[Thumb1] AND with a constant operand can be converted into BIC So model the cost of materializing the constant operand C as the minimum of C and ~C. llvm-svn: 280929	2016-09-08 12:58:12 +00:00
James Molloy	7c7255e40b	[Thumb1] Fix cost calculation for complemented immediates Materializing something like "-3" can be done as 2 instructions: MOV r0, #3 MVN r0, r0 This has a cost of 2, not 3. It looks like we were already trying to detect this pattern in TII::getIntImmCost(), but were taking the complement of the zero-extended value instead of the sign-extended value which is unlikely to ever produce a number < 256. There were no tests failing after changing this... :/ llvm-svn: 280928	2016-09-08 12:58:04 +00:00
Pablo Barrio	2b7ed1339c	Revert "[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM)" This reverts commit r280808. It is possible that this change results in an infinite loop. This is causing timeouts in some tests on ARM, and a Chromebook bot is failing. llvm-svn: 280918	2016-09-08 10:05:57 +00:00
Dean Michael Berris	17d94e279e	[XRay] ARM 32-bit no-Thumb support in LLVM This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: 1. https://reviews.llvm.org/D23932 (Clang test) 2. https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 280888	2016-09-08 00:19:04 +00:00
Pablo Barrio	fc752bb70a	[ARM] Lower UDIV+UREM to UDIV+MLS (and the same for SREM) Summary: This saves a library call to __aeabi_uidivmod. However, the processor must feature hardware division in order to benefit from the transformation. Reviewers: scott-0, jmolloy, compnerd, rengolin Subscribers: t.p.northover, compnerd, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D24133 llvm-svn: 280808	2016-09-07 12:49:15 +00:00
Saleem Abdulrasool	bfa25bd1ac	ARM: workaround bundled operation predication This is a Windows ARM specific issue. If the code path in the if conversion ends up using a relocation which will form a IMAGE_REL_ARM_MOV32T, we end up with a bundle to ensure that the mov.w/mov.t pair is not split up. This is normally fine, however, if the branch is also predicated, then we end up trying to predicate the bundle. For now, report a bundle as being unpredicatable. Although this is false, this would trigger a failure case previously anyways, so this is no worse. That is, there should not be any code which would previously have been if converted and predicated which would not be now. Under certain circumstances, it may be possible to "predicate the bundle". This would require scanning all bundle instructions, and ensure that the bundle contains only predicatable instructions, and converting the bundle into an IT block sequence. If the bundle is larger than the maximal IT block length (4 instructions), it would require materializing multiple IT blocks from the single bundle. llvm-svn: 280689	2016-09-06 04:00:12 +00:00
James Molloy	728cf85950	[Thumb1] Add relocations for fixups fixup_arm_thumb_{br,bcc} These need to be mapped through to R_ARM_THM_JUMP{11,8} respectively. Fixes PR30279. llvm-svn: 280651	2016-09-05 08:29:15 +00:00
Sjoerd Meijer	6c4140b6c0	Setting fp trapping mode and denormal type: this an improvement of r280246 and calculates compatibility of functions attributes in a better way. Differential Revision: https://reviews.llvm.org/D24070 llvm-svn: 280534	2016-09-02 19:51:34 +00:00
Sjoerd Meijer	46b5b88387	Clang patch r280064 introduced ways to set the FP exceptions and denormal types. This is the LLVM counterpart and it adds options that map onto FP exceptions and denormal build attributes allowing better fp math library selections. Differential Revision: https://reviews.llvm.org/D24070 llvm-svn: 280246	2016-08-31 14:17:38 +00:00
Pablo Barrio	b8ec630583	Handle empty functions with debug info in load/store opt pass Summary: In fuctions that contained debug info but were empty otherwise, the ARM load/store optimizer could abort. This was because function MergeReturnIntoLDM handled the special case where a Machine Basic BLock is empty by calling MBB.empty(). However, this returns false in presence of debug info, although the function should be considered empty in the eyes of the load/store optimizer. This has been fixed by handling the case where searching through the block finds only debug instructions. Reviewers: rengolin, dexonsmith, llvm-commits, jmolloy Subscribers: t.p.northover, aemerson, rengolin, samparker Differential Revision: https://reviews.llvm.org/D23847 llvm-svn: 279820	2016-08-26 13:00:39 +00:00
Tim Northover	3495647d0d	ARM: by default don't set the Thumb bit on MachO relocated values. Its existence is largely historical, apparently we tried to make ARM object files look maybe-almost-possibly runnable by putting our best guess at the actual value into relocated locations. Of course, the real linker then comes along and can completely change things. But it should only be there for word-sized and movw/movt relocations. It can't be encoded in branch relocations, and I've seen it mess up validity calculations twice in the last couple of weeks so the default is clearly problematic. llvm-svn: 279773	2016-08-25 20:41:30 +00:00
Matthias Braun	1eb473680a	MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, compute it Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of running after register and simply describes that no vregs are used in a machine function. With that we can simply compute the property and do not need to dump/parse it in .mir files. Differential Revision: http://reviews.llvm.org/D23850 llvm-svn: 279698	2016-08-25 01:27:13 +00:00
Tim Northover	9c3633f516	ARM: don't diagnose cbz/cbnz to Thumb functions. A branch-distance to a Thumb function shouldn't be forced to be odd for CBZ/CBNZ instructions because (assuming it's within range), it's going to be a valid, even offset. llvm-svn: 279665	2016-08-24 21:21:29 +00:00
Rafael Espindola	70c6a3976b	Use isTargetMachO instead of isTargetDarwin. llvm-svn: 279655	2016-08-24 19:02:29 +00:00
Oliver Stannard	9aa6f010a4	[ARM] Generate consistent frame records for Thumb2 There is not an official documented ABI for frame pointers in Thumb2, but we should try to emit something which is useful. We use r7 as the frame pointer for Thumb code, which currently means that if a function needs to save a high register (r8-r11), it will get pushed to the stack between the frame pointer (r7) and link register (r14). This means that while a stack unwinder can follow the chain of frame pointers up the stack, it cannot know the offset to lr, so does not know which functions correspond to the stack frames. To fix this, we need to push the callee-saved registers in two batches, with the first push saving the low registers, fp and lr, and the second push saving the high registers. This is already implemented, but previously only used for iOS. This patch turns it on for all Thumb2 targets when frame pointers are required by the ABI, and the frame pointer is r7 (Windows uses r11, so this isn't a problem there). If frame pointer elimination is enabled we still emit a single push/pop even if we need a frame pointer for other reasons, to avoid increasing code size. We must also ensure that lr is pushed to the stack when using a frame pointer, so that we end up with a complete frame record. Situations that could cause this were rare, because we already push lr in most situations so that we can return using the pop instruction. Differential Revision: https://reviews.llvm.org/D23516 llvm-svn: 279506	2016-08-23 09:19:22 +00:00
Duncan P. N. Exon Smith	8f44c98d04	ARM: Avoid dereferencing end() in ARMFrameLowering::emitEpilogue This fixes the crash from PR29072, where the MachineBasicBlock::iterator wasn't being properly checked against MachineBasicBlock::end() before iterating. This was another bug exposed by the new ilist::iterator::operator*() assertion from r279314. This testcase is poor quality. bugpoint couldn't reduce any further, and I haven't had time to dig into what's going on so I can't invent a better one. I didn't even get good CHECK lines in: this is just a crasher. I'm committing anyway since this is a real crash with an obvious fix, but I'll leave PR29072 open and ask an ARM maintainer to help improve the testcase. llvm-svn: 279391	2016-08-21 00:08:10 +00:00
Michael Kuperstein	2bc3d4d46c	[SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129	2016-08-18 20:08:15 +00:00
Justin Bogner	cd1d5aaf2e	Replace a few more "fall through" comments with LLVM_FALLTHROUGH Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970	2016-08-17 20:30:52 +00:00
Tim Northover	de3aea0412	GlobalISel: support irtranslation of icmp instructions. llvm-svn: 278969	2016-08-17 20:25:25 +00:00
Justin Bogner	b03fd12cef	Replace "fallthrough" comments with LLVM_FALLTHROUGH This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902	2016-08-17 05:10:15 +00:00
Zijiao Ma	53d55f45a1	Some places that could using TargetParser in LLVM. NFC. llvm-svn: 278888	2016-08-17 02:08:28 +00:00
Duncan P. N. Exon Smith	ec083b59ed	ARM: Avoid dereferencing end() in ARMFrameLowering::emitPrologue llvm::tryFoldSPUpdateIntoPushPop assumes its arguments are valid MachineInstrs. Update ARMFrameLowering::emitPrologue to respect that; when LastPush==end(), it can't possibly be a push instruction anyway. llvm-svn: 278880	2016-08-17 00:53:04 +00:00
Prakhar Bahuguna	a27c4a0e66	Correct the upper bound for a CBZ/CBNZ branch target. Summary: Fix for the upper bound check that was causing a build failure. Reviewers: olista01, rengolin, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23501 llvm-svn: 278789	2016-08-16 10:41:56 +00:00
Prakhar Bahuguna	15ed7ec5aa	[Thumb] Validate branch target for CBZ/CBNZ instructions. Summary: The assembler currently does not check the branch target for CBZ/CBNZ instructions, which only permit branching forwards with a positive offset. This adds validation for the branch target to ensure negative PC-relative offsets are not encoded into the instruction, whether specified as a literal or as an assembler symbol. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D23312 llvm-svn: 278788	2016-08-16 10:41:52 +00:00
Matthias Braun	b948c52416	Revert "[Thumb] Validate branch target for CBZ/CBNZ instructions." This currently breaks the greendragon clang-stage1-configure-RA/ and brotli. It is probably just uncovering a pre-existing problem. Reverting temporarily to get the buildbots green again. A reduced testcase will follow shortly. This reverts commit r278659. llvm-svn: 278711	2016-08-15 18:50:13 +00:00
Prakhar Bahuguna	a305a435a6	[Thumb] Validate branch target for CBZ/CBNZ instructions. Summary: The assembler currently does not check the branch target for CBZ/CBNZ instructions, which only permit branching forwards with a positive offset. This adds validation for the branch target to ensure negative PC-relative offsets are not encoded into the instruction, whether specified as a literal or as an assembler symbol. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D23312 llvm-svn: 278659	2016-08-15 07:57:44 +00:00
David Majnemer	0da5afe717	Use the range variant of count_if instead of unpacking begin/end No functionality change is intended. llvm-svn: 278474	2016-08-12 04:32:29 +00:00
David Majnemer	42531260b3	Use the range variant of find/find_if instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278469	2016-08-12 03:55:06 +00:00
David Majnemer	562e82945e	Use the range variant of find_if instead of unpacking begin/end No functionality change is intended. llvm-svn: 278443	2016-08-12 00:18:03 +00:00
David Majnemer	0d955d0bf5	Use the range variant of find instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433	2016-08-11 22:21:41 +00:00
David Majnemer	0a16c22846	Use range algorithms instead of unpacking begin/end No functionality change is intended. llvm-svn: 278417	2016-08-11 21:15:00 +00:00
Sam Parker	62965c96df	[ARM] Improve sxta{b\|h} and uxta{b\|h} tests Created a Thumb2 predicated pattern matcher that uses Thumb2 and HasT2ExtractPack and used it to redefine the patterns for sxta{b\|h} and uxta{b\|h}. Also used the similar patterns to fill in isel pattern gaps for the corresponding instructions in the ARM backend. The patch is mainly changes to tests since most of this functionality appears not to have been tested. Differential Revision: https://reviews.llvm.org/D23273 llvm-svn: 278207	2016-08-10 09:34:34 +00:00
Oliver Stannard	8331aaee8f	[ARM] Add support for embedded position-independent code This patch adds support for some new relocation models to the ARM backend: * Read-only position independence (ROPI): Code and read-only data is accessed PC-relative. The offsets between all code and RO data sections are known at static link time. This does not affect read-write data. * Read-write position independence (RWPI): Read-write data is accessed relative to the static base register (r9). The offsets between all writeable data sections are known at static link time. This does not affect read-only data. These two modes are independent (they specify how different objects should be addressed), so they can be used individually or together. They are otherwise the same as the "static" relocation model, and are not compatible with SysV-style PIC using a global offset table. These modes are normally used by bare-metal systems or systems with small real-time operating systems. They are designed to avoid the need for a dynamic linker, the only initialisation required is setting r9 to an appropriate value for RWPI code. I have only added support to SelectionDAG, not FastISel, because FastISel is currently disabled for bare-metal targets where these modes would be used. Differential Revision: https://reviews.llvm.org/D23195 llvm-svn: 278015	2016-08-08 15:28:31 +00:00
Benjamin Kramer	3f0c1e625d	[ARM] Don't copy MCInsts in loop. NFC. llvm-svn: 277924	2016-08-06 12:58:24 +00:00
Weiming Zhao	f68a6a720c	[ARM] Constant Materialize: imms with specific value can be encoded into mov.w Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. I'm resubmitting this patch. The test case in the original commit r277610 does not specify triple, so builds with differnt default triple will have different output. This patch fixed trile as thumb-darwin-apple. Reviewers: john.brawn, jmolloy, bruno Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277865	2016-08-05 20:58:29 +00:00
Bruno Cardoso Lopes	3fcf832cce	Revert "[ARM] Constant Materialize: imms with specific value can be encoded into mov.w" This reverts commit r277610 / d619aa8878c3dafcc0d29a46517f63ff3209fdd4. This make subtarget-no-movt.ll fail in http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/26892, llvm-svn: 277654	2016-08-03 21:26:21 +00:00
Weiming Zhao	57dc4cf0e1	[ARM] Constant Materialize: imms with specific value can be encoded into mov.w Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. Reviewers: john.brawn, jmolloy Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277610	2016-08-03 17:05:23 +00:00
Tim Northover	765777ce67	ARM: only form SMMLS when SUBE flags unused. In this particular example we wouldn't want the smmls anyway (the value is actually unused), but in general smmls does not provide the required flags register so if that SUBE result is used we can't replace it. llvm-svn: 277541	2016-08-02 23:12:36 +00:00
Sam Parker	18bc3a002e	[ARM] Improve smul* and smla* isel for Thumb2 Added (sra (shl x, 16), 16) to the sext_16_node PatLeaf for ARM to simplify some pattern matching. This has allowed several patterns for smul* and smla* to be removed as well as making it easier to add the matching for the corresponding instructions for Thumb2 targets. Also added two Pat classes that are predicated on Thumb2 with the hasDSP flag and UseMulOps flags. Updated the smul codegen test with the wider range of patterns plus the ThumbV6 and ThumbV6T2 targets. Differential Revision: https://reviews.llvm.org/D22908 llvm-svn: 277450	2016-08-02 12:44:27 +00:00
Bernard Ogden	849f737155	[ARM] Some saturation instructions not DSP-only Summary: Commit 276701 requires that targets have the DSP extensions to use certain saturating instructions. This requires some corrections. For ARM ISA the instructions in question are available in all v6* architectures. For Thumb2, the instructions in question are available from v6T2. SSAT and USAT are part of the base architecture while SSAT16 and USAT16 require the DSP extensions. Reviewers: rengolin Subscribers: aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23010 llvm-svn: 277439	2016-08-02 10:04:03 +00:00
Evandro Menezes	82e245a202	[AArch64] Add support for Samsung Exynos M2 (NFC). llvm-svn: 277364	2016-08-01 18:39:45 +00:00
Davide Italiano	3ebda7ed88	[ARMConstantIslandPass] Remove dead code. llvm-svn: 277283	2016-07-30 22:07:15 +00:00
Tim Northover	6b3bd61283	CodeGen: add new "intrinsic" MachineOperand kind. This will be used during GlobalISel, where we need a more robust and readable way to write tests than a simple immediate ID. llvm-svn: 277209	2016-07-29 20:32:59 +00:00
Sjoerd Meijer	a3de1262d7	Fix for commit rL277126 that broke a build. llvm-svn: 277129	2016-07-29 09:57:37 +00:00
Prakhar Bahuguna	d1233e857e	[Thumb] Emit Thumb move in both Thumb modes for struct_byval predicates Summary: The MOV/MOVT instructions being chosen for struct_byval predicates was conditional only on Thumb2, resulting in an ARM MOV/MOVT instruction being incorrectly emitted in Thumb1 mode. This is especially apparent with v8-m.base targets. This patch ensures that Thumb instructions are emitted in both Thumb modes. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, aemerson, rengolin Differential Revision: https://reviews.llvm.org/D22865 llvm-svn: 277128	2016-07-29 09:16:46 +00:00
Matthias Braun	941a705b7b	MachineFunction: Return reference for getFrameInfo(); NFC getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017	2016-07-28 18:40:00 +00:00
Sjoerd Meijer	89217f8835	TargetInstrInfo: rename GetInstSizeInBytes to getInstSizeInBytes. NFC Differential Revision: https://reviews.llvm.org/D22925 llvm-svn: 276997	2016-07-28 16:32:22 +00:00
Renato Golin	80e58869f8	[ARM] Set a non-conflicting comment character for assembly in MSVC mode Currently, for ARMCOFFMCAsmInfoMicrosoft, no comment character is set, thus the idefault, '#', is used. The hash character doesn't work as comment character in ARM assembly, since '#' is used for immediate values. The comment character is set to ';', which is the comment character used by MS armasm.exe. (The microsoft armasm.exe uses a different directive syntax than what LLVM currently supports though, similar to ARM's armasm.) This allows inline assembly with immediate constants to be built (and brings the assembly output from clang -S closer to being possible to assemble). A test is added that verifies that ';' is correctly interpreted as comments in this mode, and verifies that assembling code that includes literal constants with a '#' works. Patch by Martin Storsjö. llvm-svn: 276859	2016-07-27 12:31:58 +00:00
Oliver Stannard	1c6e591457	[ARM] Improve error messages for .arch_extension directive - More informative message when extension name is not an identifier token. - Stop parsing directive if extension is unknown (avoid duplicate error messages). - Report unsupported extensions with a source location, rather than report_fatal_error. Differential Revision: https://reviews.llvm.org/D22806 llvm-svn: 276748	2016-07-26 14:24:43 +00:00
Oliver Stannard	2171828a49	[ARM] Implement -mimplicit-it assembler option This option, compatible with gas's -mimplicit-it, controls the generation/checking of implicit IT blocks in ARM/Thumb assembly. This option allows two behaviours that were not possible before: - When in ARM mode, emit a warning when assembling a conditional instruction that is not in an IT block. This is enabled with -mimplicit-it=never and -mimplicit-it=thumb. - When in Thumb mode, automatically generate IT instructions when an instruction with a condition code appears outside of an IT block. This is enabled with -mimplicit-it=thumb and -mimplicit-it=always. The default option is -mimplicit-it=arm, which matches the existing behaviour (allow conditional ARM instructions outside IT blocks without warning, and error if a conditional Thumb instruction is outside an IT block). The general strategy for generating IT blocks in Thumb mode is to keep a small list of instructions which should be in the IT block, and only emit them when we encounter something in the input which means we cannot continue the block. This could be caused by: - A non-predicable instruction - An instruction with a condition not compatible with the IT block - The IT block already contains 4 instructions - A branch-like instruction (including ALU instructions with the PC as the destination), which cannot appear in the middle of an IT block - A label (branching into an IT block is not legal) - A change of section, architecture, ISA, etc - The end of the assembly file. Some of these, such as change of section and end of file, are parsed outside of the ARM asm parser, so I've added a new virtual function to AsmParser to ensure any previously-parsed instructions have been emitted. The ARM implementation of this flushes the currently pending IT block. We now have to try instruction matching up to 3 times, because we cannot know if the current IT block is valid before matching, and instruction matching changes depending on the IT block state (due to the 16-bit ALU instructions, which set the flags iff not in an IT block). In the common case of not having an open implicit IT block and the instruction being matched not needing one, we still only have to run the matcher once. I've removed the ITState.FirstCond variable, because it does not store any information that isn't already represented by CurPosition. I've also updated the comment on CurPosition to accurately describe it's meaning (which this patch doesn't change). Differential Revision: https://reviews.llvm.org/D22760 llvm-svn: 276747	2016-07-26 14:19:47 +00:00
Renato Golin	32b165f561	[ARM] Saturation instructions are DSP-only The saturation instructions appeared in v6T2, with DSP extensions, but they were being accepted / generated on any, with the new introduction of the saturation detection in the back-end. This commit restricts the usage to DSP-enable only cores. Fixes PR28607. llvm-svn: 276701	2016-07-25 22:25:25 +00:00
Joel Jones	373d7d30dd	MC] Provide an MCTargetOptions to implementors of MCAsmBackendCtorTy, NFC Some targets, notably AArch64 for ILP32, have different relocation encodings based upon the ABI. This is an enabling change, so a future patch can use the ABIName from MCTargetOptions to chose which relocations to use. Tested using check-llvm. The corresponding change to clang is in: http://reviews.llvm.org/D16538 Patch by: Joel Jones Differential Revision: https://reviews.llvm.org/D16213 llvm-svn: 276654	2016-07-25 17:18:28 +00:00
Sam Parker	d5ca0a65b5	[ARM] Improve longMAC codegen test Added thumb targets and dataflow checks to the longMAC test. Differential Revision: https://reviews.llvm.org/D22684 llvm-svn: 276629	2016-07-25 10:11:00 +00:00
Sam Parker	73f5c785c1	[ARM] Small refactor of Thumb2 SMLA insts Follow up to r276624. Changes bits 22-20 to be parameters to instruction class. Differential Revision: https://reviews.llvm.org/D22562 llvm-svn: 276626	2016-07-25 09:29:24 +00:00
Sam Parker	68c71cd1e4	[ARM] Enable ISel of SMMLS for ARM and Thumb2 Use ISelDAGToDAG to recognise the SMMLS instruction pattern. Differential Revision: https://reviews.llvm.org/D22562 llvm-svn: 276624	2016-07-25 09:20:20 +00:00
Sjoerd Meijer	5c0ef83386	This refactoring of ARM machine block size computation creates two utility functions so that the size computation is available not only in ConstantIslands but in other passes as well. Differential Revision: https://reviews.llvm.org/D22640 llvm-svn: 276399	2016-07-22 08:39:12 +00:00
Diana Picus	f345d40ae2	[ARM] Skip inline asm memory operands in DAGToDAGISel Retry r275776 (no changes, we suspect the issue was with another commit). The current logic for handling inline asm operands in DAGToDAGISel interprets the operands by looking for constants, which should represent the flags describing the kind of operand we're dealing with (immediate, memory, register def etc). The operands representing actual data are skipped only if they are non-const, with the exception of immediate operands which are skipped explicitly when a flag describing an immediate is found. The oversight is that memory operands may be const too (e.g. for device drivers reading a fixed address), so we should explicitly skip the operand following a flag describing a memory operand. If we don't, we risk interpreting that constant as a flag, which is definitely not intended. Fixes PR26038 Differential Revision: https://reviews.llvm.org/D22103 llvm-svn: 276101	2016-07-20 09:48:24 +00:00
David Majnemer	5d26127752	Revert "Disable this-return argument forwarding on ARM/AArch64" Inference of the 'returned' attribute was fixed in r276008, lets try turning the backend support back on. This reverts commit r275677. llvm-svn: 276081	2016-07-20 04:13:01 +00:00
Tim Northover	554fbd05e8	ARM: move feature for Thumb2 pkhbt/pkhtb onto architectures. There's not much functional change, but it really is an architectural feature (on v6T2, v7A, v7R and v7EM) rather than something each CPU implements individually. The main functional change is the default behaviour you get when specifying only "-triple". llvm-svn: 276013	2016-07-19 19:49:13 +00:00
Sam Parker	6ca4bbb00d	[ARM] Refactor Thumb2 Mul and Mla instr descs Recommitting after r274347 was reverted. This patch introduces some classes to refactor the 3 and 4 register Thumb2 multiplication instruction descriptions, plus improved tests for some of those instructions. Differential Revision: https://reviews.llvm.org/D21929 llvm-svn: 275979	2016-07-19 14:44:05 +00:00
Peter Smith	cbcecca538	Add support for tlsldm assembler operator to ARM target The standard local dynamic model for TLS on ARM systems needs two relocations: - R_ARM_TLS_LDM32 (module idx) - R_ARM_TLS_LDO32 (offset of object from origin of module TLS block) In GNU style assembler we use symbol(tlsldm) and symbol(tlsldo) to produce these relocations. llvm-mc for ARM supports symbol(tlsldo) but does not support symbol(tlsldm). This patch wires up the existing symbol(tlsldm) to R_ARM_TLS_LDM32. TLS for ARM is defined in Addenda to, and Errata in, the ABI for the ARM Architecture Differential Revision: https://reviews.llvm.org/D22461 llvm-svn: 275977	2016-07-19 14:15:33 +00:00
Vitaly Buka	c93e10fcbb	Revert "[ARM] Skip inline asm memory operands in DAGToDAGISel" Breaks asan, see https://reviews.llvm.org/D22103 This reverts commit r275776. llvm-svn: 275890	2016-07-18 19:44:01 +00:00
Diana Picus	73ed44d328	[ARM] Skip inline asm memory operands in DAGToDAGISel The current logic for handling inline asm operands in DAGToDAGISel interprets the operands by looking for constants, which should represent the flags describing the kind of operand we're dealing with (immediate, memory, register def etc). The operands representing actual data are skipped only if they are non-const, with the exception of immediate operands which are skipped explicitly when a flag describing an immediate is found. The oversight is that memory operands may be const too (e.g. for device drivers reading a fixed address), so we should explicitly skip the operand following a flag describing a memory operand. If we don't, we risk interpreting that constant as a flag, which is definitely not intended. Fixes PR26038 Differential Revision: https://reviews.llvm.org/D22103 llvm-svn: 275776	2016-07-18 07:35:14 +00:00
Diana Picus	774d157a5d	[ARM] Honour ABI for rem under -O0 for EABI, GNUEABI, Android and Musl At higher optimization levels, we generate the libcall for DIVREM_Ix, which is fine: aeabi_{u\|i}divmod. At -O0 we generate the one for REM_Ix, which is the default {u}mod{q\|h\|s\|d}i3. This commit makes sure that we don't generate REM_Ix calls for ABIs that don't support them (i.e. where we need to use DIVREM_Ix instead). This is achieved by bailing out of FastISel, which can't handle non-double multi-reg returns, and letting the legalization infrastructure expand the REM_Ix calls. It also updates the divmod-eabi.ll test to run under -O0 as well, and adds some Windows checks to it to make sure we don't break things for it. Fixes PR27068 Differential Revision: https://reviews.llvm.org/D21926 llvm-svn: 275773	2016-07-18 06:48:25 +00:00
Hal Finkel	04b5330ccd	Disable this-return argument forwarding on ARM/AArch64 r275042 reverted function-attribute inference for the 'returned' attribute because the feature triggered self-hosting failures on ARM and AArch64. James Molloy determined that the this-return argument forwarding feature, which directly ties the returned input argument to the returned value, was the cause. It seems likely that this forwarding code contains, or triggers, a subtle bug. Disabling for now until we can track that down. llvm-svn: 275677	2016-07-16 07:07:29 +00:00
Matthias Braun	8f456fb18f	ARM: Initialize LoadStore passes in TargetMachine Initializing them in LLVMInitializeARMTarget() makes them visible early enough for "llc -run-pass usage". This required the pass to be renamed from "arm-load-store-opt" to "arm-ldst-opt", because there already exists an arm-load-store-opt cl::opt switch which would now clash with the passname getting added as a switch in opt. On the bright side the pass name now matches the DEBUG_TYPE name. Renamed "arm-prera-load-store-opt" to "arm-repra-ldst-opt" as well for consistency. llvm-svn: 275661	2016-07-16 02:24:10 +00:00
Justin Lebar	9c375817ac	[SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends. Summary: Instead, we take a single flags arg (a bitset). Also add a default 0 alignment, and change the order of arguments so the alignment comes before the flags. This greatly simplifies many callsites, and fixes a bug in AMDGPUISelLowering, wherein the order of the args to getLoad was inverted. It also greatly simplifies the process of adding another flag to getLoad. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D22249 llvm-svn: 275592	2016-07-15 18:27:10 +00:00
Justin Lebar	0af80cd6f0	[CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand. Summary: Previously we took an unsigned. Hooray for type-safety. Reviewers: chandlerc Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D22282 llvm-svn: 275591	2016-07-15 18:26:59 +00:00
Jacques Pienaar	71c30a14b7	Rename AnalyzeBranch* to analyzeBranch*. Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 llvm-svn: 275564	2016-07-15 14:41:04 +00:00
James Molloy	8f16dffbb1	[ARM] Fix build after r275540 A rebase seemed so innocent before committing. Turns out someone changed a pointer to a reference in the mean time :( llvm-svn: 275541	2016-07-15 08:12:44 +00:00
James Molloy	b3326df56a	[Thumb-1] Select post-increment load and store where possible Thumb-1 doesn't have post-inc or pre-inc load or store instructions. However the LDM/STM instructions with writeback can function as post-inc load/store: ldm r0!, {r1} @ load from r0 into r1 and increment r0 by 4 Obviously, this only works if the post increment is 4. llvm-svn: 275540	2016-07-15 08:03:56 +00:00
James Molloy	2af08fa051	[ARM] Followup to r275537 addressing review comments Address Chad's comment in D22216 which I missed due to tunnel vision on the "LGTM" comment. llvm-svn: 275538	2016-07-15 07:57:35 +00:00
James Molloy	a454a11d60	[ARM] Prefer indirect calls in minsize mode ... When we emit several calls to the same function in the same basic block. An indirect call uses a "BLX r0" instruction which has a 16-bit encoding. If many calls are made to the same target, this can enable significant code size reductions. llvm-svn: 275537	2016-07-15 07:55:21 +00:00
Tim Northover	6003fb58df	ARM: fix vmov.i64 immediate validity check Typo meant we were only checking the low byte (repeatedly). llvm-svn: 275437	2016-07-14 17:04:34 +00:00
Sjoerd Meijer	38c2cd0c14	This implements a more optimal algorithm for selecting a base constant in constant hoisting. It not only takes into account the number of uses and the cost of expressions in which constants appear, but now also the resulting integer range of the offsets. Thus, the algorithm maximizes the number of uses within an integer range that will enable more efficient code generation. On ARM, for example, this will enable code size optimisations because less negative offsets will be created. Negative offsets/immediates are not supported by Thumb1 thus preventing more compact instruction encoding. Differential Revision: http://reviews.llvm.org/D21183 llvm-svn: 275382	2016-07-14 07:44:20 +00:00
Tim Northover	3e0361710a	ARM: validate immediate branch targets in AsmParser. Immediate branch targets aren't commonly used, but if they are we should make sure they can actually be encoded. This means they must be divisible by 2 when targeting Thumb mode, and by 4 when targeting ARM mode. Also do a little naming cleanup while I was changing everything around anyway. llvm-svn: 275116	2016-07-11 22:29:37 +00:00
Nirav Dave	8603062ee4	Fix branch relaxation in 16-bit mode. Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation to generate jumps with 16-bit sized immediates in 16-bit mode. This fixes PR22097. Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D20830 llvm-svn: 275068	2016-07-11 14:23:53 +00:00
Benjamin Kramer	4d09892e9a	Give helper classes/functions internal linkage. NFC. llvm-svn: 275014	2016-07-10 11:28:51 +00:00
Duncan P. N. Exon Smith	29c524983b	ARM: Remove implicit iterator conversions, NFC Remove remaining implicit conversions from MachineInstrBundleIterator to MachineInstr* from the ARM backend. In most cases, I made them less attractive by preferring MachineInstr& or using a ranged-based for loop. Once all the backends are fixed I'll make the operator explicit so that this doesn't bitrot back. llvm-svn: 274920	2016-07-08 20:21:17 +00:00
Saleem Abdulrasool	eb059b0e0a	ARM: support high registers in __builtin_longjmp on WoA Windows on ARM uses a pure thumb-2 environment. This means that it can select a high register when doing a __builtin_longjmp. We would use a tLDRi which would truncate the register to a low register. Use a t2LDRi12 to get the full register file access. Tweak the code to just load into PC, as that is an interworking branch on all supported cores anyways. llvm-svn: 274815	2016-07-08 00:48:22 +00:00
Diana Picus	575f2bb287	[ARM] Do not test for CPUs, use SubtargetFeatures. Also remove 1 flag This is a follow-up for r273544. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. This commit also removes a command line flag that isn't used in any of the tests: check-vmlx-hazards. It can be replaced easily with the mattr mechanism, since this is now a subtarget feature. There is still some work left regarding FeatureExpandMLx. In the past MLx expansion was enabled for subtargets with hasVFP2(), until r129775 [1] switched from that to isCortexA9, without too much justification. In spite of that, the code performing MLx expansion still contains calls to isSwift/isLikeA9, although the results of those are pretty clear given that we're only enabling it for the A9. We should try to enable it for all targets that have FeatureHasVMLxHazards, as it seems to be closely related to that behaviour, and if that is possible try to clean up the MLx expansion pass from all calls to isWhatever. This will require some performance testing, so it will be done in another patch. [1] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110418/119725.html Differential Revision: http://reviews.llvm.org/D21798 llvm-svn: 274742	2016-07-07 09:11:39 +00:00
Diana Picus	b772e409ba	[ARM] Do not test for CPUs, use SubtargetFeatures. Also remove 2 flags. This is a follow-up for r273544. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. This commit also removes two command-line flags that weren't used in any of the tests: widen-vmovs and swift-partial-update-clearance. The former may be easily replaced with the mattr mechanism, but the latter may not (as it is a subtarget property, and not a proper feature). Differential Revision: http://reviews.llvm.org/D21797 llvm-svn: 274620	2016-07-06 11:22:11 +00:00
Diana Picus	4879b050cc	[ARM] Do not test for CPUs, use SubtargetFeatures (Part 3). NFCI This is a follow-up for r273544 and r273853. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. This commit also marks them as obsolete. Differential Revision: http://reviews.llvm.org/D21796 llvm-svn: 274616	2016-07-06 09:22:23 +00:00
Saleem Abdulrasool	4d950ef892	ARM: fix `-mlong-calls` for WoA Not all code-paths set the relocation model to static for Windows. This currently breaks on Windows ARM with `-mlong-calls` when built with clang. Loosen the assertion to what it was previously. We would ideally ensure that all the configuration sets Windows to static relocation model. llvm-svn: 274570	2016-07-05 18:30:52 +00:00
James Molloy	ae5ff990ae	[Thumb] Reapply r272251 with a fix for PR28348 (mk 2) The important thing I was missing was ensuring newly added constants were kept in topological order. Repositioning the node is correct if the constant is newly added (so it has no topological ordering) but wrong if it already existed - positioning it next in the worklist would break the topological ordering. Original commit message: [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 274543	2016-07-05 12:37:13 +00:00
James Molloy	c3b4ed4a70	Revert "[Thumb] Reapply r272251 with a fix for PR28348" This reverts commit r274510 - it made green dragon unhappy. llvm-svn: 274512	2016-07-04 17:14:24 +00:00
James Molloy	9f019835ef	[Thumb] Reapply r272251 with a fix for PR28348 We were using DAG->getConstant instead of DAG->getTargetConstant. This meant that we could inadvertently increase the use count of a constant if stars aligned, which it did in this testcase. Increasing the use count of the constant could cause ISel to fall over (because DAGToDAG lowering assumed the constant had only one use!) Original commit message: [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 274510	2016-07-04 16:35:41 +00:00
Hans Wennborg	a3bb5f1594	Revert r274347 "[ARM] Refactor Thumb2 mul instruction descs" This caused PR28387: Assertion "#operands for dag node doesn't match .td file!" llvm-svn: 274367	2016-07-01 17:26:42 +00:00
Sam Parker	06692203ed	[ARM] Refactor Thumb2 mul instruction descs No functional changes. Just created wrapper classes around the 3 and 4 reg mult and mac instruction classes. Differential Revision: http://reviews.llvm.org/D21549 llvm-svn: 274347	2016-07-01 12:55:49 +00:00
Craig Topper	2bd8b4b180	[CodeGen,Target] Remove the version of DAG.getVectorShuffle that takes a pointer to a mask array. Convert all callers to use the ArrayRef version. No functional change intended. For the most part this simplifies all callers. There were two places in X86 that needed an explicit makeArrayRef to shorten a statically sized array. llvm-svn: 274337	2016-07-01 06:54:47 +00:00
Duncan P. N. Exon Smith	d26fdc83c9	CodeGen: Use MachineInstr& in LiveVariables API, NFC Change all the methods in LiveVariables that expect non-null MachineInstr* to take MachineInstr& and update the call sites. This clarifies the API, and designs away a class of iterator to pointer implicit conversions. llvm-svn: 274319	2016-07-01 01:51:32 +00:00
Duncan P. N. Exon Smith	e4f5e4f4d1	CodeGen: Use MachineInstr& in TargetLowering, NFC This is a mechanical change to make TargetLowering API take MachineInstr& (instead of MachineInstr), since the argument is expected to be a valid MachineInstr. In one case, changed a parameter from MachineInstr to MachineBasicBlock::iterator, since it was used as an insertion point. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. llvm-svn: 274287	2016-06-30 22:52:52 +00:00
Rafael Espindola	d86e8bb0ed	Delete MCCodeGenInfo. MC doesn't really care about CodeGen stuff, so this was just complicating target initialization. llvm-svn: 274258	2016-06-30 18:25:11 +00:00
Rafael Espindola	db6bd02185	Delete unused includes. NFC. llvm-svn: 274225	2016-06-30 12:19:16 +00:00
Craig Topper	bc56e3ba53	Use ShuffleVectorSDNode::isSplat member method instead of static method isSplatMask where the mask came directly from getMask() on a shuffle node. llvm-svn: 274208	2016-06-30 04:38:51 +00:00
Duncan P. N. Exon Smith	9cfc75c214	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189	2016-06-30 00:01:54 +00:00
Nico Weber	12fdf60b75	Revert r272251, it caused PR28348. llvm-svn: 274141	2016-06-29 17:33:41 +00:00
Weiming Zhao	5410edddb1	[ARM] Fix 28282: cost computation for constant hoisting Summary: This fixes bug: https://llvm.org/bugs/show_bug.cgi?id=28282 Currently the cost model of constant hoisting checks the bit width of the data type of the constants. However, the actual immediate value is small enough and not need to be hoisted. This patch checks for the actual bit width needed for the constant. Reviewers: t.p.northover, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D21668 llvm-svn: 274073	2016-06-28 22:30:45 +00:00
Rafael Espindola	5ac8f5c379	Don't pass a Reloc::Model to GVIsIndirectSymbol. It already has access to it. While at it, rename it to isGVIndirectSymbol. llvm-svn: 274023	2016-06-28 15:38:13 +00:00
Rafael Espindola	82f4631c88	Don't pass Reloc::Model to places that already have it. NFC. llvm-svn: 274022	2016-06-28 15:18:26 +00:00
Nick Lewycky	9980075133	NFC. Fix popular typo in comment 'deferencing' --> 'dereferencing'. Bonus changes, * placement in X86ISelLowering and 'exerce' -> 'exercise' in test. llvm-svn: 273984	2016-06-28 01:45:05 +00:00
Rafael Espindola	3beef8d6db	Move shouldAssumeDSOLocal to Target. Should fix the shared library build. llvm-svn: 273958	2016-06-27 23:15:57 +00:00
Rafael Espindola	0db11db560	Move isPositionIndependent up to AsmPrinter. Use it in ppc too. llvm-svn: 273877	2016-06-27 14:19:45 +00:00
Diana Picus	eb1068a5fe	[ARM] Use member initializers in ARMSubtarget. NFCI Same as r273556, but with C++11 member initializers. Change suggested by Matthias Braun (see http://reviews.llvm.org/D21432). llvm-svn: 273873	2016-06-27 13:06:10 +00:00
Diana Picus	92423ce194	[ARM] Do not test for CPUs, use SubtargetFeatures (Part 2). NFCI This is a follow-up for r273544. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. Since the ARM backend seems to have quite a lot of calls to these methods, I intend to submit 5-6 subtarget features at a time, instead of one big lump. Differential Revision: http://reviews.llvm.org/D21685 llvm-svn: 273853	2016-06-27 09:08:23 +00:00
Rafael Espindola	e715172791	Use isPositionIndependent. NFC. llvm-svn: 273829	2016-06-26 22:32:53 +00:00
Rafael Espindola	ae0d866f56	Refactor a duplicated predicate. NFC. llvm-svn: 273826	2016-06-26 22:13:55 +00:00
Rafael Espindola	a895a0cd01	Add support for musl-libc on ARM Linux. Patch by Lei Zhang! llvm-svn: 273726	2016-06-24 21:14:33 +00:00
Ahmed Bougacha	241e74cbc2	[ARM] Remove dead SDNodes. NFC. The opcodes are used, but only by DAG->DAG. llvm-svn: 273717	2016-06-24 20:38:00 +00:00
Ahmed Bougacha	f0b46ee0aa	[ARM] Use aapcs_vfp for ___truncdfhf2 on v7k. r215348 overrode the f16 libcalls to be soft-float, but v7k uses the default (hard-float) calling convention. llvm-svn: 273631	2016-06-24 00:08:01 +00:00
Pablo Barrio	7a64346533	[ARM] Lower (select_cc k k (select_cc ~k ~k x)) into (SSAT l_k x) Summary: SSAT saturates an integer, making sure that its value lies within an interval [-k, k]. Since the constant is given to SSAT as the number of bytes set to one, k + 1 must be a power of 2, otherwise the optimization is not possible. Also, the select_cc must use < and > respectively so that they define an interval. Reviewers: mcrosier, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D21372 llvm-svn: 273581	2016-06-23 16:53:49 +00:00
Diana Picus	eb3dd14b95	[ARM] Use member initializers in ARMSubtarget. NFCI Move most of the initializations in ARMSubtarget::initializeEnvironment to member initializers. Change suggested by Matthias Braun (see http://reviews.llvm.org/D21432). llvm-svn: 273556	2016-06-23 12:04:33 +00:00
Diana Picus	c5baa43f53	[ARM] Do not test for CPUs, use SubtargetFeatures (Part 1). NFCI This is a cleanup commit similar to r271555, but for ARM. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. Since the ARM backend seems to have quite a lot of calls to these methods, I intend to submit 5-6 subtarget features at a time, instead of one big lump. Differential Revision: http://reviews.llvm.org/D21432 llvm-svn: 273544	2016-06-23 07:47:35 +00:00
Krzysztof Parzyszek	e116d500a7	[SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee The setCallee function will set the number of fixed arguments based on the size of the argument list. The FixedArgs parameter was often explicitly set to 0, leading to a lack of consistent value for non- vararg functions. Differential Revision: http://reviews.llvm.org/D20376 llvm-svn: 273403	2016-06-22 12:54:25 +00:00
Rafael Espindola	7b4ef068c6	Delete more dead code. Found by gcc 6. llvm-svn: 273322	2016-06-21 21:51:41 +00:00
Etienne Bergeron	715ec09dcf	fix indentation llvm-svn: 273274	2016-06-21 15:21:04 +00:00
Rafael Espindola	3d6a130fee	Define a isPositionIndependent helper for ARMAsmPrinter. NFC. llvm-svn: 273261	2016-06-21 14:21:53 +00:00
David Majnemer	e61e4bfd87	Replace silly uses of 'signed' with 'int' llvm-svn: 273244	2016-06-21 05:10:24 +00:00
Rafael Espindola	524bcbf1f3	Add a isPositionIndependent helper to ARMFastISel. NFC. llvm-svn: 273187	2016-06-20 19:00:05 +00:00
Rafael Espindola	959e9c8d01	Use shouldAssumeDSOLocal. With this ARM fast isel knows that PIE variable are not preemptable. llvm-svn: 273169	2016-06-20 17:45:33 +00:00
Rafael Espindola	9935766458	Simplify. NFC. llvm-svn: 273167	2016-06-20 17:00:13 +00:00
Sam Parker	d616cf07b2	[ARM] Enable isel of UMAAL TargetLowering and DAGToDAG are used to combine ADDC, ADDE and UMLAL dags into UMAAL. Selection is split into the two phases because it is easier to match the two patterns at those different times. Differential Revision: http://http://reviews.llvm.org/D21461 llvm-svn: 273165	2016-06-20 16:47:09 +00:00
Rafael Espindola	0f89833c31	Add a isPositionIndependent predicate. Reduces a bit of code duplication and clarify where we are interested just on position independence and no the location of the symbol. llvm-svn: 273164	2016-06-20 16:43:17 +00:00
Aaron Ballman	86100fc8be	Removing an unused switch statement that has only a default label. This happens to also eliminate an instance of switchception. NFC intended. llvm-svn: 273161	2016-06-20 15:37:15 +00:00
Tim Northover	28a9e7f4ba	ARM: take account of possible bundle when erasing an instruction. Fortunately this appears to be the only ARM-specific pass that runs while bundles might be in play, so no other cases need modifying. llvm-svn: 273029	2016-06-17 18:40:46 +00:00
Nirav Dave	fd91041ce1	Refactor and cleanup Assembly Parsing / Lexing Recommiting after fixing non-atomic insert to front of SmallVector in MCAsmLexer.h Add explicit Comment Token in Assembly Lexing for future support for outputting explicit comments from inline assembly. As part of this, CPPHash Directives are now explicitly distinguished from Hash line comments in Lexer. Line comments are recorded as EndOfStatement tokens, not Comment tokens to simplify compatibility with current TargetParsers. This slightly complicates comment output. This remove all lexing tasks out of the parser, does minor cleanup to remove extraneous newlines Asm Output, and some improvements white space handling. Reviewers: rtrieu, dwmw2, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20009 llvm-svn: 273007	2016-06-17 16:06:17 +00:00
Benjamin Kramer	f690da4864	[ARM] Strength reduce vectors to arrays. No functionality change intended. llvm-svn: 273001	2016-06-17 14:14:29 +00:00
Ranjeet Singh	39d2d097d6	[ARM] Add support for mrrc/mrrc2 intrinsics. Reapplying patch as it was reverted when it was first committed because of an assertion failure when the mrrc2 intrinsic was called in ARM mode. The failure was happening because the instruction was being built in ARMISelDAGToDAG.cpp and the tablegen description for mrrc2 instruction doesn't allow you to use a predicate. The ARM architecture manuals do say that mrrc2 in ARM mode can be predicated with AL in assembly but this has no effect on the encoding of the instruction as the top 4 bits will always be 1111 not 1110 which is the encoding for the condition AL. Differential Revision: http://reviews.llvm.org/D21408 llvm-svn: 272982	2016-06-17 00:52:41 +00:00
Nirav Dave	280ecf6ff0	Revert "Refactor and cleanup Assembly Parsing / Lexing" Reverting for unexpected crashes on various platforms. This reverts commit r272953. llvm-svn: 272957	2016-06-16 21:19:23 +00:00
Nirav Dave	c19c3260df	Refactor and cleanup Assembly Parsing / Lexing Add explicit Comment Token in Assembly Lexing for future support for outputting explicit comments from inline assembly. As part of this, CPPHash Directives are now explicitly distinguished from Hash line comments in Lexer. Line comments are recorded as EndOfStatement tokens, not Comment tokens to simplify compatibility with current TargetParsers. This slightly complicates comment output. This remove all lexing tasks out of the parser, does minor cleanup to remove extraneous newlines Asm Output, and some improvements white space handling. Reviewers: rtrieu, dwmw2, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20009 llvm-svn: 272953	2016-06-16 20:34:22 +00:00
Rafael Espindola	afade35003	Don't print (PLT) on arm. The R_ARM_PLT32 relocation is deprecated and is not produced by MC. This means that the code being deleted is dead from the .o point of view and was making the .s more confusing. llvm-svn: 272909	2016-06-16 16:09:53 +00:00
Rafael Espindola	9ba9c5bde5	Refactor duplicated code. NFC. llvm-svn: 272905	2016-06-16 15:44:06 +00:00
Rafael Espindola	c1d739f253	Refactor duplicated code. NFC. llvm-svn: 272904	2016-06-16 15:40:24 +00:00
Rafael Espindola	c24f0eeb8d	Refactor duplicated code. NFC. llvm-svn: 272903	2016-06-16 15:31:06 +00:00
Rafael Espindola	3888bdb022	Refactor duplicated code. NFC. llvm-svn: 272901	2016-06-16 15:22:01 +00:00
Ranjeet Singh	0db7be886e	Reverting r272778 because there's an assertion failure when running the test CodeGen/ARM/intrinsics-coprocessor.ll llvm-svn: 272791	2016-06-15 14:23:29 +00:00
Ranjeet Singh	351364fe76	[ARM] Add support for mrrc/mrrc2 intrinsics. Differential Revision: http://reviews.llvm.org/D21178 llvm-svn: 272778	2016-06-15 11:32:24 +00:00
James Molloy	65b6be1d3a	[Thumb] Fix off-by-one error in r272007 We can only generate immediates up to #510 with a MOV+ADD, not #511, because there's no such instruction as add #256. Found by Oliver Stannard and csmith! llvm-svn: 272665	2016-06-14 13:33:07 +00:00
Ranjeet Singh	933e1aa39f	[ARM] Reverting r272544 because clang patch needs to go in as soon as llvm patch has gone in because tests will start breaking in Clang. llvm-svn: 272546	2016-06-13 10:58:24 +00:00
Ranjeet Singh	8feacb330d	[ARM] Add mrrc/mrrc2 co-processor intrinsics MRRC/MRRC2 instruction writes to two registers. The intrinsic definition returns a single uint64_t to represent the write, this is a compact way of representing a write to two 32 bit registers, the alternative might have been two return a struct of 2 uint32_t's but this isn't as nice. Differential Revision: llvm-svn: 272544	2016-06-13 10:43:50 +00:00
Benjamin Kramer	d3f4c05aea	Move instances of std::function. Or replace with llvm::function_ref if it's never stored. NFC intended. llvm-svn: 272513	2016-06-12 16:13:55 +00:00
Benjamin Kramer	bdc4956bac	Pass DebugLoc and SDLoc by const ref. This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512	2016-06-12 15:39:02 +00:00
Roger Ferrer Ibanez	1efc17e1ec	test commit: remove trailing whitespaces in README.txt llvm-svn: 272380	2016-06-10 08:19:58 +00:00
James Molloy	a7dbf987b5	[Thumb] A branch is not part of an IT block ReplaceTailWithBranchTo assumed that if an instruction is predicated, it must be part of an IT block. This is not correct for conditional branches. No testcase as this was triggered by the reverted patch r272017 - test coverage will occur when that patch is re-reverted and there is no known way to trigger this in the meantime. llvm-svn: 272258	2016-06-09 11:51:29 +00:00
James Molloy	feb9f4243b	[Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 272251	2016-06-09 07:39:08 +00:00
Oliver Stannard	b3378e2f3c	[ARM] MSR instructions implicitly set CPSR The MSR instructions can write to the CPSR, but we did not model this fact, so we could emit them in the middle of IT blocks, changing the condition flags for later instructions in the block. The tests use two calls to llvm.write_register.i32 because it is valid to use these instructions at the end of an IT block, which if conversion does do in some cases. With two calls, the first clobbers the flags, so a branch has to be used to make the second one conditional. Differential Revision: http://reviews.llvm.org/D21139 llvm-svn: 272154	2016-06-08 15:26:34 +00:00
Diana Picus	0781d10ac4	[ARM] Remove redundant check. NFC isSwift is tested earlier and known to be false when we reach this code. llvm-svn: 272127	2016-06-08 10:29:02 +00:00
Benjamin Kramer	46e38f3678	Avoid copies of std::strings and APInt/APFloats where we only read from it As suggested by clang-tidy's performance-unnecessary-copy-initialization. This can easily hit lifetime issues, so I audited every change and ran the tests under asan, which came back clean. llvm-svn: 272126	2016-06-08 10:01:20 +00:00
Oliver Stannard	8de5f24d10	[ARM] Accept conditional versions of BXNS and BLXNS These instructions end in "S" but are not flag-setting, so they need including in the list of special cases in the assembly parser. Differential Revision: http://reviews.llvm.org/D21077 llvm-svn: 272015	2016-06-07 14:58:48 +00:00
James Molloy	b101383fb5	[Thumb-1] Add optimized constant materialization for integers [256..512) We can materialize these integers using a MOV; ADDi8 pair. llvm-svn: 272007	2016-06-07 13:10:14 +00:00
James Molloy	53298a1808	[ARM] Shrink post-indexed LDR and STR to LDM/STM A Thumb-2 post-indexed LDR instruction such as: ldr.w r0, [r1], #4 Can be rewritten as: ldm.n r1!, {r0} LDMs can be more expensive than LDRs on some cores, so this has been enabled only in minsize mode. llvm-svn: 272002	2016-06-07 12:13:34 +00:00
James Molloy	75afc95112	[ARM] Transform LDMs into writeback form to save code size If we have an LDM that uses only low registers and doesn't write to its base register: ldm.w r0, {r1, r2, r3} And that base register is dead after the LDM, then we can convert it to writeback form and use a narrow encoding: ldm.n r0!, {r1, r2, r3} Obviously, this introduces a new register write and so can cause WAW hazards, so I've enabled it only in minsize mode. This is a code size trick that ARM Compiler 5 ("armcc") does that we don't. llvm-svn: 272000	2016-06-07 11:47:24 +00:00
Peter Smith	353a2286e2	[ARM] Incorrect relocation type for Thumb2 B<cond>.w The Thumb2 conditional branch B<cond>.W has a different encoding (T3) to the unconditional branch B.W (T4) as it needs to record <cond>. As the encoding is different the B<cond>.W is given a different relocation type. ELF for the ARM Architecture 4.6.1.6 (Table-13) states that R_ARM_THM_JUMP19 should be used for B<cond>.W. At present the MC layer is using the R_ARM_THM_JUMP24 from B.W. This change makes B<cond>.W use R_ARM_THM_JUMP19 and alters the existing test that checks for R_ARM_THM_JUMP24 to expect R_ARM_THM_JUMP19. llvm-svn: 271997	2016-06-07 10:34:33 +00:00
Saleem Abdulrasool	532dcbc2c5	ARM: correct TLS access on WoA TLS access requires an offset from the TLS index. The index itself is the section-relative distance of the symbol. For ARM, the relevant relocation (IMAGE_REL_ARM_SECREL) is applied as a constant. This means that the value may not be an immediate and must be lowered into a constant pool. This offset will not be base relocated. We were previously emitting the actual address of the symbol which would be base relocated and would therefore be the vaue offset by the ImageBase + TLS Offset. llvm-svn: 271974	2016-06-07 03:15:07 +00:00
Saleem Abdulrasool	ce4eee4951	ARM: clang-format a couple of switches, add comments clang-format a couple of switches in preparation for a future change. Add some enumeration comments llvm-svn: 271973	2016-06-07 03:15:01 +00:00
Saleem Abdulrasool	0cd0cb992a	ARM: normalise space in the patterns Just adjust the whitespace for the selection patterns. NFC. llvm-svn: 271972	2016-06-07 03:14:57 +00:00
Sjoerd Meijer	9bc93f6298	Code size optimisation: do not inline memcpy if this expansion results in more instructions than the libary call. Differential Revision: http://reviews.llvm.org/D20958 llvm-svn: 271678	2016-06-03 15:38:55 +00:00
Sjoerd Meijer	d906bf1369	RAS extensions are part of ARMv8.2-A. This change enables them by introducing a new instruction to ARM and AArch64 targets and several system registers. Patch by: Roger Ferrer Ibanez and Oliver Stannard Differential Revision: http://reviews.llvm.org/D20282 llvm-svn: 271670	2016-06-03 14:03:27 +00:00
Sjoerd Meijer	9da258d8e5	ARM target does not use printAliasInstr machinery which forces having special checks in ArmInstPrinter::printInstruction. This patch addresses this issue. Not all special checks could be removed: either they involve elaborated conditions under which the alias is emitted (e.g. ldm/stm on sp may be pop/push but only if the number of registers is >= 2) or the number of registers is multivalued (like happens again with ldm/stm) and they do not match the InstAlias pattern which assumes single-valued operands in the pattern. Patch by: Roger Ferrer Ibanez Differential Revision: http://reviews.llvm.org/D20237 llvm-svn: 271667	2016-06-03 13:19:43 +00:00
Sjoerd Meijer	0b7bb16e5b	This adds support for Cortex-A73 as an available target. Differential Revision: http://reviews.llvm.org/D20865 llvm-svn: 271508	2016-06-02 10:48:52 +00:00
Rafael Espindola	41410cc812	Avoid a load for local functions. llvm-svn: 271437	2016-06-01 21:57:11 +00:00
Oliver Stannard	92ca83cccd	[ARM] Add additional matching for UBFX instructions This adds an additional matcher to select UBFX(..) from SRL(AND(..)) in ARMISelDAGToDAG to help with code size. Patch by David Green. Differential Revision: http://reviews.llvm.org/D20667 llvm-svn: 271384	2016-06-01 12:01:01 +00:00
Matthias Braun	fe725c9241	ARM: Do not attempt to modify register class of physregs. Physregs have no associated register class, do not attempt to modify it in Thumb2InstrInfo::storeRegToStackSlot()/loadFromStackSlot(). llvm-svn: 271339	2016-05-31 21:39:12 +00:00
Rafael Espindola	7ad97b2fe4	Add a use of shouldAssumeDSOLocal to ARM. Now this code path knows about position independent executables. llvm-svn: 271290	2016-05-31 15:31:55 +00:00
Ranjeet Singh	16c24f4d6e	[ARM] Add backend support for load/store intrinsics. Added support to map intrinsics __builtin_arm_{ldc,ldcl,ldc2,ldc2l,stc,stcl,stc2,stc2l} to their ARM instructions. Differential Revision: http://reviews.llvm.org/D20564 llvm-svn: 271271	2016-05-31 12:39:30 +00:00
Rafael Espindola	fe796dca90	Fix default reloc model on ARM. llvm-svn: 271111	2016-05-28 10:41:15 +00:00
Renato Golin	9be88629d5	Revert "Revert "Map DynamicNoPIC to Static on non-darwin."" This reverts commit r271096, as reverting it broke even more buildbots! But that also means I'll break on ARM again... :( llvm-svn: 271099	2016-05-28 04:47:13 +00:00
Renato Golin	4f22c51b09	Revert "Map DynamicNoPIC to Static on non-darwin." This reverts commit r271052, as it broke some ARM buildbots. llvm-svn: 271096	2016-05-28 04:24:26 +00:00
Rafael Espindola	eece113105	Start using shouldAssumeDSOLocal on ARM. Given where this is used it should be a nop. llvm-svn: 271066	2016-05-27 22:41:51 +00:00
Rafael Espindola	f9bda6805b	Map DynamicNoPIC to Static on non-darwin. DynamicNoPIC was only every used on darwin. This maps it to static on ELF. It matches what is done on X86. llvm-svn: 271052	2016-05-27 21:44:18 +00:00
Ahmed Bougacha	655c2deaf6	[ARM] Remove tBLXr Pat made redundant by r269101. NFCI. llvm-svn: 271023	2016-05-27 17:58:03 +00:00
Benjamin Kramer	f6f815bf39	Use StringRef::startswith instead of find(...) == 0. It's faster and easier to read. llvm-svn: 271018	2016-05-27 16:54:57 +00:00
Benjamin Kramer	82de7d323d	Apply clang-tidy's misc-move-constructor-init throughout LLVM. No functionality change intended, maybe a tiny performance improvement. llvm-svn: 270997	2016-05-27 14:27:24 +00:00
Benjamin Kramer	4fed928f53	Avoid some copies by using const references. clang-tidy's performance-unnecessary-copy-initialization with some manual fixes. No functional changes intended. llvm-svn: 270988	2016-05-27 12:30:51 +00:00
Benjamin Kramer	3e9a5d3468	Apply clang-tidy's misc-static-assert where it makes sense. Also fold conditions into assert(0) where it makes sense. No functional change intended. llvm-svn: 270982	2016-05-27 11:36:04 +00:00
Ranjeet Singh	c520e93d9a	Test commit. llvm-svn: 270056	2016-05-19 12:44:39 +00:00
Rafael Espindola	8c34dd8257	Delete Reloc::Default. Having an enum member named Default is quite confusing: Is it distinct from the others? This patch removes that member and instead uses Optional<Reloc> in places where we have a user input that still hasn't been maped to the default value, which is now clear has no be one of the remaining 3 options. llvm-svn: 269988	2016-05-18 22:04:49 +00:00
Rafael Espindola	38af4d6347	Trivial cleanups. This just clang formats and cleans comments in an area I am about to post a patch for review. llvm-svn: 269946	2016-05-18 16:00:24 +00:00
Rafael Espindola	712f957cae	Simplify handling of hidden stub. Since r207518 they are printed exactly like non-hidden stubs on x86 and since r207517 on ARM. This means we can use a single set for all stubs in those platforms. llvm-svn: 269776	2016-05-17 16:01:32 +00:00
Renato Golin	57bfb69aa4	[ARM] ARM mov InstAlias for MOVW lacks HasV6T2 The movw instruction is only available in ARM state for V6T2 and above. The MOVi16 instruction has requirement HasV6T2 but the InstAlias for mov rd, imm where the operand is imm0_65535_expr:$imm does not. This means that movw can incorrectly be used in ARMv4 and ARMv5 by writing mov rd, 0x1234. The simple fix is to the requirement HasV6T2 to the InstAlias. Tests added to not-armv4.s. Patch by Peter Smith. llvm-svn: 269761	2016-05-17 13:05:28 +00:00
Saleem Abdulrasool	8df2f49889	ARM: support export directives for Windows It seems that cl will emit the export directives for Windows ARM targets. The fact that it did this had originally been missed and this functionality was never implemented. This makes it possible to rely solely on the source code for indicating what the exported interfaces are and brings us more compatibility with cl. llvm-svn: 269574	2016-05-14 18:58:34 +00:00
Tim Northover	f8b0a7af52	ARM: use callee-saved list in the order they're actually saved. When setting the frame pointer, the offset from SP is calculated based on the stack slot it gets allocated, but this slot is in turn based on the order of the CSR list so that list should match the order we actually save the registers in. Mostly it did, but in the edge-case of MachO AAPCS targets it was wrong. llvm-svn: 269459	2016-05-13 19:16:14 +00:00
Renato Golin	608cb5def6	[ARM] Support and tests for transform of LDR rt, = to MOV This change implements the transformation in processInstruction() for the LDR rt, =expression to MOV rt, expression when the expression can be evaluated and can fit into the immediate field of the MOV or a MVN. Across the ARM and Thumb instruction sets there are several cases to consider, each with a different range of representatble constants. In ARM we have: * Modified immediate (All ARM architectures) * MOVW (v6t2 and above) In Thumb we have: * Modified immediate (v6t2, v7m and v8m.mainline) * MOVW (v6t2, v7m, v8.mainline and v8m.baseline) * Narrow Thumb MOV that can be used in an IT block (non flag-setting) If the immediate fits any of the available alternatives then we make the transformation. Fixes 25722. Patch by Peter Smith. llvm-svn: 269354	2016-05-12 21:22:42 +00:00
Renato Golin	3f126138a1	[ARM] Delay ARM constant pool creation. NFC. This change adds a new constant pool kind to ARMOperand. When parsing the operand for =immediate we create an instance of this operand rather than creating a constant pool entry and rewriting the operand. As the new operand kind is only created for ldr rt,= we can make ldr rt,= an explicit pseudo instruction in ARM, Thumb and Thumb2 The pseudo instruction is expanded in processInstruction(). This creates the constant pool and transforms the pseudo instruction into a pc-relative ldr to the constant pool. There are no functional changes and no modifications needed to existing tests. Required by the patch that fixes PR25722. Patch by Peter Smith. llvm-svn: 269352	2016-05-12 21:22:31 +00:00
Renato Golin	f6ed8bbf46	[scan-build] fix warnings emitted on LLVM ARM code base Fix "Logic error" warnings of the type "Called C++ object pointer is null" reported by Clang Static Analyzer. Patch by Apelete Seketeli. llvm-svn: 269285	2016-05-12 12:33:33 +00:00
Justin Bogner	4557136645	SDAG: Implement Select instead of SelectImpl in ARMDAGToDAGISel This is a large change, but it's pretty mechanical: - Where we were returning a node before, call ReplaceNode instead. - Where we would return null to fall back to another selector, rename the method to try* and return a bool for success. - Where we were calling SelectNodeTo, just return afterwards. Part of llvm.org/pr26808. llvm-svn: 269258	2016-05-12 00:31:09 +00:00
Justin Bogner	ed4f37850e	SDAG: Clean up dangling nodes in ARMISelDAGToDAG::SelectImpl When we convert to the void Select interface, leaving unreferenced nodes around won't be allowed anymore. Part of llvm.org/pr26808. llvm-svn: 269256	2016-05-12 00:20:19 +00:00
Tim Northover	56048d5c2c	ARM: report an error when attempting to target a misalgined BLX The CodeGen problem was fixed in r269101, but we still miscompiled assembly that tried the same thing. llvm-svn: 269126	2016-05-10 21:48:48 +00:00
Tim Northover	b5ece527a1	ARM: stop emitting blx instructions for most calls on MachO. I'm really not sure why we were in the first place, it's the linker's job to convert between BL/BLX as necessary. Even worse, using BLX left Thumb calls that could be locally resolved completely unencodable since all offsets to BLX are multiples of 4. rdar://26182344 llvm-svn: 269101	2016-05-10 19:17:47 +00:00
Matthias Braun	31d19d43c7	CodeGen: Move TargetPassConfig from Passes.h to an own header; NFC Many files include Passes.h but only a fraction needs to know about the TargetPassConfig class. Move it into an own header. Also rename Passes.cpp to TargetPassConfig.cpp while we are at it. llvm-svn: 269011	2016-05-10 03:21:59 +00:00
Weiming Zhao	5b5501e817	[ARM] Fix Scavenger assert due to underestimated stack size (re-apply r268810 as it exposed an uninitialized variable in ARM MFI. Patch 268868 should fix that.) Summary: Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure. Reviewers: rengolin Subscribers: vitalybuka, aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D19896 llvm-svn: 268869	2016-05-08 05:11:54 +00:00
Weiming Zhao	453b79013e	Fix use-of-uninitialized-value of ARMMachineFunctionInfo Summary: Explicitly initialize ArgumentStackSize to prevent the msan failure. Reviewers: rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D20051 llvm-svn: 268868	2016-05-08 05:04:47 +00:00
Vitaly Buka	e81d96be6f	Revert r268810 becase it brakes msan bot. 16802==WARNING: MemorySanitizer: use-of-uninitialized-value lib/Target/ARM/ARMFrameLowering.cpp:1632 llvm-svn: 268833	2016-05-07 01:54:00 +00:00
Weiming Zhao	74f12d31c1	[ARM] Fix Scavenger assert due to underestimated stack size (this is resubmit of r268529 with minor refactoring. r268529 was reverted at r268536 due a memory sanitizer failure. I have not been able to reproduce that failure and I checked all the variable used in my change but I could not spot an issue. I did some refactoring and see if it will give a clearer hint) Summary: Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure. Reviewers: rengolin Subscribers: vitalybuka, aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D19896 llvm-svn: 268810	2016-05-06 22:20:13 +00:00
Justin Bogner	b012699741	SDAG: Rename Select->SelectImpl and repurpose Select as returning void This is a step towards removing the rampant undefined behaviour in SelectionDAG, which is a part of llvm.org/PR26808. We rename SelectionDAGISel::Select to SelectImpl and update targets to match, and then change Select to return void and consolidate the sketchy behaviour we're trying to get away from there. Next, we'll update backends to implement `void Select(...)` instead of SelectImpl and eventually drop the base Select implementation. llvm-svn: 268693	2016-05-05 23:19:08 +00:00
Tim Northover	df43264cf7	ARM: don't attempt to merge litpools referencing different PC-anchors. Given something like: ldr r0, .LCPI0_0 (== pc-rel var) add r0, pc ldr r1, .LCPI0_1 (== pc-rel var) add r1, pc we cannot combine the 2 ldr instructions and litpools because they get added to a different pc to form the correct address. I think the original logic came from a time when we fused the LDRpci/PICADD instructions into one pseudo-instruction so the PC was always immediately at-hand. That's no longer the case. Should fix general-dynamic TLS access on Linux, and quite possibly other -fPIC code that relies on litpools (e.g. v6m and -Oz compilations) though trivial tweaks of the .ll test didn't provoke anything. llvm-svn: 268662	2016-05-05 18:38:53 +00:00
Justin Bogner	8752be775c	ARM: Use a Handle to track SDNodes in case they're CSE'd. NFC The code here is recursively Select-ing a new Node to avoid issues where N is CSE'd during replaceDAGValue and stops being valid. We can accomplish the same goal in a more principled way by using a HandleSDNode. This is essentially a less dodgy fix for PR25733 than the original attempt back in r255120. llvm-svn: 268590	2016-05-05 01:43:49 +00:00
Vitaly Buka	6b5c89262a	Revert r268529 because it caused use-of-uninitialized-value Summary: This reverts commit d88cc0862bf7da64850b89e9bb5ea9f95e7f1184. #0 0xfed467 in llvm::ARMFrameLowering::determineCalleeSaves(llvm::MachineFunction&, llvm::BitVector&, llvm::RegScavenger) const /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Target/ARM/ARMFrameLowering.cpp:1625:52 #1 0x330d4cc in (anonymous namespace)::PEI::runOnMachineFunction(llvm::MachineFunction&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/PrologEpilogInserter.cpp:186:3 #2 0x3193e12 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/MachineFunctionPass.cpp:60:13 #3 0x396237d in llvm::FPPassManager::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1526:23 #4 0x3962a23 in llvm::FPPassManager::runOnModule(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1547:16 #5 0x3963d52 in runOnModule /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1603:23 #6 0x3963d52 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1706 #7 0x6bb910 in compileModule(char*, llvm::LLVMContext&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:412:5 #8 0x6b3c25 in main /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:218:22 #9 0x7fd4a7d37ec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4) #10 0x625c93 in _start (/mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm_build_msan/bin/llc+0x625c93) Reviewers: Subscribers: llvm-svn: 268536	2016-05-04 19:44:11 +00:00
Weiming Zhao	2373f769ce	[ARM] Fix Scavenger assert due to underestimated stack size Summary: Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure. Reviewers: rengolin Subscribers: aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D19896 llvm-svn: 268529	2016-05-04 18:19:33 +00:00
Matthias Braun	d1aabb2813	livePhysRegs: Pass MBB by reference in addLive{Ins\|Outs}(); NFC The block must no be nullptr for the addLiveIns()/addLiveOuts() function. llvm-svn: 268340	2016-05-03 00:24:32 +00:00
Matthias Braun	24f26e6d91	LivePhysRegs: Automatically determine presence of pristine regs. Remove the AddPristinesAndCSRs parameters from addLiveIns()/addLiveOuts(). We need to respect pristine registers after prologue epilogue insertion, Seeing that we got this wrong in at least two commits already, we should rather pay the small price to query MachineFrameInfo for it. There are three cases that did not set AddPristineAndCSRs to true even after register allocation: - ExecutionDepsFix: live-out registers are used as a hint that the register is used soon. This is not true for pristine registers so use the new addLiveOutsNoPristines() to maintain this behaviour. - SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like a bug, should do the right thing automatically now. - StackMapLivenessAnalysis: Not adding pristine registers looks like a bug to me. Added a FIXME comment but maintain the current behaviour as a change may need to get coordinated with GC runtimes. llvm-svn: 268336	2016-05-03 00:08:46 +00:00
Tim Northover	c08db1840c	ARM: fix handling of SUB immediates in peephole opt. We were negating an immediate that was going to be used in a SUBri form unnecessarily. Since ADD/SUB are very similar we can do that, but we have to change the SUB to an ADD at the same time. This also applies to ADD, and allows us to handle a slightly larger range of immediates for those two operations. rdar://25992245 llvm-svn: 268276	2016-05-02 18:30:08 +00:00
Filipe Cabecinhas	0da9937517	Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them. Summary: Historically, we had a switch in the Makefiles for turning on "expensive checks". This has never been ported to the cmake build, but the (dead-ish) code is still around. This will also make it easier to turn it on in buildbots. Reviewers: chandlerc Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D19723 llvm-svn: 268050	2016-04-29 15:22:48 +00:00
Craig Topper	33772c5375	[CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior. llvm-svn: 267853	2016-04-28 03:34:31 +00:00
Ahmed Bougacha	65572afea8	[ARM] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs. We run after PEI. Found via inspection; no obvious testcase. Follow-up to r266679. llvm-svn: 267781	2016-04-27 20:33:07 +00:00
Ahmed Bougacha	b4af107239	[ARM] Set correct successors in CMPXCHG pseudo expansion. transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas it should be a successor of the original MBB. The testcase changes are caused by Thumb2SizeReduction, which was previously confused by the broken CFG. Follow-up to r266679. Unfortunately, it's tricky to catch this in the verifier. llvm-svn: 267778	2016-04-27 20:32:54 +00:00
Ahmed Bougacha	128f8732a5	[CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI. Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606	2016-04-26 21:15:30 +00:00
Craig Topper	d8d6be4f99	[ARM] Expand vector ctlz_zero_undef so it becomes ctlz. The default is Legal, which results in 'Cannot select' errors. llvm-svn: 267521	2016-04-26 05:04:37 +00:00
Craig Topper	edb4a6ba98	[ARM] Expand v1i64 and v2i64 ctlz. The default is legal, which results in 'Cannot select' errors. llvm-svn: 267520	2016-04-26 05:04:33 +00:00
Andrew Kaylor	1aa3cf7d18	Reverting Thumb2SizeReduction opt bisect change to fix failing buildbots. llvm-svn: 267506	2016-04-26 00:56:36 +00:00
Junmo Park	3c65acf87e	Remove MinLatency in SchedMachineModel. NFC. Summary: We don't use MinLatency any more since r184032. Reviewers: atrick, hfinkel, mcrosier Differential Revision: http://reviews.llvm.org/D19474 llvm-svn: 267502	2016-04-26 00:37:46 +00:00
Andrew Kaylor	736efc894d	Fix build warning llvm-svn: 267487	2016-04-25 22:27:30 +00:00
Andrew Kaylor	a2b9111ef7	Add optimization bisect opt-in calls for ARM passes Differential Revision: http://reviews.llvm.org/D19449 llvm-svn: 267480	2016-04-25 22:01:04 +00:00
Tim Northover	5c3140f745	ARM: put extern __thread stubs in a special section. The linker needs to know that the symbols are thread-local to do its job properly. llvm-svn: 267473	2016-04-25 21:12:04 +00:00
Silviu Baranga	82d04260b7	[ARM] Add support for the X asm constraint Summary: This patch adds support for the X asm constraint. To do this, we lower the constraint to either a "w" or "r" constraint depending on the operand type (both constraints are supported on ARM). Fixes PR26493 Reviewers: t.p.northover, echristo, rengolin Subscribers: joker.eph, jgreenhalgh, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D19061 llvm-svn: 267411	2016-04-25 14:29:18 +00:00
Saleem Abdulrasool	9611518646	ARM: fix __chkstk Frame Setup on WoA This corrects the MI annotations for the stack adjustment following the __chkstk invocation. We were marking the original SP usage as a Def rather than Kill. The (new) assigned value is the definition, the original reference is killed. Adjust the ISelLowering to mark Kills and FrameSetup as well. This partially resolves PR27480. llvm-svn: 267361	2016-04-24 20:12:48 +00:00
Peter Collingbourne	265ebd7d70	CodeGen: Use PLT relocations for relative references to unnamed_addr functions. The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual functions defined in other DSOs. The unnamed_addr attribute means that the function's address is not significant, so we're allowed to substitute it with the address of a PLT entry. Also includes a bonus feature: addends for COFF image-relative references. Differential Revision: http://reviews.llvm.org/D17938 llvm-svn: 267211	2016-04-22 20:40:10 +00:00
David Majnemer	68318e0414	Fix some spelling mistakes llvm-svn: 267112	2016-04-22 06:37:48 +00:00
Saleem Abdulrasool	a028853540	ARM: restrict register class for WIN__DBZCHK WIN__DBZCHK will insert a CBZ instruction into the stream. This instruction reserves 3 bits for the condition register (rn). As such, we must ensure that we restrict the register to a low register. Use the tGPR class instead of GPR to ensure that this is properly constrained. In debug builds, we would attempt to use lr as a condition register which would silently get truncated with no hint that the register selection was incorrect. llvm-svn: 267080	2016-04-21 23:53:19 +00:00
Tim Northover	c52c74efdf	MachO: enable .data_region directives everywhere We'd disabled them on x86 because back in the early days some host tools couldn't handle the new load commands. This no longer holds: anyone capable of deploying Clang should be able to deploy its copies of ar/ranlib/etc. rdar://25254790 llvm-svn: 267075	2016-04-21 23:00:17 +00:00
Tim Northover	1ee27c74cb	ARM: fix assertion failure on -O0 cmpxchg. Because lowering of CMP_SWAP_64 occurs during type legalization, there can be i64 types produced by more than just a BUILD_PAIR or similar. My initial tests used just incoming function args. llvm-svn: 266828	2016-04-19 22:25:02 +00:00
Marcin Koscielnicki	3fdc257d6a	[AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic. Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that just return the thread pointer. I have a pending patch that does the same for SystemZ (D19054), and there are many more targets that could benefit from one. This patch merges the ARM and AArch64 intrinsics into a single target independent one that will also be used by subsequent targets. Differential Revision: http://reviews.llvm.org/D19098 llvm-svn: 266818	2016-04-19 20:51:05 +00:00
Tim Northover	b629c77692	ARM: use a pseudo-instruction for cmpxchg at -O0. The fast register-allocator cannot cope with inter-block dependencies without spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions where any value produced within a block is dead by the end, but not for cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded after regalloc. Fortunately this is at -O0 so we don't have to care about performance. This simplifies the various axes of expansion considerably: we assume a strong seq_cst operation and ensure ordering via the always-present DMB instructions rather than v8 acquire/release instructions. Should fix the 32-bit part of PR25526. llvm-svn: 266679	2016-04-18 21:48:55 +00:00
Renato Golin	4b18a510a2	[ARM] AArch32 v8 NEON is still not IEEE-754 compliant llvm-svn: 266603	2016-04-18 12:06:47 +00:00
Mehdi Amini	b550cb1750	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Tim Northover	903f81ba18	ARM: don't try to hoist constant RHS out of a division. Divisions by a constant can be converted into multiplies which are usually cheaper, but this isn't possible if the constant gets separated (particularly in loops). Fix this by telling ConstantHoisting that the immediate in a DIV is cheap. I considered making the check generic, but neither AArch64 (strangely) nor x86 showed any benefit on the tests I had. llvm-svn: 266464	2016-04-15 18:17:18 +00:00
Renato Golin	5cb666add7	[ARM] Adding IEEE-754 SIMD detection to loop vectorizer Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON. This patch teaches the loop vectorizer to only allow transformations of loops that either contain no floating-point operations or have enough allowance flags supporting lack of precision (ex. -ffast-math, Darwin). For that, the target description now has a method which tells us if the vectorizer is allowed to handle FP math without falling into unsafe representations, plus a check on every FP instruction in the candidate loop to check for the safety flags. This commit makes LLVM behave like GCC with respect to ARM NEON support, but it stops short of fixing the underlying problem: sub-normals. Neither GCC nor LLVM have a flag for allowing sub-normal operations. Before this patch, GCC only allows it using unsafe-math flags and LLVM allows it by default with no way to turn it off (short of not using NEON at all). As a first step, we push this change to make it safe and in sync with GCC. The second step is to discuss a new sub-normal's flag on both communitues and come up with a common solution. The third step is to improve the FastMath flags in LLVM to encode sub-normals and use those flags to restrict NEON FP. Fixes PR16275. llvm-svn: 266363	2016-04-14 20:42:18 +00:00
Matthias Braun	46b0f03e12	TargetLowering: Factor out common code for tail call eligibility checking; NFC llvm-svn: 266270	2016-04-14 01:10:42 +00:00
Tim Northover	5c02f9ad28	ARM: override cost function to re-enable ConstantHoisting (& fix it). At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260	2016-04-13 23:08:27 +00:00
Matthias Braun	707e02c273	ARM: Use a callee save register for the swiftself parameter. It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D18901 llvm-svn: 266253	2016-04-13 21:43:25 +00:00
Tim Northover	a6dea06fe3	ARM: use r7 as the frame-pointer on all MachO targets. This is better for a few reasons: + It matches the other tooling for iOS. + It matches EABI in more cases (i.e. Thumb-mode, and in practice we don't use ARM mode). + It leads to infinitesimally smaller code (0.2%, yay!). rdar://25369506 llvm-svn: 266003	2016-04-11 22:27:40 +00:00
Manman Ren	5751814eda	Swift Calling Convention: swifterror target support. Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997	2016-04-11 21:08:06 +00:00
Oliver Stannard	c869e9158d	[ARM] Avoid switching ARM/Thumb mode on .arch/.cpu directive When we see a .arch or .cpu directive, we should try to avoid switching ARM/Thumb mode if possible. If we do have to switch modes, we also need to emit the correct mapping symbol for the new ISA. We did not do this previously, so could emit ARM code with Thumb mapping symbols (or vice-versa). The GAS behaviour is to always stay in the same mode, and to emit an error on any instructions seen when the current mode is not available on the current target. We can't represent that situation easily (we assume that Thumb mode is available if ModeThumb is set), so we differ from the GAS behaviour when switching to a target that can't support the old mode. I've added a warning for when this implicit mode-switch occurs. Differential Revision: http://reviews.llvm.org/D18955 llvm-svn: 265936	2016-04-11 13:06:28 +00:00
Sam Parker	2d5126cdf5	[ARM] Enable SMLAW[B\|T] and SMLUW[B\|T] instruction selection Added ISelDAGToDAG functions to enable selection of the smlawb, smlawt, smulwb and smulwt instructions for the ARM backend. Also updated the smul CodeGen test and removed the smulw one. Differential Revision: http://reviews.llvm.org/D18892 llvm-svn: 265793	2016-04-08 16:02:53 +00:00
JF Bastien	800f87a871	NFC: make AtomicOrdering an enum class Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602	2016-04-06 21:19:33 +00:00
Manman Ren	802cd6f9d7	Swift Calling Convention: swiftcc for ARM. Differential Revision: http://reviews.llvm.org/D18769 llvm-svn: 265482	2016-04-05 22:44:44 +00:00
Sam Parker	0d3a3a537c	[ARM] Cleanup of smul and smla instruction descriptions Removed the SDNode argument passed to the AI_smul and AI_smla multiclass definitions as they are always mul. Differential Revision: http://reviews.llvm.org/D18791 llvm-svn: 265409	2016-04-05 16:01:25 +00:00
Matthias Braun	870c34f0cf	ARM, AArch64, X86: Check preserved registers for tail calls. We can only perform a tail call to a callee that preserves all the registers that the caller needs to preserve. This situation happens with calling conventions like preserver_mostcc or cxx_fast_tls. It was explicitely handled for fast_tls and failing for preserve_most. This patch generalizes the check to any calling convention. Related to rdar://24207743 Differential Revision: http://reviews.llvm.org/D18680 llvm-svn: 265329	2016-04-04 18:56:13 +00:00
Derek Schuff	1dbf7a571f	Add MachineFunctionProperty checks for AllVRegsAllocated for target passes Summary: This adds the same checks that were added in r264593 to all target-specific passes that run after register allocation. Reviewers: qcolombet Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D18525 llvm-svn: 265313	2016-04-04 17:09:25 +00:00
James Y Knight	e6a4646372	Remove useless check for ThreadModel==Single in ARMISelLowering. NFC. ThreadModel::Single is already handled already by ARMPassConfig adding LowerAtomicPass to the pass list, which lowers all atomics to non-atomic ops and deletes fences. So by the time we get to ISel, there's no atomic fences left, so they don't need special handling. llvm-svn: 265178	2016-04-01 19:33:19 +00:00
James Molloy	b876c72bcc	Fix for pr24346: arm asm label calculation error in sub Some ARM instructions encode 32-bit immediates as a 8-bit integer (0-255) and a 4-bit rotation (0-30, even) in its least significant 12 bits. The original fixup, FK_Data_4, patches the instruction by the value bit-to-bit, regardless of the encoding. For example, assuming the label L1 and L2 are 0x0 and 0x104 respectively, the following instruction: add r0, r0, #(L2 - L1) ; expects 0x104, i.e., 260 would be assembled to the following, which adds 1 to r0, instead of 260: e2800104 add r0, r0, #4, 2 ; equivalently 1 The new fixup kind fixup_arm_mod_imm takes care of the encoding: e2800f41 add r0, r0, #260 Patch by Ting-Yuan Huang! llvm-svn: 265122	2016-04-01 09:40:47 +00:00
Benjamin Kramer	569efd2cfd	[ARM] Expand v1i64 and v2i64 ctpop. The default is legal, which results in 'Cannot select' errors. This is triggered during selfhost due to a recent cost model change. llvm-svn: 265040	2016-03-31 19:42:04 +00:00
Hans Wennborg	e1a2e90ffa	Change eliminateCallFramePseudoInstr() to return an iterator This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous and/or next instruction. It also greatly simplifies the calls to this function from Prolog- EpilogInserter. Previously, that had a bunch of logic to resume iteration after the call; now it just continues with the returned iterator. Note that this changes the behaviour of PEI a little. Previously, it attempted to re-visit the new instruction created by eliminateCallFramePseudoInstr(). That code was added in r36625, but I can't see any reason for it: the new instructions will obviously not be pseudo instructions, they will not have FrameIndex operands, and we have already accounted for the stack adjustment. Differential Revision: http://reviews.llvm.org/D18627 llvm-svn: 265036	2016-03-31 18:33:38 +00:00
Matthias Braun	8d41436004	CodeGen: Factor out code for tail call result compatibility check; NFC llvm-svn: 264959	2016-03-30 22:46:04 +00:00
Aaron Ballman	ef0fe1eed8	Silencing warnings from MSVC 2015 Update 2. All of these changes silence "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC. llvm-svn: 264929	2016-03-30 21:30:00 +00:00
Nirav Dave	8dd66e5753	Remove HasFnAttribute guards to getFnAttribute calls These checks are redundant and can be removed Reviewers: hans Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D18564 llvm-svn: 264872	2016-03-30 15:41:12 +00:00
Manman Ren	f46262e0b7	Swift Calling Convention: add swiftself attribute. Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754	2016-03-29 17:37:21 +00:00
Saleem Abdulrasool	750a90df6a	ARM: maintain BB ordering when expanding WIN__DBZCHK It is possible to have a fallthrough MBB prior to MBB placement. The original addition of the BB would result in reordering the BB as not preceding the successor. Because of the fallthrough nature of the BB, we could end up executing incorrect code or even a constant pool island! Insert the spliced BB into the same location to avoid that. Thanks to Tim Northover for invaluable hints and Fiora for the discussion on what may have been occurring! llvm-svn: 264454	2016-03-25 19:48:06 +00:00
Saleem Abdulrasool	0dab98d926	ARM: fix optimised division on WoA We did not have an explicit branch to the continuation BB. When the check was hoisted, this could permit control follow to fall through into the division trap. Add the explicit branch to the continuation basic block to ensure that code execution is correct. llvm-svn: 264370	2016-03-25 00:34:11 +00:00
Artyom Skrobov	e6f1b7f094	Replace a string comparison in ARMSubtarget.h with a tablegen entry in ARM.td (NFC) Reviewers: rengolin, t.p.northover Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D18393 llvm-svn: 264165	2016-03-23 16:18:13 +00:00
Peter Collingbourne	86b9fbe980	ARM: Better codegen for 64-bit compares. This introduces a custom lowering for ISD::SETCCE (introduced in r253572) that allows us to emit a short code sequence for 64-bit compares. Before: push {r7, lr} cmp r0, r2 mov.w r0, #0 mov.w r12, #0 it hs movhs r0, #1 cmp r1, r3 it ge movge.w r12, #1 it eq moveq r12, r0 cmp.w r12, #0 bne .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} After: push {r7, lr} subs r0, r0, r2 sbcs.w r0, r1, r3 bge .LBB1_2 @ BB#1: @ %bb1 bl f pop {r7, pc} .LBB1_2: @ %bb2 bl g pop {r7, pc} Saves around 80KB in Chromium's libchrome.so. Some notes on this patch: - I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I introduced (nothing else needs them). However, they are necessary in order to avoid poor codegen, and they seem similar to existing combines in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to (brcond Compare)). - No support for Thumb-1. This is in principle possible, but we'd need to implement ARMISD::SUBE for Thumb-1. Differential Revision: http://reviews.llvm.org/D15256 llvm-svn: 263962	2016-03-21 18:00:02 +00:00
Renato Golin	2b6b7ffd6c	[ARM] Add Cortex-A32 support Adding Cortex-A32 as an available target in the ARM backend. Patch by Sam Parker. llvm-svn: 263956	2016-03-21 17:29:01 +00:00
Manman Ren	a3a019cf90	[CXX_FAST_TLS] Fix issues in ARM. We need to be careful on which registers can be explicitly handled via copies. Prologue, Epilogue use physical registers and if one belongs to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites it after the explicit copies. llvm-svn: 263857	2016-03-18 23:44:37 +00:00
Manman Ren	4865d89653	[CXX_FAST_TLS] Disable tail call when calling conventions are mismatched. Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller and callee have mismatched calling conventions. llvm-svn: 263856	2016-03-18 23:41:51 +00:00
Manman Ren	2828c57b6f	[CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86. Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855	2016-03-18 23:38:49 +00:00
Tim Northover	498c56c240	ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering. llvm-svn: 263741	2016-03-17 20:10:28 +00:00
Saleem Abdulrasool	071a099102	ARM: Revert SVN r253865, 254158, fix windows division The two changes together weakened the test and caused a regression with division handling in MSVC mode. They were applied to avoid an assertion being triggered in the block frequency analysis. However, the underlying problem was simply being masked rather than solved properly. Address the actual underlying problem and revert the changes. Rather than analyze the cause of the assertion, the division failure was assumed to be an overflow. The underlying issue was a subtle bug in the BB construction in the emission of the div-by-zero check (WIN__DBZCHK). We did not construct the proper successor information in the basic blocks, nor did we update the PHIs associated with the basic block when we split them. This would result in assertions being triggered in the block frequency analysis pass. Although the original tests are being removed, the tests themselves performed very little in terms of validation but merely tested that we did not assert when generating code. Update this with new tests that actually ensure that we do not regress on the code generation. llvm-svn: 263714	2016-03-17 14:10:49 +00:00
James Y Knight	f44fc5219f	Tweak some atomics functions in preparation for larger changes; NFC. - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665	2016-03-16 22:12:04 +00:00
Davide Italiano	dfdf278ebf	[MC] Rename TLSDESC as it's not ARM specific. Similarly to what was done for TLSCALL in r263515. llvm-svn: 263564	2016-03-15 17:29:52 +00:00
Davide Italiano	249c45d92e	[MC] Rename TLSCALL as it's not ARM specific. `MCSymbolRefExpr` variant kind for TLSCALL is prefixed with _ARM_ since this is how it was originally implemented. The X86_64 version is exactly the same so there's no reason to create a new variant, we can just rename the existing one to be machine-independent. This generalization is the first step to implement support for GNU2 TLS dialect in MC. Differential Revision: http://reviews.llvm.org/D18160 llvm-svn: 263515	2016-03-15 00:25:22 +00:00
Sanjay Patel	7506852709	[DAG] use !isUndef() ; NFCI llvm-svn: 263453	2016-03-14 18:09:43 +00:00
Sanjay Patel	5719584129	[DAG] use isUndef() ; NFCI llvm-svn: 263448	2016-03-14 17:28:46 +00:00
Peter Collingbourne	aba16fca5d	ARM: Support relative references using the PREL31 symbol variant. Differential Revision: http://reviews.llvm.org/D17937 llvm-svn: 263156	2016-03-10 19:30:18 +00:00
Alexandros Lamprineas	843164242e	[ARM] Cortex-R8 support This patch adds Cortex-R8 to Target Parser and TableGen. It also adds CodeGen tests for the build attributes. Patch by Pablo Barrio. Differential Revision: http://reviews.llvm.org/D17925 llvm-svn: 263132	2016-03-10 17:38:41 +00:00
Saleem Abdulrasool	1632fe1f77	ARM: follow up improvements for SVN r263118 The initial change was insufficiently complete for always getting the semantics of __builtin_longjmp correct. The builtin is translated into a `tInt_eh_sjlj_longjmp` DAG node. This node set R7 as clobbered. However, the code would then follow up with a clobber of R11. I had failed to notice the imp-def,kill on R7 in the isel. Unfortunately, it seems that it is not possible to conditionalise the Defs list via an !if. Instead, construct a new parallel WIN node and prefer that when targeting windows. This ensures that we now both correctly model the __builtin_longjmp as well as construct the frame in a more ABI conformant manner. llvm-svn: 263123	2016-03-10 16:26:37 +00:00
Saleem Abdulrasool	8b30f9854e	ARM: correct __builtin_longjmp on WoA WoA uses r11 as the FP even though it is a pure thumb-2 environment in contrast to AAPCS which states r7. This adjusts __builtin_longjmp to not clobber r7 and to properly restore the frame pointer on execution. llvm-svn: 263118	2016-03-10 15:11:09 +00:00
Roman Levenstein	2792b3f02f	Add support for a preserve_most calling convention to the AArch64 backend. This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64. There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention. Differential Revision: http://reviews.llvm.org/D18016 llvm-svn: 263092	2016-03-10 04:35:09 +00:00
Artyom Skrobov	5ddea6a8e9	[ARM] Simplify ARMInstr*.td by getting rid of identity PatFrags (NFC) Reviewers: t.p.northover, grosbach, resistor Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D17636 llvm-svn: 262936	2016-03-08 16:23:54 +00:00
Renato Golin	175c6d6d95	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738	2016-03-04 19:19:36 +00:00
Renato Golin	3d78271eac	Revert "[ARM] Merging 64-bit divmod lib calls into one" This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594	2016-03-03 08:57:44 +00:00
Renato Golin	93e42d9934	[ARM] Merging 64-bit divmod lib calls into one When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262507	2016-03-02 19:35:45 +00:00
Matthias Braun	f290912d22	ARM: Introduce conservative load/store optimization mode Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which means that unaligned loads/stores do not trap and even extensive testing will not catch these bugs. However the multi/double variants are not affected by this bit and will still trap. In effect a more aggressive load/store optimization will break existing (bad) code. These bugs do not necessarily manifest in the broken code where the misaligned pointer is formed but often later in perfectly legal code where it is accessed. This means recompiling system libraries (which have no alignment bugs) with a newer compiler will break existing applications (with alignment bugs) that worked before. So (under protest) I implemented this safe mode which limits the formation of multi/double operations to cases that are not affected by user code (stack operations like spills/reloads) or cases where the normal operations trap anyway (floating point load/stores). It is disabled by default. Differential Revision: http://reviews.llvm.org/D17015 llvm-svn: 262504	2016-03-02 19:20:00 +00:00
Matthias Braun	17cb57995e	TableGen: Check scheduling models for completeness TableGen checks at compiletime that for scheduling models with "CompleteModel = 1" one of the following holds: - Is marked with the hasNoSchedulingInfo flag - The instruction is a subclass of Sched - There are InstRW definitions in the scheduling model Typical steps necessary to complete a model: - Ensure all pseudo instructions that are expanded before machine scheduling (usually everything handled with EmitYYY() functions in XXXTargetLowering). - If a CPU does not support some instructions mark the corresponding resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }". - Add missing scheduling information. Differential Revision: http://reviews.llvm.org/D17747 llvm-svn: 262384	2016-03-01 20:03:21 +00:00
Duncan P. N. Exon Smith	fd8cc23220	CodeGen: Change MachineInstr to use MachineInstr&, NFC Change MachineInstr API to prefer MachineInstr& over MachineInstr* whenever the parameter is expected to be non-null. Slowly inching toward being able to fix PR26753. llvm-svn: 262149	2016-02-27 20:01:33 +00:00
Tim Northover	aa35bd26c7	ARM: disallow pc as a base register in Thumb2 memory ops. These should all be deferring to the "OP (literal)" variant according to the ARM ARM. llvm-svn: 261895	2016-02-25 16:54:52 +00:00
Tim Northover	b9edc5ca21	ARM: fix handling of movw/movt relocations with addend. We were emitting only one half of a the paired relocations needed for these instructions because we decided that an offset needed a scattered relocation. In fact, movw/movt relocations can be paired without being scattered. llvm-svn: 261679	2016-02-23 20:20:23 +00:00
Weiming Zhao	5e0c3ebe9e	Fix PR25339: ARM Constant Island Summary: Currently, the ARM Constant Island may not converge (or not converge quickly). This patch let it move to the closest water after the user if it doesn't converge after 15 iterations. This address https://llvm.org/bugs/show_bug.cgi?id=25339 Reviewers: t.p.northover, srhines, kristof.beyls, aadg, rengolin Subscribers: weimingz, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D16890 llvm-svn: 261665	2016-02-23 18:39:19 +00:00
Junmo Park	453f4aa4dd	[ARM] fix initialization of PredictableSelectIsExpensive Summary: If we want classify OoO or not, using getSchedModel().isOutOfOrder() could be more proper way than using Subtarget->isLikeA9(). Reviewers: jmolloy, rengolin Differential Revision: http://reviews.llvm.org/D17433 llvm-svn: 261623	2016-02-23 09:56:58 +00:00
Duncan P. N. Exon Smith	6307eb5518	CodeGen: TII: Take MachineInstr& in predicate API, NFC Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605	2016-02-23 02:46:52 +00:00
Duncan P. N. Exon Smith	d84f600653	CodeGen: Bring back MachineBasicBlock::iterator::getInstrIterator()... This is a little embarrassing. When I reverted r261504 (getIterator() => getInstrIterator()) in r261567, I did a `git grep` to see if there were new calls to `getInstrIterator()` that I needed to migrate. There were 10-20 hits, and I blindly did a `sed ...` before calling `ninja check`. However, these were `MachineInstrBundleIterator::getInstrIterator()`, which predated r261567. Perhaps coincidentally, these had an identical name and return type. This commit undoes my careless sed and restores `MachineBasicBlock::iterator::getInstrIterator()`. llvm-svn: 261577	2016-02-22 21:30:15 +00:00
Duncan P. N. Exon Smith	c5b668deb8	Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC" This reverts commit r261504, since it's not obvious the new name is better: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html I'll recommit if we get consensus that it's the right direction. llvm-svn: 261567	2016-02-22 20:49:58 +00:00
Duncan P. N. Exon Smith	dc0848c029	CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC Delete MachineInstr::getIterator(), since the term "iterator" is overloaded when talking about MachineInstr. - Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so that ilist_node::getIterator() is still available. - Add it back as MachineInstr::getInstrIterator(). This matches the naming in MachineBasicBlock. - Add MachineInstr::getBundleIterator(). This is explicitly called "bundle" (not matching MachineBasicBlock) to disintinguish it clearly from ilist_node::getIterator(). - Update all calls. Some of these I switched to `auto` to remove boiler-plate, since the new name is clear about the type. There was one call I updated that looked fishy, but it wasn't clear what the right answer was. This was in X86FrameLowering::inlineStackProbe(), added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to leave the behaviour unchanged, but I'll reply to the original commit on the list in a moment. llvm-svn: 261504	2016-02-21 22:58:35 +00:00
Duncan P. N. Exon Smith	e9bc579c37	ADT: Remove == and != comparisons between ilist iterators and pointers I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498	2016-02-21 20:39:50 +00:00
Junmo Park	1108ab059c	Minor code cleanups. NFC. llvm-svn: 261294	2016-02-19 01:46:04 +00:00
Ahmed Bougacha	93cff7fb82	[CodeGen] Document and use getConstant's splat-building feature. NFC. Differential Revision: http://reviews.llvm.org/D17229 llvm-svn: 260901	2016-02-15 18:07:29 +00:00
Ahmed Bougacha	f8dfb47c02	[CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI. llvm-svn: 260316	2016-02-09 22:54:12 +00:00
Saleem Abdulrasool	f36005a358	ARM: support TLS for WoA Add support for TLS access for Windows on ARM. This generates a similar access to MSVC for ARM. The changes to the tablegen data is needed to support loading an external symbol global that is not for a call. The adjustments to the DAG to DAG transforms are needed to preserve the 32-bit move. llvm-svn: 259676	2016-02-03 18:21:59 +00:00
Renato Golin	6027dd38ef	[ARM] Move GNUEABI divmod to __aeabi_divmod* The GNU toolchain emits __aeabi_divmod for soft-divide on ARM cores which happens to be a lot faster than __divsi3/__modsi3 when the core has hardware divide instructions. Do the same here. Fixes PR26450. llvm-svn: 259657	2016-02-03 16:10:54 +00:00
Sjoerd Meijer	ffe19f5245	Removed FeatureVFPOnlySP from the Cortex-R7 processor model description and changed the regression test accordingly. The default configuration of a Cortex-R7 is to implement the VFPv3-D16 architecture and the feature line as it was is too restrictive. llvm-svn: 259480	2016-02-02 09:28:20 +00:00
Matthias Braun	b30f2f5141	Avoid overly large SmallPtrSet/SmallSet These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. llvm-svn: 259283	2016-01-30 01:24:31 +00:00
Yaron Keren	eb2a25467e	Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240	2016-01-29 20:50:44 +00:00
Tim Northover	c4093c3ced	ARM: don't mangle DAG constant if it has more than one use The basic optimisation was to convert (mul $LHS, $complex_constant) into roughly "(shl (mul $LHS, $simple_constant), $simple_amt)" when it was expected to be cheaper. The original logic checks that the mul only has one use (since we're mangling $complex_constant), but when used in even more complex addressing modes there may be an outer addition that can pick up the wrong value too. I think the ARM addressing-mode problem is actually unreachable at the moment, but that depends on complex assessments of the profitability of pre-increment addressing modes so I've put a real check in there instead of an assertion. llvm-svn: 259228	2016-01-29 19:18:46 +00:00
Alexandros Lamprineas	8c26e7c647	[ARM] Emit trap instruction using .inst directive The trap instruction is emitted as a data-in-text rather than an instruction. This patch uses the .inst directive for emitting trap. Differential Revision: http://reviews.llvm.org/D16684 llvm-svn: 259182	2016-01-29 10:23:32 +00:00
Tim Northover	042a6c1fe1	ARMv7k: base ABI decision on v7k Arch rather than watchos OS. Various bits we want to use the new ABI actually compile with "-arch armv7k -miphoneos-version-min=9.0". Not ideal, but also not ridiculous given how slices work. llvm-svn: 258975	2016-01-27 19:32:29 +00:00
Benjamin Kramer	391be792f2	One more batch of self-containing headers. llvm-svn: 258974	2016-01-27 19:29:56 +00:00
Benjamin Kramer	b32a5042bd	Don't put classes in headers into anonymous namespaces. You want ODR violations? That's how you get ODR violations. llvm-svn: 258973	2016-01-27 19:29:42 +00:00
Benjamin Kramer	45275a4d3c	Make more headers self-contained. A lot of this comes from the new complete type requirement of DenseMap. llvm-svn: 258956	2016-01-27 18:03:37 +00:00
Benjamin Kramer	f9172fd4ac	Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to CodeGen/ It's a SelectionDAG thing, not a Target thing. llvm-svn: 258939	2016-01-27 16:32:26 +00:00
Benjamin Kramer	b3e8a6d2b8	Move MCTargetAsmParser.h to llvm/MC/MCParser where it belongs. llvm-svn: 258917	2016-01-27 10:01:28 +00:00
Chris Bieneman	e49730d4ba	Remove autoconf support Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861	2016-01-26 21:29:08 +00:00
Benjamin Kramer	f57c1977c1	Reflect the MC/MCDisassembler split on the include/ level. No functional change, just moving code around. llvm-svn: 258818	2016-01-26 16:44:37 +00:00
Bradley Smith	d27a6a7072	[ARM] Add DSP build attribute and extension targeting This patch was originally committed as r257885, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258683	2016-01-25 11:26:11 +00:00
Bradley Smith	f277c8a5ea	[ARM] Add new system registers to ARMv8-M Baseline/Mainline This patch was originally committed as r257884, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258682	2016-01-25 11:25:36 +00:00
Bradley Smith	fed3e4ac00	[ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline This patch was originally committed as r257883, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258681	2016-01-25 11:24:47 +00:00
Oliver Stannard	65b85382f6	[ARM] Add ARMv8.2-A FP16 scalar instructions This was originally committed as r255762, but reverted as it broke windows bots. Re-commitiing the exact same patch, as the underlying cause was fixed by r258677. ARMv8.2-A adds 16-bit floating point versions of all existing VFP floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. The assembly for these instructions uses S registers (AArch32 does not have H registers), but the instructions have ".f16" type specifiers rather than ".f32" or ".f64". The top 16 bits of each source register are ignored, and the top 16 bits of the destination register are set to zero. These instructions are mostly the same as the 32- and 64-bit versions, but they use coprocessor 9 rather than 10 and 11. Two new instructions, VMOVX and VINS, have been added to allow packing and extracting two 16-bit floats stored in the top and bottom halves of an S register. New fixup kinds have been added for the PC-relative load and store instructions, but no ELF relocations have been added as they have a range of 512 bytes. Differential Revision: http://reviews.llvm.org/D15038 llvm-svn: 258678	2016-01-25 10:26:26 +00:00
Oliver Stannard	9f68749eba	[ARM] Operands for PKHTB alias should be swapped When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of the input operands needs to be swapped. Differential Revision: http://reviews.llvm.org/D16288 llvm-svn: 258044	2016-01-18 11:56:35 +00:00
Manuel Jacob	190577ac81	[opaque pointer types] [NFC] CallSite: use getFunctionType() instead of going through PointerType::getElementType. Patch by Eduard Burtescu. Reviewers: dblaikie, mjacob Subscribers: dsanders, llvm-commits, dblaikie Differential Revision: http://reviews.llvm.org/D16273 llvm-svn: 258023	2016-01-17 22:37:39 +00:00
Manman Ren	e5f807f928	CXX_FAST_TLS calling convention: fix issue on ARM. When we have a single basic block, the explicit copy-back instructions should be inserted right before the terminator. Before this fix, they were wrongly placed at the beginning of the basic block. PR26136 llvm-svn: 257930	2016-01-15 20:24:11 +00:00
Reid Kleckner	d4a0d18899	Revert "[ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline" This reverts commit r257883. Somehow this didn't make it into r257916. llvm-svn: 257919	2016-01-15 18:55:12 +00:00
Reid Kleckner	47f2452da8	# This is a combination of 2 commits. # The first commit's message is: Revert "[ARM] Add DSP build attribute and extension targeting" This reverts commit b11cc50c0b4a7c8cdb628abc50b7dc226ff583dc. # This is the 2nd commit message: Revert "[ARM] Add new system registers to ARMv8-M Baseline/Mainline" This reverts commit 837d08454e3e5beb8581951ac26b22fa07df3cd5. llvm-svn: 257916	2016-01-15 18:31:29 +00:00
Bradley Smith	48b93e1f21	[ARM] Add DSP build attribute and extension targeting llvm-svn: 257885	2016-01-15 10:28:25 +00:00
Bradley Smith	42f6e90a43	[ARM] Add new system registers to ARMv8-M Baseline/Mainline llvm-svn: 257884	2016-01-15 10:28:03 +00:00
Bradley Smith	618712df04	[ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline llvm-svn: 257883	2016-01-15 10:27:14 +00:00
Bradley Smith	433c22e35c	[ARM] Add ARMv8-A semaphore/atomic instructions to ARMv8-M Baseline/Mainline llvm-svn: 257882	2016-01-15 10:26:51 +00:00
Bradley Smith	a1189106d5	[ARM] Add B.W and CBZ instructions to ARMv8-M Baseline llvm-svn: 257881	2016-01-15 10:26:17 +00:00
Bradley Smith	519563e371	[ARM] Add SDIV/UDIV instructions to ARMv8-M Baseline llvm-svn: 257880	2016-01-15 10:25:35 +00:00
Bradley Smith	d9a99ce53d	[ARM] Add MOVW/MOVT instructions to ARMv8-M Baseline/Mainline llvm-svn: 257879	2016-01-15 10:25:14 +00:00
Bradley Smith	e26f799422	[ARM] Add ARMv8-M Baseline/Mainline LLVM targeting llvm-svn: 257878	2016-01-15 10:24:39 +00:00
Bradley Smith	4c21cba72b	[ARM] Split out ARMv8-A semaphores and atomics and ARMv7 clrex as separate features llvm-svn: 257877	2016-01-15 10:23:46 +00:00
Rui Ueyama	da00f2fdf4	Update to use new name alignTo(). llvm-svn: 257804	2016-01-14 21:06:47 +00:00
Benjamin Kramer	fc1f7d893e	[ARM] Use the efficient version of BitVector::set and a static_assert. No functional change intended. llvm-svn: 257766	2016-01-14 14:33:04 +00:00
Rafael Espindola	8340f94df1	Convert a few assert failures into proper errors. Fixes PR25944. llvm-svn: 257697	2016-01-13 22:56:57 +00:00
Ana Pazos	359cab3bb3	Guard fabs to bfc convert with V6T2 flag Summary: BFC instructions are available in ARMv6T2 and above. Reviewers: t.p.northover Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D16076 llvm-svn: 257546	2016-01-13 00:03:35 +00:00
Quentin Colombet	f8e3030794	[ARM] Mark VMOV with immediate: isAsCheapAsMove. VMOVs are not strictly speaking cheap, but they are as expensive as a vector copy (VORR), so we should prefer rematerialization over splitting when it applies. rdar://problem/23754176 llvm-svn: 257545	2016-01-13 00:02:40 +00:00
Keno Fischer	00021429d4	[ARM] Fix several state persistence bugs Summary: This fixes three bugs, in all of which state is not or incorrecly reset between objects (i.e. when reusing the same pass manager to create multiple object files): 1) AttributeSection needs to be reset to nullptr, because otherwise the backend will try to emit into the old object file's attribute section causing a segmentation fault. 2) MappingSymbolCounter needs to be reset, otherwise the second object file will start where the first one left off. 3) The MCStreamer base class resets the Streamer's e_flags settings. Since EF_ARM_EABI_VER5 is set on streamer creation, we need to set it again after the MCStreamer was rest. Also rename Reset (uppser case) to EHReset to avoid confusion with reset (lower case). Reviewers: rengolin Differential Revision: http://reviews.llvm.org/D15950 llvm-svn: 257473	2016-01-12 13:38:15 +00:00
Manman Ren	5e9e65e705	CXX_FAST_TLS calling convention: performance improvement for ARM. This is the same change on ARM as r255821 on AArch64. rdar://9001553 llvm-svn: 257424	2016-01-12 00:47:18 +00:00
Manman Ren	1602605bf8	CXX_FAST_TLS calling convention: Add support for ARM on Darwin. rdar://9001553 llvm-svn: 257417	2016-01-11 23:50:43 +00:00
Weiming Zhao	4b3b13d3bc	RBIT Instruction only available for ARMv6t2 and above. Summary: r255334 matches bit-reverse pattern in InstCombine and generates calls to Instrinsic::bitreverse. RBIT instruction is only available for ARMv6t2 and above. This patch has the intrinsic expanded during legalization for ARMv4 and ARMv5. Patch by Z. Zheng <zhaoshiz@codeaurora.org> Reviewers: apazos, jmolloy, weimingz Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D15932 llvm-svn: 257188	2016-01-08 18:43:41 +00:00
Weiming Zhao	48c033e021	Disable shrink-wrap for Thumb1 Summary: In ARMConstantIslandPass, which runs after Shrink Wrap pass, long jumps will be fixed up as BL (tBfar) which depends on spilling LR in epilogue. However, shrink-wrap may remove the LR, which causes issues when the function returns. Reviewers: qcolombet, rengolin Subscribers: aemerson, rengolin Differential Revision: http://reviews.llvm.org/D15984 llvm-svn: 257187	2016-01-08 18:37:43 +00:00
Eric Christopher	b793230797	Add some testing for thumb1 and thumb2 inline asm immediate constraints and fix a couple of bugs on inspection. Also fixes PR26061. llvm-svn: 257122	2016-01-08 00:34:44 +00:00
Tim Northover	bd41cf880c	ARM: support TLS accesses on Darwin platforms Darwin TLS accesses most closely resemble ELF's general-dynamic situation, since they have to be able to handle all possible situations. The descriptors and so on are obviously slightly different though. llvm-svn: 257039	2016-01-07 09:03:03 +00:00
Philip Reames	c86ed0055d	Extract helper function to merge MemoryOperand lists [NFC] In the discussion on http://reviews.llvm.org/D15730, Andy pointed out we had a utility function for merging MMO lists. Since it turned we actually had two copies and there's another review in progress (http://reviews.llvm.org/D15230) which needs the same, extract it into a utility function and clean up the interfaces to make it easier to use with a MachineInstBuilder. I introduced a pair here to track size and allocation together. I think we should probably move in the direction of the MachineOperandsRef helper class, but I'm leaving that for further work. I want to get the poison state introduced before I make major changes to the interface. Differential Revision: http://reviews.llvm.org/D15757 llvm-svn: 256909	2016-01-06 04:39:03 +00:00
MinSeong Kim	a7385ebf78	[AArch64] Add support for Samsung Exynos-M1 Adds core tuning support for new Samsung Exynos-M1 core (ARMv8-A). Differential Revision: http://reviews.llvm.org/D15663 llvm-svn: 256828	2016-01-05 12:51:59 +00:00
Craig Topper	e30b8ca149	Use std::is_sorted and std::none_of instead of manual loops. NFC llvm-svn: 256719	2016-01-03 19:43:40 +00:00
Artyom Skrobov	2aca0c622a	[Thumb] Fix assembler error 'cannot honor width suffix pop {lr}' Summary: * avoid generating POP {LR} in Thumb1 epilogues * combine MOV LR, Rx + BX LR -> BX Rx in a peephole optimization pass * combine POP {LR} + B + BX LR -> POP {PC} on v5T+ Test cases by Ana Pazos Differential Revision: http://reviews.llvm.org/D15707 llvm-svn: 256523	2015-12-28 21:40:45 +00:00
Roman Divacky	73fc84761f	Support clrex instruction on ARMv6k. Patch by Andrew Turner. llvm-svn: 256505	2015-12-28 17:47:23 +00:00
Craig Topper	daf2e3ff7a	Remove extra forward declarations and scrub includes for all in tree InstPrinters. NFC llvm-svn: 256427	2015-12-25 22:10:01 +00:00
David Majnemer	03e2cc3007	[MC, COFF] Support link /incremental conditionally Today, we always take into account the possibility that object files produced by MC may be consumed by an incremental linker. This results in us initialing fields which vary with time (TimeDateStamp) which harms hermetic builds (e.g. verifying a self-host went well) and produces sub-optimal code because we cannot assume anything about the relative position of functions within a section (call sites can get redirected through incremental linker thunks). Let's provide an MCTargetOption which controls this behavior so that we can disable this functionality if we know a-priori that the build will not rely on /incremental. llvm-svn: 256203	2015-12-21 22:09:27 +00:00
Adrian Prantl	5d9acc2443	Teach ARMLoadStoreOptimizer to ignore DBG_VALUE instructions when merging instructions. As noted in PR24563. rdar://problem/23963293 llvm-svn: 256183	2015-12-21 19:25:03 +00:00
Chad Rosier	353d71914a	Remove extra whitespace. NFC. llvm-svn: 256173	2015-12-21 18:08:05 +00:00
Weiming Zhao	613c6862fa	Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions for Thumb2 Summary: r250697 fixed the mapping for ARM mode. We have to do the same for Thumb2 otherwise the same llvm.arm.ssat() will generate different saturating amount for ARM and Thumb. r250697: http://reviews.llvm.org/rL250697 Reviewers: rmaprath Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15653 llvm-svn: 256115	2015-12-20 06:41:44 +00:00
Reid Kleckner	187d33ee74	Revert "[ARM] Add ARMv8.2-A FP16 scalar instructions" This reverts commit r255762. llvm-svn: 255806	2015-12-16 19:21:03 +00:00
Oliver Stannard	2de8c16913	[ARM] Add ARMv8.2-A FP16 vector instructions ARMv8.2-A adds 16-bit floating point versions of all existing SIMD floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. Note that VFP without SIMD is not a valid combination for any version of ARMv8-A, but I have ensured that these instructions all depend on both FeatureNEON and FeatureFullFP16 for consistency. Differential Revision: http://reviews.llvm.org/D15039 llvm-svn: 255764	2015-12-16 12:37:39 +00:00
Oliver Stannard	48568cbe18	[ARM] Add ARMv8.2-A FP16 scalar instructions ARMv8.2-A adds 16-bit floating point versions of all existing VFP floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. The assembly for these instructions uses S registers (AArch32 does not have H registers), but the instructions have ".f16" type specifiers rather than ".f32" or ".f64". The top 16 bits of each source register are ignored, and the top 16 bits of the destination register are set to zero. These instructions are mostly the same as the 32- and 64-bit versions, but they use coprocessor 9 rather than 10 and 11. Two new instructions, VMOVX and VINS, have been added to allow packing and extracting two 16-bit floats stored in the top and bottom halves of an S register. New fixup kinds have been added for the PC-relative load and store instructions, but no ELF relocations have been added as they have a range of 512 bytes. Differential Revision: http://reviews.llvm.org/D15038 llvm-svn: 255762	2015-12-16 11:35:44 +00:00
Cong Hou	c106989fd5	Normalize MBB's successors' probabilities in several locations. This patch adds some missing calls to MBB::normalizeSuccProbs() in several locations where it should be called. Those places are found by checking if the sum of successors' probabilities is approximate one in MachineBlockPlacement pass with some instrumented code (not in this patch). Differential revision: http://reviews.llvm.org/D15259 llvm-svn: 255455	2015-12-13 09:26:17 +00:00
Saleem Abdulrasool	778c268594	ARM: only emit EABI attributes on EABI targets EABI attributes should only be emitted on EABI targets. This prevents the emission of the optimization goals EABI attribute on Windows ARM. llvm-svn: 255448	2015-12-13 05:27:45 +00:00
Hal Finkel	cd8664c3c2	Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics After much discussion, ending here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html it has been decided that, instead of having the vectorizer directly generate special absdiff and horizontal-add intrinsics, we'll recognize the relevant reduction patterns during CodeGen. Accordingly, these intrinsics are not needed (the operations they represent can be pattern matched, as is already done in some backends). Thus, we're backing these out in favor of the current development work. r248483 - Codegen: Fix llvm.*absdiff semantic. r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation llvm-svn: 255387	2015-12-11 23:11:52 +00:00
Matthias Braun	60d69e2865	CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness() computeRegisterLiveness() was broken in that it reported dead for a register even if a subregister was alive. I assume this was because the results of analayzePhysRegs() are hard to understand with respect to subregisters. This commit: Changes the results of analyzePhysRegs (=struct PhysRegInfo) to be clearly understandable, also renames the fields to avoid silent breakage of third-party code (and improve the grammar). Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling it and removing workarounds for the bug. This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033 Differential Revision: http://reviews.llvm.org/D15320 llvm-svn: 255362	2015-12-11 19:42:09 +00:00
Tim Northover	d91d635b36	ARM: don't use a deleted node as the BaseReg in complex pattern. We mutated the DAG, which invalidated the node we were trying to use as a base register. Sometimes we got away with it, but other times the node really did get deleted before it was finished with. Should fix PR25733 llvm-svn: 255120	2015-12-09 15:54:50 +00:00
Ahmed Bougacha	97564c3a1b	[AArch64][ARM] Don't base interleaved op legality on type alloc size. Otherwise, we think that most types that look like they'd fit in a legal vector type are legal (so, basically, any vector type with a size between 33 and 128 bits, I think, since we use pow2 alignment; e.g., v2i25, v3f32, ...). DataLayout::getTypeAllocSize rounds up based on alignment. When checking for target intrinsic legality, that's not what we want: if rounding makes a difference, the type isn't legal, and the target intrinsics shouldn't be used, as they are always assumed legal. One could make the argument that alloc size is ultimately the most relevant here, since we're dealing with LD/ST intrinsics. That's only true if we did legalize them though; that's a problem for another day. Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits. Type::getSizeInBits can't be used because that'd gratuitously break pointer vector support. Some of these uses are currently fine, because we only hit them when the type is already known legal (e.g., r114454). Update them for consistency. It's faster to avoid the rounding anyway! llvm-svn: 255089	2015-12-09 01:19:50 +00:00
Artyom Skrobov	0a37b80bcb	Fix ARMv4T (Thumb1) epilogue generation Summary: Before ARMv5T, Thumb1 code could not pop PC, as described at D14357 and D14986; so we need the special fixup in the epilogue. Reviewers: jroelofs, qcolombet Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15126 llvm-svn: 255047	2015-12-08 19:59:01 +00:00
Renato Golin	412ee3d45d	[ARM] Allowing SP/PC for AND/BIC mod_imm_not AND/BIC instructions do accept SP/PC, so the register class should be more generic (rGPR -> GPR) to cope with that case. Adding more tests. llvm-svn: 255034	2015-12-08 18:10:58 +00:00
Artyom Skrobov	e9b3fb8603	[ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM. Summary: This reverts r254234, and adds a simple fix for the annoying case of use-after-free. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15236 llvm-svn: 254912	2015-12-07 14:22:39 +00:00
Bradley Smith	d5a1f47a63	[ARM] Flag vcvt{t,b} with an f16 type specifier as part of the FP16 extension Additionally correct the Cortex-R7 definition to allow the FP16 feature. llvm-svn: 254900	2015-12-07 10:54:36 +00:00
Craig Topper	e5e035a3a8	Replace uint16_t with the MCPhysReg typedef in many places. A lot of physical register arrays already use this typedef. llvm-svn: 254843	2015-12-05 07:13:35 +00:00
Quentin Colombet	901f036353	[ARM] When a bitcast is about to be turned into a VMOVDRR, try to combine it with its source instead of forcing the values on GPRs. This improves the lowering of vector code when such bitcasts happen in the middle of vector computations. rdar://problem/23691584 llvm-svn: 254684	2015-12-04 01:53:14 +00:00
Tim Northover	f520eff782	AArch64: use ldxp/stxp pair to implement 128-bit atomic loads. The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic if there has been a corresponding successful stxp. It's less clear for AArch32, so I'm leaving that alone for now. llvm-svn: 254524	2015-12-02 18:12:57 +00:00
Christof Douma	8b5dc2c94e	[AArch64]: Add support for Cortex-A35 Adds support for the new Cortex-A35 ARMv8-A core. llvm-svn: 254503	2015-12-02 11:53:44 +00:00
Matthias Braun	b258d794dd	ARM: Change ArchCheck field to uint64_t The values in this field are compared against getAvailableFeatures() which returns an uint64_t. This was causing problems in an internal branch. llvm-svn: 254462	2015-12-01 21:48:52 +00:00
Artyom Skrobov	5d1f2524a0	Fix Thumb1 epilogue generation Summary: This had been broken for a very long time, but nobody noticed until D14357 enabled shrink-wrapping by default. Reviewers: jroelofs, qcolombet Subscribers: tyomitch, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14986 llvm-svn: 254444	2015-12-01 19:25:11 +00:00
Oliver Stannard	4667071574	[ARM] Add ARMv8.2-A to TargetParser Add ARMv8.2-A to TargetParser, so that it can be used by the clang command-line options and the .arch directive. Most testing of this will be done in clang, checking that the command-line options that this enables work. Differential Revision: http://reviews.llvm.org/D15037 llvm-svn: 254400	2015-12-01 10:33:56 +00:00
Oliver Stannard	8addbf4350	[ARM] Add subtarget features for ARMv8.2-A This adds subtarget features for ARMv8.2-A, which builds on (and requires the features from) ARMv8.1-A. Most assembler-visible features of ARMv8.2-A are system instructions, and are all required parts of the architecture, so just depend on the HasV8_2aOps subtarget feature. There is also one large, optional feature, which adds 16-bit floating point versions of all existing floating-point instructions (VFP and SIMD), this is represented by the FeatureFullFP16 subtarget feature. Differential Revision: http://reviews.llvm.org/D15036 llvm-svn: 254399	2015-12-01 10:23:06 +00:00
Craig Topper	8072081b63	[ARM] Use range-based for loops to avoid the need for calculating an array size that I would have otherwise cconverted to array_lengthof. NFC llvm-svn: 254381	2015-12-01 06:13:01 +00:00
Cong Hou	d97c100dc4	Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. (This is the second attempt to submit this patch. The first caused two assertion failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687) The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254377	2015-12-01 05:29:22 +00:00
Hans Wennborg	1dbaf67537	Revert r254348: "Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces." and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction." Asserts were firing in Chromium builds. See PR25687. llvm-svn: 254366	2015-12-01 03:49:42 +00:00
Cong Hou	fa1917c673	Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348	2015-12-01 00:02:51 +00:00
Quentin Colombet	cdad10f333	[ARM] For old thumb ISA like v4t, we cannot use PC directly in pop. Fix the epilogue emission to account for that. llvm-svn: 254325	2015-11-30 20:37:58 +00:00
Renato Golin	5dbc8a5283	Revert "[ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM." This reverts commit r254201 and r254202, as it broke test-suite, self-hosting and sanitizer tests on ARM buildbots. llvm-svn: 254234	2015-11-28 17:23:46 +00:00
Artyom Skrobov	f01a59f9fb	Follow-up fix for r254201 llvm-svn: 254202	2015-11-27 16:20:34 +00:00
Artyom Skrobov	b955b90509	[ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM. Summary: Since this build attribute corresponds to a whole module, and different functions in a module may differ in the optimizations enabled for them, this attribute is emitted after all functions, and only in the case that the optimization goals for all functions match. Reviewers: logan, hans Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14934 llvm-svn: 254201	2015-11-27 15:30:51 +00:00
Martell Malone	d12292480a	ARM: address WOA unsigned division overflow crash Building on r253865 the crash is not limited to signed overflows. Disable custom handling of unsigned 32-bit and 64-bit integer divide. Add test cases for both 32-bit and 64-bit unsigned integer overflow. llvm-svn: 254158	2015-11-26 15:34:03 +00:00
Artyom Skrobov	314ee04268	Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC) Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085	2015-11-25 19:41:11 +00:00
Martell Malone	a6b867eb0d	ARM: address WoA division overflow crash Disable custom handling of signed 32-bit and 64-bit integer divide. Add test cases for both 32-bit and 64-bit integer overflow crashes. llvm-svn: 253865	2015-11-23 13:11:39 +00:00
Matthias Braun	5a1857b6eb	ARMLoadStoreOptimizer: Cleanup isMemoryOp(); NFC llvm-svn: 253757	2015-11-21 02:09:49 +00:00
Vinicius Tinti	67cf33d9ab	Test commit llvm-svn: 253737	2015-11-20 23:20:12 +00:00
Artyom Skrobov	91f339ab3f	Handle ARMv6-J as an alias, instead of fake architecture Summary: This follows D14577 to treat ARMv6-J as an alias for ARMv6, instead of an architecture in its own right. The functional change is that the default CPU when targeting ARMv6-J changes from arm1136j-s to arm1136jf-s, which is currently used as the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs. The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't affect code generation, attributes, optimizations, or anything else, apart from selecting the default CPU. Reviewers: rengolin, logan, compnerd Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14755 llvm-svn: 253675	2015-11-20 16:46:09 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Tim Northover	747ae9a7de	ARM: make sure backend is consistent about exception handling method. It turns out we decide whether to use SjLj exceptions or some alternative in two separate places in the backend, and they disagreed with each other. This led to inconsistent code and is generally a terrible idea. So make them consistent and add an assert that they do match (unfortunately MCAsmInfo isn't available in opt, so it can't be used to initialise the CodeGen version directly). llvm-svn: 253502	2015-11-18 21:10:39 +00:00
Rafael Espindola	449711cb36	Stop producing .data.rel sections. If a section is rw, it is irrelevant if the dynamic linker will write to it or not. It looks like llvm implemented this because gcc was doing it. It looks like gcc implemented this in the hope that it would put all the relocated items close together and speed up the dynamic linker. There are two problem with this: * It doesn't work. Both bfd and gold will map .data.rel to .data and concatenate the input sections in the order they are seen. * If we want a feature like that, it can be implemented directly in the linker since it knowns where the dynamic relocations are. llvm-svn: 253436	2015-11-18 06:02:15 +00:00
Quentin Colombet	8cb95b8e51	[ARM] Enable shrink-wrapping by default. Differential Revision: http://reviews.llvm.org/D14357 rdar://problem/21942589 llvm-svn: 253411	2015-11-18 00:40:54 +00:00
Charlie Turner	7968b981bf	[ARM] Don't pessimize i32 vselect. The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad scalarization that is still happening there. I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements. From my benchmarks, I saw these improvements in A57 (T32) spec.cpu2000.ref.177_mesa 5.95% lnt.SingleSource/Benchmarks/Shootout/strcat 12.93% lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89% I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change Differential Revision: http://reviews.llvm.org/D14743 llvm-svn: 253349	2015-11-17 17:25:15 +00:00
Bradley Smith	982a8888b8	[ARM] Default to ARMv4t in favour of adding Other to ARMArch llvm-svn: 253335	2015-11-17 13:38:29 +00:00
Charlie Turner	b4613c6973	[ARM] Match VABDL from log2 shuffles. Differential Revision: http://reviews.llvm.org/D14664 llvm-svn: 253334	2015-11-17 13:21:35 +00:00
Bradley Smith	4320205484	[ARM] Properly initialize ARMArch in the ARM subtarget llvm-svn: 253331	2015-11-17 11:57:33 +00:00
Oliver Stannard	9be59af3ab	[Assembler] Make fatal assembler errors non-fatal Currently, if the assembler encounters an error after parsing (such as an out-of-range fixup), it reports this as a fatal error, and so stops after the first error. However, for most of these there is an obvious way to recover after emitting the error, such as emitting the fixup with a value of zero. This means that we can report on all of the errors in a file, not just the first one. MCContext::reportError records the fact that an error was encountered, so we won't actually emit an object file with the incorrect contents. Differential Revision: http://reviews.llvm.org/D14717 llvm-svn: 253328	2015-11-17 10:00:43 +00:00
Rafael Espindola	65e4902156	Drop prelink support. The way prelink used to work was * The compiler decides if a given section only has relocations that are know to point to the same DSO. If so, it names it .data.rel.ro.local<something>. * The static linker puts all of these together. * The prelinker program assigns addresses to each library and resolves the local relocations. There are many problems with this: * It is incompatible with address space randomization. * The information passed by the compiler is redundant. The linker knows if a given relocation is in the same DSO or not. If could sort by that if so desired. * There are newer ways of speeding up DSO (gnu hash for example). * Even if we want to implement this again in the compiler, the previous implementation is pretty broken. It talks about relocations that are "resolved by the static linker". If they are resolved, there are none left for the prelinker. What one needs to track is if an expression will require only dynamic relocations that point to the same DSO. At this point it looks like the prelinker is an historical curiosity. For example, fedora has retired it because it failed to build for two releases (http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f) This patch removes support for it. That is, it stops printing the ".local" sections. llvm-svn: 253280	2015-11-17 00:51:23 +00:00
Petr Pavlu	a770379524	[ARM] Prevent use of a value pointed by end() iterator when placing a jump table Function ARMConstantIslands::doInitialJumpTablePlacement() iterates over all basic blocks in a machine function. It calls `MI = MBB.getLastNonDebugInstr()` to get the last instruction in each block and then uses MI->getOpcode() to decide what to do. If getLastNonDebugInstr() returns MBB.end() (for example, when the block does not contain any instructions) then calling getOpcode() on this value is incorrect. Avoid this problem by checking the result of getLastNonDebugInstr(). Differential Revision: http://reviews.llvm.org/D14694 llvm-svn: 253222	2015-11-16 16:41:13 +00:00
Oliver Stannard	9327a7575b	[ARM,AArch64] Store source location of asm constant pool entries Storing the source location of the expression that created a constant pool entry allows us to emit better error messages if we later discover that the expression cannot be represented by a relocation. Differential Revision: http://reviews.llvm.org/D14646 llvm-svn: 253220	2015-11-16 16:25:47 +00:00
Oliver Stannard	09be060606	[ARM,AArch64] Store source location for values in assembly files The MCValue class can store a SMLoc to allow better error messages to be emitted if an error is detected after parsing. The ARM and AArch64 assembly parsers were not setting this, so error messages did not have source information. Differential Revision: http://reviews.llvm.org/D14645 llvm-svn: 253219	2015-11-16 16:22:47 +00:00
Artyom Skrobov	f187a65f99	Handle ARMv6KZ naming Summary: * ARMv6KZ is the "canonical" name, given in the ARMARM * ARMv6Z is an "official abbreviation" for it, mentioned in the ARMARM * ARMv6ZK is a popular misspelling, which we should support as an alias. The patch corrects the handling of the names. Functional changes: * ARMv6Z no longer treated as an architecture in its own right * ARMv6ZK renamed to ARMv6KZ, accepting ARMv6ZK as an alias * arm1176jz-s and arm1176jzf-s recognized as ARMv6ZK, instead of ARMv6K * default ARMv6K CPU changed to arm1176j-s Reviewers: rengolin, logan, compnerd Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14568 llvm-svn: 253206	2015-11-16 14:05:32 +00:00
Bradley Smith	323fee105d	[ARM] Introduce subtarget features per ARM architecture. This allows for accurate architecture targeting as well as removing duplicate information (hardcoded feature strings) from MCTargetDesc. llvm-svn: 253196	2015-11-16 11:10:19 +00:00
James Molloy	2018091e87	Properly check if a CMPZ node is in fact comparing against zero This was left implicit and never ever checked, which means we could have a CMPZ against some non-zero value and we were carrying on with BFI conversion regardless. Caught by Oliver Stannard using csmith; regression test added. llvm-svn: 253195	2015-11-16 10:49:25 +00:00
Akira Hatanaka	b11ef0897c	Reduce the size of MCRelaxableFragment. MCRelaxableFragment previously kept a copy of MCSubtargetInfo and MCInst to enable re-encoding the MCInst later during relaxation. A copy of MCSubtargetInfo (instead of a reference or pointer) was needed because the feature bits could be modified by the parser. This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment with a constant reference to MCSubtargetInfo. The copies of MCSubtargetInfo are kept in MCContext, and the target parsers are now responsible for asking MCContext to provide a copy whenever the feature bits of MCSubtargetInfo have to be toggled. With this patch, I saw a 4% reduction in peak memory usage when I compiled verify-uselistorder.lto.bc using llc. rdar://problem/21736951 Differential Revision: http://reviews.llvm.org/D14346 llvm-svn: 253127	2015-11-14 06:35:56 +00:00
Akira Hatanaka	bd9fc28444	[MCTargetAsmParser] Move the member varialbes that reference MCSubtargetInfo in the subclasses into MCTargetAsmParser and define a member function getSTI. This is done in preparation for making changes to shrink the size of MCRelaxableFragment. (see http://reviews.llvm.org/D14346). llvm-svn: 253124	2015-11-14 05:20:05 +00:00
James Molloy	b564098c62	[ARM] Replace ARMISD::RBIT with ISD::BITREVERSE ISD::BITREVERSE matches "rbit" completely, so remove ARMISD::RBIT and mark ISD::BITREVERSE as legal, adding a test for lowering. llvm-svn: 253047	2015-11-13 16:05:22 +00:00
Artyom Skrobov	2c2f378f8a	Cull non-standard variants of ARM architectures (NFC) Summary: This patch changes ARMV5, ARMV5E, ARMV6SM, ARMV6HL, ARMV7, ARMV7L, ARMV7HL, ARMV7EM to be treated as aliases for the corresponding standard architectures, instead of as actual architectures. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14577 llvm-svn: 252903	2015-11-12 15:51:41 +00:00
James Molloy	8e99e97f2a	[ARM] CMOV->BFI combining: handle both senses of CMPZ I completely misunderstood what ARMISD::CMPZ means. It's not "compare equal to zero", it's "compare, only setting the zero/Z flag". It can either be equal-to-zero or not-equal-to-zero, and we weren't checking what sense it was. If it's equal-to-zero, we can swap the operands around and pretend like it is not-equal-to-zero, which is both a bug fix and lets us handle more cases. llvm-svn: 252891	2015-11-12 13:49:17 +00:00
Renato Golin	93064025bd	Revert "[ARM] Enable shrink-wrapping by default." This reverts commit r252825, as it broke ASAN on ARM. Investigating... llvm-svn: 252889	2015-11-12 13:34:50 +00:00
Quentin Colombet	10f9813528	[ARM] Enable shrink-wrapping by default. Differential Revision: http://reviews.llvm.org/D14357 rdar://problem/21942589 llvm-svn: 252825	2015-11-11 23:31:46 +00:00
Diego Novillo	0767ae5896	Properly fix unused variable in disable-assert builds. I missed the side-effects of ParseBFI in my previous attempt (r252748). Thanks dblaikie for the suggestion of adding a void use of the unused variable instead. llvm-svn: 252751	2015-11-11 16:39:22 +00:00
Diego Novillo	29f88a2460	Remove unused variable in disable-assert builds. NFC. llvm-svn: 252748	2015-11-11 16:14:52 +00:00
James Molloy	ce12c92f66	[ARM] Combine BFIs together If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits. llvm-svn: 252740	2015-11-11 15:40:40 +00:00
Sanjay Patel	af1b48bfdc	[ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz() ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2 implementation. The net result of allowing this speculation for the regression tests in this patch is that we get this code: ctlz: clz r0, r0 bx lr cttz: rbit r0, r0 clz r0, r0 bx lr Instead of: ctlz: cmp r0, #0 moveq r0, #32 clzne r0, r0 bx lr cttz: cmp r0, #0 moveq r0, #32 rbitne r0, r0 clzne r0, r0 bx lr This will help solve a general speculation/despeculation problem noted in PR24818: https://llvm.org/bugs/show_bug.cgi?id=24818 Differential Revision: http://reviews.llvm.org/D14469 llvm-svn: 252639	2015-11-10 19:24:31 +00:00
James Molloy	9d55f19cfa	Reapply "[ARM] Combine CMOV into BFI where possible" Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh! Original commit message: If we have a CMOV, OR and AND combination such as: if (x & CN) y \|= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252606	2015-11-10 14:22:05 +00:00
Akira Hatanaka	3bfc3e2d2a	[ARM] Handle t2ADDri in ARMAsmPrinter::EmitUnwindingInstruction. This fixes a bug in ARMAsmPrinter::EmitUnwindingInstruction where llvm_unreachable was reached because t2ADDri wasn't handled. Test case provided by Tim Northover. rdar://problem/23270609 http://reviews.llvm.org/D14518 llvm-svn: 252557	2015-11-10 00:10:41 +00:00
Renato Golin	6d435f12f0	[EABI] Add LLVM support for -meabi flag "GCC requires the freestanding environment provide memcpy, memmove, memset and memcmp": https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html Hence in GNUEABI targets LLVM should not convert 'memops' to their equivalent '__aeabi_memops'. This convertion violates GCC contract. The -meabi flag controls whether or not LLVM will modify 'memops' in GNUEABI targets. Without -meabi: use the triple default EABI. With -meabi=default: use the triple default EABI. With -meabi=gnu: use 'memops'. With -meabi=4 or -meabi=5: use '__aeabi_memops'. With -meabi set to an unknown value: same as -meabi=default. Patch by Vinicius Tinti. llvm-svn: 252462	2015-11-09 12:40:30 +00:00
Renato Golin	1d8a2c952f	Revert "[ARM] Combine CMOV into BFI where possible" This reverts commit r252057, as it broke ARM self-hosting buildbots, probably due to a code-gen fault. llvm-svn: 252460	2015-11-09 12:19:10 +00:00
Colin LeMahieu	8a0453e23a	[AsmParser] Backends can parameterize ASM tokenization. llvm-svn: 252439	2015-11-09 00:31:07 +00:00
Joseph Tremoulet	f748c8937e	[WinEH] Update exception pointer registers Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383	2015-11-07 01:11:31 +00:00
Reid Kleckner	b8fd162fc5	[WinEH] Mark funclet entries and exits as clobbering all registers Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318	2015-11-06 17:06:38 +00:00
Tim Northover	775aaeb765	Remove windows line endings introduced by r252177. NFC. llvm-svn: 252217	2015-11-05 21:54:58 +00:00
Oleg Ranevskyy	057c5a6b2b	[DebugInfo] Fix ARM/AArch64 prologue_end position. Related to D11268. Summary: This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it. D11268 is quite old and has merge conflicts against the current trunk. This request - rebases D11268 onto the new trunk; - resolves the merge conflicts; - fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct. Reviewers: echristo, rengolin, kubabrecka Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252177	2015-11-05 17:50:17 +00:00
James Molloy	bef6e43107	[ARM] Compute known bits for ARMISD::CMOV We can conservatively know that CMOV's known bits are the intersection of known bits for each of its operands. This helps PerformCMOVToBFICombine find more opportunities. I tried hard to create a testcase for this and failed - we have to sufficiently confuse DAG.computeKnownBits which can see through all the cheap tricks I tried to narrow my larger testcase down :( This code is actually exercised in CodeGen/ARM/bfi.ll, there's just no functional difference because DAG.computeKnownBits gets the right answer in that case. llvm-svn: 252168	2015-11-05 15:21:58 +00:00
Rafael Espindola	e61a902371	Go back to producing relocations for out of range symbols. This brings back the behavior from before r252090 for out of range symbols. Should bring some arm bots back. llvm-svn: 252119	2015-11-05 01:10:15 +00:00
Rafael Espindola	49b8548903	Slightly saner handling of thumb branches. The generic infrastructure already did a lot of work to decide if the fixup value is know or not. It doesn't make sense to reimplement a very basic case: same fragment. llvm-svn: 252090	2015-11-04 23:00:39 +00:00
James Molloy	e7d679cf4c	[ARM] Combine CMOV into BFI where possible If we have a CMOV, OR and AND combination such as: if (x & CN) y \|= CM; And: * CN is a single bit; * All bits covered by CM are known zero in y; Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction). llvm-svn: 252057	2015-11-04 16:55:07 +00:00
Tim Northover	155103ec18	WatchOS: update default CPU for triple after t2dsp -> dsp rename llvm-svn: 251814	2015-11-02 18:21:07 +00:00
Artyom Skrobov	0ff1ce4038	Recognize that ARM1176JZ[F]-S support TrustZone Summary: ARMv6KZ cores were set up incorrectly in ARM.td; also, the SMI mnemonic (the old name for SMC, as defined in ARMv6KZ) wasn't supported. Reviewers: jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14154 llvm-svn: 251627	2015-10-29 13:56:19 +00:00
Tim Northover	f8e47e4868	ARM: add support for WatchOS's compact unwind information. llvm-svn: 251573	2015-10-28 22:56:36 +00:00
Tim Northover	8b40366b54	ARM: teach backend about WatchOS and TvOS libcalls. The most substantial changes are again for watchOS: libcalls are hard-float if needed and sincos has a different calling convention. llvm-svn: 251571	2015-10-28 22:51:16 +00:00
Tim Northover	e0ccdc6de9	ARM: add backend support for the ABI used in WatchOS At the LLVM level this ABI is essentially a minimal modification of AAPCS to support 16-byte alignment for vector types and the stack. llvm-svn: 251570	2015-10-28 22:46:43 +00:00
Tim Northover	2d4d161519	ARM: support .watchos_version_min and .tvos_version_min. These MachO file directives are used by linkers and other tools to provide compatibility information, much like the existing .ios_version_min and .macosx_version_min. llvm-svn: 251569	2015-10-28 22:36:05 +00:00
Artyom Skrobov	b43981076a	[ARM] Allow SP in rGPR, starting from ARMv8 Summary: This patch handles assembly and disassembly, but not codegen, as of yet. Additionally, it fixes a bug whereby SP and PC as shifted-reg operands were treated as predictable in ARMv7 Thumb; and it enables the tests for invalid and unpredictable instructions to run on both ARMv7 and ARMv8. Reviewers: jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14141 llvm-svn: 251516	2015-10-28 13:58:36 +00:00
Craig Topper	4b27576001	Remove templates from CostTableLookup functions. All instantiations had the same type. This also lets us remove the versions of the functions that took a statically sized array as we can rely on ArrayRef implicit conversion now. llvm-svn: 251490	2015-10-28 04:02:12 +00:00
Charlie Turner	458e79b814	[ARM] Expand ROTL and ROTR of vector value types Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection. Reviewers: rengolin, t.p.northover Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14082 llvm-svn: 251401	2015-10-27 10:25:20 +00:00
Craig Topper	ee0c859788	Convert cost table lookup functions to return a pointer to the entry or nullptr instead of the index. This avoid mentioning the table name an extra time and allows the lookup to be done directly in the ifs by relying on the bool conversion of the pointer. While there make use of ArrayRef and std::find_if. llvm-svn: 251382	2015-10-27 04:14:24 +00:00
Tim Northover	939f089242	ARM: make sure VFP loads and stores are properly aligned. Both VLDRS and VLDRD fault if the memory is not 4 byte aligned, which wasn't really being checked before, leading to faults at runtime. llvm-svn: 251352	2015-10-26 21:32:53 +00:00
Peter Collingbourne	99fac80db2	ARM/ELF: Restore original (pre-r251322) logic for deciding whether to use GOT. Unbreaks linking with gold, which cannot resolve direct relocations referring to global symbols. llvm-svn: 251342	2015-10-26 20:46:44 +00:00
Peter Collingbourne	97aae40880	ARM/ELF: Better codegen for global variable addresses. In PIC mode we were previously computing global variable addresses (or GOT entry addresses) by adding the PC, the PC-relative GOT displacement and the GOT-relative symbol/GOT entry displacement. Because the latter two displacements are fixed, we ended up performing one more addition than necessary. This change causes us to compute addresses using a single PC-relative displacement, resulting in a shorter code sequence. This reduces code size by about 4% in a recent build of Chromium for Android. As a result of this change we no longer need to compute the GOT base address in the ARM backend, which allows us to remove the Global Base Reg pass and SDAG lowering for the GOT. We also now no longer use the GOT when addressing a symbol which is known to be defined in the same linkage unit. Specifically, the symbol must have either hidden visibility or a strong definition in the current module in order to not use the the GOT. This is a change from the previous behaviour where we would use the GOT to address externally visible symbols defined in the same module. I think the only cases where this could matter are cases involving symbol interposition, but we don't really support that well anyway. Differential Revision: http://reviews.llvm.org/D13650 llvm-svn: 251322	2015-10-26 18:23:16 +00:00
James Molloy	72222f5dca	[ARM] Handle the inline asm constraint type 'o' This means "memory with offset" and requires very little plumbing to get working. This fixes PR25317. llvm-svn: 251280	2015-10-26 10:04:52 +00:00
Benjamin Kramer	8ceb323bb4	Convert assert(false) into llvm_unreachable where it makes sense. llvm-svn: 251266	2015-10-25 22:28:27 +00:00
Artyom Skrobov	5a6e39454e	[ARM] Renaming +t2dsp feature into +dsp, as discussed on llvm-dev llvm-svn: 251125	2015-10-23 17:19:19 +00:00
Oleg Ranevskyy	6389dd9fa2	[ARM CodeGen] @llvm.debugtrap call may be removed when restoring callee saved registers Summary: When ARMFrameLowering::emitPopInst generates a "pop" instruction to restore the callee saved registers, it checks if the LR register is among them. If so, the function may decide to remove the basic block's terminator and replace it with a "pop" to the PC register instead of LR. This leads to a problem when the block's terminator is preceded by a "llvm.debugtrap" call. The MI iterator points to the trap in such a case, which is also a terminator. If the function decides to restore LR to PC, it erroneously removes the trap. Reviewers: asl, rengolin Subscribers: aemerson, jfb, rengolin, dschuff, llvm-commits Differential Revision: http://reviews.llvm.org/D13672 llvm-svn: 251123	2015-10-23 17:17:59 +00:00
Craig Topper	8fe40e0ed5	Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032	2015-10-22 17:05:00 +00:00
Pete Cooper	b70b956c80	Add missing load/store flags to thumb2 instructions. These were the cause of a verifier error when building 7zip with -verify-machineinstrs. Running 'make check' with the verifier triggered the same error on the test here so i've updated the test to run the verifier on one of its runs instead of adding a new one. While looking at this code, there was a stale comment that these instructions were only used for disassembly. This probably used to be the case, but they are now used in the 'ARM load / store optimization pass' too. This reapplies r242300 which was reverted in r242428 due to bot failures. Ultimately those failures were spurious and completely unrelated to this commit. I reverted this at the time because it was thought to be at fault. llvm-svn: 250969	2015-10-22 01:48:57 +00:00
Artyom Skrobov	7fd67e25aa	Adding support for TargetLoweringBase::LibCall Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826	2015-10-20 13:14:52 +00:00
Duncan P. N. Exon Smith	9f9559e807	ARM: Remove implicit ilist iterator conversions, NFC llvm-svn: 250759	2015-10-19 23:25:57 +00:00
Asiri Rathnayake	1040a53be3	Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions The mapping of these two intrinsics in ARMInstrInfo.td had a small omission which lead to their operands not being validated/transformed before being lowered into usat and ssat instructions. This can cause incorrect instructions to be emitted. I've also added tests for the remaining two saturating arithmatic intrinsics @llvm.arm.qadd and @llvm.arm.qsub as they are missing codegen tests. llvm-svn: 250697	2015-10-19 11:44:24 +00:00
Craig Topper	2626094fa1	Make a bunch of static arrays const. llvm-svn: 250642	2015-10-18 05:15:34 +00:00
Craig Topper	a2d0635098	Remove unnecessary 'const' pointed out by David Blaikie. llvm-svn: 250619	2015-10-17 18:22:46 +00:00
Craig Topper	c177d9edb3	Use std::begin/end and std::is_sorted to simplify some code. NFC llvm-svn: 250614	2015-10-17 16:37:11 +00:00
Quentin Colombet	5084e44d71	[ARM] Make sure we do not dereference the end iterator when accessing debug information. Although the problem was always here, it would only be exposed when shrink-wrapping is enable. rdar://problem/23110493 llvm-svn: 250352	2015-10-15 00:41:26 +00:00
Christof Douma	f0765c4f5b	Test commit llvm-svn: 250154	2015-10-13 09:38:21 +00:00
James Molloy	fa4e994a7a	[ARM] Mark Swift MISched model as incomplete The Swift Machine Scheduler Model is incomplete. There are instructions missing which can trigger the "incomplete machine model" abort. This was observed when a downstream SchedMachineModel was added to the ARM target. Patch by Christof Douma! llvm-svn: 250033	2015-10-12 12:49:59 +00:00
Saleem Abdulrasool	1825fac3c9	ARM: tweak WoA frame lowering Accept r11 when targeting Windows on ARM rather than just low registers. Because we are in a thumb-2 only mode, this may be slightly more expensive in code size, but results in better code for the environment since it spills the frame register, which is generally desired for fast stack walking as per the ABI. llvm-svn: 249804	2015-10-09 03:19:03 +00:00
Evgeniy Stepanov	5fe279e727	Add Triple::isAndroid(). This is a simple refactoring that replaces Triple.getEnvironment() checks for Android with Triple.isAndroid(). llvm-svn: 249750	2015-10-08 21:21:24 +00:00
Chad Rosier	169865ffda	[ARM] Promote helper function to SelectionDAG. I'll be using the function in a similar combine for AArch64. The helper was also improved to handle undef values. Part of http://reviews.llvm.org/D13442 llvm-svn: 249572	2015-10-07 17:28:58 +00:00
Oliver Stannard	d3d114ba54	[ARM] Use correct half-precision functions in EABI mode The ARM RTABI defines the half- to single-precision float conversion functions with an __aeabi prefix, but libgcc only has them with a __gnu prefix. Therefore we need to emit the __aeabi version when compiling with an eabi or eabihf triple, and the __gnu version with a gnueabi or gnueabihf triple. llvm-svn: 249565	2015-10-07 16:58:49 +00:00
Chad Rosier	17436bf64e	[ARM] Prevent PerformVDIVCombine from combining a vcvt/vdiv with 8 lanes. This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 249560	2015-10-07 16:15:40 +00:00
Jeroen Ketema	aebca09543	[ARM][AArch64] Only lower to interleaved load/store if the target has NEON Without an additional check for NEON, the compiler crashes during legalization of NEON ldN/stN. Differential Revision: http://reviews.llvm.org/D13508 llvm-svn: 249550	2015-10-07 14:53:29 +00:00
Chad Rosier	db71abf2d4	[ARM] Push more complex check down to reduce compile time. NFC. llvm-svn: 249547	2015-10-07 13:40:44 +00:00
Chad Rosier	dca46b426f	[ARM] Minor refactoring. NFC. llvm-svn: 249465	2015-10-06 20:58:42 +00:00
Chad Rosier	aed910b7d7	[ARM] Minor refactoring. NFC. llvm-svn: 249464	2015-10-06 20:51:26 +00:00
Chad Rosier	9df4aff86d	[ARM] Minor refactoring. NFC. llvm-svn: 249463	2015-10-06 20:45:45 +00:00
Chad Rosier	a087fd21da	[ARM] Minor refactoring to improve readability. NFC. llvm-svn: 249454	2015-10-06 20:23:42 +00:00
Scott Douglass	953f908173	[ARM] Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing memcpy as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achieving better codegen is to create MEMCPY pseudo instructions, attach scratch virtual registers to them and then, post register allocation, expand the MEMCPYs into LDM/STM pairs using the scratch registers. The register allocator will have picked arbitrary registers which we sort when expanding the MEMCPY. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. Fixes PR9199 and PR23768. [This is based on Peter Collingbourne's r238473 which was reverted.] Differential Revision: http://reviews.llvm.org/D13239 Change-Id: I727543c2e94136e0f80b8e22d5642d7b9ee5b458 Author: Peter Collingbourne <peter@pcc.me.uk> llvm-svn: 249322	2015-10-05 14:49:54 +00:00
Rafael Espindola	e3a20f57d9	Fix pr24486. This extends the work done in r233995 so that now getFragment (in addition to getSection) also works for variable symbols. With that the existing logic to decide if a-b can be computed works even if a or b are variables. Given that, the expression evaluation can avoid expanding variables as aggressively and that in turn lets the relocation code see the original variable. In order for this to work with the asm streamer, there is now a dummy fragment per section. It is used to assign a section to a symbol when no other fragment exists. This patch is a joint work by Maxim Ostapenko andy myself. llvm-svn: 249303	2015-10-05 12:07:05 +00:00
Roman Divacky	4b5507a037	Actually switch the arch when we see .arch. PR21695 llvm-svn: 249165	2015-10-02 18:25:25 +00:00
Tim Northover	8d67b8e053	ARM: diagnose invalid local fixups on Thumb1 We previously stopped producing Thumb2 relaxations when they weren't supported, but only diagnosed the case where an actual relocation was produced. We should also tell people if local symbols aren't going to work rather than silently overflowing. llvm-svn: 249164	2015-10-02 18:07:18 +00:00
Tim Northover	956b008db6	ARM: correctly align constant pool value on Thumb1 targets. Since we're using tLDRpci to access it, the constant pool's address must be 0 (mod 4). llvm-svn: 249163	2015-10-02 18:07:13 +00:00
Scott Douglass	290183d734	[ARM] More care with Thumb1 writeback in ARMLoadStoreOptimizer Differential Revision: http://reviews.llvm.org/D13240 llvm-svn: 249002	2015-10-01 11:56:19 +00:00
Artyom Skrobov	72ca6b8f3f	[ARM] Support for ARMv6-Z / ARMv6-ZK missing As Richard Barton observed at http://reviews.llvm.org/D12937#inline-107121 TargetParser in LLVM has insufficient support for ARMv6Z and ARMv6ZK. In particular, there were no tests for TrustZone being supported in these architectures. The patch clears a FIXME: left by Saleem Abdulrasool in r201471, and fixes his test case which hadn't really been testing what it was claiming to test. Differential Revision: http://reviews.llvm.org/D13236 llvm-svn: 248921	2015-09-30 17:25:52 +00:00
Jeroen Ketema	ab99b59e8c	[ARM][NEON] Use address space in vld([1234]\|[234]lane) and vst([1234]\|[234]lane) instructions This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234], vst[234]lane ARM neon intrinsics and associates an address space with the pointer that these intrinsics take. This changes, e.g., <2 x i32> @llvm.arm.neon.vld1.v2i32(i8, i32) to <2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8, i32) This change ensures that address spaces are fully taken into account in the ARM target during lowering of interleaved loads and stores. Differential Revision: http://reviews.llvm.org/D12985 llvm-svn: 248887	2015-09-30 10:56:37 +00:00
Andrew Kaylor	16c4da03d5	Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735	2015-09-28 20:33:22 +00:00
Artyom Skrobov	ad8a0638f7	[ARM] Avoid redundant checks for isThumb1Only() after supportsTailCall() supportsTailCall() has two callers. Both of them double-check isThumb1Only(), and refuse to proceed with tail-calling in that case. Therefore, it makes sense to move this check to ARMSubtarget::initSubtargetFeatures, where SupportsTailCall is initialized; and to eliminate the extra checks at the call sites. Following a review comment, added an "assert(supportsTailCall())" in IsEligibleForTailCall. NFC. llvm-svn: 248703	2015-09-28 09:44:11 +00:00
Ahmed Bougacha	e81610fabb	[ARM] Don't generate clrex for pre-v7 targets. Since r248294, we emit clrex, but it doesn't exist on v6. llvm-svn: 248640	2015-09-26 00:14:02 +00:00
Saleem Abdulrasool	8e99f50768	ARM: make -Asserts,-Werror=unused-variable build happy The value was only used in an assertion. Sink the variable usage into the assertion. llvm-svn: 248562	2015-09-25 05:41:02 +00:00
Saleem Abdulrasool	fe83b50289	ARM: address WoA division limitation We now emit the compiler generated divide by zero check that was needed for the MSVC routines. We construct a psuedo-instruction for the DBZ check as the operation requires splitting up the BB. For the 64-bit operations, we need to custom expand the node as we need to insert the DBZ check and then emit the libcall to the appropriate name. Because this is target specific, it seemed better to reproduce the expansion operation from the target-agnostic type legalization rather than sink this there to avoid the duplication. The division library calls now match MSVC semantically. llvm-svn: 248561	2015-09-25 05:15:46 +00:00
Artyom Skrobov	cf296444ab	[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with a FIXME: attached. This patch changes the handling of +t2dsp to be in line with other architecture extensions. Following a revert of r248152 and new review comments, this patch also includes renaming FeatureDSPThumb2 -> FeatureDSP, hasThumb2DSP() -> hasDSP(), etc. The spelling of "t2dsp" is preserved, pending a further investigation of its possible external usage. Differential Revision: http://reviews.llvm.org/D12937 llvm-svn: 248519	2015-09-24 17:31:16 +00:00
Tim Northover	beb5bccf88	ARM: fix folding stack adjustment (again again again...) This time, the issue is that we weren't accounting for the possibility that aligned DPRs could have been stored after the final "push" in a prologue. When that happened we effectively moved a "sub sp, #N" from below the aligned stores to above them, and everything went to pot. To make it worse, I'd actually committed something testing that we produced wrong code, so the test update is tiny. llvm-svn: 248437	2015-09-23 22:21:09 +00:00
Oliver Stannard	f2ed5c68d2	[ARM] Add option to force fast-isel The ARM backend has some logic that only allows the fast-isel to be enabled for subtargets where it is known to be stable. This adds a backend option to override this and force the fast-isel to be used for any target, to allow it to be tested. This is an ARM-specific option, because no other backend disables the fast-isel on a per-subtarget basis. llvm-svn: 248369	2015-09-23 09:19:54 +00:00
Ahmed Bougacha	81616a72ea	[ARM] Emit clrex in the expanded cmpxchg fail block. ARM counterpart to r248291: In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248294	2015-09-22 17:22:58 +00:00
NAKAMURA Takumi	0a7d0ad95f	Untabify. llvm-svn: 248264	2015-09-22 11:15:07 +00:00
NAKAMURA Takumi	a9cb538a74	Reformat blank lines. llvm-svn: 248263	2015-09-22 11:14:39 +00:00
NAKAMURA Takumi	70ad98aca4	Reformat. llvm-svn: 248261	2015-09-22 11:13:55 +00:00
NAKAMURA Takumi	59a16a76be	ARMInstrInfo.cpp: Reformat. llvm-svn: 248260	2015-09-22 11:10:17 +00:00
Jeroen Ketema	41681a5329	[ARM] Do not scale vext with a factor The vext pseudo-instruction takes the number of elements that need to be extracted, not the number of bytes. Hence, use the number of elements directly instead of scaling them with a factor. Reviewers: Silviu Baranga, James Molloy (not reflected in the differential revision) Differential Revision: http://reviews.llvm.org/D12974 llvm-svn: 248208	2015-09-21 20:28:04 +00:00
James Molloy	e46da3849a	Revert "[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def" This was committed without the code review (http://reviews.llvm.org/D12937) being approved. This reverts commit r248152. llvm-svn: 248174	2015-09-21 16:35:08 +00:00
Artyom Skrobov	79b0adaae4	[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with a FIXME: attached. This patch changes the handling of +t2dsp to be in line with other architecture extensions. Following review comments, also updating the description of FeatureDSPThumb2 in ARM.td. Differential Revision: http://reviews.llvm.org/D12937 llvm-svn: 248152	2015-09-21 12:43:10 +00:00
Craig Topper	3c76c523e1	Cleanup places that passed SMLoc by const reference to pass it by value instead. NFC llvm-svn: 248135	2015-09-20 23:35:59 +00:00
Saleem Abdulrasool	4966f58ac2	ARM: cleanup formatting clang-format a line which was poorly formatted. NFC. llvm-svn: 248110	2015-09-20 03:19:09 +00:00
Bob Wilson	8823b84fae	NFC: Fix indentation and add braces to clarify nested of else-statement. llvm-svn: 248086	2015-09-19 06:20:59 +00:00
Eric Christopher	a835956bda	Limit the range of processors supported by ARM fast isel to v6 or later as that's all that is tested right now. Fixes PR24858. llvm-svn: 248027	2015-09-18 20:08:18 +00:00
Cong Hou	f9f9ffb98b	Scaling up values in ARMBaseInstrInfo::isProfitableToIfCvt() before they are scaled by a probability to avoid precision issue. In ARMBaseInstrInfo::isProfitableToIfCvt(), there is a simple cost model in which the number of cycles is scaled by a probability to estimate the cost. However, when the number of cycles is small (which is usually the case), there is a precision issue after the computation. To avoid this issue, this patch scales those cycles by 1024 (chosen to make the multiplication a litter faster) before they are scaled by the probability. Other variables are also scaled up for the final comparison. Differential Revision: http://reviews.llvm.org/D12742 llvm-svn: 248018	2015-09-18 18:19:40 +00:00
Eric Christopher	a4e5d3cf8e	constify the Function parameter to the TTI creation callback and propagate to all callers/users/etc. llvm-svn: 247864	2015-09-16 23:38:13 +00:00
Sanjay Patel	a260701bbb	propagate fast-math-flags on DAG nodes After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815	2015-09-16 16:31:21 +00:00
Chad Rosier	5d485db6b2	[ARM] Register ARMPreAllocLoadStoreOpt pass with LLVM pass manager. llvm-svn: 247791	2015-09-16 13:11:31 +00:00
Daniel Sanders	50f17235dd	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Eric has replied and has demanded the patch be reverted. llvm-svn: 247702	2015-09-15 16:17:27 +00:00
Daniel Sanders	153010c52d	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247692	2015-09-15 14:08:28 +00:00
Daniel Sanders	c40de48041	Revert r247684 - Replace Triple with a new TargetTuple ... LLDB needs to be updated in the same commit. llvm-svn: 247686	2015-09-15 13:46:21 +00:00
Daniel Sanders	18d4b0dab7	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247683	2015-09-15 13:17:40 +00:00
Daniel Sanders	c8cd6e95d2	Fix namespace indentation and missing blank lines before 'public:' in *MCAsmInfo.h. NFC. This is to reduce noise in a following commit. Also fixes a couple missing spaces before the reference operator. llvm-svn: 247679	2015-09-15 12:27:06 +00:00
John Brawn	056e67865a	[ARM] Extract shifts out of multiply-by-constant Turning (op x (mul y k)) into (op x (lsl (mul y k>>n) n)) is beneficial when we can do the lsl as a shifted operand and the resulting multiply constant is simpler to generate. Do this by doing the transformation when trying to select a shifted operand, as that ensures that it actually turns out better (the alternative would be to do it in PreprocessISelDAG, but we don't know for sure there if extracting the shift would allow a shifted operand to be used). Differential Revision: http://reviews.llvm.org/D12196 llvm-svn: 247569	2015-09-14 15:19:41 +00:00
Ahmed Bougacha	5246867384	[CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit. We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429	2015-09-11 17:08:28 +00:00
Ahmed Bougacha	9d677131c4	[CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind. This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428	2015-09-11 17:08:17 +00:00
Cong Hou	c536bd9e73	Pass BranchProbability/BlockMass by value instead of const& as they are small. NFC. llvm-svn: 247357	2015-09-10 23:10:42 +00:00
James Molloy	8c995a93ce	[ARM] Do not use vtrn for vectorshuffle if the order is reversed The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case. Patch by Jeroen Ketema! llvm-svn: 247254	2015-09-10 08:42:28 +00:00
Chandler Carruth	e4405e949f	[ADT] Switch a bunch of places in LLVM that were doing single-character splits to actually use the single character split routine which does less work, and in a debug build is substantially faster. llvm-svn: 247245	2015-09-10 06:12:31 +00:00
Matthias Braun	d9da162789	Save LaneMask with livein registers With subregister liveness enabled we can detect the case where only parts of a register are live in, this is expressed as a 32bit lanemask. The current code only keeps registers in the live-in list and therefore enumerated all subregisters affected by the lanemask. This turned out to be too conservative as the subregister may also cover additional parts of the lanemask which are not live. Expressing a given lanemask by enumerating a minimum set of subregisters is computationally expensive so the best solution is to simply change the live-in list to store the lanemasks as well. This will reduce memory usage for targets using subregister liveness and slightly increase it for other targets Differential Revision: http://reviews.llvm.org/D12442 llvm-svn: 247171	2015-09-09 18:08:03 +00:00
John Brawn	d8b405abf7	[ARM] Get rid of SelectT2ShifterOperandReg, NFC SelectT2ShifterOperandReg has identical behaviour to SelectImmShifterOperand, so get rid of it and use SelectImmShifterOperand instead. Differential Revision: http://reviews.llvm.org/D12195 llvm-svn: 246962	2015-09-07 11:45:18 +00:00
Ahmed Bougacha	699a9dd7c3	[ARM] Don't abort on variable-idx extractelt in ReconstructShuffle. The code introduced in r244314 assumed that EXTRACT_VECTOR_ELT only takes constant indices, but it does accept variables. Bail out for those: we can't use them, as the shuffles we want to reconstruct do require constant masks. llvm-svn: 246594	2015-09-01 21:56:00 +00:00
Silviu Baranga	e748c9ef55	[ARM] Turn on by default interleaved access vectorization Summary: This change turns on by default interleaved access vectorization on ARM, as it has shown to be beneficial on ARM. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12146 llvm-svn: 246541	2015-09-01 11:19:15 +00:00
Chandler Carruth	bb47b9a367	[Triple] Stop abusing a class to have only static methods and just use the namespace that we are already using for the enums that are produced by the parsing. llvm-svn: 246367	2015-08-30 02:09:48 +00:00
James Molloy	45ee9898ec	[ARM] Hoist fabs/fneg above a conversion to float. This is especially visible in softfp mode, for example in the implementation of libm fabs/fneg functions. If we have: %1 = vmovdrr r0, r1 %2 = fabs %1 then move the fabs before the vmovdrr: %1 = and r1, #0x7FFFFFFF %2 = vmovdrr r0, r1 This is never a lose, and could be a serious win because the vmovdrr may be followed by a vmovrrd, which would enable us to remove the conversion into FPRs completely. We already do this for f32, but not for f64. Tests are added for both. llvm-svn: 246360	2015-08-29 10:49:11 +00:00
Ahmed Bougacha	f9c19da03a	[CodeGen] Support (and default to) expanding READCYCLECOUNTER to 0. For targets that didn't support this, this will let us respect the langref instead of failing to select. Note that we don't need to change the 32-bit x86/PPC lowerings (to account for the result type/# difference) because they're both custom and bypass type legalization. llvm-svn: 246258	2015-08-28 01:49:59 +00:00
Reid Kleckner	0e2882345d	[WinEH] Add some support for code generating catchpad We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-svn: 246235	2015-08-27 23:27:47 +00:00
Cong Hou	b5ef475e5c	[ARM] Use BranchProbability::scale() to scale an integer with a probability in ARMBaseInstrInfo.cpp, Previously in isProfitableToIfCvt() in ARMBaseInstrInfo.cpp, the multiplication between an integer and a branch probability is done manually in an unsafe way that may lead to overflow. This patch corrects those cases by using BranchProbability's member function scale() to avoid overflow (which stores the intermediate result in int64). Differential Revision: http://reviews.llvm.org/D12295 llvm-svn: 246106	2015-08-26 23:17:52 +00:00
Matthias Braun	ccfc9c8d6d	FastISel: Use finishCondBranch() for ARM,Mips,PowerPC FastISel Note that after this change branch probabilities are preserved now. llvm-svn: 245998	2015-08-26 01:55:47 +00:00
Matthias Braun	b2b7ef1de8	MachineBasicBlock: Add liveins() method returning an iterator_range llvm-svn: 245895	2015-08-24 22:59:52 +00:00
Scott Douglass	bdef60462d	[ARM] Use AEABI helpers for i64 div and rem Differential Revision: http://reviews.llvm.org/D12232 llvm-svn: 245830	2015-08-24 09:17:18 +00:00
Scott Douglass	d2974a6afa	[ARM] Refactor LowerDivRem before adding LowerREM (nfc) Differential Revision: http://reviews.llvm.org/D12230 llvm-svn: 245829	2015-08-24 09:17:11 +00:00
Vedant Kumar	366dd9fd2b	[ARM] Fix MachO CPU Subtype selection Differential Revision: http://reviews.llvm.org/D12040 llvm-svn: 245744	2015-08-21 21:52:48 +00:00
James Molloy	bf17009a97	[ARM] Don't try and custom lower a vNi64 SETCC. It won't go well. We've already marked 64-bit SETCCs as non-Custom, but it's just possible that a SETCC has a legal result type but an illegal operand type. If this happens, bail out before we create unselectable nodes. Fixes PR24292. I tried to create a testcase but in 99% of cases we can't trigger this - not surprising that this bug has been latent since 2009. llvm-svn: 245577	2015-08-20 16:33:44 +00:00
Silviu Baranga	ad1b19fcb7	[ARM] Add instruction selection patterns for vmin/vmax Summary: The mid-end was generating vector smin/smax/umin/umax nodes, but we were using vbsl to generatate the code. This adds the vmin/vmax patterns and a test to check that we are now generating vmin/vmax instructions. Reviewers: rengolin, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12105 llvm-svn: 245439	2015-08-19 14:11:27 +00:00
Sanjay Patel	1cd6d88e4d	use minSize wrapper; NFCI These were missed when other uses were switched over: http://llvm.org/viewvc/llvm-project?view=revision&revision=243994 llvm-svn: 245311	2015-08-18 16:44:23 +00:00
Guozhi Wei	f66d384443	Align SP adjustment in function getSPAdjust This commit adds a new function TargetFrameLowering::alignSPAdjust and calls it from TargetInstrInfo::getSPAdjust. It fixes PR24142. llvm-svn: 245253	2015-08-17 22:36:27 +00:00
James Molloy	974838f294	[ARM] Fix crash when targetting CPU without NEON We emulate a scalar vmin/vmax with NEON instructions as they don't exist in the VFP ISA. So only mark these as legal when NEON is available. Found here: https://code.google.com/p/chromium/issues/detail?id=521671 llvm-svn: 245231	2015-08-17 19:37:12 +00:00
Silviu Baranga	d5ac26937c	[CostModel][ARM] Increase cost of insert/extract operations Summary: This change limits the minimum cost of an insert/extract element operation to 2 in cases where this would result in mixing of NEON and VFP code. Reviewers: rengolin Subscribers: mssimpso, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12030 llvm-svn: 245225	2015-08-17 15:57:05 +00:00
James Molloy	c617be559a	Rip out hand-rolled matching code for VMIN, VMAX, VMINNM and VMAXNM This is no longer needed - SDAGBuilder will do this for us. llvm-svn: 245197	2015-08-17 07:13:15 +00:00
James Y Knight	5567bafe93	Remove redundant TargetFrameLowering::getFrameIndexOffset virtual function. This was the same as getFrameIndexReference, but without the FrameReg output. Differential Revision: http://reviews.llvm.org/D12042 llvm-svn: 245148	2015-08-15 02:32:35 +00:00
Renato Golin	980b6cc42b	Revert "[ARM] Fix MachO CPU Subtype selection" This reverts commit r245081, as it breaks many builds. llvm-svn: 245086	2015-08-14 19:35:47 +00:00
Vedant Kumar	2f079be789	[ARM] Fix MachO CPU Subtype selection This patch makes the Darwin ARM backend take advantage of TargetParser. It also teaches TargetParser about ARMV7K for the first time. This makes target triple parsing more consistent across llvm. Differential Revision: http://reviews.llvm.org/D11996 llvm-svn: 245081	2015-08-14 18:36:47 +00:00
Rafael Espindola	dbaf0498a9	Revert "Centralize the information about which object format we are using." This reverts commit r245047. It was failing on the darwin bots. The problem was that when running ./bin/llc -march=msp430 llc gets to if (TheTriple.getTriple().empty()) TheTriple.setTriple(sys::getDefaultTargetTriple()); Which means that we go with an arch of msp430 but a triple of x86_64-apple-darwin14.4.0 which fails badly. That code has to be updated to select a triple based on the value of march, but that is not a trivial fix. llvm-svn: 245062	2015-08-14 15:48:41 +00:00
Rafael Espindola	90eb70c8a7	Centralize the information about which object format we are using. Other than some places that were handling unknown as ELF, this should have no change. The test updates are because we were detecting arm-coff or x86_64-win64-coff as ELF targets before. It is not clear if the enum should live on the Triple. At least now it lives in a single location and should be easier to move somewhere else. llvm-svn: 245047	2015-08-14 13:31:17 +00:00
James Molloy	31117875c2	[ARM] FMINNAN/FMAXNAN of f64 are not legal. This was my error. We've got f32 marked as legal because they're simulated using a v2f32 instruction, but there's no equivalent for f64. This will get test coverage imminently when D12015 lands. llvm-svn: 244916	2015-08-13 17:28:26 +00:00
James Molloy	c71f78f49f	[ARM] Allow vmin/vmax of scalars to be emitted without UseNEONForFP. This overrides the default to more closely resemble the hand-crafted matching logic in ISelLowering. It makes sense, as there is no VFP equivalent of vmin or vmax, to use them when they're available even if in general VFP ops should be preferred. This should be NFC. llvm-svn: 244915	2015-08-13 17:28:20 +00:00
John Brawn	68acdcb435	[ARM] Reorganise and simplify thumb-1 load/store selection Other than PC-relative loads/store the patterns that match the various load/store addressing modes have the same complexity, so the order that they are matched is the order that they appear in the .td file. Rearrange the instruction definitions in ARMInstrThumb.td, and make use of AddedComplexity for PC-relative loads, so that the instruction matching order is the order that results in the simplest selection logic. This also makes register-offset load/store be selected when it should, as previously it was only selected for too-large immediate offsets. Differential Revision: http://reviews.llvm.org/D11800 llvm-svn: 244882	2015-08-13 10:48:22 +00:00
Alex Lorenz	e40c8a2b26	PseudoSourceValue: Replace global manager with a manager in a machine function. This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693	2015-08-11 23:09:45 +00:00
James Molloy	d616c642bb	[ARM] Match fminnan/fmaxnan for vector vmin/vmax instead of an intrinsic Lower Intrinsic::arm_neon_vmins/vmaxs to fminnan/fmaxnan and match that instead. This is important because SDAG will soon be able to select FMINNAN itself, so we need a unified lowering path for intrinsics and SDAG. NFCI. llvm-svn: 244593	2015-08-11 12:06:28 +00:00
James Molloy	ee868b2a3e	[ARM] Match fminnum/fmaxnum for vector vminnm/vmaxnm instead of an intrinsic Lower the intrinsic to a FMINNUM/FMAXNUM node and select that instead. This is important because soon SDAG will be able to select FMINNUM/FMAXNUM itself, so we need an integrated lowering path between SDAG and intrinsics. NFCI. llvm-svn: 244592	2015-08-11 12:06:25 +00:00
James Molloy	ea3a687a33	[ARM] Replace ARMISD::VMINNM/VMAXNM with ISD::FMINNUM/FMAXNUM NFCI. This replaces another custom ISDNode with a generic equivalent. llvm-svn: 244591	2015-08-11 12:06:22 +00:00
James Molloy	db8ee4b5a9	[ARM] Replace ARMISD::FMIN/FMAX with the shiny new ISD::FMINNAN/FMAXNAN. NFCI. This removes a custom ISDNode. llvm-svn: 244590	2015-08-11 12:06:15 +00:00
Cameron Esfahani	f97999dc46	Explicitly clear the MI operand list when getInstruction() is called. Call MI.clear() within MCD::OPC_Decode case and inside of translateInstruction() for the X86 target. Remove now unnecessary MI.clear() from ARMDisassembler. Summary: Explicitly clear the MI operand list when getInstruction() is called. Reviewers: hfinkel, t.p.northover, hvarga, kparzysz, jyknight, qcolombet, uweigand Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11665 llvm-svn: 244557	2015-08-11 01:15:07 +00:00
Benjamin Kramer	df005cbe19	Fix some comment typos. llvm-svn: 244402	2015-08-08 18:27:36 +00:00
Chad Rosier	9659de379d	[ARM] Remove an unused reference to MachineRegisterInfo. NFC. llvm-svn: 244334	2015-08-07 17:02:29 +00:00
Silviu Baranga	a07090f7fa	Fix unused variable warning introduced in r244314 llvm-svn: 244315	2015-08-07 12:05:46 +00:00
Silviu Baranga	3e8e51c1a9	[ARM] Update ReconstructShuffle to handle mismatched types Summary: Port the ReconstructShuffle function from AArch64 to ARM to handle mismatched incoming types in the BUILD_VECTOR node. This fixes an outstanding FIXME in the ReconstructShuffle code. Reviewers: t.p.northover, rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11720 llvm-svn: 244314	2015-08-07 11:40:46 +00:00
Pete Cooper	ebcd748927	Convert a bunch of loops to foreach. NFC. After r244074, we now have a successors() method to iterate over all the successors of a TerminatorInst. This commit changes a bunch of eligible loops to use it. llvm-svn: 244260	2015-08-06 20:22:46 +00:00
Chandler Carruth	93205eb966	[TTI] Make the cost APIs in TargetTransformInfo consistently use 'int' rather than 'unsigned' for their costs. For something like costs in particular there is a natural "negative" value, that of savings or saved cost. As a consequence, there is a lot of code that subtracts or creates negative values based on cost, all of which is prone to awkwardness or bugs when dealing with an unsigned type. Similarly, we never want these values to wrap, as that would cause Very Bad code generation (likely percieved as an infinite loop as we try to emit over 2^32 instructions or some such insanity). All around 'int' seems a much better fit for these basic metrics. I've added asserts to ensure that at least the TTI interface never returns negative numbers here. If we ever have a use case for negative numbers, we can remove this, but this way a bug where someone used '-1' to produce a 'very large' cost will be caught by the assert. This passes all tests, and is also UBSan clean. No functional change intended. Differential Revision: http://reviews.llvm.org/D11741 llvm-svn: 244080	2015-08-05 18:08:10 +00:00
Artyom Skrobov	6fbef2a780	ARMISelDAGToDAG.cpp had this self-contradictory code: return StringSwitch<int>(Flags) .Case("g", 0x1) .Case("nzcvq", 0x2) .Case("nzcvqg", 0x3) .Default(-1); ... // The _g and _nzcvqg versions are only valid if the DSP extension is // available. if (!Subtarget->hasThumb2DSP() && (Mask & 0x2)) return -1; ARMARM confirms that the comment is right, and the code was wrong. llvm-svn: 244029	2015-08-05 11:02:14 +00:00
Tanya Lattner	0d28f80bd1	Rename all references to old mailing lists to new lists.llvm.org address. llvm-svn: 243999	2015-08-05 03:51:17 +00:00
Sanjay Patel	924879ad2c	wrap OptSize and MinSize attributes for easier and consistent access (NFCI) Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994	2015-08-04 15:49:57 +00:00
Saleem Abdulrasool	0a2672bb43	ARM: support windows division routines This adds the software division routines for the Windows RTABI. These are not expected to be used often though as most modern Windows ARM capable targets support hardware division. In the case that the target CPU doesnt support hardware division, this will be the fallback. llvm-svn: 243952	2015-08-04 03:57:56 +00:00
Saleem Abdulrasool	67697a7ea9	ARM: make Darwin libcall registration table driven (NFC) Make the libcall updating table driven similar to the approach that the Linux and Windows codepath does below. NFC. llvm-svn: 243951	2015-08-04 03:57:52 +00:00
Tim Northover	9c340ec6fd	ARM: remove horrible printf left over from debugging llvm-svn: 243907	2015-08-03 22:19:08 +00:00
Tim Northover	910dde7ab2	ARM: prefer allocating VFP regs at stride 4 on Darwin. This is necessary for WatchOS support, where the compact unwind format assumes this kind of layout. For now we only want this on Swift-like CPUs though, where it's been the Xcode behaviour for ages. Also, since it can expand the prologue we don't want it at -Oz. llvm-svn: 243884	2015-08-03 17:20:10 +00:00
John Brawn	f3324cf1a5	[ARM] Make GlobalMerge merge extern globals by default Enabling merging of extern globals appears to be generally either beneficial or harmless. On some benchmarks suites (on Cortex-M4F, Cortex-A9, and Cortex-A57) it gives improvements in the 1-5% range, but in the rest the overall effect is zero. Differential Revision: http://reviews.llvm.org/D10966 llvm-svn: 243874	2015-08-03 12:13:33 +00:00
James Molloy	6967e5e4a3	Be less conservative about forming IT blocks. In http://reviews.llvm.org/rL215382, IT forming was made more conservative under the belief that a flag-setting instruction was unpredictable inside an IT block on ARMv6M. But actually, ARMv6M doesn't even support IT blocks so that's impossible. In the ARMARM for v7M, v7AR and v8AR it states that the semantics of such an instruction changes inside an IT block - it doesn't set the flags. So actually it is fine to use one inside an IT block as long as the flags register is dead afterwards. This gives significant performance improvements in a variety of MPEG based workloads. Differential revision: http://reviews.llvm.org/D11680 llvm-svn: 243869	2015-08-03 09:24:48 +00:00
Craig Topper	e3dcce9700	De-constify pointers to Type since they can't be modified. NFC This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842	2015-08-01 22:20:21 +00:00
David Blaikie	a5fd382eb3	-Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11 Various targets use std::swap on specific MCAsmOperands (ARM and possibly Hexagon as well). It might be helpful to mark those subclasses as final, to ensure that the availability of move/copy operations can't lead to slicing. (same sort of requirements as the non-vitual dtor - protected or a final class) llvm-svn: 243820	2015-08-01 04:40:41 +00:00
Sumanth Gundapaneni	532a13691c	[ARM] Lower modulo operation to generate __aeabi_divmod on Android For a modulo (reminder) operation, clang -target armv7-none-linux-gnueabi generates "__modsi3" clang -target armv7-none-eabi generates "__aeabi_idivmod" clang -target armv7-linux-androideabi generates "__modsi3" Android bionic libc doesn't provide a __modsi3, instead it provides a "__aeabi_idivmod". This patch fixes the LLVM ARMISelLowering to generate the correct call when ever there is a modulo operation. Differential Revision: http://reviews.llvm.org/D11661 llvm-svn: 243717	2015-07-31 00:45:12 +00:00
Sanjay Patel	1166f2ff9f	fix memcpy/memset/memmove lowering when optimizing for size Fixing MinSize attribute handling was discussed in D11363. This is a prerequisite patch to doing that. The handling of OptSize when lowering mem* functions was broken on Darwin because it wants to ignore -Os for these cases, but the existing logic also made it ignore -Oz (MinSize). The Linux change demonstrates a widespread problem. The backend doesn't usually recognize the MinSize attribute by itself; it assumes that if the MinSize attribute exists, then the OptSize attribute must also exist. Fixing this more generally will be a follow-on patch or two. Differential Revision: http://reviews.llvm.org/D11568 llvm-svn: 243693	2015-07-30 21:41:50 +00:00
Nick Lewycky	c3890d2969	Fix typo "fuction" noticed in comments in AssumptionCache.h, and also all the other files that have the same typo. All comments, no functionality change! (Merely a "fuctionality" change.) Bonus change to remove emacs major mode marker from SystemZMachineFunctionInfo.cpp because emacs already knows it's C++ from the extension. Also fix typo "appeary" in AMDGPUMCAsmInfo.h. llvm-svn: 243585	2015-07-29 22:32:47 +00:00
Akira Hatanaka	2670f4a550	[ARM] Define subtarget feature strict-align. This commit defines subtarget feature strict-align and uses it instead of cl::opt -arm-strict-align to decide whether strict alignment should be forced. Also, remove the logic that was checking the OS and architecture as clang is now responsible for setting strict-align based on the command line options specified and the target architecute and OS. rdar://problem/21529937 http://reviews.llvm.org/D11470 llvm-svn: 243493	2015-07-28 22:44:28 +00:00
Chih-Hung Hsieh	1e859582d6	Implement target independent TLS compatible with glibc's emutls.c. The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target//ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438	2015-07-28 16:24:05 +00:00
Alexandros Lamprineas	4ea707555a	- Added support for parsing HWDiv features using Target Parser. - Architecture extensions are represented as a bitmap. Phabricator: http://reviews.llvm.org/D11457 llvm-svn: 243335	2015-07-27 22:26:59 +00:00
Colin LeMahieu	fe2c8b8015	[llvm-mc] Pushing plumbing through for --fatal-warnings flag. llvm-svn: 243334	2015-07-27 21:56:53 +00:00
Silviu Baranga	7581d22512	[ARM/AArch64] Fix cost model for interleaved accesses Summary: Fix the cost of interleaved accesses for ARM/AArch64. We were calling getTypeAllocSize and using it to check the number of bits, when we should have called getTypeAllocSizeInBits instead. This would pottentially cause the vectorizer to generate loads/stores and shuffles which cannot be matched with an interleaved access instruction. No performance changes are expected for now since matching/generating interleaved accesses is still disabled by default. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11524 llvm-svn: 243270	2015-07-27 14:39:34 +00:00
Luke Cheeseman	4d45ff2b87	[ARM] - Fix lowering of shufflevectors in AArch32 Some shufflevectors are currently being incorrectly lowered in the AArch32 backend as the existing checks for detecting the NEON operations from the shufflevector instruction expects the shuffle mask and the vector operands to be of the same length. This is not always the case as the mask may be twice as long as the operand; here only the lower half of the shufflemask gets checked, so provided the lower half of the shufflemask looks like a vector transpose (or even is just all -1 for undef) then the intrinsics may get incorrectly lowered into a vector transpose (VTRN) instruction. This patch fixes this by accommodating for both cases and adds regression tests. Differential Revision: http://reviews.llvm.org/D11407 llvm-svn: 243103	2015-07-24 09:57:05 +00:00
Luke Cheeseman	b5c627aba8	When lowering vector shifts a check is performed to see if the value to shift by is an immediate, in this check the value is negated and stored in and int64_t. The value can be -2^63 yet the result cannot be stored in an int64_t and this gives some undefined behaviour causing failures. The negation is only necessary when the values is within a certain range and so it should not need to negate -2^63, this patch introduces this and also a regression test. Differential Revision: http://reviews.llvm.org/D11408 llvm-svn: 243100	2015-07-24 09:31:48 +00:00
David Gross	d9c1bc9955	[ARM] Register (existing) ARMLoadStoreOpt pass with LLVM pass manager. Summary: Among other things, this allows -print-after-all/-print-before-all to dump IR around this pass. Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D11373 llvm-svn: 243052	2015-07-23 22:12:46 +00:00
David Gross	2ad5d173ce	Test commit. llvm-svn: 243046	2015-07-23 21:46:09 +00:00
Quentin Colombet	48b772007f	[ARM] Make the frame lowering code ready for shrink-wrapping. Shrink-wrapping can now be tested on ARM with -enable-shrink-wrap. Related to <rdar://problem/20821730> llvm-svn: 242908	2015-07-22 16:34:37 +00:00
Akira Hatanaka	285815258c	[ARM] Define subtarget feature "reserve-r9", which is used to decide whether register r9 should be reserved. This recommits r242737, which broke bots because the number of subtarget features went over the limit of 64. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242756	2015-07-21 01:42:02 +00:00
Matthias Braun	a50d2203fa	ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code Re-apply of r241928 which had to be reverted because of the r241926 revert. This commit factors out common code from MergeBaseUpdateLoadStore() and MergeBaseUpdateLSMultiple() and introduces a new function MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a strd/ldrd instruction into an strd/ldrd instruction with writeback where possible. Differential Revision: http://reviews.llvm.org/D10676 llvm-svn: 242743	2015-07-21 00:19:01 +00:00
Matthias Braun	e40d89ef9b	ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2 Re-apply r241926 with an additional check that r13 and r15 are not used for LDRD/STRD. See http://llvm.org/PR24190. This also already includes the fix from r241951. Differential Revision: http://reviews.llvm.org/D10623 llvm-svn: 242742	2015-07-21 00:18:59 +00:00
Akira Hatanaka	42427d2c38	Revert r242737. This caused builds to fail with the following error message: error:Too many subtarget features! Bump MAX_SUBTARGET_FEATURES. llvm-svn: 242740	2015-07-20 23:51:12 +00:00
Akira Hatanaka	7482d40cd5	[ARM] Define subtarget feature "reserve-r9", which is used to decide whether register r9 should be reserved. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242737	2015-07-20 23:21:30 +00:00
Matthias Braun	731e359e70	Revert "ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2" This reverts commit r241926. This caused http://llvm.org/PR24190 llvm-svn: 242735	2015-07-20 23:17:20 +00:00
Matthias Braun	84e289702a	Revert "ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code" This reverts commit r241928. This caused http://llvm.org/PR24190 llvm-svn: 242734	2015-07-20 23:17:16 +00:00
Matthias Braun	22f3960759	Revert "ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920" This reverts commit r241951. It caused http://llvm.org/PR24190 llvm-svn: 242733	2015-07-20 23:17:14 +00:00
JF Bastien	e4d22d59d1	Targets: commonize some stack realignment code This patch does the following: * Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`. * Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute. Multiple targets duplicated the same `needsStackRealignment` code: - Aarch64. - ARM. - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has. - PowerPC. - WebAssembly. - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has. The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects: - AMDGPU - BPF - CppBackend - MSP430 - NVPTX - Sparc - SystemZ - XCore - Out-of-tree targets This is a breaking change! `make check` passes. The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation. `needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone. Reviewers: sunfish Subscribers: aemerson, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11160 llvm-svn: 242727	2015-07-20 22:51:32 +00:00
Quentin Colombet	71a71485f4	[ARM] Refactor the prologue/epilogue emission to be more robust. This is the first step toward supporting shrink-wrapping for this target. The changes could be summarized by these items: - Expand the tail-call return as part of the expand pseudo pass. - Get rid of the assumptions that the epilogue is the exit block: * Do not assume which registers are free in the epilogue. (This indirectly improve the lowering of the code for the segmented stacks, see the test cases.) * Take into account that the basic block can be empty. Related to <rdar://problem/20821730> llvm-svn: 242714	2015-07-20 21:42:14 +00:00
Matthias Braun	9e85980658	ARM: Enable MachineScheduler and disable PostRAScheduler for swift. Reapply r242500 now that the swift schedmodel includes LDRLIT. This is mostly done to disable the PostRAScheduler which optimizes for instruction latencies which isn't a good fit for out-of-order architectures. This also allows to leave out the itinerary table in swift in favor of the SchedModel ones. This change leads to performance improvements/regressions by as much as 10% in some benchmarks, in fact we loose 0.4% performance over the llvm-testsuite for reasons that appear to be unknown or out of the compilers control. rdar://20803802 documents the investigation of these effects. While it is probably a good idea to perform the same switch for the other ARM out-of-order CPUs, I limited this change to swift as I cannot perform the benchmark verification on the other CPUs. Differential Revision: http://reviews.llvm.org/D10513 llvm-svn: 242588	2015-07-17 23:18:30 +00:00
Matthias Braun	141d1c9d8f	ARM: Add scheduling information for LDRLIT instructions to swift scheduling model These pseudo instructions are only lowered after register allocation and are therefore still present when the machine scheduler runs. Add a run: line to a testcase that uses the uncommon flags necessary to actually produce a LDRLIT instruction on swift. llvm-svn: 242587	2015-07-17 23:18:26 +00:00
Adam Nemet	5a6d5bc17b	Revert "ARM: Enable MachineScheduler and disable PostRAScheduler for swift." This reverts commit r242500. It broke some internal tests and Matthias asked me to revert it while he is investigating. llvm-svn: 242553	2015-07-17 18:14:19 +00:00
James Molloy	a6702e2f14	[ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA No functional change, but it preps codegen for the future when SABSDIFF will start getting generated in anger. llvm-svn: 242546	2015-07-17 17:10:55 +00:00
Matthias Braun	2d8315f806	ARM: Enable MachineScheduler and disable PostRAScheduler for swift. This is mostly done to disable the PostRAScheduler which optimizes for instruction latencies which isn't a good fit for out-of-order architectures. This also allows to leave out the itinerary table in swift in favor of the SchedModel ones. This change leads to performance improvements/regressions by as much as 10% in some benchmarks, in fact we loose 0.4% performance over the llvm-testsuite for reasons that appear to be unknown or out of the compilers control. rdar://20803802 documents the investigation of these effects. While it is probably a good idea to perform the same switch for the other ARM out-of-order CPUs, I limited this change to swift as I cannot perform the benchmark verification on the other CPUs. Differential Revision: http://reviews.llvm.org/D10513 llvm-svn: 242500	2015-07-17 01:44:31 +00:00
Matthias Braun	da3d0d7342	Arm: Don't define a label twice with two setjmps in a function. Constructing a name based on the function name didn't give us a unique symbol if we had more than one setjmp in a function. Using MCContext::createTempSymbol() always gives us a unique name. Differential Revision: http://reviews.llvm.org/D9314 llvm-svn: 242482	2015-07-16 22:34:20 +00:00
Matthias Braun	3cd00c1739	Fix __builtin_setjmp in combination with sjlj exception handling. llvm.eh.sjlj.setjmp was used as part of the SjLj exception handling style but is also used in clang to implement __builtin_setjmp. The ARM backend needs to output additional dispatch tables for the SjLj exception handling style, these tables however can't be emitted if llvm.eh.sjlj.setjmp is simply used for __builtin_setjmp and no actual landing pad blocks exist. To solve this issue a new llvm.eh.sjlj.setup_dispatch intrinsic is introduced which is used instead of llvm.eh.sjlj.setjmp in the SjLj exception handling lowering, so we can differentiate between the case where we actually need to setup a dispatch table and the case where we just need the __builtin_setjmp semantic. Differential Revision: http://reviews.llvm.org/D9313 llvm-svn: 242481	2015-07-16 22:34:16 +00:00
Pete Cooper	f4ce569deb	Revert "Add missing load/store flags to thumb2 instructions." This reverts commit r242300. This is causing buildbot failures which we are investigating. I'll reapply once we know whats going on, but for now want to get the bots green. llvm-svn: 242428	2015-07-16 18:38:13 +00:00
Mehdi Amini	bd7287ebe5	Move most user of TargetMachine::getDataLayout to the Module one Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. This patch is quite boring overall, except for some uglyness in ASMPrinter which has a getDataLayout function but has some clients that use it without a Module (llmv-dsymutil, llvm-dwarfdump), so some methods are taking a DataLayout as parameter. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11090 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 242386	2015-07-16 06:11:10 +00:00
Akira Hatanaka	024d91a00b	[ARM] Define a subtarget feature that is used to avoid using movt/movw pairs for 32-bit immediates. This change is needed to avoid emitting movt/movw pairs when doing LTO and do so on a per-function basis. Out-of-tree projects currently using cl::opt option -arm-use-movt=0 or false to avoid emitting movt/movw pairs should make changes to add subtarget feature "+no-movt" (see the changes made to clang in r242368). rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11026 llvm-svn: 242369	2015-07-16 00:58:23 +00:00
Pete Cooper	e3c8161736	Clear kill flags in ARMLoadStoreOptimizer. The pass here was clearing kill flags on instructions which had their sources killed in the instruction being combined. But given that the new instruction is inserted after the existing ones, any existing instructions with kill flags will lead to the verifier complaining that we are reading an undefined physreg. For example, what we had prior to this optimization is t2STRi12 %R1, %SP, 12 t2STRi12 %R1<kill>, %SP, 16 t2STRi12 %R0<kill>, %SP, 8 and prior to this fix that would generate t2STRi12 %R1<kill>, %SP, 16 t2STRDi8 %R0<kill>, %R1, %SP, 8 This is clearly incorrect as it didn't clear the kill flag on R1 used with offset 16 because there was no kill flag on the instruction with offset 12. After this change we clear the kill flag on the offset 16 instruction because we know it will be used afterwards in the new instruction. I haven't provided a test case. I have a small test, but even it is very sensitive to register allocation order which isn't ideal. llvm-svn: 242359	2015-07-16 00:09:18 +00:00
Matthias Braun	5d1f12d1f5	TargetRegisterInfo: Provide a way to check assigned registers in getRegAllocationHints() Pass a const reference to LiveRegMatrix to getRegAllocationHints() because some targets can prodive better hints if they can test whether a physreg has been used for register allocation yet. llvm-svn: 242340	2015-07-15 22:16:00 +00:00
Pete Cooper	21ca199cea	Add missing load/store flags to thumb2 instructions. These were the cause of a verifier error when building 7zip with -verify-machineinstrs. Running 'make check' with the verifier triggered the same error on the test here so i've updated the test to run the verifier on one of its runs instead of adding a new one. While looking at this code, there was a stale comment that these instructions were only used for disassembly. This probably used to be the case, but they are now used in the 'ARM load / store optimization pass' too. llvm-svn: 242300	2015-07-15 16:36:38 +00:00
JF Bastien	c8f48c19d3	WebAssembly: fix build breakage. Summary: processFunctionBeforeCalleeSavedScan was renamed to determineCalleeSaves and now takes a BitVector parameter as of rL242165, reviewed in http://reviews.llvm.org/D10909 WebAssembly is still marked as experimental and therefore doesn't build by default. It does, however, grep by default! I notice that processFunctionBeforeCalleeSavedScan is still mentioned in a few comments and error messages, which I also fixed. Reviewers: qcolombet, sunfish Subscribers: jfb, dsanders, hfinkel, MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D11199 llvm-svn: 242242	2015-07-14 23:06:07 +00:00
Matthias Braun	0256486532	PrologEpilogInserter: Rewrite API to determine callee save regsiters. This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan(): - Rename the function to determineCalleeSaves() - Pass a bitset of callee saved registers by reference, thus avoiding the function-global PhysRegUsed bitset in MachineRegisterInfo. - Without PhysRegUsed the implementation is fine tuned to not save physcial registers which are only read but never modified. Related to rdar://21539507 Differential Revision: http://reviews.llvm.org/D10909 llvm-svn: 242165	2015-07-14 17:17:13 +00:00
Hans Wennborg	61f9efe73b	ARMAsmParser: Take MCInst param by const-ref (Broken out from http://reviews.llvm.org/D11167) llvm-svn: 242160	2015-07-14 16:39:01 +00:00
Yaron Keren	d1ba2d9d8b	Generate correct asm info for mingw and cygwin ARM targets. http://reviews.llvm.org/D11075 Patch by Martell Malone Reviewed by Reid Kleckner llvm-svn: 242123	2015-07-14 05:51:05 +00:00
Logan Chien	0a43abc9f8	ARM: Fix cttz expansion on vector types. The 64/128-bit vector types are legal if NEON instructions are available. However, there was no matching patterns for @llvm.cttz.*() intrinsics and result in fatal error. This commit fixes the problem by lowering cttz to: a. ctpop((x & -x) - 1) b. width - ctlz(x & -x) - 1 llvm-svn: 242037	2015-07-13 15:37:30 +00:00
Scott Douglass	69bf1ce03a	[ARM] Handle commutativity when converting to tADDhirr in Thumb2 Also, run thumb_rewrite.s tests in Thumb2 now that they pass. Differential Revision: http://reviews.llvm.org/D11132 llvm-svn: 242036	2015-07-13 15:31:48 +00:00
Scott Douglass	d9d8d26458	[ARM] Add Thumb2 ADD with SP narrowing from 3 operand to 2 Differential Revision: http://reviews.llvm.org/D11131 llvm-svn: 242035	2015-07-13 15:31:40 +00:00
Scott Douglass	039f768c42	[ARM] Small refactor of tryConvertingToTwoOperandForm (nfc) Also, add more Thumb2 ADD tests requested during review of http://reviews.llvm.org/D11053. Differential Revision: http://reviews.llvm.org/D11130 llvm-svn: 242034	2015-07-13 15:31:33 +00:00
Aaron Ballman	6d8f785073	Removing several -Wunused-but-set-variable warnings; NFC intended. llvm-svn: 242028	2015-07-13 14:04:30 +00:00
Renato Golin	1ef7a0f7c0	[ARM] Add support for nest attribute using r12 Register r12 ('ip') is used by GCC for this purpose and hence is used here. As discussed on the GCC mailing list, the register choice is an ABI issue and so choosing the same register as GCC means __builtin_call_with_static_chain is compatible. A similar patch has just gone in the AArch64 backend, so this is just the ARM counterpart, following the same discussion. Patch by Stephen Cross. llvm-svn: 241996	2015-07-12 18:16:40 +00:00
Duncan P. N. Exon Smith	e463e470f8	MC: Only allow changing feature bits in MCSubtargetInfo Disallow all mutation of `MCSubtargetInfo` expect the feature bits. Besides deleting the assignment operators -- which were dead "code" -- this restricts `InitMCProcessorInfo()` to subclass initialization sequences, and exposes a new more limited function called `setDefaultFeatures()` for use by the ARMAsmParser `.cpu` directive. There's a small functional change here: ARMAsmParser used to adjust `MCSubtargetInfo::CPUSchedModel` as a side effect of calling `InitMCProcessorInfo()`, but I've removed that suspicious behaviour. Since the AsmParser shouldn't be doing any scheduling, there shouldn't be any observable change... llvm-svn: 241961	2015-07-10 22:52:15 +00:00
Duncan P. N. Exon Smith	754e21f244	MC: Remove MCSubtargetInfo() default constructor Force all creators of `MCSubtargetInfo` to immediately initialize it, merging the default constructor and the initializer into an initializing constructor. Besides cleaning up the code a little, this makes it clear that the initializer is never called again later. Out-of-tree backends need a trivial change: instead of calling: auto *X = new MCSubtargetInfo(); InitXYZMCSubtargetInfo(X, ...); return X; they should call: return createXYZMCSubtargetInfoImpl(...); There's no real functionality change here. llvm-svn: 241957	2015-07-10 22:43:42 +00:00
Duncan P. N. Exon Smith	bb57d73805	MC: Remove MCSubtargetInfo::InitCPUSched() Remove all calls to `MCSubtargetInfo::InitCPUSched()` and merge its body into the only relevant caller, `MCSubtargetInfo::InitMCProcessorInfo()`. We were only calling the former after explicitly calling the latter with the same CPU; it's confusing to have both methods exposed. Besides a minor (surely unmeasurable) speedup in ARM and X86 from avoiding running the logic twice, no functionality change. llvm-svn: 241956	2015-07-10 22:33:01 +00:00
Matthias Braun	e5a112f5e1	ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920 llvm-svn: 241951	2015-07-10 22:23:57 +00:00
Matthias Braun	d9bd22b2c4	ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code This commit factors out common code from MergeBaseUpdateLoadStore() and MergeBaseUpdateLSMultiple() and introduces a new function MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a strd/ldrd instruction into an strd/ldrd instruction with writeback where possible. Differential Revision: http://reviews.llvm.org/D10676 llvm-svn: 241928	2015-07-10 18:37:33 +00:00
Matthias Braun	e4ba6b8c24	ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2 Differential Revision: http://reviews.llvm.org/D10623 llvm-svn: 241926	2015-07-10 18:28:49 +00:00
JF Bastien	b73a2ed20e	Target RegisterInfo: devirtualize TargetFrameLowering Summary: The target frame lowering's concrete type is always known in RegisterInfo, yet it's only sometimes devirtualized through a static_cast. This change adds an auto-generated static function <Target>GenRegisterInfo::getFrameLowering(const MachineFunction &MF) which does this devirtualization, and uses this function in all targets which can. This change was suggested by sunfish in D11070 for WebAssembly, I figure that I may as well improve the other targets while I'm here. Subscribers: sunfish, ted, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11093 llvm-svn: 241921	2015-07-10 18:13:17 +00:00
Matthias Braun	a4a3182ded	ARMLoadStoreOptimizer: Rewrite LDM/STM matching logic. This improves the logic in several ways and is a preparation for followup patches: - First perform an analysis and create a list of merge candidates, then transform. This simplifies the code in that you have don't have to care to much anymore that you may be holding iterators to MachineInstrs that get removed. - Analyze/Transform basic blocks in reverse order. This allows to use LivePhysRegs to find free registers instead of the RegisterScavenger. The RegisterScavenger will become less precise in the future as it relies on the deprecated kill-flags. - Return the newly created node in MergeOps so there's no need to look around in the schedule to find it. - Rename some MBBI iterators to InsertBefore to make their role clear. - General code cleanup. Differential Revision: http://reviews.llvm.org/D10140 llvm-svn: 241920	2015-07-10 18:08:49 +00:00
Pat Gavlin	a717f255b6	Allow {e,r}bp as the target of {read,write}_register. This patch allows the read_register and write_register intrinsics to read/write the RBP/EBP registers on X86 iff the targeted register is the frame pointer for the containing function. Differential Revision: http://reviews.llvm.org/D10977 llvm-svn: 241827	2015-07-09 17:40:29 +00:00
Scott Douglass	8143bc25ee	[ARM] Thumb1 3 to 2 operand convertion for commutative operations Differential Revision: http://reviews.llvm.org/D11057 llvm-svn: 241802	2015-07-09 14:13:55 +00:00
Scott Douglass	2740a63725	[ARM] Don't be overzealous converting Thumb1 3 to 2 operands Differential Revision: http://reviews.llvm.org/D11056 llvm-svn: 241801	2015-07-09 14:13:48 +00:00
Scott Douglass	47a3fce461	[ARM] Add Thumb2 ADD with PC narrowing from 3 operand to 2 Differential Revision: http://reviews.llvm.org/D11055 llvm-svn: 241800	2015-07-09 14:13:41 +00:00
Scott Douglass	8c7803f4c1	[ARM] Refactor converting Thumb1 from 3 to 2 operand (nfc) Also adds some test cases. Differential Revision: http://reviews.llvm.org/D11054 llvm-svn: 241799	2015-07-09 14:13:34 +00:00
Mehdi Amini	157e5a6d10	Remove getDataLayout() from TargetSelectionDAGInfo (had no users) Summary: Remove empty subclass in the process. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren, ted Differential Revision: http://reviews.llvm.org/D11045 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241780	2015-07-09 02:10:08 +00:00
Mehdi Amini	a749f2ad47	Remove getDataLayout() from TargetLowering Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779	2015-07-09 02:09:52 +00:00
Mehdi Amini	0cdec1e2ab	Make isLegalAddressingMode() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778	2015-07-09 02:09:40 +00:00
Mehdi Amini	44ede33a69	Make TargetLowering::getPointerTy() taking DataLayout as an argument Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775	2015-07-09 02:09:04 +00:00
Mehdi Amini	5010ebf181	Make TargetTransformInfo keeping a reference to the Module DataLayout DataLayout is no longer optional. It was initialized with or without a DataLayout, and the DataLayout when supplied could have been the one from the TargetMachine. Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11021 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241774	2015-07-09 02:08:42 +00:00
Mehdi Amini	56228dabfa	Redirect DataLayout from TargetMachine to Module in ComputeValueVTs() Summary: Avoid using the TargetMachine owned DataLayout and use the Module owned one instead. This requires passing the DataLayout up the stack to ComputeValueVTs(). This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11019 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241773	2015-07-09 01:57:34 +00:00
Duncan P. N. Exon Smith	ad98745561	MC: Constify MCSubtargetInfo in getDeprecationInfo(), NFC There's no reason to be able to mutate `MCSubtargetInfo` in `getDeprecationInfo()`. Constify the reference. llvm-svn: 241693	2015-07-08 17:30:55 +00:00
Mehdi Amini	ffc1402fad	Remove IsLittleEndian from TargetLowering and redirect to DataLayout Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11017 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241655	2015-07-08 01:00:38 +00:00
Akira Hatanaka	1bc8af78f4	[ARM] Define a subtarget feature and use it to decide whether long calls should be emitted. This is needed to enable ARM long calls for LTO and enable and disable it on a per-function basis. Out-of-tree projects currently using EnableARMLongCalls to emit long calls should start passing "+long-calls" to the feature string (see the changes made to clang in r241565). rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D9364 llvm-svn: 241566	2015-07-07 06:54:42 +00:00
Daniel Sanders	f423f5627c	Change the last few internal StringRef triples into Triple objects. Summary: This concludes the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. At this point, the StringRef-form of GNU Triples should only be used in the public API (including IR serialization) and a couple objects that directly interact with the API (most notably the Module class). The next step is to replace these Triple objects with the TargetTuple object that will represent our authoratative/unambiguous internal equivalent to GNU Triples. Reviewers: rengolin Subscribers: llvm-commits, jholewinski, ted, rengolin Differential Revision: http://reviews.llvm.org/D10962 llvm-svn: 241472	2015-07-06 16:56:07 +00:00
Daniel Sanders	fbdab437f0	Where Triple has a suitable predicate, use it rather than the enum values. NFC. Reviewers: mcrosier Subscribers: llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10960 llvm-svn: 241469	2015-07-06 16:33:18 +00:00
Peter Collingbourne	6a9d1774d0	IR: Do not consider available_externally linkage to be linker-weak. From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413	2015-07-05 20:52:35 +00:00
Benjamin Kramer	9bfb627a0e	[TargetLowering] StringRefize asm constraint getters. There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411	2015-07-05 19:29:18 +00:00
Ranjeet Singh	86ecbb7b54	Reverting r241058 because it's causing buildbot failures. llvm-svn: 241061	2015-06-30 12:32:53 +00:00
Ranjeet Singh	5b119091a1	There are a few places where subtarget features are still represented by uint64_t, this patch replaces these usages with the FeatureBitset (std::bitset) type. Differential Revision: http://reviews.llvm.org/D10542 llvm-svn: 241058	2015-06-30 11:30:42 +00:00
Tim Northover	83f0fbcc37	ARM: add correct kill flags when combining stm instructions When the store sequence being combined actually stores the base register, we should not mark it as killed until the end. rdar://21504262 llvm-svn: 241003	2015-06-29 21:42:16 +00:00
Javed Absar	d5526303b7	[ARM]: Extend -mfpu options for half-precision and vfpv3xd Some of the the permissible ARM -mfpu options, which are supported in GCC, are currently not present in llvm/clang.This patch adds the options: 'neon-fp16', 'vfpv3-fp16', 'vfpv3-d16-fp16', 'vfpv3xd' and 'vfpv3xd-fp16. These are related to half-precision floating-point and single precision. Reviewers: rengolin, ranjeet.singh Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10645 llvm-svn: 240930	2015-06-29 09:32:29 +00:00
Javed Absar	bced3032e0	[ARM] Cortex-R5 is not VFPOnlySP This patch fixes the error in ARM.td which stated that Cortex-R5 floating point unit can do only single precision, when it can do double as well. Reviewers: rengolin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10769 llvm-svn: 240799	2015-06-26 17:42:37 +00:00
Javed Absar	99a9343ae6	[ARM] Cortex-R4F is not VFPOnlySP Cortex-R4F TRM states that fpu supports both single and double precision. This patch corrects the information in ARM.td file and corresponding test. Reviewers: rengolin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10763 llvm-svn: 240776	2015-06-26 12:14:56 +00:00
Rafael Espindola	c5fb508c9d	Optimize the creation of mapping symbols. No need to create two symbols just to assign one to the other. llvm-svn: 240773	2015-06-26 11:31:13 +00:00
Hao Liu	2cd34bb585	[ARM] Lower interleaved memory accesses to vldN/vstN intrinsics. This patch also adds a function to calculate the cost of interleaved memory accesses. E.g. Lower an interleaved load: %wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4 %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> into: %vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4) %vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0 %vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1 E.g. Lower an interleaved store: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4 into: %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3> %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7> %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11> call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4) Differential Revision: http://reviews.llvm.org/D10533 llvm-svn: 240755	2015-06-26 02:45:36 +00:00
Benjamin Kramer	e61cbd1f3a	Replace copy-pasted debug value skipping with MBB::getLastNonDebugInstr No functional change intended. llvm-svn: 240639	2015-06-25 13:28:24 +00:00
Matthias Braun	ba3ecc3c80	ARMLoadStoreOptimizer: Fix errata 602117 handling and make testcase actually test for it This fixes PR23912 Differential Revision: http://reviews.llvm.org/D10620 llvm-svn: 240582	2015-06-24 20:03:27 +00:00
John Brawn	d86e004b7e	[ARM] ARMLoadStoreOpt::UpdateBaseRegUses should stop on def When UpdateBaseRegUses sees an instruction that defines the base register it must stop, as the base register value it is updating is no longer live. Ideally we would already have seen the register be killed (which is already checked for), but the kill flags may be inaccurate and we have to account for this. Differential Revision: http://reviews.llvm.org/D10566 llvm-svn: 240424	2015-06-23 16:02:11 +00:00
Alexander Kornienko	f00654e31b	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) Apparently, the style needs to be agreed upon first. llvm-svn: 240390	2015-06-23 09:49:53 +00:00
Pete Cooper	80d21cb40d	Change .thumb_set to have the same error checks as .set. According to the documentation, .thumb_set is 'the equivalent of a .set directive'. We didn't have equivalent behaviour in terms of all the errors we could throw, for example, when a symbol is redefined. This change refactors parseAssignment so that it can be used by .set and .thumb_set and implements tests for .thumb_set for all the errors thrown by that method. Reviewed by Rafael Espíndola. llvm-svn: 240318	2015-06-22 19:35:57 +00:00
Alexander Kornienko	70bc5f1398	Fixed/added namespace ending comments using clang-tidy. NFC The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137	2015-06-19 15:57:42 +00:00
Ahmed Bougacha	9a9094260d	[ARM] Look through concat when lowering in-place shuffles (VZIP, ..) Currently, we canonicalize shuffles that produce a result larger than their operands with: shuffle(concat(v1, undef), concat(v2, undef)) -> shuffle(concat(v1, v2), undef) because we can access quad vectors (see PerformVECTOR_SHUFFLECombine). This is useful in the general case, but there are special cases where native shuffles produce larger results: the two-result ops. We can look through the concat when lowering them: shuffle(concat(v1, v2), undef) -> concat(VZIP(v1, v2):0, :1) This lets us generate the native shuffles instead of scalarizing to dozens of VMOVs. Differential Revision: http://reviews.llvm.org/D10424 llvm-svn: 240118	2015-06-19 02:32:35 +00:00
Ahmed Bougacha	2ffa91f908	[ARM] Factor out two-result shuffle matching. NFCI. In preparation for a future patch: makes it easier to do the same matching to generate different nodes, without duplication. llvm-svn: 240116	2015-06-19 02:25:01 +00:00
Eric Christopher	572e03a396	Fix "the the" in comments. llvm-svn: 240112	2015-06-19 01:53:21 +00:00
Daniel Sanders	c81f450f1a	Clean up redundant copies of Triple objects. NFC Summary: Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10382 llvm-svn: 239823	2015-06-16 15:44:21 +00:00
Matthias Braun	39a2afc941	Rename TargetSubtargetInfo::enablePostMachineScheduler() to enablePostRAScheduler() r213101 changed the behaviour of this method to not only affect the PostMachineScheduler scheduler but also the PostRAScheduler scheduler, renaming should make this fact clear. Also document that the preferred way is to specify this in the scheduling model instead of overriding this method. Differential Revision: http://reviews.llvm.org/D10427 llvm-svn: 239659	2015-06-13 03:42:16 +00:00
Matthias Braun	88e213159a	MachineLICM: Use TargetSchedModel instead of just itineraries This will use Itinieraries if available, but will also work if just a MCSchedModel is available. Differential Revision: http://reviews.llvm.org/D10428 llvm-svn: 239658	2015-06-13 03:42:11 +00:00
Daniel Sanders	3e5de88dac	Replace string GNU Triples with llvm::Triple in TargetMachine. NFC. Summary: For the moment, TargetMachine::getTargetTriple() still returns a StringRef. This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rengolin Reviewed By: rengolin Subscribers: ted, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10362 llvm-svn: 239554	2015-06-11 19:41:26 +00:00
Ahmed Bougacha	c88bf54366	[CodeGen] ArrayRef'ize cond/pred in various TII APIs. NFC. llvm-svn: 239553	2015-06-11 19:30:37 +00:00
Daniel Sanders	ed64d62c70	Replace string GNU Triples with llvm::Triple in computeDataLayout(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rengolin Reviewed By: rengolin Subscribers: llvm-commits, jfb, rengolin Differential Revision: http://reviews.llvm.org/D10361 llvm-svn: 239538	2015-06-11 15:34:59 +00:00
Reid Kleckner	c35e7f52ba	Revert "Move dllimport name mangling to IR mangler." This reverts commit r239437. This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol directly instead of using it to do an indirect function call. llvm-svn: 239502	2015-06-11 01:31:48 +00:00
Daniel Sanders	a73f1fdb19	Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and create*MCSubtargetInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10311 llvm-svn: 239467	2015-06-10 12:11:26 +00:00
Daniel Sanders	9aa7e38bf8	Replace string GNU Triples with llvm::Triple in create*MCRelocationInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10307 llvm-svn: 239465	2015-06-10 10:54:40 +00:00
Daniel Sanders	418caf5002	Replace string GNU Triples with llvm::Triple in MCAsmBackend subclasses and create*AsmBackend(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: echristo, rafael Reviewed By: rafael Subscribers: rafael, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10243 llvm-svn: 239464	2015-06-10 10:35:34 +00:00
Peter Collingbourne	9fe51fdf18	Move dllimport name mangling to IR mangler. This ensures that LTO clients see the correct external symbol name. Differential Revision: http://reviews.llvm.org/D10318 llvm-svn: 239437	2015-06-09 22:09:53 +00:00
Akira Hatanaka	d9699bc7bd	Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions that was resetting it. Remove the uses of DisableTailCalls in subclasses of TargetLowering and use the value of function attribute "disable-tail-calls" instead. Also, unconditionally add pass TailCallElim to the pipeline and check the function attribute at the start of runOnFunction to disable the pass on a per-function basis. This is part of the work to remove TargetMachine::resetTargetOptions, and since DisableTailCalls was the last non-fast-math option that was being reset in that function, we should be able to remove the function entirely after the work to propagate IR-level fast-math flags to DAG nodes is completed. Out-of-tree users should remove the uses of DisableTailCalls and make changes to attach attribute "disable-tail-calls"="true" or "false" to the functions in the IR. rdar://problem/13752163 Differential Revision: http://reviews.llvm.org/D10099 llvm-svn: 239427	2015-06-09 19:07:19 +00:00
Aaron Ballman	3182ee92ba	Removing spurious semi colons; NFC. llvm-svn: 239399	2015-06-09 12:03:46 +00:00
Matt Arsenault	8b643559d4	MC: Add target hook to control symbol quoting llvm-svn: 239370	2015-06-09 00:31:39 +00:00
Akira Hatanaka	4a61619ff5	[ARM] Pass a callback to FunctionPass constructors to enable skipping execution on a per-function basis. Previously some of the passes were conditionally added to ARM's pass pipeline based on the target machine's subtarget. This patch makes changes to add those passes unconditionally and execute them conditonally based on the predicate functor passed to the pass constructors. This enables running different sets of passes for different functions in the module. rdar://problem/20542263 Differential Revision: http://reviews.llvm.org/D8717 llvm-svn: 239325	2015-06-08 18:50:43 +00:00
Pete Cooper	4915dd076f	Remove includes of MCMachOSymbolFlags.h after it was deleted llvm-svn: 239318	2015-06-08 17:25:57 +00:00
Peter Collingbourne	6679fc1a79	Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM." as it caused miscompilations and assertion failures (PR23768, http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html). llvm-svn: 239169	2015-06-05 18:01:28 +00:00
Benjamin Kramer	113b2a943f	[ARM] Make helper function static. This one had a declaration but it differed from the definition so the declaration was actually dead. llvm-svn: 239157	2015-06-05 14:32:54 +00:00
John Brawn	985c04e8fa	[ARM] Add support for -sp- FPUs and FPU none to TargetParser These are added mainly for the benefit of clang, but this also means that they are now allowed in .fpu directives and we emit the correct .fpu directive when single-precision-only is used. Differential Revision: http://reviews.llvm.org/D10238 llvm-svn: 239151	2015-06-05 13:31:19 +00:00
John Brawn	d03d22922d	[ARM] Add knowledge of FPU subtarget features to TargetParser Add getFPUFeatures to TargetParser, which gets the list of subtarget features that are enabled/disabled for each FPU, and use it when handling the .fpu directive. No functional change in this commit, though clang will start behaving differently once it starts using this. Differential Revision: http://reviews.llvm.org/D10237 llvm-svn: 239150	2015-06-05 13:29:24 +00:00
Jim Grosbach	36e60e9127	MC: Clean up naming in MCObjectWriter. NFC. s/WriteObject/writeObject/ s/RecordRelocation/recordRelocation/ s/IsSymbolRefDifferenceFullyResolved/isSymbolRefDifferenceFullyResolved/ s/Write8/write8/ s/WriteLE16/writeLE16/ s/WriteLE32/writeLE32/ s/WriteLE64/writeLE64/ s/WriteBE16/writeBE16/ s/WriteBE32/writeBE32/ s/WriteBE64/writeBE64/ s/Write16/write16/ s/Write32/write32/ s/Write64/write64/ s/WriteZeroes/writeZeroes/ s/WriteBytes/writeBytes/ llvm-svn: 239108	2015-06-04 22:24:41 +00:00
Ahmed Bougacha	8207641251	[GlobalMerge] Take into account minsize on Global users' parents. Now that we can look at users, we can trivially do this: when we would have otherwise disabled GlobalMerge (currently -O<3), we can just run it for minsize functions, as it's usually a codesize win. Differential Revision: http://reviews.llvm.org/D10054 llvm-svn: 239087	2015-06-04 20:39:23 +00:00
Jim Grosbach	7c76b4cc6e	MC: Remove obsolete MachO UseAggressiveSymbolFolding. Fix the FIXME and remove this old as(1) compat option. It was useful for bringup of the integrated assembler to diff object files, but now it's just causing more relocations than strictly necessary to be generated. rdar://21201804 llvm-svn: 239084	2015-06-04 20:27:42 +00:00
Daniel Sanders	7813ae879e	Replace string GNU Triples with llvm::Triple in MCAsmInfo subclasses and create*AsmInfo(). NFC. Summary: This is the first of several patches to eliminate StringRef forms of GNU triples from the internals of LLVM. After this is complete, GNU triples will be replaced by a more authoratitive representation in the form of an LLVM TargetTuple. Reviewers: rengolin Reviewed By: rengolin Subscribers: ted, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10236 llvm-svn: 239036	2015-06-04 13:12:25 +00:00
Rafael Espindola	f8794ff29d	Remove MCELFSymbolFlags.h. It is now internal to MCSymbolELF. llvm-svn: 238996	2015-06-04 00:47:43 +00:00
Rafael Espindola	c73aed1cb3	Remove getOrCreateSymbolData. There is no MCSymbolData anymore. llvm-svn: 238952	2015-06-03 19:03:11 +00:00
Matthias Braun	125c9f5f7b	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 Recommiting after the revert in r238821, the buildbot still failed with the patch removed so there seems to be another reason for the breakage. llvm-svn: 238935	2015-06-03 16:30:24 +00:00
Daniel Sanders	43a79bf694	[arm] Fix r238921. We must handle Constraint_i too. llvm-svn: 238925	2015-06-03 14:17:18 +00:00
Daniel Sanders	1f58ef71ea	[arm] Distinguish the /U[qytnms]/, 'Uv', 'Q', and 'm' inline assembly memory constraints. Summary: But still handle them the same way since I don't know how they differ on this target. Of these, /U[qytnms]/ do not have backend tests but are accepted by clang. No functional change intended. Reviewers: t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D8203 llvm-svn: 238921	2015-06-03 12:33:56 +00:00
Rafael Espindola	0ccf9b71f3	Pass a MCSymbolELF to a few ELF only functions. NFC. llvm-svn: 238868	2015-06-02 21:30:13 +00:00
Rafael Espindola	95fb9b93ed	Merge MCELF.h into MCSymbolELF.h. Now that we have a dedicated type for ELF symbol, these helper functions can become member function of MCSymbolELF. llvm-svn: 238864	2015-06-02 20:38:46 +00:00
Renato Golin	3a7bec86bd	Revert "ARM: Thumb2 LDRD/STRD supports independent input/output regs" This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot. Since self-hosting issues with Clang are hard to investigate, I'm taking the liberty to revert now, so we can investigate it offline. llvm-svn: 238821	2015-06-02 11:47:30 +00:00
Matthias Braun	e20dc1cd3a	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 llvm-svn: 238795	2015-06-01 23:27:08 +00:00
Matthias Braun	ec50fa6f8c	ARMLoadStoreOptimizer: Fix doxygen comments; NFC llvm-svn: 238784	2015-06-01 21:26:23 +00:00
Luke Cheeseman	85fd06d389	Re-commit of r238201 with fix for building with shared libraries. llvm-svn: 238739	2015-06-01 12:02:47 +00:00
Matt Arsenault	bd7d80a4a6	Add address space argument to isLegalAddressingMode This is important because of different addressing modes depending on the address space for GPU targets. This only adds the argument, and does not update any of the uses to provide the correct address space. llvm-svn: 238723	2015-06-01 05:31:59 +00:00
NAKAMURA Takumi	072a58a7fd	ARMConstantIslandPass.cpp: Prune an empty \brief. [-Wdocumentation] llvm-svn: 238697	2015-05-31 23:05:35 +00:00
Tim Northover	a603c4076c	ARM: recommit r237590: allow jump tables to be placed as constant islands. The original version didn't properly account for the base register being modified before the final jump, so caused miscompilations in Chromium and LLVM. I've fixed this and tested with an LLVM self-host (I don't have the means to build & test Chromium). The general idea remains the same: in pathological cases jump tables can be too far away from the instructions referencing them (like other constants) so they need to be movable. Should fix PR23627. llvm-svn: 238680	2015-05-31 19:22:07 +00:00
Renato Golin	5d78c9ce58	Comment change. NFC That comment misleads the current discussions in mentioned bug. Leave the discussions to the bug. Also, adding a future change FIXME. llvm-svn: 238653	2015-05-30 10:44:07 +00:00
Renato Golin	230d298320	[ARMTargetParser] Move IAS arch ext parser. NFC The plan was to move the whole table into the already existing ArchExtNames but some fields depend on a table-generated file, and we don't yet have this feature in the generic lib/Support side. Once the minimum target-specific table-generated files are available in a generic fashion to these libraries, we'll have to keep it in the ASM parser. llvm-svn: 238651	2015-05-30 10:30:02 +00:00
Jim Grosbach	13760bd152	MC: Clean up MCExpr naming. NFC. llvm-svn: 238634	2015-05-30 01:25:56 +00:00
Rafael Espindola	4d37b2a259	Remove getData. This completes the mechanical part of merging MCSymbol and MCSymbolData. llvm-svn: 238617	2015-05-29 21:45:01 +00:00
Rafael Espindola	beb6060a51	Remove the MCSymbolData typedef. The getData member function is next. llvm-svn: 238611	2015-05-29 20:41:47 +00:00
Rafael Espindola	b5d316bfc3	Rename getOrCreateSymbolData to registerSymbol and return void. Another step in merging MCSymbol and MCSymbolData. llvm-svn: 238607	2015-05-29 20:21:02 +00:00
Rafael Espindola	e3b2acf274	Pass MCSymbols to the helper functions in MCELF.h. llvm-svn: 238596	2015-05-29 18:47:23 +00:00
Rafael Espindola	ece40ca43d	Pass a MCSymbol to needsRelocateWithSymbol. llvm-svn: 238589	2015-05-29 18:26:09 +00:00
Matthias Braun	e41e146c16	CodeGen: Use mop_iterator instead of MIOperands/ConstMIOperands MIOperands/ConstMIOperands are classes iterating over the MachineOperand of a MachineInstr, however MachineInstr::mop_iterator does the same thing. I assume these two iterators exist to have a uniform interface to iterate over the operands of a machine instruction bundle and a single machine instruction. However in practice I find it more confusing to have 2 different iterator classes, so this patch transforms (nearly all) the code to use mop_iterators. The only exception being MIOperands::anlayzePhysReg() and MIOperands::analyzeVirtReg() still needing an equivalent, I leave that as an exercise for the next patch. Differential Revision: http://reviews.llvm.org/D9932 This version is slightly modified from the proposed revision in that it introduces MachineInstr::getOperandNo to avoid the extra counting variable in the few loops that previously used MIOperands::getOperandNo. llvm-svn: 238539	2015-05-29 02:56:46 +00:00
Rafael Espindola	3a5d3cce80	Remove a trivial forwarding function. NFC. llvm-svn: 238506	2015-05-28 21:36:02 +00:00
Peter Collingbourne	450fbee6b2	Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing these as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achiveing better codegen is to create LDM/STM instructions with identical sets of virtual registers, let the register allocator pick arbitrary registers and order register lists when printing an MCInst. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. This is implemented by lowering the memcpy intrinsic to a series of SD-only MCOPY pseudo-instructions which performs a memory copy using a given number of registers. During SD->MI lowering, we lower MCOPY to LDM/STM. This is a little unusual, but it avoids the need to encode register lists in the SD, and we can take advantage of SD use lists to decide whether to use the _UPD variant of the instructions. Fixes PR9199. Differential Revision: http://reviews.llvm.org/D9508 llvm-svn: 238473	2015-05-28 20:02:45 +00:00
Renato Golin	f7c0d5f247	ARMTargetParser: Normalising build attributes Now that most of the methods in Clang and LLVM that were parsing arch/cpu/fpu strings are using ARMTargetParser, it's time to make it a bit more conforming with what the ABI says. This commit adds some clarification on what build attributes are accepted and which are "non-standard". It also makes clear that the "defaultCPU" and "defaultArch" methods were really just build attribute getters. It also diverges from GCC's behaviour to say that armv2/armv3 are really an ARMv4 in the build attributes, when the ABI has a clear state for that: Pre-v4. llvm-svn: 238344	2015-05-27 18:15:37 +00:00
Rafael Espindola	f4a1365387	Use operator<< instead of print in a few more places. llvm-svn: 238315	2015-05-27 13:05:42 +00:00
Matthias Braun	aa9fa35555	ARMLoadStoreOptimizer: Code cleanup; NFC llvm-svn: 238289	2015-05-27 05:12:40 +00:00
Diego Novillo	bfecc06656	Revert "Re-commit changes in r237579 with fix for bug breaking windows builds." This reverts commit r238201 to fix linking problems in x86 Linux http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html llvm-svn: 238223	2015-05-26 17:45:38 +00:00
Luke Cheeseman	a5d053d6f4	Re-commit changes in r237579 with fix for bug breaking windows builds. llvm-svn: 238201	2015-05-26 13:40:31 +00:00
Luke Cheeseman	0af4f635f1	Test Commit llvm-svn: 238199	2015-05-26 13:10:35 +00:00
Michael Kuperstein	db0712f986	Use std::bitset for SubtargetFeatures. Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. The first several times this was committed (e.g. r229831, r233055), it caused several buildbot failures. Apparently the reason for most failures was both clang and gcc's inability to deal with large numbers (> 10K) of bitset constructor calls in tablegen-generated initializers of instruction info tables. This should now be fixed. llvm-svn: 238192	2015-05-26 10:47:10 +00:00
Rafael Espindola	61e724a8c5	Stop using MCSectionData in MCMachObjectWriter.h. llvm-svn: 238165	2015-05-26 01:15:30 +00:00
Rafael Espindola	079027ea90	Stop using MCSectionData in MCExpr.h. llvm-svn: 238163	2015-05-26 00:52:18 +00:00
Rafael Espindola	7549f87672	Return a MCSection from MCFragment::getParent(). Another step in merging MCSectionData and MCSection. llvm-svn: 238162	2015-05-26 00:36:57 +00:00
Rafael Espindola	6e6820a7e6	Stop forwarding getOrdinal and setOrdinal. llvm-svn: 238139	2015-05-25 14:12:48 +00:00
Benjamin Kramer	be48c40475	[AArch64] Clean up the ELF streamer a bit. llvm-svn: 238102	2015-05-23 16:39:10 +00:00
Akira Hatanaka	ddf76aa36f	Stop resetting NoFramePointerElim in TargetMachine::resetTargetOptions. This is part of the work to remove TargetMachine::resetTargetOptions. In this patch, instead of updating global variable NoFramePointerElim in resetTargetOptions, its use in DisableFramePointerElim is replaced with a call to TargetFrameLowering::noFramePointerElim. This function determines on a per-function basis if frame pointer elimination should be disabled. There is no change in functionality except that cl:opt option "disable-fp-elim" can now override function attribute "no-frame-pointer-elim". llvm-svn: 238080	2015-05-23 01:14:08 +00:00
Chad Rosier	67336305f5	Use new MachineInstr mayLoadOrStore() API. NFC. llvm-svn: 238044	2015-05-22 20:07:34 +00:00
John Brawn	c815a969c7	[ARM] Fix typo in subtarget feature list for 7em triple The list of subtarget features for the 7em triple contains 't2xtpk', which actually disables that subtarget feature. Correct that to '+t2xtpk' and test that the instructions enabled by that feature do actually work. Differential Revision: http://reviews.llvm.org/D9936 llvm-svn: 238022	2015-05-22 14:16:22 +00:00
Peter Collingbourne	7e814d100b	Revert r237590, "ARM: allow jump tables to be placed as constant islands." Caused a miscompile of the Android port of Chromium, details forthcoming. llvm-svn: 237972	2015-05-21 23:20:55 +00:00
Rafael Espindola	0709a7bd1a	Move alignment from MCSectionData to MCSection. This starts merging MCSection and MCSectionData. There are a few issues with the current split between MCSection and MCSectionData. * It optimizes the the not as important case. We want the production of .o files to be really fast, but the split puts the information used for .o emission in a separate data structure. * The ELF/COFF/MachO hierarchy is not represented in MCSectionData, leading to some ad-hoc ways to represent the various flags. * It makes it harder to remember where each item is. The attached patch starts merging the two by moving the alignment from MCSectionData to MCSection. Most of the patch is actually just dropping 'const', since MCSectionData is mutable, but MCSection was not. llvm-svn: 237936	2015-05-21 19:20:38 +00:00
Davide Italiano	141b2891cb	[Target/ARM] Only enable OptimizeBarrierPass at -O1 and above. Ideally this is going to be and LLVM IR pass (shared, among others with AArch64), but for the time being just enable it if consumers ask us for optimization and not unconditionally. Discussed with Tim Northover on IRC. llvm-svn: 237837	2015-05-20 21:40:38 +00:00
Matthias Braun	6091208331	ARM: Fix comment and make it slightly more readable llvm-svn: 237820	2015-05-20 18:40:06 +00:00
Duncan P. N. Exon Smith	08b8726de3	MC: Use MCSymbol in MachObjectWriter, NFC Replace uses of `MCSymbolData` with `MCSymbol` where both are needed, so we can remove the backpointer. llvm-svn: 237799	2015-05-20 15:16:14 +00:00
Duncan P. N. Exon Smith	99d8a8e8ac	MC: Take MCSymbol in MachObjectWriter::getSymbolAddress(), NFC Pass through an `MCSymbol` instead of an `MCSymbolData` so we can get rid of the back pointer. llvm-svn: 237750	2015-05-20 00:02:39 +00:00
Duncan P. N. Exon Smith	2a40483418	MC: Use MCSymbol in MCAsmLayout::getSymbolOffset(), NFC Continue to canonicalize on MCSymbol instead of MCSymbolData when both are needed. llvm-svn: 237749	2015-05-19 23:53:20 +00:00
Matthias Braun	07066cca20	MachineInstr: Remove unused parameter. llvm-svn: 237726	2015-05-19 21:22:20 +00:00
David Blaikie	ff6409d096	Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only llvm-svn: 237624	2015-05-18 22:13:54 +00:00
Matthias Braun	fa3872e7ad	MachineInstr: Change return value of getOpcode() to unsigned. This was previously returning int. However there are no negative opcode numbers and more importantly this was needlessly different from MCInstrDesc::getOpcode() (which even is the value returned here) and SDValue::getOpcode()/SDNode::getOpcode(). llvm-svn: 237611	2015-05-18 20:27:55 +00:00
Jim Grosbach	6f482000e9	MC: Clean up method names in MCContext. The naming was a mish-mash of old and new style. Update to be consistent with the new. NFC. llvm-svn: 237594	2015-05-18 18:43:14 +00:00
Tim Northover	12c41af07c	ARM: allow jump tables to be placed as constant islands. Previously, they were forced to immediately follow the actual branch instruction. This was usually OK (the LEAs actually accessing them got emitted nearby, and weren't usually separated much afterwards). Unfortunately, a sufficiently nasty phi elimination dumps many instructions right before the basic block terminator, and this can increase the range too much. This patch frees them up to be placed as usual by the constant islands pass, and consequently has to slightly modify the form of TBB/TBH tables to refer to a PC-relative label at the final jump. The other jump table formats were already position-independent. rdar://20813304 llvm-svn: 237590	2015-05-18 17:10:40 +00:00
Oliver Stannard	6cb23465e0	Revert r237579, as it broke windows buildbots llvm-svn: 237583	2015-05-18 16:39:16 +00:00
Oliver Stannard	0c553afe6a	[LLVM - ARM/AArch64] Add ACLE special register intrinsics This patch implements LLVM support for the ACLE special register intrinsics in section 10.1, __arm_{w,r}sr{,p,64}. This patch is intended to lower the read/write_register instrinsics, used to implement the special register intrinsics in the clang patch for special register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor registers in AArch32 and AArch64. This is done by inspecting the register string passed to the intrinsic and then lowering to the appropriate instruction. Patch by Luke Cheeseman. Differential Revision: http://reviews.llvm.org/D9699 llvm-svn: 237579	2015-05-18 16:23:33 +00:00
Duncan P. N. Exon Smith	6e23e5a680	MC: Use MCSymbol in RelAndSymbol, NFC Switch from `MCSymbolData` to `MCSymbol`. llvm-svn: 237502	2015-05-16 01:14:19 +00:00
Jim Grosbach	4c98cf77d9	MC: MCCodeGenInfo naming update. NFC. s/InitMCCodeGenInfo/initMCCodeGenInfo/ llvm-svn: 237471	2015-05-15 19:13:31 +00:00
Jim Grosbach	91df21f740	MC: Update MCCodeEmitter naming. NFC. s/EncodeInstruction/encodeInstruction/ llvm-svn: 237469	2015-05-15 19:13:16 +00:00
Jim Grosbach	63661f8d73	MC: Update MCFixup naming. NFC. s/MCFixup::Create/MCFixup::create/ llvm-svn: 237468	2015-05-15 19:13:05 +00:00
Artyom Skrobov	a70dfe18d3	Re-apply r237247 - [AArch64] Codegen VMAX/VMIN for safe math cases No longer breaks SPEC2000/2006 llvm-svn: 237361	2015-05-14 12:59:46 +00:00
Tim Northover	b4c61f889f	ARM: remove possible vestiges of the legacy JIT??? There's no need to manually pass modifier strings around to tell an operand how to print now, that information is encoded in the operand itself since the MC layer came along. llvm-svn: 237295	2015-05-13 20:28:41 +00:00
Tim Northover	4998a47f73	ARM: remove custom jump table UID We were creating and propagating two separate indices for each jump table (from back in the mists of time). However, the generic index used by other backends is sufficient to emit a unique symbol so this was unneeded. llvm-svn: 237294	2015-05-13 20:28:38 +00:00
Tim Northover	688f7bb21a	ARM: refactor optimizeThumb2JumpTables. The previous logic mixed 2 separate questions: + Can we form a TBB/TBH instruction? + Can we remove the jump-table calculation before it? It then performed a bunch of random tests on the instructions earlier in the basic block, which were probably sufficient to answer 2 but only because of the very limited ways in which a t2BR_JT can actually be created. For example there's no reason to expect the LeaInst to define the same base register as the following indexing calulation. In practice this means we might have missed opportunities to form TBB/TBH, in theory you could end up misidentifying a sequence and removing the wrong LEA: %R1 = t2LEApcrelJT ... %R2 = t2LEApcrelJT ... <... using and killing %R2 ...> %R2 = t2ADDr %R1, $Ridx Before we would have looked for an LEA defining %R2 and found the wrong one. We just got lucky that jump table setup was (almost?) always confined to a single basic block and there was only one jump table per block. llvm-svn: 237293	2015-05-13 20:28:32 +00:00
Jim Grosbach	e9119e41ef	MC: Modernize MCOperand API naming. NFC. MCOperand::Create() methods renamed to MCOperand::create(). llvm-svn: 237275	2015-05-13 18:37:00 +00:00
Silviu Baranga	780a3b3be7	Revert r237247 - [AArch64] Codegen VMAX/VMIN.. as it is causing failures in SPEC2000/2006 llvm-svn: 237256	2015-05-13 14:03:18 +00:00
Artyom Skrobov	b526681e08	[AArch64] Codegen VMAX/VMIN for safe math cases llvm-svn: 237247	2015-05-13 12:01:09 +00:00
Michael Kuperstein	c3434b390d	Reverting r237234, "Use std::bitset for SubtargetFeatures" The buildbots are still not satisfied. MIPS and ARM are failing (even though at least MIPS was expected to pass). llvm-svn: 237245	2015-05-13 10:28:46 +00:00
Michael Kuperstein	aba4a34ef2	Use std::bitset for SubtargetFeatures Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. The first two times this was committed (r229831, r233055), it caused several buildbot failures. At least some of the ARM and MIPS ones were due to gcc/binutils issues, and should now be fixed. llvm-svn: 237234	2015-05-13 08:27:08 +00:00
Matthias Braun	b5424d043b	Revert "ARM: Remove Itineraries for swift CPU" Reverting until I figure out the new lit failures. This reverts commit r237179. llvm-svn: 237189	2015-05-12 21:28:39 +00:00
Matthias Braun	befa1380d2	ARM: Remove Itineraries for swift CPU They do more harm than good when used in the MachineScheduler as they tend to take preference to register pressure minimsation which is more important for swift. Differential Revision: http://reviews.llvm.org/D9718 llvm-svn: 237179	2015-05-12 21:07:54 +00:00
Douglas Katzman	03dfca04df	Strip trailing whitespace. NFC llvm-svn: 237165	2015-05-12 19:42:31 +00:00
John Brawn	70605f7d22	[ARM] Use AEABI aligned function variants AEABI defines aligned variants of memcpy etc. that can be faster than the default version due to not having to do alignment checks. When emitting target code for these functions make use of these aligned variants if possible. Also convert memset to memclr if possible. Differential Revision: http://reviews.llvm.org/D8060 llvm-svn: 237127	2015-05-12 13:13:38 +00:00
Renato Golin	35de35d03f	Change TargetParser enum names to avoid macro conflicts (llvm) sys/time.h on Solaris (and possibly other systems) defines "SEC" as "1" using a cpp macro. The result is that this fails to compile. Fixes https://llvm.org/PR23482 llvm-svn: 237112	2015-05-12 10:33:58 +00:00
Eric Christopher	824f42f209	Migrate existing backends that care about software floating point to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079	2015-05-12 01:26:05 +00:00
Davide Italiano	2c29cd697e	[Target/ARM] Remove unused 'private' from class. Differential Revision: http://reviews.llvm.org/D9611 Reviewed by: rengolin llvm-svn: 236918	2015-05-08 23:58:28 +00:00
Arnold Schwaighofer	f54b73d681	ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects The code that builds the dependence graph assumes that two PseudoSourceValues don't alias. In a tail calling function two FixedStackObjects might refer to the same location. Worse 'immutable' fixed stack objects like function arguments are not immutable and will be clobbered. Change this so that a load from a FixedStackObject is not invariant in a tail calling function and don't return a PseudoSourceValue for an instruction in tail calling functions when building the dependence graph so that we handle function arguments conservatively. Fix for PR23459. rdar://20740035 llvm-svn: 236916	2015-05-08 23:52:00 +00:00
Renato Golin	f5f373fcf1	TargetParser: FPU/ARCH/EXT parsing refactory - NFC This new class in a global context contain arch-specific knowledge in order to provide LLVM libraries, tools and projects with the ability to understand the architectures. For now, only FPU, ARCH and ARCH extensions on ARM are supported. Current behaviour it to parse from free-text to enum values and back, so that all users can share the same parser and codes. This simplifies a lot both the ASM/Obj streamers in the back-end (where this came from), and the front-end parsers for command line arguments (where this is going to be used next). The previous implementation, using .def/.h includes is deprecated due to its inflexibility to be built without the backend support and for being too cumbersome. As more architectures join this scheme, and as more features of such architectures are added (such as hardware features, type sizes, etc) into a full blown TargetDescription class, having a set of classes is the most sane implementation. The ultimate goal of this refactor both LLVM's and Clang's target description classes into one unique interface, so that we can de-duplicate and standardise the descriptions, as well as make it available for other front-ends, tools, etc. The FPU parsing for command line options in Clang has been converted to use this new library and a number of aliases were added for compatibility: * A bogus neon-vfpv3 alias (neon defaults to vfp3) * armv5/v6 * {fp4/fp5}-{sp/dp}-d16 Next steps: * Port Clang's ARCH/EXT parsing to use this library. * Create a TableGen back-end to generate this information. * Run this TableGen process regardless of which back-ends are built. * Expose more information and rename it to TargetDescription. * Continue re-factoring Clang to use as much of it as possible. llvm-svn: 236900	2015-05-08 21:04:27 +00:00
Matthias Braun	f45afee3dc	Fix typo. llvm-svn: 236785	2015-05-07 22:16:10 +00:00
Matthias Braun	d04893fa36	Change getTargetNodeName() to produce compiler warnings for missing cases, fix them llvm-svn: 236775	2015-05-07 21:33:59 +00:00
Wei Mi	062c74484d	[X86] Disable loop unrolling in loop vectorization pass when VF is 1. The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture, by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce the cost of overflow check, memory boundary check and extra prologue/epilogue code when regular unroller will unroll the loop another time. Disable it when VF==1 remove the unnecessary cost on x86. The same can be done for other platforms after verifying interleaving/memory bound checking to be not perf critical on those platforms. Differential Revision: http://reviews.llvm.org/D9515 llvm-svn: 236613	2015-05-06 17:12:25 +00:00
Pete Cooper	d927c6eaf8	[ARM] Fast-Isel was incorrectly selecting <2 x double> adds. With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add. However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it." This commit disables SelectBinaryFPOp for any vector types. There's already a FIXME to try handle neon. Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 \|\| VT == MVT::i64' llvm-svn: 236609	2015-05-06 16:39:17 +00:00
Artyom Skrobov	3f8eae92a4	[ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too llvm-svn: 236590	2015-05-06 11:44:10 +00:00
Ahmed Bougacha	e8d0c4ccea	[ARM][FastISel] Use TST #1 instead of CMP #0 for select. Since r234249, i1 are sext instead of zext; because of that, doing "CMP rN, #0; IT EQ/NE" isn't correct anymore. "TST #1" is the conservatively correct alternative - the tradeoff being that it doesn't have a 16-bit encoding -, so use that instead. llvm-svn: 236569	2015-05-06 04:14:02 +00:00
Peter Collingbourne	85a0e23bc8	Thumb2SizeReduction: Check the correct set of registers for LDMIA. The register set for LDMIA begins at offset 3, not 4. We were previously missing the short encoding of this instruction in the case where the base register was the first register in the register set. Also clean up some dead code: - The isARMLowRegister check is redundant with what VerifyLowRegs does; replace with an assert. - Remove handling of LDMDB instruction, which has no short encoding (and does not appear in ReduceTable). Differential Revision: http://reviews.llvm.org/D9485 llvm-svn: 236535	2015-05-05 20:07:10 +00:00
Quentin Colombet	61b305edfd	[ShrinkWrap] Add (a simplified version) of shrink-wrapping. This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks. As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64. Context Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places. Motivating example Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 } On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call. Proposed Solution This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI. Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties. The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap. Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block. Design Decisions 1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples. Differential Revision: http://reviews.llvm.org/D9210 <rdar://problem/3201744> llvm-svn: 236507	2015-05-05 17:38:16 +00:00
Pete Cooper	4dddbcfbb1	[ARM] IT block insertion needs to update kill flags When forming an IT block from the first MOV here: %R2<def> = t2MOVr %R0, pred:1, pred:%CPSR, opt:%noreg %R3<def> = tMOVr %R0<kill>, pred:14, pred:%noreg the move in to R3 is moved out of the IT block so that later instructions on the same predicate can be inside this block, and we can share the IT instruction. However, when moving the R3 copy out of the IT block, we need to clear its kill flags for anything in use at this point in time, ie, R0 here. This appeases the machine verifier which thought that R0 wasn't defined when used. I have a test case, but its extremely register allocator specific. It would be too fragile to commit a test which depends on the register allocator here. llvm-svn: 236468	2015-05-04 22:44:47 +00:00
Pete Cooper	f68d5038e6	[ARM] Transfer the internal flag in thumb2 size reduction. Converting from t2LDRs to tLDRr caused the shift argument to drop the internal flag. This would then throw machine verifier errors. Unfortunately i'm having trouble reducing a test case. I'm going to keep trying, but so far its a scary combination of machine sinking, an 'and i1', loads feeding loads, and a bunch of code which shouldn't change IT block formation, but does. Its not useful to commit a test in that state as we have no way of knowing if it even hits this code reliably in future. rdar://problem/20752113 llvm-svn: 236333	2015-05-01 18:57:32 +00:00
Peter Collingbourne	d27d3a151f	ARM: Align functions containing Thumb-2 jump tables to 4 bytes. Functions with jump tables need an alignment of 4 because they use the ADR instruction, which aligns the PC to 4 bytes before adding an offset. Differential Revision: http://reviews.llvm.org/D9424 llvm-svn: 236327	2015-05-01 18:05:59 +00:00
Pete Cooper	2127b00cd5	[ARM] optimizeSelect should clear kill flags. If we move an instruction from one block down to a MOVC and predicate it, then the original instruction could be moved in to a loop. In this case, its invalid for any kill flags to remain on there. Fails with -verfy-machineinstrs. rdar://problem/20752113 llvm-svn: 236290	2015-04-30 23:57:47 +00:00
Pete Cooper	5111881cfc	Don't always apply kill flag in thumb2 ABS pseudo expansion. The expansion for t2ABS was always setting the kill flag on the rsb instruction. It should instead only be set on rsb if it was set on the original ABS instruction. rdar://problem/20752113 llvm-svn: 236272	2015-04-30 22:15:59 +00:00
Quentin Colombet	0a905042cd	[ARM] Do not generate invalid encoding for stack adjust, even if this is just temporary. Because of that: 1. The machine verifier was complaining on such code. 2. The generate code worked just because the thumb reduction size pass fixed the opcode. rdar://problem/20749824 llvm-svn: 236247	2015-04-30 18:52:49 +00:00
Tim Northover	5211715360	ARM: mark branch-like instructions with correct flags. There's probably no way to test BXJ, but if the compiler ever did emit it during CodeGen it would have to be a block terminator so "isBranch" is appropriate. BLX is more tricky. Clearly a call, but it affects surprisingly little. rdar://18719544 llvm-svn: 236140	2015-04-29 19:16:38 +00:00
Tim Northover	e18d662201	ARM: fix peephole optimisation of TST We were trying to look through COPY instructions, but only to the next instruction in a BB and incorrectly anyway. The cases where that would actually be a good idea are rare enough (and not even tested!) that it's not worth trying to get right. rdar://20721342 llvm-svn: 236050	2015-04-28 22:03:55 +00:00
Sergey Dmitrouk	842a51bad8	Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes" [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989	2015-04-28 14:05:47 +00:00
Daniel Jasper	48e93f7181	Revert "[DebugInfo] Add debug locations to constant SD nodes" This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987	2015-04-28 13:38:35 +00:00
Sergey Dmitrouk	adb4c69d5c	[DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977	2015-04-28 11:56:37 +00:00
Matthias Braun	eec4efcca5	Cleanup, remove unused return value llvm-svn: 235952	2015-04-28 00:37:05 +00:00
Benjamin Kramer	a44b37e676	[ARM] Simplify code. NFC. llvm-svn: 235803	2015-04-25 17:25:13 +00:00
Lang Hames	9ff69c8f4d	[AsmPrinter] Make AsmPrinter's OutStreamer member a unique_ptr. AsmPrinter owns the OutStreamer, so an owning pointer makes sense here. Using a reference for this is crufty. llvm-svn: 235752	2015-04-24 19:11:51 +00:00
Peter Collingbourne	167668f8c8	Thumb2: When applying branch optimizations, visit branches in reverse order. The order in which branches appear in ImmBranches is approximately their order within the function body. By visiting later branches first, we reduce the distance between earlier forward branches and their targets, making it more likely that the cbn?z optimization, which can only apply to forward branches, will succeed for those earlier branches. Differential Revision: http://reviews.llvm.org/D9185 llvm-svn: 235640	2015-04-23 20:31:35 +00:00
Peter Collingbourne	cfee5b04bc	ARM: When re-creating a branch via InsertBranch, preserve CPSR flags. In particular, this preserves the kill flag, which allows the Thumb2 cbn?z optimization to be applied in cases where a branch has been re-created after the live variables analysis pass, e.g. by the machine block placement pass. This appears to be low risk; a number of other targets seem to already be doing something similar, e.g. AArch64, PowerPC. Differential Revision: http://reviews.llvm.org/D9184 llvm-svn: 235639	2015-04-23 20:31:32 +00:00
Peter Collingbourne	6529523151	Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero. This allows the constant island pass to lower these branches to cbn?z instructions, resulting in a shorter instruction sequence. Differential Revision: http://reviews.llvm.org/D9183 llvm-svn: 235638	2015-04-23 20:31:30 +00:00
Peter Collingbourne	78f1ecc59c	ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets. This makes it more likely that we can use the 16-bit push and pop instructions on Thumb-2, saving around 4 bytes per function. Differential Revision: http://reviews.llvm.org/D9165 llvm-svn: 235637	2015-04-23 20:31:26 +00:00
Peter Collingbourne	1213918bf4	ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools. This appears to have been introduced back in r76698 as part of an unrelated change. I can find no official ARM documentation stating that Thumb-2 functions require 4-byte alignment; in fact, ARM documentation appears to contradict this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement, section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions."). Also remove code that sets alignment for ARM functions, which is redundant with code in the MachineFunction constructor, and remove the hidden -arm-align-constant-islands flag, which has been enabled by default since r146739 (Dec 2011) and has probably received sufficient testing by now. Differential Revision: http://reviews.llvm.org/D9138 llvm-svn: 235636	2015-04-23 20:31:22 +00:00
Vladimir Sukharev	0e0f8d2c1f	[ARM] Add v8.1a "Privileged Access Never" extension Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8504 llvm-svn: 235087	2015-04-16 11:34:25 +00:00
Charlie Turner	6f13d0ca84	Fix BXJ is undefined in AArch32. BXJ was incorrectly said to be unsupported in ARMv8-A. It is not supported in the A64 instruction set, but it is supported in the T32 and A32 instruction sets, because it's listed as an instruction in the ARM ARM section F7.1.28. Using SP as an operand to BXJ changed from UNPREDICTABLE to PREDICTABLE in v8-A. This patch reflects that update as well. This was found by MCHammer. llvm-svn: 235024	2015-04-15 17:28:23 +00:00
Rafael Espindola	5560a4cfbd	Use raw_pwrite_stream in the object writer/streamer. The ELF object writer will take advantage of that in the next commit. llvm-svn: 234950	2015-04-14 22:14:34 +00:00
Alexander Kornienko	fb37cfa346	Refactor: Simplify boolean expressions in ARM target Simplify boolean expressions using `true` and `false` with `clang-tidy` http://reviews.llvm.org/D8524 Patch by Richard Thomson! llvm-svn: 234901	2015-04-14 15:32:58 +00:00
Alexander Kornienko	f817c1cb9a	Use 'override/final' instead of 'virtual' for overridden methods The patch is generated using clang-tidy misc-use-override check. This command was used: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \ -checks='-*,misc-use-override' -header-filter='llvm\|clang' \ -j=32 -fix -format http://reviews.llvm.org/D8925 llvm-svn: 234679	2015-04-11 02:11:45 +00:00
Ahmed Bougacha	b96444efd1	[CodeGen] Split -enable-global-merge into ARM and AArch64 options. Currently, there's a single flag, checked by the pass itself. It can't force-enable the pass (and is on by default), because it might not even have been created, as that's the targets decision. Instead, have separate explicit flags, so that the decision is consistently made in the target. Keep the flag as a last-resort "force-disable GlobalMerge" for now, for backwards compatibility. llvm-svn: 234666	2015-04-11 00:06:36 +00:00
Rafael Espindola	49286e9f4a	clang-format bits of code to make a followup patch easy to read. llvm-svn: 234519	2015-04-09 18:32:58 +00:00
Rafael Espindola	df7305a438	Don't repeat name in comment. NFC. llvm-svn: 234506	2015-04-09 17:10:57 +00:00
Javed Absar	5c5e3c5e36	[ARM] support for Cortex-R4/R4F Currently, llvm (backend) doesn't know cortex-r4, even though it is the default target for armv7r. Using "--target=armv7r-arm-none-eabi" provokes 'cortex-r4' is not a recognized processor for this target' by llvm. This patch adds support for cortex-r4 and, very closely related, r4f. llvm-svn: 234486	2015-04-09 14:07:28 +00:00
Scott Douglass	7ad7792088	[ARM] make vminnm/vmaxnm work with ?le, ?ge and no-nans-fp-math Because -menable-no-nans causes fcmp conditions to be rewritten without 'o' or 'u' the recognition code in needs to cope. Also extended it to handle 'le' and 'ge. Differential Revision: http://reviews.llvm.org/D8725 llvm-svn: 234421	2015-04-08 17:18:28 +00:00
Sergey Dmitrouk	3cc62b3715	[ARM][Debug Info] Restore emitting of .cfi_def_cfa_offset for functions without stack frame Summary: Looks like new code from [[ http://reviews.llvm.org/rL222057 \| rL222057 ]] doesn't account for early `return` in `ARMFrameLowering::emitPrologue`, which leads to loosing `.cfi_def_cfa_offset` directive for functions without stack frame. Reviewers: echristo, rengolin, asl, t.p.northover Reviewed By: t.p.northover Subscribers: llvm-commits, rengolin, aemerson Differential Revision: http://reviews.llvm.org/D8606 llvm-svn: 234399	2015-04-08 10:10:12 +00:00
Ahmed Bougacha	273a9b4f03	[ARM] Mark a bunch of .td Operands with type _MEMORY. This shouldn't affect anything in-tree, as the OperandType users are mostly smart disassemblers and such; more information is helpful there. However, on the flip side, that + the fact that this is just hinting at the meaning of operands makes this not really test-worthy or testable. Differential Revision: http://reviews.llvm.org/D8620 llvm-svn: 234350	2015-04-07 20:31:16 +00:00
Rafael Espindola	b91455b5c0	Refactor a lot of duplicated code for stub output. This also moves it earlier so that it they are produced before we print an end symbol for the data section. llvm-svn: 234315	2015-04-07 13:42:44 +00:00
Aaron Ballman	ac33624075	Silencing several "enumeral and non-enumeral type in conditional expression" warnings; NFC. llvm-svn: 234314	2015-04-07 13:28:37 +00:00
Tim Northover	42335572bb	ARM: do not relax Thumb1 -> Thumb2 if only Thumb1 is available. After recognising that a certain narrow instruction might need a relocation to be represented, we used to unconditionally relax it to a Thumb2 instruction to permit this. Unfortunately, some CPUs (e.g. v6m) don't even have most Thumb2 instructions, so we end up emitting a completely invalid instruction. Theoretically, ELF does have relocations for these situations; but they are fairly unusable with such short ranges and the ABI document even says they're documented "for completeness". So an error is probably better there too. rdar://20391953 llvm-svn: 234195	2015-04-06 18:44:42 +00:00
Rafael Espindola	972756b741	Remove unnecessary uses of AliasedSymbol. As pr19627 points out, every use of AliasedSymbol is likely a bug. The main use was to avoid the oddity of a variable showing up as undefined. That was fixed in r233995, which made these calls nops. llvm-svn: 234169	2015-04-06 16:10:05 +00:00
Rafael Espindola	61e8ce36be	Store the sh_link of ARM_EXIDX directly in MCSectionELF. This avoids some pretty horrible and broken name based section handling. llvm-svn: 234142	2015-04-06 04:25:18 +00:00
Matthias Braun	6a42d7fd6b	ARM: Handle physreg targets in RegPair hints gracefully Register coalescing can change the target of a RegPair hint to a physreg, we should not crash on this. This also slightly improved the way ARMBaseRegisterInfo::updateRegAllocHint() works. llvm-svn: 233987	2015-04-03 00:18:38 +00:00
Vladimir Sukharev	2afdb32c06	[ARM] Rename v8.1a from "extension" to "architecture" v8.1a is renamed to architecture, following current entity naming approach. Excess generic cpu is removed. Intended use: "generic" cpu with "v8.1a" subtarget feature Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8767 llvm-svn: 233811	2015-04-01 14:54:56 +00:00
Eric Christopher	f8019408dc	Replace the MCSubtargetInfo parameter with a Triple when creating an MCInstPrinter. Update all callers and use where we wanted a Triple previously. llvm-svn: 233648	2015-03-31 00:10:04 +00:00
Eric Christopher	7099d51275	Remove unused MCSubtargetInfo argument from the ARM MCInstPrinter ctors. llvm-svn: 233609	2015-03-30 21:52:28 +00:00
Eric Christopher	c7c5592b7e	Remove unused Target argument from MCInstPrinter ctor functions. llvm-svn: 233607	2015-03-30 21:52:21 +00:00
Yaron Keren	075759aadd	Remove more superfluous .str() and replace std::string concatenation with Twine. Following r233392, http://llvm.org/viewvc/llvm-project?rev=233392&view=rev. llvm-svn: 233555	2015-03-30 15:42:36 +00:00
Akira Hatanaka	ee97475b2e	[ARM] Enable changing instprinter's behavior based on the per-function subtarget. llvm-svn: 233451	2015-03-27 23:41:42 +00:00
Akira Hatanaka	cfa1f619e2	clang-format ARMInstPrinter.{h,cpp} before I make changes to these files. llvm-svn: 233448	2015-03-27 23:24:22 +00:00
Akira Hatanaka	b46d0234a6	[MCInstPrinter] Enable MCInstPrinter to change its behavior based on the per-function subtarget. Currently, code-gen passes the default or generic subtarget to the constructors of MCInstPrinter subclasses (see LLVMTargetMachine::addPassesToEmitFile), which enables some targets (AArch64, ARM, and X86) to change their instprinter's behavior based on the subtarget feature bits. Since the backend can now use different subtargets for each function, instprinter has to be changed to use the per-function subtarget rather than the default subtarget. This patch takes the first step towards enabling instprinter to change its behavior based on the per-function subtarget. It adds a bit "PassSubtarget" to AsmWriter which tells table-gen to pass a reference to MCSubtargetInfo to the various print methods table-gen auto-generates. I will follow up with changes to instprinters of AArch64, ARM, and X86. llvm-svn: 233411	2015-03-27 20:36:02 +00:00
Derek Schuff	b051389f04	Use movw/movt instead of constant pool loads to lower byval parameter copies Summary: The ARM backend can use a loop to implement copying byval parameters before a call. In non-thumb2 mode it uses a constant pool load to materialize the trip count. For targets that need movt instead (e.g. Native Client), use the same code as in thumb2 mode to materialize the trip count. Reviewers: jfb, t.p.northover Differential Revision: http://reviews.llvm.org/D8442 llvm-svn: 233324	2015-03-26 22:11:00 +00:00
Renato Golin	4c8713969c	Adds an option to disable ARM ld/st optim pass Enabled by default, but it's useful when debugging with llc. Patch by Ranjeet Singh. llvm-svn: 233303	2015-03-26 18:38:04 +00:00

... 18 19 20 21 22 ...

9882 Commits