llvm-project

Commit Graph

Author	SHA1	Message	Date
Martin Storsjo	865d01a3cf	[AArch64] Support COFF linker directives This is similar to what was done for ARM in SVN r269574; the code and the test are straight copypaste to the corresponding AArch64 code and test directory. Differential revision: https://reviews.llvm.org/D37204 llvm-svn: 312223	2017-08-31 08:28:48 +00:00
Sjoerd Meijer	be5b60f735	[AArch64] allow v4f16 types when FullFP16 is supported Support for scalars was committed in r311154, this adds support for allowing v4f16 vector types (thus avoiding conversions from/to single precision for these types). Differential Revision: https://reviews.llvm.org/D37145 llvm-svn: 312104	2017-08-30 08:38:13 +00:00
Evandro Menezes	4976d6a0c6	[AArch64] Adjust the cost model for Exynos M1 and M2 Add new predicate to more accurately model the scheduling around branches and function calls and of loads and stores of pairs and integer multiplications. llvm-svn: 311944	2017-08-28 22:51:52 +00:00
Evandro Menezes	509516d200	[AArch64] Adjust the cost model for Exynos M1 and M2 Add new predicate to more accurately model the cost of arithmetic and logical operations shifted left. Differential revision: https://reviews.llvm.org/D37151 llvm-svn: 311943	2017-08-28 22:51:32 +00:00
Geoff Berry	40cdc0e053	[AArch64][Falkor] Avoid generating STRQro* instructions Summary: STRQro* instructions are slower than the alternative ADD/STRQui expanded instructions on Falkor, so avoid generating them unless we're optimizing for code size. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D37020 llvm-svn: 311931	2017-08-28 20:48:43 +00:00
NAKAMURA Takumi	a1e97a77f5	Untabify. llvm-svn: 311875	2017-08-28 06:47:47 +00:00
Aditya Nandakumar	892979effc	[GISel]: Implement widenScalar for Legalizing G_PHI https://reviews.llvm.org/D37018 llvm-svn: 311763	2017-08-25 04:57:27 +00:00
Sjoerd Meijer	b0eb5fb317	[AArch64] Add FMOVH0: materialize 0 using zero register for f16 values Instead of loading 0 from a constant pool, it's of course much better to materialize it using an fmov and the zero register. Thanks to Ahmed Bougacha for the suggestion. Differential Revision: https://reviews.llvm.org/D37102 llvm-svn: 311662	2017-08-24 14:47:06 +00:00
Sjoerd Meijer	afc2cd3c9e	[AArch64] Custom lowering of copysign f16 This is a follow up patch of r311154 and introduces custom lowering of copysign f16 to avoid promotions to single precision types when the subtarget supports fullfp16. Differential Revision: https://reviews.llvm.org/D36893 llvm-svn: 311646	2017-08-24 09:21:10 +00:00
Sjoerd Meijer	046a969360	[AArch64] fix for fcos and frem f16 promotion Fix for copy-paste mistake in r311154; setOperationAction for fcos and frem f16 operands appeared twice (and it should be set to 'promote'). Differential Revision: https://reviews.llvm.org/D37071 llvm-svn: 311635	2017-08-24 07:43:52 +00:00
Geoff Berry	90bef32219	[AArch64][Falkor] Fix bug in Falkor HWPF tag collision avoidance LDPDi was incorrectly marked as ignoring the destination register in the prefetcher tag. llvm-svn: 311599	2017-08-23 21:11:28 +00:00
Aditya Nandakumar	efd8a84cd5	[GISEl]: Translate phi into G_PHI G_PHI has the same semantics as PHI but also has types. This lets us verify that the types in the G_PHI are consistent. This also allows specifying legalization actions for G_PHIs. https://reviews.llvm.org/D36990 llvm-svn: 311596	2017-08-23 20:45:48 +00:00
Krasimir Georgiev	3d55cef48b	[AArch64] Silence unused variable warning in opt mode after r311533 llvm-svn: 311535	2017-08-23 08:40:22 +00:00
Sjoerd Meijer	24c98189ed	[AArch64] ISel legalization debug messages. NFCI. Debugging AArch64 instruction legalization and custom lowering is really an unpleasant experience because it shows nodes that appear out of thin air. In commit r311444, some debug messages have been added to SelectionDAG, the target independent part, and this patch adds some AArch64 specific messages. Differential Revision: https://reviews.llvm.org/D36964 llvm-svn: 311533	2017-08-23 08:18:37 +00:00
Sam Parker	6dc3fcb1c6	[ARM][AArch64] v8.3-A Javascript Conversion Armv8.3-A adds instructions that convert a double-precision floating point number to a signed 32-bit integer with round towards zero, designed for improving Javascript performance. Differential Revision: https://reviews.llvm.org/D36785 llvm-svn: 311448	2017-08-22 11:08:21 +00:00
Sjoerd Meijer	b9de2b4871	[AArch64] Cleanup of HasFullFP16 argument. NFC. This is a clean up of commit r311154; it's not necessary to pass HasFullFP16 as an argument, instead just query the DAG. Differential Revision: https://reviews.llvm.org/D36978 llvm-svn: 311438	2017-08-22 09:21:08 +00:00
Alex Bradbury	080f6976c0	Use report_fatal_error for unsupported calling conventions The calling convention can be specified by the user in IR. Failing to support a particular calling convention isn't a programming error, and so relying on llvm_unreachable to catch and report an unsupported calling convention is not appropriate. Differential Revision: https://reviews.llvm.org/D36830 llvm-svn: 311435	2017-08-22 09:11:41 +00:00
Tim Northover	ef1fc5ae89	GlobalISel (AArch64): fix ABI at border between GPRs and SP. If a struct would end up half in GPRs and half on SP the ABI says it should actually go entirely on the stack. We were getting this wrong in GlobalISel before, causing compatibility issues. llvm-svn: 311388	2017-08-21 21:56:11 +00:00
Sam Parker	b252ffd2cc	[ARM][AArch64] Cortex-A75 and Cortex-A55 support This patch introduces support for Cortex-A75 and Cortex-A55, Arm's latest big.LITTLE A-class cores. They implement the ARMv8.2-A architecture, including the cryptography and RAS extensions, plus the optional dot product extension. They also implement the RCpc AArch64 extension from ARMv8.3-A. Cortex-A75: https://developer.arm.com/products/processors/cortex-a/cortex-a75 Cortex-A55: https://developer.arm.com/products/processors/cortex-a/cortex-a55 Differential Revision: https://reviews.llvm.org/D36667 llvm-svn: 311316	2017-08-21 08:43:06 +00:00
Benjamin Kramer	49a49fe816	Move helper classes into anonymous namespaces. No functionality change intended. llvm-svn: 311288	2017-08-20 13:03:48 +00:00
Sjoerd Meijer	ec9581e5e0	[AArch64] Do not promote f16 when subtarget HasFullFP16 Armv8.2-A adds FP16 support, i.e. f16 is not only a storage-only type, but it also supports performing data processing on 16-bit floating-point quantities. All the necessary (tablegen) groundwork of adding the ARMv8.2-A FP16 (scalar) instructions was done in D15014. To take advantage of this, this patch avoids promotion of f16 to f32 types when the subtarget supports FullFP16, which enables instruction selection of these FP16 instructions. Differential Revision: https://reviews.llvm.org/D36396 llvm-svn: 311154	2017-08-18 10:51:14 +00:00
Diana Picus	42ea77d5c2	Revert "GlobalISel (AArch64): fix ABI at border between GPRs and SP." This reverts commit e8fd20964798ca6d46d2729dd3a789707a6416da in an attempt to appease the GlobalISel buildbot, which fails in the test-suite with errors like fpcmp: files differ without tolerance allowance llvm-svn: 311151	2017-08-18 09:31:21 +00:00
Sam Parker	25efe769c0	[AArch64] Fix for buildbots, unused function Removing function declaration, my previous commit broke the bots. llvm-svn: 311150	2017-08-18 09:08:05 +00:00
Sam Parker	96f8959cfd	[AArch64] Remove DecodeAuthLoadWriteback The BaseAuthLoad instruction class was incorrectly passing an empty constraint string to its parent, so I have corrected this. This makes the DecodeAuthLoadWriteback function redundant, so I've also removed it. Differential Revision: https://reviews.llvm.org/D36741 llvm-svn: 311148	2017-08-18 08:39:54 +00:00
Tim Northover	48fff995d6	GlobalISel (AArch64): fix ABI at border between GPRs and SP. If a struct would end up half in GPRs and half on SP the ABI says it should actually go entirely on the stack. We were getting this wrong in GlobalISel before, causing compatibility issues. llvm-svn: 311137	2017-08-17 23:14:01 +00:00
Daniel Sanders	edd0784be6	Re-commit: [globalisel][tablegen] Support zero-instruction emission. Summary: Support the case where an operand of a pattern is also the whole of the result pattern. In this case the original result and all its uses must be replaced by the operand. However, register class restrictions can require a COPY. This patch handles both cases by always emitting the copy and leaving it for the register allocator to optimize. The previous commit failed on Windows machines due to a flaw in the sort predicate which allowed both A < B < C and B == C to be satisfied simultaneously. The cause of this was some sloppiness in the priority order of G_CONSTANT instructions compared to other instructions. These had equal priority because it makes no difference, however there were operands had higher priority than G_CONSTANT but lower priority than any other instruction. As a result, a priority order between G_CONSTANT and other instructions must be enforced to ensure the predicate defines a strict weak order. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36084 llvm-svn: 311076	2017-08-17 09:26:14 +00:00
Geoff Berry	40549ad1ac	[LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolution Summary: Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving ScalarEvolution since they do not alter loop structure and should not alter any SCEV values (though LoopDataPrefetch may introduce new instructions that won't have cached SCEV values yet). This can result in slight code differences, mainly w.r.t. nsw/nuw flags on SCEVs, since these are computed somewhat lazily when a zext/sext instruction is encountered. As a result, passes after the modified passes may see SCEVs with more nsw/nuw flags present. Reviewers: sanjoy, anemet Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D36716 llvm-svn: 311032	2017-08-16 19:03:16 +00:00
Quentin Colombet	61d71a138b	Reapply "[GlobalISel] Remove the GISelAccessor API." This reverts commit r310425, thus reapplying r310335 with a fix for link issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON. Original commit message: [GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. ---- The fix for the link issue consists in adding the GlobalISel library in the list of dependencies for the AArch64 unittests. This dependency comes from the use of AArch64Subtarget that needs to know how to destruct the GISel related APIs when being detroyed. Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and understand the problem. llvm-svn: 310969	2017-08-15 22:31:51 +00:00
Daniel Sanders	eb2f5f3256	Revert r310919 - [globalisel][tablegen] Support zero-instruction emission. As expected, this failed on the windows bots but the instrumentation showed something interesting. The ADD8ri and INC8r rules are never directly compared on the windows machines. That implies that the issue lies in transitivity of the Compare predicate. I believe I've already verified that but maybe I missed something. llvm-svn: 310922	2017-08-15 15:10:31 +00:00
Daniel Sanders	16e6dd3cd6	Re-commit with some instrumentation: [globalisel][tablegen] Support zero-instruction emission. Summary: Support the case where an operand of a pattern is also the whole of the result pattern. In this case the original result and all its uses must be replaced by the operand. However, register class restrictions can require a COPY. This patch handles both cases by always emitting the copy and leaving it for the register allocator to optimize. The previous commit failed on the windows bots and this one is likely to fail on those same bots. However, the added instrumentation should reveal a particular isHigherPriorityThan() evaluation which I'm expecting to expose that these machines are weighing priority of two rules differently from the non-windows machines. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36084 llvm-svn: 310919	2017-08-15 13:50:09 +00:00
Sam Parker	647cce82a3	[AArch64] Remove unused MC function An unused function warning was raised in https://bugs.llvm.org/show_bug.cgi?id=34178. The offending function, in AArch64MCCodeEmitter.cpp, was committed by me last week. Differential Revision: https://reviews.llvm.org/D36665 llvm-svn: 310823	2017-08-14 09:16:13 +00:00
Martin Storsjo	2341319564	[COFF, ARM64] Use '//' as comment character in assembly files in GNU environments This allows using semicolons for bundling up more than one statement per line. This is used within the mingw-w64 project in some assembly files that contain code for multiple architectures. Differential Revision: https://reviews.llvm.org/D36366 llvm-svn: 310797	2017-08-13 19:42:05 +00:00
Daniel Sanders	e6c216ed5b	Revert r310716 (and r310735): [globalisel][tablegen] Support zero-instruction emission. Two of the Windows bots are failing test\CodeGen\X86\GlobalISel\select-inc.mir which should not have been affected by the change. Reverting while I investigate. Also reverted r310735 because it builds on r310716. llvm-svn: 310745	2017-08-11 19:19:21 +00:00
Daniel Sanders	1fb1ce0c87	[globalisel][tablegen] Support zero-instruction emission. Summary: Support the case where an operand of a pattern is also the whole of the result pattern. In this case the original result and all its uses must be replaced by the operand. However, register class restrictions can require a COPY. This patch handles both cases by always emitting the copy and leaving it for the register allocator to optimize. Depends on D35833 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36084 llvm-svn: 310716	2017-08-11 15:40:32 +00:00
Sam Parker	6d42de7847	[AArch64] Enable ARMv8.3-A pointer authentication Add assembler and disassembler support for the ARMv8.3-A pointer authentication instructions. Differential Revision: https://reviews.llvm.org/D36517 llvm-svn: 310709	2017-08-11 13:14:00 +00:00
Jessica Paquette	6315d2d21d	[MachineOutliner] Add RegState::Define to LDRXpost in insertOutlinedCall This fixes a MachineVerifier failure in machine-outliner.mir. Not explicitly adding RegState::Define to the LR argument makes it unhappy because an explicit definition is marked as a use. Build failure: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/7496/testReport/junit/LLVM/CodeGen_AArch64/machine_outliner_mir/ llvm-svn: 310671	2017-08-10 23:11:24 +00:00
Krzysztof Parzyszek	bea30c6286	Add "Restored" flag to CalleeSavedInfo The liveness-tracking code assumes that the registers that were saved in the function's prolog are live outside of the function. Specifically, that registers that were saved are also live-on-exit from the function. This isn't always the case as illustrated by the LR register on ARM. Differential Revision: https://reviews.llvm.org/D36160 llvm-svn: 310619	2017-08-10 16:17:32 +00:00
Sam Parker	71a474d563	[AArch64] Assembler support for v8.3 RCpc Added assembler and disassembler support for the new Release Consistent processor consistent instructions, introduced with ARM v8.3-A for AArch64. Differential Revision: https://reviews.llvm.org/D36522 llvm-svn: 310575	2017-08-10 09:52:55 +00:00
Sam Parker	9d95764c3b	[ARM][AArch64] ARMv8.3-A enablement The beta ARMv8.3 ISA specifications have been released for AArch64 and AArch32, these can be found at: https://developer.arm.com/products/architecture/a-profile/exploration-tools An introduction to this architecture update can be found at: https://community.arm.com/processors/b/blog/posts/armv8-a-architecture-2016-additions This patch is the first in a series which will add ARM v8.3-A support in LLVM and Clang. It adds the necessary changes that create targets for both the ARM and AArch64 backends. Differential Revision: https://reviews.llvm.org/D36514 llvm-svn: 310561	2017-08-10 09:41:00 +00:00
Sjoerd Meijer	7987633263	[AArch64] Assembler support for the ARMv8.2a dot product instructions Dot product is an optional ARMv8.2a extension, see also the public architecture specification here: https://developer.arm.com/products/architecture/a-profile/exploration-tools. This patch adds AArch64 assembler support for these dot product instructions. Differential Revision: https://reviews.llvm.org/D36515 llvm-svn: 310480	2017-08-09 14:59:54 +00:00
Quentin Colombet	8dd90fb54b	Revert "[GlobalISel] Remove the GISelAccessor API." This reverts commit r310115. It causes a linker failure for the one of the unittests of AArch64 on one of the linux bot: http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429 : && /home/fedora/gcc/install/gcc-7.1.0/bin/g++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O2 -L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined -Wl,-O3 -Wl,--gc-sections unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o -o unittests/Target/AArch64/AArch64Tests lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread -Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib && : unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0): undefined reference to `vtable for llvm::LegalizerInfo' unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8): undefined reference to `vtable for llvm::RegisterBankInfo' The particularity of this bot is that it is built with BUILD_SHARED_LIBS=ON However, I was not able to reproduce the problem so far. Reverting to unblock the bot. llvm-svn: 310425	2017-08-08 22:22:30 +00:00
Jessica Paquette	d36945bf3a	[MachineOutliner] Ensure AArch64 outliner doesn't mess with W30 or LR Before, the outliner would mark all instructions that read from/modify LR as illegal. This doesn't handle W30, which overlaps with LR. This shouldn't be outlined. This commit fixes that by making modifiesRegister() and readsRegister() look at W30 + take in a TRI argument. This makes sure that modifiesRegister() and readsRegister() won't outline either of W30 and LR. https://reviews.llvm.org/D36435 llvm-svn: 310422	2017-08-08 21:51:26 +00:00
Daniel Sanders	0554004698	[globalisel][tablegen] Add support for importing 'imm' operands. Summary: This patch enables the import of rules containing 'imm' operands that do not constrain the acceptable values using predicates. Support for ImmLeaf will arrive in a later patch. Depends on D35681 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35833 llvm-svn: 310343	2017-08-08 10:44:31 +00:00
Joel Jones	60711ca253	[AArch64] LSE Atomics reorg - part 1 Add memory synchronization semantics to LSE Atomics. The memory semantics feature will be added in a subsequent patch. In this patch, several corrections were added to the existing LSE Atomics implementation, based on the ARM Errata D11904 from 05/12/2017. Patch by: steleman Differential Revision: https://reviews.llvm.org/D35319 llvm-svn: 310167	2017-08-05 04:30:55 +00:00
Quentin Colombet	c046208c52	[GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. llvm-svn: 310115	2017-08-04 20:15:46 +00:00
Craig Topper	4e22ee6745	[ConstantInt] Use ConstantInt::getValue instead of Constant::getUniqueInteger in a few places where we obviously have a ConstantInt. NFC getUniqueInteger will ultimately call ConstantInt::getValue, but calling ConstantInt::getValue should be inlined. llvm-svn: 310069	2017-08-04 16:59:29 +00:00
Chad Rosier	14fc82a1df	[AArch64] Fix an assertion for pre-index generation with unscaled loads/stores. Differential Revision: https://reviews.llvm.org/D36248 PR34035 llvm-svn: 310066	2017-08-04 16:44:06 +00:00
Quentin Colombet	250e050a50	[GlobalISel] Make GlobalISel a non-optional library. With this change, the GlobalISel library gets always built. In particular, this is not possible to opt GlobalISel out of the build using the LLVM_BUILD_GLOBAL_ISEL variable any more. llvm-svn: 309990	2017-08-03 21:52:25 +00:00
Tim Northover	869fa74d4b	Revert "[AArch64] Simplify AES*Tied pseudo expansion (NFC)." This reverts commit r309821. My suggestion was wrong because it left the MachineOperands tied which confused the verifier. Since there's no easy way to untie operands, the original BuildMI solution is probably best. llvm-svn: 309962	2017-08-03 16:59:36 +00:00
Rafael Espindola	79e238afee	Delete Default and JITDefault code models IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911	2017-08-03 02:16:21 +00:00
Florian Hahn	31f78fd0ae	[AArch64] Simplify AES*Tied pseudo expansion (NFC). Summary: Suggested by @t.p.northover in https://bugs.llvm.org/show_bug.cgi?id=34015. Reviewers: javed.absar, t.p.northover, rengolin Reviewed By: t.p.northover Subscribers: aemerson, kristof.beyls, llvm-commits, t.p.northover Differential Revision: https://reviews.llvm.org/D36223 llvm-svn: 309821	2017-08-02 15:17:19 +00:00
Haicheng Wu	50692a203c	[AArch64] Fix a typo in isExtFreeImpl() next => not Differential Revision: https://reviews.llvm.org/D36104 llvm-svn: 309748	2017-08-01 21:26:45 +00:00
Martin Storsjo	eacf4e408b	[AArch64] Rewrite stack frame handling for win64 vararg functions The previous attempt, which made do with a single offset in computeCalleeSaveRegisterPairs, wasn't quite enough. The previous attempt only worked as long as CombineSPBump == true (since the offset would be adjusted later in fixupCalleeSaveRestoreStackOffset). Instead include the size for the fixed stack area used for win64 varargs in calculations in emitPrologue/emitEpilogue. The stack consists of mainly three parts; - AFI->getLocalStackSize() - AFI->getCalleeSavedStackSize() - FixedObject Most of the places in the code which previously used the CSStackSize now use PrologueSaveSize instead, which is the sum of the latter two, while some cases which need exactly the middle one use AFI->getCalleeSavedStackSize() explicitly instead of a local variable. In addition to moving the offsetting into emitPrologue/emitEpilogue (which fixes functions with CombineSPBump == false), also set the frame pointer to point to the right location, where the frame pointer and link register actually are stored. In addition to the prologue/epilogue, this also requires changes to resolveFrameIndexReference. Add tests for a function that keeps a frame pointer and another one that uses a VLA. Differential Revision: https://reviews.llvm.org/D35919 llvm-svn: 309744	2017-08-01 21:13:54 +00:00
Aditya Nandakumar	02c602e18c	[GISel]: Support Widening G_ICMP's destination operand. Updated AArch64 to widen destination to s32. https://reviews.llvm.org/D35737 Reviewed by Tim llvm-svn: 309579	2017-07-31 17:00:16 +00:00
Florian Hahn	f63a5e91db	[AArch64] Tie source and destination operands for AESMC/AESIMC. Summary: Most CPUs implementing AES fusion require instruction pairs of the form AESE Vn, _ AESMC Vn, Vn and AESD Vn, _ AESIMC Vn, Vn The constraint is added to AES(I)MC instructions which use the result of an AES(E\|D) instruction by using AES(I)MCTrr pseudo instructions, which constraint source and destination registers to be the same. A nice side effect of this change is that now all possible pairs are scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll test case. I had to update aes_load_store. The version I added initially was very reduced and with the new constraint, AESE/AESMC could not be scheduled back-to-back. I updated the test to be more realistic and still expose the same scheduling problem as the initial test case. Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga Reviewed By: t.p.northover, evandro Subscribers: aemerson, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35299 llvm-svn: 309495	2017-07-29 20:35:28 +00:00
Florian Hahn	2f86e3d494	[AArch64] Use 8 bytes as preferred function alignment on Cortex-A53. Summary: This change gives a 0.25% speedup on execution time, a 0.82% improvement in benchmark scores and a 0.20% increase in binary size on a Cortex-A53. These numbers are the geomean results on a wide range of benchmarks from the test-suite and a range of proprietary suites. Reviewers: t.p.northover, aadg, silviu.baranga, mcrosier, rengolin Reviewed By: rengolin Subscribers: grimar, davide, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35568 llvm-svn: 309494	2017-07-29 20:04:54 +00:00
Jessica Paquette	d87f54493d	[MachineOutliner] NFC: Change IsTailCall to a call class + frame class This commit - Removes IsTailCall and replaces it with a target-defined unsigned - Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall - Adds a call class + frame class classification to OutlinedFunction and Candidate respectively This accomplishes a couple things. Firstly, we don't need the notion of tail call in the general outlining algorithm. Secondly, we now can have different "outlining classes" for each candidate within a set of candidates. This will make it easy to add new ways to outline sequences for certain targets and dynamically choose an appropriate cost model for a sequence depending on the context that that sequence lives in. Ultimately, this should get us closer to being able to do something like, say avoid saving the link register when outlining AArch64 instructions. llvm-svn: 309475	2017-07-29 02:55:46 +00:00
Tim Northover	a7f583e33b	GlobalISel: map 128-bit values to an FPR by default. Eventually we may want to allow a pair of GPRs but absolutely nothing in the entire world is ready for that yet. llvm-svn: 309404	2017-07-28 17:11:01 +00:00
Joel Jones	08e88e8df7	[AArch64] Standardize suffixes for LSE Atomics mnemonics (NFCI) This NFC changeset standardizes the suffixes used for LSE Atomics instructions. It changes the existing suffixes - 'b', 'h', 's', 'd' - to the existing standard 'B', 'H', 'W' and 'X'. This changeset is the result of the code review discussion for D35319. Patch by: steleman Differential Revision: https://reviews.llvm.org/D35927 llvm-svn: 309384	2017-07-28 14:09:24 +00:00
Jessica Paquette	809d708b8a	[MachineOutliner] NFC: Split up getOutliningBenefit This is some more cleanup in preparation for some actual functional changes. This splits getOutliningBenefit into two cost functions: getOutliningCallOverhead and getOutliningFrameOverhead. These functions return the number of instructions that would be required to call a specific function and the number of instructions that would be required to construct a frame for a specific funtion. The actual outlining benefit logic is moved into the outliner, which calls these functions. The goal of refactoring getOutliningBenefit is to: - Get us closer to getting rid of the IsTailCall flag - Further split up "target-specific" things and "general algorithm" things llvm-svn: 309356	2017-07-28 03:21:58 +00:00
Adrian Prantl	8f4b353ee1	Remove unused function from AArch64 backend (NFC) llvm-svn: 309336	2017-07-27 23:52:06 +00:00
Ahmed Bougacha	52cecb1f27	[AArch64] Remove outdated comment. NFC. There hasn't been a ternary since r231987. llvm-svn: 309324	2017-07-27 21:27:58 +00:00
Ahmed Bougacha	87807c5a86	[AArch64] Fix legality info passed to demanded bits for TBI opt. The (seldom-used) TBI-aware optimization had a typo lying dormant since it was first introduced, in r252573: when asking for demanded bits, it told TLI that it was running after legalize, where the opposite was true. This is an important piece of information, that the demanded bits analysis uses to make assumptions about the node. r301019 added such an assumption, which was broken by the TBI combine. Instead, pass the correct flags to TLO. llvm-svn: 309323	2017-07-27 21:27:25 +00:00
Florian Hahn	67ddd1d08f	[TargetParser] Use enum classes for various ARM kind enums. Summary: Using c++11 enum classes ensures that only valid enum values are used for ArchKind, ProfileKind, VersionKind and ISAKind. This removes the need for checks that the provided values map to a proper enum value, allows us to get rid of AK_LAST and prevents comparing values from different enums. It also removes a bunch of static_cast from unsigned to enum values and vice versa, at the cost of introducing static casts to access AArch64ARCHNames and ARMARCHNames by ArchKind. FPUKind and ArchExtKind are the only remaining old-style enum in TargetParser.h. I think it's beneficial to keep ArchExtKind as old-style enum, but FPUKind can be converted too, but this patch is quite big, so could do this in a follow-up patch. I could also split this patch up a bit, if people would prefer that. Reviewers: rengolin, javed.absar, chandlerc, rovka Reviewed By: rovka Subscribers: aemerson, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35882 llvm-svn: 309287	2017-07-27 16:27:56 +00:00
Evandro Menezes	d192a8ae7d	[AArch64] Adjust the cost model for Exynos M1 and M2 Add the information for the scalar reciprocal square root approximation. llvm-svn: 309183	2017-07-26 21:28:15 +00:00
Peter Collingbourne	081ffe2ff2	Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI. This was a use-after-free waiting to happen. llvm-svn: 309159	2017-07-26 19:15:29 +00:00
Martin Storsjo	0b7bf7a2e3	[COFF, ARM64] Fix symbol offsets in ADRP/ADD/LDR/STR relocations In COFF, a symbol offset can't be stored in the relocation (as is done in ELF or MachO), but is stored as the immediate in the instruction itself. The immediate in the ADRP thus is the symbol offset in bytes, not in pages. For the PAGEOFFSET_12A/L relocations, ignore any offset outside of the lowest 12 bits; they won't have any effect on the ADD/LDR/STR instruction itself but only on the associated ADRP. This is similar to how the same issue is handled for MOVW/MOVT instructions in ELF (see e.g. SVN r307713, and r307728 in lld). This fixes "fixup out of range" errors while building larger object files, where temporary symbols end up as a plain section symbol and an offset, and fixes any cases where the symbol offset mean that the actual target ended up on a different page than the symbol itself. Differential Revision: https://reviews.llvm.org/D35791 llvm-svn: 309105	2017-07-26 11:19:17 +00:00
Zvi Rackover	1b73682243	TargetLowering: Change isShuffleMaskLegal's mask argument type to ArrayRef<int>. NFCI. Changing mask argument type from const SmallVectorImpl<int>& to ArrayRef<int>. This came up in D35700 where a mask is received as an ArrayRef<int> and we want to pass it to TargetLowering::isShuffleMaskLegal(). Also saves a few lines of code. llvm-svn: 309085	2017-07-26 08:06:58 +00:00
Eugene Zelenko	96d933da4f	[AArch64] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 309062	2017-07-25 23:51:02 +00:00
Eric Christopher	97ae58686f	Update the comments on default subtargets based on feedback. llvm-svn: 309041	2017-07-25 22:21:08 +00:00
Martin Storsjo	8cb3667541	[AArch64] Reserve a 16 byte aligned amount of fixed stack for win64 varargs Create a dummy 8 byte fixed object for the unused slot below the first stored vararg. Alternative ideas tested but skipped: One could try to align the whole fixed object to 16, but I haven't found how to add an offset to the stack frame used in LowerWin64_VASTART. If only the size of the fixed stack object size is padded but not the offset, via MFI.CreateFixedObject(alignTo(GPRSaveSize, 16), -(int)GPRSaveSize, false), PrologEpilogInserter crashes due to "Attempted to reset backwards range!". This fixes misconceptions about where registers are spilled, since AArch64FrameLowering.cpp assumes the offset from fixed objects is aligned to 16 bytes (and the Win64 case there already manually aligns the offset to 16 bytes). This fixes cases where local stack allocations could overwrite callee saved registers on the stack. Differential Revision: https://reviews.llvm.org/D35720 llvm-svn: 308950	2017-07-25 05:20:01 +00:00
Evandro Menezes	29ffb0e66a	[AArch64] Adjust the cost model for Exynos M1 and M2 Fine tune the resources in a couple of ASIMD loads. llvm-svn: 308904	2017-07-24 18:06:16 +00:00
Chad Rosier	9b2b4c961a	[AArch64] Redundant Copy Elimination - remove more zero copies. This patch removes unnecessary zero copies in BBs that are targets of b.eq/b.ne and we know the result of the compare instruction is zero. For example, BB#0: subs w0, w1, w2 str w0, [x1] b.ne .LBB0_2 BB#1: mov w0, wzr ; <-- redundant str w0, [x2] .LBB0_2 Differential Revision: https://reviews.llvm.org/D35075 llvm-svn: 308849	2017-07-23 16:38:08 +00:00
Jonas Paulsson	024e319489	[SystemZ, LoopStrengthReduce] This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729	2017-07-21 11:59:37 +00:00
Evandro Menezes	55459609c8	[AArch64] Adjust the cost model for Exynos M1 and M2 Add the cost for the EXT instructions and explicitly add the cost for a few instructions that were implied by the coarse model. llvm-svn: 308697	2017-07-20 23:41:50 +00:00
Tim Northover	7b6d66c0c9	Recommit: GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64. It revealed a bug in the Localizer pass which has now been fixed. This includes the fix for SUBREG_TO_REG committed separately last time. llvm-svn: 308688	2017-07-20 22:58:38 +00:00
Mandeep Singh Grang	d41ac895bb	[COFF, ARM64, CodeView] Add support to emit CodeView debug info for ARM64 COFF Reviewers: compnerd, ruiu, rnk, zturner Reviewed By: rnk Subscribers: majnemer, aemerson, aprantl, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35518 llvm-svn: 308665	2017-07-20 20:20:00 +00:00
Diana Picus	7534b28291	Revert "GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64." This reverts commit 36c6a2ea9669bc3bb695928529a85d12d1d3e3f9 because it broke the test-suite on the GlobalISel bot. llvm-svn: 308603	2017-07-20 11:36:03 +00:00
Tim Northover	967d4aa7a0	GlobalISel: partially revert r308540. An unfinished and untested implementation of ISel for G_UNMERGE_VALUES crept in by mistake. llvm-svn: 308542	2017-07-19 22:11:08 +00:00
Tim Northover	0e0b3c97dd	GlobalISel: fix SUBREG_TO_REG implementation. The first argument needs to be an immediate rather than a register. Should fix some crashes in the verifier bot. llvm-svn: 308540	2017-07-19 22:08:08 +00:00
Martin Storsjo	b2e9fcfca4	[AArch64] Force relocations for all ADRP instructions This generalizes an existing fix from ELF to MachO and COFF. Test that an ADRP to a local symbol whose offset is known at assembly time still produces relocations, both for MachO and COFF. Test that an ADRP without a @page modifier on MachO fails (previously it didn't). Differential Revision: https://reviews.llvm.org/D35544 llvm-svn: 308518	2017-07-19 20:14:32 +00:00
Martin Storsjo	2ff5f5d681	[AArch64, COFF] Interpret .align as power of two for COFF as well Differential Revision: https://reviews.llvm.org/D35545 llvm-svn: 308517	2017-07-19 20:14:24 +00:00
Tim Northover	d59fbec8e2	GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64. llvm-svn: 308493	2017-07-19 16:47:07 +00:00
Evandro Menezes	e8411cba87	[AArch64] Adjust the feature set for Exynos M2 Add fusion of AES operations. llvm-svn: 308388	2017-07-18 22:51:25 +00:00
Mandeep Singh Grang	d857b4ca98	[COFF, ARM64] Reserve X18 register by default Reviewers: compnerd, rnk, ruiu, mstorsjo Reviewed By: mstorsjo Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35531 llvm-svn: 308358	2017-07-18 20:41:33 +00:00
Geoff Berry	9962faed2b	[AArch64][Falkor] Avoid HW prefetcher tag collisions (step 2) Summary: Avoid HW prefetcher instruction tag collisions in loops by inserting MOVs to change the base address register of strided loads. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D35366 llvm-svn: 308324	2017-07-18 16:14:22 +00:00
Daniel Sanders	40b66d646e	[globalisel][tablegen] Enable the import of rules involving fma. Summary: G_FMA was recently added to GlobalISel which enables the import of rules involving fma. Add the mapping to allow it. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35130 llvm-svn: 308308	2017-07-18 14:10:07 +00:00
Florian Hahn	3530094de6	[AArch64] Use 16 bytes as preferred function alignment on Cortex-A73. Summary: Using 16 byte alignment is beneficial on Cortex-A73, similar to Cortex-A72 (added in D34961). Reviewers: mcrosier, t.p.northover, aadg, silviu.baranga Reviewed By: t.p.northover Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35493 llvm-svn: 308283	2017-07-18 09:31:18 +00:00
Mandeep Singh Grang	6d6f2fa198	[COFF, ARM64] Correct the data layout string for COFF ARM64 target llvm-svn: 308223	2017-07-17 21:25:19 +00:00
Geoff Berry	0cf9e702bf	[AArch64][Falkor] Address some stylistic review comments. NFC. llvm-svn: 308211	2017-07-17 20:19:05 +00:00
Martin Storsjo	2f24e93481	[AArch64] Extend CallingConv::X86_64_Win64 to AArch64 as well Rename the enum value from X86_64_Win64 to plain Win64. The symbol exposed in the textual IR is changed from 'x86_64_win64cc' to 'win64cc', but the numeric value is kept, keeping support for old bitcode. Differential Revision: https://reviews.llvm.org/D34474 llvm-svn: 308208	2017-07-17 20:05:19 +00:00
Nirav Dave	8d0ecbedbe	Avoid store merge to f128 in context of noimpiccitfloat NFCI. Prevent store merge from merging stores into an invalid 128-bit store (realized as a f128 value in the context of the noimplicitfloat attribute). Previously, such stores are immediately split back into valid stores. llvm-svn: 308184	2017-07-17 15:09:47 +00:00
Mandeep Singh Grang	a210f1d7bf	[COFF, ARM64] Add initial relocation types Reviewers: compnerd, ruiu, rnk Reviewed By: compnerd Subscribers: mstorsjo, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34857 llvm-svn: 308154	2017-07-17 00:05:32 +00:00
Hiroshi Inoue	a9ee279e70	fix typos in comments; NFC llvm-svn: 308126	2017-07-16 07:48:48 +00:00
Yi Kong	3b680d8d81	[AArch64] Avoid selecting XZR inline ASM memory operand Restricting register class to PointerRegClass for memory operands. Also fix the PointerRegClass for AArch64 from GPR64 to GPR64sp, since XZR cannot hold a memory pointer while SP is. Fixes PR33134. Differential Revision: https://reviews.llvm.org/D34999 llvm-svn: 308060	2017-07-14 21:46:16 +00:00
Geoff Berry	b1e8714af9	[AArch64][Falkor] Avoid HW prefetcher tag collisions (step 1) Summary: This patch is the first step in reducing HW prefetcher instruction tag collisions in inner loops for Falkor. It adds a pass that annotates IR loads with metadata to indicate that they are known to be strided loads, and adds a target lowering hook that translates this metadata to a target-specific MachineMemOperand flag. A follow on change will use this MachineMemOperand flag to re-write instructions to reduce tag collisions. Reviewers: mcrosier, t.p.northover Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34963 llvm-svn: 308059	2017-07-14 21:44:12 +00:00
Eric Christopher	4e332c7cf1	Add a set of comments explaining why getSubtargetImpl() is deleted on these targets. llvm-svn: 307999	2017-07-14 04:33:43 +00:00
Martin Storsjo	68266faa31	[AArch64] Implement support for windows style vararg functions Pass parameters properly in calls to such functions (pass all floats in integer registers), and handle va_start properly (allocate stack immediately below the arguments on the stack, to save the register arguments into a single continuous array). Differential Revision: https://reviews.llvm.org/D35006 llvm-svn: 307928	2017-07-13 17:03:12 +00:00
Sjoerd Meijer	fe3ff69faf	[AArch64] Enable the mnemonic spell checker The AsmParser mnemonic spell checker was introduced in r307148 and enabled only for ARM. This patch enables it for AArch64. Differential Revision: https://reviews.llvm.org/D35357 llvm-svn: 307918	2017-07-13 15:29:13 +00:00
Amara Emerson	9f3a245e76	[AArch64] Add an SVE target feature to the backend and TargetParser. The feature will be used properly once assembler/disassembler support begins to land. llvm-svn: 307917	2017-07-13 15:19:56 +00:00
Matthew Simpson	06e6a6bdff	[AArch64] Add preliminary support for ARMv8.1 SUB/AND atomics This patch is a follow-up to r305893 and adds preliminary support for the fetch_sub and fetch_and operations. llvm-svn: 307913	2017-07-13 15:01:23 +00:00
Geoff Berry	6748abe24d	[MIR] Add support for printing and parsing target MMO flags Summary: Add target hooks for printing and parsing target MMO flags. Targets may override getSerializableMachineMemOperandTargetFlags() to return a mapping from string to flag value for target MMO values that should be serialized/parsed in MIR output. Add implementation of this hook for AArch64 SuppressPair MMO flag. Reviewers: bogner, hfinkel, qcolombet, MatzeB Subscribers: mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34962 llvm-svn: 307877	2017-07-13 02:28:54 +00:00
Florian Hahn	15be1ac9ab	[AArch64] Only run macro fusion for CPUs with any fusion support. Reviewers: evandro, t.p.northover, javed.absar Reviewed By: evandro Subscribers: aemerson, rengolin, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34959 llvm-svn: 307851	2017-07-12 21:41:28 +00:00
Florian Hahn	f934addc09	[AArch64] Add AArch64Subtarget::isFusion function. Summary: isFusion returns true if the subtarget supports any kind of instruction fusion, similar to ARMSubtarget::isFusion. This was suggested in D34142. This changes the current behavior slightly, because the macro fusion mutation is now added to the PostRA MachineScheduler in case the subtarget supports any kind of fusion. I think that makes sense because if the PostRA MachineScheduler is run, there is potential that instructions scheduled back to back are re-scheduled. Reviewers: evandro, t.p.northover, joelkevinjones, joel_k_jones, steleman Reviewed By: joelkevinjones Subscribers: joel_k_jones, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34958 llvm-svn: 307842	2017-07-12 20:53:22 +00:00
Justin Bogner	4fc696635d	GlobalISel: Handle selection of G_IMPLICIT_DEF in AArch64 A generic variant of IMPLICIT_DEF was added in r306875, but this survives to selection and hits a `Cannot Select`. Add handling that converts the note to a regular IMPLICIT_DEF. llvm-svn: 307817	2017-07-12 17:32:32 +00:00
Rafael Espindola	1beb702ba2	Fully fix the movw/movt addend. The issue is not if the value is pcrel. It is whether we have a relocation or not. If we have a relocation, the static linker will select the upper bits. If we don't have a relocation, we have to do it. llvm-svn: 307730	2017-07-11 23:18:25 +00:00
Florian Hahn	93cf9b4f91	[AArch64] Remove unused IsDarwin & IsNotDarwin predicates (NFCI). Reviewers: t.p.northover, rengolin Reviewed By: t.p.northover Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D35266 llvm-svn: 307706	2017-07-11 20:56:24 +00:00
Daniel Sanders	fe12c0fa56	[globalisel][tablegen] Correct matching of intrinsic ID's. TreePatternNode considers them to be plain integers but MachineInstr considers them to be a distinct kind of operand. The tweak to AArch64InstrInfo.td to produce a simple test case is a NFC for everything except GlobalISelEmitter (confirmed by diffing the tablegenerated files). GlobalISelEmitter is currently unable to infer the type of operands in the Dst pattern from the operands in the Src pattern. llvm-svn: 307634	2017-07-11 08:57:29 +00:00
Joel Jones	7466ccfc59	Doxygen formatting. NFCI llvm-svn: 307597	2017-07-10 22:11:50 +00:00
Simon Pilgrim	9e90152363	[AArch64] Fix -Wimplicit-fallthrough warnings. NFCI. Add breaks - doesn't affect results as both GPR/FPU both check for 32/64 bit sizes. So will still default to GenericOps in the same way. llvm-svn: 307484	2017-07-08 19:28:24 +00:00
Simon Pilgrim	cb07d67a5c	Fix some more -Wimplicit-fallthrough warnings. NFCI. llvm-svn: 307411	2017-07-07 16:40:06 +00:00
Simon Pilgrim	8b4dc53326	[AArch64] Fix -Wimplicit-fallthrough warnings. NFCI. llvm-svn: 307393	2017-07-07 13:03:28 +00:00
Florian Hahn	d4550baf3b	[AArch64] Use 16 bytes as preferred function alignment on Cortex-A57. Summary: This change gives a 0.89% speed on execution time, a 0.94% improvement in benchmark scores and a 0.62% increase in binary size on a Cortex-A57. These numbers are the geomean results on a wide range of benchmarks from the test-suite, SPEC2000, SPEC2006 and a range of proprietary suites. The software optimization guide for the Cortex-A57 recommends 16 byte branch alignment. Reviewers: t.p.northover, mcrosier, javed.absar, kristof.beyls, sbaranga Reviewed By: kristof.beyls Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D34954 llvm-svn: 307389	2017-07-07 10:43:01 +00:00
Florian Hahn	e3666ec9d6	[AArch64] Use 16 bytes as preferred function alignment on Cortex-A72. Summary: This change gives a 0.34% speed on execution time, a 0.61% improvement in benchmark scores and a 0.57% increase in binary size on a Cortex-A72. These numbers are the geomean results on a wide range of benchmarks from the test-suite, SPEC2000, SPEC2006 and a range of proprietary suites. The software optimization guide for the Cortex-A72 recommends 16 byte branch alignment. Reviewers: t.p.northover, kristof.beyls, rengolin, sbaranga, mcrosier, javed.absar Reviewed By: kristof.beyls Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D34961 llvm-svn: 307380	2017-07-07 10:15:49 +00:00
Matthias Braun	1b54aa5879	LiveRegUnits: Rename accumulateBackward()->accumulate() Contrary to the stepForward()/stepBackward() method accumulate() doesn't have a direction as defs, uses and clobbers all have the same effect. Also improve the documentation comment. llvm-svn: 307351	2017-07-07 03:02:17 +00:00
Martin Storsjo	68d0fcd7aa	[COFF, AArch64] Set the private label prefix to .L This fixes calls to external functions starting with a capital L, fixing errors like this: fatal error: error in backend: assembler label 'LocalFree' can not be undefined Differential Revision: https://reviews.llvm.org/D35079 llvm-svn: 307317	2017-07-06 21:08:34 +00:00
Aditya Nandakumar	1745121a45	[GISel]: Enhance the MachineIRBuilder API Allows the MachineIRBuilder APIs to directly create registers (based on LLT or TargetRegisterClass) as well as accept MachineInstrBuilders and implicitly converts to register(with getOperand(0).getReg()). Eg usage: LLT s32 = LLT::scalar(32); auto C32 = Builder.buildConstant(s32, 32); auto Tmp = Builder.buildInstr(TargetOpcode::G_SUB, s32, C32, OtherReg); auto Tmp2 = Builder.buildInstr(Opcode, DstReg, Builder.buildConstant(s32, 31)); .... Only a few methods added for now. Reviewed by Tim llvm-svn: 307302	2017-07-06 19:40:07 +00:00
Joel Jones	aff09bf052	Doxygen formatting. NFCI llvm-svn: 307263	2017-07-06 14:17:36 +00:00
Daniel Sanders	6ab0daade8	[globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s) Summary: Replace the matcher if-statements for each rule with a state-machine. This significantly reduces compile time, memory allocations, and cumulative memory allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is recommitted. The following patches will expand on this further to fully fix the regressions. Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33758 llvm-svn: 307079	2017-07-04 14:35:06 +00:00
Hiroshi Inoue	ddb34d84c9	fix trivial typos in comments; NFC llvm-svn: 307004	2017-07-03 06:32:59 +00:00
Rafael Espindola	76287ab3a0	Rename and adjust processFixupValue. It was not processing any value. All that it ever did was force relocations, so name it shouldForceRelocation. llvm-svn: 306906	2017-06-30 22:47:27 +00:00
Tim Northover	ff5e7e1295	GlobalISel: add G_IMPLICIT_DEF instruction. It looks like there are two target-independent but not GISel instructions that need legalization, IMPLICIT_DEF and PHI. These are already anomalies since their operands have important LLTs attached, so to make things more uniform it seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF. llvm-svn: 306875	2017-06-30 20:27:36 +00:00
Eric Christopher	b4fb256574	Make 0 argument getSubtargetImpl functions for the X86, AArch64, and PPC targets deleted so that no one is tempted to use them. llvm-svn: 306864	2017-06-30 19:49:05 +00:00
Chad Rosier	4c1bc656d0	[AArch64] Silence an unused variable warning in Release builds. NFC. llvm-svn: 306738	2017-06-29 20:43:35 +00:00
Mandeep Singh Grang	6f61e237cc	[AArch64] Make assert messages uniform and general [NFC] Summary: Make assert messages related to Darwin, ELF and COFF uniform. Reviewers: rnk, ruiu, compnerd, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34730 llvm-svn: 306589	2017-06-28 19:37:38 +00:00
Geoff Berry	0abd980680	[AArch64][Falkor] Attempt to fix Windows buildbots llvm-svn: 306588	2017-06-28 19:36:10 +00:00
Geoff Berry	378374d457	[AArch64][Falkor] Try to avoid exhausting HW prefetcher resources when unrolling. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34533 llvm-svn: 306584	2017-06-28 18:53:09 +00:00
Geoff Berry	66d9bdbca8	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 llvm-svn: 306554	2017-06-28 15:53:17 +00:00
Alexandros Lamprineas	c0432d86aa	[AArch64] AArch64CondBrTuningPass generates wrong branch instructions Some conditional branch instructions generated by this pass are checking the wrong condition code. The instructions TBZ and TBNZ are transformed into B.GE and B.LT instead of B.PL and B.MI respectively. They should only be checking the Negative bit. Differential Revision: https://reviews.llvm.org/D34743 llvm-svn: 306550	2017-06-28 15:09:11 +00:00
Rafael Espindola	9a450d9b29	Don't repeat name in comments. 80 columns. NFC. llvm-svn: 306548	2017-06-28 14:59:30 +00:00
Mandeep Singh Grang	0c72172e32	[COFF, ARM64] Add support for Windows ARM64 COFF format Summary: This is the llvm part of the initial implementation to support Windows ARM64 COFF format. I will gradually add more functionality in subsequent patches. Reviewers: ruiu, rnk, t.p.northover, compnerd Reviewed By: ruiu, compnerd Subscribers: aemerson, mgorny, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34705 llvm-svn: 306490	2017-06-27 23:58:19 +00:00
Florian Hahn	2665febb54	[AArch64] Inline callee if its target-features are a subset of the caller Summary: Similar to X86, it should be safe to inline callees if their target-features are a subset of the caller. This change matches GCC's inlining behavior with respect to attributes [1]. [1] https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes Reviewers: kristof.beyls, javed.absar, rengolin, t.p.northover Reviewed By: t.p.northover Subscribers: aemerson, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D34698 llvm-svn: 306478	2017-06-27 22:27:32 +00:00
Rafael Espindola	650c96e0a7	clang-format a file. It had a few inconsistent indentations that made a followup patch hard to read. llvm-svn: 306474	2017-06-27 22:14:20 +00:00
Joel Jones	aea1c356e6	[AArch64] Performance enhancements for Cavium ThunderX2 T99 This patch enables significant performance enhancements to the Cavium ThunderX2T99 LLVM backend, as observed by running SPEC2K6, by adding more detailed scheduling information. Related Bugzilla bug: http://bugs.llvm.org/show_bug.cgi?id=32562 Patch by: steleman Differential Revision: https://reviews.llvm.org/D31801 llvm-svn: 306462	2017-06-27 20:44:55 +00:00
Matthew Simpson	0bd79f416a	[AArch64] Update successor probabilities after ccmp-conversion This patch modifies the conditional compares pass so that it keeps successor probabilities up-to-date after the conversion. Previously, successor probabilities were being normalized to a uniform distribution, even though they may have been heavily biased prior to the conversion (e.g., if one of the edges was the back edge of a loop). This loss of information affected passes later in the pipeline. Differential Revision: https://reviews.llvm.org/D34109 llvm-svn: 306412	2017-06-27 15:00:22 +00:00
Daniel Sanders	cc36dbf55d	[globalisel][tablegen] Add support for EXTRACT_SUBREG. Summary: After this patch, we finally have test cases that require multiple instruction emission. Depends on D33590 Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls Subscribers: javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D33596 llvm-svn: 306388	2017-06-27 10:11:39 +00:00
Dehao Chen	38f1bc7834	Fix the bug when handling shufflevector for aarch64. Summary: This Fixes https://bugs.llvm.org/show_bug.cgi?id=33600 Reviewers: mssimpso, davidxl, Carrot Reviewed By: mssimpso Subscribers: aemerson, rengolin, sanjoy, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D34641 llvm-svn: 306334	2017-06-26 21:33:51 +00:00
Tim Northover	c2d5e6d637	AArch64: legalize G_EXTRACT operations. This is the dual problem to legalizing G_INSERTs so most of the code and testing was cribbed from there. llvm-svn: 306328	2017-06-26 20:34:13 +00:00
Tim Northover	9ac3e42211	AArch64: remove all kill flags when extending register liveness. When we forward a stored value to a load and eliminate it entirely we need to make sure the liveness of the register is maintained all the way to its use. Previously we only cleared liveness on the store doing the forwarding, but there could be other killing uses in between. We already do the right thing when the load has to be converted into something else, it was just this one path that skipped it. llvm-svn: 306318	2017-06-26 18:49:25 +00:00
Hiroshi Inoue	a85d24b73d	fix trivial typos in comment, NFC llvm-svn: 306211	2017-06-24 16:00:26 +00:00
Rafael Espindola	6418856127	Simplify the processFixupValue interface. NFC. llvm-svn: 306202	2017-06-24 05:22:28 +00:00
Rafael Espindola	f351292141	Remove redundant argument. llvm-svn: 306189	2017-06-24 00:26:57 +00:00
Rafael Espindola	801b42de31	ARM: move some logic from processFixupValue to applyFixup. processFixupValue is called on every relaxation iteration. applyFixup is only called once at the very end. applyFixup is then the correct place to do last minute changes and value checks. While here, do proper range checks again for fixup_arm_thumb_bl. We used to do it, but dropped because of thumb2. We now do it again, but use the thumb2 range. llvm-svn: 306177	2017-06-23 22:52:36 +00:00
Geoff Berry	dd239718bd	[AArch64][Falkor] Remove some non-existent opcodes from sched detail regexes. NFC. llvm-svn: 306170	2017-06-23 21:59:09 +00:00
Chad Rosier	6db9ff64a8	[AArch64] Prefer Bcc to CBZ/CBNZ/TBZ/TBNZ when NZCV flags can be set for "free". This patch contains a pass that transforms CBZ/CBNZ/TBZ/TBNZ instructions into a conditional branch (Bcc), when the NZCV flags can be set for "free". This is preferred on targets that have more flexibility when scheduling Bcc instructions as compared to CBZ/CBNZ/TBZ/TBNZ (assuming all other variables are equal). This can reduce register pressure and is also the default behavior for GCC. A few examples: add w8, w0, w1 -> cmn w0, w1 ; CMN is an alias of ADDS. cbz w8, .LBB_2 -> b.eq .LBB0_2 ; single def/use of w8 removed. add w8, w0, w1 -> adds w8, w0, w1 ; w8 has multiple uses. cbz w8, .LBB1_2 -> b.eq .LBB1_2 sub w8, w0, w1 -> subs w8, w0, w1 ; w8 has multiple uses. tbz w8, #31, .LBB6_2 -> b.ge .LBB6_2 In looking at all current sub-target machine descriptions, this transformation appears to be either positive or neutral. Differential Revision: https://reviews.llvm.org/D34220. llvm-svn: 306144	2017-06-23 19:20:12 +00:00
Tim Northover	4b4eec7009	GlobalISel: remove G_SEQUENCE instruction. It was trying to do too many things. The basic lumping together of values for legalization purposes is now handled by G_MERGE_VALUES. More complex things involving gaps and odd sizes are handled by G_INSERT sequences. llvm-svn: 306120	2017-06-23 16:15:55 +00:00
Rafael Espindola	88d9e37ec8	Use a MutableArrayRef. NFC. llvm-svn: 305968	2017-06-21 23:06:53 +00:00
Christof Douma	c1c28051d2	[AARCH64][LSE] Preliminary support for ARMv8.1 LSE Atomics. Implemented support to AArch64 codegen for ARMv8.1 Large System Extensions atomic instructions. Where supported, these instructions can provide atomic operations with higher performance. Currently supported operations include: fetch_add, fetch_or, fetch_xor, fetch_smin, fetch_min/max (signed and unsigned), swap, and compare_exchange. This implementation implies sequential-consistency ordering, more relaxed ordering is under development. Subtarget->hasLSE is currently supported for Cavium ThunderX2T99. Patch by Ananth Jasty. Differential Revision: https://reviews.llvm.org/D33586 Change-Id: I82f6d3d64255622791ceb0715b7ab9f4dc4d4b2c llvm-svn: 305893	2017-06-21 10:58:31 +00:00
Florian Hahn	8552e591a1	[AArch64] Add early exit to promoteLoadFromStore. There should be at most a single kill flag for the promoted operand between the store/load pair. Discussed in https://reviews.llvm.org/D34402. llvm-svn: 305889	2017-06-21 09:51:52 +00:00
Florian Hahn	80e485179e	[AArch64] Preserve register flags when promoting a load from store. Summary: This patch updates promoteLoadFromStore to use the store MachineOperand as the source operand of the of the new instruction instead of creating a new register MachineOperand. This way, the existing register flags are preserved. This fixes PR33468 (https://bugs.llvm.org/show_bug.cgi?id=33468). Reviewers: MatzeB, t.p.northover, junbuml Reviewed By: MatzeB Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34402 llvm-svn: 305885	2017-06-21 08:47:23 +00:00
Rafael Espindola	3ac4c09daf	clang-format a region. It will make a followup patch easier to read. llvm-svn: 305865	2017-06-20 22:53:29 +00:00
Geoff Berry	5e46600e3a	[AArch64][Falkor] Fix MOVZ sched predicate to not assert on non-imm operands (e.g. blockaddress). llvm-svn: 305752	2017-06-19 21:57:44 +00:00
Geoff Berry	e9972cabbd	[AArch64][Kryo] Add missing write latency for LDAXP, LDXP second destination. Fixes PR33491 and PR33512. llvm-svn: 305751	2017-06-19 21:57:42 +00:00
Geoff Berry	3cc4b9f780	[AArch64][Falkor] Refine load/store increment latencies. Also fix LDXP & LDAXP write latency to avoid similar assert as PR33491 and PR33512. llvm-svn: 305750	2017-06-19 21:56:21 +00:00
Florian Hahn	fd44ca6c76	[AArch64] Fix order of checks in shouldScheduleAdjacent. We need to check the opcode of FirstMI before accessing the operands. This caused a buildbot failure during bootstrapping on AArch64. llvm-svn: 305694	2017-06-19 13:45:41 +00:00
Florian Hahn	5f746c8e27	Recommit rL305677: [CodeGen] Add generic MacroFusion pass Use llvm::make_unique to avoid ambiguity with MSVC. This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305690	2017-06-19 12:53:31 +00:00
Florian Hahn	e16d3106f3	Revert r305677 [CodeGen] Add generic MacroFusion pass. This causes Windows buildbot failures do an ambiguous call. llvm-svn: 305681	2017-06-19 11:26:15 +00:00
Florian Hahn	ee1b096f8a	[CodeGen] Add generic MacroFusion pass. Summary: This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Reviewers: craig.topper, evandro, t.p.northover, atrick, MatzeB Reviewed By: MatzeB Subscribers: atrick, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305677	2017-06-19 10:51:38 +00:00
Nirav Dave	85e92223b4	[AArch64] Add indexed check to splitStores. NFC. Add explicit check for unhandled cases in preparation for delaying splitStores to post-legalization. llvm-svn: 305471	2017-06-15 14:47:44 +00:00
Florian Hahn	0a26d2c298	[AArch64] Enable FeatureFuseAES for the generic processor model. Summary: Scheduling AESE/AESMC and AESD/AESIMC instruction pairs back-to-back gives a double digit speedup on benchmarks using those instructions on Cortex-A processors. In GCC, this optimization is part of the generic processor model as well. This change should not have a major performance impact on processors that do not optimize AES instruction pairs, although I only had access to Cortex-A processors for benchmarking. Reviewers: rengolin, kristof.beyls, javed.absar, evandro, silviu.baranga, MatzeB, mcrosier, joelkevinjones, joel_k_jones, bmakam, t.p.northover Reviewed By: evandro Subscribers: sbaranga, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D33836 llvm-svn: 305457	2017-06-15 09:31:23 +00:00
Geoff Berry	13d5dcb093	[AArch64][Falkor] Fix sched details for FDIV, FSQRT, SDIV, UDIV llvm-svn: 305310	2017-06-13 17:43:39 +00:00
Tim Northover	7a61316e89	AArch64: don't try to emit an add (shifted reg) for SP. The "Add/sub (shifted reg)" instructions use the 31 encoding for xzr and wzr rather than the SP, so we need to use different variants. Situations where this actually comes up are rare enough (see test-case) that I think falling back to DAG is fine. llvm-svn: 305230	2017-06-12 20:49:53 +00:00
Haicheng Wu	ef790ffd56	[Falkor] Enable SW Prefetch. SW prefetch is good for Falkor. Differential Revision: http://reviews.llvm.org/D34084 llvm-svn: 305199	2017-06-12 16:34:19 +00:00
Daniel Neilson	c0112ae8da	Const correctness for TTI::getRegisterBitWidth Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation. Reviewers: chandlerc, rnk, reames Reviewed By: reames Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D33903 llvm-svn: 305189	2017-06-12 14:22:21 +00:00
I-Jui (Ray) Sung	21fde385fa	[AArch64] Add fallback in FastISel fp16 conversions Summary: - Fix assertion failures on F16 to/from int types in FastISel by falling back to regular ISel - Add a testcase of various conversion cases with FastISel (-O0) Reviewers: kristof.beyls, jmolloy, SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls Differential Revision: https://reviews.llvm.org/D33734 llvm-svn: 305127	2017-06-09 22:40:50 +00:00
Zachary Turner	264b5d9e88	Move Object format code to lib/BinaryFormat. This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864	2017-06-07 03:48:56 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Mandeep Singh Grang	5e1697ef28	[llvm] Remove double semicolons Reviewers: craig.topper, arsenm, mehdi_amini Reviewed By: mehdi_amini Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33924 llvm-svn: 304767	2017-06-06 05:08:36 +00:00
Geoff Berry	57d8a417e7	[AArch64][Falkor] Model immediate forwarding. llvm-svn: 304552	2017-06-02 14:27:41 +00:00
Eugene Zelenko	7ea692373c	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 304495	2017-06-01 23:25:02 +00:00
Florian Hahn	ff25b6d8f6	[AArch64] Enable FeatureFuseAES on Cortex-A53. It improves performance on Cortex-A53. llvm-svn: 304307	2017-05-31 15:50:03 +00:00
Florian Hahn	064a2f9222	[AArch64] Enable FeatureFuseAES on Cortex-A73. It improves performance on Cortex-A73. llvm-svn: 304304	2017-05-31 15:25:25 +00:00
Abderrazek Zaafrani	855411566b	Add latency info for Exynos interleaved Load/Store instructions. llvm-svn: 304259	2017-05-31 00:20:55 +00:00
Matthias Braun	5e394c3d6f	TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFC TargetPassConfig is not useful for targets that do not use the CodeGen library, so we may just as well store a pointer to an LLVMTargetMachine instead of just to a TargetMachine. While at it, also change the constructor to take a reference instead of a pointer as the TM must not be nullptr. llvm-svn: 304247	2017-05-30 21:36:41 +00:00
Craig Topper	f6d4dc5b4a	[SelectionDAG] Set ISD::FPOWI to Expand by default Summary: Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie". This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default. Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits Differential Revision: https://reviews.llvm.org/D33530 llvm-svn: 304215	2017-05-30 15:27:55 +00:00
Kristof Beyls	2af1e90eb2	Fix PR33031: correct the estimate of maximum offset for instructions spilling/filling the stack. llvm-svn: 304196	2017-05-30 06:58:41 +00:00
Geoff Berry	2739ebafb6	[AArch64][Falkor] Combine sched details files into one. NFC. llvm-svn: 304109	2017-05-28 22:20:44 +00:00
Geoff Berry	b542fb3817	[AArch64][Falkor] Fix some sched details. - Remove all uses of base sched model entries and set them all to Unsupported so all the opcodes are described in AArch64SchedFalkorDetails.td. - Remove entries for unsupported half-float opcodes. - Remove entries for unsupported LSE extension opcodes. - Add entry for MOVbaseTLS (and set Sched in base td file entry to WriteSys) and a few other pseudo ops. - Fix a few FP load/store with reg offset entries to use the LSLfast predicates. - Add Q size BIF/BIT/BSL entries. - Fix swapped Q/D sized CLS/CLZ/CNT/RBIT entires. - Fix pre/post increment address register latency (this operand is always dest 0). - Fix swapped FCVTHD/FCVTHS/FCVTDH/FCVTDS entries. - Fix XYZ resource over usage on LD[1-4] opcodes. llvm-svn: 304108	2017-05-28 21:48:31 +00:00
Matthias Braun	88c8c9847d	AArch64/PEI: Do not add reserved regs to liveins We do not track liveness for reserved registers. It is unnecessary to add them to block livein lists. llvm-svn: 304059	2017-05-27 03:38:02 +00:00
Quentin Colombet	7a43eddf28	[AArch64][GlobalISel] Add the Localizer pass for the O0 pipeline This should fix most of the issue we have right now with constants being spilled all over the place. llvm-svn: 304052	2017-05-27 01:34:07 +00:00
Matthias Braun	b4f74224ff	AArch64: Fix cmpxchg O0 expansion - Rewrite livein calculation to use the computeLiveIns() helper function. This is slightly less efficient but easier to reason about and doesn't unnecessarily add pristine and reserved registers[1] - Zero the status register at the beginning of the loop to make sure it has a defined value. - Remove kill flags of values that need to stay alive throughout the loop. [1] An upcoming commit of mine will tighten the MachineVerifier to catch these. llvm-svn: 304048	2017-05-26 23:48:59 +00:00
Matthias Braun	ac4307c41e	LivePhysRegs: Rework constructor + documentation; NFC - Take reference instead of pointer to a TRI that cannot be nullptr. - Improve documentation comments. llvm-svn: 304038	2017-05-26 21:51:00 +00:00
Nirav Dave	6ff50bf242	Fix signedness of constant. NFC. llvm-svn: 303980	2017-05-26 12:53:10 +00:00
Manoj Gupta	d536180fdc	[AArch64]: add 'a' inline asm operand modifier. Summary: This is used in the Linux kernel, and effectively just means "print an address". This brings back r193593. Reviewed by: Renato Golin Reviewers: t.p.northover, rengolin, richard.barton.arm, kristof.beyls Subscribers: aemerson, javed.absar, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D33558 llvm-svn: 303901	2017-05-25 19:07:57 +00:00
Nirav Dave	bb20b5d5c3	[AArch64] Prevent nested ADDs from address calc in splitStoreSplat. NFC In preparation for late-stage store merging. llvm-svn: 303800	2017-05-24 19:55:49 +00:00
Matthew Simpson	6349380fa4	Revert r291254: [AArch64] Reduce vector insert/extract cost for Falkor The default vector insert/extract cost is more profitable on Falkor than the reduced cost. llvm-svn: 303771	2017-05-24 16:48:39 +00:00
Geoff Berry	d6ac96f953	[AArch64][Falkor] Refine sched details for LSLfast/ASRfast. llvm-svn: 303682	2017-05-23 19:57:45 +00:00
Geoff Berry	e6366f505f	[AArch64][Falkor] Fix sched details for FMOV of WZR/XZR. llvm-svn: 303680	2017-05-23 19:54:28 +00:00
Florian Hahn	abb4218b98	[AArch64] Make instruction fusion more aggressive. Summary: This patch makes instruction fusion more aggressive by * adding artificial edges between the successors of FirstSU and SecondSU, similar to BaseMemOpClusterMutation::clusterNeighboringMemOps. * updating PostGenericScheduler::tryCandidate to keep clusters together, similar to GenericScheduler::tryCandidate. This change increases the number of AES instruction pairs generated on Cortex-A57 and Cortex-A72. This doesn't change code at all in most benchmarks or general code, but we've seen improvement on kernels using AESE/AESMC and AESD/AESIMC. Reviewers: evandro, kristof.beyls, t.p.northover, silviu.baranga, atrick, rengolin, MatzeB Reviewed By: evandro Subscribers: aemerson, rengolin, MatzeB, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33230 llvm-svn: 303618	2017-05-23 09:33:34 +00:00
Akira Hatanaka	e8ae3346a3	[AArch64] Fix PRR33100. This commit fixes a bug introduced in r301019 where optimizeLogicalImm would replace a logical node's immediate operand that was CSE'd and was also an operand of another node. This commit fixes the bug by replacing the logical node instead of its immediate operand. rdar://problem/32295276 llvm-svn: 303607	2017-05-23 06:08:37 +00:00
Daniel Sanders	a1b2db7919	[globalisel][tablegen] Demote OptForSize/OptForMinSize/ForCodeSize to per-function predicates. Summary: This causes them to be re-computed more often than necessary but resolves objections that were raised post-commit on r301750. Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls Reviewed By: qcolombet Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32861 llvm-svn: 303418	2017-05-19 11:08:33 +00:00
Francis Visoiu Mistrih	8b61764cbb	[LegacyPassManager] Remove TargetMachine constructors This provides a new way to access the TargetMachine through TargetPassConfig, as a dependency. The patterns replaced here are: * Passes handling a null TargetMachine call `getAnalysisIfAvailable<TargetPassConfig>`. * Passes not handling a null TargetMachine `addRequired<TargetPassConfig>` and call `getAnalysis<TargetPassConfig>`. * MachineFunctionPasses now use MF.getTarget(). * Remove all the TargetMachine constructors. * Remove INITIALIZE_TM_PASS. This fixes a crash when running `llc -start-before prologepilog`. PEI needs StackProtector, which gets constructed without a TargetMachine by the pass manager. The StackProtector pass doesn't handle the case where there is no TargetMachine, so it segfaults. Related to PR30324. Differential Revision: https://reviews.llvm.org/D33222 llvm-svn: 303360	2017-05-18 17:21:13 +00:00
Francis Visoiu Mistrih	b52e036600	BitVector: add iterators for set bits Differential revision: https://reviews.llvm.org/D32060 llvm-svn: 303227	2017-05-17 01:07:53 +00:00
Amara Emerson	c9916d7e97	Re-commit r302678, fixing PR33053. The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions which didn't have a lowering. llvm-svn: 303211	2017-05-16 21:29:22 +00:00
Chad Rosier	8b12a03215	Fix an improperly placed curly bracket. NFC. llvm-svn: 303165	2017-05-16 12:43:23 +00:00
Tim Northover	203c6f055d	AArch64: use linker-private symbols for globals in MachO. We don't use section-relative relocations on AArch64, so all symbols must be at least visible to the linker (i.e. properly global or l_whatever, but not L_whatever). llvm-svn: 303118	2017-05-15 21:51:38 +00:00
Adam Nemet	e29686e5c1	[SLP] Enable 64-bit wide vectorization on AArch64 ARM Neon has native support for half-sized vector registers (64 bits). This is beneficial for example for 2D and 3D graphics. This patch adds the option to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer. * Performance Analysis This change was motivated by some internal benchmarks but it is also beneficial on SPEC and the LLVM testsuite. The results are with -O3 and PGO. A negative percentage is an improvement. The testsuite was run with a sample size of 4. SPEC * CFP2006/482.sphinx3 -3.34% A pretty hot loop is SLP vectorized resulting in nice instruction reduction. This used to be a +22% regression before rL299482. * CFP2000/177.mesa -3.34% * CINT2000/256.bzip2 +6.97% My current plan is to extend the fix in rL299482 to i16 which brings the regression down to +2.5%. There are also other problems with the codegen in this loop so there is further room for improvement. ** LLVM testsuite * SingleSource/Benchmarks/Misc/ReedSolomon -10.75% There are multiple small SLP vectorizations outside the hot code. It's a bit surprising that it adds up to 10%. Some of this may be code-layout noise. * MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40% The opt-viewer screenshot can be seen at F3218284. We start at a colder store but the tree leads us into the hottest loop. * MultiSource/Applications/lambda-0.1.3/lambda -2.68% * MultiSource/Benchmarks/Bullet/bullet -2.18% This is using 3D vectors. * SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67% Noise, binary is unchanged. * MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90% There is an additional SLP in the cold code. The test runs for ~1sec and prints out over 2000 lines. This is most likely noise. * MultiSource/Applications/aha/aha +1.63% * MultiSource/Applications/JM/lencod/lencod +1.41% * SingleSource/Benchmarks/Misc/richards_benchmark +1.15% Differential Revision: https://reviews.llvm.org/D31965 llvm-svn: 303116	2017-05-15 21:15:01 +00:00
Hans Wennborg	bd6e9e77a7	Revert r302678 "[AArch64] Enable use of reduction intrinsics." This caused PR33053. Original commit message: > The new experimental reduction intrinsics can now be used, so I'm enabling this > for AArch64. We will need this for SVE anyway, so it makes sense to do this for > NEON reductions as well. > > The existing code to match shufflevector patterns are replaced with a direct > lowering of the reductions to AArch64-specific nodes. Tests updated with the > new, simpler, representation. > > Differential Revision: https://reviews.llvm.org/D32247 llvm-svn: 303115	2017-05-15 20:59:32 +00:00
Tim Northover	8b96c7e9b5	AArch64: diagnose unrecognized features in .cpu directive. We were silently ignoring any features we couldn't match up, which led to errors in an inline asm block missing the conventional "\n\t". llvm-svn: 303108	2017-05-15 19:42:15 +00:00
Geoff Berry	e369653bf3	[AArch64][Falkor] Fix sched details for FMOV llvm-svn: 303099	2017-05-15 18:50:22 +00:00
Florian Hahn	af91e7e6d2	[AArch64] Enable FeatureFuseAES on Cortex-A72. This patch enables fusing dependent AESE/AESMC and AESD/AESIMC instruction pairs on Cortex-A72, as recommended in the Software Optimization Guide, section 4.10. llvm-svn: 303073	2017-05-15 15:15:22 +00:00
Geoff Berry	ddbbf6416c	[AArch64][Falkor] Refine modeling of multiply accumulate forwarding. llvm-svn: 302933	2017-05-12 18:57:10 +00:00
Chad Rosier	aeffffdb44	[AArch64][MachineCombine] Fold FNMUL+FSUB -> FNMADD. Differential Revision: http://reviews.llvm.org/D33101. llvm-svn: 302822	2017-05-11 20:07:24 +00:00
Quentin Colombet	307e29124c	[AArch64][RegisterBankInfo] Change the default mapping of fp stores. For stores, check if the stored value is defined by a floating point instruction and if yes, we return a default mapping with FPR instead of GPR. llvm-svn: 302679	2017-05-10 15:19:41 +00:00
Amara Emerson	816542ceb3	[AArch64] Enable use of reduction intrinsics. The new experimental reduction intrinsics can now be used, so I'm enabling this for AArch64. We will need this for SVE anyway, so it makes sense to do this for NEON reductions as well. The existing code to match shufflevector patterns are replaced with a direct lowering of the reductions to AArch64-specific nodes. Tests updated with the new, simpler, representation. Differential Revision: https://reviews.llvm.org/D32247 llvm-svn: 302678	2017-05-10 15:15:38 +00:00
Martin Storsjo	605b0466ea	[AArch64] Fix a comment to match the code. NFC. For the ELF case, the default/preferred form is the generic one, not the short one as used for Apple - fix the comment to say so. Currently it is a copy-paste typo. Make the comments on the darwin default a bit more verbose. Use enum names instead of literal 0/1 to further increase readability and reduce fragility. Differential Revision: https://reviews.llvm.org/D32963 llvm-svn: 302634	2017-05-10 10:51:32 +00:00
Amara Emerson	836b0f48c1	Add a late IR expansion pass for the experimental reduction intrinsics. This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence. Differential Revision: https://reviews.llvm.org/D32245 llvm-svn: 302631	2017-05-10 09:42:49 +00:00
Matthew Simpson	78fd46b230	[AArch64] Consider widening instructions in cost calculations The AArch64 instruction set has a few "widening" instructions (e.g., uaddl, saddl, uaddw, etc.) that take one or more doubleword operands and produce quadword results. The operands are automatically sign- or zero-extended as appropriate. However, in LLVM IR, these extends are explicit. This patch updates TTI to consider these widening instructions as single operations whose cost is attached to the arithmetic instruction. It marks extends that are part of a widening operation "free" and applies a sub-target specified overhead (zero by default) to the arithmetic instructions. Differential Revision: https://reviews.llvm.org/D32706 llvm-svn: 302582	2017-05-09 20:18:12 +00:00
Serge Guelton	e38003f839	Suppress all uses of LLVM_END_WITH_NULL. NFC. Use variadic templates instead of relying on <cstdarg> + sentinel. This enforces better type checking and makes code more readable. Differential Revision: https://reviews.llvm.org/D32541 llvm-svn: 302571	2017-05-09 19:31:13 +00:00
Serge Pavlov	d526b13e61	Add extra operand to CALLSEQ_START to keep frame part set up previously Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527	2017-05-09 13:35:13 +00:00
Quentin Colombet	55a72b3b05	[AArch64][RegisterBankInfo] Change the default mapping of fp loads. This fixes PR32550, in a way that does not imply running the greedy mode at O0. The fix consists in checking if a load is used by any floating point instruction and if yes, we return a default mapping with FPR instead of GPR. llvm-svn: 302453	2017-05-08 18:16:31 +00:00
Quentin Colombet	0e41a41b87	[AArch64][RegisterBankInfo] Fix mapping cost for GPR. In r292478, we changed the order of the enum that is referenced by PMI_FirstXXX. This had the side effect of changing the cost of the mapping of all the loads, instead of just the FPRs ones. Reinstate the higher cost for all but GPR loads. Note: This did not have any external visible effects: - For Fast mode, the cost would have been higher, but we don't care because we don't try to use alternative mappings. - For Greedy mode, the higher cost of the GPR loads, would have triggered the use of the supposedly alternative mapping, that would be in fact the same GPR mapping but with a lower cost. llvm-svn: 302452	2017-05-08 18:16:23 +00:00
Simon Pilgrim	7a28a3ac78	[AARCH64][NEON] Add support for ISD::ABS lowering Update int_aarch64_neon_abs intrinsic to use the ISD::ABS opcode directly Differential Revision: https://reviews.llvm.org/D32940 llvm-svn: 302415	2017-05-08 10:25:18 +00:00
Quentin Colombet	245994d968	[RegisterBankInfo] Uniquely allocate instruction mapping. This is a step toward having statically allocated instruciton mapping. We are going to tablegen them eventually, so let us reflect that in the API. NFC. llvm-svn: 302316	2017-05-05 22:48:22 +00:00
Jun Bum Lim	94d42533eb	[AArch64] Remove AArch64AddressTypePromotion pass Summary: Remove the AArch64AddressTypePromotion pass as we migrated all transformations done in this pass into CGP in r299379. Reviewers: qcolombet, jmolloy, javed.absar, mcrosier Reviewed By: qcolombet Subscribers: aemerson, rengolin, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D31623 llvm-svn: 302245	2017-05-05 16:05:41 +00:00
Ahmed Bougacha	a1991bdde2	[AArch64] armv8-A doesn't have CRC. That's only a required extension as of v8.1a. Remove it from the "generic" CPU as well: it should only support the base ISA (and binutils agrees). Also unify the MC tests into crc.s and arm64-crc32.s llvm-svn: 302077	2017-05-03 20:33:52 +00:00
Reid Kleckner	a0b45f4bfc	[IR] Abstract away ArgNo+1 attribute indexing as much as possible Summary: Do three things to help with that: - Add AttributeList::FirstArgIndex, which is an enumerator currently set to 1. It allows us to change the indexing scheme with fewer changes. - Add addParamAttr/removeParamAttr. This just shortens addAttribute call sites that would otherwise need to spell out FirstArgIndex. - Remove some attribute-specific getters and setters from Function that take attribute list indices. Most of these were only used from BuildLibCalls, and doesNotAlias was only used to test or set if the return value is malloc-like. I'm happy to split the patch, but I think they are probably easier to review when taken together. This patch should be NFC, but it sets the stage to change the indexing scheme to this, which is more convenient when indexing into an array: 0: func attrs 1: retattrs 2...: arg attrs Reviewers: chandlerc, pete, javed.absar Subscribers: david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D32811 llvm-svn: 302060	2017-05-03 18:17:31 +00:00
Joel Jones	6513405735	[AArch64] ILP32 Backend Relocation Support Remove "_NC" suffix and semantics from TLSDESC_LD{64,32}_LO12 and TLSDESC_ADD_LO12 relocations Rearrange ordering in AArch64.def to follow relocation encoding Fix name: R_AARCH64_P32_LD64_GOT_LO12_NC => R_AARCH64_P32_LD32_GOT_LO12_NC Add support for several "TLS", "TLSGD", and "TLSLD" relocations for ILP32 Fix return values from isNonILP32reloc Add implementations for R_AARCH64_ADR_PREL_PG_HI21_NC, R_AARCH64_P32_LD32_GOT_LO12_NC, R_AARCH64_P32_TLSIE_LD32_GOTTPREL_LO12_NC, R_AARCH64_P32_TLSDESC_LD32_LO12, R_AARCH64_LD64_GOT_LO12_NC, TLSLD_LDST128_DTPREL_LO12, TLSLD_LDST128_DTPREL_LO12_NC, TLSLE_LDST128_TPREL_LO12, TLSLE_LDST128_TPREL_LO12_NC Modify error messages to give name of equivalent relocation in the ABI not being used, along with better checking for non-existent requested relocations. Added assembler support for "pg_hi21_nc" Relocation definitions added without implementations: R_AARCH64_P32_TLSDESC_ADR_PREL21, R_AARCH64_P32_TLSGD_ADR_PREL21, R_AARCH64_P32_TLSGD_ADD_LO12_NC, R_AARCH64_P32_TLSLD_ADR_PREL21, R_AARCH64_P32_TLSLD_ADR_PAGE21, R_AARCH64_P32_TLSLD_ADD_LO12_NC, R_AARCH64_P32_TLSLD_LD_PREL19, R_AARCH64_P32_TLSDESC_LD_PREL19, R_AARCH64_P32_TLSGD_ADR_PAGE21, R_AARCH64_P32_TLS_DTPREL, R_AARCH64_P32_TLS_DTPMOD, R_AARCH64_P32_TLS_TPREL, R_AARCH64_P32_TLSDESC Fix encoding: R_AARCH64_P32_TLSDESC_ADR_PAGE21 Reviewers: Peter Smith Patch by: Joel Jones (jjones@cavium.com) Differential Revision: https://reviews.llvm.org/D32072 llvm-svn: 301980	2017-05-02 22:01:48 +00:00
Zachary Turner	a0aae2757d	Revert "Remove "_NC" suffix and semantics from TLSDESC_LD{64,32}_LO12 and" This reverts commit c08155afc5d3230792da2ad30a046a8617735a73. This is causing undefined symbol errors with some of the constants. llvm-svn: 301944	2017-05-02 17:51:27 +00:00
Joel Jones	705103e523	Remove "_NC" suffix and semantics from TLSDESC_LD{64,32}_LO12 and TLSDESC_ADD_LO12 relocations Rearrange ordering in AArch64.def to follow relocation encoding Fix name: R_AARCH64_P32_LD64_GOT_LO12_NC => R_AARCH64_P32_LD32_GOT_LO12_NC Add support for several "TLS", "TLSGD", and "TLSLD" relocations for ILP32 Fix return values from isNonILP32reloc Add implementations for R_AARCH64_ADR_PREL_PG_HI21_NC, R_AARCH64_P32_LD32_GOT_LO12_NC, R_AARCH64_P32_TLSIE_LD32_GOTTPREL_LO12_NC, R_AARCH64_P32_TLSDESC_LD32_LO12, R_AARCH64_LD64_GOT_LO12_NC, TLSLD_LDST128_DTPREL_LO12, TLSLD_LDST128_DTPREL_LO12_NC, TLSLE_LDST128_TPREL_LO12, TLSLE_LDST128_TPREL_LO12_NC Modify error messages to give name of equivalent relocation in the ABI not being used, along with better checking for non-existent requested relocations. Added assembler support for "pg_hi21_nc" Relocation definitions added without implementations: R_AARCH64_P32_TLSDESC_ADR_PREL21, R_AARCH64_P32_TLSGD_ADR_PREL21, R_AARCH64_P32_TLSGD_ADD_LO12_NC, R_AARCH64_P32_TLSLD_ADR_PREL21, R_AARCH64_P32_TLSLD_ADR_PAGE21, R_AARCH64_P32_TLSLD_ADD_LO12_NC, R_AARCH64_P32_TLSLD_LD_PREL19, R_AARCH64_P32_TLSDESC_LD_PREL19, R_AARCH64_P32_TLSGD_ADR_PAGE21, R_AARCH64_P32_TLS_DTPREL, R_AARCH64_P32_TLS_DTPMOD, R_AARCH64_P32_TLS_TPREL, R_AARCH64_P32_TLSDESC Fix encoding: R_AARCH64_P32_TLSDESC_ADR_PAGE21 Reviewers: Peter Smith Patch by: Joel Jones (jjones@cavium.com) Differential Revision: https://reviews.llvm.org/D32072 llvm-svn: 301939	2017-05-02 17:14:31 +00:00
Quentin Colombet	cdf8c81127	[AArch64] Move GISel accessor initialization from TargetMachine to Subtarget. NFC llvm-svn: 301841	2017-05-01 21:53:19 +00:00
Amara Emerson	d28f0cd448	Generalize the specialized flag-carrying SDNodes by moving flags into SDNode. This removes BinaryWithFlagsSDNode, and flags are now all passed by value. Differential Revision: https://reviews.llvm.org/D32527 llvm-svn: 301803	2017-05-01 15:17:51 +00:00
Daniel Sanders	e9fdba39e0	[globalisel][tablegen] Compute available feature bits correctly. Summary: Predicate<> now has a field to indicate how often it must be recomputed. Currently, there are two frequencies, per-module (RecomputePerFunction==0) and per-function (RecomputePerFunction==1). Per-function predicates are currently recomputed more frequently than necessary since the only predicate in this category is cheap to test. Per-module predicates are now computed in getSubtargetImpl() while per-function predicates are computed in selectImpl(). Tablegen now manages the PredicateBitset internally. It should only be necessary to add the required includes. Also fixed a problem revealed by the test case where constrainSelectedInstRegOperands() would attempt to tie operands that BuildMI had already tied. Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32491 llvm-svn: 301750	2017-04-29 17:30:09 +00:00
Reid Kleckner	6652a52e2b	Use Argument::hasAttribute and AttributeList::ReturnIndex more This eliminates many extra 'Idx' induction variables in loops over arguments in CodeGen/ and Target/. It also reduces the number of places where we assume that ReturnIndex is 0 and that we should add one to argument numbers to get the corresponding attribute list index. NFC llvm-svn: 301666	2017-04-28 18:37:16 +00:00
Craig Topper	d0af7e8ab8	[SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620	2017-04-28 05:31:46 +00:00
Craig Topper	24e71017aa	[APInt] Use inplace shift methods where possible. NFCI llvm-svn: 301612	2017-04-28 03:36:24 +00:00
Krzysztof Parzyszek	44e25f37ae	Move size and alignment information of regclass to TargetRegisterInfo 1. RegisterClass::getSize() is split into two functions: - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const; - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const; 2. RegisterClass::getAlignment() is replaced by: - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const; This will allow making those values depend on subtarget features in the future. Differential Revision: https://reviews.llvm.org/D31783 llvm-svn: 301221	2017-04-24 18:55:33 +00:00
Sjoerd Meijer	e5b8557d5b	[Arch64AsmParser] better diagnostic for isb Instruction isb takes as an operand either 'sy' or an immediate value. This improves the diagnostic when the string is not 'sy' and adds a test case for this which was missing. This also adds tests to check invalid inputs for dsb and dmb. Differential Revision: https://reviews.llvm.org/D32227 llvm-svn: 301165	2017-04-24 08:22:20 +00:00
Renato Golin	4abfb3d741	Revert "[APInt] Fix a few places that use APInt::getRawData to operate within the normal API." This reverts commit r301105, 4, 3 and 1, as a follow up of the previous revert, which broke even more bots. For reference: Revert "[APInt] Use operator<<= where possible. NFC" Revert "[APInt] Use operator<<= instead of shl where possible. NFC" Revert "[APInt] Use ashInPlace where possible." PR32754. llvm-svn: 301111	2017-04-23 12:15:30 +00:00
Craig Topper	5f68af0806	[APInt] Use operator<<= instead of shl where possible. NFC llvm-svn: 301103	2017-04-23 05:18:31 +00:00
Daniel Sanders	2deea1878e	[globalisel][tablegen] Revise API for ComplexPattern operands to improve flexibility. Summary: Some targets need to be able to do more complex rendering than just adding an operand or two to an instruction. For example, it may need to insert an instruction to extract a subreg first, or it may need to perform an operation on the operand. In SelectionDAG, targets would create SDNode's to achieve the desired effect during the complex pattern predicate. This worked because SelectionDAG had a form of garbage collection that would take care of SDNode's that were created but not used due to a later predicate rejecting a match. This doesn't translate well to GlobalISel and the churn was wasteful. The API changes in this patch enable GlobalISel to accomplish the same thing without the waste. The API is now: InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const; where Root is the root of the match. The return value can be omitted to indicate that the predicate failed to match, or a function with the signature ComplexRendererFn can be returned. For example: return OptionalComplexRendererFn( [=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); }); adds two immediate operands to the rendered instruction. Immed and ShVal are captured from the predicate function. As an added bonus, this also reduces the amount of information we need to provide to GIComplexOperandMatcher. Depends on D31418 Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar Reviewed By: ab Subscribers: dberris, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D31761 llvm-svn: 301079	2017-04-22 15:11:04 +00:00
Matthias Braun	d78597ec08	AArch64FrameLowering: Check if the ExtraCSSpill register is actually unused The code assumed that when saving an additional CSR register (ExtraCSSpill==true) we would have a free register throughout the function. This was not true if this CSR register is also used to pass values as in the swiftself case. rdar://31451816 llvm-svn: 301057	2017-04-21 22:42:08 +00:00
Hans Wennborg	9b9a5358dd	Re-commit r301040 "X86: Don't emit zero-byte functions on Windows" In addition to the original commit, tighten the condition for when to pad empty functions to COFF Windows. This avoids running into problems when targeting e.g. Win32 AMDGPU, which caused test failures when this was committed initially. llvm-svn: 301047	2017-04-21 21:48:41 +00:00
Hans Wennborg	04593000d8	Revert r301040 "X86: Don't emit zero-byte functions on Windows" This broke almost all bots. Reverting while fixing. llvm-svn: 301041	2017-04-21 21:10:37 +00:00
Hans Wennborg	cb3e810714	X86: Don't emit zero-byte functions on Windows Empty functions can lead to duplicate entries in the Guard CF Function Table of a binary due to multiple functions sharing the same RVA, causing the kernel to refuse to load that binary. We had a terrific bug due to this in Chromium. It turns out we were already doing this for Mach-O in certain situations. This patch expands the code for that in AsmPrinter::EmitFunctionBody() and renames TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it seems it was used for not just Mach-O anyway. Differential Revision: https://reviews.llvm.org/D32330 llvm-svn: 301040	2017-04-21 20:58:12 +00:00
Akira Hatanaka	22e839f4b2	[AArch64] Improve code generation for logical instructions taking immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. This recommits r300932 and r300930, which was causing dag-combine to loop forever. The problem was that optimizeLogicalImm was returning true even when there was no change to the immediate node (which happened when the immediate was all zeros or ones), which caused dag-combine to push and pop the same node to the work list over and over again without making any progress. This commit fixes the bug by returning false early in optimizeLogicalImm if the immediate is all zeros or ones. Also, it changes the code to compare the immediate with 0 or Mask rather than calling countPopulation. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 301019	2017-04-21 18:53:12 +00:00
Joel Jones	a7c4a52188	[AArch64] Refactor instruction selection lowering for addresses. NFCI Factor out the common code used for generating addresses into common templated functions that call overloaded versions of a new function, getTargetNode. Tested with make check-llvm with targets AArch64. Differential Revision: https://reviews.llvm.org/D32169 llvm-svn: 301005	2017-04-21 17:31:03 +00:00
Daniel Sanders	e7b0d66080	[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule. Summary: The SelectionDAG importer now imports rules with Predicate's attached via Requires, PredicateControl, etc. These predicates are implemented as bitset's to allow multiple predicates to be tested together. However, unlike the MC layer subtarget features, each target only pays for it's own predicates (e.g. AArch64 doesn't have 192 feature bits just because X86 needs a lot). Both AArch64 and X86 derive at least one predicate from the MachineFunction or Function so they must re-initialize AvailableFeatures before each function. They also declare locals in <Target>InstructionSelector so that computeAvailableFeatures() can use the code from SelectionDAG without modification. Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab Reviewed By: rovka Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D31418 llvm-svn: 300993	2017-04-21 15:59:56 +00:00
Chad Rosier	428556c536	[AArch64][Falkor] Refine modeling of store-release exclusive instructions. llvm-svn: 300987	2017-04-21 14:58:32 +00:00
Chad Rosier	d631b9e500	[AArch64][Falkor] Refine resource needs of STRQ with register offset. llvm-svn: 300984	2017-04-21 14:33:13 +00:00
Daniel Sanders	419efdd55b	Revert r300964 + r300970 - [globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule. It's causing llvm-clang-x86_64-expensive-checks-win to fail to compile and I haven't worked out why. Reverting to make it green while I figure it out. llvm-svn: 300978	2017-04-21 14:09:20 +00:00
Chad Rosier	537defeeb5	[AArch64][Falkor] Refine loads/stores that require an extra LD pipe. llvm-svn: 300976	2017-04-21 13:55:41 +00:00
Chad Rosier	bbcc828833	[AArch64][Falkor] Fix number of microops for WriteSTIdx missed in r300892. llvm-svn: 300975	2017-04-21 13:37:01 +00:00
Chad Rosier	4f2e9e237f	[AArch64] Fix a few missed pre/post-inc in Falkor. llvm-svn: 300974	2017-04-21 13:36:57 +00:00
Daniel Sanders	279d03527e	[globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule. Summary: The SelectionDAG importer now imports rules with Predicate's attached via Requires, PredicateControl, etc. These predicates are implemented as bitset's to allow multiple predicates to be tested together. However, unlike the MC layer subtarget features, each target only pays for it's own predicates (e.g. AArch64 doesn't have 192 feature bits just because X86 needs a lot). Both AArch64 and X86 derive at least one predicate from the MachineFunction or Function so they must re-initialize AvailableFeatures before each function. They also declare locals in <Target>InstructionSelector so that computeAvailableFeatures() can use the code from SelectionDAG without modification. Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab Reviewed By: rovka Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D31418 llvm-svn: 300964	2017-04-21 10:27:20 +00:00
Akira Hatanaka	78ccba6a20	Revert r300932 and r300930. It seems that r300930 was creating an infinite loop in dag-combine when compling the following file: MultiSource/Benchmarks/MiBench/consumer-typeset/z21.c llvm-svn: 300940	2017-04-21 01:31:50 +00:00
Akira Hatanaka	e52caddae8	[AArch64] Use suffix ULL to shift a 64-bit value. llvm-svn: 300932	2017-04-21 00:35:27 +00:00
Akira Hatanaka	19077aaee0	[AArch64] Improve code generation for logical instructions taking immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. This recommits r300913, which broke bots because I didn't fix a call to ShrinkDemandedConstant in SIISelLowering.cpp after changing the APIs of TargetLoweringOpt and TargetLowering. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300930	2017-04-21 00:05:16 +00:00
Akira Hatanaka	7b06cebe73	Revert "[AArch64] Improve code generation for logical instructions taking" This reverts r300913. This broke bots. llvm-svn: 300916	2017-04-20 23:03:30 +00:00
Akira Hatanaka	e327f09832	[AArch64] Improve code generation for logical instructions taking immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300913	2017-04-20 22:47:56 +00:00
Tim Northover	100b7f6eae	AArch64: lower "fence singlethread" to a pure compiler barrier. Single-threaded fences aren't required to provide any synchronization with other processing elements so there's no need for a DMB. They should still be a barrier for compiler optimizations though. llvm-svn: 300905	2017-04-20 21:57:45 +00:00
Chad Rosier	4279c58ec4	[AArch64] Whitespace/ordering fixes for Falkor machine description. NFC. llvm-svn: 300893	2017-04-20 21:11:17 +00:00
Chad Rosier	a56bdbe62d	[AArch64] Refine Falkor machine description for pre/post-inc and stores. llvm-svn: 300892	2017-04-20 21:11:09 +00:00
Chad Rosier	9f25dd56a8	[AArch64] Improve scheduling of logical operations on Falkor. llvm-svn: 300871	2017-04-20 18:50:21 +00:00
John Brawn	5ca5daa6b9	[AArch64] Fix handling of zero immediate in fmov instructions Currently fmov #0 with a vector destination is handle incorrectly and results in fmov #-1.9375 being emitted but should instead give an error. This is due to the way we cope with fmov #0 with a scalar destination being an alias of fmov zr, so fix this by actually doing it through an alias. Differential Revision: https://reviews.llvm.org/D31949 llvm-svn: 300830	2017-04-20 10:13:54 +00:00
John Brawn	dcf037a6f0	[AArch64] Fix handling of integer fp immediates When an integer is used as an fp immediate we're failing to check the return value of getFP64Imm, so invalid values are silently permitted. Fix this by merging together the integer and real handling. llvm-svn: 300828	2017-04-20 10:10:10 +00:00
Aditya Nandakumar	75ad9ccbfa	[GISEL]: Move getConstantVReg to Utils NFCI llvm-svn: 300751	2017-04-19 20:48:50 +00:00
Kristof Beyls	0f36e68f62	[GlobalISel] Support vector-of-pointers in LLT This fixes PR32471. As comment 10 on that bug report highlights (https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a few different defendable design tradeoffs that could be made, including not representing pointers at all in LLT. I decided to go for representing vector-of-pointer as a concept in LLT, while keeping the size of the LLT type 64 bits (this is an increase from 48 bits before). My rationale for keeping pointers explicit is that on some targets probably it's very handy to have the distinction between pointer and non-pointer (e.g. 68K has a different register bank for pointers IIRC). If we keep a scalar pointer, it probably is easiest to also have a vector-of-pointers to keep LLT relatively conceptually clean and orthogonal, while we don't have a very strong reason to break that orthogonality. Once we gain more experience on the use of LLT, we can of course reconsider this direction. Rejecting vector-of-pointer types in the IRTranslator is also an option to avoid the crash reported in PR32471, but that is only a very short-term solution; also needs quite a bit of code tweaks in places, and is probably fragile. Therefore I didn't consider this the best option. llvm-svn: 300664	2017-04-19 07:23:57 +00:00
Matt Arsenault	3138075dd4	DAG: Make mayBeEmittedAsTailCall parameter const llvm-svn: 300603	2017-04-18 21:16:46 +00:00
Craig Topper	fc947bcfba	[APInt] Use lshrInPlace to replace lshr where possible This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result. This adds an lshrInPlace(const APInt &) version as well. Differential Revision: https://reviews.llvm.org/D32155 llvm-svn: 300566	2017-04-18 17:14:21 +00:00
Kristof Beyls	a4e79cca77	Revert "[GlobalISel] Support vector-of-pointers in LLT" This reverts r300535 and r300537. The newly added tests in test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll produces slightly different code between LLVM versions being built with different compilers. E.g., dependent on the compiler LLVM is built with, either one of the following can be produced: remark: <unknown>:0:0: unable to legalize instruction: %vreg0<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg2; (in function: vector_of_pointers_extractelement) remark: <unknown>:0:0: unable to legalize instruction: %vreg2<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg0; (in function: vector_of_pointers_extractelement) Non-determinism like this is clearly a bad thing, so reverting this until I can find and fix the root cause of the non-determinism. llvm-svn: 300538	2017-04-18 09:26:36 +00:00
Kristof Beyls	fb73eb0324	[GlobalISel] Support vector-of-pointers in LLT This fixes PR32471. As comment 10 on that bug report highlights (https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a few different defendable design tradeoffs that could be made, including not representing pointers at all in LLT. I decided to go for representing vector-of-pointer as a concept in LLT, while keeping the size of the LLT type 64 bits (this is an increase from 48 bits before). My rationale for keeping pointers explicit is that on some targets probably it's very handy to have the distinction between pointer and non-pointer (e.g. 68K has a different register bank for pointers IIRC). If we keep a scalar pointer, it probably is easiest to also have a vector-of-pointers to keep LLT relatively conceptually clean and orthogonal, while we don't have a very strong reason to break that orthogonality. Once we gain more experience on the use of LLT, we can of course reconsider this direction. Rejecting vector-of-pointer types in the IRTranslator is also an option to avoid the crash reported in PR32471, but that is only a very short-term solution; also needs quite a bit of code tweaks in places, and is probably fragile. Therefore I didn't consider this the best option. llvm-svn: 300535	2017-04-18 08:12:45 +00:00
Davide Italiano	3e9986f368	[Target] Use hasOneUse() instead of getNumUses(). The latter does a liner scan over a linked list, therefore is much more expensive. llvm-svn: 300518	2017-04-18 00:29:54 +00:00
Tim Northover	46e36f0953	AArch64: put nonlazybind special handling behind a flag for now. It's basically a terrible idea anyway but objc_msgSend gets emitted like that. We can decide on a better way to deal with it in the unlikely event that anyone actually uses it. llvm-svn: 300474	2017-04-17 18:18:47 +00:00
Konstantin Zhuravlyov	dc77b2e960	Distinguish between code pointer size and DataLayout::getPointerSize() in DWARF info generation llvm-svn: 300463	2017-04-17 17:41:25 +00:00
Tim Northover	879a0b2e1b	AArch64: support nonlazybind It's almost certainly not a good idea to actually use it in most cases (there's a pretty large code size overhead on AArch64), but we can't do those experiments until it's supported. llvm-svn: 300462	2017-04-17 17:27:56 +00:00
Andrew V. Tischenko	75745d0c3e	This patch closes PR#32216: Better testing of schedule model instruction latencies/throughputs. The details are here: https://reviews.llvm.org/D30941 llvm-svn: 300311	2017-04-14 07:44:23 +00:00
Adam Nemet	c5779460f4	[AArch64] Avoid partial register writes on lane 0 of BUILD_VECTOR for i8/i16/f16 This further improves Ahmed's change in rL299482. See the new comment for the rationale. The patch recovers most of the regression for bzip2 after D31965. We're down to +2.68% from +6.97%. Differential Revision: https://reviews.llvm.org/D32028 llvm-svn: 300276	2017-04-13 23:32:47 +00:00
Jonas Paulsson	fccc7d66c3	[SystemZ] TargetTransformInfo cost functions implemented. getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052	2017-04-12 11:49:08 +00:00
Balaram Makam	c53c44cec4	[AArch64] Fix scheduling info for INS(vector, general) instruction. llvm-svn: 299994	2017-04-11 22:14:10 +00:00
Evandro Menezes	203eef0ed5	[AArch64] Simplify MacroFusion This patch assumes that the dependents to be scanned for the ExitSU are its predecessors; otherwise, the successors of the instr are scanned. Furthermore, sometimes the ExitSU was being fused twice, since it may be fused once when scanning the successors from the beginning of the BB and then again when scanning the predecessors of ExitSU. Thus, when scanning the successors of an instr, skip the ExitSU. llvm-svn: 299974	2017-04-11 19:13:11 +00:00
Matthew Simpson	1468d3e04e	[ARM/AArch64] Ensure valid vector element types for interleaved accesses This patch refactors and strengthens the type checks performed for interleaved accesses. The primary functional change is to ensure that the interleaved accesses have valid element types. The added test cases previously failed because the element type is f128. Differential Revision: https://reviews.llvm.org/D31817 llvm-svn: 299864	2017-04-10 18:34:37 +00:00
Balaram Makam	b4419f9d30	[AArch64] Refine Falkor Machine Model - Part 3 This concludes the refinements to Falkor Machine Model. It includes SchedPredicates for immediate zero and LSL Fast. Forwarding logic is also modeled for vector multiply and accumulate only. llvm-svn: 299810	2017-04-08 03:30:15 +00:00
Petr Hosek	c3a9e6db38	[AArch64] Allow global register asm("x18") or asm("w18") under -ffixed-x18 When using -ffixed-x18, the x18 (or w18) register can safely be used with the "global register variable" GCC extension, but the backend fails to recognize it. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31793 llvm-svn: 299799	2017-04-07 20:41:58 +00:00
Daniel Sanders	0b5293f6ae	[globalisel][tablegen] Move <Target>InstructionSelector declarations to anonymous namespaces Summary: This resolves the issue of tablegen-erated includes in the headers for non-GlobalISel builds in a simpler way than before. Reviewers: qcolombet, ab Reviewed By: ab Subscribers: igorb, ab, mgorny, dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30998 llvm-svn: 299637	2017-04-06 09:49:34 +00:00
James Molloy	9d42334e02	[AArch64] Crypto requires FP. So if FP is disabled, crypto should also be disabled. llvm-svn: 299531	2017-04-05 10:44:38 +00:00
Alex Bradbury	866113c2ea	Add MCContext argument to MCAsmBackend::applyFixup for error reporting A number of backends (AArch64, MIPS, ARM) have been using MCContext::reportError to report issues such as out-of-range fixup values in their TgtAsmBackend. This is great, but because MCContext couldn't easily be threaded through to the adjustFixupValue helper function from its usual callsite (applyFixup), these backends ended up adding an MCContext* argument and adding another call to applyFixup to processFixupValue. Adding an MCContext parameter to applyFixup makes this unnecessary, and even better - applyFixup can take a reference to MCContext rather than a potentially null pointer. Differential Revision: https://reviews.llvm.org/D30264 llvm-svn: 299529	2017-04-05 10:16:14 +00:00
Ahmed Bougacha	d3c03a5ddd	[AArch64] Avoid partial register deps on insertelt of load into lane 0. This improves upon r246462: that prevented FMOVs from being emitted for the cross-class INSERT_SUBREGs by disabling the formation of INSERT_SUBREGs of LOAD. But the ld1.s that we started selecting caused us to introduce partial dependencies on the vector register. Avoid that by using SCALAR_TO_VECTOR: it's a first-class citizen that is folded away by many patterns, including the scalar LDRS that we want in this case. Credit goes to Adam for finding the issue! llvm-svn: 299482	2017-04-04 22:55:53 +00:00
Balaram Makam	b3120b6d3f	[AArch64] Add missing schedinfo, check completeness for Falkor. llvm-svn: 299468	2017-04-04 21:15:53 +00:00
Petr Hosek	9eb0a1e09b	[AArch64][Fuchsia] Allow -mcmodel=kernel for --target=aarch64-fuchsia This mode is just like -mcmodel=small except that it moves the thread pointer from TPIDR_EL0 to TPIDR_EL1. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31624 llvm-svn: 299462	2017-04-04 19:51:53 +00:00
Balaram Makam	7b5c098cfa	[AArch64] Refine Falkor Machine Model - Part 2 llvm-svn: 299456	2017-04-04 18:42:14 +00:00
Daniel Sanders	bee5739a7c	[tablegen][globalisel] Add support for nested instruction matching. Summary: Lift the restrictions that prevented the tree walking introduced in the previous change and add support for patterns like: (G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3 Also adds support for G_SEXT and G_ZEXT to support these cases. One particular aspect of this that I should draw attention to is that I've tried to be overly conservative in determining the safety of matches that involve non-adjacent instructions and multiple basic blocks. This is intended to be used as a cheap initial check and we may add a more expensive check in the future. The current rules are: * Reject if any instruction may load/store (we'd need to check for intervening memory operations. * Reject if any instruction has implicit operands. * Reject if any instruction has unmodelled side-effects. See isObviouslySafeToFold(). Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka Reviewed By: ab Subscribers: igorb, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30539 llvm-svn: 299430	2017-04-04 13:25:23 +00:00
Jun Bum Lim	dee5565869	[CodeGenPrep] move aarch64-type-promotion to CGP Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680 llvm-svn: 299379	2017-04-03 19:20:07 +00:00
Quentin Colombet	35a47010b1	Revert "Instrument SDISel C++ patterns" This reverts commit r299284. Didn't intend to commit this :( llvm-svn: 299286	2017-04-01 01:26:17 +00:00
Quentin Colombet	b43da15602	Instrument SDISel C++ patterns llvm-svn: 299284	2017-04-01 01:21:32 +00:00
Eric Christopher	60a245e0ff	Reduce the number of times we query the subtarget for the same information. llvm-svn: 299278	2017-03-31 23:12:27 +00:00
Eric Christopher	cf965f2f03	Small cleanup to remove extraneous cast. llvm-svn: 299277	2017-03-31 23:12:24 +00:00
Balaram Makam	2aba753e84	[AArch64] Add new subtarget feature to fold LSL into address mode. Summary: This feature enables folding of logical shift operations of up to 3 places into addressing mode on Kryo and Falkor that have a fastpath LSL. Reviewers: mcrosier, rengolin, t.p.northover Subscribers: junbuml, gberry, llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D31113 llvm-svn: 299240	2017-03-31 18:16:53 +00:00
Simon Pilgrim	37b536e4b3	[DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes. Differential Revision: https://reviews.llvm.org/D31249 llvm-svn: 299201	2017-03-31 11:24:16 +00:00
Davide Italiano	a0bd28c4d9	[AArch64ISelLowering] Remove `else` after `return` in LowerGlobalTLSAddress. llvm-svn: 299103	2017-03-30 19:52:31 +00:00
Davide Italiano	de05686ec6	[AArch64] Simplify isSingExtended()/isZeroExtended(). NFCI. llvm-svn: 299102	2017-03-30 19:46:18 +00:00
Sanne Wouda	d4658ee634	[AArch64] [Assembler] option to disable negative immediate conversions Summary: Similar to the ARM target in https://reviews.llvm.org/rL298380, this patch adds identical infrastructure for disabling negative immediate conversions, and converts the existing aliases to the new infrastucture. Reviewers: rengolin, javed.absar, olista01, SjoerdMeijer, samparker Reviewed By: samparker Subscribers: samparker, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D31243 llvm-svn: 298908	2017-03-28 10:02:56 +00:00
Ahmed Bougacha	f0b22c471b	[GlobalISel][AArch64] Extract a variable out of an NDEBUG block. NFC. r298863 used PtrReg, but that's never defined in release builds. Fix it. llvm-svn: 298869	2017-03-27 18:14:20 +00:00
Ahmed Bougacha	f75782f9dc	[GlobalISel][AArch64] Fold FI into LDR/STR ui addressing mode. A majority of loads and stores at O0 access an alloca. It's trivial to fold the G_FRAME_INDEX into the instruction; do it. llvm-svn: 298864	2017-03-27 17:31:56 +00:00
Ahmed Bougacha	8a654085d0	[GlobalISel][AArch64] Fold G_GEP into LDR/STR ui addressing mode. We're not to the point of supporting the load/store patterns yet (because they extensively use PatFrags). But in the meantime, we can implement some of the simplest addressing modes. llvm-svn: 298863	2017-03-27 17:31:52 +00:00
Ahmed Bougacha	85a66a6d9f	[GlobalISel][AArch64] Select store of zero to WZR/XZR. These occur very frequently, and are quite trivial to catch. llvm-svn: 298862	2017-03-27 17:31:48 +00:00
Ahmed Bougacha	641cb203b6	[GlobalISel][AArch64] Select CBZ. CBZ/CBNZ represent a substantial portion of all conditional branches. Look through G_ICMP to select them. We can't use tablegen yet because the existing patterns match an AArch64ISD node. llvm-svn: 298856	2017-03-27 16:35:31 +00:00
Chad Rosier	862a41270f	[AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as not having side effects. Among other things, this allows Machine LICM to hoist a costly 'mrs' instruction from within a loop. Differential Revision: http://reviews.llvm.org/D31151 llvm-svn: 298851	2017-03-27 15:52:38 +00:00
Davide Italiano	a2c4e4b929	[Target] Remove some code probably copy/pasted from another backend. llvm-svn: 298825	2017-03-26 21:45:04 +00:00
Davide Italiano	5c2aa5d3e4	[MachineScheduler] Reference the correct header. llvm-svn: 298823	2017-03-26 21:27:21 +00:00
Balaram Makam	cf0e5e1c62	[AArch64] Refine Falkor Machine Model - Part1 llvm-svn: 298768	2017-03-25 04:02:39 +00:00
Jessica Paquette	eac8633d6d	[Outliner] Revert r298734. When I tested r298734, I thought that red zones were enabled by default like in X86. Since red zones are behind a flag on AArch64 the testing wasn't true. llvm-svn: 298747	2017-03-24 23:00:21 +00:00
Jessica Paquette	167af85ec7	[Outliner] Remove no red zone requirment for AArch64 AArch64 doesn't require -mno-red-zone; stack fixups are sufficient here. This was unnecessarily copied over from the X86 target. (You can now outline with red zones! Yay!) Removing the requirement passes all Single/MultiSource tests. llvm-svn: 298734	2017-03-24 20:47:59 +00:00
Matt Arsenault	18bb24a1be	TTI: Split IsSimple in MemIntrinsicInfo All this did before was assert in EarlyCSE. llvm-svn: 298724	2017-03-24 18:56:43 +00:00
Davide Italiano	0145e751c4	[AArch64] Drive-by cleanup, make this code shorter. NFCI. llvm-svn: 298563	2017-03-22 23:37:58 +00:00
Reid Kleckner	b518054b87	Rename AttributeSet to AttributeList Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 llvm-svn: 298393	2017-03-21 16:57:19 +00:00
Jessica Paquette	02cbfb2926	[Outliner] ACTUALLY remove the errs output I don't know how to type. This fixes the last commit which would have made all of the overflows legal, and kept the screaming. llvm-svn: 298263	2017-03-20 16:25:04 +00:00
Jessica Paquette	5d59a4ee19	[Outliner] Remove output for offset range check Forgot to remove some output before committing last time. (Instruction fixups don't actually overflow anywhere in the test suite so far, so I missed it). To prevent the outliner from screaming "Overflow!" in the event that that does happen, this commit removes that output. llvm-svn: 298260	2017-03-20 15:51:45 +00:00
Diana Picus	d79253a9f7	[GlobalISel] Use the correct calling conv for calls This commit adds a parameter that lets us pass in the calling convention of the call to CallLowering::lowerCall. This allows us to handle situations where the calling convetion of the callee is different from that of the caller. Differential Revision: https://reviews.llvm.org/D31039 llvm-svn: 298254	2017-03-20 14:40:18 +00:00
Nirav Dave	ac6081cb67	Make library calls sensitive to regparm module flag (Fixes PR3997). Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 llvm-svn: 298179	2017-03-18 00:44:07 +00:00
Nirav Dave	6de2c77944	Capitalize ArgListEntry fields. NFC. llvm-svn: 298178	2017-03-18 00:43:57 +00:00
Jessica Paquette	ea8cc09be0	[Outliner] Add outliner for AArch64 This commit adds the necessary target hooks for outlining in AArch64. It also refactors the switch statement used in `getMemOpBaseRegImmOfsWidth` into a more general function, `getMemOpInfo`. This allows the outliner to share that code without copying and pasting it. The AArch64 outliner can be run using -mllvm -enable-machine-outliner, as with the X86-64 outliner. The test for this pass verifies that the outliner does, in fact outline functions, fixes up the stack accesses properly, and can correctly generate a tail call. In the future, this test should be replaced with a MIR test, so that we can properly test immediate offset overflows in fixed-up instructions. llvm-svn: 298162	2017-03-17 22:26:55 +00:00
Chad Rosier	a69dcb6b66	[AArch64] Use alias analysis in the load/store optimization pass. This allows the optimization to rearrange loads and stores more aggressively. Differential Revision: http://reviews.llvm.org/D30903 llvm-svn: 298092	2017-03-17 14:19:55 +00:00
Reid Kleckner	45707d4d5a	Remove getArgumentList() in favor of arg_begin(), args(), etc Users often call getArgumentList().size(), which is a linear way to get the number of function arguments. arg_size(), on the other hand, is constant time. In general, the fact that arguments are stored in an iplist is an implementation detail, so I've removed it from the Function interface and moved all other users to the argument container APIs (arg_begin(), arg_end(), args(), arg_size()). Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D31052 llvm-svn: 298010	2017-03-16 22:59:15 +00:00
Matthias Braun	e959544517	TargetInstrInfo: Provide default implementation of isTailCall(). In fact this default implementation should be the only implementation, keep it virtual for now to accomodate targets that don't model flags correctly. Differential Revision: https://reviews.llvm.org/D30747 llvm-svn: 297980	2017-03-16 20:02:30 +00:00
Daniel Sanders	0e64202871	[globalisel] Correct G_CONSTANT path of selectArithImmed() Earlier stages of GlobalISel always use ConstantInt in G_CONSTANT so that's what we should check for. This fixes a crash introduced in r297782. llvm-svn: 297968	2017-03-16 18:04:50 +00:00
Ahmed Bougacha	62cd73d989	[GlobalISel][AArch64] Select ADDXri. We're now able to select ADDWri thanks to the new complex pattern support. Extend that to ADDXri. llvm-svn: 297874	2017-03-15 19:20:59 +00:00
Daniel Sanders	a228df75c0	[globalisel] LLVM_BUILD_GLOBAL_ISEL=OFF should prevent GlobalISel instruction selector from being declared. llvm-svn: 297786	2017-03-14 22:09:29 +00:00
Daniel Sanders	8a4bae9993	[globalisel][tblgen] Add support for ComplexPatterns Summary: Adds a new kind of MachineOperand: MO_Placeholder. This operand must not appear in the MIR and only exists as a way of creating an 'uninitialized' operand until a matcher function overwrites it. Depends on D30046, D29712 Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D30089 llvm-svn: 297782	2017-03-14 21:32:08 +00:00
Nirav Dave	54e22f33d9	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting with compiler time improvements Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 297695	2017-03-14 00:34:14 +00:00
Balaram Makam	cacc08bb46	[AArch64] Map Sched Read/Write resources for Falkor. llvm-svn: 297611	2017-03-13 10:42:17 +00:00
Azharuddin Mohammed	473b75c3d5	Remove CRC32 instructions from AArch64InstrInfo::hasShiftedReg Summary: A53 scheduler causes an assertion failure on all CRC instructions: include/llvm/CodeGen/MachineInstr.h:280: const llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i < getNumOperands() && "getOperand() out of range!"' failed. The case statements corresponding to CRC instructions are incorrect and should be removed. Also adding a testcase while on this. Reviewers: t.p.northover, javed.absar, apazos, rengolin Reviewed By: rengolin Subscribers: evandro, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30274 llvm-svn: 297582	2017-03-12 14:02:32 +00:00
Evandro Menezes	8f70e249a7	[AArch64, X86] Additional debug information for MacroFusion In order to make it easier to parse information about the performance of MacroFusion, this patch adds the function and the instruction names to the debug output of this pass. llvm-svn: 297504	2017-03-10 20:20:04 +00:00
Daniel Sanders	52b4ce727a	Recommit: [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 297241	2017-03-07 23:20:35 +00:00
Joel Jones	2852088126	[AArch64] Vulcan is now ThunderXT99 Broadcom Vulcan is now Cavium ThunderX2T99. LLVM Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32113 Minor fixes for the alignments of loops and functions for ThunderX T81/T83/T88 (better performance). Patch was tested with SpecCPU2006. Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D30510 llvm-svn: 297190	2017-03-07 19:42:40 +00:00
Daniel Sanders	8ebec37d26	Revert r297177: Change LLT constructor string into an LLT-based object ... More module problems. This time it only showed up in the stage 2 compile of clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile. Somehow, this change causes the build to need Attributes.gen before it's been generated. llvm-svn: 297188	2017-03-07 19:21:23 +00:00
Daniel Sanders	8612326a08	[globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 297177	2017-03-07 18:32:25 +00:00
Tim Northover	c2c545b8f7	GlobalISel: restrict G_EXTRACT instruction to just one operand. A bit more painful than G_INSERT because it was more widely used, but this should simplify the handling of extract operations in most locations. llvm-svn: 297100	2017-03-06 23:50:28 +00:00
Chad Rosier	9a70c7c02a	[AArch64][Redundant Copy Elim] Add support for CMN and shifted imm. This patch extends the current functionality of the AArch64 redundant copy elimination pass to handle CMN instructions as well as a shifted immediates. Differential Revision: https://reviews.llvm.org/D30576. llvm-svn: 297078	2017-03-06 21:20:00 +00:00
Tim Northover	3e6a7afd81	GlobalISel: constrain G_INSERT to inserting just one value per instruction. It's much easier to reason about single-value inserts and no-one was actually using the variadic variants before. llvm-svn: 296923	2017-03-03 23:05:47 +00:00
Chandler Carruth	ce52b80744	[SDAG] Revert r296476 (and r296486, r296668, r296690). This patch causes compile times for some patterns to explode. I have a (large, unreduced) test case that slows down by more than 20x and several test cases slow down by 2x. I'm sending some of the test cases directly to Nirav and following up with more details in the review log, but this should unblock anyone else hitting this. llvm-svn: 296862	2017-03-03 10:02:25 +00:00
Sjoerd Meijer	69bccf96bd	[AArch64AsmParser] rewrite of function parseSysAlias This is a cleanup/rewrite of the parseSysAlias function. It was not using the tablegen instruction descriptions, but was “manually” matching the mnemonics and recreating the operands whereas all this information is already in tablegen; all this code has been replaced with calls to lookupXYZByName tablegen calls. Differential Revision: https://reviews.llvm.org/D30491 llvm-svn: 296857	2017-03-03 08:12:47 +00:00
Chad Rosier	ea25eca04a	[AArch64] Extend redundant copy elimination pass to handle non-zero stores. This patch extends the current functionality of the AArch64 redundant copy elimination pass to handle non-zero cases such as: BB#0: cmp x0, #1 b.eq .LBB0_1 .LBB0_1: orr x0, xzr, #0x1 ; <-- redundant copy; x0 known to hold #1. Differential Revision: https://reviews.llvm.org/D29344 llvm-svn: 296809	2017-03-02 20:48:11 +00:00
Tim Northover	e80d6d1360	GlobalISel: record correct stack usage for signext parameters. The CallingConv.td rules allocate 8 bytes for these kinds of arguments on AAPCS targets, but we were only recording the smaller amount. The difference is theoretical on AArch64 because we don't actually store more than the smaller amount, but it's still much better to have these two components in agreement. Based on Diana Picus's ARM equivalent patch (where it matters a lot more). llvm-svn: 296754	2017-03-02 15:34:18 +00:00
Matthew Simpson	aee9771ae2	[ARM/AArch64] Update costs for interleaved accesses with wide types After r296750, we're able to match interleaved accesses having types wider than 128 bits. This patch updates the associated TTI costs. Differential Revision: https://reviews.llvm.org/D29675 llvm-svn: 296751	2017-03-02 15:15:35 +00:00
Matthew Simpson	1bfa159db9	[ARM/AArch64] Support wide interleaved accesses This patch teaches (ARM\|AArch64)ISelLowering.cpp to match illegal vector types to interleaved access intrinsics as long as the types are multiples of the vector register width. A "wide" access will now be mapped to multiple interleave intrinsics similar to the way in which non-interleaved accesses with illegal types are legalized into multiple accesses. I'll update the associated TTI costs (in getInterleavedMemoryOpCost) as a follow-on. Differential Revision: https://reviews.llvm.org/D29466 llvm-svn: 296750	2017-03-02 15:11:20 +00:00
Ahmed Bougacha	120ae22d70	[GlobalISel] Add a way for targets to enable GISel. Until now, we've had to use -global-isel to enable GISel. But using that on other targets that don't support it will result in an abort, as we can't build a full pipeline. Additionally, we want to experiment with enabling GISel by default for some targets: we can't just enable GISel by default, even among those target that do have some support, because the level of support varies. This first step adds an override for the target to explicitly define its level of support. For AArch64, do that using a new command-line option (I know..): -aarch64-enable-global-isel-at-O=<N> Where N is the opt-level below which GISel should be used. Default that to -1, so that we still don't enable GISel anywhere. We're not there yet! While there, remove a couple LLVM_UNLIKELYs. Building the pipeline is such a cold path that in practice that shouldn't matter at all. llvm-svn: 296710	2017-03-01 23:33:08 +00:00
Daniel Sanders	983c9b98e9	Revert r296474 - [globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it. There's a circular dependency that's only revealed when LLVM_ENABLE_MODULES=1. llvm-svn: 296478	2017-02-28 15:00:27 +00:00
Nirav Dave	f830dec3f2	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 296476	2017-02-28 14:24:15 +00:00
Daniel Sanders	a5afdefec6	[globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 296474	2017-02-28 14:21:31 +00:00
Sjoerd Meijer	32ecac7ac8	AArch64InstPrinter: rewrite of printSysAlias This is a cleanup/rewrite of the printSysAlias function. This was not using the tablegen instruction descriptions, but was "manually" decoding the instructions. This has been replaced with calls to lookup_XYZ_ByEncoding tablegen calls. This revealed several problems. First, instruction IVAU had the wrong encoding. This was cancelled out by the parser that incorrectly matched the wrong encoding. Second, instruction CVAP was missing from the SystemOperands tablegen descriptions, so this has been added. And third, the required target features were not captured in the tablegen descriptions, so support for this has also been added. Differential Revision: https://reviews.llvm.org/D30329 llvm-svn: 296343	2017-02-27 14:45:34 +00:00
Sjoerd Meijer	6d171006f4	AArch64AsmParser: don't try to parse “[1]” for non-vector register operands There are no instructions that have "[1]" as part of the assembly string; FMOVXDhighr is out of date. This removes dead code. Differential Revision: https://reviews.llvm.org/D30165 llvm-svn: 296327	2017-02-27 10:51:11 +00:00
Nirav Dave	73cd0194cf	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r296252 until 256-bit operations are more efficiently generated in X86. llvm-svn: 296279	2017-02-26 01:27:32 +00:00
Nirav Dave	beabf456df	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 296252	2017-02-25 11:43:58 +00:00
Junmo Park	7ff4c045eb	Minor code cleanup. NFC. llvm-svn: 296207	2017-02-25 00:08:53 +00:00
Daniel Sanders	066ebbfd46	[globalisel] Decouple src pattern operands from dst pattern operands. Summary: This isn't testable for AArch64 by itself so this patch also adds support for constant immediates in the pattern and physical register uses in the result. The new IntOperandMatcher matches the constant in patterns such as '(set $rd:GPR32, (G_XOR $rs:GPR32, -1))'. It's always safe to fold immediates into an instruction so this is the first rule that will match across multiple BB's. The Renderer hierarchy is responsible for adding operands to the result instruction. Renderers can copy operands (CopyRenderer) or add physical registers (in particular %wzr and %xzr) to the result instruction in any order (OperandMatchers now import the operand names from SelectionDAG to allow renderers to access any operand). This allows us to emit the result instruction for: %1 = G_XOR %0, -1 --> %1 = ORNWrr %wzr, %0 %1 = G_XOR -1, %0 --> %1 = ORNWrr %wzr, %0 although the latter is untested since the matcher/importer has not been taught about commutativity yet. Added BuildMIAction which can build new instructions and mutate them where possible. W.r.t the mutation aspect, MatchActions are now told the name of an instruction they can recycle and BuildMIAction will emit mutation code when the renderers are appropriate. They are appropriate when all operands are rendered using CopyRenderer and the indices are the same as the matcher. This currently assumes that all operands have at least one matcher. Finally, this change also fixes a crash in AArch64InstructionSelector::select() caused by an immediate operand passing isImm() rather than isCImm(). This was uncovered by the other changes and was detected by existing tests. Depends on D29711 Reviewers: t.p.northover, ab, qcolombet, rovka, aditya_nandakumar, javed.absar Reviewed By: rovka Subscribers: aemerson, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D29712 llvm-svn: 296131	2017-02-24 15:43:30 +00:00
Petr Hosek	a7d5916308	[Fuchsia] Use thread-pointer ABI slots for stack-protector and safe-stack The Fuchsia ABI defines slots from the thread pointer where the stack-guard value for stack-protector, and the unsafe stack pointer for safe-stack, are stored. This parallels the Android ABI support. Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D30237 llvm-svn: 296081	2017-02-24 03:10:10 +00:00
Geoff Berry	6bb79157dd	[AArch64] Extend AArch64RedundantCopyElimination to do simple copy propagation. Summary: Extend AArch64RedundantCopyElimination to catch cases where the register that is known to be zero is COPY'd in the predecessor block. Before this change, this pass would catch cases like: CBZW %W0, <BB#1> BB#1: %W0 = COPY %WZR // removed After this change, cases like the one below are also caught: %W0 = COPY %W1 CBZW %W1, <BB#1> BB#1: %W0 = COPY %WZR // removed This change results in a 4% increase in static copies removed by this pass when compiling the llvm test-suite. It also fixes regressions caused by doing post-RA copy propagation (a separate change to be put up for review shortly). Reviewers: junbuml, mcrosier, t.p.northover, qcolombet, MatzeB Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D30113 llvm-svn: 295863	2017-02-22 19:10:45 +00:00
Evandro Menezes	a8d3301ee1	[AArch64, X86] Add statistics for the MacroFusion pass llvm-svn: 295777	2017-02-21 22:16:13 +00:00
Evandro Menezes	b9b7f4b8d3	[AArch64, X86] Guard against both instrs being wild cards If both instrs are wild cards, the result can be a crash. llvm-svn: 295776	2017-02-21 22:16:11 +00:00
Geoff Berry	5d534b6a11	[CodeGenPrepare] Sink and duplicate more 'and' instructions. Summary: Rework the code that was sinking/duplicating (icmp and, 0) sequences into blocks where they were being used by conditional branches to form more tbz instructions on AArch64. The new code is more general in that it just looks for 'and's that have all icmp 0's as users, with a target hook used to select which subset of 'and' instructions to consider. This change also enables 'and' sinking for X86, where it is more widely beneficial than on AArch64. The 'and' sinking/duplicating code is moved into the optimizeInst phase of CodeGenPrepare, where it can take advantage of the fact the OptimizeCmpExpression has already sunk/duplicated any icmps into the blocks where they are used. One minor complication from this change is that optimizeLoadExt needed to be updated to always mark 'and's it has determined should be in the same block as their feeding load in the InsertedInsts set to avoid an infinite loop of hoisting and sinking the same 'and'. This change fixes a regression on X86 in the tsan runtime caused by moving GVNHoist to a later place in the optimization pipeline (see PR31382). Reviewers: t.p.northover, qcolombet, MatzeB Subscribers: aemerson, mcrosier, sebpop, llvm-commits Differential Revision: https://reviews.llvm.org/D28813 llvm-svn: 295746	2017-02-21 18:53:14 +00:00
Sjoerd Meijer	e22a79e898	AArch64AsmParser: tablegen the isBranchTarget helper functions Use tablegen to autogenerate isBranchtarget helper functions. This is a cleanup that removes almost identical functions that differ only in a few constants. Differential Revision: https://reviews.llvm.org/D30160 llvm-svn: 295649	2017-02-20 10:57:54 +00:00
Davide Italiano	1aef59eb44	[AArch64] Prefer static_cast<> to C-style cast. NFCI. llvm-svn: 295615	2017-02-19 21:31:14 +00:00
Simon Pilgrim	b092166a76	[AArch64] Fix enumeral/non-enumeral conditional expression warning. gcc only allows you to mix enums / ints if they have the same signedness. llvm-svn: 295577	2017-02-18 22:50:28 +00:00
Matthias Braun	d9a59a8df8	AArch64LoadStoreOptimizer: Correctly clear kill flags When promoting the Load of a Store-Load pair to a COPY all kill flags between the store and the load need to be cleared. rdar://30402435 Differential Revision: https://reviews.llvm.org/D30110 llvm-svn: 295512	2017-02-17 23:15:03 +00:00
Joel Jones	ab0f3b43e3	[AArch64] Add Cavium ThunderX support This set of patches adds support for Cavium ThunderX ARM64 processors: * ThunderX * ThunderX T81 * ThunderX T83 * ThunderX T88 Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D28891 llvm-svn: 295475	2017-02-17 18:34:24 +00:00
Sjoerd Meijer	cb2d950214	[AArch64] AArch64AsmParser clean up of isImmediate functions. NFC Regression test neon-diagnostics.s needed changing because it now produces a more specific diagnostic about the immediate ranges. One change in the expected error message is not obvious, but there multiple candidate and it happens to pick the immediate diagnostic. Differential Revision: https://reviews.llvm.org/D29939 llvm-svn: 295331	2017-02-16 15:52:22 +00:00
Tim Northover	9136617a3f	GlobalISel: legalize va_arg on AArch64. Uses a Custom implementation because the slot sizes being a multiple of the pointer size isn't really universal, even for the architectures that do have a simple "void *" va_list. llvm-svn: 295255	2017-02-15 23:22:50 +00:00
Ahmed Bougacha	f8acf568f1	[AArch64] Make am_ldrlit an iPTR - not OtherVT - operand. NFC-ish. am_ldrlit diverged from am_brcond in r207105, but kept the OtherVT operand type. It made sense for branch targets, as those are represented as MVT::Other in SDAG. But loads operate on pointers. This shouldn't have an observable effect on any in-tree code, but helps make the patterns consistent for external users. llvm-svn: 295229	2017-02-15 20:38:31 +00:00
Tim Northover	398c5f57f9	GlobalISel: deal with new G_PTR_MASK instruction on AArch64. It's just an AND-immediate instruction for us, surprisingly simple to select. llvm-svn: 295104	2017-02-14 20:56:29 +00:00
Tim Northover	48dfa1a6ed	GlobalISel: represent atomic loads & stores via the MachineMemOperand. Also make sure the AArch64 backend doesn't try to convert them into normal loads and stores. llvm-svn: 294993	2017-02-13 22:14:16 +00:00
Arnold Schwaighofer	26f016f143	SwiftCC: swifterror register cannot be as the base register Functions that have a dynamic alloca require a base register which is defined to be X19 on AArch64 and r6 on ARM. We have defined the swifterror register to be the same register. Use a different callee save register for swifterror instead: X21 on AArch64 R8 on ARM rdar://30433803 llvm-svn: 294551	2017-02-09 01:52:17 +00:00
Tim Northover	e041841811	GlobalISel: legalize G_FPOW to a libcall on AArch64. There's no instruction to implement it. llvm-svn: 294531	2017-02-08 23:23:39 +00:00
Arnold Schwaighofer	db7bbcbe78	[ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not 'this returns' We mark X0 as preserved by a call that passes the returned parameter. x0 = ... fun(x0) // no implicit def of x0 This no longer is valid if we pass the parameter in a different register then the returned value as is the case with a swiftself parameter (passed in x20). x20 = ... fun(x20) // there should be an implict def of x8 rdar://30425845 llvm-svn: 294527	2017-02-08 22:30:47 +00:00
Amara Emerson	c3a4b282bb	Revert r294437 as it broke an asan buildbot. llvm-svn: 294523	2017-02-08 21:41:16 +00:00
Tim Northover	9dd78f8a6d	GlobalISel: select G_[SU]MULH on AArch64. Hopefully this'll be nuked by tablegen pretty soon, but until then it's reasonably important for supporting C++ operator new[]. llvm-svn: 294520	2017-02-08 21:22:25 +00:00
Tim Northover	0a9b27933a	GlobalISel: expand mul-with-overflow into mul-hi on AArch64. AArch64 has specific instructions to multiply two numbers at double the width and produce the high part of the result. These can be used to implement LLVM's mul.with.overflow instructions fairly simply. Helps with C++ operator new[]. llvm-svn: 294519	2017-02-08 21:22:15 +00:00
Tim Northover	e9600d861c	GlobalISel: select G_VASTART on iOS AArch64. The AAPCS ABI is substantially more complicated so that's coming in a separate patch. For now we can generate correct code for iOS though. llvm-svn: 294493	2017-02-08 17:57:27 +00:00
Tim Northover	f19d467ff6	GlobalISel: translate @llvm.va_start intrinsic. Because we need to preserve the memory access being performed we need a separate instruction to represent this. llvm-svn: 294492	2017-02-08 17:57:20 +00:00
Amara Emerson	fecdb36f92	[AArch64][TableGen] Skip tied result operands for InstAlias This patch checks the number of operands in the resulting instruction instead of just the alias, then skips over tied operands when generating the printing method. This allows us to generate the preferred assembly syntax for the AArch64 'ins' instruction, which should always be displayed as 'mov' according to the ARMARM. Several unit tests have changed as a result, but only to reflect the preferred disassembly. Some other InstAlias patterns (movk/bic/orr) needed a slight adjustment to stop them becoming the default and breaking other unit tests. Patch by Graham Hunter. Differential Revision: https://reviews.llvm.org/D29219 llvm-svn: 294437	2017-02-08 11:28:08 +00:00
Tim Northover	868332d6bf	GlobalISel: legalize narrow G_SELECTS on AArch64. Otherwise there aren't any patterns to select them. llvm-svn: 294261	2017-02-06 23:41:27 +00:00
Tim Northover	6f2db57dae	GlobalISel: fall back gracefully when we can't map an operand's size. AArch64 was asserting when it was asked to provide a register-bank of a size it couldn't deal with (in this case an s128 IMPLICIT_DEF). But we want a robust fallback path so this isn't allowed. llvm-svn: 294248	2017-02-06 21:57:06 +00:00
Tim Northover	0e6afbdd77	GlobalISel: legalize G_INSERT instructions We don't handle all cases yet (see arm64-fallback.ll for an example), but this is enough to cover most common C++ code so it's a good place to start. llvm-svn: 294247	2017-02-06 21:56:47 +00:00
John Brawn	3a9c842a9d	[AArch64] Fix incorrect MachinePointerInfo in splitStoreSplat When splitting up one store into several in splitStoreSplat we have to make sure we get the MachinePointerInfo right, otherwise alias analysis thinks they all store to the same location. This can then cause invalid scheduling later on. Differential Revision: https://reviews.llvm.org/D29446 llvm-svn: 294203	2017-02-06 18:07:20 +00:00
Eugene Zelenko	939f6b0167	[AArch64] Fix some Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce MCExpr.h dependencies. llvm-svn: 294053	2017-02-03 21:49:13 +00:00
Nirav Dave	93f9d5ce04	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293893 which is miscompiling lua on ARM and bootstrapping for x86-windows. llvm-svn: 293915	2017-02-02 18:24:55 +00:00
Nirav Dave	4442667fc5	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixing X86 inc/dec chain bug. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293893	2017-02-02 14:39:42 +00:00
Matthias Braun	5b49f95592	AArch64RegisterInfo: Simplify getReservedReg(); NFC After marking a 32bit register and all its super registers the 64bit register does not need to be marked again. llvm-svn: 293855	2017-02-02 02:23:25 +00:00
Eugene Zelenko	c5eb8e29d0	[AArch64] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 293836	2017-02-01 22:56:06 +00:00
Matthew Simpson	ba5cf9dfee	[LV] Move interleaved access helper functions to VectorUtils (NFC) This patch moves some helper functions related to interleaved access vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like to use these functions in a follow-on patch that improves interleaved load and store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was already duplicated there and has been removed. Differential Revision: https://reviews.llvm.org/D29398 llvm-svn: 293788	2017-02-01 17:45:46 +00:00
NAKAMURA Takumi	468487d71a	*MacroFusion.cpp: Suppress warnings to eliminate \param(s). [-Wdocumentation] llvm-svn: 293744	2017-02-01 07:30:46 +00:00
Evandro Menezes	455382ea22	[AArch64] Add new target feature to fuse literal generation This feature enables the fusion of such operations on Cortex A57, as recommended in its Software Optimisation Guide, sections 4.14 and 4.15. Differential revision: https://reviews.llvm.org/D28698 llvm-svn: 293739	2017-02-01 02:54:42 +00:00
Evandro Menezes	b21fb29c26	[AArch64] Add new subtarget feature to fuse AES crypto operations This feature enables the fusion of such operations on Cortex A57, as recommended in its Software Optimisation Guide, section 4.13, and on Exynos M1. Differential revision: https://reviews.llvm.org/D28491 llvm-svn: 293738	2017-02-01 02:54:39 +00:00
Evandro Menezes	94edf02923	[CodeGen] Move MacroFusion to the target This patch moves the class for scheduling adjacent instructions, MacroFusion, to the target. In AArch64, it also expands the fusion to all instructions pairs in a scheduling block, beyond just among the predecessors of the branch at the end. Differential revision: https://reviews.llvm.org/D28489 llvm-svn: 293737	2017-02-01 02:54:34 +00:00
Kristof Beyls	65a12c012f	[GlobalISel] Add support for indirectbr Differential Revision: https://reviews.llvm.org/D28079 llvm-svn: 293470	2017-01-30 09:13:18 +00:00
Quentin Colombet	24203cf997	[AArch64][LegalizerInfo] Specify the type of the opcode. This is an attempt to fix the win7 bot that does not seem to be very good at infering the type when it gets used in an initiliazer list. llvm-svn: 293246	2017-01-27 01:13:30 +00:00
Quentin Colombet	e15e460c05	Revert "[AArch64][LegalizerInfo] Specify the type of the initialization list." This reverts commit r293238. Even with that the win7 bot is still failing: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/3862 llvm-svn: 293245	2017-01-27 01:13:25 +00:00
Quentin Colombet	86fc8305ec	[AArch64][LegalizerInfo] Specify the type of the initialization list. This is an attempt to fix the win7 bot that does not seem to be very good at infering the type. llvm-svn: 293238	2017-01-27 00:39:03 +00:00
Balaram Makam	b73d2962ba	[AArch64] Refine Kryo Machine Model Summary: Refine floating point SQRT and DIV with accurate latency information. Reviewers: mcrosier Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D29191 llvm-svn: 293204	2017-01-26 20:10:41 +00:00
Nirav Dave	d32a421f75	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293184 which is failing in LTO builds llvm-svn: 293188	2017-01-26 16:46:13 +00:00
Nirav Dave	de6516c466	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293184	2017-01-26 16:02:24 +00:00
Jonas Paulsson	8e2f948ef0	[TargetTransformInfo] Refactor and improve getScalarizationOverhead() Refactoring to remove duplications of this method. New method getOperandsScalarizationOverhead() that looks at the present unique operands and add extract costs for them. Old behaviour was to just add extract costs for one operand of the type always, which still happens in getArithmeticInstrCost() if no operands are provided by the caller. This is a good start of improving on this, but there are more places that can be improved by using getOperandsScalarizationOverhead(). Review: Hal Finkel https://reviews.llvm.org/D29017 llvm-svn: 293155	2017-01-26 07:03:25 +00:00
Serge Rogatch	bc2d34394d	[XRay][AArch64] More staging for tail call support in XRay on AArch64 - in LLVM Summary: This patch prepares more for tail call support in XRay. Until the logging part supports tail calls, this is just staging, so it seems LLVM part is mostly ready with this patch. Related: https://reviews.llvm.org/D28948 (compiler-rt) Reviewers: dberris, rengolin Reviewed By: dberris Subscribers: llvm-commits, iid_iunknown, aemerson Differential Revision: https://reviews.llvm.org/D28947 llvm-svn: 293080	2017-01-25 20:21:49 +00:00
Chad Rosier	072e70b365	[AArch64] Minor code refactoring. NFC. llvm-svn: 293063	2017-01-25 15:56:59 +00:00
Ahmed Bougacha	05a5f7dc0b	[GlobalISel] Generate selector for more integer binop patterns. This surprisingly isn't NFC because there are patterns to select GPR sub to SUBSWrr (rather than SUBWrr/rs); SUBS is later optimized to SUB if NZCV is dead. From ISel's perspective, both are fine. llvm-svn: 293010	2017-01-25 02:41:38 +00:00
Eugene Zelenko	11f6907f40	[AArch64] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 292996	2017-01-25 00:29:26 +00:00
Chad Rosier	8e11fbd15d	[AArch64] Fix typo. NFC. llvm-svn: 292959	2017-01-24 18:08:10 +00:00
Evandro Menezes	7784cacd91	[AArch64] Rename 'no-quad-ldst-pairs' to 'slow-paired-128' In order to follow the pattern of the existing 'slow-misaligned-128store' option, rename the option 'no-quad-ldst-pairs' to 'slow-paired-128'. llvm-svn: 292954	2017-01-24 17:34:31 +00:00
Ahmed Bougacha	b6137063eb	[AArch64][GlobalISel] Legalize narrow scalar fp->int conversions. Since we're now avoiding operations using narrow scalar integer types, we have to legalize the integer side of the FP conversions. This requires teaching the legalizer how to do that. llvm-svn: 292828	2017-01-23 21:10:14 +00:00
Ahmed Bougacha	cfb384d39d	[AArch64][GlobalISel] Legalize narrow scalar ops again. Since r279760, we've been marking as legal operations on narrow integer types that have wider legal equivalents (for instance, G_ADD s8). Compared to legalizing these operations, this reduced the amount of extends/truncates required, but was always a weird legalization decision made at selection time. So far, we haven't been able to formalize it in a way that permits the selector generated from SelectionDAG patterns to be sufficient. Using a wide instruction (say, s64), when a narrower instruction exists (s32) would introduce register class incompatibilities (when one narrow generic instruction is selected to the wider variant, but another is selected to the narrower variant). It's also impractical to limit which narrow operations are matched for which instruction, as restricting "narrow selection" to ranges of types clashes with potentially incompatible instruction predicates. Concerns were also raised regarding MIPS64's sign-extended register assumptions, as well as wrapping behavior. See discussions in https://reviews.llvm.org/D26878. Instead, legalize the operations. Should we ever revert to selecting these narrow operations, we should try to represent this more accurately: for instance, by separating a "concrete" type on operations, and an "underlying" type on vregs, we could move the "this narrow-looking op is really legal" decision to the legalizer, and let the selector use the "underlying" vreg type only, which would be guaranteed to map to a register class. In any case, we eventually should mitigate: - the performance impact by selecting no-op extract/truncates to COPYs (which we currently do), and the COPYs to register reuses (which we don't do yet). - the compile-time impact by optimizing away extract/truncate sequences in the legalizer. llvm-svn: 292827	2017-01-23 21:10:05 +00:00
Matthias Braun	28eae8f4e0	LiveRegUnits: Add accumulateBackward() function Re-Commit r292543 with a fix for the situation when the chain end is MBB.end(). This function can be used to accumulate the set of all read and modified register in a sequence of instructions. Use this code in AArch64A57FPLoadBalancing::scavengeRegister() to prove the concept. - The AArch64A57LoadBalancing code is using a backwards analysis now which is irrespective of kill flags. This is the main motivation for this change. Differential Revision: http://reviews.llvm.org/D22082 llvm-svn: 292705	2017-01-21 02:21:04 +00:00
Matthias Braun	2e8c11e4b3	AArch64LoadStoreOptimizer: Update kill flags when merging stores Kill flags need to be updated correctly when moving stores up/down to form store pair instructions. Those invalid flags have been ignored before but as of r290014 they are recognized when using -mllvm -verify-machineinstrs. Also simplifies test/CodeGen/AArch64/ldst-opt-dbg-limit.mir, renames it to ldst-opt.mir test and adds a new tests for this change. Differential Revision: https://reviews.llvm.org/D28875 llvm-svn: 292625	2017-01-20 18:04:27 +00:00
Matthias Braun	d9217c0b86	Revert "LiveRegUnits: Add accumulateBackward() function" This seems to be breaking some bots. This reverts commit r292543. llvm-svn: 292574	2017-01-20 03:58:42 +00:00
Ahmed Bougacha	d294823930	[AArch64][GlobalISel] Widen scalar int->fp conversions. It's incorrect to ignore the higher bits of the integer source. Teach the legalizer how to widen it. llvm-svn: 292563	2017-01-20 01:37:24 +00:00
Matthias Braun	3ffeb68869	LiveRegUnits: Add accumulateBackward() function This function can be used to accumulate the set of all read and modified register in a sequence of instructions. Use this code in AArch64A57FPLoadBalancing::scavengeRegister() to prove the concept. - The AArch64A57LoadBalancing code is using a backwards analysis now which is irrespective of kill flags. This is the main motivation for this change. Differential Revision: http://reviews.llvm.org/D22082 llvm-svn: 292543	2017-01-20 00:16:17 +00:00
Kristof Beyls	e9412b4d47	[GlobalISel] Pointers are legal operands for G_SELECT on AArch64 Differential Revision: https://reviews.llvm.org/D28805 llvm-svn: 292481	2017-01-19 13:32:14 +00:00
Daniel Sanders	d64d5024a4	Re-commit: [globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Changes since first commit attempt: * Added missing guards * Added more missing guards * Found and fixed a use-after-free bug involving Twine locals Reviewers: t.p.northover, ab, rovka, qcolombet Reviewed By: qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292478	2017-01-19 11:15:55 +00:00
Evandro Menezes	7960b2e19a	[AArch64] Generate literals by the little end ARM seems to prefer that long literals be formed from their little end in order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section 4.14). Differential revision: https://reviews.llvm.org/D28697 llvm-svn: 292422	2017-01-18 18:57:08 +00:00
Daniel Sanders	af76f989b5	Re-revert: [globalisel] Tablegen-erate current Register Bank Information More missing guards. My build didn't notice it due to a stale file left over from a Global ISel build. llvm-svn: 292369	2017-01-18 14:26:12 +00:00
Daniel Sanders	517b61cb69	Re-commit: [globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Changes since last commit: The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and this should fix the buildbots however it may not be the whole fix. The previous buildbot failures suggest there may be a memory bug lurking that I'm unable to reproduce (including when using asan) or spot in the source. If they re-occur on this commit then I'll need assistance from the bot owners to track it down. Reviewers: t.p.northover, ab, rovka, qcolombet Reviewed By: qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292367	2017-01-18 14:17:50 +00:00
Tim Northover	33a1a0b001	GlobalISel: fix comparison order for G_FCMP As with G_ICMP we'd written the CSET instructions backwards. llvm-svn: 292285	2017-01-17 23:04:01 +00:00
Tim Northover	509091f9e0	GlobalISel: add callseq instructions to record stack usage llvm-svn: 292284	2017-01-17 22:43:34 +00:00
Tim Northover	d943354216	GlobalISel: correctly handle varargs Some platforms (notably iOS) use a different calling convention for unnamed vs named parameters in varargs functions, so we need to keep track of this information when translating calls. Since not many platforms are involved, the guts of the special handling is in the ValueHandler class (with a generic implementation that should work for most targets). llvm-svn: 292283	2017-01-17 22:30:10 +00:00
Chad Rosier	58fb5f5e58	[AArch64] Falkor supports Rounding Double Multiply Add/Subtract instructions. Falkor only partially implements the ARMv8.1a extensions, so this patch refactors the support for the SQRDML[A\|S]H instruction into a separate feature. Differential Revision: https://reviews.llvm.org/D28681 llvm-svn: 292142	2017-01-16 16:28:43 +00:00
Daniel Sanders	a83a1a69c5	Revert r292132: [globalisel] Tablegen-erate current Register Bank Information'... Several buildbots encountered a crash in tablegen when building this commit. Reverting while I investigate the cause. llvm-svn: 292136	2017-01-16 15:34:43 +00:00
Daniel Sanders	ab8194def0	[globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292132	2017-01-16 15:20:43 +00:00
Benjamin Kramer	061f4a5fe6	Apply clang-tidy's performance-unnecessary-value-param to LLVM. With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904	2017-01-13 14:39:03 +00:00
Daniel Sanders	d6a1831ea7	[globalisel][aarch64] Make getCopyMapping() take register banks ID's rather than IsGPR booleans Summary: This allows the function to handle architectures with more than two register banks. Depends on D27978 Reviewers: ab, t.p.northover, rovka, qcolombet Subscribers: aditya_nandakumar, kristof.beyls, aemerson, rengolin, vkalintiris, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27339 llvm-svn: 291902	2017-01-13 14:16:33 +00:00
Daniel Sanders	21ac840fca	[aarch64][globalisel] Move getValueMapping/getCopyMapping to AArch64GenRegisterBankInfo. NFC. Summary: We did lose a little specificity in the assertion messages for the PartialMappingIdx enumerators in this change but this was necessary to avoid unnecessary use of 'public:' and we haven't lost anything that can't be discovered easily in lldb. Once this is tablegen-erated we could also safely remove the assertions. Depends on D27976 Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, aemerson, rengolin, vkalintiris, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D27978 llvm-svn: 291900	2017-01-13 11:50:34 +00:00
Daniel Sanders	f81cf47e65	[aarch64][globalisel] Refactor getRegBankBaseIdxOffset() to remove the power-of-2 assumption. NFC Summary: We don't exploit it yet though Depends on D27976 Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, aemerson, rengolin, vkalintiris, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D27977 llvm-svn: 291899	2017-01-13 11:23:37 +00:00
Daniel Sanders	438a1ecc2c	[aarch64][globalisel] Move data into <Target>GenRegisterBankInfo. NFC. Summary: Depends on D27809 Reviewers: t.p.northover, rovka, qcolombet, ab Subscribers: aditya_nandakumar, aemerson, rengolin, vkalintiris, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D27976 llvm-svn: 291897	2017-01-13 10:53:57 +00:00
Diana Picus	116bbab4e4	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891	2017-01-13 09:58:52 +00:00
Daniel Sanders	b7391dd3b4	[globalisel] Move as much RegisterBank initialization to the constructor as possible Summary: The register bank is now entirely initialized in the constructor. However, we still have the hardcoded number of register classes which will be dealt with in the TableGen patch (D27338) since we do not have access to this information to resolve this at this stage. The number of register classes is known to the TRI and to TableGen but the RegisterBank constructor is too early for the former and too late for the latter. This will be fixed when the data is tablegen-erated. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D27809 llvm-svn: 291770	2017-01-12 16:11:23 +00:00
Daniel Sanders	ae03595bfb	[globalisel] Initialize RegisterBanks with static data. Summary: Refactor the RegisterBank initialization to use static data. This requires GlobalISel implementations to rewrite calls to createRegisterBank() and addRegBankCoverage() into a call to setRegBankData(). Out of tree targets can use diff 4 of D27807 (https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump the register classes and other data that needs to be provided to setRegBankData(). This is the method that was used to generate the static data in this patch. Tablegen-eration of this static data will follow after some refactoring. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D27807 Differential Revision: https://reviews.llvm.org/D27808 llvm-svn: 291768	2017-01-12 15:32:10 +00:00
Mohammed Agabaria	2c96c43388	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch. updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657	2017-01-11 08:23:37 +00:00
Evandro Menezes	330e1b8945	[AArch64] Consider all vector types for FeatureSlowMisaligned128Store The original code considered only v2i64 as slow for this feature. This patch consider all 128-bit long vector types as slow candidates. In internal tests, extending this feature to all 128-bit vector types resulted in an overall improvement of 1% on Exynos M1. Differential revision: https://reviews.llvm.org/D27998 llvm-svn: 291616	2017-01-10 23:42:21 +00:00
Chad Rosier	3daffbf6a8	[AArch64] Add support for lowering bitreverse to the rbit instruction. Differential Revision: https://reviews.llvm.org/D28379 llvm-svn: 291575	2017-01-10 17:20:33 +00:00
Matthias Braun	258b847c4f	AArch64CollectLOH: Rewrite as block-local analysis. Re-apply r288561: This time with a fix where the ADDs that are part of a 3 instruction LOH would not invalidate the "LastAdrp" state. This fixes http://llvm.org/PR31361 Previously this pass was using up to 5% compile time in some cases which is a bit much for what it is doing. The pass featured a full blown data-flow analysis which in the default configuration was restricted to a single block. This rewrites the pass under the assumption that we only ever work on a single block. This is done in a single pass maintaining a state machine per general purpose register to catch LOH patterns. Differential Revision: https://reviews.llvm.org/D27329 This reverts commit 9e6cedb0a4f14364d6511597a9160305e7d34493. llvm-svn: 291266	2017-01-06 19:22:01 +00:00
Chad Rosier	e177185e79	[AArch64] Reduce vector insert/extract cost for Falkor. Differential Revision: https://reviews.llvm.org/D28403 llvm-svn: 291254	2017-01-06 18:03:26 +00:00
Eugene Zelenko	049b017538	[AArch64, Lanai] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 291197	2017-01-06 00:30:53 +00:00
Logan Chien	ce542eefe3	Code cleanup: Remove tab indents. llvm-svn: 291193	2017-01-05 23:41:33 +00:00
Geoff Berry	d46b6e8096	[AArch64] Fold some filled/spilled subreg COPYs Summary: Extend AArch64 foldMemoryOperandImpl() to handle folding spills of subreg COPYs with read-undef defs like: %vreg0:sub_32<def,read-undef> = COPY %WZR; GPR64:%vreg0 by widening the spilled physical source reg and generating: STRXui %XZR <fi#0> as well as folding fills of similar COPYs like: %vreg0:sub_32<def,read-undef> = COPY %vreg1; GPR64:%vreg0, GPR32:%vreg1 by generating: %vreg0:sub_32<def,read-undef> = LDRWui <fi#0> Reviewers: MatzeB, qcolombet Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27425 llvm-svn: 291180	2017-01-05 21:51:42 +00:00
Mohammed Agabaria	23599ba794	Currently isLikelyComplexAddressComputation tries to figure out if the given stride seems to be 'complex' and need some extra cost for address computation handling. This code seems to be target dependent which may not be the same for all targets. Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'. Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general. Differential Revision: https://reviews.llvm.org/D27518 llvm-svn: 291106	2017-01-05 14:03:41 +00:00
Kristof Beyls	2252440b81	[GlobalISel] Fix AArch64 ICMP instruction selection Differential Revision: https://reviews.llvm.org/D28175 llvm-svn: 291097	2017-01-05 10:16:08 +00:00
Chad Rosier	63687e40bc	[AArch64] Update the feature set for Qualcomm's Falkor CPU. llvm-svn: 291010	2017-01-04 21:26:23 +00:00
Nirav Dave	0f9d111f97	[AArch64] Fix over-eager early-exit in load-store combiner Fix early-exit analysis for memory operation pairing when operations are not emitted in ascending order. Reviewers: mcrosier, t.p.northover Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D28251 llvm-svn: 291008	2017-01-04 21:21:46 +00:00
Dean Michael Berris	f7e7b938ea	[XRay] Merge instrumentation point table emission code into AsmPrinter. Summary: No need to have this per-architecture. While there, unify 32-bit ARM's behaviour with what changed elsewhere and start function names lowercase as per the coding standards. Individual entry emission code goes to the entry's own class. Fully tested on amd64, cross-builds on both ARMs and PowerPC. Reviewers: dberris Subscribers: aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D28209 llvm-svn: 290858	2017-01-03 04:30:21 +00:00
Chad Rosier	2ff37b8615	[AArch64][AsmParser] Add support for parsing shift/extend operands with symbols. Differential Revision: https://reviews.llvm.org/D27953 llvm-svn: 290609	2016-12-27 16:58:09 +00:00
Renato Golin	21da340f7a	[AArch64] Cortex-A57 FDIV/FSQRT scheduling fix (W-unit) According to the Cortex-A57 doc, FDIV/FSQRT instructions should use F0 unit (W-unit in AArch64SchedA57.td, the same as cryptography instructions), not F1 unit (X-unit in td, like ASIMD absolute diff accum SABA/UABA). This patch changes FDIV/FSQRT scheduling declarations to use A57UnitW instead of A57UnitX. Also, latencies for those instructions are corrected. Patch by Andrew Zhogin. llvm-svn: 290426	2016-12-23 12:51:41 +00:00
Quentin Colombet	f38015e5fe	[AArch64][CallLowering] Constraint registers on target specific instruction The InstructionSelect pass will not look at target specific instructions since they are already selected. As a result, the operands of target specific instructions must be properly constrained, because it is not going to fix them. This fixes invalid register classes on call instruction. llvm-svn: 290377	2016-12-22 21:56:31 +00:00
Haicheng Wu	9ac20a1e10	[AArch64] Correct the check of signed 9-bit imm in getIndexedAddressParts(). -256 is a legal indexed address part. Differential Revision: https://reviews.llvm.org/D27537 llvm-svn: 290296	2016-12-22 01:39:24 +00:00
Ahmed Bougacha	36f7035bd7	[GlobalISel] Add basic Selector-emitter tblgen backend. This adds a basic tablegen backend that analyzes the SelectionDAG patterns to find simple ones that are eligible for GlobalISel-emission. That's similar to FastISel, with one notable difference: we're not fed ISD opcodes, so we need to map the SDNode operators to generic opcodes. That's done using GINodeEquiv in TargetGlobalISel.td. Otherwise, this is mostly boilerplate, and lots of filtering of any kind of "complicated" pattern. On AArch64, this is sufficient to match G_ADD up to s64 (to ADDWrr/ADDXrr) and G_BR (to B). Differential Revision: https://reviews.llvm.org/D26878 llvm-svn: 290284	2016-12-21 23:26:20 +00:00
Haicheng Wu	6bb0e39321	[AArch64] Remove a redundant check. NFC. The case AM.Scale == 0 is already handled by the code right above. Differential Revision: https://reviews.llvm.org/D28003 llvm-svn: 290275	2016-12-21 21:40:47 +00:00
Matthias Braun	15b56e6973	Revert "AArch64CollectLOH: Rewrite as block-local analysis." It is still breaking Chrome. http://llvm.org/PR31361 This reverts commit r290026. llvm-svn: 290047	2016-12-17 18:53:11 +00:00
Matthias Braun	e813cf457a	AArch64CollectLOH: Rewrite as block-local analysis. Re-apply r288561: Liveness tracking should be correct now after r290014. Previously this pass was using up to 5% compile time in some cases which is a bit much for what it is doing. The pass featured a full blown data-flow analysis which in the default configuration was restricted to a single block. This rewrites the pass under the assumption that we only ever work on a single block. This is done in a single pass maintaining a state machine per general purpose register to catch LOH patterns. Differential Revision: https://reviews.llvm.org/D27329 llvm-svn: 290026	2016-12-17 01:15:59 +00:00
Matthias Braun	76bb4139dc	AArch64: Enable post-ra liveness updates Differential Revision: https://reviews.llvm.org/D27559 llvm-svn: 290014	2016-12-16 23:55:43 +00:00
Evandro Menezes	1b48bac330	[AArch64] Add FeatureSlowMisaligned128Store to Exynos M1 and M2 This feature now gates such stores after r289845. Thus the Exynos processors now need this feature. llvm-svn: 289898	2016-12-16 00:18:00 +00:00
Ahmed Bougacha	5228603387	[GlobalISel] Drop workaround for Legalizer member/class sharing a name. NFC. MachineLegalizer used to be the name of both the class and the member, causing GCC errors. r276522 fixed that by renaming the member to just 'Legalizer'. The 'class' workaround isn't necessary anymore; drop it. llvm-svn: 289848	2016-12-15 18:45:30 +00:00
Matthew Simpson	2c8de192a1	[AArch64] Guard Misaligned 128-bit store penalty by subtarget feature This patch checks that the SlowMisaligned128Store subtarget feature is set when penalizing such stores in getMemoryOpCost. Differential Revision: https://reviews.llvm.org/D27677 llvm-svn: 289845	2016-12-15 18:36:59 +00:00
Ahmed Bougacha	2a26a5f1f0	[AArch64][GlobalISel] Remove redundant RBI comments. NFC. It's brittle, and Doxygen already picks the overriden method's comment anyway. llvm-svn: 289844	2016-12-15 18:22:15 +00:00
Stephan Bergmann	17c7f70362	Replace APFloatBase static fltSemantics data members with getter functions At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647	2016-12-14 11:57:17 +00:00
Evandro Menezes	aeec780e42	Add support for Samsung Exynos M3 (NFC) llvm-svn: 289613	2016-12-13 23:31:41 +00:00
Alina Sbirlea	77c5eaaeda	Generalize strided store pattern in interleave access pass Summary: This patch aims to generalize matching of the strided store accesses to more general masks. The more general rule is to have consecutive accesses based on the stride: [x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...] All elements in the masks need not form a contiguous space, there may be gaps. As before, undefs are allowed and filled in with adjacent element loads. Reviewers: HaoLiu, mssimpso Subscribers: mkuper, delena, llvm-commits Differential Revision: https://reviews.llvm.org/D23646 llvm-svn: 289573	2016-12-13 19:32:36 +00:00
Matthias Braun	fde00fc252	Revert "AArch64CollectLOH: Rewrite as block-local analysis." This is not always behaving as expected as it turns out block live-in lists are only correct most of the time. Still waiting for reviews on https://reviews.llvm.org/D27559 to have them correct all of the time. See also http://llvm.org/PR31361, rdar://25117107 This reverts commit r288567. This reverts commit r288561. llvm-svn: 289570	2016-12-13 19:08:17 +00:00
Tim Northover	fe7c59adb8	GlobalISel: fix GOT accesses on AArch64. We were using the correct pseudo-instruction, but because the operand's flags weren't set correctly we still ended up emitting incorrect relocations during MC lowering. llvm-svn: 289566	2016-12-13 18:25:38 +00:00
Diana Picus	2d9adbf524	[GlobalISel] Move extendRegister where it belongs. NFCI Apparently I missed this one when I moved ValueHandler back in r288658. Sorry! llvm-svn: 289528	2016-12-13 10:46:12 +00:00
Tim Northover	05cc4859ad	GlobalISel: simplify MachineIRBuilder interface. MachineIRBuilder had weird before/after and beginning/end flags for the insert point. Unfortunately the non-default means that instructions will be inserted in reverse order which is almost never what anyone wants. Really, I think we just want (like IRBuilder has) the ability to insert at any C++ iterator-style point (i.e. before any instruction or before MBB.end()). So this fixes MIRBuilders to behave like IRBuilders in this respect. llvm-svn: 288980	2016-12-07 21:05:38 +00:00
Haicheng Wu	f8b834049a	[AArch64] Correct the check of signed 9-bit imm in isLegalAddressingMode() In the addressing mode, signed 9-bit imm is [-256, 255], not [-512, 511]. Differential Revision: https://reviews.llvm.org/D27480 llvm-svn: 288876	2016-12-07 01:45:04 +00:00
Tim Northover	c1a23854f3	GlobalISel: handle G_SEQUENCE fallbacks gracefully. There were two problems: + AArch64 was reusing random data from its binary op tables, which is complete nonsense for G_SEQUENCE. + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE, the generic code asserted. llvm-svn: 288836	2016-12-06 18:38:38 +00:00
Daniel Sanders	4fd1e7c628	[globalisel][aarch64] Fix unintended assumptions about PartialMappingIdx. NFC. Summary: This is NFC but prevents assertions when PartialMappingIdx is tablegen-erated. The assumptions were: 1) FirstGPR is 0 2) FirstGPR is the first of the First* enumerators. GPR32 is changed to 1 to demonstrate that assumption #1 is fixed. #2 will be covered by a subsequent patch that tablegen-erates information and swaps the order of GPR and FPR as a side effect. Depends on D27336 Reviewers: ab, t.p.northover, qcolombet Subscribers: aemerson, rengolin, vkalintiris, dberris, rovka, llvm-commits Differential Revision: https://reviews.llvm.org/D27337 llvm-svn: 288812	2016-12-06 14:39:57 +00:00
Daniel Sanders	21765cb15e	[globalisel][aarch64] Replace magic numbers with corresponding enumerators in ValMappings. NFC Reviewers: ab, t.p.northover, qcolombet Subscribers: aemerson, rengolin, vkalintiris, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27336 llvm-svn: 288810	2016-12-06 13:55:01 +00:00
Daniel Sanders	605f8cd30d	[globalisel][aarch64] Correct argument names in comments. llvm-svn: 288809	2016-12-06 13:48:58 +00:00
Daniel Sanders	bfd5ff155a	[globalisel][aarch64] Prefix PartialMappingIdx enumerators with 'PMI_' to fit coding standards. This also stops things like 'None' polluting the llvm::AArch64 namespace. llvm-svn: 288799	2016-12-06 11:33:04 +00:00
Tim Northover	9267ac5d47	GlobalISel: make G_CONSTANT take a ConstantInt rather than int64_t. This makes it more similar to the floating-point constant, and also allows for larger constants to be translated later. There's no real functional change in this patch though, just syntax updates. llvm-svn: 288712	2016-12-05 21:47:07 +00:00
Tim Northover	d1fd383b28	GlobalISel: handle 1-element aggregates during ABI lowering. llvm-svn: 288706	2016-12-05 21:25:33 +00:00
Quentin Colombet	0e6cccfb53	[AArch64][RegisterBankInfo] Fix typo in the logic used in assert. Thanks to David Binderman <dcb314@hotmail.com> for bringing it to my attention. llvm-svn: 288688	2016-12-05 19:02:37 +00:00
Diana Picus	f11f042ecb	[GlobalISel] Extract handleAssignments out of AArch64CallLowering This function seems target-independent so far: all the target-specific behaviour is isolated in the CCAssignFn and the ValueHandler (which we're also extracting into the generic CallLowering). The intention is to use this in the ARM backend. Differential Revision: https://reviews.llvm.org/D27045 llvm-svn: 288658	2016-12-05 10:40:33 +00:00
Matthias Braun	1fbb0f6dd9	AArch64CollectLOH: Rewrite as block-local analysis. Previously this pass was using up to 5% compile time in some cases which is a bit much for what it is doing. The pass featured a full blown data-flow analysis which in the default configuration was restricted to a single block. This rewrites the pass under the assumption that we only ever work on a single block. This is done in a single pass maintaining a state machine per general purpose register to catch LOH patterns. Differential Revision: https://reviews.llvm.org/D27329 llvm-svn: 288561	2016-12-03 00:52:56 +00:00
Peter Collingbourne	ab85225be4	IR: Change the gep_type_iterator API to avoid always exposing the "current" type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458	2016-12-02 02:24:42 +00:00
Geoff Berry	7ffce7be0c	[AArch64] Fold more spilled/refilled COPYs. Summary: Make AArch64InstrInfo::foldMemoryOperandImpl more general by folding all full COPYs between register classes of the same size that are either spilled or refilled. Reviewers: MatzeB, qcolombet Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27271 llvm-svn: 288439	2016-12-01 23:43:55 +00:00
Tim Northover	5bb87b6769	AArch64: fix 128-bit cmpxchg at -O0 (again, again). This time the issue is fortunately just a simple mistake rather than a horrible design spectre. I thought SUBS/SBCS provided sufficient NZCV flags for comparing two 64-bit values, but they don't. The fix is slightly clunkier in AArch64 because we can't use conditional execution to emit a pair of CMPs. Traditionally an "icmp ne i128" would map to an EOR/EOR/ORR/CBNZ, but that uses more registers so it's easier to go with a CSET/CINC/CBNZ combination. Slightly less efficient, but this is -O0 anyway. Thanks to Anton Korobeynikov for pointing out the issue. llvm-svn: 288418	2016-12-01 21:31:59 +00:00
Matthias Braun	f23ef437cc	Move FrameInstructions from MachineModuleInfo to MachineFunction This is per function data so it is better kept at the function instead of the module. This is a necessary step to have machine module passes work properly. Differential Revision: https://reviews.llvm.org/D27185 llvm-svn: 288291	2016-11-30 23:48:42 +00:00
Joel Jones	75818bc8f7	[AArch64] Refactor LSE support as feature separate from V8.1a support. Summary: This is preparation for ThunderX processors that have Large System Extension (LSE) atomic instructions, but not the other instructions introduced by V8.1a. This will mimic changes to GCC as described here: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00388.html LSE instructions are: LD/ST<op>, CAS*, SWP Reviewers: t.p.northover, echristo, jmolloy, rengolin Subscribers: aemerson, mehdi_amini Differential Revision: https://reviews.llvm.org/D26621 llvm-svn: 288279	2016-11-30 22:25:24 +00:00
Matthias Braun	c52fe2961c	Clarify rules for reserved regs, fix aarch64 ones. No test case necessary as the problematic condition is checked with the newly introduced assertAllSuperRegsMarked() function. Differential Revision: https://reviews.llvm.org/D26648 llvm-svn: 288277	2016-11-30 22:17:10 +00:00
Silviu Baranga	aab65b155e	[AArch64] Fix useful bits detection for BFM instructions Summary: When computing useful bits for a BFM instruction, we need to take into consideration the case where both operands of the BFM are equal and provide data that we need to track. Not doing this can cause us to miss useful bits. Fixes PR31138 (https://llvm.org/bugs/show_bug.cgi?id=31138) Reviewers: t.p.northover, jmolloy Subscribers: evandro, gberry, srhines, pirama, mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27130 llvm-svn: 288253	2016-11-30 17:04:22 +00:00
Sanjay Patel	47f7f30df9	[AArch64] allow and-not-compare transform to form 'bics' This target hook was added with D19087: https://reviews.llvm.org/D19087 Differential Revision: https://reviews.llvm.org/D27221 llvm-svn: 288206	2016-11-29 22:28:58 +00:00
Chad Rosier	d34c26eb08	[AArch64] Add a basic SchedMachineModel for Falkor. Differential Revision: https://reviews.llvm.org/D26972 llvm-svn: 288194	2016-11-29 20:00:27 +00:00
Geoff Berry	7c078fc035	[AArch64] Fold spills of COPY of WZR/XZR Summary: In AArch64InstrInfo::foldMemoryOperandImpl, catch more cases where the COPY being spilled is copying from WZR/XZR, but the source register is not in the COPY destination register's regclass. For example, when spilling: %vreg0 = COPY %XZR ; %vreg0:GPR64common without this change, the code in TargetInstrInfo::foldMemoryOperand() and canFoldCopy() that normally handles cases like this would fail to optimize since %XZR is not in GPR64common. So the spill code generated would be: %vreg0 = COPY %XZR STR %vreg instead of the new code generated: STR %XZR Reviewers: qcolombet, MatzeB Subscribers: mcrosier, aemerson, t.p.northover, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26976 llvm-svn: 288176	2016-11-29 18:28:32 +00:00
Matthias Braun	115efcd3d1	MachineScheduler: Export function to construct "default" scheduler. This makes the createGenericSchedLive() function that constructs the default scheduler available for the public API. This should help when you want to get a scheduler and the default list of DAG mutations. This also shrinks the list of default DAG mutations: {Load\|Store}ClusterDAGMutation and MacroFusionDAGMutation are no longer added by default. Targets can easily add them if they need them. It also makes it easier for targets to add alternative/custom macrofusion or clustering mutations while staying with the default createGenericSchedLive(). It also saves the callback back and forth in TargetInstrInfo::enableClusterLoads()/enableClusterStores(). Differential Revision: https://reviews.llvm.org/D26986 llvm-svn: 288057	2016-11-28 20:11:54 +00:00
Kuba Mracek	06995e866b	[xray] Add XRay support for Mach-O in CodeGen Currently, XRay only supports emitting the XRay table (xray_instr_map) on ELF binaries. Let's add Mach-O support. Differential Revision: https://reviews.llvm.org/D26983 llvm-svn: 287734	2016-11-23 02:07:04 +00:00
Tim Northover	b64fb453ea	CodeGen: simplify TargetMachine::getSymbol interface. NFC. No-one actually had a mangler handy when calling this function, and getSymbol itself went most of the way towards getting its own mangler (with a local TLOF variable) so forcing all callers to supply one was just extra complication. llvm-svn: 287645	2016-11-22 16:17:20 +00:00
Chad Rosier	ecc77273a0	[AArch64] Set the max interleave factor for Falkor. llvm-svn: 287642	2016-11-22 14:25:02 +00:00
Chad Rosier	2abc29c593	[AArch64] Maximize 80-column. NFC. llvm-svn: 287640	2016-11-22 14:12:09 +00:00
Geoff Berry	e0bf52f394	[AArch64LoadStoreOptimizer] Don't treat write to XZR/WZR as a clobber. Summary: When searching for load/store instructions to pair/merge don't treat writes to WZR/XZR as clobbers since they don't change the value read from WZR/XZR (which is always 0). Reviewers: mcrosier, junbuml, jmolloy, t.p.northover Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26921 llvm-svn: 287592	2016-11-21 22:51:10 +00:00
Dean Michael Berris	31761f300d	[XRay][AArch64] Implemented a test for the compile-time sleds emitted, and fixed a bug in the jump instruction This patch adds a test for the assembly code emitted with XRay instrumentation. It also fixes a bug where the operand of a jump instruction must be not the number of bytes to jump over, but rather the number of 4-byte instructions. Author: rSerge Reviewers: dberris, rengolin Differential Revision: https://reviews.llvm.org/D26805 llvm-svn: 287516	2016-11-21 03:01:43 +00:00
Benjamin Kramer	ffd3715d16	Give some helper classes/functions internal linkage. NFC. llvm-svn: 287462	2016-11-19 20:44:26 +00:00
Daniel Sanders	72db2a390a	Check that emitted instructions meet their predicates on all targets except ARM, Mips, and X86. Summary: * ARM is omitted from this patch because this check appears to expose bugs in this target. * Mips is omitted from this patch because this check either detects bugs or deliberate emission of instructions that don't satisfy their predicates. One deliberate use is the SYNC instruction where the version with an operand is correctly defined as requiring MIPS32 while the version without an operand is defined as an alias of 'SYNC 0' and requires MIPS2. * X86 is omitted from this patch because it doesn't use the tablegen-erated MCCodeEmitter infrastructure. Patches for ARM and Mips will follow. Depends on D25617 Reviewers: tstellarAMD, jmolloy Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits Differential Revision: https://reviews.llvm.org/D25618 llvm-svn: 287439	2016-11-19 13:05:44 +00:00
Dean Michael Berris	3234d3a4bd	[XRay] Support AArch64 in LLVM This patch adds XRay support in LLVM for AArch64 targets. This patch is one of a series: Clang: https://reviews.llvm.org/D26415 compiler-rt: https://reviews.llvm.org/D26413 Author: rSerge Reviewers: rengolin, dberris Subscribers: amehsan, aemerson, llvm-commits, iid_iunknown Differential Revision: https://reviews.llvm.org/D26412 llvm-svn: 287209	2016-11-17 05:15:37 +00:00
Chris Bieneman	05c279fc4b	[CMake] NFC. Updating CMake dependency specifications This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system. llvm-svn: 287206	2016-11-17 04:36:50 +00:00
Geoff Berry	8301c645c8	[AArch64] Handle vector types in replaceZeroVectorStore. Summary: Extend replaceZeroVectorStore to handle more vector type stores, floating point zero vectors and set alignment more accurately on split stores. This is a follow-up change to r286875. This change fixes PR31038. Reviewers: MatzeB Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26682 llvm-svn: 287142	2016-11-16 19:35:19 +00:00
Matthias Braun	3d51cf0a2c	AArch64: Use DeadRegisterDefinitionsPass before regalloc. Doing this before register allocation reduces register pressure as we do not even have to allocate a register for those dead definitions. Differential Revision: https://reviews.llvm.org/D26111 llvm-svn: 287076	2016-11-16 03:38:27 +00:00
Chad Rosier	201fc1ed26	[AArch64] Add support for Qualcomm's Falkor CPU. Differential Revision: https://reviews.llvm.org/D26673 llvm-svn: 287036	2016-11-15 21:34:12 +00:00
Haicheng Wu	faee2b71a7	[AArch64] Lower multiplication by a constant int to shl+add+shl Lower a = b * C where C = (2^n + 1) * 2^m to add w0, w0, w0, lsl n lsl w0, w0, m Differential Revision: https://reviews.llvm.org/D229245 llvm-svn: 287019	2016-11-15 20:16:48 +00:00
Evandro Menezes	9fc54826e0	[AArch64] Compute the Newton series for reciprocals natively Implement the Newton series for square root, its reciprocal and reciprocal natively using the specialized instructions in AArch64 to perform each series iteration. Differential revision: https://reviews.llvm.org/D26518 llvm-svn: 286907	2016-11-14 23:29:01 +00:00
Geoff Berry	e8de67abad	[AArch64] Change some pointers to references. NFC. Follow-up change to r286875. llvm-svn: 286879	2016-11-14 19:59:11 +00:00
Geoff Berry	526c50588d	[AArch64] Split 0 vector stores into scalar store pairs. Summary: Replace a splat of zeros to a vector store by scalar stores of WZR/XZR. The load store optimizer pass will merge them to store pair stores. This should be better than a movi to create the vector zero followed by a vector store if the zero constant is not re-used, since one instructions and one register live range will be removed. For example, the final generated code should be: stp xzr, xzr, [x0] instead of: movi v0.2d, #0 str q0, [x0] Reviewers: t.p.northover, mcrosier, MatzeB, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26561 llvm-svn: 286875	2016-11-14 19:39:04 +00:00
Geoff Berry	def4bfa9d9	[AArch64] Factor out transform code from split16BStore. NFC. llvm-svn: 286874	2016-11-14 19:39:00 +00:00
Diana Picus	bda7276120	GlobalISel: Fix indentation. NFC llvm-svn: 286808	2016-11-14 10:25:43 +00:00
Chad Rosier	8ade03463e	[AArch64] Update a FIXME comment to reflect current state. NFC. llvm-svn: 286625	2016-11-11 19:52:45 +00:00
Geoff Berry	25fa4999ff	[AArch64] Fix bugs in isel lowering replaceSplatVectorStore. Summary: Fix off-by-one indexing error in loop checking that inserted value was a splat vector. Add code to check that INSERT_VECTOR_ELT nodes constructing the splat vector have the expected constant index values. Reviewers: t.p.northover, jmolloy, mcrosier Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26409 llvm-svn: 286616	2016-11-11 19:25:20 +00:00
Chad Rosier	d6e85ce3c3	[AArch64] Remove lots of redundant code. NFC. llvm-svn: 286606	2016-11-11 17:49:34 +00:00
Chad Rosier	31ee813068	[AArch64] Early return and minor renaming/refactoring to ease code review. NFC. llvm-svn: 286601	2016-11-11 17:07:37 +00:00
Chad Rosier	10c7aaaee9	[AArch64] Enable merging of adjacent zero stores for all subtargets. This optimization merges adjacent zero stores into a wider store. e.g., strh wzr, [x0] strh wzr, [x0, #2] ; becomes str wzr, [x0] e.g., str wzr, [x0] str wzr, [x0, #4] ; becomes str xzr, [x0] Previously, this was only enabled for Kryo and Cortex-A57. Differential Revision: https://reviews.llvm.org/D26396 llvm-svn: 286592	2016-11-11 14:10:12 +00:00
Evandro Menezes	21f9ce1a0d	[DAG Combiner] Fix the native computation of the Newton series for reciprocals The generic infrastructure to compute the Newton series for reciprocal and reciprocal square root was conceived to allow a target to compute the series itself. However, the original code did not properly consider this condition if returned by a target. This patch addresses the issues to allow a target to compute the series on its own. Differential revision: https://reviews.llvm.org/D22975 llvm-svn: 286523	2016-11-10 23:31:06 +00:00
Tim Northover	a9105be437	GlobalISel: translate invoke and landingpad instructions Pretty bare-bones support for exception handling (no weird MSVC stuff, no SjLj etc), but it should get things going. llvm-svn: 286407	2016-11-09 22:39:54 +00:00
Matthias Braun	c53cbbb1d1	AArch64DeadRegisterDefinitionsPass: Fix Changed flag Fix a bug in the calculation of the changed flag introduced in r285488. llvm-svn: 286293	2016-11-08 20:59:03 +00:00
Nirav Dave	e833c6c61a	[MC][AArch64] Cleanup end-of-line parsing in AArch64 AsmParser. Reviewers: t.p.northover, rengolin Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D26309 llvm-svn: 286265	2016-11-08 18:31:04 +00:00
Tim Northover	5f7dea85c2	GlobalISel: support selecting fpext/fptrunc instructions on AArch64. llvm-svn: 286253	2016-11-08 17:44:07 +00:00
Roger Ferrer Ibanez	80c0f33c29	[AArch64] Fix incorrect CSEL node created Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64 transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to "a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have the same type which can lead to a wrong CSEL node that crashes later due to nonsensical copies. Differential Revision: https://reviews.llvm.org/D26394 llvm-svn: 286231	2016-11-08 13:34:41 +00:00
Tim Northover	9ac0eba672	GlobalISel: support selecting G_SELECT on AArch64. llvm-svn: 286185	2016-11-08 00:45:29 +00:00
Tim Northover	7d88da6a46	GlobalISel: constrain PHI registers on AArch64. Self-referencing PHI nodes need their destination operands to be constrained because nothing else is likely to do so. For now we just pick a register class naively. Patch mostly by Ahmed again. llvm-svn: 286183	2016-11-08 00:34:06 +00:00
Sanjin Sijaric	6f020d91a1	[AArch64] Transfer memory operands when lowering vector load/store intrinsics Summary: Some vector loads and stores generated from AArch64 intrinsics alias each other unnecessarily, preventing better scheduling. We just need to transfer memory operands during lowering. Reviewers: mcrosier, t.p.northover, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26313 llvm-svn: 286168	2016-11-07 22:39:02 +00:00
Davide Italiano	5df6066ec1	[AArch64] Remove dead store. Found by gcc7. llvm-svn: 286137	2016-11-07 19:11:25 +00:00
Amara Emerson	614b44bbe9	This patch adds support for 16 bit floating point registers to the inline asm register selection on AArch64. Without this patch, register allocation for the example below fails. define half @test(half %a1, half %a2) #0 { entry: %0 = tail call half asm "sqrshl ${0:h}, ${1:h}, ${2:h}", "=w,w,w" (half %a1, half %a2) #1 ret half %0 } Patch by Florian Hahn. Differential Revision: https://reviews.llvm.org/D25080 llvm-svn: 286111	2016-11-07 15:42:12 +00:00
Chad Rosier	d6daac4746	[AArch64] Removed the narrow load merging code in the ld/st optimizer. This feature has been disabled for some time now, so remove cruft. Differential Revision: https://reviews.llvm.org/D26248 llvm-svn: 286110	2016-11-07 15:27:22 +00:00
Peter Collingbourne	4e76019e34	Support: Remove MemoryObject and DataStreamer interfaces. These interfaces are no longer used. Differential Revision: https://reviews.llvm.org/D26222 llvm-svn: 285774	2016-11-02 00:08:37 +00:00
Alex Bradbury	58eba09949	[TableGen] Move OperandMatchResultTy enum to MCTargetAsmParser.h As it stands, the OperandMatchResultTy is only included in the generated header if there is custom operand parsing. However, almost all backends make use of MatchOperand_Success and friends from OperandMatchResultTy for e.g. parseRegister. This is a pain when starting an AsmParser for a new backend that doesn't yet have custom operand parsing. Move the enum to MCTargetAsmParser.h. This patch is a prerequisite for D23563 Differential Revision: https://reviews.llvm.org/D23496 llvm-svn: 285705	2016-11-01 16:32:05 +00:00
Tim Northover	037af52c8b	GlobalISel: allow truncating pointer casts on AArch64. llvm-svn: 285615	2016-10-31 18:31:09 +00:00
Tim Northover	cdf23f1d93	GlobalISel: translate stack protector intrinsics llvm-svn: 285614	2016-10-31 18:30:59 +00:00
Matthias Braun	7d78614ae9	AArch64DeadRegisterDefinitionsPass: Cleanup; NFC - Fix doxygen file comment - reduce indentation in loop - Factor out some common subexpressions - Move independent helper function out of class - Fix Changed flag (this is not strictly NFC but a bugfix, but the flag seems ignored anyway) llvm-svn: 285488	2016-10-29 01:03:41 +00:00
Evandro Menezes	ca8370396a	[AArch64] Create feature set for Samsung Exynos-M2 Since Exynos-M2 improved the FP square root unit a bit over the one in Exynos-M1, it does not benefit from using the Newton series for such operations. llvm-svn: 285246	2016-10-26 22:06:20 +00:00
Chad Rosier	0c621fda0d	[AArch64] Avoid materializing constant 1 when generating cneg instructions. Instead of cmp w0, #1 orr w8, wzr, #0x1 cneg w0, w8, ne we now generate cmp w0, #1 csinv w0, w0, wzr, eq PR28965 llvm-svn: 285217	2016-10-26 18:15:32 +00:00
Evandro Menezes	7696dc0685	[AArch64] Adjust the cost model for Exynos M1. Modify the maximum jump table size. llvm-svn: 285106	2016-10-25 20:05:42 +00:00
Evandro Menezes	eff2bd9d4f	[AArch64] Optionally use the Newton series for reciprocal estimation Add support for estimating the square root or its reciprocal and division or reciprocal using the combiner generic Newton series. Differential revision: https://reviews.llvm.org/D25291 llvm-svn: 284986	2016-10-24 16:14:58 +00:00
Joel Jones	504bf334b0	AArch64 ILP32 relocations for assembly and ELF Summary: Add relocations for AArch64 ILP32. Includes: - Addition of definitions for R_AARCH32_* - Definition of new -target-abi: ilp32 - Definition of data layout string - Tests for added relocations. Not comprehensive, but matches existing tests for 64-bit. Renames "CHECK-OBJ" to "CHECK-OBJ-LP64". - Tests for llvm-readobj Reviewers: zatrazz, peter.smith, echristo, t.p.northover Subscribers: aemerson, rengolin, mehdi_amini Differential Revision: https://reviews.llvm.org/D25159 llvm-svn: 284973	2016-10-24 13:37:13 +00:00
Abderrazek Zaafrani	9daf8110c8	Set the vectorizer MaxInterleaveFactor for Exynos. llvm-svn: 284839	2016-10-21 16:28:27 +00:00
Abderrazek Zaafrani	9f382f53d1	Test commit llvm-svn: 284832	2016-10-21 15:24:08 +00:00
Bjorn Pettersson	9fcd605d1e	[AArch64] Corrected spill size for DDD register class. NFCI Summary: The spill size was incorrectly set to 196 bits, which isn't a multiple of 8. This problem was detected when experimenting with asserts that the spill size should be a multiple of the byte size. New corrected value for the spill size is set to 192 bits. Note that tablegen (RegisterInfoEmitter) will divide the size set in the RegisterClass definition by 8. So this change should not have any impact on the tablegen output (trunc(192/8) == trunc(196/8) == 24 bytes). Reviewers: t.p.northover Subscribers: llvm-commits, aemerson, rengolin Differential Revision: https://reviews.llvm.org/D25818 llvm-svn: 284814	2016-10-21 09:53:42 +00:00
Benjamin Kramer	2a8bef8769	Do a sweep over move ctors and remove those that are identical to the default. All of these existed because MSVC 2013 was unable to synthesize default move ctors. We recently dropped support for it so all that error-prone boilerplate can go. No functionality change intended. llvm-svn: 284721	2016-10-20 12:20:28 +00:00
Evandro Menezes	ce8d60156c	[AArch64] Avoid materializing 0.0 when generating FP SELECT Transform `a == 0.0 ? 0.0 : x` to `a == 0.0 ? a : x` and `a != 0.0 ? x : 0.0` to `a != 0.0 ? x : a` to avoid materializing 0.0 for FCSEL, since it does not have to be materialized beforehand for FCMP, as it has a form that has 0.0 as an implicit operand. Differential Revision: https://reviews.llvm.org/D24808 llvm-svn: 284531	2016-10-18 20:37:35 +00:00
Tim Northover	55782222c0	GlobalISel: select small binary operations on AArch64. AArch64 actually supports many 8-bit operations under the definition used by GlobalISel: the designated information-carrying bits of a GPR32 get the right value if you just use the normal 32-bit instruction. llvm-svn: 284526	2016-10-18 20:03:48 +00:00
Tim Northover	4494d69862	GlobalISel: support floating-point constants on AArch64. Patch from Ahmed Bougacha. llvm-svn: 284523	2016-10-18 19:47:57 +00:00
Tim Northover	020d104496	GlobalISel: support wider range of load/store sizes in AArch64. llvm-svn: 284406	2016-10-17 18:36:53 +00:00
Tim Northover	69fa84a6e9	GlobalISel: rename legalizer components to match others. The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. llvm-svn: 284287	2016-10-14 22:18:18 +00:00
Quentin Colombet	b3f5a8c644	[AArch64][RegisterBankInfo] Switch to fully static opds mapping for G_BITCAST. NFC. llvm-svn: 284146	2016-10-13 18:46:38 +00:00
Quentin Colombet	6b87a3109c	[AArch64][RegisterBankInfo] Provide alternative mappings for 64-bit load This allows RegBankSelect in greedy mode to get rid some of the cross register bank copies when loads are involved in the chain of computation. llvm-svn: 284097	2016-10-13 01:01:23 +00:00
Quentin Colombet	cd80e97e88	[AArch64][RegisterBankInfo] Provide alternative mappings for G_BITCASTs. Thanks to this patch, RegBankSelect is able to get rid of some register bank copies as demonstrated in the test case. llvm-svn: 284094	2016-10-13 00:34:48 +00:00
Quentin Colombet	45c9c1432f	[AArch64][RegisterBankInfo] Describe cross regbank copies statically. NFC. llvm-svn: 284091	2016-10-13 00:12:06 +00:00
Quentin Colombet	9e64919b7c	[AArch64][RegisterBankInfo] Use static mapping for same bank G_BITCAST. NFC. llvm-svn: 284090	2016-10-13 00:12:04 +00:00
Quentin Colombet	db643d9091	[AArch64][MachineLegalizer] Mark more G_BITCAST as legal. Basically any vector types that fits in a 32-bit register is also valid as far as copies are concerned. llvm-svn: 284089	2016-10-13 00:12:01 +00:00
Quentin Colombet	f760799c40	[AArch64][RegisterBankInfo] Bump the cost of vector loads. This does not change anything yet, because we do not offer any alternative mapping. llvm-svn: 284088	2016-10-13 00:11:59 +00:00
Quentin Colombet	f35a8c5bdc	[AArch64][RegisterBankInfo] Use a proper cost for cross regbank G_BITCASTs. This does not change anything yet, because we do not offer any alternative mapping. llvm-svn: 284087	2016-10-13 00:11:57 +00:00
Quentin Colombet	27b40356f7	[AArch64][RegisterBankInfo] Provide more realistic copy costs. llvm-svn: 284086	2016-10-13 00:11:55 +00:00
Tim Northover	fb8d989818	GlobalISel: support G_TRUNC selection on AArch64. Ahmed's patch again. llvm-svn: 284075	2016-10-12 22:49:15 +00:00
Tim Northover	69271c64d5	GlobalISel: support int <-> float conversions on AArch64. More of Ahmed's work. llvm-svn: 284074	2016-10-12 22:49:11 +00:00
Tim Northover	7dd378dd08	GlobalISel: select G_FCMP instructions on AArch64. Another of Ahmed's patches. llvm-svn: 284073	2016-10-12 22:49:07 +00:00
Tim Northover	6c02ad5e4f	GlobalISel: support selection of G_ICMP on AArch64. Patch from Ahmed Bougaca again. llvm-svn: 284072	2016-10-12 22:49:04 +00:00
Tim Northover	5e3dbf326c	GlobalISel: select G_BRCOND instructions on AArch64. llvm-svn: 284071	2016-10-12 22:49:01 +00:00
Tim Northover	6aacd27cd7	GlobalISel: mark G_BRCOND on s1 as legal. It's going to be a TBNZ (at -O0) anyway, so the high bits don't matter. llvm-svn: 284070	2016-10-12 22:48:36 +00:00
Quentin Colombet	9de30faeac	[AArch64][InstrustionSelector] Teach the selector about G_BITCAST. llvm-svn: 283973	2016-10-12 03:57:52 +00:00
Quentin Colombet	cb629a897c	[AArch64][InstructionSelector] Refactor the handling of copies. Although Copies are not specific to preISel, we still have to assign them a proper register class. However, given they are not constrained to anything we do not have to handle the source register at the copy. It will be properly mapped when reaching the related definition. In the process, the handlong of G_ANYEXT is slightly modified as those end up being selected as copy. The difference is that when register size do not match on both sides, we need to insert SUBREG_TO_REG operation, otherwise the post RA copy expansion will not be happy! llvm-svn: 283972	2016-10-12 03:57:49 +00:00
Quentin Colombet	404e4350dc	[AArch64][MachineLegalizer] Mark more bitcasts as legal. Those are copies, we do not have to do any legalization action for them. llvm-svn: 283970	2016-10-12 03:57:43 +00:00
Tim Northover	c1d8c2bf8c	GlobalISel: support same-size casts on AArch64. Mostly Ahmed's work again, I'm just sprucing things up slightly before committing. llvm-svn: 283952	2016-10-11 22:29:23 +00:00
Tim Northover	3d38b3a4d1	GlobalISel: support selection of extend operations. Patch mostly by Ahmed Bougaca. llvm-svn: 283937	2016-10-11 20:50:21 +00:00
Diana Picus	c93518db8c	[AArch64] Allow label arithmetic with add/sub/cmp Allow instructions such as 'cmp w0, #(end - start)' by folding the expression into a constant. For ELF, we fold only if the symbols are in the same section. For MachO, we fold if the expression contains only symbols that are not linker visible. Fixes https://llvm.org/bugs/show_bug.cgi?id=18920 Differential Revision: https://reviews.llvm.org/D23834 llvm-svn: 283862	2016-10-11 09:17:47 +00:00
Quentin Colombet	d2623f8e38	[AArch64][InstructionSelector] Teach how to select FP load/store. This patch allows to select 32 and 64-bit FP load and store. llvm-svn: 283832	2016-10-11 00:21:14 +00:00
Quentin Colombet	0e5312787e	[AArch64][InstructionSelector] Teach the selector how to handle vector OR. This only adds the support for 64-bit vector OR. Adding more sizes is not difficult, but it requires a bigger refactoring because ORs work on any size, not necessarly the ones that match the width of the register width. Right now, this is not expressed in the legalization, so don't bother pushing the refactoring yet. llvm-svn: 283831	2016-10-11 00:21:11 +00:00
Quentin Colombet	d3126d5fb4	[AArch64][MachineLegalizer] Mark v2s32 G_LOAD as legal. Actually every 64-bit loads are legal, but right now the API does not offer a simple way to express that. llvm-svn: 283829	2016-10-11 00:21:08 +00:00
Peter Collingbourne	0da86301ad	Revert r283690, "MC: Remove unused entities." llvm-svn: 283814	2016-10-10 22:49:37 +00:00
Tim Northover	bdf1624367	GlobalISel: select G_GLOBAL_VALUE uses on AArch64. llvm-svn: 283809	2016-10-10 21:50:00 +00:00
Tim Northover	ad0acca544	GlobalISel: allow G_GLOBAL_VALUEs in AArch64 legalization. llvm-svn: 283808	2016-10-10 21:49:53 +00:00
Tim Northover	2fda4b08ae	GlobalISel: support selecting G_GEP instructions. They're basically just an alias for G_ADD on AArch64. llvm-svn: 283807	2016-10-10 21:49:49 +00:00
Tim Northover	4edc60d785	GlobalISel: support selecting constants on AArch64. llvm-svn: 283806	2016-10-10 21:49:42 +00:00
Mehdi Amini	f42454b94b	Move the global variables representing each Target behind accessor function This avoids "static initialization order fiasco" Differential Revision: https://reviews.llvm.org/D25412 llvm-svn: 283702	2016-10-09 23:00:34 +00:00
Peter Collingbourne	cc723cccab	MC: Remove unused entities. llvm-svn: 283691	2016-10-09 04:39:13 +00:00
Peter Collingbourne	5c924d7117	Target: Remove unused entities. llvm-svn: 283690	2016-10-09 04:38:57 +00:00
Mehdi Amini	732afdd09a	Turn cl::values() (for enum) from a vararg function to using C++ variadic template The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling: va_start(ValueArgs, Desc); with Desc being a StringRef. Differential Revision: https://reviews.llvm.org/D25342 llvm-svn: 283671	2016-10-08 19:41:06 +00:00
Sebastian Pop	eb65d72d9c	[AArch64] Avoid generating indexed vector instructions for Exynos Avoid generating indexed vector instructions for Exynos. This is needed for fmla/fmls/fmul/fmulx. For example, the instruction fmla v0.4s, v1.4s, v2.s[1] is less efficient than the instructions dup v2.4s, v2.s[1] fmla v0.4s, v1.4s, v2.4s Patch written by Abderrazek Zaafrani. Differential Revision: https://reviews.llvm.org/D21571 llvm-svn: 283663	2016-10-08 12:30:07 +00:00
Mehdi Amini	a0016ec95f	Use StringReg in TargetParser APIs (NFC) llvm-svn: 283527	2016-10-07 08:37:29 +00:00
Matt Arsenault	36919a4f7c	Move AArch64BranchRelaxation to generic code llvm-svn: 283459	2016-10-06 15:38:53 +00:00
Matt Arsenault	0a3ea89e85	AArch64: Move remaining target specific BranchRelaxation bits to TII llvm-svn: 283458	2016-10-06 15:38:09 +00:00
Matthias Braun	46a5238682	AArch64: Macrofusion: Split features, add missing combinations. AArch64InstrInfo::shouldScheduleAdjacent() determines whether two instruction can benefit from macroop fusion on apple CPUs. The list turned out to be incomplete: - the "rr" variants of the instructions were missing - even the "rs" variants can have shift value == 0 and behave like the "rr" variants This also splits the MacropFusion target feature into ArithmeticBccFusion and ArithmeticCbzFusion. Differential Revision: https://reviews.llvm.org/D25142 llvm-svn: 283243	2016-10-04 19:28:21 +00:00
Quentin Colombet	3a06701913	[AArch64][RegisterBankInfo] Add getSameKindofOperandsMapping. Refactor the code so that the same function can be used for all instructions with all the same operands for up to 3 operands. This is going to be useful for cast instructions. NFC. llvm-svn: 283144	2016-10-03 20:20:13 +00:00
Matthias Braun	a827ed8891	AArch64Subtarget: Remove unused CPUString field llvm-svn: 283142	2016-10-03 20:17:02 +00:00
Mehdi Amini	48878ae579	Use StringRef in Datalayout API (NFC) llvm-svn: 283013	2016-10-01 05:57:55 +00:00
Mehdi Amini	117296c0a0	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004	2016-10-01 02:56:57 +00:00
Eric Christopher	98983d0aff	Remove TargetTriple from AArch64MCInstLower as it's used in few places and can be pulled from the TargetMachine. NFC. llvm-svn: 283000	2016-10-01 01:50:25 +00:00
Quentin Colombet	a6119958ff	[AArch64][RegisterBankInfo] Use the helper functions for the checks This makes sure the helper functions work as expected. NFC. llvm-svn: 282961	2016-09-30 21:46:21 +00:00
Quentin Colombet	7c3fa8e361	[AArch64][RegisterBankInfo] Rename getValueMappingIdx to getValueMapping We don't return index, we return the actual ValueMapping. NFC. llvm-svn: 282960	2016-09-30 21:46:19 +00:00
Quentin Colombet	b4afac7b32	[AArch64][RegisterBankInfo] Compress the ValueMapping table a bit. We don't need to have singleton ValueMapping on their own, we can just reuse one of the elements of the 3-ops mapping. This allows even more code sharing. NFC. llvm-svn: 282959	2016-09-30 21:46:17 +00:00
Quentin Colombet	7fc5fe41c5	[AArch64][RegisterBankInfo] Refactor the code to access AArch64::ValMapping Use a helper function to access ValMapping. This should make the code easier to understand and maintain. NFC. llvm-svn: 282958	2016-09-30 21:46:15 +00:00
Quentin Colombet	15dc25bb3d	[AArch64][RegisterBankInfo] Rename getRegBankIdx to getRegBankIdxOffset The function name did not make it clear that the returned value was an offset to apply to a register bank index. NFC. llvm-svn: 282957	2016-09-30 21:46:12 +00:00
Quentin Colombet	b2308987ab	[AArch64][RegisterBankInfo] Use the static opds mapping for alt mappings Avoid to rely on the dynamically allocated operands mapping for the alternative mapping. NFC. llvm-svn: 282956	2016-09-30 21:45:56 +00:00
Quentin Colombet	4b36e0c409	[AArch64][RegisterBankInfo] Use static mapping for 3-operands instrs. This uses a TableGen'ed like structure for all 3-operands instrs. The output of the RegBankSelect pass should be identical but the RegisterBankInfo will do less dynamic allocations. llvm-svn: 282817	2016-09-30 00:10:00 +00:00
Quentin Colombet	fdd303afe2	[AArch64][RegisterBankInfo] Add static value mapping for 3-op instrs. This is the kind of input TableGen should generate at some point. NFC. llvm-svn: 282816	2016-09-30 00:09:58 +00:00
Quentin Colombet	eb8d3da9a0	[AArch64][RegisterBankInfo] Check the statically created ValueMapping. Make sure that the ValueMappings contain the value we expect at the indices we expect. NFC. llvm-svn: 282815	2016-09-30 00:09:43 +00:00
Lei Liu	361615cfd0	AArch64: Set shift bit of TLSLE HI12 add instruction Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model. This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000. Reviewers: t.p.northover, peter.smith, rovka Subscribers: salim.nasser, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24702 llvm-svn: 282661	2016-09-29 01:05:48 +00:00
Quentin Colombet	40cbc27ff3	[RegisterBankInfo] Uniquely generate OperandsMapping. This is a step toward statically allocate InstructionMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. This should already improve compile time by getting rid of a bunch of memmove of SmallVectors. llvm-svn: 282643	2016-09-28 22:20:49 +00:00
Quentin Colombet	c0f11a9fb8	[AArch64][RegisterBankInfo] Switch to statically allocated ValueMapping. Another step toward TableGen'ed like structure for the RegisterBankInfo of AArch64. By doing this, we also save a bit of compile time for the exact same output. llvm-svn: 282550	2016-09-27 22:55:04 +00:00
Quentin Colombet	caae9cd246	[AArch64][RegisterBankInfo] Fix copy/paste in comments. NFC. llvm-svn: 282549	2016-09-27 22:54:57 +00:00
Geoff Berry	b124331db7	[TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg(). Summary: The current implementation of isConstantPhysReg() checks for defs of physical registers to determine if they are constant. Some architectures (e.g. AArch64 XZR/WZR) have registers that are constant and may be used as destinations to indicate the generated value is discarded, preventing isConstantPhysReg() from returning true. This change adds a TargetRegisterInfo hook that overrides the no defs check for cases such as this. Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy Subscribers: junbuml, aemerson, mcrosier, rengolin Differential Revision: https://reviews.llvm.org/D24570 llvm-svn: 282543	2016-09-27 22:17:27 +00:00
Geoff Berry	256fcf975f	[AArch64] Improve add/sub/cmp isel of uxtw forms. Don't match the UXTW extended reg forms of ADD/ADDS/SUB/SUBS if the 32-bit to 64-bit zero-extend can be done for free by taking advantage of the 32-bit defining instruction zeroing the upper 32-bits of the X register destination. This enables better instruction selection in a few cases, such as: sub x0, xzr, x8 instead of: mov x8, xzr sub x0, x8, w9, uxtw madd x0, x1, x1, x8 instead of: mul x9, x1, x1 add x0, x9, w8, uxtw cmp x2, x8 instead of: sub x8, x2, w8, uxtw cmp x8, #0 add x0, x8, x1, lsl #3 instead of: lsl x9, x1, #3 add x0, x9, w8, uxtw Reviewers: t.p.northover, jmolloy Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24747 llvm-svn: 282413	2016-09-26 15:34:47 +00:00
Evandro Menezes	e45de8a5ec	Add support to optionally limit the size of jump tables. Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated as such branch predictors are, they tend to have well defined limits beyond which their effectiveness is hampered or even nullified. One such limit is the number of possible destinations for a given indirect branches that such branch predictors can handle. This patch considers a limit that a target may set to the number of destination addresses in a jump table. Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar <aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>. Differential revision: https://reviews.llvm.org/D21940 llvm-svn: 282412	2016-09-26 15:32:33 +00:00
Quentin Colombet	fd8c95adf4	[RegisterBankInfo] Uniquely generate ValueMapping. This is a step toward statically allocate ValueMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. llvm-svn: 282324	2016-09-24 04:53:52 +00:00
Quentin Colombet	fd0ab5c660	[AArch64][RegisterBankInfo] Sanity check TableGen'ed like inputs. Make sure the entries written to mimic the behavior of TableGen are sane. llvm-svn: 282220	2016-09-23 00:59:07 +00:00
Quentin Colombet	5b16d931dc	[AArch64][RegisterBankInfo] Switch to TableGen'ed like PartialMapping. Statically instanciate the most common PartialMappings. This should be closer to what the code would look like when TableGen support is added for GlobalISel. As a side effect, this should improve compile time. llvm-svn: 282215	2016-09-23 00:14:36 +00:00
Quentin Colombet	0afa7d6b82	[RegisterBankInfo] Use array instead of SmallVector for BreakDown. This is another step toward TableGen'ed like structures. The BreakDown of the mapping of the value will be statically computed by TableGen, thus we only have to point to the right entry in the table instead of dynamically allocate the mapping for each instruction. We still support the dynamic allocation through a factory of PartialMapping to ease the bring-up of the targets while the TableGen backend is not available. llvm-svn: 282213	2016-09-23 00:14:30 +00:00
Tim Northover	a5e38fa00d	GlobalISel: handle stack-based parameters on AArch64. llvm-svn: 282153	2016-09-22 13:49:25 +00:00
Quentin Colombet	6a76323c64	[RegisterBankInfo] Move to statically allocated RegisterBank. This commit is basically the first step toward what will RegisterBankInfo look when it gets TableGen'ed. It introduces a XXXGenRegisterBankInfo.def file that is what TableGen will issue at some point. Moreover, the RegBanks field in RegisterBankInfo changed to reflect the static (compile time) aspect of the information. llvm-svn: 282131	2016-09-22 02:10:37 +00:00
Tim Northover	9a46718378	GlobalISel: produce correct code for signext/zeroext ABI flags. We still don't really have an equivalent of "AssertXExt" in DAG, so we don't exploit the guarantees on the receiving side yet, but this should produce conservatively correct code on iOS ABIs. llvm-svn: 282069	2016-09-21 12:57:45 +00:00

... 10 11 12 13 14 ...

2986 Commits