llvm-project

Commit Graph

Author	SHA1	Message	Date
Guillaume Chatelet	aff45e4b23	[LLVM][Alignment] Make functions using log of alignment explicit Summary: This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align. The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment. A few renames uncovered dubious assignments: - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation. - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation, - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation, Reviewers: lattner, thegameg, courbet Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65945 llvm-svn: 371045	2019-09-05 10:00:22 +00:00
Jonas Devlieghere	0eaee545ee	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Amara Emerson	e14c91b71a	[GlobalISel] Make the InstructionSelector instance non-const, allowing state to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652	2019-08-13 06:26:59 +00:00
Pablo Barrio	3cdd586be2	[AArch64] Set pref. func. align to 8 bytes on Neoverse E1 & Cortex-A65 Summary: The Arm Neoverse E1 and Cortex-A65 Software Optimization Guide [1][2], Section "4.7 Branch instruction alignment" state: "It is preferable for branch targets, including subroutine entry points, to be placed on aligned 64-bit boundaries to maximize instruction fetch efficiency." This patch sets the preferred function alignment on Neoverse E1 and Cortex-A65 to 2^3=8B. This was already the case in some Cortex-A CPUs such as Cortex-A53. [1] https://developer.arm.com/docs/swog466751/latest/arm-neoversetm-e1-core-software-optimization-guide [2] https://developer.arm.com/docs/swog010045/latest/arm-cortex-a65-core-software-optimization-guide Reviewers: dmgreen, fhahn, samparker Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65937 llvm-svn: 368431	2019-08-09 11:05:15 +00:00
Pablo Barrio	a8426b43f8	[AArch64] Set preferred function alignment to 16 bytes on Neoverse N1 Summary: The Arm Neoverse N1 Software Optimization Guide [1], Section "4.8 Branch instruction alignment" states: "Consider aligning subroutine entry points and branch targets to 32B boundaries, within the bounds of the code-density requirements of the program." This patch sets the preferred function alignment on Neoverse N1 to 2^4=16B. This was already the case in some of the latest Cortex-A CPUs. Benchmarking in previous Cortex-A CPUs suggested that 16B alignment is already better than the default. See commit d04ee305. The reason we don't set it to 32B right now (as the optimisation guide suggests) is that this will impact code size and perhaps the instruction cache performance. Therefore we need benchmark numbers first. I have also added testing for A75 and A76 that we were missing. [1] https://developer.arm.com/docs/swog309707/latest Reviewers: fhahn, greened, samparker, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65654 llvm-svn: 367894	2019-08-05 17:38:58 +00:00
Peter Collingbourne	09f39967a2	AArch64: Add a tagged-globals backend feature. This feature instructs the backend to allow locally defined global variable addresses to contain a pointer tag in bits 56-63 that will be ignored by the hardware (i.e. TBI), but may be used by an instrumentation pass such as HWASAN. It works by adding a MOVK instruction to the regular ADRP/ADD sequence that sets bits 48-63 to the corresponding bits of the global, with the linker bounds check disabled on the ADRP instruction to prevent the tag from causing a link failure. This implementation of the feature omits the MOVK when loading from or storing to a global, which is sufficient for TBI. If the same approach is extended to MTE, assuming that 0 is not configured as a catch-all tag, we will most likely also need the MOVK in this case in order to avoid a tag mismatch. Differential Revision: https://reviews.llvm.org/D65364 llvm-svn: 367475	2019-07-31 20:14:19 +00:00
Peter Collingbourne	33773d5cfc	SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned char to unsigned. This makes the field wider than MachineOperand::SubReg_TargetFlags so that we don't end up silently truncating any higher bits. We should still catch any bits truncated from the MachineOperand field as a consequence of the assertion in MachineOperand::setTargetFlags(). Differential Revision: https://reviews.llvm.org/D65465 llvm-svn: 367474	2019-07-31 20:14:09 +00:00
Pablo Barrio	275954539d	[ARM][AArch64] Support for Cortex-A65 & A65AE, Neoverse E1 & N1 Summary: Add support for Cortex-A65, Cortex-A65AE, Neoverse E1 and Neoverse N1. Neoverse E1 and Cortex-A65(&AE) only implement the AArch64 state of the Arm architecture. Neoverse N1 implements both AArch32 and AArch64. Cortex-A65: https://developer.arm.com/ip-products/processors/cortex-a/cortex-a65 Cortex-A65AE: https://developer.arm.com/ip-products/processors/cortex-a/cortex-a65ae Neoverse E1: https://developer.arm.com/ip-products/processors/neoverse/neoverse-e1 Neoverse N1: https://developer.arm.com/ip-products/processors/neoverse/neoverse-n1 Patch by Diogo Sampaio and Pablo Barrio Reviewers: samparker, LukeCheeseman, sbaranga, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64406 llvm-svn: 367007	2019-07-25 10:59:45 +00:00
Luke Cheeseman	59f77e7891	[AArch64] Add support for Cortex-A76 and Cortex-A76AE - Add LLVM backend support for Cortex-A76 and Cortex-A76AE - Documentation can be found at https://developer.arm.com/products/processors/cortex-a/cortex-a76 llvm-svn: 354788	2019-02-25 15:08:27 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Evandro Menezes	b02ac8bd21	[AArch64] Refactor the scheduling predicates (1/3) (NFC) Refactor the scheduling predicates based on `MCInstPredicate`. In this case, `AArch64InstrInfo::isScaledAddr()` Differential revision: https://reviews.llvm.org/D54777 llvm-svn: 347597	2018-11-26 21:47:28 +00:00
Bryan Chan	123553921f	[AArch64] Support HiSilicon's TSV110 processor Reviewers: t.p.northover, SjoerdMeijer, kristof.beyls Reviewed By: kristof.beyls Subscribers: olista01, javed.absar, kristof.beyls, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D53908 llvm-svn: 346546	2018-11-09 19:32:08 +00:00
Evandro Menezes	3a06c46470	[AArch64] Sort switch cases (NFC) llvm-svn: 345786	2018-10-31 21:56:49 +00:00
Tri Vo	6c47c62588	[AArch64] Support adding X[8-15,18] registers as CSRs. Summary: Specifying X[8-15,18] registers as callee-saved is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. As part of this patch we: - use custom CSR list/mask when user specifies custom CSRs - update Machine Register Info's list of CSRs with additional custom CSRs in LowerCall and LowerFormalArguments. Reviewers: srhines, nickdesaulniers, efriedma, javed.absar Reviewed By: nickdesaulniers Subscribers: kristof.beyls, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52216 llvm-svn: 342824	2018-09-22 22:17:50 +00:00
Calixte Denizet	7413a43886	Verify commit access in fixing typo llvm-svn: 342538	2018-09-19 11:26:20 +00:00
Nick Desaulniers	287a3be379	[AArch64] Support reserving x1-7 registers. Summary: Reserving registers x1-7 is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. This change adds support for reserving registers x1 through x7. Reviewers: javed.absar, phosek, srhines, nickdesaulniers, efriedma Reviewed By: nickdesaulniers, efriedma Subscribers: niravd, jfb, manojgupta, nickdesaulniers, jyknight, efriedma, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D48580 llvm-svn: 341706	2018-09-07 20:58:57 +00:00
Martin Storsjo	68df812cce	[MinGW] Move code for indicating "potentially not DSO local" into shouldAssumeDSOLocal. NFC. On Windows, if shouldAssumeDSOLocal returns false, it's either a dllimport reference, or a reference that we should treat as non-local and create a stub for. Clean up AArch64Subtarget::ClassifyGlobalReference a little while touching the flag handling relating to dllimport. Differential Revision: https://reviews.llvm.org/D51590 llvm-svn: 341402	2018-09-04 20:56:28 +00:00
Martin Storsjo	fed420d6b6	[MinGW] [AArch64] Add stubs for potential automatic dllimported variables The runtime pseudo relocations can't handle the AArch64 format PC relative addressing in adrp+add/ldr pairs. By using stubs, the potentially dllimported addresses can be touched up by the runtime pseudo relocation framework. Differential Revision: https://reviews.llvm.org/D51452 llvm-svn: 341401	2018-09-04 20:56:21 +00:00
David Green	9dd1d451d9	[AArch64] Add Tiny Code Model for AArch64 This adds the plumbing for the Tiny code model for the AArch64 backend. This, instead of loading addresses through the normal ADRP;ADD pair used in the Small model, uses a single ADR. The 21 bit range of an ADR means that the code and its statically defined symbols need to be within 1MB of each other. This makes it mostly interesting for embedded applications where we want to fit as much as we can in as small a space as possible. Differential Revision: https://reviews.llvm.org/D49673 llvm-svn: 340397	2018-08-22 11:31:39 +00:00
Peter Collingbourne	f11eb3ebe7	AArch64: Implement support for the shadowcallstack attribute. The implementation of shadow call stack on aarch64 is quite different to the implementation on x86_64. Instead of reserving a segment register for the shadow call stack, we reserve the platform register, x18. Any function that spills lr to sp also spills it to the shadow call stack, a pointer to which is stored in x18. Differential Revision: https://reviews.llvm.org/D45239 llvm-svn: 329236	2018-04-04 21:55:44 +00:00
Petr Hosek	934e5d5436	[AArch64] Reserve x18 register on Fuchsia This register is reserved as a platform register on Fuchsia. Differential Revision: https://reviews.llvm.org/D45105 llvm-svn: 328950	2018-04-01 23:44:04 +00:00
Martin Storsjo	708498a164	[AArch64] Properly handle dllimport of variables when using fast-isel Differential Revision: https://reviews.llvm.org/D42567 llvm-svn: 323810	2018-01-30 19:50:51 +00:00
Evandro Menezes	9f9daa1f14	[AArch64] Add pipeline model for Exynos M3 Add the scheduling and cost model for Exynos M3. Differential revision: https://reviews.llvm.org/D42387 llvm-svn: 323773	2018-01-30 15:40:16 +00:00
Matthias Braun	5c290dc206	AArch64: Fix emergency spillslot being out of reach for large callframes Re-commit of r322200: The testcase shouldn't hit machineverifiers anymore with r322917 in place. Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322919	2018-01-19 03:16:36 +00:00
Matthias Braun	e3a8db7ba1	Revert "AArch64: Fix emergency spillslot being out of reach for large callframes" Revert for now as the testcase is hitting a pre-existing verifier error that manifest as a failure when expensive checks are enabled (or -verify-machineinstrs) is used. This reverts commit r322200. llvm-svn: 322231	2018-01-10 22:36:28 +00:00
Matthias Braun	b42ffa1283	AArch64: Fix emergency spillslot being out of reach for large callframes Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322200	2018-01-10 18:16:24 +00:00
Matthias Braun	a92cecfbda	AArch64/X86: Factor out common bzero logic; NFC llvm-svn: 321035	2017-12-18 23:14:28 +00:00
Michael Zolotukhin	a859bd9ced	Remove redundant includes from lib/Target/AArch64. llvm-svn: 320634	2017-12-13 21:31:16 +00:00
Daniel Sanders	7fe7acc6b1	[aarch64][globalisel] Define G_ATOMIC_CMPXCHG and G_ATOMICRMW_* and make them legal The IRTranslator cannot generate these instructions at the moment so there's no issue with not having implemented ISel for them yet. D40092 will add G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_* to the IRTranslator and a further patch will add support for lowering G_ATOMIC_CMPXCHG_WITH_SUCCESS into G_ATOMIC_CMPXCHG with an external success check via the `Lower` action. The separation of G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMIC_CMPXCHG is to import SelectionDAG rules while still supporting targets that prefer to custom lower the original LLVM-IR-like operation. llvm-svn: 319216	2017-11-28 20:21:15 +00:00
Chad Rosier	71070856e6	[AArch64] Add basic support for Qualcomm's Saphira CPU. llvm-svn: 314105	2017-09-25 14:05:00 +00:00
Sam Parker	b252ffd2cc	[ARM][AArch64] Cortex-A75 and Cortex-A55 support This patch introduces support for Cortex-A75 and Cortex-A55, Arm's latest big.LITTLE A-class cores. They implement the ARMv8.2-A architecture, including the cryptography and RAS extensions, plus the optional dot product extension. They also implement the RCpc AArch64 extension from ARMv8.3-A. Cortex-A75: https://developer.arm.com/products/processors/cortex-a/cortex-a75 Cortex-A55: https://developer.arm.com/products/processors/cortex-a/cortex-a55 Differential Revision: https://reviews.llvm.org/D36667 llvm-svn: 311316	2017-08-21 08:43:06 +00:00
Quentin Colombet	61d71a138b	Reapply "[GlobalISel] Remove the GISelAccessor API." This reverts commit r310425, thus reapplying r310335 with a fix for link issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON. Original commit message: [GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. ---- The fix for the link issue consists in adding the GlobalISel library in the list of dependencies for the AArch64 unittests. This dependency comes from the use of AArch64Subtarget that needs to know how to destruct the GISel related APIs when being detroyed. Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and understand the problem. llvm-svn: 310969	2017-08-15 22:31:51 +00:00
Quentin Colombet	8dd90fb54b	Revert "[GlobalISel] Remove the GISelAccessor API." This reverts commit r310115. It causes a linker failure for the one of the unittests of AArch64 on one of the linux bot: http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429 : && /home/fedora/gcc/install/gcc-7.1.0/bin/g++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O2 -L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined -Wl,-O3 -Wl,--gc-sections unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o -o unittests/Target/AArch64/AArch64Tests lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread -Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib && : unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0): undefined reference to `vtable for llvm::LegalizerInfo' unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8): undefined reference to `vtable for llvm::RegisterBankInfo' The particularity of this bot is that it is built with BUILD_SHARED_LIBS=ON However, I was not able to reproduce the problem so far. Reverting to unblock the bot. llvm-svn: 310425	2017-08-08 22:22:30 +00:00
Quentin Colombet	c046208c52	[GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. llvm-svn: 310115	2017-08-04 20:15:46 +00:00
Quentin Colombet	250e050a50	[GlobalISel] Make GlobalISel a non-optional library. With this change, the GlobalISel library gets always built. In particular, this is not possible to opt GlobalISel out of the build using the LLVM_BUILD_GLOBAL_ISEL variable any more. llvm-svn: 309990	2017-08-03 21:52:25 +00:00
Florian Hahn	2f86e3d494	[AArch64] Use 8 bytes as preferred function alignment on Cortex-A53. Summary: This change gives a 0.25% speedup on execution time, a 0.82% improvement in benchmark scores and a 0.20% increase in binary size on a Cortex-A53. These numbers are the geomean results on a wide range of benchmarks from the test-suite and a range of proprietary suites. Reviewers: t.p.northover, aadg, silviu.baranga, mcrosier, rengolin Reviewed By: rengolin Subscribers: grimar, davide, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35568 llvm-svn: 309494	2017-07-29 20:04:54 +00:00
Mandeep Singh Grang	d857b4ca98	[COFF, ARM64] Reserve X18 register by default Reviewers: compnerd, rnk, ruiu, mstorsjo Reviewed By: mstorsjo Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35531 llvm-svn: 308358	2017-07-18 20:41:33 +00:00
Florian Hahn	3530094de6	[AArch64] Use 16 bytes as preferred function alignment on Cortex-A73. Summary: Using 16 byte alignment is beneficial on Cortex-A73, similar to Cortex-A72 (added in D34961). Reviewers: mcrosier, t.p.northover, aadg, silviu.baranga Reviewed By: t.p.northover Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35493 llvm-svn: 308283	2017-07-18 09:31:18 +00:00
Florian Hahn	d4550baf3b	[AArch64] Use 16 bytes as preferred function alignment on Cortex-A57. Summary: This change gives a 0.89% speed on execution time, a 0.94% improvement in benchmark scores and a 0.62% increase in binary size on a Cortex-A57. These numbers are the geomean results on a wide range of benchmarks from the test-suite, SPEC2000, SPEC2006 and a range of proprietary suites. The software optimization guide for the Cortex-A57 recommends 16 byte branch alignment. Reviewers: t.p.northover, mcrosier, javed.absar, kristof.beyls, sbaranga Reviewed By: kristof.beyls Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D34954 llvm-svn: 307389	2017-07-07 10:43:01 +00:00
Florian Hahn	e3666ec9d6	[AArch64] Use 16 bytes as preferred function alignment on Cortex-A72. Summary: This change gives a 0.34% speed on execution time, a 0.61% improvement in benchmark scores and a 0.57% increase in binary size on a Cortex-A72. These numbers are the geomean results on a wide range of benchmarks from the test-suite, SPEC2000, SPEC2006 and a range of proprietary suites. The software optimization guide for the Cortex-A72 recommends 16 byte branch alignment. Reviewers: t.p.northover, kristof.beyls, rengolin, sbaranga, mcrosier, javed.absar Reviewed By: kristof.beyls Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D34961 llvm-svn: 307380	2017-07-07 10:15:49 +00:00
Haicheng Wu	ef790ffd56	[Falkor] Enable SW Prefetch. SW prefetch is good for Falkor. Differential Revision: http://reviews.llvm.org/D34084 llvm-svn: 305199	2017-06-12 16:34:19 +00:00
Matthew Simpson	6349380fa4	Revert r291254: [AArch64] Reduce vector insert/extract cost for Falkor The default vector insert/extract cost is more profitable on Falkor than the reduced cost. llvm-svn: 303771	2017-05-24 16:48:39 +00:00
Daniel Sanders	a1b2db7919	[globalisel][tablegen] Demote OptForSize/OptForMinSize/ForCodeSize to per-function predicates. Summary: This causes them to be re-computed more often than necessary but resolves objections that were raised post-commit on r301750. Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls Reviewed By: qcolombet Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32861 llvm-svn: 303418	2017-05-19 11:08:33 +00:00
Adam Nemet	e29686e5c1	[SLP] Enable 64-bit wide vectorization on AArch64 ARM Neon has native support for half-sized vector registers (64 bits). This is beneficial for example for 2D and 3D graphics. This patch adds the option to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer. * Performance Analysis This change was motivated by some internal benchmarks but it is also beneficial on SPEC and the LLVM testsuite. The results are with -O3 and PGO. A negative percentage is an improvement. The testsuite was run with a sample size of 4. SPEC * CFP2006/482.sphinx3 -3.34% A pretty hot loop is SLP vectorized resulting in nice instruction reduction. This used to be a +22% regression before rL299482. * CFP2000/177.mesa -3.34% * CINT2000/256.bzip2 +6.97% My current plan is to extend the fix in rL299482 to i16 which brings the regression down to +2.5%. There are also other problems with the codegen in this loop so there is further room for improvement. ** LLVM testsuite * SingleSource/Benchmarks/Misc/ReedSolomon -10.75% There are multiple small SLP vectorizations outside the hot code. It's a bit surprising that it adds up to 10%. Some of this may be code-layout noise. * MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40% The opt-viewer screenshot can be seen at F3218284. We start at a colder store but the tree leads us into the hottest loop. * MultiSource/Applications/lambda-0.1.3/lambda -2.68% * MultiSource/Benchmarks/Bullet/bullet -2.18% This is using 3D vectors. * SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67% Noise, binary is unchanged. * MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90% There is an additional SLP in the cold code. The test runs for ~1sec and prints out over 2000 lines. This is most likely noise. * MultiSource/Applications/aha/aha +1.63% * MultiSource/Applications/JM/lencod/lencod +1.41% * SingleSource/Benchmarks/Misc/richards_benchmark +1.15% Differential Revision: https://reviews.llvm.org/D31965 llvm-svn: 303116	2017-05-15 21:15:01 +00:00
Quentin Colombet	cdf8c81127	[AArch64] Move GISel accessor initialization from TargetMachine to Subtarget. NFC llvm-svn: 301841	2017-05-01 21:53:19 +00:00
Daniel Sanders	e9fdba39e0	[globalisel][tablegen] Compute available feature bits correctly. Summary: Predicate<> now has a field to indicate how often it must be recomputed. Currently, there are two frequencies, per-module (RecomputePerFunction==0) and per-function (RecomputePerFunction==1). Per-function predicates are currently recomputed more frequently than necessary since the only predicate in this category is cheap to test. Per-module predicates are now computed in getSubtargetImpl() while per-function predicates are computed in selectImpl(). Tablegen now manages the PredicateBitset internally. It should only be necessary to add the required includes. Also fixed a problem revealed by the test case where constrainSelectedInstRegOperands() would attempt to tie operands that BuildMI had already tied. Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32491 llvm-svn: 301750	2017-04-29 17:30:09 +00:00
Tim Northover	46e36f0953	AArch64: put nonlazybind special handling behind a flag for now. It's basically a terrible idea anyway but objc_msgSend gets emitted like that. We can decide on a better way to deal with it in the unlikely event that anyone actually uses it. llvm-svn: 300474	2017-04-17 18:18:47 +00:00
Tim Northover	879a0b2e1b	AArch64: support nonlazybind It's almost certainly not a good idea to actually use it in most cases (there's a pretty large code size overhead on AArch64), but we can't do those experiments until it's supported. llvm-svn: 300462	2017-04-17 17:27:56 +00:00
Petr Hosek	9eb0a1e09b	[AArch64][Fuchsia] Allow -mcmodel=kernel for --target=aarch64-fuchsia This mode is just like -mcmodel=small except that it moves the thread pointer from TPIDR_EL0 to TPIDR_EL1. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31624 llvm-svn: 299462	2017-04-04 19:51:53 +00:00
Joel Jones	2852088126	[AArch64] Vulcan is now ThunderXT99 Broadcom Vulcan is now Cavium ThunderX2T99. LLVM Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32113 Minor fixes for the alignments of loops and functions for ThunderX T81/T83/T88 (better performance). Patch was tested with SpecCPU2006. Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D30510 llvm-svn: 297190	2017-03-07 19:42:40 +00:00
Joel Jones	ab0f3b43e3	[AArch64] Add Cavium ThunderX support This set of patches adds support for Cavium ThunderX ARM64 processors: * ThunderX * ThunderX T81 * ThunderX T83 * ThunderX T88 Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D28891 llvm-svn: 295475	2017-02-17 18:34:24 +00:00
Chad Rosier	e177185e79	[AArch64] Reduce vector insert/extract cost for Falkor. Differential Revision: https://reviews.llvm.org/D28403 llvm-svn: 291254	2017-01-06 18:03:26 +00:00
Chad Rosier	ecc77273a0	[AArch64] Set the max interleave factor for Falkor. llvm-svn: 287642	2016-11-22 14:25:02 +00:00
Chad Rosier	201fc1ed26	[AArch64] Add support for Qualcomm's Falkor CPU. Differential Revision: https://reviews.llvm.org/D26673 llvm-svn: 287036	2016-11-15 21:34:12 +00:00
Evandro Menezes	7696dc0685	[AArch64] Adjust the cost model for Exynos M1. Modify the maximum jump table size. llvm-svn: 285106	2016-10-25 20:05:42 +00:00
Abderrazek Zaafrani	9daf8110c8	Set the vectorizer MaxInterleaveFactor for Exynos. llvm-svn: 284839	2016-10-21 16:28:27 +00:00
Tim Northover	69fa84a6e9	GlobalISel: rename legalizer components to match others. The previous names were both misleading (the MachineLegalizer actually contained the info tables) and inconsistent with the selector & translator (in having a "Machine") prefix. This should make everything sensible again. The only functional change is the name of a couple of command-line options. llvm-svn: 284287	2016-10-14 22:18:18 +00:00
Matthias Braun	a827ed8891	AArch64Subtarget: Remove unused CPUString field llvm-svn: 283142	2016-10-03 20:17:02 +00:00
Evandro Menezes	e45de8a5ec	Add support to optionally limit the size of jump tables. Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated as such branch predictors are, they tend to have well defined limits beyond which their effectiveness is hampered or even nullified. One such limit is the number of possible destinations for a given indirect branches that such branch predictors can handle. This patch considers a limit that a target may set to the number of destination addresses in a jump table. Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar <aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>. Differential revision: https://reviews.llvm.org/D21940 llvm-svn: 282412	2016-09-26 15:32:33 +00:00
Ahmed Bougacha	6756a2c953	[GlobalISel] Introduce an instruction selector. And implement it for AArch64, supporting x/w ADD/OR. Differential Revision: https://reviews.llvm.org/D22373 llvm-svn: 276875	2016-07-27 14:31:55 +00:00
Tim Northover	33b07d6725	GlobalISel: implement legalization pass, with just one transformation. This adds the actual MachineLegalizeHelper to do the work and a trivial pass wrapper that legalizes all instructions in a MachineFunction. Currently the only transformation supported is splitting up a vector G_ADD into one acting on smaller vectors. llvm-svn: 276461	2016-07-22 20:03:43 +00:00
Junmo Park	5e4bd2e7c4	Minor code cleanup. NFC. llvm-svn: 274702	2016-07-06 23:15:18 +00:00
Duncan P. N. Exon Smith	632987296f	Target: Remove unused arguments from overrideSchedPolicy, NFC TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr* arguments (begin and end) that invite implicit conversions from MachineInstrBundleIterator. One option would be to change their type to an iterator, but since they don't seem to have been used since the API was added in 2010, I'm deleting the dead code. llvm-svn: 274304	2016-07-01 00:23:27 +00:00
Rafael Espindola	db6bd02185	Delete unused includes. NFC. llvm-svn: 274225	2016-06-30 12:19:16 +00:00
Pankaj Gode	f4b25547cf	[AArch64] Add Broadcom Vulcan scheduling model. Adding scheduling model for new Broadcom Vulcan core (ARMv8.1A). Differential Revision: http://reviews.llvm.org/D21728 llvm-svn: 274213	2016-06-30 06:42:31 +00:00
Rafael Espindola	3beef8d6db	Move shouldAssumeDSOLocal to Target. Should fix the shared library build. llvm-svn: 273958	2016-06-27 23:15:57 +00:00
Haicheng Wu	a783bac50b	[Kryo] Enable loop prefetcher. Differential Revision: http://reviews.llvm.org/D21535 llvm-svn: 273329	2016-06-21 22:47:56 +00:00
Silviu Baranga	aee40fc61c	[AArch64] Restore codegen for AArch64 Cortex-A72/A73 after NFCI Summary: Code generation for Cortex-A72/Cortex-A73 was accidentally changed by r271555, which was a NFCI. The isCortexA57() predicate was not true for Cortex-A72/Cortex-A73 before r271555 (since it was checking the CPU string). Because Cortex-A72/Cortex-A73 inherit all features from Cortex-A57, all decisions previously guarded by isCortexA57() are now taken. This change restores the behaviour before r271555 by adding separate ProcA72/ProcA73, which have the required features to preserve code generation. Reviewers: kristof.beyls, aadg, mcrosier, rengolin Subscribers: mcrosier, llvm-commits, aemerson, t.p.northover, MatzeB, rengolin Differential Revision: http://reviews.llvm.org/D21182 llvm-svn: 273277	2016-06-21 15:53:54 +00:00
Pankaj Gode	0aab2e398a	[AARCH64] Add support for Broadcom Vulcan Adding core tuning support for new Broadcom Vulcan core (ARMv8.1A). Differential Revision: http://reviews.llvm.org/D21500 llvm-svn: 273148	2016-06-20 11:13:31 +00:00
Evandro Menezes	a3a0a60cff	[AArch64] Add preferred alignments for Exynos M1 Differential Revision: http://reviews.llvm.org/D21203 llvm-svn: 272400	2016-06-10 16:00:18 +00:00
Matthias Braun	651cff42c4	AArch64: Do not test for CPUs, use SubtargetFeatures Testing for specific CPUs has a number of problems, better use subtarget features: - When some tweak is added for a specific CPU it is often desirable for the next version of that CPU as well, yet we often forget to add it. - It is hard to keep track of checks scattered around the target code; Declaring all target specifics together with the CPU in the tablegen file is a clear representation. - Subtarget features can be tweaked from the command line. To discourage people from using CPU checks in the future I removed the isCortexXX(), isCyclone(), ... functions. I added an getProcFamily() function for exceptional circumstances but made it clear in the comment that usage is discouraged. Reformat feature list in AArch64.td to have 1 feature per line in alphabetical order to simplify merging and sorting for out of tree tweaks. No functional change intended. Differential Revision: http://reviews.llvm.org/D20762 llvm-svn: 271555	2016-06-02 18:03:53 +00:00
Rafael Espindola	4d29099f7f	Delete AArch64II::MO_CONSTPOOL. A constant pool holding the address of a variable in equivalent to a got entry. It produces exactly the same instruction sequence as a got use and unlike a got use this is not uniqued by the linker. llvm-svn: 271311	2016-05-31 18:31:14 +00:00
Matthias Braun	27b6692fe2	AArch64Subtarget: Use default member initializers llvm-svn: 271057	2016-05-27 22:14:09 +00:00
Rafael Espindola	a224de06bc	Use shouldAssumeDSOLocal on AArch64. This reduces code duplication and now AArch64 also handles PIE. llvm-svn: 270844	2016-05-26 12:42:55 +00:00
Rafael Espindola	6b93bf5783	Don't repeat name in comment and git-clang-format. llvm-svn: 270785	2016-05-25 22:44:06 +00:00
Rafael Espindola	6b4baa5f58	Sort includes. llvm-svn: 270769	2016-05-25 21:37:29 +00:00
Mehdi Amini	b550cb1750	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Tom Stellard	cef0fe4245	[GlobalISel] Move GISelAccessor class into public headers Reviewers: qcolombet Subscribers: joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19120 llvm-svn: 266348	2016-04-14 17:45:38 +00:00
Quentin Colombet	c17f744001	[AArch64] Teach the subtarget how to get to the RegisterBankInfo. Rework the access to GlobalISel APIs to contain how much of the APIs we need to access for the final executable to build when GlobalISel is not built. This prevents massive usage of ifdefs in various places. Now, all the GlobalISel ifdefs will be happing only in AArch64TargetMachine.cpp. llvm-svn: 265567	2016-04-06 17:26:03 +00:00
Quentin Colombet	ba2a01645b	[GlobalISel] Re-apply r260922-260923 with MSVC-friendly code. Original message: Get rid of the ifdefs in TargetLowering. Introduce a new API used only by GlobalISel: CallLowering. This API will contain target hooks dedicated to call lowering. llvm-svn: 260998	2016-02-16 19:26:02 +00:00
Aaron Ballman	fc64ef1a15	Reverting r260922-260923; they cause link failures with MSVC. http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc2015/builds/15436/steps/build/logs/stdio http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc18-DA/builds/961/steps/build_llvm/logs/stdio llvm-svn: 260972	2016-02-16 15:29:06 +00:00
Quentin Colombet	1ce38545fb	[GlobalISel] Get rid of the ifdefs in TargetLowering. Introduce a new API used only by GlobalISel: CallLowering. This API will contain target hooks dedicated to call lowering. llvm-svn: 260922	2016-02-16 00:57:44 +00:00
Oliver Stannard	7cc0c4e675	[AArch64] Add subtarget features for ARMv8.2-A This adds subtarget features for ARMv8.2-A, which builds on (and requires the features from) ARMv8.1-A. Most assembler-visible features of ARMv8.2-A are system instructions, and are all required parts of the architecture, so just depend on the HasV8_2aOps subtarget feature. There is also one large, optional feature, which adds 16-bit floating point versions of all existing floating-point instructions (VFP and SIMD), this is represented by the FeatureFullFP16 subtarget feature. Differential Revision: http://reviews.llvm.org/D15013 llvm-svn: 254154	2015-11-26 15:23:32 +00:00
Justin Bogner	fff708db92	AArch64: Default AArch64Subtarget::ReserveX18 to true on darwin Darwin reserves x18, so it's never ABI compliant to generate code that uses it. Set the default value based on the OS part of the triple rather than forcing front-ends to set the +reserve-x18 target feature in order to build correct code for Darwin. This will make r243310 redundant, so I'll revert that shortly. llvm-svn: 253102	2015-11-13 23:05:46 +00:00
Tim Northover	339c83e27f	AArch64: add experimental support for address tagging. AArch64 has the ability to use the top 8-bits of an "address" for extra information, with the memory subsystem automatically masking them off for loads and stores. When that's happening, we can sometimes skip masks on memory operations in the compiler. However, this requires the host OS and support stack to preserve those bits so it can't be enabled everywhere. In principle iOS 8.0 and above do take the required precautions and but we'll put it under a flag for now. llvm-svn: 252573	2015-11-10 00:44:23 +00:00
Matthias Braun	d276de6db1	AArch64: Disable the latency heuristic It turned out not to improve any of our benchmarks but occasionally led to increased register pressure and spilling. Only enabling for the Cyclone CPU as the results on the cortex CPUs give mixed results. Differential Revision: http://reviews.llvm.org/D13708 llvm-svn: 251038	2015-10-22 18:07:38 +00:00
Daniel Sanders	50f17235dd	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Eric has replied and has demanded the patch be reverted. llvm-svn: 247702	2015-09-15 16:17:27 +00:00
Daniel Sanders	153010c52d	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247692	2015-09-15 14:08:28 +00:00
Daniel Sanders	c40de48041	Revert r247684 - Replace Triple with a new TargetTuple ... LLDB needs to be updated in the same commit. llvm-svn: 247686	2015-09-15 13:46:21 +00:00
Daniel Sanders	18d4b0dab7	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247683	2015-09-15 13:17:40 +00:00
Ahmed Bougacha	b0ff6437cb	[AArch64] Lower READCYCLECOUNTER using MRS PMCCTNR_EL0. This matches the ARM behavior. In both cases, the register is part of the optional Performance Monitors extension, so, add the feature, and enable it for the A-class processors we support. Differential Revision: http://reviews.llvm.org/D12425 llvm-svn: 246555	2015-09-01 16:23:45 +00:00
Akira Hatanaka	f53b0403f8	[AArch64] Define subtarget feature strict-align. This commit defines subtarget feature strict-align and uses it instead of cl::opt -aarch64-strict-align to decide whether strict alignment should be forced. rdar://problem/21529937 llvm-svn: 243516	2015-07-29 14:17:26 +00:00
Akira Hatanaka	0d4c9ea6e0	[AArch64] Define subtarget feature "reserve-x18", which is used to decide whether register x18 should be reserved. This change is needed because we cannot use a backend option to set cl::opt "aarch64-reserve-x18" when doing LTO. Out-of-tree projects currently using cl::opt option "-aarch64-reserve-x18" to reserve x18 should make changes to add subtarget feature "reserve-x18" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11463 llvm-svn: 243186	2015-07-25 00:18:31 +00:00
Mehdi Amini	157e5a6d10	Remove getDataLayout() from TargetSelectionDAGInfo (had no users) Summary: Remove empty subclass in the process. This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren, ted Differential Revision: http://reviews.llvm.org/D11045 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241780	2015-07-09 02:10:08 +00:00
Peter Collingbourne	6a9d1774d0	IR: Do not consider available_externally linkage to be linker-weak. From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413	2015-07-05 20:52:35 +00:00
Daniel Sanders	a73f1fdb19	Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and create*MCSubtargetInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10311 llvm-svn: 239467	2015-06-10 12:11:26 +00:00
Vladimir Sukharev	439328e172	[AArch64] Rename v8.1a from "extension" to "architecture" v8.1a is renamed to architecture, accordingly to approaches in ARM backend. Excess generic cpu is removed. Intended use: "generic" cpu with "v8.1a" subtarget feature Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8766 llvm-svn: 233810	2015-04-01 14:49:29 +00:00
Vladimir Sukharev	c632cda8b2	[AArch64, ARM] Add v8.1a architecture and generic cpu New architecture and cpu added, following http://community.arm.com/groups/processors/blog/2014/12/02/the-armv8-a-architecture-and-its-ongoing-development Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8505 llvm-svn: 233290	2015-03-26 17:05:54 +00:00
Eric Christopher	a0de253d27	Revert "Migrate the AArch64 TargetRegisterInfo to its TargetMachine" as we don't necessarily need to do this yet - though we could move the base class to the TargetMachine as it isn't subtarget dependent. This reverts commit r232103. llvm-svn: 232665	2015-03-18 20:37:30 +00:00
Eric Christopher	1b585aeb8a	Migrate the AArch64 TargetRegisterInfo to its TargetMachine implementation. This requires a bit of scaffolding and a few fixups that'll go away once all of the ports have been migrated. llvm-svn: 232103	2015-03-12 21:04:46 +00:00

1 2 3 4

178 Commits