llvm-project

Commit Graph

Author	SHA1	Message	Date
Sjoerd Meijer	24f12711ae	[ARM] Add CLI support for Armv8.1-M and MVE Given the existing infrastructure in LLVM side for +fp and +fp.dp, this is more or less trivial, needing only one tiny source change and a couple of tests. Patch by Simon Tatham. Differential Revision: https://reviews.llvm.org/D60699 llvm-svn: 362096	2019-05-30 14:22:26 +00:00
Simon Tatham	760df47b77	[ARM] Replace fp-only-sp and d16 with fp64 and d32. Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list before rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 llvm-svn: 361845	2019-05-28 16:13:20 +00:00
Craig Topper	a85c0fd918	[X86] Split multi-line chained assignments into single lines to avoid making clang-format create triangle shaped indentation. Simplify one if statement to remove a bunch of string matches. NFCI We had an if statement that checked over every avx512* feature to see if it should enabled avx512f. Since they are all prefixed with avx512 just check for that instead. llvm-svn: 361557	2019-05-23 21:34:36 +00:00
Thomas Lively	eafe8ef6f2	[WebAssembly] Add multivalue and tail-call target features Summary: These features will both be implemented soon, so I thought I would save time by adding the boilerplate for both of them at the same time. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D62047 llvm-svn: 361516	2019-05-23 17:26:47 +00:00
Fangrui Song	f69c992485	[PPC64] Fix PPC64TargetInfo ABI on clang side after D61950 llvm-svn: 361365	2019-05-22 09:26:46 +00:00
Fangrui Song	091aaa69d3	[PPC64] Fix PPC64TargetInfo after D61950 llvm-svn: 361363	2019-05-22 09:17:21 +00:00
Fangrui Song	1c61471ab1	[PPC64] Parse -elfv1 -elfv2 when specified on target triple Summary: For big-endian powerpc64, the default ABI is ELFv1. OpenPower ABI ELFv2 is supported when -mabi=elfv2 is specified. FreeBSD support for PowerPC64 ELFv2 ABI with LLVM is in progress[1]. This patch adds an alternative way to specify ELFv2 ABI on target triple [2]. The following results are expected: ELFv1 when using: -target powerpc64-unknown-freebsd12.0 -target powerpc64-unknown-freebsd12.0 -mabi=elfv1 -target powerpc64-unknown-freebsd12.0-elfv1 ELFv2 when using: -target powerpc64-unknown-freebsd12.0 -mabi=elfv2 -target powerpc64-unknown-freebsd12.0-elfv2 [1] https://wiki.freebsd.org/powerpc/llvm-elfv2 [2] https://clang.llvm.org/docs/CrossCompilation.html Patch by Alfredo Dal'Ava Júnior! Differential Revision: https://reviews.llvm.org/D61950 llvm-svn: 361355	2019-05-22 07:29:59 +00:00
Javed Absar	603a2bac05	[ARM][CMSE] Add commandline option and feature macro Defines macro ARM_FEATURE_CMSE to 1 for v8-M targets and introduces -mcmse option which for v8-M targets sets ARM_FEATURE_CMSE to 3. A diagnostic is produced when the option is given on architectures without support for Security Extensions. Reviewed By: dmgreen, snidertm Differential Revision: https://reviews.llvm.org/D59879 llvm-svn: 361261	2019-05-21 14:21:26 +00:00
Craig Topper	20040db9a6	[X86] Stop implicitly enabling avx512vl when avx512bf16 is enabled. Previously we were doing this so that the 256 bit selectw builtin could be used in the implementation of the 512->256 bit conversion intrinsic. After this commit we now use a masked convert builtin that will emit the intrinsic call and the 256-bit select from custom code in CGBuiltin. Then the header only needs to call that one intrinsic. llvm-svn: 360924	2019-05-16 18:28:17 +00:00
Xing Xue	6dc363ecc1	Add AIX Version Macros Summary: - This patch checks the AIX version and defines the appropriate macros. - Follow up to a comment on D59048. Author: andusy Reviewers: hubert.reinterpretcast, jasonliu, sfertile, xingxue Reviewed By: sfertile Subscribers: jsji, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61530 llvm-svn: 360900	2019-05-16 14:22:37 +00:00
Stanislav Mekhanoshin	91792f1b93	[AMDGPU] gfx1010 clang target Differential Revision: https://reviews.llvm.org/D61875 llvm-svn: 360634	2019-05-13 23:15:59 +00:00
Akira Hatanaka	c39a243da6	Assume `__cxa_allocate_exception` returns an under-aligned memory on Darwin if the version of libc++abi isn't new enough to include the fix in r319123 This patch resurrects r264998, which was committed to work around a bug in libc++abi that was causing _cxa_allocate_exception to return a memory that wasn't double-word aligned. http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20160328/154332.html In addition, this patch makes clang issue a warning if the type of the thrown object requires an alignment that is larger than the minimum guaranteed by the target C++ runtime. rdar://problem/49864414 Differential Revision: https://reviews.llvm.org/D61667 llvm-svn: 360404	2019-05-10 02:16:37 +00:00
Eric Christopher	e2b7332d2d	Fix typo in risc-v register aliases. Patch by John. Differential Revision: https://reviews.llvm.org/D61464 llvm-svn: 360104	2019-05-07 00:45:47 +00:00
Luo, Yuanke	844f662932	Enable intrinsics of AVX512_BF16, which are supported for BFLOAT16 in Cooper Lake Summary: 1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake; 2. Enable intrinsics for VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision. For more details about BF16 intrinsic, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Patch by LiuTianle Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, spatel, RKSimon Reviewed By: craig.topper Subscribers: mgorny, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60552 llvm-svn: 360018	2019-05-06 08:25:11 +00:00
Tom Tan	b7c6d95af5	[COFF, ARM64] Align global symbol by size for ARM64 MSVC ABI According to alignment section in below ARM64 ABI document, MSVC could increase alignment of global data based on its total size. Clang doesn't do this. Compile the same symbol into different alignments by Clang and MSVC could cause link error because some instruction encodings, like 64-bit LDR/STR with immediate, require the target to be 8 bytes aligned, and linker could choose code stream with such LDR/STR instruction from MSVC and 4 bytes aligned data from Clang into final image, which actually cannot be linked together (see https://bugs.llvm.org/show_bug.cgi?id=41506 for more details). https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#alignment Differential Revision: https://reviews.llvm.org/D61225 llvm-svn: 359744	2019-05-02 00:38:14 +00:00
Yaxun Liu	4469701207	AMDGPU: Enable _Float16 llvm-svn: 359594	2019-04-30 18:35:37 +00:00
Vitaly Buka	ae01981d03	[AArch64] Initialize HasMTE llvm-svn: 359366	2019-04-27 02:40:01 +00:00
Javed Absar	18b0c40bc5	[AArch64] Add support for MTE intrinsics This provides intrinsics support for Memory Tagging Extension (MTE), which was introduced with the Armv8.5-a architecture. These intrinsics are available when __ARM_FEATURE_MEMORY_TAGGING is defined. Each intrinsic is described in detail in the ACLE Q1 2019 documentation: https://developer.arm.com/docs/101028/latest Reviewed By: Tim Nortover, David Spickett Differential Revision: https://reviews.llvm.org/D60485 llvm-svn: 359348	2019-04-26 21:08:11 +00:00
Yonghong Song	51a4a0d68f	[BPF] do not generate predefined macro bpf "DefineStd(Builder, "bpf", Opts)" generates the following three macros: bpf __bpf __bpf__ and the macro "bpf" is due to the fact that the target language is C which allows GNU extensions. The name "bpf" could be easily used as variable name or type field name. For example, in current linux kernel, there are four places where bpf is used as a field name. If the corresponding types are included in bpf program, the compilation error will occur. This patch removed predefined macro "bpf" as well as "__bpf" which is rarely used if used at all. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61173 llvm-svn: 359310	2019-04-26 15:35:51 +00:00
Artem Belevich	5fe85a003f	[CUDA] Implemented _[bi]mma* builtins. These builtins provide access to the new integer and sub-integer variants of MMA (matrix multiply-accumulate) instructions provided by CUDA-10.x on sm_75 (AKA Turing) GPUs. Also added a feature for PTX 6.4. While Clang/LLVM does not generate any PTX instructions that need it, we still need to pass it through to ptxas in order to be able to compile code that uses the new 'mma' instruction as inline assembly (e.g used by NVIDIA's CUTLASS library https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101) Differential Revision: https://reviews.llvm.org/D60279 llvm-svn: 359248	2019-04-25 22:28:09 +00:00
Stanislav Mekhanoshin	1d9f286ecb	[AMDGPU] rename vi-insts into gfx8-insts Differential Revision: https://reviews.llvm.org/D60293 llvm-svn: 357792	2019-04-05 18:25:00 +00:00
Alon Zakai	b4f9991f38	[WebAssembly] Add Emscripten OS definition + small_printf The Emscripten OS provides a definition of __EMSCRIPTEN__, and also that it supports iprintf optimizations. Also define small_printf optimizations, which is a printf with float support but not long double (which in wasm can be useful since long doubles are 128 bit and force linking of float128 emulation code). This part is based on sunfish's https://reviews.llvm.org/D57620 (which can't land yet since the WASI integration isn't ready yet). Differential Revision: https://reviews.llvm.org/D60167 llvm-svn: 357552	2019-04-03 01:08:35 +00:00
Simon Pilgrim	4bad9c2170	Fix Wimplicit-fallthrough warning introduced in rL357466. NFCI. llvm-svn: 357467	2019-04-02 11:25:38 +00:00
Strahinja Petrovic	4f839ac188	[PowerPC] Fix issue with inline asm - soft float mode This patch prevents floating point register constraints in soft float mode. Differential Revision: https://reviews.llvm.org/D59310 llvm-svn: 357466	2019-04-02 11:00:09 +00:00
Fangrui Song	75e74e077c	Range-style std::find{,_if} -> llvm::find{,_if}. NFC llvm-svn: 357359	2019-03-31 08:48:19 +00:00
Thomas Lively	5f0c4c67bb	[WebAssembly] Add mutable globals feature Summary: This feature is not actually used for anything in the WebAssembly backend, but adding it allows users to get it into the target features sections of their objects, which makes these objects future-compatible. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jdoerfert, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60013 llvm-svn: 357321	2019-03-29 22:00:18 +00:00
Craig Topper	7339e61b89	[X86] Correct the value of MaxAtomicInlineWidth for pre-586 cpus Use the new cx8 feature flag that was added to the backend to represent support for cmpxchg8b. Use this flag to set the MaxAtomicInlineWidth. This also assumes all the cmpxchg instructions are enabled for CK_Generic which is what cc1 defaults to when nothing is specified. Differential Revision: https://reviews.llvm.org/D59566 llvm-svn: 356709	2019-03-21 20:36:08 +00:00
Craig Topper	f2f139e9ef	[X86] Use the CPUKind enum from PROC_ALIAS to directly get the CPUKind in fillValidCPUList. We were using getCPUKind which translates the string to the enum also using PROC_ALIAS. This just cuts out the string compares. llvm-svn: 356686	2019-03-21 17:33:20 +00:00
Craig Topper	140f766f14	[X86] Remove getCPUKindCanonicalName which is unused. Differential Revision: https://reviews.llvm.org/D59578 llvm-svn: 356580	2019-03-20 17:26:51 +00:00
Craig Topper	dfa0fdbde0	[X86] Separate PentiumPro and i686. They aren't aliases in the backend. PentiumPro has HasNOPL set in the backend. i686 does not. Despite having a function that looks like it canonicalizes alias names. It doesn't seem to be called. So I don't think this is a functional change. But its good to be consistent between the backend and frontend. llvm-svn: 356537	2019-03-20 07:31:18 +00:00
Michael Liao	3c2aadbe67	[AMDGPU] Add the missing clang change of the experimental buffer fat pointer llvm-svn: 356385	2019-03-18 18:11:37 +00:00
Jason Liu	7f7867b05a	Reland the rest of "Add AIX Target Info" llvm-svn 356197 relanded previously failing test case max_align.c. This commit will reland the rest of llvm-svn 356060 commit. Differential Revision: https://reviews.llvm.org/D59048 llvm-svn: 356208	2019-03-14 21:54:30 +00:00
Craig Topper	bee966d163	[X86] Only define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 in 64-bit mode. Summary: This define should correspond to CMPXCHG16B being available which requires 64-bit mode. I checked and gcc also seems to only define this in 64-bit mode. Reviewers: RKSimon, spatel, efriedma, jyknight, jfb Reviewed By: jfb Subscribers: jfb, cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D59287 llvm-svn: 356118	2019-03-14 05:45:42 +00:00
Jason Liu	e62ccefe44	Revert "Add AIX Target Info" This reverts commit `4e192d0e1e`. The newly added test case max_align.c do not work on all platforms. original llvm-svn: 356060 llvm-svn: 356070	2019-03-13 17:57:23 +00:00
Jason Liu	4e192d0e1e	Add AIX Target Info Summary: A first pass over platform-specific properties of the C API/ABI on AIX for both 32-bit and 64-bit modes. This is a continuation of D18360 by Andrew Paprocki and further work by Wu Zhao. Patch by Andus Yu Reviewers: apaprocki, chandlerc, hubert.reinterpretcast, jasonliu, xingxue, sfertile Reviewed by: hubert.reinterpretcast, apaprocki, sfertile Differential Revision: https://reviews.llvm.org/D59048 llvm-svn: 356060	2019-03-13 16:02:26 +00:00
Craig Topper	d02c9f59ff	[X86] Remove 'cx16' from 'prescott' and 'yonah' as they are 32-bit only CPUs and cmpxchg16b requires 64-bit mode. llvm-svn: 356008	2019-03-13 05:14:52 +00:00
Michael Platings	308e82eceb	[IR][ARM] Add function pointer alignment to datalayout Use this feature to fix a bug on ARM where 4 byte alignment is incorrectly assumed. Differential Revision: https://reviews.llvm.org/D57335 llvm-svn: 355685	2019-03-08 10:44:06 +00:00
Mitch Phillips	92dd321a14	Rollback of rL355585. Introduces memory leak in FunctionTest.GetPointerAlignment that breaks sanitizer buildbots: ``` ================================================================= ==2453==ERROR: LeakSanitizer: detected memory leaks Direct leak of 128 byte(s) in 1 object(s) allocated from: #0 0x610428 in operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:105 #1 0x16936bc in llvm::User::operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/User.cpp:151:19 #2 0x7c3fe9 in Create /b/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/IR/Function.h:144:12 #3 0x7c3fe9 in (anonymous namespace)::FunctionTest_GetPointerAlignment_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/IR/FunctionTest.cpp:136 #4 0x1a836a0 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc #5 0x1a836a0 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2474 #6 0x1a85c55 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11 #7 0x1a870d0 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28 #8 0x1aa5b84 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43 #9 0x1aa4d30 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc #10 0x1aa4d30 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4257 #11 0x1a6b656 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46 #12 0x1a6b656 in main /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50 #13 0x7f5af37a22e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) Indirect leak of 40 byte(s) in 1 object(s) allocated from: #0 0x610428 in operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:105 #1 0x151be6b in make_unique<llvm::ValueSymbolTable> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/ADT/STLExtras.h:1349:29 #2 0x151be6b in llvm::Function::Function(llvm::FunctionType, llvm::GlobalValue::LinkageTypes, unsigned int, llvm::Twine const&, llvm::Module) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/Function.cpp:241 #3 0x7c4006 in Create /b/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/IR/Function.h:144:16 #4 0x7c4006 in (anonymous namespace)::FunctionTest_GetPointerAlignment_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/IR/FunctionTest.cpp:136 #5 0x1a836a0 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc #6 0x1a836a0 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2474 #7 0x1a85c55 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11 #8 0x1a870d0 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28 #9 0x1aa5b84 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43 #10 0x1aa4d30 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc #11 0x1aa4d30 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4257 #12 0x1a6b656 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46 #13 0x1a6b656 in main /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50 #14 0x7f5af37a22e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) SUMMARY: AddressSanitizer: 168 byte(s) leaked in 2 allocation(s). ``` See http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/11358/steps/check-llvm%20asan/logs/stdio for more information. Also introduces use-of-uninitialized-value in ConstantsTest.FoldGlobalVariablePtr: ``` ==7070==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x14e703c in User /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/User.h:79:5 #1 0x14e703c in Constant /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/Constant.h:44 #2 0x14e703c in llvm::GlobalValue::GlobalValue(llvm::Type, llvm::Value::ValueTy, llvm::Use, unsigned int, llvm::GlobalValue::LinkageTypes, llvm::Twine const&, unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/GlobalValue.h:78 #3 0x14e5467 in GlobalObject /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/GlobalObject.h:34:9 #4 0x14e5467 in llvm::GlobalVariable::GlobalVariable(llvm::Type, bool, llvm::GlobalValue::LinkageTypes, llvm::Constant, llvm::Twine const&, llvm::GlobalValue::ThreadLocalMode, unsigned int, bool) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/Globals.cpp:314 #5 0x6938f1 in llvm::(anonymous namespace)::ConstantsTest_FoldGlobalVariablePtr_Test::TestBody() /b/sanitizer-x86_64-linux-fast/build/llvm/unittests/IR/ConstantsTest.cpp:565:18 #6 0x1a240a1 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc #7 0x1a240a1 in testing::Test::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:2474 #8 0x1a26d26 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11 #9 0x1a2815f in testing::TestCase::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28 #10 0x1a43de8 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43 #11 0x1a42c47 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc #12 0x1a42c47 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:4257 #13 0x1a0dfba in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46 #14 0x1a0dfba in main /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50 #15 0x7f2081c412e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #16 0x4dff49 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/unittests/IR/IRTests+0x4dff49) SUMMARY: MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/User.h:79:5 in User ``` See http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/30222/steps/check-llvm%20msan/logs/stdio for more information. llvm-svn: 355616	2019-03-07 18:13:39 +00:00
Michael Platings	fd4156ed4d	[IR][ARM] Add function pointer alignment to datalayout Use this feature to fix a bug on ARM where 4 byte alignment is incorrectly assumed. Differential Revision: https://reviews.llvm.org/D57335 llvm-svn: 355585	2019-03-07 09:15:23 +00:00
Mitch Phillips	318028f00f	Revert "[IR][ARM] Add function pointer alignment to datalayout" This reverts commit `2391bfca97`. This reverts rL355522 (https://reviews.llvm.org/D57335). Kills buildbots that use '-Werror' with the following error: /var/lib/buildbot/sanitizer-buildbot6/sanitizer-x86_64-linux-android/build/llvm/lib/IR/Value.cpp:657:7: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default] See buildbots http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/30200/steps/check-llvm%20asan/logs/stdio for more information. llvm-svn: 355537	2019-03-06 19:17:18 +00:00
Michael Platings	2391bfca97	[IR][ARM] Add function pointer alignment to datalayout Use this feature to fix a bug on ARM where 4 byte alignment is incorrectly assumed. Differential Revision: https://reviews.llvm.org/D57335 llvm-svn: 355522	2019-03-06 17:24:11 +00:00
Ganesh Gopalasubramanian	4f171d2761	[X86] AMD znver2 enablement This patch enables the following 1) AMD family 17h "znver2" tune flag (-march, -mcpu). 2) ISAs that are enabled for "znver2" architecture. 3) For the time being, it uses the znver1 scheduler model. 4) Tests are updated. 5) This patch is the clang counterpart to D58343 Reviewers: craig.topper Tags: #clang Differential Revision: https://reviews.llvm.org/D58344 llvm-svn: 354899	2019-02-26 17:15:36 +00:00
Oliver Stannard	e3c8ce8b75	[ARM] Add pre-defined macros for ROPI and RWPI This adds ACLE-defined macros to test for code being compiled in the ROPI and RWPI position-independence modes. Differential revision: https://reviews.llvm.org/D23610 llvm-svn: 354265	2019-02-18 12:39:47 +00:00
Nirav Dave	91ecb69acd	[X86] Prevent clang clobber checking for asm flag constraints. Update getConstraintRegister as X86 Asm flag output constraints are no longer fully alphanumeric, llvm-svn: 354211	2019-02-17 03:53:23 +00:00
Nirav Dave	90868bb058	[X86] Add clang support for X86 flag output parameters. Summary: Add frontend support and expected flags for X86 inline assembly flag parameters. Reviewers: craig.topper, rnk, echristo Subscribers: eraman, nickdesaulniers, void, llvm-commits Differential Revision: https://reviews.llvm.org/D57394 llvm-svn: 354053	2019-02-14 19:27:25 +00:00
Hubert Tong	45195c873b	[PowerPC] Stop defining _ARCH_PWR6X on POWER7 and up Summary: The predefined macro `_ARCH_PWR6X` is associated with GCC's `-mcpu=power6x` option, which enables generation of P6 "raw mode" instructions such as `mftgpr`. Later POWER processors build upon the "architected mode", not the raw one. `_ARCH_PWR6X` should not be defined for these later processors. Fixes PR#40236. Reviewers: echristo, hfinkel, kbarton, nemanjai, wschmidt Reviewed By: hfinkel Subscribers: jsji, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58128 llvm-svn: 353975	2019-02-13 20:17:13 +00:00
Simon Atanasyan	4c22a57414	[Headers][mips] Add `__attribute__((__mode__(__unwind_word__)))` to the _Unwind_Word / _Unwind_SWord definitions The rationale of this change is to fix _Unwind_Word / _Unwind_SWord definitions for MIPS N32 ABI. This ABI uses 32-bit pointers, but _Unwind_Word and _Unwind_SWord types are eight bytes long. # The __attribute__((__mode__(__unwind_word__))) is added to the type definitions. It makes them equal to the corresponding definitions used by GCC and allows to override types using `getUnwindWordWidth` function. # The `getUnwindWordWidth` virtual function override in the `MipsTargetInfo` class and provides correct type size values. Differential revision: https://reviews.llvm.org/D58165 llvm-svn: 353965	2019-02-13 18:27:09 +00:00
Brad Smith	09699a7603	long double is double on OpenBSD/NetBSD/PPC. Patch by George Koehler. llvm-svn: 353656	2019-02-11 02:53:16 +00:00
Stanislav Mekhanoshin	1607a37308	[AMDGPU] Split dot-insts feature Differential Revision: https://reviews.llvm.org/D57972 llvm-svn: 353588	2019-02-09 00:34:41 +00:00
Jiong Wang	862e7405e8	bpf: teach BPF driver about the new CPU "v3" This patch simply teach BPF driver about the new CPU "v3" introduced in LLVM backend. Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> llvm-svn: 353479	2019-02-07 22:51:56 +00:00
Heejin Ahn	bab8597916	[WebAssembly] Add atomics target option Reviewers: tlively Subscribers: dschuff, sbc100, jgravelle-google, sunfish, jfb, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D57798 llvm-svn: 353260	2019-02-06 01:41:26 +00:00
Alexey Bataev	1a9e05d7da	[DEBUG_INFO][NVPTX] Generate correct data about variable address class. Summary: Added ability to generate correct debug info data about the variable address class. Currently, for all the locals and globals the default values are used, ADDR_local_space(6) for locals and ADDR_global_space(5) for globals. The values are taken from the table in https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf. We need to emit correct data for address classes of, at least, shared and constant globals. Currently, all these variables are treated by the cuda-gdb debugger as the variables in the global address space and, thus, it require manual data type casting. Reviewers: echristo, probinson Subscribers: jholewinski, aprantl, cfe-commits Differential Revision: https://reviews.llvm.org/D57162 llvm-svn: 353204	2019-02-05 19:45:57 +00:00
Yaxun Liu	277e064bf5	Do not copy long double and 128-bit fp format from aux target for AMDGPU rC352620 caused regressions because it copied floating point format from aux target. floating point format decides whether extended long double is supported. It is x86_fp80 on x86 but IEEE double on amdgcn. Document usage of long doubel type in HIP programming guide https://github.com/ROCm-Developer-Tools/HIP/pull/890 Differential Revision: https://reviews.llvm.org/D57527 llvm-svn: 352801	2019-01-31 21:57:51 +00:00
Thomas Lively	88058d4e1e	[WebAssembly] Add bulk memory target feature Summary: Also clean up some preexisting target feature code. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, jfb Differential Revision: https://reviews.llvm.org/D57495 llvm-svn: 352793	2019-01-31 21:02:19 +00:00
Yaxun Liu	95f2ca541f	[HIP] Fix size_t for MSVC environment In 64 bit MSVC environment size_t is defined as unsigned long long. In single source language like HIP, data layout should be consistent in device and host compilation, therefore copy data layout controlling fields from Aux target for AMDGPU target. Differential Revision: https://reviews.llvm.org/D56318 llvm-svn: 352620	2019-01-30 12:26:54 +00:00
Erich Keane	1d1d438e8e	Disable _Float16 for non ARM/SPIR Targets As Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2019-January/129543.html There are problems exposing the _Float16 type on architectures that haven't defined the ABI/ISel for the type yet, so we're temporarily disabling the type and making it opt-in. Differential Revision: https://reviews.llvm.org/D57188 Change-Id: I5db7366dedf1deb9485adb8948b1deb7e612a736 llvm-svn: 352221	2019-01-25 17:27:57 +00:00
Anton Korobeynikov	58f6bc509b	[MSP430] Ajust f32/f64 alignment according to MSP430 EABI Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D57015 llvm-svn: 352177	2019-01-25 08:51:53 +00:00
Dan Gohman	c1eee1d659	[WebAssembly] Add a __wasi__ target macro This adds a `__wasi__` macro for the wasi OS, similar to `__linux__` etc. for other OS's. Differential Revision: https://reviews.llvm.org/D57155 llvm-svn: 352105	2019-01-24 21:05:11 +00:00
Dan Gohman	a957fa7e15	[WebAssembly] Support __float128 This enables support for the "__float128" keyword. Differential Revision: https://reviews.llvm.org/D57154 llvm-svn: 352100	2019-01-24 20:33:28 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Chandler Carruth	4a50956c07	Convert two more files that were using Windows line endings and remove a stray single '\r' from one file. These are the last line ending issues I can find in the files containing parts of LLVM's file headers. llvm-svn: 351634	2019-01-19 06:36:08 +00:00
Nirav Dave	7503316d10	Revert "Clang side support for @cc assembly operands." llvm-svn: 351561	2019-01-18 16:03:08 +00:00
Nirav Dave	d410e392cd	Clang side support for @cc assembly operands. llvm-svn: 351559	2019-01-18 15:57:23 +00:00
Craig Topper	5589738979	[Nios2] Remove Nios2 backend As mentioned here http://lists.llvm.org/pipermail/llvm-dev/2019-January/129121.html This backend is incomplete and has not been maintained in several months. Differential Revision: https://reviews.llvm.org/D56690 llvm-svn: 351230	2019-01-15 19:58:36 +00:00
Thomas Lively	b7b9fdc114	[WebAssembly] Add unimplemented-simd128 feature, gate builtins Summary: Depends on D56501. Also adds a macro define `__wasm_unimplemented_simd128__` for feature detection of unimplemented SIMD builtins. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, kristina, cfe-commits, rrwinterton llvm-svn: 350909	2019-01-10 23:49:00 +00:00
Stanislav Mekhanoshin	6332f4d0d4	[AMDGPU] Separate feature dot-insts Differential Revision: https://reviews.llvm.org/D56525 llvm-svn: 350794	2019-01-10 03:25:47 +00:00
Dan Albert	706b1f3aeb	Android is not GNU, so don't claim that it is. Reviewers: pirama, srhines Reviewed By: srhines Subscribers: kristina, cfe-commits Differential Revision: https://reviews.llvm.org/D55953 llvm-svn: 350664	2019-01-08 22:31:19 +00:00
Michal Gorny	5a409d0e30	Replace getOS() == llvm::Triple::BSD with isOSBSD() [NFCI] Replace multiple comparisons of getOS() value with FreeBSD, NetBSD, OpenBSD and DragonFly with matching isOS*BSD() methods. This should improve the consistency of coding style without changing the behavior. Direct getOS() comparisons were left whenever used in switch or switch- like context. Differential Revision: https://reviews.llvm.org/D55916 llvm-svn: 349752	2018-12-20 13:09:30 +00:00
Saleem Abdulrasool	d4f7d6a7a1	Basic: make `int_least64_t` and `int_fast64_t` match on Darwin The Darwin targets use `int64_t` and `uint64_t` to define the `int_least64_t` and `int_fast64_t` types. The underlying type is actually a `long long`. Match the types to allow the printf specifiers to work properly and have the compiler vended macros match the implementation on the target. llvm-svn: 348939	2018-12-12 17:05:20 +00:00
Richard Trieu	6368818fd5	Move CodeGenOptions from Frontend to Basic Basic uses CodeGenOptions and should not depend on Frontend. llvm-svn: 348827	2018-12-11 03:18:39 +00:00
Kang Zhang	9606d58a5f	[PowerPC] VSX register support for inline assembly Summary: The patch is to add the VSX register support for inline assembly. After this patch, we can use VSX register in inline assembly clobber list without error. Reviewed By: jsji, nemanjai Differential Revision: https://reviews.llvm.org/D55192 llvm-svn: 348572	2018-12-07 08:58:12 +00:00
Saleem Abdulrasool	f587857c88	ARM, AArch64: support `__attribute__((__swiftcall__))` Support the Swift calling convention on Windows ARM and AArch64. Both of these conform to the AAPCS, AAPCS64 calling convention, and LLVM has been adjusted to account for the register usage. Ensure that the frontend passes this into the backend. This allows the swift runtime to be built for Windows. llvm-svn: 348454	2018-12-06 03:28:37 +00:00
Krzysztof Parzyszek	85393b28f9	[Hexagon] Add support for Hexagon V66 llvm-svn: 348415	2018-12-05 21:38:35 +00:00
Kristina Brooks	1051bb7463	[Haiku] Support __float128 for x86 and x86_64 This patch addresses a compilation error with clang when running in Haiku being unable to compile code using float128 (throws compilation error such as 'float128 is not supported on this target'). Patch by kallisti5 (Alexander von Gluck IV) Differential Revision: https://reviews.llvm.org/D54901 llvm-svn: 348368	2018-12-05 15:05:06 +00:00
Ulrich Weigand	88e0660bf2	[SystemZ] Do not support __float128 As of rev. 268898, clang supports __float128 on SystemZ. This seems to have been in error. GCC has never supported __float128 on SystemZ, since the "long double" type on the platform is already IEEE-128. (GCC only supports __float128 on platforms where "long double" is some other data type.) For compatibility reasons this patch removes __float128 on SystemZ again. The test case is updated accordingly. llvm-svn: 348247	2018-12-04 10:51:36 +00:00
Kristina Brooks	77a4adc4f9	Add Hurd target to Clang driver (2/2) This adds Hurd toolchain support to Clang's driver in addition to handling translating the triple from Hurd-compatible form to the actual triple registered in LLVM. (Phabricator was stripping the empty files from the patch so I manually created them) Patch by sthibaul (Samuel Thibault) Differential Revision: https://reviews.llvm.org/D54379 llvm-svn: 347833	2018-11-29 03:49:14 +00:00
Tatyana Krasnukha	f8c264e02e	[clang][ARC] Add ARCTargetInfo Based-on-patch-by: Pete Couperus <petecoup@synopsys.com> Differential Revision: https://reviews.llvm.org/D53100 llvm-svn: 347699	2018-11-27 19:52:10 +00:00
Craig Topper	5bb1bf6ff5	[X86] Add -march=cascadelake support in clang. This is skylake-avx512 with the addition of avx512vnni ISA. Patch by Jianping Chen Differential Revision: https://reviews.llvm.org/D54792 llvm-svn: 347682	2018-11-27 18:05:14 +00:00
Sander de Smalen	44a2253a54	[AArch64] Add aarch64_vector_pcs function attribute to Clang This is the Clang patch to complement the following LLVM patches: https://reviews.llvm.org/D51477 https://reviews.llvm.org/D51479 More information describing the vector ABI and procedure call standard can be found here: https://developer.arm.com/products/software-development-tools/\ hpc/arm-compiler-for-hpc/vector-function-abi Patch by Kerry McLaughlin. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D54425 llvm-svn: 347571	2018-11-26 16:38:37 +00:00
Reid Kleckner	4dc0b1ac60	Fix clang -Wimplicit-fallthrough warnings across llvm, NFC This patch should not introduce any behavior changes. It consists of mostly one of two changes: 1. Replacing fall through comments with the LLVM_FALLTHROUGH macro 2. Inserting 'break' before falling through into a case block consisting of only 'break'. We were already using this warning with GCC, but its warning behaves slightly differently. In this patch, the following differences are relevant: 1. GCC recognizes comments that say "fall through" as annotations, clang doesn't 2. GCC doesn't warn on "case N: foo(); default: break;", clang does 3. GCC doesn't warn when the case contains a switch, but falls through the outer case. I will enable the warning separately in a follow-up patch so that it can be cleanly reverted if necessary. Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu Differential Revision: https://reviews.llvm.org/D53950 llvm-svn: 345882	2018-11-01 19:54:45 +00:00
Reid Kleckner	14518f1d9b	Add LLVM_FALLTHROUGH annotation after switch This silences a -Wimplicit-fallthrough warning from clang. GCC does not appear to warn when the case body ends in a switch. This is a somewhat surprising but intended fallthrough that I pulled out from my mechanical patch. The code intends to handle 'Yi' and related constraints as the 'x' constraint. llvm-svn: 345873	2018-11-01 18:53:02 +00:00
Li Jia He	bbaedf2ba1	[Clang][PowerPC] Support constraint 'wi' in asm From the gcc manual, we can see that the specific limit of wi inline asm is “FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS”. The link is https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/Machine-Constraints.html#Machine-Constraints. We should accept this constraint. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D53265 llvm-svn: 345809	2018-11-01 02:32:49 +00:00
Erik Pilkington	fa98390b3c	NFC: Remove the ObjC1/ObjC2 distinction from clang (and related projects) We haven't supported compiling ObjC1 for a long time (and never will again), so there isn't any reason to keep these separate. This patch replaces LangOpts::ObjC1 and LangOpts::ObjC2 with LangOpts::ObjC. Differential revision: https://reviews.llvm.org/D53547 llvm-svn: 345637	2018-10-30 20:31:30 +00:00
Bryan Chan	223307b3dc	[AArch64] Implement FP16FML intrinsics Generate the FP16FML intrinsics into arm_neon.h (AArch64 only for now). Add two new type modifiers to NeonEmitter to handle the new prototypes. Define __ARM_FEATURE_FP16FML when +fp16fml is enabled and guard the intrinsics with the macro in arm_neon.h. Based on a patch by Gao Yiling. Differential Revision: https://reviews.llvm.org/D53633 llvm-svn: 345344	2018-10-25 23:47:00 +00:00
Erich Keane	19a8adc9bd	Implement Function Multiversioning for Non-ELF Systems. Similar to how ICC handles CPU-Dispatch on Windows, this patch uses the resolver function directly to forward the call to the proper function. This is not nearly as efficient as IFuncs of course, but is still quite useful for large functions specifically developed for certain processors. This is unfortunately still limited to x86, since it depends on __builtin_cpu_supports and __builtin_cpu_is, which are x86 builtins. The naming for the resolver/forwarding function for cpu-dispatch was taken from ICC's implementation, which uses the unmodified name for this (no mangling additions). This is possible, since cpu-dispatch uses '.A' for the 'default' version. In 'target' multiversioning, this function keeps the '.resolver' extension in order to keep the default function keeping the default mangling. Change-Id: I4731555a39be26c7ad59a2d8fda6fa1a50f73284 Differential Revision: https://reviews.llvm.org/D53586 llvm-svn: 345298	2018-10-25 18:57:19 +00:00
Tim Renouf	632f35d495	Add gfx909 to GPU Arch Subscribers: jholewinski, cfe-commits Differential Revision: https://reviews.llvm.org/D53558 llvm-svn: 345198	2018-10-24 21:19:02 +00:00
Konstantin Zhuravlyov	06570954e2	AMDGPU: Handle gfx909 in AMDGPUTargetInfo::initFeatureMap + add required tests llvm-svn: 345181	2018-10-24 19:07:56 +00:00
Yaxun Liu	83b5f35d85	Add gfx904 and gfx906 to GPU Arch Differential Revision: https://reviews.llvm.org/D53472 llvm-svn: 344996	2018-10-23 02:05:31 +00:00
Craig Topper	9ad1e8a93b	[X86] Remove 'rtm' feature from KNL. I'm unsure if KNL has this feature, but the backend never thought it did, only clang did. The predefined-arch-macros test lost the check for __RTM__ on KNL when it was removed Skylake CPUs in r344117. I think we want to drop it from KNL for consistency with Skylake anyway regardless of how we got here. llvm-svn: 344978	2018-10-23 00:15:37 +00:00
Krzysztof Parzyszek	57e6706e56	[Hexagon] Remove support for V4 llvm-svn: 344786	2018-10-19 15:36:45 +00:00
Eli Friedman	39ceea326d	[AArch64] Define __ELF__ for aarch64-none-elf and other similar triples. "aarch64-none-elf" is commonly used for AArch64 baremetal toolchains. Differential Revision: https://reviews.llvm.org/D53348 llvm-svn: 344710	2018-10-17 21:07:11 +00:00
Simon Atanasyan	db81c7b9c9	[mips] Fix handling of GNUABIN32 environment in a target triple The `GNUABIN32` environment in a target triple implies using the N32 ABI. This patch adds support for this environment and switches on N32 ABI if necessary. Patch by Patch by YunQiang Su. Differential revision: https://reviews.llvm.org/D51464 llvm-svn: 344570	2018-10-15 22:43:23 +00:00
Craig Topper	153b53adfa	[X86] Remove FeatureRTM from Skylake processor list Summary: There are a LOT of Skylakes and later without TSX-NI. Examples: - SKL: https://ark.intel.com/products/136863/Intel-Core-i3-8121U-Processor-4M-Cache-up-to-3-20-GHz- - KBL: https://ark.intel.com/products/97540/Intel-Core-i7-7560U-Processor-4M-Cache-up-to-3-80-GHz- - KBL-R: https://ark.intel.com/products/149091/Intel-Core-i7-8565U-Processor-8M-Cache-up-to-4-60-GHz- - CNL: https://ark.intel.com/products/136863/Intel-Core-i3-8121U-Processor-4M-Cache-up-to-3_20-GHz This feature seems to be present only on high-end desktop and server chips (I can't find any SKX without). This commit leaves it disabled for all processors, but can be re-enabled for specific builds with -mrtm. Matches https://reviews.llvm.org/D53041 Patch by Thiago Macieira Reviewers: erichkeane, craig.topper Reviewed By: craig.topper Subscribers: lebedev.ri, cfe-commits Differential Revision: https://reviews.llvm.org/D53042 llvm-svn: 344117	2018-10-10 07:43:45 +00:00
Ali Tamur	bc1cd929bf	Introduce code_model macros Summary: gcc defines macros such as __code_model_small_ based on the user passed command line flag -mcmodel. clang accepts a flag with the same name and similar effects, but does not generate any macro that the user can use. This cl narrows the gap between gcc and clang behaviour. However, achieving full compatibility with gcc is not trivial: The set of valid values for mcmodel in gcc and clang are not equal. Also, gcc defines different macros for different architectures. In this cl, we only tackle an easy part of the problem and define the macro only for x64 architecture. When the user does not specify a mcmodel, the macro for small code model is produced, as is the case with gcc. Reviewers: compnerd, MaskRay Reviewed By: MaskRay Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D52920 llvm-svn: 344000	2018-10-08 22:25:20 +00:00
Craig Topper	6ad9220067	[X86] Add the movbe instruction intrinsics from icc. These intrinsics exist in icc. They can be found on the Intel Intrinsics Guide website. All the backend support is in place to pattern match a load+bswap or a bswap+store pattern to the MOVBE instructions. So we just need to get the frontend to emit the correct IR. The pointer arguments in icc are declared as void so I had to jump through a packed struct to forcing a specific alignment on the load/store. Same trick we use in the unaligned vector load/store intrinsics Differential Revision: https://reviews.llvm.org/D52586 llvm-svn: 343343	2018-09-28 17:09:51 +00:00
Sam Parker	d476cd304b	[ARM] Prevent DSP and SIM32 being set for v6m My previous change (rL340911) set the two features for architectures >= 6, which wrongly includes v6m. Now set to >= 6 and not Cortex-M. Differential Revision: https://reviews.llvm.org/D52644 llvm-svn: 343309	2018-09-28 10:18:02 +00:00
Oliver Stannard	a30b48d020	[ARM/AArch64][v8.5A] Add Armv8.5-A target This patch allows targetting Armv8.5-A from Clang. Most of the implementation is in TargetParser, so this is mostly just adding tests. Patch by Pablo Barrio! Differential revision: https://reviews.llvm.org/D52491 llvm-svn: 343111	2018-09-26 14:20:29 +00:00
Artem Belevich	44ecb0e3c2	[CUDA] Added basic support for compiling with CUDA-10.0 llvm-svn: 342924	2018-09-24 23:10:44 +00:00
Saleem Abdulrasool	6183c6316d	Basic: correct `__WINT_TYPE__` on Windows Windows uses `unsigned short` for `wint_t`. Correct the type definition as vended by the compiler. This type is defined in corecrt.h and is unconditionally typedef'ed. cl does not have an equivalent to `__WINT_TYPE__` which is why this was never detected. llvm-svn: 342557	2018-09-19 16:18:55 +00:00
Erich Keane	7582222691	Move AESNI generation to Skylake and Goldmont The instruction set first appeared with Westmere, but not all processors in that and the next few generations have the instructions. According to Wikipedia[1], the first generation in which all SKUs have AES instructions are Skylake and Goldmont. I can't find any Skylake, Kabylake, Kabylake-R or Cannon Lake currently listed at https://ark.intel.com that says "Intel® AES New Instructions" "No". This matches GCC commit https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01940.html [1] https://en.wikipedia.org/wiki/AES_instruction_set Patch By: thiagomacieira Differential Revision: https://reviews.llvm.org/D51510 llvm-svn: 341862	2018-09-10 21:12:21 +00:00
Sam Parker	96d4872899	[ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores ARM_FEATURE_DSP is already set for targets with the +dsp feature. In the backend, this target feature is also used to represent the availability of the of the instructions that the ACLE guard through the __ARM_FEATURE_SIMD32 macro. We don't have any cores that implement one and not the other, so set this macro for cores later than V6 or for Cortex-M cores that the target parser, or user, reports that the 'dsp' instructions are supported. Differential Revision: https://reviews.llvm.org/D51093 llvm-svn: 340911	2018-08-29 10:39:03 +00:00
Hans Wennborg	b4278895a4	Revert r323281 "Adjust MaxAtomicInlineWidth for i386/i486 targets." As reported on http://lists.llvm.org/pipermail/cfe-dev/2018-August/058760.html, this broke i386-freebsd11 due to its lack of atomic 64 bit primitives. While that's not really this commit's fault, let's revert back to the old behaviour until this can be fixed. This means generating cmpxchg8b etc for i386 and i486 which don't technically support those, but that's been the behaviour for a long time, so a little longer probably doesn't hurt that much. > Adjust MaxAtomicInlineWidth for i386/i486 targets. > > This is to fix the bug reported in https://bugs.llvm.org/show_bug.cgi?id=34347#c6. > Currently, all MaxAtomicInlineWidth of x86-32 targets are set to 64. However, > i386 doesn't support any cmpxchg related instructions. i486 only supports cmpxchg. > So in this patch MaxAtomicInlineWidth is reset as follows: > For i386, the MaxAtomicInlineWidth should be 0 because no cmpxchg is supported. > For i486, the MaxAtomicInlineWidth should be 32 because it supports cmpxchg. > For others 32 bits x86 cpu, the MaxAtomicInlineWidth should be 64 because of cmpxchg8b. > > Differential Revision: https://reviews.llvm.org/D42154 llvm-svn: 340666	2018-08-24 22:46:33 +00:00
Chandler Carruth	ae0cafece8	[x86/retpoline] Split the LLVM concept of retpolines into separate subtarget features for indirect calls and indirect branches. This is in preparation for enabling only the call retpolines when using speculative load hardening. I've continued to use subtarget features for now as they continue to seem the best fit given the lack of other retpoline like constructs so far. The LLVM side is pretty simple. I'd like to eventually get rid of the old feature, but not sure what backwards compatibility issues that will cause. This does remove the "implies" from requesting an external thunk. This always seemed somewhat questionable and is now clearly not desirable -- you specify a thunk the same way no matter which set of things are getting retpolines. I really want to keep this nicely isolated from end users and just an LLVM implementation detail, so I've moved the `-mretpoline` flag in Clang to no longer rely on a specific subtarget feature by that name and instead to be directly handled. In some ways this is simpler, but in order to preserve existing behavior I've had to add some fallback code so that users who relied on merely passing -mretpoline-external-thunk continue to get the same behavior. We should eventually remove this I suspect (we have never tested that it works!) but I've not done that in this patch. Differential Revision: https://reviews.llvm.org/D51150 llvm-svn: 340515	2018-08-23 06:06:38 +00:00
Stefan Maksimovic	eb63256095	[clang][mips] Set __mips_fpr correctly for -mfpxx Set __mips_fpr to 0 if o32 ABI is used with either -mfpxx or none of -mfp32, -mfpxx, -mfp64 being specified. Introduce additional checks: -mfpxx is only to be used in conjunction with the o32 ABI. report an error when incompatible options are provided. Formerly no errors were raised when combining n32/n64 ABIs with -mfp32 and -mfpxx. There are other cases when __mips_fpr should be set to 0 that are not covered, ex. using o32 on a mips64 cpu which is valid but not supported in the backend as of yet. Differential Revision: https://reviews.llvm.org/D50557 llvm-svn: 340391	2018-08-22 09:26:25 +00:00
Matt Arsenault	b666e73dd9	AMDGPU: Move target code into TargetParser llvm-svn: 340292	2018-08-21 16:13:29 +00:00
Matt Arsenault	89e833c662	AMDGPU: Correct errors in device table llvm-svn: 339934	2018-08-16 20:19:47 +00:00
Matt Arsenault	45bc148093	AMDGPU: Fix enabling denormals by default on pre-VI targets Fast FMAF is not a sufficient condition to enable denormals. Before VI, enabling denormals caused F32 instructions to run at F64 speeds. llvm-svn: 339278	2018-08-08 17:48:37 +00:00
Matt Arsenault	31c895ecdf	AMDGPU: Add builtin for s_dcache_wb llvm-svn: 339110	2018-08-07 07:49:13 +00:00
Matt Arsenault	24f3924709	AMDGPU: Add builtin for s_dcache_inv_vol llvm-svn: 339109	2018-08-07 07:49:04 +00:00
Matt Arsenault	c65f966d76	Try to make builtin address space declarations not useless The way address space declarations for builtins currently work is nearly useless. The code assumes the address spaces used for builtins is a confusingly named "target address space" from user code using __attribute__((address_space(N))) that matches the builtin declaration. There's no way to use this to declare a builtin that returns a language specific address space. The terminology used is highly cofusing since it has nothing to do with the the address space selected by the target to use for a language address space. This feature is essentially unused as-is. AMDGPU and NVPTX are the only in-tree targets attempting to use this. The AMDGPU builtins certainly do not behave as intended (i.e. all of the builtins returning pointers can never compile because the numbered address space never matches the expected named address space). The NVPTX builtins are missing tests for some, and the others seem to rely on an implicit addrspacecast. Change the used address space for builtins based on a target hook to allow using a language address space for a builtin. This allows the same builtin declaration to be used for multiple languages with similarly purposed address spaces (e.g. the same AMDGPU builtin can be used in OpenCL and CUDA even though the constant address spaces are arbitarily different). This breaks the possibility of using arbitrary numbered address spaces alongside the named address spaces for builtins. If this is an issue we probably need to introduce another builtin declaration character to distinguish language address spaces from so-called "target address spaces". llvm-svn: 338707	2018-08-02 12:14:28 +00:00
Sjoerd Meijer	e6e4f3178c	[AArch64][ARM] Add Armv8.4-A tests This adds tests for Armv8.4-A, and also some v8.2 and v8.3 tests that were missing. Differential Revision: https://reviews.llvm.org/D50068 llvm-svn: 338525	2018-08-01 12:41:10 +00:00
Fangrui Song	6907ce2f8f	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338291	2018-07-30 19:24:48 +00:00
Sanjin Sijaric	56391d6f84	[ARM64] [Windows] Follow MS X86_64 C++ ABI when passing structs Summary: Microsoft's C++ object model for ARM64 is the same as that for X86_64. For example, small structs with non-trivial copy constructors or virtual function tables are passed indirectly. Currently, they are passed in registers when compiled with clang. Reviewers: rnk, mstorsjo, TomTan, haripul, javed.absar Reviewed By: rnk, mstorsjo Subscribers: kristof.beyls, chrib, llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D49770 llvm-svn: 338076	2018-07-26 22:18:28 +00:00
Dan Gohman	df07a35912	[WebAssembly] Change size_t to `unsigned long`. Changing it to unsigned long (which is 32-bit on wasm32) makes it the same type as wasm64 (where unsigned long is 64-bit), which would eliminate the most common cause for mangled names being different between wasm32 and wasm64. For example, export lists containing symbol names could now often be the same between wasm32 and wasm64. Differential Revision: https://reviews.llvm.org/D40526 llvm-svn: 337783	2018-07-24 00:29:58 +00:00
Erik Pilkington	7adcf292a1	NFC: Add the emacs c++ mode hint "-- C++ --" to the headers that don't have it https://llvm.org/docs/CodingStandards.html#file-headers llvm-svn: 337780	2018-07-24 00:07:49 +00:00
Reid Kleckner	dbc390d0c5	[MS] Update _MSVC_LANG values for C++17 and C++2a Fixes PR38262 llvm-svn: 337715	2018-07-23 17:44:00 +00:00
Erich Keane	3efe00206f	Implement cpu_dispatch/cpu_specific Multiversioning As documented here: https://software.intel.com/en-us/node/682969 and https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning is an ICC feature that provides for function multiversioning. This feature is implemented with two attributes: First, cpu_specific, which specifies the individual function versions. Second, cpu_dispatch, which specifies the location of the resolver function and the list of resolvable functions. This is valuable since it provides a mechanism where the resolver's TU can be specified in one location, and the individual implementions each in their own translation units. The goal of this patch is to be source-compatible with ICC, so this implementation diverges from the ICC implementation in a few ways: 1- Linux x86/64 only: This implementation uses ifuncs in order to properly dispatch functions. This is is a valuable performance benefit over the ICC implementation. A future patch will be provided to enable this feature on Windows, but it will obviously more closely fit ICC's implementation. 2- CPU Identification functions: ICC uses a set of custom functions to identify the feature list of the host processor. This patch uses the cpu_supports functionality in order to better align with 'target' multiversioning. 1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function marked cpu_dispatch be an empty definition. This patch supports that as well, however declarations are also permitted, since the linker will solve the issue of multiple emissions. Differential Revision: https://reviews.llvm.org/D47474 llvm-svn: 337552	2018-07-20 14:13:28 +00:00
Martin Storsjo	17c0f721b9	[AArch64] Define TARGET_HEADER_BUILTIN Without it, the new intrinsics became available for all language variants. This was missed in SVN r337327. llvm-svn: 337352	2018-07-18 06:15:09 +00:00
Joerg Sonnenberger	b79e61f8b3	Always use __mcount on NetBSD. Some platforms don't provide _mcount. llvm-svn: 337277	2018-07-17 13:13:34 +00:00
Joerg Sonnenberger	68c0210fa6	By popular demand, switch in64_t on NetBSD/AArch64 and NetBSD/PowerPC64 to long for consistency with other 64bit platforms. llvm-svn: 337271	2018-07-17 12:33:19 +00:00
Krzysztof Parzyszek	762dee516c	[Hexagon] Diagnose intrinsics not supported by selected CPU/HVX llvm-svn: 336933	2018-07-12 18:54:04 +00:00
Alexander Richardson	742553da13	Use Triple::isMIPS() instead of enumerating all Triples. NFC Reviewed By: atanasyan Differential Revision: https://reviews.llvm.org/D48549 llvm-svn: 335495	2018-06-25 16:49:52 +00:00
Sjoerd Meijer	5c4c998a54	[SPIR] Prevent SPIR targets from using half conversion intrinsics The SPIR target currently allows for half precision floating point types to be emitted using the LLVM intrinsic functions which convert half types to floats and doubles. However, this is illegal in SPIR as the only intrinsic allowed by SPIR is memcpy, as per section 3 of the SPIR specification. Currently this is leading to an assert being hit in the Clang CodeGen when attempting to emit a constant or literal _Float16 type in a comparison operation on a SPIR or SPIR64 target. This assert stems from the CodeGen attempting to emit a constant half value as an integer because the backend has specified that it is using these half conversion intrinsics (which represents half as i16). This patch prevents SPIR targets from using these intrinsics by overloading the responsible target info method, marks SPIR targets as having a legal half type and provides additional regression testing for the _Float16 type on SPIR targets. Patch by: Stephen McGroarty Differential Revision: https://reviews.llvm.org/D48188 llvm-svn: 335111	2018-06-20 09:49:40 +00:00
Yonghong Song	6927cf0c2a	bpf: recognize target specific option -mattr=dwarfris in clang The following is the usage example with clang: bash-4.2$ clang -target bpf -O2 -g -c -Xclang -target-feature -Xclang +dwarfris t.c bash-4.2$ llvm-objdump -S -d t.o t.o: file format ELF64-BPF Disassembly of section .text: test: ; int test(void) { 0: b7 00 00 00 00 00 00 00 r0 = 0 ; return 0; 1: 95 00 00 00 00 00 00 00 exit bash-4.2$ cat t.c int test(void) { return 0; } bash-4.2$ Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 334839	2018-06-15 15:53:31 +00:00
Benjamin Kramer	ffe60e0403	[Basic] Fix -Wreorder warning Just use field initializers that don't suffer from this problem llvm-svn: 334619	2018-06-13 16:45:12 +00:00
Stefan Pintilie	a6ce3fe72b	[PowerPC] The __float128 type should only be available on Power9 Diasble the use of the type __float128 for PPC machines older than Power9. The use of -mfloat128 for PPC machine older than Power9 will result in an error. Differential Revision: https://reviews.llvm.org/D48088 llvm-svn: 334613	2018-06-13 16:05:05 +00:00
Reid Kleckner	89fbd55145	Revert r333791 "Cap "voluntary" vector alignment at 16 for all Darwin platforms." Adding __attribute__((aligned(32))) to __m256 breaks the implementation of _mm256_loadu_ps on Windows. On Windows, alignment attributes have higher precedence than packing attributes. We also might want to carefully consider the consequences of changing our vector typedefs, since many users copy them and invent their own new, non-Intel specific vector type names. llvm-svn: 333958	2018-06-04 21:39:20 +00:00
John McCall	280c656031	Cap "voluntary" vector alignment at 16 for all Darwin platforms. This fixes two major problems: - We were not capping vector alignment as desired on 32-bit ARM. - We were using different alignments based on the AVX settings on Intel, so we did not have a consistent ABI. This is an ABI break, but we think we can get away with it because vectors tend to be used mostly in inline code (which is why not having a consistent ABI has not proven disastrous on Intel). Intel's AVX types are specified as having 32-byte / 64-byte alignment, so align them explicitly instead of relying on the base ABI rule. Note that this sort of attribute is stripped from template arguments in template substitution, so there's a possibility that code templated over vectors will produce inadequately-aligned objects. The right long-term solution for this is for alignment attributes to be interpreted as true qualifiers and thus preserved in the canonical type. llvm-svn: 333791	2018-06-01 21:34:26 +00:00
Daniel Cederman	8cc53aecad	[Sparc] Add floating-point register names Reviewers: jyknight Reviewed By: jyknight Subscribers: eraman, fedor.sergeev, jrtc27, cfe-commits Differential Revision: https://reviews.llvm.org/D47137 llvm-svn: 333510	2018-05-30 06:02:18 +00:00
Bob Wilson	fa84fc916c	Support Swift calling convention for PPC64 targets This adds basic support for the Swift calling convention with PPC64 targets. Patch provided by Atul Sowani in bug report #37223 llvm-svn: 333316	2018-05-25 21:26:03 +00:00
Gabor Buella	078bb99a90	[x86] invpcid intrinsic An intrinsic for an old instruction, as described in the Intel SDM. Reviewers: craig.topper, rnk Reviewed By: craig.topper, rnk Differential Revision: https://reviews.llvm.org/D47142 llvm-svn: 333256	2018-05-25 06:34:42 +00:00
Alexander Ivchenko	0fb8c877c4	This patch aims to match the changes introduced in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The -mibt feature flag is being removed, and the -fcf-protection option now also defines a CET macro and causes errors when used on non-X86 targets, while X86 targets no longer check for -mibt and -mshstk to determine if -fcf-protection is supported. -mshstk is now used only to determine availability of shadow stack intrinsics. Comes with an LLVM patch (D46882). Patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D46881 llvm-svn: 332704	2018-05-18 11:56:21 +00:00
Rainer Orth	877d15b396	[Solaris] Only define _REENTRANT if -pthread When looking at lib/Basic/Targets/OSTargets.h, I noticed that _REENTRANT is defined unconditionally on Solaris, unlike all other targets and what either Studio cc (only define it with -mt) or gcc (only define it with -pthread) do. This patch follows that lead. Differential Revision: https://reviews.llvm.org/D41241 llvm-svn: 332343	2018-05-15 11:36:00 +00:00
Gabor Buella	3a7571259e	[X86] ptwrite intrinsic Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46540 llvm-svn: 331962	2018-05-10 07:28:54 +00:00
Artem Belevich	679dafe69e	[CUDA] Added -f[no-]cuda-short-ptr option The option enables use of 32-bit pointers for accessing const/local/shared memory. The feature is disabled by default. Differential Revision: https://reviews.llvm.org/D46148 llvm-svn: 331938	2018-05-09 23:10:09 +00:00
Adrian Prantl	9fc8faf9e6	Remove \brief commands from doxygen comments. This is similar to the LLVM change https://reviews.llvm.org/D46290. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46320 llvm-svn: 331834	2018-05-09 01:00:01 +00:00
Gabor Buella	b0f310d51d	[x86] Introduce the pconfig intrinsic Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46431 llvm-svn: 331740	2018-05-08 06:49:41 +00:00
Gabor Buella	a51e0c2243	[X86] directstore and movdir64b intrinsics Reviewers: spatel, craig.topper, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45984 llvm-svn: 331249	2018-05-01 10:05:42 +00:00
Matt Arsenault	d2da3c20d7	AMDGPU: Add Vega12 and Vega20 Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331216	2018-04-30 19:08:27 +00:00
Mikhail Maltsev	89f7b46b7a	[Targets] Implement getConstraintRegister for ARM and AArch64 Summary: The getConstraintRegister method is used by semantic checking of inline assembly statements in order to diagnose conflicts between clobber list and input/output lists. Currently ARM and AArch64 don't override getConstraintRegister, so conflicts between registers assigned to variables in asm labels and clobber lists are not diagnosed. Such conflicts can cause assertion failures in the back end and even miscompilations. This patch implements getConstraintRegister for ARM and AArch64 targets. Since these targets don't have single-register constraints, the implementation is trivial and just returns the register specified in an asm label (if any). Reviewers: eli.friedman, javed.absar, thopre Reviewed By: thopre Subscribers: rengolin, eraman, rogfer01, myatsina, kristof.beyls, cfe-commits, chrib Differential Revision: https://reviews.llvm.org/D45965 llvm-svn: 331164	2018-04-30 09:11:08 +00:00
Oliver Stannard	39ee9de64c	[ARM] Add __ARM_FEATURE_DOTPROD pre-defined macro This adds a pre-defined macro to test if the compiler has support for the v8.2-A dot rpoduct intrinsics in AArch32 mode. The AAcrh64 equivalent has already been added by rL330229. The ACLE spec which describes this macro hasn't been published yet, but this is based on the final internal draft, and GCC has already implemented this. Differential revision: https://reviews.llvm.org/D46108 llvm-svn: 331038	2018-04-27 13:56:02 +00:00
Rainer Orth	f0f716df8e	[Solaris] __float128 is supported on Solaris/x86 When rebasing https://reviews.llvm.org/D40898 with GCC 5.4 on Solaris 11.4, I ran into a few instances of In file included from /vol/llvm/src/compiler-rt/local/test/asan/TestCases/Posix/asan-symbolize-sanity-test.cc:19: In file included from /usr/gcc/5/lib/gcc/x86_64-pc-solaris2.11/5.4.0/../../../../include/c++/5.4.0/string:40: In file included from /usr/gcc/5/lib/gcc/x86_64-pc-solaris2.11/5.4.0/../../../../include/c++/5.4.0/bits/char_traits.h:39: In file included from /usr/gcc/5/lib/gcc/x86_64-pc-solaris2.11/5.4.0/../../../../include/c++/5.4.0/bits/stl_algobase.h:64: In file included from /usr/gcc/5/lib/gcc/x86_64-pc-solaris2.11/5.4.0/../../../../include/c++/5.4.0/bits/stl_pair.h:59: In file included from /usr/gcc/5/lib/gcc/x86_64-pc-solaris2.11/5.4.0/../../../../include/c++/5.4.0/bits/move.h:57: /usr/gcc/5/lib/gcc/x86_64-pc-solaris2.11/5.4.0/../../../../include/c++/5.4.0/type_traits:311:39: error: __float128 is not supported on this target struct __is_floating_point_helper<__float128> ^ during make check-all. The line above is inside #if !defined(__STRICT_ANSI__) && defined(_GLIBCXX_USE_FLOAT128) template<> struct __is_floating_point_helper<__float128> : public true_type { }; #endif While the libstdc++ header indicates support for __float128, clang does not, but should. The following patch implements this and fixed those errors. Differential Revision: https://reviews.llvm.org/D41240 llvm-svn: 330572	2018-04-23 09:28:08 +00:00
Gabor Buella	eba6c42e66	[X86] WaitPKG intrinsics Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45254 llvm-svn: 330463	2018-04-20 18:44:33 +00:00
Junmo Park	4b9b9fb7a0	[AAch64] Add the __ARM_FEATURE_DOTPROD macro definition This matches what GCC does. https://github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/aarch64-c.c Differential Revision: https://reviews.llvm.org/D45544 llvm-svn: 330229	2018-04-17 22:38:40 +00:00
Eli Friedman	642a5ee1c1	[ARM] Compute a target feature which corresponds to the ARM version. Currently, the interaction between the triple, the CPU, and the supported features is a mess: the driver edits the triple to indicate the supported architecture version, and the LLVM backend uses this to figure out what instructions are legal. This makes it difficult to understand what's happening, and makes it impossible to LTO together two modules with different computed architectures. Instead of relying on triple rewriting to get the correct target features, we should add the right target features explicitly. Differential Revision: https://reviews.llvm.org/D45240 llvm-svn: 330169	2018-04-16 23:52:58 +00:00
Gabor Buella	f594ce739b	[X86] Introduce archs: goldmont-plus & tremont Reviewers: craig.topper Reviewed By: craig.topper Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D45613 llvm-svn: 330110	2018-04-16 08:10:10 +00:00
Gabor Buella	b220dd2b6c	[X86] Introduce cldemote intrinsic Reviewers: craig.topper, zvi Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45257 llvm-svn: 329993	2018-04-13 07:37:24 +00:00
Gabor Buella	a052016ef2	[x86] wbnoinvd intrinsic The WBNOINVD instruction writes back all modified cache lines in the processor’s internal cache to main memory but does not invalidate (flush) the internal caches. Reviewers: craig.topper, zvi, ashlykov Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D43817 llvm-svn: 329848	2018-04-11 20:09:09 +00:00
Artem Belevich	2f8efcf3ca	[NVPTX] Removed 'satom' feature which is no longer used. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329830	2018-04-11 17:51:33 +00:00
Artem Belevich	24e8a680e5	[NVPTX, CUDA] Improved feature constraints on NVPTX target builtins. When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature, consider those features available if we're compiling for GPU >= sm_XX or have enabled PTX version >= ptxYY. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329829	2018-04-11 17:51:19 +00:00
Yonghong Song	2ad75f7410	bpf: accept all asm register names Sometimes when people compile bpf programs with "clang ... -target bpf ...", the kernel header files may contain host arch inline assembly codes as in the patch https://patchwork.kernel.org/patch/10119683/ by Arnaldo Carvaldo de Melo. The current workaround in the above patch is to guard the inline assembly with "#ifndef __BPF__" marco. So when __BPF__ is defined, these macros will have no use. Such a method is not extensible. As a matter of fact, most of these inline assembly codes will be thrown away at the end of clang compilation. So for bpf target, this patch accepts all asm register names in clang AST stage. The name will be checked again during llc code generation if the inline assembly code is indeed for bpf programs. With this patch, the above "#ifndef __BPF__" is not needed any more in https://patchwork.kernel.org/patch/10119683/. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 329823	2018-04-11 16:08:00 +00:00
Gabor Buella	8701b18a25	[X86] Split up -march=icelake to -client & -server Reviewers: craig.topper, zvi, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45056 llvm-svn: 329741	2018-04-10 18:58:26 +00:00
Gabor Buella	e3330c1b61	[X86] Disable SGX for Skylake Server Reviewers: craig.topper, zvi, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45058 llvm-svn: 329701	2018-04-10 14:04:21 +00:00
Yaxun Liu	bec8a66454	[CUDA] Revert defining __CUDA_ARCH__ for amdgcn targets amdgcn targets only support HIP, which does not define __CUDA_ARCH__. this is a partial unroll of r329232 / D45277. Differential Revision: https://reviews.llvm.org/D45387 llvm-svn: 329584	2018-04-09 15:43:01 +00:00
Alexander Kornienko	2a8c18d991	Fix typos in clang Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399	2018-04-06 15:14:32 +00:00
Shiva Chen	4891dbf557	[PATCH] [RISCV] Extend getTargetDefines for RISCVTargetInfo Summary: This patch extend getTargetDefines and implement handleTargetFeatures and hasFeature. and define corresponding marco for those features. Reviewers: asb, apazos, eli.friedman Differential Revision: https://reviews.llvm.org/D44727 Patch by Kito Cheng. llvm-svn: 329278	2018-04-05 12:54:00 +00:00
Yaxun Liu	8a5fc15aa4	[CUDA] Add amdgpu sub archs Patch by Greg Rodgers. Revised and lit tests added by Yaxun Liu. Differential Revision: https://reviews.llvm.org/D45277 llvm-svn: 329232	2018-04-04 21:19:27 +00:00
Krzysztof Parzyszek	3163610010	[Hexagon] Remove -mhvx-double and the corresponding subtarget feature Specifying the HVX vector length should be done via the -mhvx-length option. llvm-svn: 329077	2018-04-03 15:59:10 +00:00
Saleem Abdulrasool	13aeee0d36	CodeGenCXX: support PreserveMostCC in MS ABI Microsoft has reserved 'U' for the PreserveMostCC which is used in the swift runtime. Add support for this. This allows the swift runtime to be built for Windows again. llvm-svn: 329025	2018-04-02 22:25:50 +00:00
Manoj Gupta	cb668d8512	[AArch64]: Add support for parsing rN registers. Summary: Allow rN registers to be simply parsed as correspoing xN registers. The "register ... asm("rN")" is an command to the compiler's register allocator, not an operand to any individual assembly instruction. GCC documents this syntax as "...the name of the register that should be used." This is needed to support the changes in Linux kernel (see https://lkml.org/lkml/2018/3/1/268 ) Note: This will add support only for the limited use case of register ... asm("rN"). Any other uses that make rN leak into assembly are not supported. Reviewers: kristof.beyls, rengolin, peter.smith, t.p.northover Reviewed By: peter.smith Subscribers: javed.absar, eraman, cfe-commits, srhines Differential Revision: https://reviews.llvm.org/D44815 llvm-svn: 328829	2018-03-29 21:11:15 +00:00
Akira Hatanaka	fcbe17c6be	[ObjC++] Make parameter passing and function return compatible with ObjC ObjC and ObjC++ pass non-trivial structs in a way that is incompatible with each other. For example: typedef struct { id f0; __weak id f1; } S; // this code is compiled in c++. extern "C" { void foo(S s); } void caller() { // the caller passes the parameter indirectly and destructs it. foo(S()); } // this function is compiled in c. // 'a' is passed directly and is destructed in the callee. void foo(S a) { } This patch fixes the incompatibility by passing and returning structs with __strong or weak fields using the C ABI in C++ mode. __strong and __weak fields in a struct do not cause the struct to be destructed in the caller and __strong fields do not cause the struct to be passed indirectly. Also, this patch fixes the microsoft ABI bug mentioned here: https://reviews.llvm.org/D41039?id=128767#inline-364710 rdar://problem/38887866 Differential Revision: https://reviews.llvm.org/D44908 llvm-svn: 328731	2018-03-28 21:13:14 +00:00
Matt Arsenault	b130ea5605	AMDGPU: Update datalayout for stack alignment llvm-svn: 328657	2018-03-27 19:26:51 +00:00
Yaxun Liu	ac1263cd54	[AMDGPU] Fix codegen for inline assembly Need to override convertConstraint to recognise amdgpu specific register names. Differential Revision: https://reviews.llvm.org/D44533 llvm-svn: 328359	2018-03-23 19:43:42 +00:00
Saleem Abdulrasool	29149d5cb7	Basic: support PreserveMost and PreserveAll on Windows ARM Do not ignore these calling conventions on Windows ARM. They are used by the swift runtime for certain calls. llvm-svn: 328007	2018-03-20 17:33:26 +00:00
Sjoerd Meijer	87793e7599	[ARM] Pass half or i16 types for NEON intrinsics For generating NEON intrinsics, this determines the NEON data type, and whether it should be a half type or an i16 type. I.e., we always pass a half type for AArch64, this hasn't changed, but now also for ARM but only when FullFP16 is enabled, and i16 otherwise. This is intended to be non-functional change, but together with the backend work in D44538 which adds support for f16 vectors, this enables adding the AArch32 FP16 (vector) intrinsics. Differential Revision: https://reviews.llvm.org/D44561 llvm-svn: 327836	2018-03-19 13:22:49 +00:00
Sjoerd Meijer	a7463df6e2	[ARM] ACLE FP16 feature test macros This is a partial recommit of r327189 that was reverted due to test issues. I.e., this recommits minimal functional change, the FP16 feature test macros, and adds tests that were missing in the original commit. llvm-svn: 327455	2018-03-13 22:11:06 +00:00
Sjoerd Meijer	95da875898	This reverts "r327189 - [ARM] Add ARMv8.2-A FP16 vector intrinsic" This is causing problems in testing, and PR36683 was raised. Reverting it until we have sorted out how to pass f16 vectors. llvm-svn: 327437	2018-03-13 19:38:56 +00:00
Abderrazek Zaafrani	5bd68cf742	[ARM] Add ARMv8.2-A FP16 vector intrinsic Add the fp16 neon vector intrinsic for ARM as described in the ARM ACLE document. Reviews in https://reviews.llvm.org/D43650 llvm-svn: 327189	2018-03-09 23:39:34 +00:00
Matthew Voss	0a6a701f36	Correct the alignment for the PS4 target https://reviews.llvm.org/D44218 llvm-svn: 326942	2018-03-07 20:48:16 +00:00
Yaxun Liu	1578a0a55d	[AMDGPU] Clean up old address space mapping and fix constant address space value Differential Revision: https://reviews.llvm.org/D43911 llvm-svn: 326725	2018-03-05 17:50:10 +00:00
Heejin Ahn	8b6af22e60	[WebAssembly] Add exception handling option Summary: Add exception handling option to clang. Reviewers: dschuff Subscribers: jfb, sbc100, jgravelle-google, sunfish, cfe-commits Differential Revision: https://reviews.llvm.org/D43681 llvm-svn: 326517	2018-03-02 00:39:16 +00:00
Konstantin Zhuravlyov	d6b3453bdb	AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn - Expand GK_*s (i.e. GFX6 -> GFX600, GFX601, etc.) - This allows us to choose features correctly in some cases (for example, fast fmaf is available on gfx600, but not gfx601) - Move HasFMAF, HasFP64, HasLDEXPF to GPUInfo tables - Add HasFastFMA, HasFastFMAF to GPUInfo tables - Add missing tests llvm-svn: 326254	2018-02-27 21:48:05 +00:00
Mandeep Singh Grang	ac24bb53bb	[RISCV] Enable __int128_t and __uint128_t through clang flag Summary: If the flag -fforce-enable-int128 is passed, it will enable support for __int128_t and __uint128_t types. This flag can then be used to build compiler-rt for RISCV32. Reviewers: asb, kito-cheng, apazos, efriedma Reviewed By: asb, efriedma Subscribers: shiva0217, efriedma, jfb, dschuff, sdardis, sbc100, jgravelle-google, aheejin, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, cfe-commits Differential Revision: https://reviews.llvm.org/D43105 llvm-svn: 326045	2018-02-25 03:58:23 +00:00
Yonghong Song	8b1e93b7d6	bpf: Hook target feature "alu32" with LLVM LLVM has supported a new target feature "alu32" which could be enabled or disabled by "-mattr=[+\|-]alu32" when using llc. This patch link Clang with it, so it could be also done by passing related options to Clang, for example: -Xclang -target-feature -Xclang +alu32 Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 325996	2018-02-23 23:55:29 +00:00
Craig Topper	94a940d2b4	[X86] Disable CLWB in Cannon Lake Cannon Lake does not support CLWB, therefore it does not include all features listed under SKX. Patch by Gabor Buella Differential Revision: https://reviews.llvm.org/D43459 llvm-svn: 325655	2018-02-21 00:16:50 +00:00
Simon Dardis	0bc2d9b0c5	[mips] Spectre variant two mitigation for MIPSR2 This patch provides mitigation for CVE-2017-5715, Spectre variant two, which affects the P5600 and P6600. It provides the option -mindirect-jump=hazard, which instructs the LLVM backend to replace indirect branches with their hazard barrier variants. This option is accepted when targeting MIPS revision two or later. The migitation strategy suggested by MIPS for these processors is to use two hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard barrier variants of the 'jalr' and 'jr' instructions respectively. These instructions impede the execution of instruction stream until architecturally defined hazards (changes to the instruction stream, privileged registers which may affect execution) are cleared. These instructions in MIPS' designs are not speculated past. These instructions are used with the option -mindirect-jump=hazard when branching indirectly and for indirect function calls. These instructions are defined by the MIPS32R2 ISA, so this mitigation method is not compatible with processors which implement an earlier revision of the MIPS ISA. Implementation note: I've opted to provide this as an -mindirect-jump={hazard,...} style option in case alternative mitigation methods are required for other implementations of the MIPS ISA in future, e.g. retpoline style solutions. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D43487 llvm-svn: 325651	2018-02-21 00:05:05 +00:00
Dylan McKay	8ad69f2d1f	[AVR] Set the program address space in the data layout This is accompanied by r325481 in LLVM. llvm-svn: 325483	2018-02-19 10:46:16 +00:00
Dimitry Andric	2e3f23bbcc	[X86] Add 'sahf' CPU feature to frontend Summary: Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the `+sahf` feature for the backend, for bug 36028 (Incorrect use of pushf/popf enables/disables interrupts on amd64 kernels). This was originally submitted in bug 36037 by Jonathan Looney <jonlooney@gmail.com>. As described there, GCC also uses `-msahf` for this feature, and the backend already recognizes the `+sahf` feature. All that is needed is to teach clang to pass this on to the backend. The mapping of feature support onto CPUs may not be complete; rather, it was chosen to match LLVM's idea of which CPUs support this feature (see lib/Target/X86/X86.td). I also updated the affected test case (CodeGen/attr-target-x86.c) to match the emitted output. Reviewers: craig.topper, coby, efriedma, rsmith Reviewed By: craig.topper Subscribers: emaste, cfe-commits Differential Revision: https://reviews.llvm.org/D43394 llvm-svn: 325446	2018-02-17 21:04:35 +00:00
Konstantin Zhuravlyov	cf71761495	Reapply r325193 llvm-svn: 325203	2018-02-15 02:37:04 +00:00
Konstantin Zhuravlyov	b7b86127f5	Revert r325193 as it breaks buildbots llvm-svn: 325200	2018-02-15 02:27:45 +00:00
Richard Smith	47c9b5d4d6	Add missing definition for class static after r325193. llvm-svn: 325195	2018-02-15 01:01:06 +00:00
Konstantin Zhuravlyov	5c9d4e7957	AMDGPU: Cleanup most of the macros - Insert __AMD__ macro - Insert __AMDGPU__ macro - Insert __devicename__ macro - Add missing tests for arch macros Differential Revision: https://reviews.llvm.org/D36802 llvm-svn: 325193	2018-02-15 00:20:26 +00:00
Yaxun Liu	651bd73c02	[AMDGPU] Change constant addr space to 4 Differential Revision: https://reviews.llvm.org/D43171 llvm-svn: 325031	2018-02-13 18:01:21 +00:00
Matt Arsenault	e7da136a74	AMDGPU: Update for datalayout change llvm-svn: 324748	2018-02-09 16:58:41 +00:00
Konstantin Zhuravlyov	76854e7daa	AMDGPU/GCN: Bring processors in sync with AMDGPUUsage - Remove gfx800 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40045 llvm-svn: 324714	2018-02-09 07:02:28 +00:00
Erich Keane	fa69c71dce	Fix UBSan issue with PPC::isValidCPUName Apparently storing the pointer to a StringLiteral as a StringRef caused this section of code to issue a ubsan warning. This will hopefully fix that. llvm-svn: 324687	2018-02-09 00:13:49 +00:00
Erich Keane	086331b4ff	Add size to constexpr Arrays What seems to be a bug in older versions of MSVC, constexpr member arrays with a redefinition (to force emission) require their initial definition to have the size between the brackets. llvm-svn: 324682	2018-02-08 23:49:40 +00:00
Erich Keane	e44bdb3f70	Add Rest of Targets Support to ValidCPUList (enabling march notes) A followup to: https://reviews.llvm.org/D42978 Most of the rest of the Targets were pretty rote, so this patch knocks them all out at once. Differential Revision: https://reviews.llvm.org/D43057 llvm-svn: 324676	2018-02-08 23:16:55 +00:00
Erich Keane	d45879d8ad	Add NVPTX Support to ValidCPUList (enabling march notes) A followup to: https://reviews.llvm.org/D42978 This patch adds NVPTX support for enabling the march notes. Differential Revision: https://reviews.llvm.org/D43045 llvm-svn: 324675	2018-02-08 23:16:00 +00:00
Erich Keane	d1d85f50d0	Add X86 Support to ValidCPUList (enabling march notes) A followup to: https://reviews.llvm.org/D42978 This patch adds X86 and X86_64 support for enabling the march notes. Differential Revision: https://reviews.llvm.org/D43041 llvm-svn: 324674	2018-02-08 23:15:02 +00:00
Erich Keane	3ec1743d0d	Make march/target-cpu print a note with the list of valid values for ARM When rejecting a march= or target-cpu command line parameter, the message is quite lacking. This patch adds a note that prints all possible values for the current target, if the target supports it. This adds support for the ARM/AArch64 targets (more to come!). Differential Revision: https://reviews.llvm.org/D42978 llvm-svn: 324673	2018-02-08 23:14:15 +00:00
Erich Keane	9176f669b4	[NFCi] Replace a couple of usages of const StringRef& with StringRef No sense passing these by reference when a copy is about as free, and saves on potential indirection later. llvm-svn: 324540	2018-02-07 23:04:38 +00:00
Walter Lee	637aafc451	[Myriad] Define __ma2x5x and __ma2x8x Summary: Add architecture defines for ma2x5x and ma2x8x. Reviewers: jyknight Subscribers: fedor.sergeev, MartinO Differential Revision: https://reviews.llvm.org/D42882 llvm-svn: 324420	2018-02-06 22:39:47 +00:00
Yaxun Liu	f5f45e5e63	[AMDGPU] Switch to the new addr space mapping by default This requires corresponding llvm change. Differential Revision: https://reviews.llvm.org/D40956 llvm-svn: 324102	2018-02-02 16:08:24 +00:00
Artem Belevich	fbc56a904f	[CUDA] Added partial support for CUDA-9.1 Clang can use CUDA-9.1 now, though new APIs (are not implemented yet. The major change is that headers in CUDA-9.1 went through substantial changes that started in CUDA-9.0 which required substantial changes in the cuda compatibility headers provided by clang. There are two major issues: * CUDA SDK no longer provides declarations for libdevice functions. * A lot of device-side functions have become nvcc's builtins and CUDA headers no longer contain their implementations. This patch changes the way CUDA headers are handled if we compile with CUDA 9.x. Both 9.0 and 9.1 are affected. * Clang provides its own declarations of libdevice functions. * For CUDA-9.x clang now provides implementation of device-side 'standard library' functions using libdevice. This patch should not affect compilation with CUDA-8. There may be some observable differences for CUDA-9.0, though they are not expected to affect functionality. Tested: CUDA test-suite tests for all supported combinations of: CUDA: 7.0,7.5,8.0,9.0,9.1 GPU: sm_20, sm_35, sm_60, sm_70 Differential Revision: https://reviews.llvm.org/D42513 llvm-svn: 323713	2018-01-30 00:00:12 +00:00
Craig Topper	ace5c37c57	[X86] Add 'rdrnd' feature to silvermont to match recent gcc bug fix. gcc recently fixed this bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83546 llvm-svn: 323552	2018-01-26 19:34:45 +00:00
Craig Topper	3672f00e01	[X86] Define __IBT__ when -mibt is specified. llvm-svn: 323543	2018-01-26 18:31:14 +00:00
Wei Mi	d1621699dc	Adjust MaxAtomicInlineWidth for i386/i486 targets. This is to fix the bug reported in https://bugs.llvm.org/show_bug.cgi?id=34347#c6. Currently, all MaxAtomicInlineWidth of x86-32 targets are set to 64. However, i386 doesn't support any cmpxchg related instructions. i486 only supports cmpxchg. So in this patch MaxAtomicInlineWidth is reset as follows: For i386, the MaxAtomicInlineWidth should be 0 because no cmpxchg is supported. For i486, the MaxAtomicInlineWidth should be 32 because it supports cmpxchg. For others 32 bits x86 cpu, the MaxAtomicInlineWidth should be 64 because of cmpxchg8b. Differential Revision: https://reviews.llvm.org/D42154 llvm-svn: 323281	2018-01-23 23:27:57 +00:00
Dan Gohman	59f16991b0	[WebAssembly] Factor out settings common to wasm32 and wasm64. NFC. MaxAtomicPromoteWidth and MaxAtomicInlineWidth are 64 on both wasm32 and wasm64, so they can be set in shared code. llvm-svn: 323253	2018-01-23 20:22:12 +00:00
Chandler Carruth	c58f2166ab	Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took in the speculative execution, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` in addition to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile all libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we strongly recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155	2018-01-22 22:05:25 +00:00

... 2 3 4 5 6 ...

452 Commits