llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	51bf4c0e6d	[clang] Add -ffinite-loops & -fno-finite-loops options. This patch adds 2 new options to control when Clang adds `mustprogress`: 1. -ffinite-loops: assume all loops are finite; mustprogress is added to all loops, regardless of the selected language standard. 2. -fno-finite-loops: assume no loop is finite; mustprogress is not added to any loop or function. We could add mustprogress to functions without loops, but we would have to detect that in Clang, which is probably not worth it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D96419	2021-02-12 19:25:49 +00:00
Florian Hahn	fb4d8fe807	[clang] Update mustprogress tests. This unifies the positive and negative tests in a single file and manually adjusts the check lines to check for differences surgically.	2021-02-12 16:53:51 +00:00
James Y Knight	8043d5a964	NFC: update clang tests to check ordering and alignment for atomicrmw/cmpxchg. The ability to specify alignment was recently added, and it's an important property which we should ensure is set as expected by Clang. (Especially before making further changes to Clang's code in this area.) But, because it's on the end of the lines, the existing tests all ignore it. Therefore, update all the tests to also verify the expected alignment for atomicrmw and cmpxchg. While I was in there, I also updated uses of 'load atomic' and 'store atomic', and added the memory ordering, where that was missing.	2021-02-11 17:35:09 -05:00
Pengxuan Zheng	61cca0f2e5	[AArch64] Adding Neon Sm3 & Sm4 Intrinsics This adds SM3 and SM4 Intrinsics support for AArch64, specifically: vsm3ss1q_u32 vsm3tt1aq_u32 vsm3tt1bq_u32 vsm3tt2aq_u32 vsm3tt2bq_u32 vsm3partw1q_u32 vsm3partw2q_u32 vsm4eq_u32 vsm4ekeyq_u32 Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D95655	2021-02-11 14:20:20 -08:00
Douglas Yung	7b4832648a	NFCI. With the move to the new pass manager by default, sanitize-coverage.c is now passing on ARM. This change removes the XFAIL from the original test and duplicates the test into sanitize-coverage-old-pm.c which uses the old pass manager and has the corresponding XFAIL. This should fix the XPASS from this and similar runs: http://lab.llvm.org:8011/#/builders/60/builds/1875	2021-02-11 13:18:18 -08:00
Paul Robinson	5ea2d4fa48	Avoid conflicts between debug-info and pseudo-probe profiling After D93264, using both -fdebug-info-for-profiling and -fpseudo-probe-for-profiling will cause the compiler to crash. Diagnose these conflicting options in the driver. Also, the existing CodeGen test was using the driver when it should be running cc1. Differential Revision: https://reviews.llvm.org/D96354	2021-02-10 07:09:18 -08:00
Wang, Pengfei	dd2460ed5d	[X86] Always assign reassoc flag for intrinsics reduce_add/mul_ps/pd. Intrinsics reduce_add/mul_ps/pd have assumption that the elements in the vector are reassociable. So we need to always assign the reassoc flag when we call _mm_reduce_* intrinsics. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D96231	2021-02-09 21:14:06 +08:00
Petr Hosek	9fd9b5a9c9	Don't emit coverage mapping for excluded functions When a function or a file is excluded using -fprofile-list= option, don't emit coverage mapping as doing so confuses users since those functions would always have zero count. This also reduces the binary size considerably in cases where only a few functions or files are being instrumented. Differential Revision: https://reviews.llvm.org/D96000	2021-02-05 13:03:57 -08:00
Thomas Preud'homme	00a62547da	Stop traping on sNaN in __builtin_isnan __builtin_isnan currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: kpn Differential Revision: https://reviews.llvm.org/D95948	2021-02-05 18:28:48 +00:00
Krzysztof Parzyszek	bc097f645e	[Hexagon] Add clang builtin definitions for Hexagon V68	2021-02-04 09:54:52 -06:00
Kevin P. Neal	81b69879c9	[FPEnv][X86] Platform builtins edition: clang should get from the AST the metadata for constrained FP builtins Currently clang is not correctly retrieving from the AST the metadata for constrained FP builtins. This patch fixes that for the X86 specific builtins. Differential Revision: https://reviews.llvm.org/D94614	2021-02-03 11:49:17 -05:00
Hongtao Yu	3d89b3cbec	[CSSPGO] Introducing distribution factor for pseudo probe. Sample re-annotation is required in LTO time to achieve a reasonable post-inline profile quality. However, we have seen that such LTO-time re-annotation degrades profile quality. This is mainly caused by preLTO code duplication that is done by passes such as loop unrolling, jump threading, indirect call promotion etc, where samples corresponding to a source location are aggregated multiple times due to the duplicates. In this change we are introducing a concept of distribution factor for pseudo probes so that samples can be distributed for duplicated probes scaled by a factor. We hope that optimizations duplicating code well-maintain the branch frequency information (BFI) based on which probe distribution factors are calculated. Distribution factors are updated at the end of preLTO pipeline to reflect an estimated portion of the real execution count. This change also introduces a pseudo probe verifier that can be run after each IR passes to detect duplicated pseudo probes. A saturated distribution factor stands for 1.0. A pesudo probe will carry a factor with the value ranged from 0.0 to 1.0. A 64-bit integral distribution factor field that represents [0.0, 1.0] is associated to each block probe. Unfortunately this cannot be done for callsite probes due to the size limitation of a 32-bit Dwarf discriminator. A 7-bit distribution factor is used instead. Changes are also needed to the sample profile inliner to deal with prorated callsite counts. Call sites duplicated by PreLTO passes, when later on inlined in LTO time, should have the callees’s probe prorated based on the Prelink-computed distribution factors. The distribution factors should also be taken into account when computing hotness for inline candidates. Also, Indirect call promotion results in multiple callisites. The original samples should be distributed across them. This is fixed by adjusting the callisites' distribution factors. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D93264	2021-02-02 11:55:01 -08:00
Fangrui Song	74c94b5d9c	[test] Default clang/test to FileCheck --allow-unused-prefixes=false	2021-02-02 11:22:46 -08:00
Zarko Todorovski	eb3426a528	[AIX] Improve option processing for mabi=vec-extabi and mabi=vec=defaul Opening this revision to better address comments by @hubert.reinterpretcast in https://reviews.llvm.org/rGcaaaebcde462 Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D95702	2021-02-02 10:59:21 -05:00
Nico Weber	f2b4cc91e0	Revert "[test] Default clang/test to FileCheck --allow-unused-prefixes=false" This reverts commit `80f539526e`. Many test failures on mac: http://45.33.8.238/macm1/2772/summary.html One on win: http://45.33.8.238/win/32442/summary.html	2021-02-02 07:38:44 -05:00
Fangrui Song	80f539526e	[test] Default clang/test to FileCheck --allow-unused-prefixes=false	2021-02-01 22:02:59 -08:00
Petr Hosek	0217f1c7a3	Make the profile-filter.c test compatible with 32-bit systems This addresses PR48930. Differential Revision: https://reviews.llvm.org/D95658	2021-01-29 09:58:32 -08:00
Abhina Sreeskantharajan	42a21778f6	[test] Use host platform specific error message substitution in lit tests On z/OS, the following error message is not matched correctly in lit tests. ``` EDC5129I No such file or directory. ``` This patch uses a lit config substitution to check for platform specific error messages. Reviewed By: muiez, jhenderson Differential Revision: https://reviews.llvm.org/D95246	2021-01-29 07:16:30 -05:00
Thomas Lively	4b68b64dcc	[WebAssembly] Prototype i8x16 to i32x4 widening instructions As proposed in https://github.com/WebAssembly/simd/pull/395 and matching the opcodes used in V8: https://chromium-review.googlesource.com/c/v8/v8/+/2617385/4/src/wasm/wasm-opcodes.h Differential Revision: https://reviews.llvm.org/D95557	2021-01-28 10:59:32 -08:00
James Y Knight	a7246ba02a	Itanium Mangling: In 'enable_if', omit X/E around <expr-primary>. The Clang enable_if extension is mangled as an <extended-qualifier>, which is supposed to contain <template-args>. However, we were unconditionally emitting X/E around its arguments, neglecting the fact that <expr-primary> should be emitted directly without the surrounding X/E. Differential Revision: https://reviews.llvm.org/D95488	2021-01-27 16:46:52 -05:00
Fangrui Song	3e80686186	[test] Fix clang/test/CodeGen tests	2021-01-27 10:55:27 -08:00
Freddy Ye	1edb76cc91	[X86] merge "={eax}" and "~{eax}" into "=&eax" for MSInlineASM Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D94466	2021-01-27 22:54:17 +08:00
Petr Hosek	bb9eb19829	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 17:13:34 -08:00
Petr Hosek	1e634f3952	Revert "Support for instrumenting only selected files or functions" This reverts commit `4edf35f11a` because the test fails on Windows bots.	2021-01-26 12:25:28 -08:00
Fangrui Song	189f311130	CGDebugInfo CreatedLimitedType: Drop file/line for RecordType with invalid location For Clang synthesized `__va_list_tag` (`CreateX86_64ABIBuiltinVaListDecl`), its DW_AT_decl_file/DW_AT_decl_line are arbitrarily set from `CurLoc`. In a stage 2 `-DCMAKE_BUILD_TYPE=Debug` clang build, I observe that in driver.cpp, DW_AT_decl_file/DW_AT_decl_line may be set to an `#include` line (the transitively included file uses va_arg (`__builtin_va_arg`)). This seems arbitrary. Drop that. Reviewed By: #debug-info, dblaikie Differential Revision: https://reviews.llvm.org/D94735	2021-01-26 11:53:25 -08:00
Petr Hosek	4edf35f11a	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 11:11:39 -08:00
Mircea Trofin	0c0d009a88	[NFC] Disallow unused prefixes under clang/test/CodeGen Differential Revision: https://reviews.llvm.org/D95417	2021-01-26 08:05:45 -08:00
Zarko Todorovski	028d7a3668	Remove requirement for -maltivec to be used when using -mabi=vec-extabi or -mabi=vec-default when not using vector code The previous implementation required that `-maltivec` be specified when using either `-mabi=vec-extabi` or `-mabi=vec-default`, this patch removes that requirement. Reviewed By: cebowleratibm Differential Revision: https://reviews.llvm.org/D94986	2021-01-26 07:58:01 -05:00
Abhina Sreeskantharajan	978444d531	Revert "[SystemZ][z/OS] Fix No such file or directory expression error" This reverts commit `06f8a49693`.	2021-01-25 08:29:38 -05:00
Jeroen Dobbelaere	2b9a834c43	[InlineFunction] Use llvm.experimental.noalias.scope.decl for noalias arguments. Insert a llvm.experimental.noalias.scope.decl intrinsic that identifies where a noalias argument was inlined. This patch includes some refactorings from D90104. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93040	2021-01-23 12:10:57 +01:00
Thomas Lively	11802eced5	[WebAssembly] Prototype new f64x2 conversions As proposed in https://github.com/WebAssembly/simd/pull/383. Differential Revision: https://reviews.llvm.org/D95012	2021-01-20 11:28:06 -08:00
George Burgess IV	b270fd59f0	Revert "[clang] Change builtin object size when subobject is invalid" This reverts commit `275f30df8a`. As noted on the code review (https://reviews.llvm.org/D92892), this change causes us to reject valid code in a few cases. Reverting so we have more time to figure out what the right fix{es are, is} here.	2021-01-20 11:03:34 -08:00
Luo, Yuanke	7e1d2224b4	[X86][AMX] Fix the typo. The dpbsud should be dpbssd. Differential Revision: https://reviews.llvm.org/D94943	2021-01-19 16:57:34 +08:00
Florian Hahn	291ac7e622	[AArch64] Revert back to Intrinsic<> for TME instructions. This patch reverts back to Intrinsic for the instructions for the transactional memory extension, so nosync is not included.	2021-01-18 18:03:58 +00:00
Abhina Sreeskantharajan	689aaba7ac	[SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match successfully. ``` EDC5129I No such file or directory. ``` Reviewed By: muiez Differential Revision: https://reviews.llvm.org/D94239	2021-01-18 07:14:37 -05:00
Mircea Trofin	e8049dc3c8	[NewPM][Inliner] Move the 'always inliner' case in the same CGSCC pass as 'regular' inliner Expanding from D94808 - we ensure the same InlineAdvisor is used by both InlinerPass instances. The notion of mandatory inlining is moved into the core InlineAdvisor: advisors anyway have to handle that case, so this change also factors out that a bit better. Differential Revision: https://reviews.llvm.org/D94825	2021-01-15 17:59:38 -08:00
Qiu Chaofan	168be42083	[Clang] Mutate long-double math builtins into f128 under IEEE-quad Under -mabi=ieeelongdouble on PowerPC, IEEE-quad floating point semantic is used for long double. This patch mutates call to related builtins into f128 version on PowerPC. And in theory, this should be applied to other targets when their backend supports IEEE 128-bit style libcalls. GCC already has these mutations except nansl, which is not available on PowerPC along with other variants (nans, nansf). Reviewed By: RKSimon, nemanjai Differential Revision: https://reviews.llvm.org/D92080	2021-01-15 16:56:20 +08:00
Erich Keane	9e53c94d8d	[NFC] Update test to not check for 'opaque' in the file name. The intent presumably is to avoid generating 'opaque' in the IR, but the header contains the filename. Thus, having the workspace in a directory with opaque in it causes this test to fail. This just adds a 'CHECK' line on target-triple, which is the last line of the IR-header.	2021-01-14 11:24:06 -08:00
Lucas Prates	2b1e25befe	[AArch64] Adding ACLE intrinsics for the LS64 extension This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`, and `__arm_st64bv0`. These are selected into the LS64 instructions LD64B, ST64B, ST64BV and ST64BV0, respectively. Based on patches written by Simon Tatham. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D93232	2021-01-14 09:43:58 +00:00
Zequan Wu	e53bbd9951	[IR] move nomerge attribute from function declaration/definition to callsites Move nomerge attribute from function declaration/definition to callsites to allow virtual function calls attach the attribute. Differential Revision: https://reviews.llvm.org/D94537	2021-01-12 12:10:46 -08:00
Jan Svoboda	7ab803095a	[clang][cli] Remove -f[no-]trapping-math from -cc1 command line This patch removes the -f[no-]trapping-math flags from the -cc1 command line. These flags are ignored in the command line parser and their semantics is fully handled by -ffp-exception-mode. This patch does not remove -f[no-]trapping-math from the driver command line. The driver flags are being used and do affect compilation. Reviewed By: dexonsmith, SjoerdMeijer Differential Revision: https://reviews.llvm.org/D93395	2021-01-12 10:00:23 +01:00
Sriraman Tallam	d8c6d24359	-funique-internal-linkage-names appends a hex md5hash suffix to the symbol name which is not demangler friendly, convert it to decimal. Please see D93747 for more context which tries to make linkage names of internal linkage functions to be the uniqueified names. This causes a problem with gdb because breaking using the demangled function name will not work if the new uniqueified name cannot be demangled. The problem is the generated suffix which is a mix of integers and letters which do not demangle. The demangler accepts either all numbers or all letters. This patch simply converts the hash to decimal. There is no loss of uniqueness by doing this as the precision is maintained. The symbol names get longer by a few characters though. Differential Revision: https://reviews.llvm.org/D94154	2021-01-11 11:10:29 -08:00
Joe Ellis	8ea72b3887	[clang][AArch64][SVE] Avoid going through memory for coerced VLST return values VLST return values are coerced to VLATs in the function epilog for consistency with the VLAT ABI. Previously, this coercion was done through memory. It is preferable to use the llvm.experimental.vector.insert intrinsic to avoid going through memory here. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D94290	2021-01-11 12:10:59 +00:00
Esme-Yi	ffa67873a3	[PowerPC] Add variants of 64-bit vector types for vec_sel. Summary: This patch added variants of vec_sel and fixed bugzilla 46770. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D94162	2021-01-11 03:52:16 +00:00
Fangrui Song	b41b743d46	[test] Improve weakref & weak_import tests	2021-01-09 23:56:55 -08:00
Fangrui Song	e2e82c9983	[CodeGenModule] Drop dso_local on function declarations for ELF -fno-pic -fno-direct-access-external-data ELF -fno-pic sets dso_local on a function declaration to allow direct accesses when taking its address (similar to a data symbol). The emitted code follows the traditional GCC/Clang -fno-pic behavior: an absolute relocation is produced. If the function is not defined in the executable, a canonical PLT entry will be needed at link time. This is similar to a copy relocation and is incompatible with (-Bsymbolic or --dynamic-list linked shared objects / protected symbols in a shared object). This patch gives -fno-pic code a way to avoid such a canonical PLT entry. The FIXME was about a generalization for -fpie -mpie-copy-relocations (now -fpie -fdirect-access-external-data). While we could set dso_local to avoid GOT when taking the address of a function declaration (there is an ignorable difference about R_386_PC32 vs R_386_PLT32 on i386), it likely does not provide any benefit and can just cause trouble, so we don't make the generalization.	2021-01-09 16:31:56 -08:00
Fangrui Song	38a716c30f	Make -fno-pic respect -fno-direct-access-external-data D92633 added -f[no-]direct-access-external-data to supersede -m[no-]pie-copy-relocations. (The option works for -fpie but is a no-op for -fno-pic and -fpic.) This patch makes -fno-pic -fno-direct-access-external-data drop dso_local from global variable declarations. This usually causes the backend to emit a GOT indirection for external data access. With a GOT relocation, the subsequent -no-pie link will not have copy relocation even if the data symbol turns out to be defined by a shared object. Differential Revision: https://reviews.llvm.org/D92714	2021-01-09 00:32:02 -08:00
Fangrui Song	1d3ebbf537	Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made -fpie code emit R_X86_64_PC32 to reference external data symbols by default. Clang adopted -mpie-copy-relocations D19996 as a flexible alternative. The name -mpie-copy-relocations can be improved [1] and does not capture the idea that this option can apply to -fno-pic and -fpic [2], so this patch introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations their aliases for compatibility. [1] For ``` extern int var; int get() { return var; } ``` if var is defined in another translation unit in the link unit, there is no copy relocation. [2] -fno-pic -fno-direct-access-external-data is useful to avoid copy relocations. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888 If a shared object is linked with -Bsymbolic or --dynamic-list and exports a data symbol, normally the data symbol cannot be accessed by -fno-pic code (because by default an absolute relocation is produced which will lead to a copy relocation). -fno-direct-access-external-data can prevent copy relocations. -fpic -fdirect-access-external-data can avoid GOT indirection. This is like the undefined counterpart of -fno-semantic-interposition. However, the user should define var in another translation unit and link with -Bsymbolic or --dynamic-list, otherwise the linker will error in a -shared link. Generally the user has better tools for their goal but I want to mention that this combination is valid. On COFF, the behavior is like always -fdirect-access-external-data. `__declspec(dllimport)` is needed to enable indirect access. There is currently no plan to affect non-ELF behaviors or -fpic behaviors. -fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch. GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D92633	2021-01-09 00:32:01 -08:00
Umesh Kalappa	33c8e16f66	PR47391: Canonicalize DIFiles Like @aprantl suggested, modify to use the canonicalized DIFile, if we don't know the loc info and filename for the compiler generated functions for example static initialization functions. Reviewed By: dblaikie, aprantl Differential Revision: https://reviews.llvm.org/D87147	2021-01-08 22:11:16 -08:00
Heejin Ahn	7be271537e	[WebAssembly] Rename wasm_rethrow_in_catch intrinsic/builtin `wasm_rethrow_in_catch` intrinsic and builtin are used in order to rethrow an exception when the exception is caught but there is no matching clause within the current `catch`. For example, ``` try { foo(); } catch (int n) { ... } ``` If the caught exception does not correspond to C++ `int` type, it should be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch` because at the time I thought there would be another intrinsic for C++'s `throw` keyword, which rethrows an exception. It turned out that `throw` keyword doesn't require wasm's `rethrow` instruction, so we rename `rethrow_in_catch` to just `rethrow` here. Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D94038	2021-01-08 06:55:04 -08:00
Jeffrey T Mott	275f30df8a	[clang] Change builtin object size when subobject is invalid Motivating example: ``` struct { int v[10]; } t[10]; __builtin_object_size( &t[0].v[11], // access past end of subobject 1 // request remaining bytes of closest surrounding // subobject ); ``` In GCC, this returns 0. https://godbolt.org/z/7TeGs7 In current clang, however, this returns 356, the number of bytes remaining in the whole variable, as if the `type` was 0 instead of 1. https://godbolt.org/z/6Kffox This patch checks for the specific case where we're requesting a subobject's size (type 1) but the subobject is invalid. Differential Revision: https://reviews.llvm.org/D92892	2021-01-07 12:34:07 -08:00
Jeroen Dobbelaere	59fce6b066	[NFC] make clang/test/CodeGen/arm_neon_intrinsics.c resistent to function attribute id changes When introducing support for @llvm.experimental.noalias.scope.decl, this tests started failing because it checks (for no good reason) for a function attribute id of '#8' which now becomes '#9' Reviewed By: pratlucas Differential Revision: https://reviews.llvm.org/D94233	2021-01-07 17:08:15 +00:00
Thomas Lively	497026c902	[WebAssembly] Prototype prefetch instructions As proposed in https://github.com/WebAssembly/simd/pull/352 and using the opcodes used in the V8 prototype: https://chromium-review.googlesource.com/c/v8/v8/+/2543167. These instructions are only usable via intrinsics and clang builtins to make them opt-in while they are being benchmarked. Differential Revision: https://reviews.llvm.org/D93883	2021-01-05 11:32:03 -08:00
Florian Hahn	51d5991f04	[Clang] Add AArch64 VCMLA LANE variants. This patch adds the LANE variants for VCMLA on AArch64 as defined in "Arm Neon Intrinsics Reference for ACLE Q3 2020" [1] This patch also updates `dup_typed` to accept constant type strings directly. Based on a patch by Tim Northover. [1] https://developer.arm.com/documentation/ihi0073/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D93014	2021-01-05 16:14:00 +00:00
Joe Ellis	3d5b18a3fd	[clang][AArch64][SVE] Avoid going through memory for coerced VLST arguments VLST arguments are coerced to VLATs at the function boundary for consistency with the VLAT ABI. They are then bitcast back to VLSTs in the function prolog. Previously, this conversion is done through memory. With the introduction of the llvm.vector.{insert,extract} intrinsic, we can avoid going through memory here. Depends on D92761 Differential Revision: https://reviews.llvm.org/D92762	2021-01-05 15:18:21 +00:00
Brandon Bergren	6cee9d0cf8	[PowerPC] Support powerpcle target in Clang [3/5] Add powerpcle support to clang. For FreeBSD, assume a freestanding environment for now, as we only need it in the first place to build loader, which runs in the OpenFirmware environment instead of the FreeBSD environment. For Linux, recognize glibc and musl environments to match current usage in Void Linux PPC. Adjust driver to match current binutils behavior regarding machine naming. Adjust and expand tests. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D93919	2021-01-02 12:17:58 -06:00
Fangrui Song	d1fd72343c	Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions The idea is that the CC1 default for ELF should set dso_local on default visibility external linkage definitions in the default -mrelocation-model pic mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar. The refactoring is made available by `2820a2ca3a`. Currently only x86 supports local aliases. We move the decision to the driver. There are three CC1 states: * -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable. * (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local * -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out. Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.	2020-12-31 13:59:45 -08:00
Fangrui Song	fd739804e0	[test] Add {{.}} to make ELF tests immune to dso_local/dso_preemptable/(none) differences For a default visibility external linkage definition, dso_local is set for ELF -fno-pic/-fpie and COFF and Mach-O. Since default clang -cc1 for ELF is similar to -fpic ("PIC Level" is not set), this nuance causes unneeded binary format differences. To make emitted IR similar, ELF -cc1 -fpic will default to -fno-semantic-interposition, which sets dso_local for default visibility external linkage definitions. To make this flip smooth and enable future (dso_local as definition default), this patch replaces (function) `define ` with `define{{.}} `, (variable/constant/alias) `= ` with `={{.}} `, or inserts appropriate `{{.}} `.	2020-12-31 00:27:11 -08:00
Fangrui Song	f2cc2669a0	[test] Fix -triple and delete UNSUPPORTED: system-windows	2020-12-31 00:13:34 -08:00
Luo, Yuanke	08665b1805	Support tilezero intrinsic and c interface for AMX. Differential Revision: https://reviews.llvm.org/D92837	2020-12-31 13:24:57 +08:00
Fangrui Song	6b3351792c	[test] Add {{.}} to make tests immune to dso_local/dso_preemptable/(none) differences For a definition (of most linkage types), dso_local is set for ELF -fno-pic/-fpie and COFF, but not for Mach-O. This nuance causes unneeded binary format differences. This patch replaces (function) `define ` with `define{{.}} `, (variable/constant/alias) `= ` with `={{.}} `, or inserts appropriate `{{.}} ` if there is an explicit linkage. * Clang will set dso_local for Mach-O, which is currently implied by TargetMachine.cpp. This will make COFF/Mach-O and executable ELF similar. * Eventually I hope we can make dso_local the textual LLVM IR default (write explicit "dso_preemptable" when applicable) and -fpic ELF will be similar to everything else. This patch helps move toward that goal.	2020-12-30 20:52:01 -08:00
Juneyoung Lee	420d046d6b	clang-format, address warnings	2020-12-30 23:05:07 +09:00
Juneyoung Lee	9b29610228	Use unary CreateShuffleVector if possible As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923	2020-12-30 22:36:08 +09:00
Fangrui Song	2820a2ca3a	Move -fno-semantic-interposition dso_local logic from TargetMachine to Clang CodeGenModule This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the decision to use dso_local. For LLVM synthesized functions/globals, they may lose inferred dso_local but such optimizations are probably not very useful. Note: the hasComdat() condition in canBenefitFromLocalAlias (D77429) may be dead now. (llvm/CodeGen/X86/semantic-interposition-comdat.ll) (Investigate whether we need test coverage when Fuchsia C++ ABI is clearer)	2020-12-29 23:37:55 -08:00
Luo, Yuanke	981a0bd858	[X86] Add x86_amx type for intel AMX. The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it is used by load/store instruction. So amx intrinsics only operate on type x86_amx. It can help to separate amx intrinsics from llvm IR instructions (+-*/). Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981. Differential Revision: https://reviews.llvm.org/D91927	2020-12-30 13:52:13 +08:00
Juneyoung Lee	278aa65cc4	[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93793	2020-12-30 04:21:04 +09:00
Thomas Lively	5e09e9979b	[WebAssembly] Prototype extending pairwise add instructions As proposed in https://github.com/WebAssembly/simd/pull/380. This commit makes the new instructions available only via clang builtins and LLVM intrinsics to make their use opt-in while they are still being evaluated for inclusion in the SIMD proposal. Depends on D93771. Differential Revision: https://reviews.llvm.org/D93775	2020-12-28 14:11:14 -08:00
Juneyoung Lee	9d70dbdc2b	[InstCombine] use poison as placeholder for undemanded elems Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement. However, this is problematic because undef isn’t undefined enough. Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison. This makes a few straightforward optimizations incorrect, such as: ``` ; https://bugs.llvm.org/show_bug.cgi?id=44185 define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %xv = insertelement <4 x float> %q, float %x, i32 2 %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef } ret <4 x float> %r ; %r[3] is undef } => define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %r = insertelement <4 x float> %y, float %x, i32 1 ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison } Transformation doesn't verify! ERROR: Target is more poisonous than source ``` I’d like to suggest 1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too) 2. Updating shufflevector’s semantics to return poison element if mask is undef Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay. m_Undef() matches PoisonValue as well, so existing optimizations will still fire. The only concern is hidden miscompilations that will go incorrect when poison constant is given. A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :( Instead, I’ll simply locally maintain the tests and run Alive2. If there is any bug found, I’ll report it. Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93586	2020-12-28 08:58:15 +09:00
Sriraman Tallam	34e70d722d	Append ".__part." to every basic block section symbol. Every basic block section symbol created by -fbasic-block-sections will contain ".__part." to know that this symbol corresponds to a basic block fragment of the function. This patch solves two problems: a) Like D89617, we want function symbols with suffixes to be properly qualified so that external tools like profile aggregators know exactly what this symbol corresponds to. b) The current basic block naming just adds a ".N" to the symbol name where N is some integer. This collides with how clang creates __cxx_global_var_init.N. clang creates these symbol names to call constructor functions and basic block symbol naming should not use the same style. Fixed all the test cases and added an extra test for __cxx_global_var_init breakage. Differential Revision: https://reviews.llvm.org/D93082	2020-12-23 11:35:44 -08:00
Arthur Eubanks	db1616c768	[test] Fix new-pass-manager-opt-bisect.c Requires x86 target to be registered.	2020-12-20 17:13:42 -08:00
Samuel Eubanks	47dbee6790	Make NPM OptBisectInstrumentation use global singleton OptBisect Currently there is an issue where the legacy pass manager uses a different OptBisect counter than the new pass manager. This fix makes the npm OptBisectInstrumentation use the global OptBisect. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D92897	2020-12-20 13:47:56 -08:00
Roman Lebedev	897c985e1e	[InstCombine] Canonicalize SPF to abs intrinsic This patch enables canonicalization of SPF_ABS and SPF_ABS to the abs intrinsic. This is a recommit, the original try was `05d4c4ebc2`, but it was reverted due to an apparent miscompile, which since then has just been fixed by the previous commit. Differential Revision: https://reviews.llvm.org/D87188	2020-12-18 21:18:14 +03:00
Kevin P. Neal	7fef551cb1	Revert "Revert "[FPEnv] Teach the IRBuilder about invoke's correct use of the strictfp attribute."" Similar to D69312, and documented in D69839, the IRBuilder needs to add the strictfp attribute to invoke instructions when constrained floating point is enabled. This is try 2, with the test corrected. Differential Revision: https://reviews.llvm.org/D93134	2020-12-18 12:42:06 -05:00
Rong Xu	3733463dbb	[IR][PGO] Add hot func attribute and use hot/cold attribute in func section Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR. This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph). This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold. The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute. The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions. Differential Revision: https://reviews.llvm.org/D92493	2020-12-17 18:41:12 -08:00
Tom Stellard	3203143f13	CodeGen: Improve generated IR for __builtin_mul_overflow(uint, uint, int) Add a special case for handling __builtin_mul_overflow with unsigned inputs and a signed output to avoid emitting the __muloti4 library call on x86_64. __muloti4 is not implemented in libgcc, so avoiding this call fixes compilation of some programs that call __builtin_mul_overflow with these arguments. For example, this fixes the build of cpio with clang, which includes code from gnulib that calls __builtin_mul_overflow with these argument types. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D84405	2020-12-17 14:30:31 -08:00
Baptiste Saleil	c2892978e9	[PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_ On PPC, the vector pair instructions are independent from MMA. This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names. We also move the vector pair type/intrinsic/builtin tests to their own files. Differential Revision: https://reviews.llvm.org/D91974	2020-12-17 13:19:27 -05:00
Tomas Matheson	f500662924	Detect section type conflicts between functions and variables If two variables are declared with __attribute__((section(name))) and the implicit section types (e.g. read only vs writeable) conflict, an error is raised. Extend this mechanism so that an error is raised if the section type implied by a function's __attribute__((section)) conflicts with that of another variable.	2020-12-17 11:43:47 -05:00
Zequan Wu	fb0f728805	[Clang] Make nomerge attribute a function attribute as well as a statement attribute. Differential Revision: https://reviews.llvm.org/D92800	2020-12-17 07:45:38 -08:00
Thomas Preud'homme	150fe05db4	[Test] Fix undef var in catch-undef-behavior.c Commit `9e52c43090` removed the directive defining LINE_1600 but left a string substitution to that variable in a CHECK-NOT directive. This will make that CHECK-NOT directive always fail to match, no matter the string. This commit follows the pattern done in `9e52c43090` of simplifying the CHECK-NOT to only look for the function name and the opening parenthesis, thereby not requiring the LINE_1600 variable. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D93350	2020-12-16 22:39:41 +00:00
Joe Ellis	dad07baf12	[clang][AArch64][SVE] Avoid going through memory for VLAT <-> VLST casts This change makes use of the llvm.vector.extract intrinsic to avoid going through memory when performing bitcasts between vector-length agnostic types and vector-length specific types. Depends on D91362 Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D92761	2020-12-16 12:24:32 +00:00
Qiu Chaofan	f141d1afc5	[NFC] Pre-commit test for long-double builtins This test reflects clang behavior on long-double type math library builtins under default or explicit 128-bit long-double options.	2020-12-16 17:19:54 +08:00
Johannes Doerfert	b9c77542e2	[Clang][Attr] Introduce the `assume` function attribute The `assume` attribute is a way to provide additional, arbitrary information to the optimizer. For now, assumptions are restricted to strings which will be accumulated for a function and emitted as comma separated string function attribute. The key of the LLVM-IR function attribute is `llvm.assume`. Similar to `llvm.assume` and `__builtin_assume`, the `assume` attribute provides a user defined assumption to the compiler. A follow up patch will introduce an LLVM-core API to query the assumptions attached to a function. We also expect to add more options, e.g., expression arguments, to the `assume` attribute later on. The `omp [begin] asssumes` pragma will leverage this attribute and expose the functionality in the absence of OpenMP. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D91979	2020-12-15 16:51:34 -06:00
Kevin P. Neal	2ec5973fdd	Revert "[FPEnv] Teach the IRBuilder about invoke's correct use of the strictfp attribute." The test is busted on some hosts that aren't the one I'm using. This reverts commit `67a1ffd88a`.	2020-12-15 12:58:47 -05:00
Kevin P. Neal	67a1ffd88a	[FPEnv] Teach the IRBuilder about invoke's correct use of the strictfp attribute. Similar to D69312, and documented in D69839, the IRBuilder needs to add the strictfp attribute to invoke instructions when constrained floating point is enabled. Differential Revision: https://reviews.llvm.org/D93134	2020-12-15 12:38:10 -05:00
Joe Ellis	5a2a8369e8	[AArch64][NEON] Remove undocumented vceqz{,q}_p16, vml{a,s}q_n_f64 intrinsics Prior to this patch, Clang supported the following C/C++ intrinsics: vceqz_p16 vceqzq_p16 vmlaq_n_f64 vmlsq_n_f64 ... exposed through arm_neon.h. However, these intrinsics are not part of the ACLE, allowing developers to write code that is not compatible with other toolchains. This patch removes these intrinsics. There is a bug report capturing this issue here: https://bugs.llvm.org/show_bug.cgi?id=47471 Reviewed By: bsmith Differential Revision: https://reviews.llvm.org/D93206	2020-12-15 17:19:16 +00:00
Jan Svoboda	56c5548d7f	[clang][cli] Squash multiple cc1 -fxxx-exceptions flags into single -exception-model=xxx option This patch enables marshalling of the exception model options while enforcing their mutual exclusivity. The clang driver interface remains the same, this only affects the cc1 command line. Depends on D93215. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D93216	2020-12-15 10:15:58 +01:00
Gulfem Savrun Yeniceri	7c0e3a77bc	[clang][IR] Add support for leaf attribute This patch adds support for leaf attribute as an optimization hint in Clang/LLVM. Differential Revision: https://reviews.llvm.org/D90275	2020-12-14 14:48:17 -08:00
Philip Reames	3b3eb7f07f	Speculative fix for build bot failures (The clang build fails for me locally, so this is based on built bot output and a guess as to root cause.) `f5fe849` made the execution of LAA conditional, so I'm guessing that's the root cause.	2020-12-14 13:44:40 -08:00
Matt Arsenault	ef4da3c2ba	clang: Add byval on x86_intrcc parameter 0 This will allow removing the special case treatment of the parameter and avoid depending on the pointer's element type.	2020-12-14 16:34:37 -05:00
Simon Pilgrim	4855a1004d	[X86] Convert fadd/fmul _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Followup to D87604, having confirmed on PR47506 that we can use the llvm codegen expansion for fadd/fmul as well. Differential Revision: https://reviews.llvm.org/D92940	2020-12-13 15:37:35 +00:00
Alexey Bader	a500a43587	[CodeGen][AMDGPU] Fix ICE for static initializer IR generation Differential Revision: https://reviews.llvm.org/D92782	2020-12-12 23:26:54 +03:00
Nico Weber	a5c65de295	mac/arm: XFAIL the last 3 failing tests We should fix them, but let's XFAIL them for now so that we can start running check-clang on bots and lock in the passing tests. Part of 46644.	2020-12-12 15:09:17 -05:00
Melanie Blower	320af6b138	Create SPIRABIInfo to enable SPIR_FUNC calling convention. Background: Call to library arithmetic functions for div is emitted by the compiler and it set wrong “C” calling convention for calls to these functions, whereas library functions are declared with `spir_function` calling convention. InstCombine optimization replaces such calls with “unreachable” instruction. It looks like clang lacks SPIRABIInfo class which should specify default calling conventions for “system” function calls. SPIR supports only SPIR_FUNC and SPIR_KERNEL calling convention. Reviewers: Erich Keane, Anastasia Differential Revision: https://reviews.llvm.org/D92721	2020-12-12 05:48:20 -08:00
Marco Elver	c28b18af19	[KernelAddressSanitizer] Fix globals exclusion for indirect aliases GlobalAlias::getAliasee() may not always point directly to a GlobalVariable. In such cases, try to find the canonical GlobalVariable that the alias refers to. Link: https://github.com/ClangBuiltLinux/linux/issues/1208 Reviewed By: dvyukov, nickdesaulniers Differential Revision: https://reviews.llvm.org/D92846	2020-12-11 12:20:40 +01:00
Florian Hahn	9c4cddb53a	[Clang] Add vcmla and rotated variants for Arm ACLE. This patch adds vcmla and the rotated variants as defined in "Arm Neon Intrinsics Reference for ACLE Q3 2020" [1] The _lane_ are still missing, but they can be added separately. This patch only adds the builtin mapping for AArch64. [1] https://developer.arm.com/documentation/ihi0073/latest Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D92930	2020-12-10 16:54:08 +00:00
Luo, Yuanke	f80b29878b	[X86] AMX programming model. This patch implements amx programming model that discussed in llvm-dev (http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html). Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet. This patch implemeted 7 components. 1. The c interface to end user. 2. The AMX intrinsics in LLVM IR. 3. Transform load/store <256 x i32> to AMX intrinsics or split the type into two <128 x i32>. 4. The Lowering from AMX intrinsics to AMX pseudo instruction. 5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx intruction. 6. The register allocation for tile register. 7. Morph AMX pseudo instruction to AMX real instruction. Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0 Differential Revision: https://reviews.llvm.org/D87981	2020-12-10 17:01:54 +08:00
Yuanfang Chen	fc3942526f	[NFCI] Add a missing triple in clang/test/CodeGen/ppc64le-varargs-f128.c	2020-12-09 18:17:34 -08:00
Kevin P. Neal	acd4950d4f	[FPEnv] Correct constrained metadata in fp16-ops-strict.c This test shows we're in some cases not getting strictfp information from the AST. Correct that. Differential Revision: https://reviews.llvm.org/D92596	2020-12-08 10:18:32 -05:00
Tim Northover	c5978f42ec	UBSAN: emit distinctive traps Sometimes people get minimal crash reports after a UBSAN incident. This change tags each trap with an integer representing the kind of failure encountered, which can aid in tracking down the root cause of the problem.	2020-12-08 10:28:26 +00:00
Luís Marques	3af354e863	[Clang][CodeGen][RISCV] Fix hard float ABI for struct with empty struct and complex Fixes bug 44904. Differential Revision: https://reviews.llvm.org/D91278	2020-12-08 09:19:05 +00:00
Luís Marques	fa8f5bfa4e	[Clang][CodeGen][RISCV] Fix hard float ABI test cases with empty struct The code seemed not to account for the field 1 offset. Differential Revision: https://reviews.llvm.org/D91270	2020-12-08 09:19:05 +00:00
Luís Marques	ca93f9abdc	[Clang][CodeGen][RISCV] Add hard float ABI tests with empty struct This patch adds tests that showcase a behavior that is currently buggy. Fix in a follow-up patch. Differential Revision: https://reviews.llvm.org/D91269	2020-12-08 09:19:05 +00:00
Qiu Chaofan	5e85a2ba16	[PowerPC] Implement intrinsic for DARN instruction Instruction darn was introduced in ISA 3.0. It means 'Deliver A Random Number'. The immediate number L means: - L=0, the number is 32-bit (higher 32-bits are all-zero) - L=1, the number is 'conditioned' (processed by hardware to reduce bias) - L=2, the number is not conditioned, directly from noise source GCC implements them in three separate intrinsics: __builtin_darn, __builtin_darn_32 and __builtin_darn_raw. This patch implements the same intrinsics. And this change also addresses Bugzilla PR39800. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92465	2020-12-08 14:08:52 +08:00
Jennifer Yu	f8d5b49c78	Fix missing error for use of 128-bit integer inside SPIR64 device code. Emit error for use of 128-bit integer inside device code had been already implemented in https://reviews.llvm.org/D74387. However, the error is not emitted for SPIR64, because for SPIR64, hasInt128Type return true. hasInt128Type: is also used to control generation of certain 128-bit predefined macros, initializer predefined 128-bit integer types and build 128-bit ArithmeticTypes. Except predefined macros, only the device target is considered, since error only emit when 128-bit integer is used inside device code, the host target (auxtarget) also needs to be considered. The change address: 1. (SPIR.h) Correct hasInt128Type() for SPIR targets. 2. Sema.cpp and SemaOverload.cpp: Add additional check to consider host target(auxtarget) when call to hasInt128Type. So that __int128_t and __int128() are allowed to avoid error when they used outside device code. 3. SemaType.cpp: add check for SYCLIsDevice to delay the error message. The error will be emitted if the use of 128-bit integer in the device code. Reviewed By: Johannes Doerfert and Aaron Ballman Differential Revision: https://reviews.llvm.org/D92439	2020-12-07 10:42:32 -08:00
Jinsong Ji	b49b8f096c	[PowerPC][Clang] Remove QPX support Clean up QPX code in clang missed in https://reviews.llvm.org/D83915 Reviewed By: #powerpc, steven.zhang Differential Revision: https://reviews.llvm.org/D92329	2020-12-07 10:15:39 -05:00
Benjamin Kramer	2a136a7a9c	[X86] Autodetect znver3	2020-12-05 19:08:20 +01:00
Fangrui Song	dec1bbb47c	Fix -allow-deprecated-dag-overlap in test/CodeGen/dso-local-executable.c	2020-12-03 21:24:38 -08:00
Qiu Chaofan	9378a366b2	[NFC] [Clang] Fix ppc64le vaarg OpenMP test in CodeGen Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D92544	2020-12-04 11:29:55 +08:00
Qiu Chaofan	222da77a82	[NFC] [Clang] Move ppc64le f128 vaargs OpenMP test This case for long-double semantics mismatch on OpenMP references %clang, which should be located in Driver directory.	2020-12-03 10:50:42 +08:00
Qiu Chaofan	3fca6a7844	[Clang] Don't adjust align for IBM extended double Commit `6b1341eb` fixed alignment for 128-bit FP types on PowerPC. However, the quadword alignment adjustment shouldn't be applied to IBM extended double (ppc_fp128 in IR) values. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92278	2020-12-02 17:02:26 +08:00
Mircea Trofin	5fe10263ab	[llvm][inliner] Reuse the inliner pass to implement 'always inliner' Enable performing mandatory inlinings upfront, by reusing the same logic as the full inliner, instead of the AlwaysInliner. This has the following benefits: - reduce code duplication - one inliner codebase - open the opportunity to help the full inliner by performing additional function passes after the mandatory inlinings, but before th full inliner. Performing the mandatory inlinings first simplifies the problem the full inliner needs to solve: less call sites, more contextualization, and, depending on the additional function optimization passes run between the 2 inliners, higher accuracy of cost models / decision policies. Note that this patch does not yet enable much in terms of post-always inline function optimization. Differential Revision: https://reviews.llvm.org/D91567	2020-11-30 12:03:39 -08:00
Hongtao Yu	c083fededf	[CSSPGO] A Clang switch -fpseudo-probe-for-profiling for pseudo-probe instrumentation. This change introduces a new clang switch `-fpseudo-probe-for-profiling` to enable AutoFDO with pseudo instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. One implication from pseudo-probe instrumentation is that the profile is now sensitive to CFG changes. We perform the pseudo instrumentation very early in the pre-LTO pipeline, before any CFG transformation. This ensures that the CFG instrumented and annotated is stable and optimization-resilient. The early instrumentation also allows the inliner to duplicate probes for inlined instances. When a probe along with the other instructions of a callee function are inlined into its caller function, the GUID of the callee function goes with the probe. This allows samples collected on inlined probes to be reported for the original callee function. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86502	2020-11-30 10:16:54 -08:00
Kevin P. Neal	abfbc5579b	[FPEnv] clang should get from the AST the metadata for constrained FP builtins Currently clang is not correctly retrieving from the AST the metadata for constrained FP builtins. This patch fixes that for the non-target specific builtins. Differential Revision: https://reviews.llvm.org/D92122	2020-11-30 11:59:37 -05:00
Kazushi (Jam) Marukawa	33eac0f283	[VE] Specify vector alignments Specify alignments for all vector types. Update a regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92256	2020-11-30 22:09:21 +09:00
Zarko Todorovski	ff8e8c1b14	[AIX] Enabling vector type arguments and return for AIX This patch enables vector type arguments on AIX. All non-aggregate Altivec vector types are 16bytes in size and are 16byte aligned. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D92117	2020-11-27 09:55:52 -05:00
Reid Kleckner	1e843a987d	[MS] Add more 128bit cmpxchg intrinsics for AArch64 The MSVC STL for requires this on ARM64. Requested in https://llvm.org/pr47099 Depends on D92061 Differential Revision: https://reviews.llvm.org/D92062	2020-11-25 12:07:28 -08:00
Reid Kleckner	3bd0672726	[MS] Fix double evaluation of MSVC builtin arguments This code got quite twisted because we consider some MSVC builtins to be target agnostic, and some to be target specific. Target specific intrinsics have a pattern of doing up-front argument evaluation, while general intrinsics do not evaluate their arguments up front. As we tried to share codepaths between the target-specific and target-agnostic handling, we ended up doing double evaluation. Instead, have each target handle MSVC intrinsics consistently before up front argument evaluation. This requires passing less data around and is more consistent with target independent intrinsic handling. See D50979 for past examples of this bug. I noticed this while looking into adding some more intrinsics. Differential Revision: https://reviews.llvm.org/D92061	2020-11-25 11:55:01 -08:00
Francesco Petrogalli	e592dde688	[clang][SVE] Activate macro `__ARM_FEATURE_SVE_VECTOR_OPERATORS`. The macro is emitted when wargeting SVE code generation with the additional command line option `-msve-vector-bits=<N>`. The behavior implied by the macro is described in sections "3.7.3.3. Behavior specific to SVE vectors" of the SVE ACLE (Version 00bet6) that can be found at https://developer.arm.com/documentation/100987/latest Reviewed By: rengolin, rsandifo-arm Differential Revision: https://reviews.llvm.org/D90956	2020-11-25 10:16:43 +00:00
Zarko Todorovski	c92f29b05e	[AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs. Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX. The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc. Reviewed By: Xiangling_L, DiggerLin Differential Revision: https://reviews.llvm.org/D89684	2020-11-24 18:17:53 -05:00
Teresa Johnson	6e4c1cf293	[ThinLTO/WPD] Enable -wholeprogramdevirt-skip in ThinLTO backends Previously this option could be used to skip devirtualizations of the given functions in regular LTO and in the ThinLTO indexing step. This change allows them to be skipped in the backend as well, which is useful when debugging WPD in a distributed ThinLTO backend. Differential Revision: https://reviews.llvm.org/D91812	2020-11-24 09:35:07 -08:00
Hubert Tong	44174b3d51	[NFC][tests] Replace non-portable grep with FileCheck After commit `2482648a79`, a GNU grep option is just passed unconditionally to `grep` in general. This patch fixes the test for platforms where `grep` is not GNU grep.	2020-11-24 12:15:07 -05:00
Craig Topper	b3f1b19c9c	[AArch64] Update clang CodeGen tests I missed in `4252f7773a`. These tests invoke opt and llc even though they are in the frontend. We now do a better job of generating commuted patterns for fma so these tests now form fmls instead of fmla+fneg.	2020-11-23 11:10:27 -08:00
Haojian Wu	b1444edbf4	[AST] Build recovery expression by default for all language. The dependency mechanism for C has been implemented, and we have rolled out this to all internal users, didn't see crashy issues, we consider it is stable enough. Differential Revision: https://reviews.llvm.org/D89046	2020-11-23 11:08:28 +01:00
Mircea Trofin	2482648a79	thinlto_embed_bitcode.ll: clarify grep should treat input as text The input to the test's use of grep should be treated as text, and that's not the case on certain Linux distros. Added --text.	2020-11-21 21:46:53 -08:00
Alex Richardson	51e09e1d5a	[AMDGPU] Set the default globals address space to 1 This will ensure that passes that add new global variables will create them in address space 1 once the passes have been updated to no longer default to the implicit address space zero. This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't already to present to ensure bitcode backwards compatibility. Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D84345	2020-11-20 15:46:53 +00:00
Simon Pilgrim	822c5c5084	[clang][CodeGen] Move WebAssembly specific tests to WebAssembly subtarget folder Minor cleanup to move more target specific tests out of the root codegen test folder	2020-11-20 12:03:28 +00:00
Simon Pilgrim	2f1fe9a3a6	[clang][CodeGen] Move riscv specific tests to RISCV subtarget folder Minor cleanup to move more target specific tests out of the root codegen test folder	2020-11-20 12:03:28 +00:00
Liu, Chen3	776f92e067	[X86] Add support for vex, vex2, vex3, and evex for MASM For MASM syntax, the prefixes are not enclosed in braces. The assembly code should like: "evex vcvtps2pd xmm0, xmm1" Differential Revision: https://reviews.llvm.org/D90441	2020-11-20 16:20:19 +08:00
Xiangling Liao	17497ec514	[AIX][FE] Support constructor/destructor attribute Support attribute((constructor)) and attribute((destructor)) on AIX Differential Revision: https://reviews.llvm.org/D90892	2020-11-19 09:24:01 -05:00
Qiu Chaofan	6b1341eb5b	[PowerPC] [Clang] Fix alignment of 128-bit float types According to ELF v2 ABI, both IEEE 128-bit and IBM extended floating point variables should be quad-word (16 bytes) aligned. Previously, only vector types are considered aligned as quad-word on PowerPC. This patch will fix incorrectness of IEEE 128-bit float argument in va_arg cases. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D91596	2020-11-19 14:22:14 +08:00
Arthur Eubanks	67f16e9e91	[NPM] Remove -enable-npm-optnone flag It has been on by default for a couple months without complaint. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D91743	2020-11-18 15:49:16 -08:00
Abhina Sreeskantharajan	057e6bb554	[SystemZ][NFC] Group SystemZ tests in SystemZ folder This patch creates a SystemZ folder in clang/test/CodeGen to contain systemz-related lit tests. Reviewed By: muiez Differential Revision: https://reviews.llvm.org/D91628	2020-11-18 11:49:15 -05:00
Florian Hahn	680931af27	[Matrix] Adjust matrix pointer type for inline asm arguments. Matrix types in memory are represented as arrays, but accessed through vector pointers, with the alignment specified on the access operation. For inline assembly, update pointer arguments to use vector pointers. Otherwise there will be a mis-match if the matrix is also an input-argument which is represented as vector. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D91631	2020-11-18 11:44:11 +00:00
Nick Desaulniers	f4c6080ab8	Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" This reverts commit `b7926ce6d7`. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
Florian Hahn	46846ac45b	[Matrix] Add inline assembly test case. This patch adds a new test cases which uses a matrix value as memory inline assembly argument. Currently the pointer element type does not match the vector type.	2020-11-17 15:13:16 +00:00
CJ Johnson	69cd776e1e	[CodeGen] Apply 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments. * Adds 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments * Gates 'nonnull' on -f(no-)delete-null-pointer-checks * Introduces this-nonnull.cpp and microsoft-abi-this-nullable.cpp tests to explicitly test the behavior of this change * Refactors hundreds of over-constrained clang tests to permit these attributes, where needed * Updates Clang12 patch notes mentioning this change Reviewed-by: rsmith, jdoerfert Differential Revision: https://reviews.llvm.org/D17993	2020-11-16 17:39:17 -08:00
Yonghong Song	4369223ea7	BPF: make __builtin_btf_type_id() return 64bit int Linux kernel recently added support for kernel modules https://lore.kernel.org/bpf/20201110011932.3201430-5-andrii@kernel.org/ In such cases, a type id in the kernel needs to be presented as (btf id for modules, btf type id for this module). Change __builtin_btf_type_id() to return 64bit value so libbpf can do the above encoding. Differential Revision: https://reviews.llvm.org/D91489	2020-11-16 07:08:41 -08:00
Florian Hahn	8dbe44cb29	Add pass to add !annotate metadata from @llvm.global.annotations. This patch adds a new pass to add !annotation metadata for entries in @llvm.global.anotations, which is generated using __attribute__((annotate("_name"))) on functions in Clang. This has been discussed on llvm-dev as part of RFC: Combining Annotation Metadata and Remarks http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D91195	2020-11-16 14:57:11 +00:00
Roman Lebedev	6861d938e5	Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485 the implementation is known-broken for certain inputs, the bugreport was up for a significant amount of timer, and there has been no activity to address it. Therefore, just completely rip out all of misexpect handling. I suspect, fixing it requires redesigning the internals of MD_misexpect. Should anyone commit to fixing the implementation problem, starting from clean slate may be better anyways. This reverts commit `7bdad08429`, and some of it's follow-ups, that don't stand on their own.	2020-11-14 13:12:38 +03:00
Arthur Eubanks	52f05fb2c2	[MemProf][NewPM] Make memprof passes required Just like other sanitizers. Fixes check-memprof under NPM. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D91389	2020-11-13 15:15:27 -08:00
Arthur Eubanks	6e098189db	[DFSan][NewPM] Handle dfsan under NPM Make it required. Since it's a module pass, optnone won't test it, so extend the clang test to also use opt-bisect now that it's supported. 14/16 check-dfsan tests failed with NPM enabled, now all pass. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D91385	2020-11-13 13:41:38 -08:00
Heejin Ahn	902ea588ea	[WebAssembly] Rename atomic.notify and *.atomic.wait - atomic.notify -> memory.atomic.notify - i32.atomic.wait -> memory.atomic.wait32 - i64.atomic.wait -> memory.atomic.wait64 See https://github.com/WebAssembly/threads/pull/149. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D91447	2020-11-13 12:04:48 -08:00
Baptiste Saleil	3f78605a8c	[PowerPC] Add paired vector load and store builtins and intrinsics This patch adds the Clang builtins and LLVM intrinsics to load and store vector pairs. Differential Revision: https://reviews.llvm.org/D90799	2020-11-13 12:35:10 -06:00
Qiu Chaofan	2abc33683b	[PowerPC] [Clang] Define macros to identify quad-fp semantics We have option -mabi=ieeelongdouble to set current long double to IEEEquad semantics. Like what GCC does, we need to define __LONG_DOUBLE_IEEE128__ macro in this case, and __LONG_DOUBLE_IBM128__ if using PPCDoubleDouble. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D90208	2020-11-12 10:26:13 +08:00
Arthur Eubanks	b6ccff3d5f	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Since callbacks may end up not adding passes, we need to check if the pass managers are empty before adding them, so PassManager now has an isEmpty() function. For example, polly adds callbacks but doesn't always add passes in those callbacks, so this is necessary to keep -debug-pass-manager tests' output from changing depending on if polly is enabled or not. Tests are a continuation of those added in https://reviews.llvm.org/D89083. Reviewed By: asbirlea, Meinersbur Differential Revision: https://reviews.llvm.org/D89158	2020-11-11 15:10:27 -08:00
Kazushi (Jam) Marukawa	6e0ae20f3b	[VE] Support vector register in inline asm Support a vector register constraint in inline asm of clang. Add a regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D91251	2020-11-12 06:18:35 +09:00
Simon Pilgrim	3e5533bafd	[CodeGen] Remove unused check prefixes	2020-11-11 14:57:38 +00:00
Simon Pilgrim	8cb97fb9c9	[CodeGen] Fix check prefix mismatch on neon-immediate-ubsan.c tests Noticed while fixing unused prefix warnings,	2020-11-11 14:57:37 +00:00
Akira Hatanaka	d9258a21f0	Fix the data layout mangling specification for 'arm64-pc-win32-macho' rdar://problem/70410504	2020-11-10 18:52:12 -08:00
Qiu Chaofan	979a4d268a	[PowerPC] [Clang] Port SSE4.1-compatible insert intrinsics This patch adds three intrinsics compatible to x86's SSE 4.1 on PowerPC target, with tests: - _mm_insert_epi8 - _mm_insert_epi32 - _mm_insert_epi64 The intrinsics implementation is contributed by Paul Clarke. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D89242	2020-11-10 10:52:13 +08:00
Fangrui Song	e625f9c5d1	-fbasic-block-sections=list=: Suppress output if failed to open the file Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D90815	2020-11-09 09:26:37 -08:00
Atmn Patel	fd3cad7a60	[clang] Fix ForStmt mustprogress handling D86841 had an error where for statements with no conditional were required to make progress. This is not true, this patch removes that line, and adds regression tests. Differential Revision: https://reviews.llvm.org/D91075	2020-11-09 11:38:06 -05:00
Arthur Eubanks	226e179f74	Revert "[NewPM] Provide method to run all pipeline callbacks, used for -O0" This reverts commit `ae38540042`. As well as some follow-up test fixes. The original change causes new-pass-manager.ll to fail when polly is enabled.	2020-11-08 00:32:35 -08:00
Atmn Patel	d3e75d31e3	Revert "[CodeGen] Fixes sanitizer test" This reverts commit `b1878b4641`. This does fix the test but it means that `ac73b73c16` is not implemented correctly. Reverting for now, and will be reverting the commit that causes this to fail.	2020-11-07 00:32:12 -05:00
Atmn Patel	b1878b4641	[CodeGen] Fixes sanitizer test By turning the loop into an infinite one, the loop can't be deleted anymore so the test will continue to pass.	2020-11-06 23:53:38 -05:00
Kevin P. Neal	2069403cdf	[FPEnv] Use strictfp metadata in casting nodes The strictfp metadata was added to the casting AST nodes in D85960, but we aren't using that metadata yet. This patch adds that support. In order to avoid lots of ad-hoc passing around of the strictfp bits I updated the IRBuilder when moving from a function that has the Expr* to a function that lacks it. I believe we should switch to this pattern to keep the strictfp support from being overly invasive. For the purpose of testing that we're picking up the right metadata, I also made my tests use a pragma to make the AST's strictfp metadata not match the global strictfp metadata. This exposes issues that we need to deal with in subsequent patches, and I believe this is the right method for most all of our clang strictfp tests. Differential Revision: https://reviews.llvm.org/D88913	2020-11-06 11:56:12 -05:00
David Spickett	aecd52b97b	[Clang][AArch64] Remove unused prefix in constrained rounding test This test was added in `7f38812d5b` and all the other tests make use of the COMMONIR check. So I think this was left in by mistake for this particular test. Reviewed By: kpn Differential Revision: https://reviews.llvm.org/D90921	2020-11-06 14:13:46 +00:00
Jan Ole Hüser	d2e7dca5ca	[CodeGen] Fix Bug 47499: __unaligned extension inconsistent behaviour with C and C++ For the language C++ the keyword __unaligned (a Microsoft extension) had no effect on pointers. The reason, why there was a difference between C and C++ for the keyword __unaligned: For C, the Method getAsCXXREcordDecl() returns nullptr. That guarantees that hasUnaligned() is called. If the language is C++, it is not guaranteed, that hasUnaligend() is called and evaluated. Here are some links: The Bug: https://bugs.llvm.org/show_bug.cgi?id=47499 Thread on the cfe-dev mailing list: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066783.html Diff, that introduced the check hasUnaligned() in getNaturalTypeAlignment(): https://reviews.llvm.org/D30166 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D90630	2020-11-05 12:57:17 -08:00
Albion Fung	1af037f643	[PowerPC] Correct cpsgn's behaviour on PowerPC to match that of the ABI This patch fixes the reversed behaviour exhibited by cpsgn on PPC. It now matches the ABI. Differential Revision: https://reviews.llvm.org/D84962	2020-11-05 15:35:14 -05:00
Fangrui Song	c6a384df1f	[Sema] Special case -Werror-implicit-function-declaration and reject other -Werror- This is the only -Werror- form warning option GCC supports (gcc/c-family/c.opt). Fortunately no other form is used anywhere.	2020-11-05 10:25:30 -08:00
Arthur Eubanks	5fd3193c88	[test] Add 'REQUIRES: bpf-registered-target' to bpf-O0.c	2020-11-04 23:19:14 -08:00
Arthur Eubanks	ae38540042	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Tests are a continuation of those added in https://reviews.llvm.org/D89083. In order to prevent TargetMachines from adding unnecessary optimization passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be changed to take an OptimizationLevel, but that will be done separately. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89158	2020-11-04 22:27:16 -08:00
Atmn Patel	ac73b73c16	[clang] Add mustprogress and llvm.loop.mustprogress attribute deduction Since C++11, the C++ standard has a forward progress guarantee [intro.progress], so all such functions must have the `mustprogress` requirement. In addition, from C11 and onwards, loops without a non-zero constant conditional or no conditional are also required to make progress (C11 6.8.5p6). This patch implements these attribute deductions so they can be used by the optimization passes. Differential Revision: https://reviews.llvm.org/D86841	2020-11-04 22:03:14 -05:00
Qiu Chaofan	7faf62a80b	[Clang] Add more fp128 math library function builtins Since glibc has supported math library functions conforming IEEE 128-bit floating point types on some platform (like ppc64le), we can fix clang's math builtins missing this type. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D90593	2020-11-04 17:58:42 +08:00
Baptiste Saleil	daa127d77e	[PowerPC] Add MMA builtin decoding and definitions Add MMA builtin decoding. These builtins use the new PowerPC-specific types __vector_pair and __vector_quad. So to avoid pervasive changes, we use custom type descriptors and custom decoding for these builtins. We also use custom code generation to expand builtin calls with pointers to simpler intrinsic calls with non-pointer types. Differential Revision: https://reviews.llvm.org/D81748	2020-11-03 15:08:46 -06:00
Serge Pavlov	ee63acc37e	Put back the test pragma-fp-exc.cpp This test was removed in `5963e028e7` because it failed on cores where support of constrained intrinsics was limited. Now this test is enabled only on x86.	2020-11-03 13:18:40 +07:00
Ben Dunbobbin	ff2e24a741	[PS4] Support dllimport/export attributes For PS4 development we support dllimport/export annotations in source code. This patch enables the dllimport/export attributes on PS4 by adding a new function to query the triple for whether dllimport/export are used and using that function to decide whether these attributes are supported. This replaces the current method of checking if the target is Windows. This means we can drop the use of "TargetArch" in the .td file (which is an improvement as dllimport/export support isn't really a function of the architecture). I have included a simple codgen test to show that the attributes are accepted and have an effect on codegen for PS4. I have also enabled the DLLExportStaticLocal and DLLImportStaticLocal attributes, which we support downstream. However, I am unable to write a test for these attributes until other patches for PS4 dllimport/export handling land upstream. Whilst writing this patch I noticed that, as these attributes are internal, they do not need to be target specific (when these attributes are added internally in Clang the target specific checks have already been run); however, I think leaving them target specific is fine because it isn't harmful and they "really are" target specific even if that has no functional impact. Differential Revision: https://reviews.llvm.org/D90442	2020-11-02 14:25:34 +00:00
Teresa Johnson	95824be18f	[MemProf] Fix test failure on windows Fix failure in new test from 0949f96dc6521be80ebb8ebc1e1c506165c22aac: Don't match exact file path separator. Should fix: http://lab.llvm.org:8011/#/builders/119/builds/437/steps/9/logs/FAIL__Clang__memory-profile-filename_c	2020-11-01 19:06:50 -08:00
Teresa Johnson	0949f96dc6	[MemProf] Pass down memory profile name with optional path from clang Similar to -fprofile-generate=, add -fmemory-profile= which takes a directory path. This is passed down to LLVM via a new module flag metadata. LLVM in turn provides this name to the runtime via the new __memprof_profile_filename variable. Additionally, always pass a default filename (in $cwd if a directory name is not specified vi the = form of the option). This is also consistent with the behavior of the PGO instrumentation. Since the memory profiles will generally be fairly large, it doesn't make sense to dump them to stderr. Also, importantly, the memory profiles will eventually be dumped in a compact binary format, which is another reason why it does not make sense to send these to stderr by default. Change the existing memprof tests to specify log_path=stderr when that was being relied on. Depends on D89086. Differential Revision: https://reviews.llvm.org/D89087	2020-11-01 17:38:23 -08:00
Serge Pavlov	5963e028e7	Temporarily remove test CodeGen/pragma-fp-exc This test fails on buildbots where CPU architecture does not fully support constrained intrinsics.	2020-10-31 19:48:44 +07:00
Serge Pavlov	6021cbea4d	Add option 'exceptions' to pragma clang fp Pragma 'clang fp' is extended to support a new option, 'exceptions'. It allows to specify floating point exception behavior more flexibly. Differential Revision: https://reviews.llvm.org/D89849	2020-10-31 17:36:12 +07:00
Arthur Eubanks	5c31b8b94f	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `10f2a0d662`. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Liu, Chen3	756f597841	[X86] Support Intel avxvnni This patch mainly made the following changes: 1. Support AVX-VNNI instructions; 2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix. Differential Revision: https://reviews.llvm.org/D89105	2020-10-31 12:39:51 +08:00
Thomas Lively	a787e09779	[WebAssembly] Prototype i64x2.bitmask As proposed in https://github.com/WebAssembly/simd/pull/368. Differential Revision: https://reviews.llvm.org/D90514	2020-10-30 17:23:30 -07:00
Thomas Lively	0a512a555a	[WebAssembly] Prototype i64x2.eq As proposed in https://github.com/WebAssembly/simd/pull/381. Since it is still in the prototyping phase, it is only accessible via a target builtin function and a target intrinsic. Depends on D90504. Differential Revision: https://reviews.llvm.org/D90508	2020-10-30 16:38:15 -07:00
Thomas Lively	1cb0b56607	[WebAssembly] Prototype i64x2.widen_{low,high}_i32x4_{s,u} As proposed in https://github.com/WebAssembly/simd/pull/290. As usual, these instructions are available only via builtin functions and intrinsics while they are in the prototyping stage. Differential Revision: https://reviews.llvm.org/D90504	2020-10-30 15:44:04 -07:00
Arthur Eubanks	10f2a0d662	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
Simon Pilgrim	973317cc5e	[CodeGen][X86] Remove unused check-prefix in constrained fma tests	2020-10-30 16:23:08 +00:00
Simon Pilgrim	365f46efeb	[CodeGen][X86] Remove unused check-prefix in movdir tests	2020-10-30 16:23:08 +00:00
Simon Pilgrim	c44846f537	[CodeGen][X86] Cleanup + fix unused check-prefixes in bmi tests	2020-10-30 16:13:54 +00:00
Simon Pilgrim	fe3d765ac7	[CodeGen][X86] Tidyup CHECKs on bitscan tests	2020-10-30 16:13:52 +00:00
Simon Pilgrim	5cdd470504	[CodeGen][X86] Remove unused check-prefix in bitscan tests	2020-10-30 16:13:50 +00:00
Simon Pilgrim	0ff9d8c8ba	[CodeGen][X86] Remove unused check-prefix in bswap tests	2020-10-30 16:13:49 +00:00
Simon Pilgrim	d7389f05ee	[CodeGen][X86] Cleanup + remove unused check-prefixes in avx union tests	2020-10-30 16:13:47 +00:00
Simon Pilgrim	bbe055dd73	[CodeGen][X86] Remove unused check-prefix in amx inline asm tests	2020-10-30 16:13:45 +00:00
Melanie Blower	71bf9f07d5	[clang] add fexperimental-strict-floating-point to test cases that fail on arm and aarch not sure this will work due to commit rG13bfd89c4962	2020-10-30 07:30:06 -07:00
David Sherwood	cea69fa4dc	[SVE] Add fatal error for unnamed SVE variadic arguments We don't currently support passing unnamed variadic SVE arguments so I've added a fatal error if we hit such cases to prevent any silent ABI issues in future. Differential Revision: https://reviews.llvm.org/D90230	2020-10-30 13:35:47 +00:00
Liu, Chen3	00090a2b82	Support complex target features combinations This patch is mainly doing two things: 1. Adding support for parentheses, making the combination of target features more diverse; 2. Making the priority of ’,‘ is higher than that of '\|' by default. So I need to make some change with PTX Builtin function. Differential Revision: https://reviews.llvm.org/D89184	2020-10-30 10:32:53 +08:00
Thomas Lively	be6f50798e	[WebAssembly] Implement SIMD signselect instructions As proposed in https://github.com/WebAssembly/simd/pull/124, using the opcodes adopted by V8 in https://chromium-review.googlesource.com/c/v8/v8/+/2486235/2/src/wasm/wasm-opcodes.h. Uses new builtin functions and a new target intrinsic exclusively to ensure that the new instructions are only emitted when a user explicitly opts in to using them since they are still in the prototyping and evaluation phase. Differential Revision: https://reviews.llvm.org/D90357	2020-10-29 11:06:20 -07:00
Mircea Trofin	13aee94bc7	[ThinLTO] Fix empty .llvmcmd sections When passing -lto-embed-bitcode=post-merge-pre-opt, we were getting empty .llvmcmd sections. It turns out that is because the CodeGenOptions::CmdArgs field was only populated when clang saw -fembed-bitcode={all\|marker}. This patch always populates the CodeGenOptions::CmdArgs. The overhead of carrying through in memory in all cases is likely negligible in the grand schema of things, and it keeps the using code simple. Differential Revision: https://reviews.llvm.org/D90366	2020-10-29 09:57:42 -07:00
Serge Pavlov	08bb5d9196	[FPEnv] Tests for rounding properties of constant evalution These are moved from D88498. Differential Revision: https://reviews.llvm.org/D90026	2020-10-29 13:53:13 +07:00
Mircea Trofin	735ab4be35	[ThinLTO] Fix .llvmcmd emission llvm::EmbedBitcodeInModule needs (what used to be called) EmbedMarker set, in order to emit .llvmcmd. EmbedMarker is really about embedding the command line, so renamed the parameter accordingly, too. This was not caught at test because the check-prefix was incorrect, but FileCheck does not report that when multiple prefixes are provided. A separate patch will address that. Differential Revision: https://reviews.llvm.org/D90278	2020-10-28 17:45:30 -07:00
Baptiste Saleil	40dd4d5233	[Clang][PowerPC] Add __vector_pair and __vector_quad types Define the __vector_pair and __vector_quad types that are used to manipulate the new accumulator registers introduced by MMA on PowerPC. Because these two types are specific to PowerPC, they are defined in a separate new file so it will be easier to add other PowerPC specific types if we need to in the future. Differential Revision: https://reviews.llvm.org/D81508	2020-10-28 13:19:20 -05:00
Thomas Lively	5b464f2aa5	[WebAssembly] Fix incorrectly named target builtin Rename __builtin_wasm_q15mulr_saturate_s_i8x16 to __builtin_wasm_q15mulr_saturate_s_i16x8, fixing the implied lane interpretation of the result.	2020-10-28 10:22:43 -07:00
Thomas Lively	31e944556f	[WebAssembly] Prototype extending multiplication SIMD instructions As proposed in https://github.com/WebAssembly/simd/pull/376. This commit implements new builtin functions and intrinsics for these instructions, but does not yet add them to wasm_simd128.h because they have not yet been merged to the proposal. These are the first instructions with opcodes greater than 0xff, so this commit updates the MC layer and disassembler to handle that correctly. Differential Revision: https://reviews.llvm.org/D90253	2020-10-28 09:38:59 -07:00
Nico Weber	2a4e704c92	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `e5766f25c6`. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.	2020-10-27 09:26:21 -04:00
Arthur Eubanks	e5766f25c6	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-26 20:24:04 -07:00
Kiran Chandramohan	c551ba0e90	Run test only if X86 target is available This fixes failures in AArch64 buildbots by running the clang/test/CodeGen/X86/att-inline-asm-prefix.c only when the X86 target is available.	2020-10-26 21:28:59 +00:00
Sriraman Tallam	ad1b9daa4b	Prepend "__uniq" to symbol names hash with -funique-internal-linkage-names. Prepend the module name hash with a fixed string ".__uniq." which helps tools that consume sampled profiles and attribute it to functions to understand that this symbol belongs to a unique internal linkage type symbol. Symbols with suffixes can result from various optimizations in the compiler. Function Multiversioning, function splitting, parameter constant propogation, unique internal linkage names. External tools like sampled profile aggregators combine profiles from multiple runs of a binary. They use various heuristics with symbols that have suffixes to try and attribute the profile to the right function instance. For instance multi-versioned symbols like foo.avx, foo.sse4.2, etc even though different should be attributed to the same source function if a single function is versioned, using attribute target_clones (supported in GCC but yet to land in LLVM). Similarly, functions that are split (split part having a .cold suffix) could have profiles for both the original and split symbols but would be aggregated and attributed to the original function that was split. Unique internal linkage functions however have different source instances and the aggregator must not put them together but attribute it to the appropriate function instance. To be sure that we are dealing with a symbol of a unique internal linkage function, we would like to prepend the hash with a known string ".__uniq." which these tools can check to understand the suffix type. Differential Revision: https://reviews.llvm.org/D89617	2020-10-26 14:24:28 -07:00
Zequan Wu	e56e7bd469	Revert "Revert "Ensure that checkInitIsICE is called exactly once for every variable"" This reverts commit `a2ac64dd90`.	2020-10-26 12:08:57 -07:00
Zequan Wu	a2ac64dd90	Revert "Ensure that checkInitIsICE is called exactly once for every variable" This causing `Assertion Result && "Could not evaluate expression"' failed` at https://bugs.chromium.org/p/chromium/issues/detail?id=1142009 This reverts commit `76c0092665`.	2020-10-26 11:59:55 -07:00
Nick Desaulniers	c8f84bd094	[Clang][CodeGen] fix failed assertion Ensure we can emit symbol aliases via function attribute even when function signatures contain incomplete types. Via bugreport: https://reviews.llvm.org/D66492#2350947 Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D90073	2020-10-26 11:37:55 -07:00
Tyker	d3205bbca3	[Annotation] Allows annotation to carry some additional constant arguments. This allows using annotation in a much more contexts than it currently has. especially when annotation with template or constexpr. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D88645	2020-10-26 10:50:05 +01:00
Liu, Chen3	180548c5c7	[X86] VEX/EVEX prefix doesn't work for inline assembly. For now, we lost the encoding information if we using inline assembly. The encoding for the inline assembly will keep default even if we add the vex/evex prefix. Differential Revision: https://reviews.llvm.org/D90009	2020-10-26 08:37:45 +08:00
Melanie Blower	2e204e2391	[clang] Enable support for #pragma STDC FENV_ACCESS Reviewers: rjmccall, rsmith, sepavloff Differential Revision: https://reviews.llvm.org/D87528	2020-10-25 06:46:25 -07:00
Benjamin Kramer	39a0d6889d	[X86] Add a stub for Intel's alderlake. No scheduling, no autodetection.	2020-10-24 19:01:22 +02:00
Nick Desaulniers	b7926ce6d7	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
Xiangling Liao	05bef88eb3	[AIX] Let alloca return 16 bytes alignment On AIX, to support vector types, which should always be 16 bytes aligned, we set alloca to return 16 bytes aligned memory space. Differential Revision: https://reviews.llvm.org/D89910	2020-10-23 14:41:32 -04:00
Tianqing Wang	be39a6fe6f	[X86] Add User Interrupts(UINTR) instructions For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89301	2020-10-22 17:33:07 +08:00
Sriraman Tallam	eef2e67d23	Simple fix to basic-block-sections to replace emit-obj with emit-llvm emit-obj is unnecessary here and further wasn't redirected to /dev/null.	2020-10-21 13:52:33 -07:00
David Zarzycki	87f6de72bc	[clang testing] Fix a read-only source build system failure	2020-10-21 08:08:03 -04:00
Florian Hahn	c50f0d239d	[Clang] Update newpm pipeline test in clang after D87322. This fixes a test failure because a LLVM pipeline test file in clang/ did not get updated in `88241ffb56`.	2020-10-21 12:59:50 +01:00
Jonas Paulsson	42a82862b6	Reapply "[clang] Improve handling of physical registers in inline assembly operands." Earlyclobbers are now excepted from this change (original commit: `c78da03`). Review: Ulrich Weigand, Nick Desaulniers Differential Revision: https://reviews.llvm.org/D87279	2020-10-21 10:53:40 +02:00
Fangrui Song	829b9f6606	[test] Fix -fbasic-block-sections= test on Windows after D89500	2020-10-20 18:31:28 -07:00
Sriraman Tallam	f88785460e	Improve file doesnt exist error with -fbasic-block-sections= With -fbasicblock-sections=, let the front-end handle the case where the file doesnt exist. The driver only checks if the option syntax is right. Differential Revision: https://reviews.llvm.org/D89500	2020-10-20 16:41:56 -07:00
sstefan1	fbfb1c7909	[IR] Make nosync, nofree and willreturn default for intrinsics. D70365 allows us to make attributes default. This is a follow up to actually make nosync, nofree and willreturn default. The approach we chose, for now, is to opt-in to default attributes to avoid introducing problems to target specific intrinsics. Intrinsics with default attributes can be created using `DefaultAttrsIntrinsic` class.	2020-10-20 11:57:19 +02:00
Fangrui Song	545c687c4b	[gcov] Unify driver and CC1 option names for -ftest-coverage & -fprofile-arcs No need to use -femit-coverage-notes and -femit-coverage-data.	2020-10-19 22:19:00 -07:00
Richard Smith	76c0092665	Ensure that checkInitIsICE is called exactly once for every variable for which it matters. This is a step towards separating checking for a constant initializer (in which std::is_constant_evaluated returns true) and any other evaluation of a variable initializer (in which it returns false).	2020-10-19 19:04:04 -07:00
Douglas Yung	774ab60125	Add option to use older clang ABI behavior when passing certain union types as function arguments Recently commit D78699 (commit `26cfb6e562`), fixed clang's behavior with respect to passing a union type through a register to correctly follow the ABI. However, this is an ABI breaking change with earlier versions of the clang compiler, so we should add an -fclang-abi-compat option to address this. Additionally, the PS4 ABI requires the older behavior, so that is added as well. This change adds a Ver11 value to the ClangABI enum that when it is set (or the target is the PS4 triple), we skip the ABI fix introduced in D78699. Differential Revision: https://reviews.llvm.org/D89747	2020-10-19 18:17:34 -07:00
Hans Wennborg	0628bea513	Revert "[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting" This broke Chromium's PGO build, it seems because hot-cold-splitting got turned on unintentionally. See comment on the code review for repro etc. > This patch adds -f[no-]split-cold-code CC1 options to clang. This allows > the splitting pass to be toggled on/off. The current method of passing > `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose > correctly (say, with `-O0` or `-Oz`). > > To implement the -fsplit-cold-code option, an attribute is applied to > functions to indicate that they may be considered for splitting. This > removes some complexity from the old/new PM pipeline builders, and > behaves as expected when LTO is enabled. > > Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> > Differential Revision: https://reviews.llvm.org/D57265 > Reviewed By: Aditya Kumar, Vedant Kumar > Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar This reverts commit `273c299d5d`.	2020-10-19 12:31:14 +02:00
Albion Fung	d30155feaa	[PowerPC] Implementation of 128-bit Binary Vector Rotate builtins This patch implements 128-bit Binary Vector Rotate builtins for PowerPC10. Differential Revision: https://reviews.llvm.org/D86819	2020-10-16 18:03:22 -04:00
Richard Smith	552c6c2328	PR44406: Follow behavior of array bound constant folding in more recent versions of GCC. Old GCC used to aggressively fold VLAs to constant-bound arrays at block scope in GNU mode. That's non-conforming, and more modern versions of GCC only do this at file scope. Update Clang to do the same. Also promote the warning for this from off-by-default to on-by-default in all cases; more recent versions of GCC likewise warn on this by default. This is still slightly more permissive than GCC, as pointed out in PR44406, as we still fold VLAs to constant arrays in structs, but that seems justifiable given that we don't support VLA-in-struct (and don't intend to ever support it), but GCC does. Differential Revision: https://reviews.llvm.org/D89523	2020-10-16 14:34:35 -07:00
Matt Arsenault	0a7cd99a70	Reapply "OpaquePtr: Add type to sret attribute" This reverts commit `eb9f7c28e5`. Previously this was incorrectly handling linking of the contained type, so this merges the fixes from D88973.	2020-10-16 11:05:02 -04:00
Florian Hahn	51ff04567b	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." After investigation by @asbirlea, the issue that caused the revert appears to be an issue in the original source, rather than a problem with the compiler. This patch enables MemorySSA DSE again. This reverts commit `915310bf14`.	2020-10-16 09:02:53 +01:00
Vedant Kumar	273c299d5d	[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting This patch adds -f[no-]split-cold-code CC1 options to clang. This allows the splitting pass to be toggled on/off. The current method of passing `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose correctly (say, with `-O0` or `-Oz`). To implement the -fsplit-cold-code option, an attribute is applied to functions to indicate that they may be considered for splitting. This removes some complexity from the old/new PM pipeline builders, and behaves as expected when LTO is enabled. Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> Differential Revision: https://reviews.llvm.org/D57265 Reviewed By: Aditya Kumar, Vedant Kumar Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar	2020-10-15 23:13:33 +00:00
Fangrui Song	5a338599fb	[CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting rL131311 added `asm()` support for builtin functions, but `asm()` for builtins with specialized emitting (e.g. memcpy, various math functions) still do not work. This patch makes these functions work for `asm()` and `#pragma redefine_extname`. glibc uses `asm()` to redirect internal libc function calls to hidden aliases. Limitation: such a function is a builtin in clang, but will not be recognized as a libcall in optimization passes because Clang does not annotate the renamed function as a libcall. In GCC -O1 or above, `abs` can be optimized out but we can't. Additionally, we cannot redirect `__builtin_sin` to `real_sin` in the following example: double sin(double x) asm("real_sin"); double f(double d) { return __builtin_sin(d); } --- According to @rsmith, the following three statements cannot be simultaneously true: (1) The frontend function foo has known, builtin semantics X. (2) The symbol foo has known, builtin semantics X. (3) It's not correct to lower a call to the frontend function foo to the symbol foo. People do want (1) (if it is profitable to expand a memcpy, do it). This also means that people do not want to add -fno-builtin-memcpy. People do want (3): that is why they use asm("__GI_memcpy") in the first place. So unfortunately we make a compromise by not refuting (2) (see the limitation above). For most libcalls, there is a small loss because compilers don't synthesize them. For the few glibc cares about, it uses `asm("memcpy = __GI_memcpy");` to make the assembly level redirection. (Changing function names (e.g. `__memcpy`) is a hit to ergonomics which is not acceptable). Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D88712	2020-10-15 15:14:38 -07:00
Thomas Lively	1992e30c2d	[WebAssembly] Prototype i8x16.popcnt As proposed at https://github.com/WebAssembly/simd/pull/379. Use a target builtin and intrinsic rather than normal codegen patterns to make the instruction opt-in until it is merged to the proposal and stabilized in engines. Differential Revision: https://reviews.llvm.org/D89446	2020-10-15 21:18:22 +00:00
Thomas Lively	3f738d1f5e	Reland "[WebAssembly] v128.load{8,16,32,64}_lane instructions" This reverts commit `7c8385a352` with a typing fix to an instruction selection pattern.	2020-10-15 19:32:34 +00:00
Thomas Lively	7c8385a352	Revert "[WebAssembly] v128.load{8,16,32,64}_lane instructions" This reverts commit `7c6bfd90ab`.	2020-10-15 15:49:36 +00:00
Thomas Lively	7c6bfd90ab	[WebAssembly] v128.load{8,16,32,64}_lane instructions Prototype the newly proposed load_lane instructions, as specified in https://github.com/WebAssembly/simd/pull/350. Since these instructions are not available to origin trial users on Chrome stable, make them opt-in by only selecting them from intrinsics rather than normal ISel patterns. Since we only need rough prototypes to measure performance right now, this commit does not implement all the load and store patterns that would be necessary to make full use of the offset immediate. However, the full suite of offset tests is included to make it easy to track improvements in the future. Since these are the first instructions to have a memarg immediate as well as an additional immediate, the disassembler needed some additional hacks to be able to parse them correctly. Making that code more principled is left as future work. Differential Revision: https://reviews.llvm.org/D89366	2020-10-15 15:33:10 +00:00
Simon Pilgrim	d7fa9030d4	[CodeGen][X86] Emit fshl/fshr ir intrinsics for shiftleft128/shiftright128 ms intrinsics Now that funnel shift handling is pretty good, we can use the intrinsics directly and avoid a lot of zext/trunc issues. https://godbolt.org/z/YqhnnM Differential Revision: https://reviews.llvm.org/D89405	2020-10-15 10:22:41 +01:00
Simon Pilgrim	b967b9a711	[CodeGen] Move x86 specific ms intrinsic tests into x86 target subfolder. NFCI.	2020-10-14 17:37:26 +01:00
Jonas Paulsson	625fa47617	Revert "[clang] Improve handling of physical registers in inline assembly operands." This reverts commit `c78da03778`. Temporarily reverted due to https://bugs.llvm.org/show_bug.cgi?id=47837.	2020-10-14 08:42:51 +02:00
Liu, Chen3	bd05afcb3f	[X86][NFC] Fix RUN line bug in the testcase Testcase added in D78699 doesn't work because the wrong RUN line in the testcase. Differential Revision: https://reviews.llvm.org/D89361	2020-10-14 12:40:34 +08:00
Jonas Paulsson	c78da03778	[clang] Improve handling of physical registers in inline assembly operands. Change EmitAsmStmt() to - Not tie physregs with the "+r" constraint, but instead add the hard register as an input constraint. This makes "+r" and "=r":"r" look the same in the output. Background: Macro intensive user code may contain inline assembly statements with multiple operands constrained to the same physreg. Such a case (with the operand constraints "+r" : "r") currently triggers the TwoAddressInstructionPass assertion against any extra use of a tied register. Furthermore, TwoAddress will insert a COPY to that physreg even though isel has already done so (for the non-tied use), which may lead to a second redundant instruction currently. A simple fix for this is to not emit tied physreg uses in the first place for the "+r" constraint, which is what this patch does. - Give an error on multiple outputs to the same physical register. This should be reported and this is also what GCC does. Review: Ulrich Weigand, Aaron Ballman, Jennifer Yu, Craig Topper Differential Revision: https://reviews.llvm.org/D87279	2020-10-13 15:09:52 +02:00
Ties Stuij	208987844f	[ARM] Follow AACPS standard for volatile bit-fields access width This patch resumes the work of D16586. According to the AAPCS, volatile bit-fields should be accessed using containers of the widht of their declarative type. In such case: ``` struct S1 { short a : 1; } ``` should be accessed using load and stores of the width (sizeof(short)), where now the compiler does only load the minimum required width (char in this case). However, as discussed in D16586, that could overwrite non-volatile bit-fields, which conflicted with C and C++ object models by creating data race conditions that are not part of the bit-field, e.g. ``` struct S2 { short a; int b : 16; } ``` Accessing `S2.b` would also access `S2.a`. The AAPCS Release 2020Q2 (https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=) section 8.1 Data Types, page 36, "Volatile bit-fields - preserving number and width of container accesses" has been updated to avoid conflict with the C++ Memory Model. Now it reads in the note: ``` This ABI does not place any restrictions on the access widths of bit-fields where the container overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field placed between two other bit-fields. This is because the C/C++ memory model defines these as being separate memory locations, which can be accessed by two threads simultaneously. For this reason, compilers must be permitted to use a narrower memory access width (including splitting the access into multiple instructions) to avoid writing to a different memory location. For example, in struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };, writes to a or b must not overwrite each other. ``` I've updated the patch D16586 to follow such behavior by verifying that we only change volatile bit-field access when: - it won't overlap with any other non-bit-field member - we only access memory inside the bounds of the record - avoid overlapping zero-length bit-fields. Regarding the number of memory accesses, that should be preserved, that will be implemented by D67399. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D72932	2020-10-13 10:31:48 +01:00
Simon Pilgrim	6c23cbc560	[X86] Convert integer _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Emit the equivalent integer reduction intrinsics in IR instead of expanding to shuffle+arithmetic sequences. The fadd/fmul reductions might be trickier as they assume a similar bisection reduction while the generic intrinsics assume a sequential reduction (intel docs are ambiguous on the correct approach) - I'm not sure if we want to always tag them with reassoc? Anyway, that issue can wait until a separate fp patch along with the fmin/fmax reductions. Differential Revision: https://reviews.llvm.org/D87604	2020-10-13 09:28:39 +01:00
Wang, Pengfei	412cdcf2ed	[X86] Add HRESET instruction. For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D89102	2020-10-13 08:47:26 +08:00
Fangrui Song	012dd42e02	[X86] Support -march=x86-64-v[234] PR47686. These micro-architecture levels are defined in the x86-64 psABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 GCC 11 will support these levels. Note, -mtune=x86-64-v[234] are invalid and __builtin_cpu_is cannot be used on them. Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D89197	2020-10-12 10:29:46 -07:00
Roman Lebedev	544a6aa267	[InstCombine] combineLoadToOperationType(): don't fold int<->ptr cast into load And another step towards transforms not introducing inttoptr and/or ptrtoint casts that weren't there already. As we've been establishing (see D88788/D88789), if there is a int<->ptr cast, it basically must stay as-is, we can't do much with it. I've looked, and the most source of new such casts being introduces, as far as i can tell, is this transform, which, ironically, tries to reduce count of casts.. On vanilla llvm test-suite + RawSpeed, @ `-O3`, this results in -33.58% less `IntToPtr`s (19014 -> 12629) and +76.20% more `PtrToInt`s (18589 -> 32753), which is an increase of +20.69% in total. However just on RawSpeed, where i know there are basically none `IntToPtr` in the original source code, this results in -99.27% less `IntToPtr`s (2724 -> 20) and +82.92% more `PtrToInt`s (4513 -> 8255). which is again an increase of 14.34% in total. To me this does seem like the step in the right direction, we end up with strictly less `IntToPtr`, but strictly more `PtrToInt`, which seems like a reasonable trade-off. See https://reviews.llvm.org/D88860 / https://reviews.llvm.org/D88995 for some more discussion on the subject. (Eventually, `CastInst::isNoopCast()`/`CastInst::isEliminableCastPair` should be taught about this, yes) Reviewed By: nlopes, nikic Differential Revision: https://reviews.llvm.org/D88979	2020-10-11 20:24:28 +03:00
Thomas Lively	d8f58bf53a	[WebAssembly] Prototype i16x8.q15mulr_sat_s This saturating, rounding, Q-format multiplication instruction is proposed in https://github.com/WebAssembly/simd/pull/365. Differential Revision: https://reviews.llvm.org/D88968	2020-10-09 21:17:53 +00:00
Scott Linder	40cef5a00e	[clang] Add a test for CGDebugInfo treatment of blocks There doesn't seem to be a direct test of this, and I'm planning to make future changes which will affect it. I'm not particularly familiar with the blocks extension, so suggestions for better tests are welcome. Differential Revision: https://reviews.llvm.org/D88754	2020-10-09 19:03:21 +00:00
Liu, Chen3	26cfb6e562	[X86] Passing union type through register For example: union M256 { double d; __m256 m; }; extern void foo1(union M256 A); union M256 m1; void test() { foo1(m1); } clang will pass m1 through stack which does not follow the ABI. Differential Revision: https://reviews.llvm.org/D78699	2020-10-09 11:24:29 +08:00
Arthur Eubanks	afff74e5c2	[HWAsan][NewPM] Handle hwasan like other sanitizers Move it as an EP callback (-O[123]) or in addSanitizersAtO0. This makes it not run in ThinLTO pre-link (like the other sanitizers), so don't check LTO runs in hwasan-new-pm.c. Changing its position also seems to change the generated IR. I think we just need to make sure the pass runs. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D88936	2020-10-08 14:43:21 -07:00
David Green	a15bd0bfc2	[AIX] Add REQUIRES for powerpc test. NFC	2020-10-08 18:40:09 +01:00
diggerlin	92bca12843	[AIX] add new option -mignore-xcoff-visibility SUMMARY: In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html. We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file. The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error. In AIX OS: 1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command . 1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes. 1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute. The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR. Reviewer: daltenty,Jason Liu Differential Revision: https://reviews.llvm.org/D87451	2020-10-08 09:34:58 -04:00
Simon Pilgrim	42d91438ad	[CodeGen][X86] Cleanup labels on some sse/avx intrinsics tests. NFCI. Add some missing CHECK-LABEL lines. Remove leading '@' so it'll be possible to match against c and c++ builds in a future patch.	2020-10-07 19:33:14 +01:00
Fanbo Meng	9908ee5670	[SystemZ][z/OS] Add test of zero length bitfield type size larger than target zero length bitfield boundary Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D88963	2020-10-07 11:34:13 -04:00
Fanbo Meng	43cd0a98d1	[SystemZ][z/OS] Set default alignment rules for z/OS target Update RUN line to fix lit failure Differential Revision: https://reviews.llvm.org/D88845	2020-10-06 14:21:21 -04:00
Fanbo Meng	c781dc74a8	[SystemZ][z/OS] Set default alignment rules for z/OS target Set the default alignment control variables for z/OS target and add test case for alignment rules on z/OS. Reviewed By: abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D88845	2020-10-06 13:16:15 -04:00
David Spickett	f0a78bdfdc	[AArch64] Correct parameter type for unsigned Neon scalar shift intrinsics In the following intrinsics the shift amount (parameter 2) should be signed. vqshlb_u8 vqshlh_u16 vqshls_u32 vqshld_u64 vqrshlb_u8 vqrshlh_u16 vqrshls_u32 vqrshld_u64 vshld_u64 vrshld_u64 See https://developer.arm.com/documentation/ihi0073/latest Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D88013	2020-10-06 11:34:58 +01:00
Roman Lebedev	e00f189d39	[InstCombine] Revert rL226781 "Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available." (PR47592) (it was introduced in https://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html) This canonicalization seems dubious. Most importantly, while it does not create `inttoptr` casts by itself, it may cause them to appear later, see e.g. D88788. I think it's pretty obvious that it is an undesirable outcome, by now we've established that seemingly no-op `inttoptr`/`ptrtoint` casts are not no-op, and are no longer eager to look past them. Which e.g. means that given ``` %a = load i32 %b = inttoptr %a %c = inttoptr %a ``` we likely won't be able to tell that `%b` and `%c` is the same thing. As we can see in D88789 / D88788 / D88806 / D75505, we can't really teach SCEV about this (not without the https://bugs.llvm.org/show_bug.cgi?id=47592 at least) And we can't recover the situation post-inlining in instcombine. So it really does look like this fold is actively breaking otherwise-good IR, in a way that is not recoverable. And that means, this fold isn't helpful in exposing the passes that are otherwise unaware of these patterns it produces. Thusly, i propose to simply not perform such a canonicalization. The original motivational RFC does not state what larger problem that canonicalization was trying to solve, so i'm not sure how this plays out in the larger picture. On vanilla llvm test-suite + RawSpeed, this results in increase of asm instructions and final object size by ~+0.05% decreases final count of bitcasts by -4.79% (-28990), ptrtoint casts by -15.41% (-3423), and of inttoptr casts by -25.59% (-6919, sic). Overall, there's -0.04% less IR blocks, -0.39% instructions. See https://bugs.llvm.org/show_bug.cgi?id=47592 Differential Revision: https://reviews.llvm.org/D88789	2020-10-06 00:00:30 +03:00
Yuanfang Chen	2c94d88e07	[NewPM] collapsing nested pass mangers of the same type This is one of the reason for extra invalidations in D84959. In practice, I don't think we have use cases needing this. This simplifies the pipeline a bit and prune corner cases when considering invalidations. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D85676	2020-10-04 15:57:13 -07:00
Craig Topper	a02b449bb1	[X86] Sync AESENC/DEC Key Locker builtins with gcc. For the wide builtins, pass a single input and output pointer to the builtins. Emit the GEPs and input loads from CGBuiltin.	2020-10-04 12:09:41 -07:00
Craig Topper	230c57b0bd	[X86] Synchronize the encodekey builtins with gcc. Don't assume void* is 16 byte aligned. We were taking multiple pointer arguments in the builtin. gcc accepts a single void. The cast from void to _m128i* caused the IR generation to assume the pointer was aligned. Instead make the builtin take a single void, emit i8 GEPs to adjust then cast to <2 x i64>* and perform a store with align of 1.	2020-10-04 12:09:35 -07:00
Roman Lebedev	aaae13d0c2	[NFC][clang][codegen] Autogenerate a few ARM SVE tests that are being affected by an upcoming patch	2020-10-04 19:54:09 +03:00
Esme-Yi	e3475f5b91	[PowerPC] Add builtins for xvtdiv(dp\|sp) and xvtsqrt(dp\|sp). Summary: This patch implements the builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp. The instructions correspond to the following builtins: int vec_test_swdiv(vector double v1, vector double v2); int vec_test_swdivs(vector float v1, vector float v2); int vec_test_swsqrt(vector double v1); int vec_test_swsqrts(vector float v1); This patch depends on D88274, which fixes the bug in copying from CRRC to GPRC/G8RC. Reviewed By: steven.zhang, amyk Differential Revision: https://reviews.llvm.org/D88278	2020-10-04 16:24:20 +00:00
Arthur Eubanks	eb55735073	Reland [AlwaysInliner] Update BFI when inlining Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88324	2020-10-02 10:46:57 -07:00
Sanjay Patel	149f5b573c	[APFloat] convert SNaN to QNaN in convert() and raise Invalid signal This is an alternate fix (see D87835) for a bug where a NaN constant gets wrongly transformed into Infinity via truncation. In this patch, we uniformly convert any SNaN to QNaN while raising 'invalid op'. But we don't have a way to directly specify a 32-bit SNaN value in LLVM IR, so those are always encoded/decoded by calling convert from/to 64-bit hex. See D88664 for a clang fix needed to allow this change. Differential Revision: https://reviews.llvm.org/D88238	2020-10-01 14:37:38 -04:00
Sanjay Patel	81921ebc43	[CodeGen] improve coverage for float (32-bit) type of NAN; NFC Goes with D88238	2020-09-30 15:10:25 -04:00
Sanjay Patel	187686bea3	[CodeGen] add test for NAN creation; NFC This goes with the APFloat change proposed in D88238. This is copied from the MIPS-specific test in builtin-nan-legacy.c to verify that the normal behavior is correct on other targets without the complication of an inverted quiet bit.	2020-09-30 13:22:12 -04:00
Xiangling Liao	3a7487f903	[FE] Use preferred alignment instead of ABI alignment for complete object when applicable On some targets, preferred alignment is larger than ABI alignment in some cases. For example, on AIX we have special power alignment rules which would cause that. Previously, to support those cases, we added a “PreferredAlignment” field in the `RecordLayout` to store the AIX special alignment values in “PreferredAlignment” as the community suggested. However, that patch alone is not enough. There are places in the Clang where `PreferredAlignment` should have been used instead of ABI-specified alignment. This patch is aimed at fixing those spots. Differential Revision: https://reviews.llvm.org/D86790	2020-09-30 10:48:28 -04:00
Xiang1 Zhang	413577a879	[X86] Support Intel Key Locker Key Locker provides a mechanism to encrypt and decrypt data with an AES key without having access to the raw key value by converting AES keys into “handles”. These handles can be used to perform the same encryption and decryption operations as the original AES keys, but they only work on the current system and only until they are revoked. If software revokes Key Locker handles (e.g., on a reboot), then any previous handles can no longer be used. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D88398	2020-09-30 18:08:45 +08:00
Richard Smith	1c604a9f5f	Recognize setjmp and friends as builtins even if jmp_buf is not declared yet. This happens in glibc's headers. It's important that we recognize these functions so that we can mark them as returns_twice. Differential Revision: https://reviews.llvm.org/D88518	2020-09-29 15:53:17 -07:00
Fangrui Song	3681be876f	Add -fprofile-update={atomic,prefer-atomic,single} GCC 7 introduced -fprofile-update={atomic,prefer-atomic} (prefer-atomic is for best efforts (some targets do not support atomics)) to increment counters atomically, which is exactly what we have done with -fprofile-instr-generate (D50867) and -fprofile-arcs (`b5ef137c11`). This patch adds the option to clang to surface the internal options at driver level. GCC 7 also turned on -fprofile-update=prefer-atomic when -pthread is specified, but it has performance regression (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89307). So we don't follow suit. Differential Revision: https://reviews.llvm.org/D87737	2020-09-29 10:43:23 -07:00
Tres Popp	eb9f7c28e5	Revert "OpaquePtr: Add type to sret attribute" This reverts commit `55c4ff91bd`. Issues were introduced as discussed in https://reviews.llvm.org/D88241 where this change made previous bugs in the linker and BitCodeWriter visible.	2020-09-29 10:31:04 +02:00
Yonghong Song	54d9f743c8	BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible Move abstractMemberAccess and PreserveDIType passes as early as possible, right after clang code generation. Currently, compiler may transform the above code p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); a = llvm.bpf.builtin.preserve_field_info(p2, EXIST); if (a) { p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); bpf_probe_read(buf, buf_size, p2); } to p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0); p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2); a = llvm.bpf.builtin.preserve_field_info(p2, EXIST); if (a) { bpf_probe_read(buf, buf_size, p2); } and eventually assembly code looks like reloc_exist = 1; reloc_member_offset = 10; //calculate member offset from base p2 = base + reloc_member_offset; if (reloc_exist) { bpf_probe_read(bpf, buf_size, p2); } if during libbpf relocation resolution, reloc_exist is actually resolved to 0 (not exist), reloc_member_offset relocation cannot be resolved and will be patched with illegal instruction. This will cause verifier failure. This patch attempts to address this issue by do chaining analysis and replace chains with special globals right after clang code gen. This will remove the cse possibility described in the above. The IR typically looks like %6 = load @llvm.sk_buff:0:50$0:0:0:2:0 %7 = bitcast %struct.sk_buff* %2 to i8* %8 = getelementptr i8, i8* %7, %6 for a particular address computation relocation. But this transformation has another consequence, code sinking may happen like below: PHI = <possibly different @preserve__access_globals> %7 = bitcast %struct.sk_buff %2 to i8* %8 = getelementptr i8, i8* %7, %6 For such cases, we will not able to generate relocations since multiple relocations are merged into one. This patch introduced a passthrough builtin to prevent such optimization. Looks like inline assembly has more impact for optimizaiton, e.g., inlining. Using passthrough has less impact on optimizations. A new IR pass is introduced at the beginning of target-dependent IR optimization, which does: - report fatal error if any reloc global in PHI nodes - remove all bpf passthrough builtin functions Changes for existing CORE tests: - for clang tests, add "-Xclang -disable-llvm-passes" flags to avoid builtin->reloc_global transformation so the test is still able to check correctness for clang generated IR. - for llvm CodeGen/BPF tests, add "opt -O2 <ir_file> \| llvm-dis" command before "llc" command since "opt" is needed to call newly-placed builtin->reloc_global transformation. Add target triple in the IR file since "opt" requires it. - Since target triple is added in IR file, if a test may produce different results for different endianness, two tests will be created, one for bpfeb and another for bpfel, e.g., some tests for relocation of lshift/rshift of bitfields. - field-reloc-bitfield-1.ll has different relocations compared to old codes. This is because for the structure in the test, new code returns struct layout alignment 4 while old code is 8. Align 8 is more precise and permits double load. With align 4, the new mechanism uses 4-byte load, so generating different relocations. - test intrinsic-transforms.ll is removed. This is used to test cse on intrinsics so we do not lose metadata. Now metadata is attached to global and not instruction, it won't get lost with cse. Differential Revision: https://reviews.llvm.org/D87153	2020-09-28 16:56:22 -07:00
Craig Topper	288c5776c9	[X86] Use inlineasm flag output for the _bittest* intrinsics. Instead of expliciting emitting a setc in the inline asm instructions, we can use flag output. This allows the backend to use the flag directly if it is needed by a branch. Previously we needed a test instruction to convert the register back to a flag. If the flag can't be used directly, the backend will emit a setcc. Differential Revision: https://reviews.llvm.org/D87888	2020-09-28 13:33:22 -07:00
Baptiste Saleil	0156914275	[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types This patch legalizes the v256i1 and v512i1 types that will be used for MMA. It implements loads and stores of these types. v256i1 is a pair of VSX registers, so for this type, we load/store the two underlying registers. v512i1 is used for MMA accumulators. So in addition to loading and storing the 4 associated VSX registers, we generate instructions to prime (copy the VSX registers to the accumulator) after loading and unprime (copy the accumulator back to the VSX registers) before storing. This patch also adds the UACC register class that is necessary to implement the loads and stores. This class represents accumulator in their unprimed form and allow the distinction between primed and unprimed accumulators to avoid invalid copies of the VSX registers associated with primed accumulators. Differential Revision: https://reviews.llvm.org/D84968	2020-09-28 14:39:37 -05:00
Michael Liao	5dbf80cad9	[clang][codegen] Annotate `correctly-rounded-divide-sqrt-fp-math` fn-attr for OpenCL only. - `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL at most. Differential revision: https://reviews.llvm.org/D88303	2020-09-28 11:40:32 -04:00
Florian Hahn	915310bf14	Revert "[DSE] Switch to MemorySSA-backed DSE by default." There appears to be a mis-compile with MemorySSA-backed DSE in combination with llvm.lifetime.end. It currently appears like DSE is doing the right thing and the llvm.lifetime.end markers are incorrect. The reverted patch uncovers the mis-compile. This patch temporarily switches back to the legacy DSE implementation, while we investigate. This reverts commit `9d172c8e9c`.	2020-09-26 18:35:27 +01:00
Matt Arsenault	55c4ff91bd	OpaquePtr: Add type to sret attribute Make the corresponding change that was made for byval in `b7141207a4`. Like byval, this requires a bulk update of the test IR tests to include the type before this can be mandatory.	2020-09-25 14:07:30 -04:00
Chris Bowler	f330d9f163	[PPC] [AIX] Implement calling convention IR for C99 complex types on AIX Add AIX calling convention logic to Clang for C99 complex types on AIX Differential Revision: https://reviews.llvm.org/D88130	2020-09-25 07:43:31 -04:00
Momchil Velikov	a88c722e68	[AArch64] PAC/BTI code generation for LLVM generated functions PAC/BTI-related codegen in the AArch64 backend is controlled by a set of LLVM IR function attributes, added to the function by Clang, based on command-line options and GCC-style function attributes. However, functions, generated in the LLVM middle end (for example, asan.module.ctor or __llvm_gcov_write_out) do not get any attributes and the backend incorrectly does not do any PAC/BTI code generation. This patch record the default state of PAC/BTI codegen in a set of LLVM IR module-level attributes, based on command-line options: * "sign-return-address", with non-zero value means generate code to sign return addresses (PAC-RET), zero value means disable PAC-RET. * "sign-return-address-all", with non-zero value means enable PAC-RET for all functions, zero value means enable PAC-RET only for functions, which spill LR. * "sign-return-address-with-bkey", with non-zero value means use B-key for signing, zero value mean use A-key. This set of attributes are always added for AArch64 targets (as opposed, for example, to interpreting a missing attribute as having a value 0) in order to be able to check for conflicts when combining module attributed during LTO. Module-level attributes are overridden by function level attributes. All the decision making about whether to not to generate PAC and/or BTI code is factored out into AArch64FunctionInfo, there shouldn't be any places left, other than AArch64FunctionInfo, which directly examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which is/will-be handled by a separate patch. Differential Revision: https://reviews.llvm.org/D85649	2020-09-25 11:47:14 +01:00
Chris Bowler	64b8a633a8	[NFC] [PPC] Add PowerPC expected IR tests for C99 complex Adding this test so that I can extend it in a follow on patch with expected IR for AIX when I implement complex handling in AIXABIInfo. Reviewed By: daltenty, ZarkoCA Differential Revision: https://reviews.llvm.org/D88105	2020-09-24 23:28:40 -04:00
Ian Levesque	6f7fbdd285	[xray] Function coverage groups Add the ability to selectively instrument a subset of functions by dividing the functions into N logical groups and then selecting a group to cover. By selecting different groups over time you could cover the entire application incrementally with lower overhead than instrumenting the entire application at once. Differential Revision: https://reviews.llvm.org/D87953	2020-09-24 22:09:53 -04:00
Amy Kwan	6b136b19cb	[Power10] Implement custom codegen for the vec_replace_elt and vec_replace_unaligned builtins. This patch implements custom codegen for the vec_replace_elt and vec_replace_unaligned builtins. These builtins map to the @llvm.ppc.altivec.vinsw and @llvm.ppc.altivec.vinsd intrinsics depending on the arguments. The main motivation for doing custom codegen for these intrinsics is because there are float and double versions of the builtin. Normally, the converting the float to an integer would be done via fptoui in the IR. This is incorrect as fptoui truncates the value and we must ensure the value is not truncated. Therefore, we provide custom codegen to utilize bitcast instead as bitcasts do not truncate. Differential Revision: https://reviews.llvm.org/D83500	2020-09-23 22:55:25 -05:00
Craig Topper	d9717d8ee7	[X86] Add a memory clobber to the bittest intrinsic inline asm. Get default clobbers from the target I believe the inline asm emitted here should have a memory clobber since it writes to memory. It was also missing the dirflag clobber that we use by default along with flags and fpsr. To avoid missing defaults in the future, get the default list from the target Differential Revision: https://reviews.llvm.org/D88121	2020-09-23 14:54:39 -07:00
Amy Kwan	2e7117f847	[PowerPC] Implement the 128-bit vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins in Clang/LLVM This patch implements the vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins for vector signed/unsigned __int128. Differential Revision: https://reviews.llvm.org/D87910	2020-09-23 16:49:40 -04:00
Albion Fung	88cdbeab41	[PowerPC] Implement Vector signed/unsigned __int128 overloads for the comparison builtins This patch implements Vector signed/unsigned __int128 overloads for the comparison builtins. Differential Revision: https://reviews.llvm.org/D87804	2020-09-23 16:49:40 -04:00
Sriraman Tallam	7d0bbe4090	Re-apply https://reviews.llvm.org/D87921 , was reverted to triage a PPC bot failure. D87921 was reverted in commit `b89059a313` as it was causing an unknown llvm PPC bot failure. Reapplying the patch after confirming that this is not responsible. Build bot failure: https://reviews.llvm.org/D87921#2286644 which caused the revert. The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled. Fixed the placement of the MPM.addpass for UniqueInternalLinkageNames to make it work correctly with -O2 and new pass manager. Updated the tests to explicitly check O0 and O1. Differential Revision: https://reviews.llvm.org/D87921	2020-09-23 10:28:40 -07:00
Albion Fung	d7eb917a7c	[PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10. Differential: https://reviews.llvm.org/D87394#inline-815858	2020-09-23 01:18:14 -05:00
Mircea Trofin	cf112382dd	[ThinLTO] Option to bypass function importing. This completes the circle, complementing -lto-embed-bitcode (specifically, post-merge-pre-opt). Using -thinlto-assume-merged skips function importing. The index file is still needed for the other data it contains. Differential Revision: https://reviews.llvm.org/D87949	2020-09-22 13:12:11 -07:00
Sriraman Tallam	b89059a313	Revert "The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled." This reverts commit `6950db36d3`.	2020-09-22 12:32:43 -07:00
Amy Kwan	079757b551	[PowerPC] Implement Vector String Isolate Builtins in Clang/LLVM This patch implements the vector string isolate (predicate and non-predicate versions) builtins. The predicate builtins are custom selected within PPCISelDAGToDAG. Differential Revision: https://reviews.llvm.org/D87671	2020-09-22 11:31:44 -05:00
Amy Kwan	b3147058de	[PowerPC] Implement the 128-bit Vector Divide Extended Builtins in Clang/LLVM This patch implements the 128-bit vector divide extended builtins in Clang/LLVM. These builtins map to the vdivesq and vdiveuq instructions respectively. Differential Revision: https://reviews.llvm.org/D87729	2020-09-22 11:31:44 -05:00
Abhina Sreeskantharajan	0fb97fd6a4	[SystemZ][z/OS] Set default wchar_t type for zOS Set the default wchar_t type on z/OS, and unsigned as the default. Reviewed By: hubert.reinterpretcast, fanbo-meng Differential Revision: https://reviews.llvm.org/D87624	2020-09-22 08:03:03 -04:00
David Spickett	f93514545c	[AArch64] Fix return type of Neon scalar comparison intrinsics The following should have unsigned return types but were signed: vceqd_s64 vceqzd_s64 vcged_s64 vcgezd_s64 vcgtd_s64 vcgtzd_s64 vcled_s64 vclezd_s64 vcltd_s64 vcltzd_s64 vtstd_s64 See https://developer.arm.com/documentation/ihi0073/latest Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D88009	2020-09-22 08:53:24 +01:00
Sriraman Tallam	6950db36d3	The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled. Fixed the placement of the MPM.addpass for UniqueInternalLinkageNames to make it work correctly with -O2 and new pass manager. Updated the tests to explicitly check O0 and O2. Previously, the addPass was placed before BackendUtil.cpp#L1373 which is wrong as MPM gets assigned at this point and any additions to the pass vector before this is wrong. This change just moves it after MPM is assigned and places it at a point where O0 and O0+ can share it. Differential Revision: https://reviews.llvm.org/D87921	2020-09-21 10:00:12 -07:00
David Spickett	349af80542	[clang][AArch64] Correct return type of Neon vqmovun intrinsics Neon intrinsics vqmovunh_s16, vqmovuns_s32, vqmovund_s64 should have unsigned return types. See https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics?search=vqmovun Fixes https://bugs.llvm.org/show_bug.cgi?id=46840 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85118	2020-09-21 09:21:51 +01:00
Amy Kwan	37e7673c21	[PowerPC] Implement Move to VSR Mask builtins in LLVM/Clang This patch implements the vec_gen[b\|h\|w\|d\|q]m function prototypes in altivec.h in order to utilize the move to VSR with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82725	2020-09-18 18:16:14 -05:00
Florian Hahn	9d172c8e9c	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." This switches to using DSE + MemorySSA by default again, after fixing the issues reported after the first commit. Notable fixes `fc82006331`, `a0017c2bc2`. This reverts commit `3a59628f3c`.	2020-09-18 11:05:00 +01:00
Nikita Popov	13e19d2e7c	Revert "[InstCombine] Canonicalize SPF_ABS to abs intrinc" This reverts commit `05d4c4ebc2`. mstorsjo reports a miscompile after this change in https://reviews.llvm.org/D87188#2281093. Reverting until I can investigate this.	2020-09-18 09:38:26 +02:00
Amy Kwan	2c3bc918db	[PowerPC] Implement Vector Count Mask Bits builtins in LLVM/Clang This patch implements the vec_cntm function prototypes in altivec.h in order to utilize the vector count mask bits instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82726	2020-09-17 18:20:53 -05:00
Zhaoshi Zheng	1c466477ad	[RISCV] Support Shadow Call Stack Currenlty assume x18 is used as pointer to shadow call stack. User shall pass flags: "-fsanitize=shadow-call-stack -ffixed-x18" Runtime supported is needed to setup x18. If SCS is desired, all parts of the program should be built with -ffixed-x18 to maintain inter-operatability. There's no particuluar reason that we must use x18 as SCS pointer. Any register may be used, as long as it does not have designated purpose already, like RA or passing call arguments. Differential Revision: https://reviews.llvm.org/D84414	2020-09-17 16:02:35 -07:00
Nikita Popov	05d4c4ebc2	[InstCombine] Canonicalize SPF_ABS to abs intrinc Enable canonicalization of SPF_ABS and SPF_NABS to the abs intrinsic. To be conservative, the one-use check on the comparison is retained, this may be relaxed if all goes well. It's pretty likely that this will uncover places that missing handling for the abs() intrinsic. Please report any seen performance regressions. Differential Revision: https://reviews.llvm.org/D87188	2020-09-17 22:28:34 +02:00
Raul Tambre	e09107ab80	[Sema] Introduce BuiltinAttr, per-declaration builtin-ness Instead of relying on whether a certain identifier is a builtin, introduce BuiltinAttr to specify a declaration as having builtin semantics. This fixes incompatible redeclarations of builtins, as reverting the identifier as being builtin due to one incompatible redeclaration would have broken rest of the builtin calls. Mostly-compatible redeclarations of builtins also no longer have builtin semantics. They don't call the builtin nor inherit their attributes. A long-standing FIXME regarding builtins inside a namespace enclosed in extern "C" not being recognized is also addressed. Due to the more correct handling attributes for builtin functions are added in more places, resulting in more useful warnings. Tests are updated to reflect that. Intrinsics without an inline definition in intrin.h had `inline` and `static` removed as they had no effect and caused them to no longer be recognized as builtins otherwise. A pthread_create() related test is XFAIL-ed, as it relied on it being recognized as a builtin based on its name. The builtin declaration syntax is too restrictive and doesn't allow custom structs, function pointers, etc. It seems to be the only case and fixing this would require reworking the current builtin syntax, so this seems acceptable. Fixes PR45410. Reviewed By: rsmith, yutsumi Differential Revision: https://reviews.llvm.org/D77491	2020-09-17 19:28:57 +03:00
Cullen Rhodes	9218f92838	[clang][aarch64] ACLE: Support implicit casts between GNU and SVE vectors This patch adds support for implicit casting between GNU vectors and SVE vectors when `__ARM_FEATURE_SVE_BITS==N`, as defined by the Arm C Language Extensions (ACLE, version 00bet5, section 3.7.3.3) for SVE [1]. This behavior makes it possible to use GNU vectors with ACLE functions that operate on VLAT. For example: typedef int8_t vec __attribute__((vector_size(32))); vec f(vec x) { return svasrd_x(svptrue_b8(), x, 1); } Tests are also added for implicit casting between GNU and fixed-length SVE vectors created by the 'arm_sve_vector_bits' attribute. This behavior makes it possible to use VLST with existing interfaces that operate on GNUT. For example: typedef int8_t vec1 __attribute__((vector_size(32))); void f(vec1); #if __ARM_FEATURE_SVE_BITS==256 && __ARM_FEATURE_SVE_VECTOR_OPERATORS typedef svint8_t vec2 __attribute__((arm_sve_vector_bits(256))); void g(vec2 x) { f(x); } // OK #endif The `__ARM_FEATURE_SVE_VECTOR_OPERATORS` feature macro indicates interoperability with the GNU vector extension. This is the first patch providing support for this feature, which once complete will be enabled by the `-msve-vector-bits` flag, as the `__ARM_FEATURE_SVE_BITS` feature currently is. [1] https://developer.arm.com/documentation/100987/latest Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87607	2020-09-17 09:35:30 +00:00
Mircea Trofin	8ea7ef8eda	[ThinLTO] Relax thinlto_embed_bitcode.ll check Fixes fuscia test [1] - the thinlto annotations may not always be there. [1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/11312	2020-09-15 22:42:22 -07:00
Mircea Trofin	61fc10d6a5	[ThinLTO] add post-thinlto-merge option to -lto-embed-bitcode This will embed bitcode after (Thin)LTO merge, but before optimizations. In the case the thinlto backend is called from clang, the .llvmcmd section is also produced. Doing so in the case where the caller is the linker doesn't yet have a motivation, and would require plumbing through command line args. Differential Revision: https://reviews.llvm.org/D87636	2020-09-15 15:56:11 -07:00
Albion Fung	05aa997d51	[PowerPC] Implement __int128 vector divide operations This patch implements __int128 vector divide operations for ISA3.1. Differential Revision: https://reviews.llvm.org/D85453	2020-09-15 15:19:35 -04:00
Florian Hahn	3a59628f3c	Revert "[DSE] Switch to MemorySSA-backed DSE by default." This reverts commit `fb109c42d9`. Temporarily revert due to a mis-compile pointed out at D87163.	2020-09-15 18:07:56 +01:00
Simon Pilgrim	9eab73fa17	[X86] Update SSE/AVX integer MINMAX intrinsics to emit llvm.smax.* etc. (PR46851) We're now getting close to having the necessary analysis/combines etc. for the new generic llvm smax/smin/umax/umin intrinsics. This patch updates the SSE/AVX integer MINMAX intrinsics to emit the generic equivalents instead of the icmp+select code pattern. Differential Revision: https://reviews.llvm.org/D87603	2020-09-15 11:19:08 +01:00
Rahman Lavaee	7841e21c98	Let -basic-block-sections=labels emit basicblock metadata in a new .bb_addr_map section, instead of emitting special unary-encoded symbols. This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section. The format of the emitted data is represented as follows. It includes a header for every function: \| Address of the function \| -> 8 bytes (pointer size) \| Number of basic blocks in this function (>0) \| -> ULEB128 The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows: \| Offset of the basic block relative to function begin \| -> ULEB128 \| Binary size of the basic block \| -> ULEB128 \| BB metadata \| -> ULEB128 [ MBB.isReturn() OR MBB.hasTailCall() << 1 OR MBB.isEHPad() << 2 ] The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels. The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers. Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%. For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html Reviewed By: MaskRay, snehasish Differential Revision: https://reviews.llvm.org/D85408	2020-09-14 10:16:44 -07:00
Simon Pilgrim	4232bccfb4	[CodeGen][X86] Regenerate minmax reduction sequence tests to match arithmetic tests. avx512-reduceIntrin.c wasn't bothering with the exhaustive alloca/store/load/bitcast checks and avx512-reduceMinMaxIntrin.c shouldn't need to either. This makes it a lot easier to maintain as the update script still doesn't work properly on x86 targets	2020-09-14 10:27:51 +01:00
Fangrui Song	63182c2ac0	[gcov] Add spanning tree optimization gcov is an "Edge Profiling with Edge Counters" application according to Optimally Profiling and Tracing Programs (1994). The minimum number of counters necessary is \|E\|-(\|V\|-1). The unmeasured edges form a spanning tree. Both GCC --coverage and clang -fprofile-generate leverage this optimization. This patch implements the optimization for clang --coverage. The produced .gcda files are much smaller now.	2020-09-13 00:07:31 -07:00
Fangrui Song	f086e85eea	[gcov] Assign names to some types and loaded values used in @__llvm_internal* This makes the generated IR much more readable.	2020-09-12 22:42:37 -07:00
Fangrui Song	d6fadc49e3	[gcov] Process .gcda immediately after the accompanying .gcno instead of doing all .gcda after all .gcno i.e. change the work flow from * .gcno for function A * .gcno for function B * .gcno for function C * .gcda for function A * .gcda for function B * .gcda for function C to * .gcno for function A * .gcda for function A * .gcno for function B * .gcda for function B * .gcno for function C * .gcda for function C Currently there is duplicate logic in .gcno & .gcda processing: how functions are filtered, which edges are instrumented, etc. This refactor enables simplification. Since we always process .gcno, in -fprofile-arcs -fno-test-coverage mode, __llvm_internal_gcov_emit_function_args.0 will have non-zero checksums.	2020-09-12 13:53:03 -07:00
Florian Hahn	a874d63344	[Clang] Add option to allow marking pass-by-value args as noalias. After the recent discussion on cfe-dev 'Can indirect class parameters be noalias?' [1], it seems like using using noalias is problematic for current C++, but should be allowed for C-only code. This patch introduces a new option to let the user indicate that it is safe to mark indirect class parameters as noalias. Note that this also applies to external callers, e.g. it might not be safe to use this flag for C functions that are called by C++ functions. In targets that allocate indirect arguments in the called function, this enables more agressive optimizations with respect to memory operations and brings a ~1% - 2% codesize reduction for some programs. [1] : http://lists.llvm.org/pipermail/cfe-dev/2020-July/066353.html Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D85473	2020-09-12 14:56:13 +01:00
Tyker	78de7297ab	Reland [AssumeBundles] Use operand bundles to encode alignment assumptions NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining.	2020-09-12 15:36:06 +02:00
David Green	ab2ed8bce9	[SVE] Regenerate sve vector bits tests. NFC	2020-09-11 18:51:57 +01:00
Qiu Chaofan	8ecc8520bc	[FPEnv] [Clang] Enable constrained FP support for PowerPC `d4ce862f` introduced HasStrictFP to disable generating constrained FP operations for platforms lacking support. Since work for enabling constrained FP on PowerPC is almost done, we'd like to enable it. Reviewed By: kpn, steven.zhang Differential Revision: https://reviews.llvm.org/D87223	2020-09-12 00:39:52 +08:00
Cullen Rhodes	002f5ab3b1	[clang][aarch64] Fix ILP32 ABI for arm_sve_vector_bits The element types of scalable vectors are defined in terms of stdint types in the ACLE. This patch fixes the mapping to builtin types for the ILP32 ABI when creating VLS types with the arm_sve_vector_bits, where the mapping is as follows: int32_t -> LongTy int64_t -> LongLongTy uint32_t -> UnsignedLongTy uint64_t -> UnsignedLongLongTy This is implemented by leveraging getBuiltinVectorTypeInfo which is target agnostic since it calls ASTContext::getIntTypeForBitwidth for integer types. The element type for svfloat16_t is changed from Float16Ty to HalfTy when creating VLS types since this is what is used elsewhere. For more information, see: https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#types-varying-by-data-model https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-support-for-scalable-vectors Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87358	2020-09-11 09:46:35 +00:00
Florian Hahn	fb109c42d9	[DSE] Switch to MemorySSA-backed DSE by default. The tests have been updated and I plan to move them from the MSSA directory up. Some end-to-end tests needed small adjustments. One difference to the legacy DSE is that legacy DSE also deletes trivially dead instructions that are unrelated to memory operations. Because MemorySSA-backed DSE just walks the MemorySSA, we only visit/check memory instructions. But removing unrelated dead instructions is not really DSE's job and other passes will clean up. One noteworthy change is in llvm/test/Transforms/Coroutines/ArgAddr.ll, but I think this comes down to legacy DSE not handling instructions that may throw correctly in that case. To cover this with MemorySSA-backed DSE, we need an update to llvm.coro.begin to treat it's return value to belong to the same underlying object as the passed pointer. There are some minor cases MemorySSA-backed DSE currently misses, e.g. related to atomic operations, but I think those can be implemented after the switch. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html For the MultiSource/SPEC2000/SPEC2006 the number of eliminated stores goes from ~17500 (legayc DSE) to ~26300 (MemorySSA-backed). More numbers and details in the thread on llvm-dev. Impact on CTMark: ``` Legacy Pass Manager exec instrs size-text O3 + 0.60% - 0.27% ReleaseThinLTO + 1.00% - 0.42% ReleaseLTO-g. + 0.77% - 0.33% RelThinLTO (link only) + 0.87% - 0.42% RelLO-g (link only) + 0.78% - 0.33% ``` http://llvm-compile-time-tracker.com/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions ``` New Pass Manager exec instrs. size-text O3 + 0.95% - 0.25% ReleaseThinLTO + 1.34% - 0.41% ReleaseLTO-g. + 1.71% - 0.35% RelThinLTO (link only) + 0.96% - 0.41% RelLO-g (link only) + 2.21% - 0.35% ``` http://195.201.131.214:8000/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions Reviewed By: asbirlea, xbolva00, nikic Differential Revision: https://reviews.llvm.org/D87163	2020-09-10 22:24:32 +01:00
Simon Pilgrim	2239882f7d	[CodeGen][X86] Move x86 builtin intrinsic/codegen tests into X86 subfolder. There are still plenty of tests that specify x86 as a triple but most shouldn't be doing anything very target specific - we can move any ones that I have missed on a case by case basis.	2020-09-10 12:58:21 +01:00
Simon Pilgrim	576bd52f77	[Codegen][X86] Move AMX specific codegen tests into X86 subfolder.	2020-09-10 12:38:23 +01:00
Qiu Chaofan	88ff4d2ca1	[PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering In standard C library, both rint and nearbyint returns rounding result in current rounding mode. But nearbyint never raises inexact exception. On PowerPC, x(v\|s)r(d\|s)pic may modify FPSCR XX, raising inexact exception. So we can't select constrained fnearbyint into xvrdpic. One exception here is xsrqpi, which will not raise inexact exception, so fnearbyint f128 is okay here. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87220	2020-09-09 22:40:58 +08:00
Ties Stuij	d6f3f61231	Revert "[ARM] Follow AACPS standard for volatile bit-fields access width" This reverts commit `514df1b2bb`. Some of the buildbots got llvm-lit errors on CodeGen/volatile.c	2020-09-08 18:46:27 +01:00
Ties Stuij	514df1b2bb	[ARM] Follow AACPS standard for volatile bit-fields access width This patch resumes the work of D16586. According to the AAPCS, volatile bit-fields should be accessed using containers of the widht of their declarative type. In such case: ``` struct S1 { short a : 1; } ``` should be accessed using load and stores of the width (sizeof(short)), where now the compiler does only load the minimum required width (char in this case). However, as discussed in D16586, that could overwrite non-volatile bit-fields, which conflicted with C and C++ object models by creating data race conditions that are not part of the bit-field, e.g. ``` struct S2 { short a; int b : 16; } ``` Accessing `S2.b` would also access `S2.a`. The AAPCS Release 2020Q2 (https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=) section 8.1 Data Types, page 36, "Volatile bit-fields - preserving number and width of container accesses" has been updated to avoid conflict with the C++ Memory Model. Now it reads in the note: ``` This ABI does not place any restrictions on the access widths of bit-fields where the container overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field placed between two other bit-fields. This is because the C/C++ memory model defines these as being separate memory locations, which can be accessed by two threads simultaneously. For this reason, compilers must be permitted to use a narrower memory access width (including splitting the access into multiple instructions) to avoid writing to a different memory location. For example, in struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };, writes to a or b must not overwrite each other. ``` Patch D16586 was updated to follow such behavior by verifying that we only change volatile bit-field access when: - it won't overlap with any other non-bit-field member - we only access memory inside the bounds of the record - avoid overlapping zero-length bit-fields. Regarding the number of memory accesses, that should be preserved, that will be implemented by D67399. Differential Revision: https://reviews.llvm.org/D72932 The following people contributed to this patch: - Diogo Sampaio - Ties Stuij	2020-09-08 17:49:49 +01:00
Simon Pilgrim	ae85da86ad	[Codegen][X86] Begin moving X86 specific codegen tests into X86 subfolder. Discussed with @craig.topper and @spatel - this is to try and tidyup the codegen folder and move the x86 specific tests (as opposed to general tests that just happen to use x86 triples) into subfolders. Its up to other targets if they follow suit. It also helps speed up test iterations as using wildcards on lit commands often misses some filenames.	2020-09-08 13:01:24 +01:00
Simon Pilgrim	2853ae3c1b	[X86] Update SSE/AVX ABS intrinsics to emit llvm.abs.* (PR46851) We're now getting close to having the necessary analysis/combines etc. for the new generic llvm.abs.* intrinsics. This patch updates the SSE/AVX ABS vector intrinsics to emit the generic equivalents instead of the icmp+sub+select code pattern. Differential Revision: https://reviews.llvm.org/D87101	2020-09-07 13:54:12 +01:00
Amy Kwan	efa57f9a7a	[PowerPC] Implement Vector Expand Mask builtins in LLVM/Clang This patch implements the vec_expandm function prototypes in altivec.h in order to utilize the vector expand with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82727	2020-09-06 17:13:21 -05:00
David Green	667e800bb3	[ARM] Remove -O3 from mve intrinsic tests. NFC	2020-09-06 13:19:55 +01:00
Nemanja Ivanovic	2d652949be	[PowerPC] Provide vec_cmpne on pre-Power9 architectures in altivec.h These overloads are listed in appendix A of the ELFv2 ABI specification without a requirement for ISA 3.0. So these need to be available on all Altivec-capable architectures. The implementation in altivec.h erroneously had them guarded for Power9 due to the availability of the VCMPNE[BHW] instructions. However these need to be implemented in terms of the VCMPEQ[BHW] instructions on older architectures. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47423	2020-09-04 21:48:38 -04:00
Nemanja Ivanovic	54205f0bd2	[PowerPC] Allow const pointers for load builtins in altivec.h The load builtins in altivec.h do not have const in the signature for the pointer parameter. This prevents using them for loading from constant pointers. A notable case for such a use is Eigen. This patch simply adds the missing const. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47408	2020-09-04 13:56:39 -04:00
Cullen Rhodes	f9091e56d3	[clang][aarch64] Drop experimental from __ARM_FEATURE_SVE_BITS macro The __ARM_FEATURE_SVE_BITS feature macro is specified in the Arm C Language Extensions (ACLE) for SVE [1] (version 00bet5). From the spec, where __ARM_FEATURE_SVE_BITS==N: When N is nonzero, indicates that the implementation is generating code for an N-bit SVE target and that the arm_sve_vector_bits(N) attribute is available. This was defined in D83550 as __ARM_FEATURE_SVE_BITS_EXPERIMENTAL and enabled under the -msve-vector-bits flag to simplify initial tests. This patch drops _EXPERIMENTAL now there is support for the feature. [1] https://developer.arm.com/documentation/100987/latest Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D86720	2020-09-03 09:39:37 +00:00
Albion Fung	5d1fe3f903	[PowerPC] Implemented Vector Multiply Builtins This patch implements the builtins for Vector Multiply Builtins (vmulxxd family of instructions), and adds the appropriate test cases for these builtins. The builtins utilize the vector multiply instructions itnroduced with ISA 3.1. Differential Revision: https://reviews.llvm.org/D83955	2020-09-02 14:16:21 -05:00
Amy Kwan	0c2d872d5d	[PowerPC] Implement builtins for xvcvspbf16 and xvcvbf16spn This patch adds the builtin implementation for the xvcvspbf16 and xvcvbf16spn instructions. Differential Revision: https://reviews.llvm.org/D86795	2020-09-01 17:16:43 -05:00
Arnold Schwaighofer	41634497d4	Teach the swift calling convention about _Atomic types rdar://67351073 Differential Revision: https://reviews.llvm.org/D86218	2020-08-31 07:07:25 -07:00
Dimitry Andric	fc2dac4116	[PPC] Fix platform definitions when compiling FreeBSD powerpc64 as LE As a prerequisite to doing experimental buids of pieces of FreeBSD PowerPC64 as little-endian, allow actually targeting it. This is needed so basic platform definitions are pulled in. Without it, the compiler will only run freestanding. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D73425	2020-08-29 12:03:20 +02:00
Fangrui Song	b5ef137c11	[gcov] Increment counters with atomicrmw if -fsanitize=thread Without this patch, `clang --coverage -fsanitize=thread` may fail spuriously because non-atomic counter increments can be detected as data races.	2020-08-28 16:32:35 -07:00
Albion Fung	331dcc43ea	[PowerPC] Implemented Vector Load with Zero and Signed Extend Builtins This patch implements the builtins for Vector Load with Zero and Signed Extend Builtins (lxvr_x for b, h, w, d), and adds the appropriate test cases for these builtins. The builtins utilize the vector load instructions itnroduced with ISA 3.1. Differential Revision: https://reviews.llvm.org/D82502#inline-797941	2020-08-28 11:28:58 -05:00
Cullen Rhodes	2ddf795e8c	Reland "[CodeGen][AArch64] Support arm_sve_vector_bits attribute" This relands D85743 with a fix for test CodeGen/attr-arm-sve-vector-bits-call.c that disables the new pass manager with '-fno-experimental-new-pass-manager'. Test was failing due to IR differences with the new pass manager which broke the Fuchsia builder [1]. Reverted in `2e7041f`. [1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375 Original summary: This patch implements codegen for the 'arm_sve_vector_bits' type attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1]. The purpose of this attribute is to define vector-length-specific (VLS) versions of existing vector-length-agnostic (VLA) types. VLSTs are represented as VectorType in the AST and fixed-length vectors in the IR everywhere except in function args/return. Implemented in this patch is codegen support for the following: * Implicit casting between VLA <-> VLS types. * Coercion of VLS types in function args/return. * Mangling of VLS types. Casting is handled by the CK_BitCast operation, which has been extended to support the two new vector kinds for fixed-length SVE predicate and data vectors, where the cast is implemented through memory rather than a bitcast which is unsupported. Implementing this as a normal bitcast would require relaxing checks in LLVM to allow bitcasting between scalable and fixed types. Another option was adding target-specific intrinsics, although codegen support would need to be added for these intrinsics. Given this, casting through memory seemed like the best approach as it's supported today and existing optimisations may remove unnecessary loads/stores, although there is room for improvement here. Coercion of VLSTs in function args/return from fixed to scalable is implemented through the AArch64 ABI in TargetInfo. The VLA and VLS types are defined by the ACLE to map to the same machine-level SVE vectors. VLS types are mangled in the same way as: __SVE_VLS<typename, unsigned> where the first argument is the underlying variable-length type and the second argument is the SVE vector length in bits. For example: #if __ARM_FEATURE_SVE_BITS==512 // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE typedef svint32_t vec __attribute__((arm_sve_vector_bits(512))); // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE typedef svbool_t pred __attribute__((arm_sve_vector_bits(512))); #endif The latest ACLE specification (00bet5) does not contain details of this mangling scheme, it will be specified in the next revision. The mangling scheme is otherwise defined in the appendices to the Procedure Call Standard for the Arm Architecture, see [2] for more information. [1] https://developer.arm.com/documentation/100987/latest [2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85743	2020-08-28 15:57:09 +00:00
JF Bastien	82d29b397b	Add an unsigned shift base sanitizer It's not undefined behavior for an unsigned left shift to overflow (i.e. to shift bits out), but it has been the source of bugs and exploits in certain codebases in the past. As we do in other parts of UBSan, this patch adds a dynamic checker which acts beyond UBSan and checks other sources of errors. The option is enabled as part of -fsanitize=integer. The flag is named: -fsanitize=unsigned-shift-base This matches shift-base and shift-exponent flags. <rdar://problem/46129047> Differential Revision: https://reviews.llvm.org/D86000	2020-08-27 19:50:10 -07:00
Cullen Rhodes	2e7041fdc2	Revert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute" Test CodeGen/attr-arm-sve-vector-bits-call.c is failing on some builders [1][2]. Reverting whilst I investigate. [1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375 [2] https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112 This reverts commit `42587345a3`.	2020-08-27 21:31:05 +00:00
Mikhail Maltsev	ae1396c7d4	[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics This patch adjusts the following ARM/AArch64 LLVM IR intrinsics: - neon_bfmmla - neon_bfmlalb - neon_bfmlalt so that they take and return bf16 and float types. Previously these intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from implementation lacking bf16 IR type). The neon_vbfdot[q] intrinsics are adjusted similarly. This change required some additional selection patterns for vbfdot itself and also for vector shuffles (in a previous patch) because of SelectionDAG transformations kicking in and mangling the original code. This patch makes the generated IR cleaner (less useless bitcasts are produced), but it does not affect the final assembly. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D86146	2020-08-27 18:43:16 +01:00
Cullen Rhodes	42587345a3	[CodeGen][AArch64] Support arm_sve_vector_bits attribute This patch implements codegen for the 'arm_sve_vector_bits' type attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1]. The purpose of this attribute is to define vector-length-specific (VLS) versions of existing vector-length-agnostic (VLA) types. VLSTs are represented as VectorType in the AST and fixed-length vectors in the IR everywhere except in function args/return. Implemented in this patch is codegen support for the following: * Implicit casting between VLA <-> VLS types. * Coercion of VLS types in function args/return. * Mangling of VLS types. Casting is handled by the CK_BitCast operation, which has been extended to support the two new vector kinds for fixed-length SVE predicate and data vectors, where the cast is implemented through memory rather than a bitcast which is unsupported. Implementing this as a normal bitcast would require relaxing checks in LLVM to allow bitcasting between scalable and fixed types. Another option was adding target-specific intrinsics, although codegen support would need to be added for these intrinsics. Given this, casting through memory seemed like the best approach as it's supported today and existing optimisations may remove unnecessary loads/stores, although there is room for improvement here. Coercion of VLSTs in function args/return from fixed to scalable is implemented through the AArch64 ABI in TargetInfo. The VLA and VLS types are defined by the ACLE to map to the same machine-level SVE vectors. VLS types are mangled in the same way as: __SVE_VLS<typename, unsigned> where the first argument is the underlying variable-length type and the second argument is the SVE vector length in bits. For example: #if __ARM_FEATURE_SVE_BITS==512 // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE typedef svint32_t vec __attribute__((arm_sve_vector_bits(512))); // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE typedef svbool_t pred __attribute__((arm_sve_vector_bits(512))); #endif The latest ACLE specification (00bet5) does not contain details of this mangling scheme, it will be specified in the next revision. The mangling scheme is otherwise defined in the appendices to the Procedure Call Standard for the Arm Architecture, see [2] for more information. [1] https://developer.arm.com/documentation/100987/latest [2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85743	2020-08-27 15:11:58 +00:00
Sander de Smalen	4e9b66de3f	[AArch64][SVE] Add missing debug info for ACLE types. This patch adds type information for SVE ACLE vector types, by describing them as vectors, with a lower bound of 0, and an upper bound described by a DWARF expression using the AArch64 Vector Granule register (VG), which contains the runtime multiple of 64bit granules in an SVE vector. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86101	2020-08-27 10:56:42 +01:00
Amy Kwan	76b0f99ea8	[PowerPC] Implement Vector Multiply High/Divide Extended Builtins in LLVM/Clang This patch implements the function prototypes vec_mulh and vec_dive in order to utilize the vector multiply high (vmulh[s\|u][w\|d]) and vector divide extended (vdive[s\|u][w\|d]) instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82609	2020-08-26 23:14:34 -05:00
Freddy Ye	e02d081f2b	[X86] Support -march=sapphirerapids Support -march=sapphirerapids for x86. Compare with Icelake Server, it includes 14 more new features. They are amxtile, amxint8, amxbf16, avx512bf16, avx512vp2intersect, cldemote, enqcmd, movdir64b, movdiri, ptwrite, serialize, shstk, tsxldtrk, waitpkg. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86503	2020-08-25 14:21:21 +08:00
Simon Pilgrim	a1dc3d241b	[X86] Enable constexpr on ROTL/ROTR intrinsics (PR31446) This enables constexpr rotate intrinsics defined in ia32intrin.h, including the MS specific builtins.	2020-08-23 16:11:58 +01:00
Simon Pilgrim	f8e0e5db48	[X86] Enable constexpr on _cast fp<-> uint intrinsics (PR31446) As suggested by @rsmith on PR47267, by replacing the builtin_memcpy bitcast pattern with builtin_bit_cast we can use _castf32_u32, _castu32_f32, _castf64_u64 and _castu64_f64 inside constant expresssions (constexpr). Although __builtin_bit_cast was added for c++20 it works on all clang c/c++ modes. Differential Revision: https://reviews.llvm.org/D86398	2020-08-23 10:27:46 +01:00
Florian Hahn	bc72a3ab94	[Constants] Handle FNeg in getWithOperands. Currently ConstantExpr::getWithOperands does not handle FNeg and subsequently treats FNeg as binary operator, leading to an assertion failure or segmentation fault if built without assertions. Originally I reproduced this with llvm-dis on a bitcode file, which I unfortunately cannot share and also cannot really reduce. But PR45426 describes the same issue and has a reproducer with Clang, so I'll go with that. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D86274	2020-08-21 16:50:56 +01:00
Simon Pilgrim	9ffc412e1a	[X86] Enable constexpr on BITSCAN intrinsics (PR31446) This enables constexpr BSF/BSR intrinsics defined in ia32intrin.h	2020-08-21 11:44:20 +01:00
Simon Pilgrim	c8e6bf0a65	[X86] Enable constexpr on BSWAP intrinsics (PR31446) This enables constexpr BSWAP intrinsics defined in ia32intrin.h	2020-08-21 10:55:15 +01:00
Simon Pilgrim	c6863a4ab8	[X86] Enable constexpr on POPCNT intrinsics (PR31446) Followup to D86229, this enables constexpr on the alternative (which fallback to generic code) POPCNT intrinsics defined in ia32intrin.h	2020-08-21 10:20:37 +01:00
Qiu Chaofan	91039784b3	[PowerPC] Add readflm/setflm intrinsics to Clang Commit `dbcfbffc` adds ppc.readflm and ppc.setflm intrinsics to read or write FPSCR register. This patch adds them to Clang. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D85874	2020-08-21 15:12:19 +08:00
Simon Pilgrim	cff0db0876	[X86] Enable constexpr on POPCNT intrinsics (PR31446) This is a first step patch to enable constexpr support and testing to a large number of x86 intrinsics. All I've done here is provide a DEFAULT_FN_ATTRS_CONSTEXPR variant to our existing DEFAULT_FN_ATTRS tag approach that adds constexpr on c++ builds. The clang cuda headers do something similar. I've started with POPCNT mainly as its tiny and are wrappers to generic __builtin_* intrinsics which already act as constexpr. Differential Revision: https://reviews.llvm.org/D86229	2020-08-20 21:38:04 +01:00
Craig Topper	724f570ad2	[X86] Add support 'tune' in target attribute This adds parsing and codegen support for tune in target attribute. I've implemented this so that arch in the target attribute implicitly disables tune from the command line. I'm not sure what gcc does here. But since -march implies -mtune. I assume 'arch' in the target attribute implies tune in the target attribute. Differential Revision: https://reviews.llvm.org/D86187	2020-08-19 15:58:19 -07:00
Craig Topper	4a36711439	[X86] Add mtune command line test cases that should have gone with `4cbceb74bb`	2020-08-19 15:58:06 -07:00
Eli Friedman	673dbe1b5e	[clang codegen] Use IR "align" attribute for static array arguments. Without the "align" attribute, marking the argument dereferenceable is basically useless. See also D80166. Fixes https://bugs.llvm.org/show_bug.cgi?id=46876 . Differential Revision: https://reviews.llvm.org/D84992	2020-08-18 12:51:16 -07:00
Amy Kwan	c7ec3a7e33	[PowerPC] Implement Vector Extract Mask builtins in LLVM/Clang This patch implements the vec_extractm function prototypes in altivec.h in order to utilize the vector extract with mask instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82675	2020-08-17 21:14:17 -05:00
Alexandre Ganea	98e01f56b0	Revert "Re-Re-land: [CodeView] Add full repro to LF_BUILDINFO record" This reverts commit `a3036b3863`. As requested in: https://reviews.llvm.org/D80833#2221866 Bug report: https://crbug.com/1117026	2020-08-17 15:49:18 -04:00
Arthur Eubanks	b0ceff94d6	[test] Fix aggregate-assign-call.c in preparation for -enable-npm-optnone Pin the test to use -enable-npm-optnone. Before, optnone wasn't implemented under NPM, so the LPM and NPM runs produced different IR. Now with -enable-npm-optnone, that is no longer necessary. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D86008	2020-08-17 10:06:40 -07:00
Arthur Eubanks	a397319509	[test] Fix thinlto-debug-pm.c in preparation for -enable-npm-optnone This fails due to the clang invocation running at -O0, producing an optnone function. Then even with -O2 in the later invocations, LoopVectorizePass doesn't run on the optnone function. So split this into an -O0 run and an -O2 run. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86011	2020-08-17 10:06:15 -07:00
Mark de Wever	fef2607124	[Sema] Use the proper cast for a fixed bool enum. When casting an enumerate with a fixed bool type the casting should use an IntegralToBoolean instead of an IntegralCast as is required per Core Issue 2338. Fixes PR47055: Incorrect codegen for enum with bool underlying type Differential Revision: https://reviews.llvm.org/D85612	2020-08-16 18:40:08 +02:00
Arthur Eubanks	e6ea8779c2	[NewPM][optnone] Mark various passes as required This was done by turning on -enable-npm-optnone and fixing failures. That will be enabled in a follow-up change for ease of reverting. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D85457	2020-08-14 15:51:59 -07:00
Gui Andrade	909a851dbf	[CGAtomic] Mark atomic libcall functions `nounwind` These functions won't ever unwind. This is useful for MemorySanitizer as it simplifies handling __atomic_load in particular. Differential Revision: https://reviews.llvm.org/D85573	2020-08-14 07:46:43 +00:00
Albion Fung	3136cbe29e	[PowerPC] Implement Vector Shift Builtins This patch implements the builtins for the vector shifts (shl, srl, sra), and adds the appropriate test cases for these builtins. The builtins utilize the vector shift instructions introduced within ISA 3.1. Differential Revision: https://reviews.llvm.org/D83338	2020-08-12 18:26:58 -05:00
Craig Topper	a7a06ded8b	Recommit "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and its follow up patches This recommits the following patches now that D85684 has landed `1cf6f210a2` [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison. `469da663f2` [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison `122b0640fc` [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison `ac0af12ed2` [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison `9b1e95329a` [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms	2020-08-12 10:45:27 -07:00
Erich Keane	aa4bc1cb79	Limit Max Vector alignment on COFF targets to 8192. COFF targets have a max object alignment of 8192, so trying to create one with a larger size results in an unreachable in WinCOFFObjectWriter. For the reproducer I have uses thread local storage, however other alignments are likely affected as well. This patch sets the MaxVectorAlign for COFF to 8192. Additionally, though there is no longer a way to reproduce that I could find, it correctly sets the MaxTLSAlign for COFF to that value as well, so that if anyone comes up with a situation where this is true, it will cause an error. Differential Revision: https://reviews.llvm.org/D85543	2020-08-12 06:35:35 -07:00
Wang, Pengfei	9512525947	[X86][FPEnv] Teach X86 mask compare intrinsics to respect strict FP semantics. When we use mask compare intrinsics under strict FP option, the masked elements shouldn't raise any exception. So, we cann't replace the intrinsic with a full compare + "and" operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D85385	2020-08-11 10:28:41 +08:00
Albion Fung	ed66df6705	test commit	2020-08-10 21:18:36 -04:00
Nick Desaulniers	4f2ad15db5	[Clang] implement -fno-eliminate-unused-debug-types Fixes pr/11710. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Resubmit after breaking Windows and OSX builds. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D80242	2020-08-10 15:08:48 -07:00
Alexandre Ganea	a3036b3863	Re-Re-land: [CodeView] Add full repro to LF_BUILDINFO record This patch adds the missing information to the LF_BUILDINFO record, which allows for rebuilding a .CPP without any external dependency but the .OBJ itself (other than the compiler). Some external tools that we are using (Recode, Live++) are extracting the information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO stores a full path to the compiler, the PWD (CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variables). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding. For more information see PR36198 and D43002. Differential Revision: https://reviews.llvm.org/D80833	2020-08-10 13:36:30 -04:00
Krzysztof Parzyszek	7406eb4f6a	[Hexagon] Avoid creating an empty target feature If the CPU string is empty, the target feature map may end up having an empty string inserted to it. The symptom of the problem is a warning message: '+' is not a recognized feature for this target (ignoring feature) Also, the target-features attribute in the module will have an empty string in it.	2020-08-10 10:37:24 -05:00
Juneyoung Lee	ef018cb65c	[BuildLibCalls] Add noundef to standard I/O functions This patch adds noundef to return value and arguments of standard I/O functions. With this patch, passing undef or poison to the functions becomes undefined behavior in LLVM IR. Since undef/poison is lowered from operations having UB in C/C++, passing undef to them was already UB in source. With this patch, the functions cannot return undef or poison anymore as well. According to C17 standard, ungetc/ungetwc/fgetpos/ftell can generate unspecified value; 3.19.3 says unspecified value is a valid value of the relevant type, and using unspecified value is unspecified behavior, which is not UB, so it cannot be undef (using undef is UB when e.g. it is used at branch condition). — The value of the file position indicator after a successful call to the ungetc function for a text stream, or the ungetwc function for any stream, until all pushed-back characters are read or discarded (7.21.7.10, 7.29.3.10). — The details of the value stored by the fgetpos function (7.21.9.1). — The details of the value returned by the ftell function for a text stream (7.21.9.4). In the long run, most of the functions listed in BuildLibCalls should have noundefs; to remove redundant diffs which will anyway disappear in the future, I added noundef to a few more non-I/O functions as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85345	2020-08-10 10:58:25 +09:00
Nick Desaulniers	abb9bf4bcf	Revert "[Clang] implement -fno-eliminate-unused-debug-types" This reverts commit `e486921fd6`. Breaks windows builds and osx builds.	2020-08-07 16:11:41 -07:00
Nick Desaulniers	73413d266a	Revert "fix windows build for D80242" This reverts commit `cbd8ec9370`.	2020-08-07 16:11:26 -07:00
Nick Desaulniers	cbd8ec9370	fix windows build for D80242	2020-08-07 14:59:35 -07:00
Nick Desaulniers	e486921fd6	[Clang] implement -fno-eliminate-unused-debug-types Fixes pr/11710. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D80242	2020-08-07 14:13:48 -07:00
biplmish	cce1b0e891	[PowerPC] Implement Vector Extract Low/High Order Builtins in LLVM/Clang This patch implements the function prototypes vec_extractl and vec_extracth in altivec.h to utilize the vector extract double element instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D84622	2020-08-07 01:02:29 -05:00
Erich Keane	2143a90b34	Fix _ExtInt(1) to be a i1 in memory. The _ExtInt(1) in getTypeForMem was hitting the bool logic for expanding to an 8 bit value. The result was an assert, or store i1 %0, i8* %2, align 1 since the parameter IS an i1. This patch changes the 'forMem' test to exclude ext-int from the bool test.	2020-08-05 10:54:51 -07:00
Yonghong Song	00602ee7ef	BPF: simplify IR generation for __builtin_btf_type_id() This patch simplified IR generation for __builtin_btf_type_id(). For __builtin_btf_type_id(obj, flag), previously IR builtin looks like if (obj is a lvalue) llvm.bpf.btf.type.id(obj.ptr, 1, flag) !type else llvm.bpf.btf.type.id(obj, 0, flag) !type The purpose of the 2nd argument is to differentiate __builtin_btf_type_id(obj, flag) where obj is a lvalue vs. __builtin_btf_type_id(obj.ptr, flag) Note that obj or obj.ptr is never used by the backend and the `obj` argument is only used to derive the type. This code sequence is subject to potential llvm CSE when - obj is the same .e.g., nullptr - flag is the same - metadata type is different, e.g., typedef of struct "s" and strust "s". In the above, we don't want CSE since their metadata is different. This patch change IR builtin to llvm.bpf.btf.type.id(seq_num, flag) !type and seq_num is always increasing. This will prevent potential llvm CSE. Also report an error if the type name is empty for remote relocation since remote relocation needs non-empty type name to do relocation against vmlinux. Differential Revision: https://reviews.llvm.org/D85174	2020-08-04 16:29:42 -07:00
Dan Gohman	47f7174ffa	[WebAssembly] Use "signed char" instead of "char" in SIMD intrinsics. This allows people to use `int8_t` instead of `char`, -funsigned-char, and generally decouples SIMD from the specialness of `char`. And it makes intrinsics like `__builtin_wasm_add_saturate_s_i8x16` and `__builtin_wasm_add_saturate_u_i8x16` use signed and unsigned element types, respectively. Differential Revision: https://reviews.llvm.org/D85074	2020-08-04 12:48:40 -07:00
Thorsten Schuett	e18c6ef6b4	[clang] improve diagnostics for misaligned and large atomics "Listing the alignment and access size (== expected alignment) in the warning seems like a good idea." solves PR 46947 struct Foo { struct Bar { void * a; void * b; }; Bar bar; }; struct ThirtyTwo { struct Large { void * a; void * b; void * c; void * d; }; Large bar; }; void braz(Foo foo, ThirtyTwo braz) { Foo::Bar bar; __atomic_load(&foo->bar, &bar, __ATOMIC_RELAXED); ThirtyTwo::Large foobar; __atomic_load(&braz->bar, &foobar, __ATOMIC_RELAXED); } repro.cpp:21:3: warning: misaligned atomic operation may incur significant performance penalty; the expected (16 bytes) exceeds the actual alignment (8 bytes) [-Watomic-alignment] __atomic_load(&foo->bar, &bar, __ATOMIC_RELAXED); ^ repro.cpp:24:3: warning: misaligned atomic operation may incur significant performance penalty; the expected (32 bytes) exceeds the actual alignment (8 bytes) [-Watomic-alignment] __atomic_load(&braz->bar, &foobar, __ATOMIC_RELAXED); ^ repro.cpp:24:3: warning: large atomic operation may incur significant performance penalty; the access size (32 bytes) exceeds the max lock-free size (16 bytes) [-Watomic-alignment] 3 warnings generated. Differential Revision: https://reviews.llvm.org/D85102	2020-08-04 11:10:29 -07:00
Yonghong Song	6d67506964	[clang][BPF] support type exist/size and enum exist/value relocations This patch added the following additional compile-once run-everywhere (CO-RE) relocations: - existence/size of typedef, struct/union or enum type - enum value and enum value existence These additional relocations will make CO-RE bpf programs more adaptive for potential kernel internal data structure changes. For existence/size relocations, the following two code patterns are supported: 1. uint32_t __builtin_preserve_type_info((<type> )0, flag); 2. <type> var; uint32_t __builtin_preserve_field_info(var, flag); flag = 0 for existence relocation and flag = 1 for size relocation. For enum value existence and enum value relocations, the following code pattern is supported: uint64_t __builtin_preserve_enum_value((<enum_type> )<enum_value>, flag); flag = 0 means existence relocation and flag = 1 for enum value. relocation. In the above <enum_type> can be an enum type or a typedef to enum type. The <enum_value> needs to be an enumerator value from the same enum type. The return type is uint64_t to permit potential 64bit enumerator values. Differential Revision: https://reviews.llvm.org/D83242	2020-08-04 08:39:53 -07:00
Kazushi (Jam) Marukawa	045e79e77c	[VE] Extend integer arguments and return values smaller than 64 bits In order to follow NEC Aurora SX VE ABI correctly, change to sign/zero extend integer arguments and return values smaller than 64 bits in clang. Also update regression test. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D85071	2020-08-04 08:07:05 +09:00
Thomas Lively	cb32792210	[WebAssembly] Implement prototype v128.load{32,64}_zero instructions Specified in https://github.com/WebAssembly/simd/pull/237, these instructions load the first vector lane from memory and zero the other lanes. Since these instructions are not officially part of the SIMD proposal, they are only available on an opt-in basis via LLVM intrinsics and clang builtin functions. If these instructions are merged to the proposal, this implementation will change so that the instructions will be generated from normal IR. At that point the intrinsics and builtin functions would be removed. This PR also changes the opcodes for the experimental f32x4.qfm{a,s} instructions because their opcodes conflicted with those of the v128.load{32,64}_zero instructions. The new opcodes were chosen to match those used in V8. Differential Revision: https://reviews.llvm.org/D84820	2020-08-03 13:54:00 -07:00
Florian Hahn	00a0282ff8	[Clang] Remove run-lines which use opt to run -ipconstprop. ipconstprop is going to get removed and checking opt with specific passes makes the tests more fragile. The tests retain the important checks that !callback metadata is created correctly.	2020-08-02 21:47:32 +01:00
Nikita Popov	25af353b0e	[NewPM][LVI] Abandon LVI after CVP As mentioned on D70376, LVI can currently cause performance issues when running under NewPM. The problem is that, unlike the legacy pass manager, NewPM will not immediately discard the LVI analysis if the following pass does not need it. This is a problem, because LVI has a high memory requirement, and mass invalidation of LVI values is very inefficient. LVI should only be alive during passes that actively interact with it. This patch addresses the issue by explicitly abandoning LVI after CVP, which gets us back to the LegacyPM behavior. Differential Revision: https://reviews.llvm.org/D84959	2020-08-01 23:47:46 +02:00
Amy Kwan	c4e5743232	[PowerPC] Implement low-order Vector Modulus Builtins, and add Vector Multiply/Divide/Modulus Builtins Tests Power10 introduces new instructions for vector multiply, divide and modulus. These instructions can be exploited by the builtin functions: vec_mul, vec_div, and vec_mod, respectively. This patch aims adds the function prototype, vec_mod, as vec_mul and vec_div been previously implemented in altivec.h. This patch also adds the following front end tests: vec_mul for v2i64 vec_div for v4i32 and v2i64 vec_mod for v4i32 and v2i64 Differential Revision: https://reviews.llvm.org/D82576	2020-07-31 10:58:07 -05:00
Arthur Eubanks	c03d3aca7d	[test] Fix thinlto-distributed-newpm.ll Broken by https://reviews.llvm.org/D84981.	2020-07-30 20:09:34 -07:00
Eli Friedman	8dfb5d767e	[clang codegen][AArch64] Use llvm.aarch64.neon.fcvtzs/u where it's necessary fptosi/fptoui have similar, but not identical, semantics. In particular, the behavior on overflow is different. Fixes https://bugs.llvm.org/show_bug.cgi?id=46844 for 64-bit. (The corresponding patch for 32-bit is more involved because the equivalent intrinsics don't exist, as far as I can tell.) Differential Revision: https://reviews.llvm.org/D84703	2020-07-30 15:41:54 -07:00
Yuanfang Chen	555cf42f38	[NewPM][PassInstrument] Add PrintPass callback to StandardInstrumentations Problem: Right now, our "Running pass" is not accurate when passes are wrapped in adaptor because adaptor is never skipped and a pass could be skipped. The other problem is that "Running pass" for a adaptor is before any "Running pass" of passes/analyses it depends on. (for example, FunctionToLoopPassAdaptor). So the order of printing is not the actual order. Solution: Doing things like PassManager::Debuglogging is very intrusive because we need to specify Debuglogging whenever adaptor is created. (Actually, right now we're not specifying Debuglogging for some sub-PassManagers. Check PassBuilder) This patch move debug logging for pass as a PassInstrument callback. We could be sure that all running passes are logged and in the correct order. This could also be used to implement hierarchy pass logging in legacy PM. We could also move logging of pass manager to this if we want. The test fixes looks messy. It includes changes: - Remove PassInstrumentationAnalysis - Remove PassAdaptor - If a PassAdaptor is for a real pass, the pass is added - Pass reorder (to the correct order), related to PassAdaptor - Add missing passes (due to Debuglogging not passed down) Reviewed By: asbirlea, aeubanks Differential Revision: https://reviews.llvm.org/D84774	2020-07-30 10:07:57 -07:00
Hiroshi Yamauchi	3d6f53018f	[PGO] Include the mem ops into the function hash. To avoid hash collisions when the only difference is in mem ops.	2020-07-30 09:26:20 -07:00
Hiroshi Yamauchi	ae7589e1f1	Revert "[PGO] Include the mem ops into the function hash." This reverts commit `120e66b341`. Due to a buildbot failure.	2020-07-29 15:04:57 -07:00
Hiroshi Yamauchi	120e66b341	[PGO] Include the mem ops into the function hash. To avoid hash collisions when the only difference is in mem ops. Differential Revision: https://reviews.llvm.org/D84782	2020-07-29 13:59:40 -07:00
Victor Campos	71bf6dd682	[Driver][ARM] Fix testcase that should only run on ARM Fix testcase introduced in `d1a3396bfb`.	2020-07-29 14:35:14 +01:00
Victor Campos	d1a3396bfb	[Driver][ARM] Disable unsupported features when nofp arch extension is used A list of target features is disabled when there is no hardware floating-point support. This is the case when one of the following options is passed to clang: - -mfloat-abi=soft - -mfpu=none This option list is missing, however, the extension "+nofp" that can be specified in -march flags, such as "-march=armv8-a+nofp". This patch also disables unsupported target features when nofp is passed to -march. Differential Revision: https://reviews.llvm.org/D82948	2020-07-29 14:13:22 +01:00
Thomas Lively	11bb7eef41	[WebAssembly] Remove intrinsics for SIMD widening ops Instead, pattern match extends of extract_subvectors to generate widening operations. Since extract_subvector is not a legal node, this is implemented via a custom combine that recognizes extract_subvector nodes before they are legalized. The combine produces custom ISD nodes that are later pattern matched directly, just like the intrinsic was. Also removes the clang builtins for these operations since the instructions can now be generated from portable code sequences. Differential Revision: https://reviews.llvm.org/D84556	2020-07-28 18:25:55 -07:00
JF Bastien	389f009c57	[NFC] Sema: use checkArgCount instead of custom checking As requested in D79279. Differential Revision: https://reviews.llvm.org/D84666	2020-07-28 13:41:06 -07:00
biplmish	0eff8b3865	[PowerPC] Cleanup p10vector clang test Remove the duplicate LE test, correct the labels and remove common tests for vec_splat builtin. Differential Revision: https://reviews.llvm.org/D84382	2020-07-26 21:23:00 -05:00
Amy Kwan	74790a5dde	[PowerPC] Implement Truncate and Store VSX Vector Builtins This patch implements the `vec_xst_trunc` function in altivec.h in order to utilize the Store VSX Vector Rightmost [byte \| half \| word \| doubleword] Indexed instructions introduced in Power10. Differential Revision: https://reviews.llvm.org/D82467	2020-07-24 19:22:39 -05:00
Ulrich Weigand	7f003957bf	[SystemZ] Implement __builtin_eh_return_data_regno Implement __builtin_eh_return_data_regno for SystemZ. Match behavior of GCC. Author: slavek-kucera Differential Revision: https://reviews.llvm.org/D84341	2020-07-24 10:28:06 +02:00
Amy Kwan	5f11027395	[PowerPC][Power10] Fix vinsvlx instructions to have i32 arguments. Previously, the vinsvlx instructions were incorrectly defined with i64 as the second argument. This patches fixes this issue by correcting the second argument of the vins*vlx instructions/intrinsics to be i32. Differential Revision: https://reviews.llvm.org/D84277	2020-07-22 17:58:14 -05:00
Richard Smith	6c18f7db73	For PR46800, implement the GCC __builtin_complex builtin. glibc's implementation of the CMPLX macro uses it (with -fgnuc-version set to 4.7 or later).	2020-07-22 13:43:10 -07:00
David Blaikie	b198de67e0	Merge some of the PCH object support with modular codegen I was trying to pick this up a bit when reviewing D48426 (& perhaps D69778) - in any case, looks like D48426 added a module level flag that might not be needed. The D48426 implementation worked by setting a module level flag, then code generating contents from the PCH a special case in ASTContext::DeclMustBeEmitted would be used to delay emitting the definition of these functions if they came from a Module with this flag. This strategy is similar to the one initially implemented for modular codegen that was removed in D29901 in favor of the modular decls list and a bit on each decl to specify whether it's homed to a module. One major difference between PCH object support and modular code generation, other than the specific list of decls that are homed, is the compilation model: MSVC PCH modules are built into the object file for some other source file (when compiling that source file /Yc is specified to say "this compilation is where the PCH is homed"), whereas modular code generation invokes a separate compilation for the PCH alone. So the current modular code generation test of to decide if a decl should be emitted "is the module where this decl is serialized the current main file" has to be extended (as Lubos did in D69778) to also test the command line flag -building-pch-with-obj. Otherwise the whole thing is basically streamlined down to the modular code generation path. This even offers one extra material improvement compared to the existing divergent implementation: Homed functions are not emitted into object files that use the pch. Instead at -O0 they are not emitted into the IR at all, and at -O1 they are emitted using available_externally (existing functionality implemented for modular code generation). The pch-codegen test has been updated to reflect this new behavior. [If possible: I'd love it if we could not have the extra MSVC-style way of accessing dllexport-pch-homing, and just do it the modular codegen way, but I understand that it might be a limitation of existing build systems. @hans / @thakis: Do either of you know if it'd be practical to move to something more similar to .pcm handling, where the pch itself is passed to the compilation, rather than homed as a side effect of compiling some other source file?] Reviewers: llunak, hans Differential Revision: https://reviews.llvm.org/D83652	2020-07-22 12:46:12 -07:00
Amy Kwan	08b4a50e39	[PowerPC][Power10] Fix the Test LSB by Byte (xvtlsbb) Builtins Implementation The implementation of the xvtlsbb builtins/intrinsics were not correct as the intrinsics previously used i1 as an argument type. This patch changes the i1 argument type used in these intrinsics to be i32 instead, as having the second as an i1 can lead to issues in the backend. Differential Revision: https://reviews.llvm.org/D84291	2020-07-22 13:27:05 -05:00
Sebastian Neubauer	b99898c1e9	Fix target specific InstCombine A clang arm test was failing if clang is compiled without arm support. Regression was introduced in `2a6c871596`	2020-07-22 17:00:46 +02:00
Sebastian Neubauer	2a6c871596	[InstCombine] Move target-specific inst combining For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728	2020-07-22 15:59:49 +02:00
Sjoerd Meijer	5567c62afa	[Matrix] Add LowerMatrixIntrinsics to the NPM Pass LowerMatrixIntrinsics wasn't running yet running under the new pass manager, and this adds LowerMatrixIntrinsics to the pipeline (to the same place as where it is running in the old PM). Differential Revision: https://reviews.llvm.org/D84180	2020-07-22 09:47:53 +01:00
Wang, Pengfei	18581fd2c4	[CFE] Add nomerge function attribute to inline assembly. Sometimes we also want to avoid merging inline assembly. This patch add the nomerge function attribute to inline assembly. Reviewed By: zequanwu Differential Revision: https://reviews.llvm.org/D84225	2020-07-22 08:22:58 +08:00
Akira Hatanaka	73bc23ff86	Fix the data layout mangling specification for 'i686-pc-macho' Use 'o' for the mangling specification instead of 'e'. This fixes an error in the backend caused by a mismatch between the data layouts generated by the backend and the frontend. rdar://problem/64168540	2020-07-21 12:58:17 -07:00
Elvina Yakubova	b36a3e6140	[llvm-readobj] Update tests because of changes in llvm-readobj behavior This patch updates tests using llvm-readobj and llvm-readelf, because soon reading from stdin will be achievable only via a '-' as described here: https://bugs.llvm.org/show_bug.cgi?id=46400. Patch with changes to llvm-readobj behavior is here: https://reviews.llvm.org/D83704 Differential Revision: https://reviews.llvm.org/D83912 Reviewed by: jhenderson, MaskRay, grimar	2020-07-20 10:39:04 +01:00
Fangrui Song	5809a32e7c	[gcov] Add __gcov_dump/__gcov_reset and delete __gcov_flush GCC r187297 (2012-05) introduced `__gcov_dump` and `__gcov_reset`. `__gcov_flush = __gcov_dump + __gcov_reset` The resolution to https://gcc.gnu.org/PR93623 ("No need to dump gcdas when forking" target GCC 11.0) removed the unuseful and undocumented __gcov_flush. Close PR38064. Reviewed By: calixte, serge-sans-paille Differential Revision: https://reviews.llvm.org/D83149	2020-07-18 15:07:46 -07:00
Sjoerd Meijer	c2d69d8d62	Remove clang matrix lowering test for now as it is still failing under the NPM.	2020-07-17 22:42:12 +01:00
Eric Christopher	7bfaa40086	Temporarily Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions" due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753. An SROA change soon may obviate some of these problems. This reverts commit `8d09f20798`.	2020-07-16 11:54:04 -07:00
Sjoerd Meijer	0160ad802e	And now really disable that test.	2020-07-16 16:14:47 +01:00
Sjoerd Meijer	31248b4785	Last attempt for rG3a624c327add: one test fails with the NPM, so disable that one for now.	2020-07-16 16:12:47 +01:00
Sjoerd Meijer	a7a07a8d63	Follow up of rG3a624c327add: pacify buildbot, add "REQUIRES: aarch64" to test	2020-07-16 15:38:36 +01:00
Sjoerd Meijer	3a624c327a	[Matrix] Add the matrix test from D83570. NFC.	2020-07-16 15:19:45 +01:00
Amy Kwan	fc55308628	[PowerPC][Power10] Fix VINS* (vector insert byte/half/word) instructions to have i32 arguments. Previously, the vins* intrinsic was incorrectly defined to have its second and third argument arguments as an i64. This patch fixes the second and third argument of the vins* instruction and intrinsic to have i32s instead. Differential Revision: https://reviews.llvm.org/D83497	2020-07-16 00:30:24 -05:00
Craig Topper	00f3579aea	Revert "[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms" and subsequent patches This reverts most of the following patches due to reports of miscompiles. I've left the added test cases with comments updated to be FIXMEs. `1cf6f210a2` [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison. `469da663f2` [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison `122b0640fc` [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison `ac0af12ed2` [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison `9b1e95329a` [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms	2020-07-15 22:02:33 -07:00
Qiu Chaofan	ef30a00a57	[NFC] Add float aggregate ABI test for PowerPC `4c5a93bd` landed adjustment to handle C++20 no_unique_address attribute correctly, clang treats empty members in aggregate type differently if having this attribute. This commit adds necessary test for PowerPC target to reflect this change.	2020-07-16 00:11:09 +08:00
Florian Hahn	c872e809d1	[Matrix] Only pass vector arg as overloaded type in MatrixBuilder. In `2b3c505`, the pointer arguments for the matrix load and store intrinsics was changed to always be the element type of the vector argument. This patch updates the MatrixBuilder to not add the pointer type to the overloaded types and adjusts the clang/mlir tests. This should fix a few build failures on GreenDragon, including http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-x86_64-O0-g/7891/	2020-07-15 10:42:24 +01:00
Tim Northover	5165b2b5fd	AArch64+ARM: make LLVM consider system registers volatile. Some of the system registers readable on AArch64 and ARM platforms return different values with each read (for example a timer counter), these shouldn't be hoisted outside loops or otherwise interfered with, but the normal @llvm.read_register intrinsic is only considered to read memory. This introduces a separate @llvm.read_volatile_register intrinsic and maps all system-registers on ARM platforms to use it for the __builtin_arm_rsr calls. Registers declared with asm("r9") or similar are unaffected.	2020-07-15 09:47:36 +01:00
Amy Kwan	62f5ba624b	[PowerPC][Power10] Implement Test LSB by Byte Builtins in LLVM/Clang This patch implements builtins for the Test LSB by Byte instruction introduced in Power10. Differential Revision: https://reviews.llvm.org/D82431	2020-07-13 22:47:47 -05:00
Tyker	8d09f20798	[AssumeBundles] Use operand bundles to encode alignment assumptions Summary: NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining. Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: thopre, yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71739	2020-07-14 01:05:58 +02:00
Ten Tzen	66f1dcd872	[Windows SEH] Fix the frame-ptr of a nested-filter within a _finally This change fixed a SEH bug (exposed by test58 & test61 in MSVC test xcpt4u.c); when an Except-filter is located inside a finally, the frame-pointer generated today via intrinsic @llvm.eh.recoverfp is the frame-pointer of the immediate parent _finally, not the frame-ptr of outermost host function. The fix is to retrieve the Establisher's frame-pointer that was previously saved in parent's frame. The prolog of a filter inside a _finally should be like code below: %0 = call i8* @llvm.eh.recoverfp(i8* bitcast (@"?fin$0@0@main@@"), i8%frame_pointer) %1 = call i8 @llvm.localrecover(i8* bitcast (@"?fin$0@0@main@@"), i8%0, i32 0) %2 = bitcast i8 %1 to i8** %3 = load i8, i8* %2, align 8 Differential Revision: https://reviews.llvm.org/D77982	2020-07-12 01:37:56 -07:00
Alexandre Ganea	b71499ac9e	Revert "Re-land [CodeView] Add full repro to LF_BUILDINFO record" This reverts commit `add59ecb34` and `41d2813a5f`.	2020-07-10 19:46:16 -04:00
Alexandre Ganea	41d2813a5f	[PDB] Attempt fix for debug-info-codeview-buildinfo.c test This is a bit a shot in the dark, as it doesn't occur on my Windows 10 machines, nor on x64 Linux Ubuntu 18.04. This patch tries to fix the following kind of error: - http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/31511/steps/cmake%20stage%201/logs/stdio - http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/25150/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Adebug-info-codeview-buildinfo.c - http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/7947/steps/check/logs/stdio	2020-07-10 18:52:52 -04:00
Aaron Ballman	006c49d890	Change behavior with zero-sized static array extents Currently, Clang previously diagnosed this code by default: void f(int a[static 0]); saying that "static has no effect on zero-length arrays", which was accurate. However, static array extents require that the caller of the function pass a nonnull pointer to an array of at least that number of elements, but it can pass more (see C17 6.7.6.3p6). Given that we allow zero-sized arrays as a GNU extension and that it's valid to pass more elements than specified by the static array extent, we now support zero-sized static array extents with the usual semantics because it can be useful in cases like: void my_bzero(char p[static 0], int n); my_bzero(&c+1, 0); //ok my_bzero(t+k,n-k); //ok, pattern from actual code	2020-07-10 15:58:11 -04:00
Alexandre Ganea	add59ecb34	Re-land [CodeView] Add full repro to LF_BUILDINFO record This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable). Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding. For more information see PR36198 and D43002. Differential Revision: https://reviews.llvm.org/D80833	2020-07-10 13:59:28 -04:00
Kevin P. Neal	523a8513f8	[FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support." Use the new -fexperimental-strict-floating-point flag in more cases to fix the arm and aarch64 bots. Differential Revision: https://reviews.llvm.org/D80952	2020-07-10 10:34:15 -04:00
Kevin P. Neal	d4ce862f2a	Reland "[FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support." We currently have strict floating point/constrained floating point enabled for all targets. Constrained SDAG nodes get converted to the regular ones before reaching the target layer. In theory this should be fine. However, the changes are exposed to users through multiple clang options already in use in the field, and the changes are _completely_ _untested_ on almost all of our targets. Bugs have already been found, like "https://bugs.llvm.org/show_bug.cgi?id=45274". This patch disables constrained floating point options in clang everywhere except X86 and SystemZ. A warning will be printed when this happens. Use the new -fexperimental-strict-floating-point flag to force allowing strict floating point on hosts that aren't already marked as supporting it (X86 and SystemZ). Differential Revision: https://reviews.llvm.org/D80952	2020-07-10 08:49:45 -04:00
Ulrich Weigand	4c5a93bd58	[ABI] Handle C++20 [[no_unique_address]] attribute Many platform ABIs have special support for passing aggregates that either just contain a single member of floatint-point type, or else a homogeneous set of members of the same floating-point type. When making this determination, any extra "empty" members of the aggregate type will typically be ignored. However, in C++ (at least in all prior versions), no data member would actually count as empty, even if it's type is an empty record -- it would still be considered to take up at least one byte of space, and therefore make those ABI special cases not apply. This is now changing in C++20, which introduced the [[no_unique_address]] attribute. Members of empty record type, if they also carry this attribute, now do not take up any space in the type, and therefore the ABI special cases for single-element or homogeneous aggregates should apply. The C++ Itanium ABI has been updated accordingly, and GCC 10 has added support for this new case. This patch now adds support to LLVM. This is cross-platform; it affects all platforms that use the single-element or homogeneous aggregate ABI special case and implement this using any of the following common subroutines in lib/CodeGen/TargetInfo.cpp: isEmptyField isEmptyRecord isSingleElementStruct isHomogeneousAggregate	2020-07-10 14:01:05 +02:00
Anatoly Trosinenko	67422e4294	[MSP430] Align the _Complex ABI with current msp430-gcc Assembler output is checked against msp430-gcc 9.2.0.50 from TI. Reviewed By: asl Differential Revision: https://reviews.llvm.org/D82646	2020-07-09 18:28:48 +03:00
Craig Topper	9b1e95329a	[InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms As noted here https://lists.llvm.org/pipermail/llvm-dev/2016-October/106182.html and by alive2, this transform isn't valid. If X is poison this potentially propagates poison when it shouldn't. This same transform still exists in DAGCombiner. Differential Revision: https://reviews.llvm.org/D83360	2020-07-08 12:53:05 -07:00
Craig Topper	82206e7fb4	[X86] Enabled a bunch of 64-bit Interlocked* functions intrinsics on 32-bit Windows to match recent MSVC This enables _InterlockedAnd64/_InterlockedOr64/_InterlockedXor64/_InterlockedDecrement64/_InterlockedIncrement64/_InterlockedExchange64/_InterlockedExchangeAdd64/_InterlockedExchangeSub64 on 32-bit Windows The backend already knows how to expand these to a loop using cmpxchg8b on 32-bit targets. Fixes PR46595 Differential Revision: https://reviews.llvm.org/D83254	2020-07-08 10:39:56 -07:00
Ulrich Weigand	80a1b95b8e	[SystemZ ABI] Allow class types in GetSingleElementType The SystemZ ABI specifies that aggregate types with just a single member of floating-point type shall be passed as if they were just a scalar of that type. This applies to both struct and class types (but not unions). However, the current ABI support code in clang only checks this case for struct types, which means that for class types, generated code does not adhere to the platform ABI. Fixed by accepting both struct and class types in the SystemZABIInfo::GetSingleElementType routine.	2020-07-07 19:56:19 +02:00
Jennifer Yu	6cf0dac1ca	orrectly generate invert xor value for Binary Atomics of int size > 64 When using __sync_nand_and_fetch with __int128, a problem is found that the wrong value for the 'invert' value gets emitted to the xor in case where the int size is greater than 64 bits. This is because uses of llvm::ConstantInt::get which zero extends the greater than 64 bits, so instead -1 that we require, it end up getting 18446744073709551615 This patch replaces the call to llvm::ConstantInt::get with the call to llvm::Constant::getAllOnesValue which works for all integer types. Reviewers: jfp, erichkeane, rjmccall, hfinkel Differential Revision: https://reviews.llvm.org/D82832	2020-07-07 10:20:14 -07:00
David Sherwood	9a1a7d888b	[SVE] Add more warnings checks to clang and LLVM SVE tests There are now more SVE tests in LLVM and Clang that do not emit warnings related to invalid use of EVT::getVectorNumElements() and VectorType::getNumElements(). For these tests I have added additional checks that there are no warnings in order to prevent any future regressions. Differential Revision: https://reviews.llvm.org/D82943	2020-07-07 09:33:20 +01:00
Xiang1 Zhang	939d8309db	[X86-64] Support Intel AMX Intrinsic INTEL ADVANCED MATRIX EXTENSIONS (AMX). AMX is a new programming paradigm, it has a set of 2-dimensional registers (TILES) representing sub-arrays from a larger 2-dimensional memory image and operate on TILES. These intrinsics use direct TMM register number as its params. Spec can be found in Chapter 3 here https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D83111	2020-07-07 10:13:40 +08:00
Biplob Mishra	0c6b6e28e7	[PowerPC] Implement Vector Splat Immediate Builtins in Clang Implements builtins for the following prototypes: vector signed int vec_splati (const signed int); vector float vec_splati (const float); vector double vec_splatid (const float); vector signed int vec_splati_ins (vector signed int, const unsigned int, const signed int); vector unsigned int vec_splati_ins (vector unsigned int, const unsigned int, const unsigned int); vector float vec_splati_ins (vector float, const unsigned int, const float); Differential Revision: https://reviews.llvm.org/D82520	2020-07-06 20:29:33 -05:00
Wouter van Oortmerssen	16d83c395a	[WebAssembly] Added 64-bit memory.grow/size/copy/fill This covers both the existing memory functions as well as the new bulk memory proposal. Added new test files since changes where also required in the inputs. Also removes unused init/drop intrinsics rather than trying to make them work for 64-bit. Differential Revision: https://reviews.llvm.org/D82821	2020-07-06 12:49:50 -07:00
Kevin P. Neal	916e2ca997	Revert "[FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support." My mistake, I had a blocking reviewer. This reverts commit `39d2ae0afb`. This reverts commit `bfdafa32a0`. This reverts commit `2b35511350`. Differential Revision: https://reviews.llvm.org/D80952	2020-07-06 14:57:45 -04:00
Kevin P. Neal	2b35511350	[FPEnv][Clang][Driver] Failing tests are now expected failures only on PowerPC Mark these tests as only failing on PowerPC. Avoids unexpected passes on other bots. Fingers crossed. Differential Revision: https://reviews.llvm.org/D80952	2020-07-06 14:44:06 -04:00
Kevin P. Neal	bfdafa32a0	[FPEnv][Clang][Driver] Failing tests are now expected failures. These are now expected failures on PowerPC. They can be reenabled when PowerPC is ready. Differential Revision: https://reviews.llvm.org/D80952	2020-07-06 14:20:49 -04:00
Kevin P. Neal	39d2ae0afb	[FPEnv][Clang][Driver] Disable constrained floating point on targets lacking support. We currently have strict floating point/constrained floating point enabled for all targets. Constrained SDAG nodes get converted to the regular ones before reaching the target layer. In theory this should be fine. However, the changes are exposed to users through multiple clang options already in use in the field, and the changes are _completely_ _untested_ on almost all of our targets. Bugs have already been found, like "https://bugs.llvm.org/show_bug.cgi?id=45274". This patch disables constrained floating point options in clang everywhere except X86 and SystemZ. A warning will be printed when this happens. Differential Revision: https://reviews.llvm.org/D80952	2020-07-06 13:32:49 -04:00
Kazushi (Jam) Marukawa	df3bda047d	[VE] Correct stack alignment Summary: Change stack alignment from 64 bits to 128 bits to follow ABI correctly. And add a regression test for datalayout. Reviewers: simoll, k-ishizaka Reviewed By: simoll Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #llvm, #ve, #clang Differential Revision: https://reviews.llvm.org/D83173	2020-07-06 17:25:29 +09:00
Fangrui Song	b0b5162fc2	[Driver] Pass -gno-column-info instead of -dwarf-column-info Making -g[no-]column-info opt out reduces the length of a typical CC1 command line. Additionally, in a non-debug compile, we won't see -dwarf-column-info.	2020-07-05 11:50:38 -07:00
Kai Luo	68e07da3e5	[clang][PowerPC] Enable -fstack-clash-protection option for ppc64 Differential Revision: https://reviews.llvm.org/D81355	2020-07-05 03:43:56 +00:00
Roman Lebedev	7ea46aee36	Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions" Assume bundle can have more than one entry with the same name, but at least AlignmentFromAssumptionsPass::extractAlignmentInfo() uses getOperandBundle("align"), which internally assumes that it isn't the case, and happily crashes otherwise. Minimal reduced reproducer: run `opt -alignment-from-assumptions` on target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %0 = type { i64, %1, i8, i64, %2, i32, %3, i8 } %1 = type opaque %2 = type { i8, i8, i16 } %3 = type { i32, i32, i32, i32 } ; Function Attrs: nounwind define i32 @f(%0* noalias nocapture readonly %arg, %0* noalias %arg1) local_unnamed_addr #0 { bb: call void @llvm.assume(i1 true) [ "align"(%0* %arg, i64 8), "align"(%0* %arg1, i64 8) ] ret i32 0 } ; Function Attrs: nounwind willreturn declare void @llvm.assume(i1) #1 attributes #0 = { nounwind "reciprocal-estimates"="none" } attributes #1 = { nounwind willreturn } This is what we'd have with -mllvm -enable-knowledge-retention This reverts commit `c95ffadb24`.	2020-07-04 23:49:23 +03:00
Roman Lebedev	7fed3cfadb	[clang] Fix two tests that are affected by llvm opt change	2020-07-04 18:26:22 +03:00
Biplob Mishra	0939e04e41	[PowerPC] Implement Vector Insert Builtins in LLVM/Clang Implements vec_insertl() and vec_inserth(). Differential Revision: https://reviews.llvm.org/D82365	2020-07-03 15:30:41 -05:00
Biplob Mishra	ca464639a1	[PowerPC] Implement Vector Blend Builtins in LLVM/Clang Implements vec_blendv() Differential Revision: https://reviews.llvm.org/D82774	2020-07-02 16:52:52 -05:00
Biplob Mishra	286073484f	[PowerPC]Implement Vector Permute Extended Builtin Implements vector permute builtin: vec_permx() Differential Revision: https://reviews.llvm.org/D82869	2020-07-02 14:53:18 -05:00
Sander de Smalen	f255656a97	[SVE] ACLE: Fix builtins for svdup_lane_bf16 and svcvtnt_bf16_f32_x bfloat16 variants of svdup_lane were missing, and svcvtnt_bf16_x was implemented incorrectly (it takes an operand for the inactive lanes) Reviewers: fpetrogalli, efriedma Reviewed By: fpetrogalli Tags: #clang Differential Revision: https://reviews.llvm.org/D82908	2020-07-02 09:57:34 +01:00
Biplob Mishra	88874f0746	[PowerPC]Implement Vector Shift Double Bit Immediate Builtins Implement Vector Shift Double Bit Immediate Builtins in LLVM/Clang. * vec_sldb (); * vec_srdb (); Differential Revision: https://reviews.llvm.org/D82440	2020-07-01 20:34:53 -05:00
Erich Keane	19c35526d9	Limit x86 test to require target to fix buildbot (from `2831a317b`) The modification of the features apparently requires the backend to be instantiated, so make sure this is required to fix the ARM build bots.	2020-07-01 07:35:39 -07:00
Erich Keane	2831a317b6	Implement AVX ABI Warning/error The x86-64 "avx" feature changes how >128 bit vector types are passed, instead of being passed in separate 128 bit registers, they can be passed in 256 bit registers. "avx512f" does the same thing, except it switches from 256 bit registers to 512 bit registers. The result of both of these is an ABI incompatibility between functions compiled with and without these features. This patch implements a warning/error pair upon an attempt to call a function that would run afoul of this. First, if a function is called that would have its ABI changed, we issue a warning. Second, if said call is made in a situation where the caller and callee are known to have different calling conventions (such as the case of 'target'), we instead issue an error. Differential Revision: https://reviews.llvm.org/D82562	2020-07-01 07:14:31 -07:00
Hans Wennborg	a8e582c830	[ThinLTO] Always parse module level inline asm with At&t dialect (PR46503) clang-cl passes -x86-asm-syntax=intel to the cc1 invocation so that assembly listings produced by the /FA flag are printed in Intel dialect. That flag however should not affect the parsing of inline assembly in the program. (See r322652) When compiling normally, AsmPrinter::emitInlineAsm is used for assembling and defaults to At&t dialect. However, when compiling for ThinLTO, the code which parses module level inline asm to find symbols for the symbol table was failing to set the dialect. This patch fixes that. (See the bug for more details.) Differential revision: https://reviews.llvm.org/D82862	2020-07-01 09:43:45 +02:00
Francesco Petrogalli	d54e4dded7	[sve][acle] Enable feature macros for SVE ACLE extensions. Summary: The following feature macros have been added: __ARM_FEATURE_SVE_BF16 __ARM_FEATURE_SVE_MATMUL_INT8 __ARM_FEATURE_SVE_MATMUL_FP32 __ARM_FEATURE_SVE_MATMUL_FP64 The driver has been updated to enable them accordingly to the value of the target feature passed at command line. The SVE ACLE tests using the macros have been modified to work with the target feature instead of passing the macro at command line. Reviewers: sdesmalen, efriedma, c-rhodes, kmclaughlin, SjoerdMeijer, rengolin Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D82623	2020-06-30 18:33:03 +00:00
David Sherwood	c02332a693	[CodeGen] Fix warning in getNode for EXTRACT_SUBVECTOR Fix a warning in getNode() when extracting a subvector from a concat vector. We can simply replace the call to getVectorNumElements with getVectorMinNumElements as this follows the defined behaviour for EXTRACT_SUBVECTOR. Differential Revision: https://reviews.llvm.org/D82746	2020-06-30 08:11:41 +01:00
Cullen Rhodes	e73c3bb06b	[AArch64][SVE] Add bfloat16 to outstanding tuple vector intrinsics Summary: * svget2/3/4 * svset2/3/4 * svcreate2/3/4 * svundef/2/3/4 Reviewers: sdesmalen, kmclaughlin, fpetrogalli, efriedma Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82665	2020-06-29 17:00:58 +00:00
Cullen Rhodes	1ef75f53e9	[AArch64][SVE] clang: Add missing svbfloat16_t tests Summary: Patch adds tests for mangling of svbfloat16_t and several other type related tests. Reviewers: sdesmalen, kmclaughlin, fpetrogalli, efriedma Reviewed By: sdesmalen, fpetrogalli Differential Revision: https://reviews.llvm.org/D82668	2020-06-29 16:48:53 +00:00
Francesco Petrogalli	67e4330fac	[sve][acle] Implement some of the C intrinsics for brain float. Summary: The following intrinsics have been extended to support brain float types: svbfloat16_t svclasta[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data) bfloat16_t svclasta[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data) bfloat16_t svlasta[_bf16](svbool_t pg, svbfloat16_t op) svbfloat16_t svclastb[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data) bfloat16_t svclastb[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data) bfloat16_t svlastb[_bf16](svbool_t pg, svbfloat16_t op) svbfloat16_t svdup[_n]_bf16(bfloat16_t op) svbfloat16_t svdup[_n]_bf16_m(svbfloat16_t inactive, svbool_t pg, bfloat16_t op) svbfloat16_t svdup[_n]_bf16_x(svbool_t pg, bfloat16_t op) svbfloat16_t svdup[_n]_bf16_z(svbool_t pg, bfloat16_t op) svbfloat16_t svdupq[_n]_bf16(bfloat16_t x0, bfloat16_t x1, bfloat16_t x2, bfloat16_t x3, bfloat16_t x4, bfloat16_t x5, bfloat16_t x6, bfloat16_t x7) svbfloat16_t svdupq_lane[_bf16](svbfloat16_t data, uint64_t index) svbfloat16_t svinsr[_n_bf16](svbfloat16_t op1, bfloat16_t op2) Reviewers: sdesmalen, kmclaughlin, c-rhodes, ctetreau, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82345	2020-06-29 16:09:08 +00:00
Cullen Rhodes	d5fc592b7c	[AArch64][SVE] Add bfloat16 support to svext intrinsic Reviewers: sdesmalen, kmclaughlin, efriedma, david-arm, fpetrogalli Reviewed By: sdesmalen, fpetrogalli Differential Revision: https://reviews.llvm.org/D82391	2020-06-29 11:08:38 +00:00
Melanie Blower	f4aaed3bf1	Reland D81869 "Modify FPFeatures to use delta not absolute settings" This reverts commit `defd43a5b3`. with correction to solve msan report To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the floating point settings in PCH files aren't compatible, rewrite FPFeatures to use a delta in the settings rather than absolute settings. With this patch, these floating point options can be benign. Reviewers: rjmccall Differential Revision: https://reviews.llvm.org/D81869	2020-06-27 01:34:57 -07:00
Craig Topper	d298acde82	[X86] Don't disable xsave when avx is disabled. Implicitly enable xsave with avx is enabled and xsave wasn't explciitly disabled CPUs with avx always have xsave, but some CPUs without avx also have xsave. So we shouldn't disable xsave just because avx is disabled. This would prevent xsave from being enabled with -march=native on CPUs with xsave and not avx. But we also don't want -mavx -mno-avx to leave xsave eanabled. So only enable xsave if avx is enabled after processing all features. I thought about just not turning xsave on with avx at all, but there might be someone out there depending on it.	2020-06-26 16:45:44 -07:00
Francesco Petrogalli	ddbdff3acc	[sve][acle] Recommit https://reviews.llvm.org/D82501 The original patch was reverted in `ff5ccf258e` as it was missing the C tests that got accidentally missing. This patch is a NFC of https://reviews.llvm.org/D82501, together with the SVE ACLE tests for the C intrinsics of svreinterpret for brain float types.	2020-06-26 20:45:29 +00:00
Melanie Blower	defd43a5b3	Revert "Revert "Revert "Modify FPFeatures to use delta not absolute settings""" This reverts commit `9518763d71`. Memory sanitizer fails in CGFPOptionsRAII::CGFPOptionsRAII dtor	2020-06-26 08:47:04 -07:00
Kevin P. Neal	e91c4b2af2	[NFC] Eliminate an unneeded -vv used in test development.	2020-06-26 11:09:16 -04:00
Melanie Blower	9518763d71	Revert "Revert "Modify FPFeatures to use delta not absolute settings"" This reverts commit `b55d723ed6`. Reapply Modify FPFeatures to use delta not absolute settings To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the floating point settings in PCH files aren't compatible, rewrite FPFeatures to use a delta in the settings rather than absolute settings. With this patch, these floating point options can be benign. Reviewers: rjmccall Differential Revision: https://reviews.llvm.org/D81869	2020-06-26 08:00:08 -07:00
Melanie Blower	b55d723ed6	Revert "Modify FPFeatures to use delta not absolute settings" This reverts commit `3a748cbf86`. I'm reverting this commit because I forgot to format the commit message propertly. Sorry for the thrash.	2020-06-26 07:52:57 -07:00
Melanie Blower	3a748cbf86	Modify FPFeatures to use delta not absolute settings	2020-06-26 07:41:09 -07:00
Anatoly Trosinenko	cb56fa2196	[MSP430] Update register names When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4. The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it. That is, I cannot use `fp` register name from the C code because Clang does not accept it (exactly like GCC). But the accepted name `r4` is not recognised by `llc` (it can be used in listings passed to `llvm-mc` and even `fp` is replace to `r4` by `llvm-mc`). So I can specify any of `fp` or `r4` for the string literal of `asm(...)` but nothing in the clobber list. This patch replaces `MSP430::FP` with `MSP430::R4` in the backend code (even [MSP430 EABI](http://www.ti.com/lit/an/slaa534/slaa534.pdf) doesn't mention FP as a register name). The R0 - R3 registers, on the other hand, are left as is in the backend code (after all, they have some special meaning on the ISA level). It is just ensured clang is renaming them as expected by the downstream tools. There is probably not much sense in marking them clobbered but rename them //just in case// for use at potentially different contexts. Differential Revision: https://reviews.llvm.org/D82184	2020-06-26 15:32:07 +03:00
Cullen Rhodes	d45cf9105b	[AArch64][SVE2] Guard while intrinsics on scalar bfloat feature macro Summary: `svwhilerw_bf16` and `svwhilewr_bf16` intrinsics use the scalar `bfloat16_t` type which is predicated on `__ARM_FEATURE_BF16_SCALAR_ARITHMETIC`. This patch changes the feature guard from `__ARM_FEATURE_SVE_BF16` to the scalar bfloat feature macro. The verify tests for `+bf16` are also removed in this patch. The purpose of these checks was to match the SVE2 ACLE tests that look for an implicit declaration warning if the feature isn't set. They worked when the intrinsics were guarded on `__ARM_FEATURE_SVE_BF16` as the `bfloat16_t` was guarded on a different macro, but with both the type and intrinsic guarded on the same macro an earlier error is triggered in the ACLE regarding the type and we don't get a warning as we do for SVE2. Reviewers: sdesmalen, fpetrogalli, kmclaughlin, rengolin, efriedma Reviewed By: sdesmalen, fpetrogalli Differential Revision: https://reviews.llvm.org/D82578	2020-06-26 10:25:42 +00:00
Kerry McLaughlin	edcfef8fee	[AArch64][SVE] Add bfloat16 support to store intrinsics Summary: Bfloat16 support added for the following intrinsics: - ST1 - STNT1 Reviewers: sdesmalen, c-rhodes, fpetrogalli, efriedma, stuij, david-arm Reviewed By: fpetrogalli Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82448	2020-06-26 11:05:56 +01:00
David Sherwood	ae47d158a0	Remove "rm -f" workaround in acle_sve_adda.c	2020-06-26 08:16:40 +01:00
Amy Kwan	e0c02dc980	[PowerPC][Power10] Implement centrifuge, vector gather every nth bit, vector evaluate Builtins in LLVM/Clang This patch implements builtins for the following prototypes: unsigned long long __builtin_cfuged (unsigned long long, unsigned long long); vector unsigned long long vec_cfuge (vector unsigned long long, vector unsigned long long); unsigned long long vec_gnb (vector unsigned __int128, const unsigned int); vector unsigned char vec_ternarylogic (vector unsigned char, vector unsigned char, vector unsigned char, const unsigned int); vector unsigned short vec_ternarylogic (vector unsigned short, vector unsigned short, vector unsigned short, const unsigned int); vector unsigned int vec_ternarylogic (vector unsigned int, vector unsigned int, vector unsigned int, const unsigned int); vector unsigned long long vec_ternarylogic (vector unsigned long long, vector unsigned long long, vector unsigned long long, const unsigned int); vector unsigned __int128 vec_ternarylogic (vector unsigned __int128, vector unsigned __int128, vector unsigned __int128, const unsigned int); Differential Revision: https://reviews.llvm.org/D80970	2020-06-25 21:34:41 -05:00
Francesco Petrogalli	7200fa38a9	[sve][acle] Add some C intrinsics for brain float types. Summary: The following intrinsics has been added: svuint16_t svcnt[_bf16]_m(svuint16_t inactive, svbool_t pg, svbfloat16_t op) svuint16_t svcnt[_bf16]_x(svbool_t pg, svbfloat16_t op) svuint16_t svcnt[_bf16]_z(svbool_t pg, svbfloat16_t op) svbfloat16_t svtbl[_bf16](svbfloat16_t data, svuint16_t indices) svbfloat16_t svtbl2[_bf16](svbfloat16x2_t data, svuint16_t indices) svbfloat16_t svtbx[_bf16](svbfloat16_t fallback, svbfloat16_t data, svuint16_t indices) Reviewers: c-rhodes, kmclaughlin, efriedma, sdesmalen, ctetreau Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82429	2020-06-25 16:31:01 +00:00
Andrew Wock	15edd7aaa7	[FPEnv] PowerPC-specific builtin constrained FP enablement This change enables PowerPC compiler builtins to generate constrained floating point operations when clang is indicated to do so. A couple of possibly unexpected backend divergences between constrained floating point and regular behavior are highlighted under the test tag FIXME-CHECK. This may be something for those on the PPC backend to look at. Patch by: Drew Wock <drew.wock@sas.com> Differential Revision: https://reviews.llvm.org/D82020	2020-06-25 11:42:58 -04:00
Tyker	c95ffadb24	[AssumeBundles] Use operand bundles to encode alignment assumptions Summary: NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining. Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71739	2020-06-25 12:59:44 +02:00
Tyker	8938a6c9ed	[NFC] update test to make diff of the following commit clear	2020-06-25 12:59:44 +02:00
Sander de Smalen	fabe67728e	[AArch64][SVE] Enable __ARM_FEATURE_SVE macros. This patch enables the following macros when their corresponding target attributes are set: __ARM_FEATURE_SVE (+sve) __ARM_FEATURE_SVE2 (+sve2) __ARM_FEATURE_SVE2_AES (+sve2-aes) __ARM_FEATURE_SVE2_BITPERM (+sve2-bitperm) __ARM_FEATURE_SVE2_SHA3 (+sve2-sha3) __ARM_FEATURE_SVE2_SM4 (+sve2-sm4) This implies that the base SVE and SVE2 ACLE (00bet2) are now feature complete, meaning that all intrinsics are implemented in LLVM and Clang. Disclaimer: To implement the ACLE we have had to fix up many parts of LLVM to make it support scalable vectors. We have also used many target-specific intrinsics to reduce reliance on parts of LLVM where we know scalable vectors may not yet be handled properly (e.g. some transformation might drop the 'scalable' flag on a vector type). While we've done a best effort with the limited testing that is available to us, we're still working to improve the stability of the implementation. Additionally, Clang may print warnings that code may have miscompiled. We find this often to be a false alarm where the wrong interfaces have been used in LLVM and where resulting code is not actually incorrect. However, this warrants a bug report and investigation. If you find any bugs or issues, please raise them on bugs.llvm.org and let us know! Reviewers: rengolin, efriedma, david-arm, SjoerdMeijer Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D81725	2020-06-25 08:14:19 +01:00
Amy Kwan	d82f26cc4b	[PowerPC][Power10] Implement Count Leading/Trailing Zeroes Builtins under bit Mask in LLVM/Clang This patch implements builtins for the following prototypes: unsigned long long __builtin_cntlzdm (unsigned long long, unsigned long long) unsigned long long __builtin_cnttzdm (unsigned long long, unsigned long long) vector unsigned long long vec_cntlzm (vector unsigned long long, vector unsigned long long) vector unsigned long long vec_cnttzm (vector unsigned long long, vector unsigned long long) Differential Revision: https://reviews.llvm.org/D80941	2020-06-24 16:03:45 -05:00
Nigel Perks	dc3f8913d2	Fix crash on XCore on unused inline in EmitTargetMetadata EmitTargetMetadata passed to emitTargetMD a null pointer as returned from GetGlobalValue, for an unused inline function which has been removed from the module at that point. A FIXME in CodeGenModule.cpp commented that the calling code in EmitTargetMetadata should be moved into the one target that needs it (XCore). A review comment agreed. So the calling loop has been moved into the XCore subclass. The check for null is done in that loop. Differential Revision: https://reviews.llvm.org/D77068	2020-06-24 12:48:17 -07:00
Cullen Rhodes	05e10ee0ae	[AArch64][SVE2] Add bfloat16 support to whilerw/whilewr intrinsics Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82399	2020-06-24 10:06:31 +00:00
Cullen Rhodes	fd2c4b8999	[AArch64][SVE] Add bfloat16 support to svlen intrinsic Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82186	2020-06-24 10:05:51 +00:00
Cullen Rhodes	26502ad609	[AArch64][SVE] Add bfloat16 support to perm and select intrinsics Summary: Added for following intrinsics: * zip1, zip2, zip1q, zip2q * trn1, trn2, trn1q, trn2q * uzp1, uzp2, uzp1q, uzp2q * splice * rev * sel Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D82182	2020-06-24 10:04:51 +00:00
Kerry McLaughlin	3d6cab271c	[AArch64][SVE] Add bfloat16 support to load intrinsics Summary: Bfloat16 support added for the following intrinsics: - LD1 - LD1RQ - LDNT1 - LDNF1 - LDFF1 Reviewers: sdesmalen, c-rhodes, efriedma, stuij, fpetrogalli, david-arm Reviewed By: fpetrogalli Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82298	2020-06-24 10:32:19 +01:00
Kazushi (Jam) Marukawa	96d4ccf00c	[VE] Clang toolchain for VE Summary: This patch enables compilation of C code for the VE target with Clang. Differential Revision: https://reviews.llvm.org/D79411	2020-06-24 10:12:09 +02:00
Zhi Zhuang	47fb21d2ea	fix test failure for clang/test/CodeGen/builtin-expect-with-probability.cpp Fix test case added by D79830 Rewrite the test case, which did similar thing as builtin-expect.c does(test generated llvm intrinsic instead of test branch weights). Currently pass by "-disable-llvm-passes" option. Differential Revision: https://reviews.llvm.org/D82403	2020-06-23 13:34:35 -07:00
Erich Keane	79ceda2e39	Fix test added by D79830 This clang test unfortunately depends on the actions of the optimizer, which some of the buildbots hit. This patch makes it so it cannot ignore the return value of 'f', so it won't do away with the implementation.	2020-06-23 08:39:25 -07:00
Mikhail Maltsev	3f353a2e5a	[BFloat] Add convert/copy instrinsic support This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a Specifically it adds intrinsic support in clang and llvm for Arm and AArch64. The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Alexandros Lamprineas - Luke Cheeseman - Mikhail Maltsev - Momchil Velikov - Luke Geeson Differential Revision: https://reviews.llvm.org/D80928	2020-06-23 14:27:05 +00:00
Mikhail Maltsev	9c579540ff	[ARM] BFloat MatMul Intrinsics&CodeGen Summary: This patch adds support for BFloat Matrix Multiplication Intrinsics and Code Generation from __bf16 to AArch32. This includes IR intrinsics. Tests are provided as needed. This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Luke Geeson - Momchil Velikov - Mikhail Maltsev - Luke Cheeseman - Simon Tatham Reviewers: stuij, t.p.northover, SjoerdMeijer, sdesmalen, fpetrogalli, LukeGeeson, simon_tatham, dmgreen, MarkMurrayARM Reviewed By: MarkMurrayARM Subscribers: MarkMurrayARM, danielkiss, kristof.beyls, hiraditya, cfe-commits, llvm-commits, chill, miyuki Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81740	2020-06-23 12:06:37 +00:00
Sander de Smalen	121e585ec8	[AArch64][SVE] ACLE: Add bfloat16 to struct load/stores. This patch contains: - Support in LLVM CodeGen for bfloat16 types for ld2/3/4 and st2/3/4. - New bfloat16 ACLE builtins for svld(2\|3\|4)[_vnum] and svst(2\|3\|4)[_vnum] Reviewers: stuij, efriedma, c-rhodes, fpetrogalli Reviewed By: fpetrogalli Tags: #clang, #lldb, #llvm Differential Revision: https://reviews.llvm.org/D82187	2020-06-23 12:12:35 +01:00
Cullen Rhodes	c8fae2bb4a	[AArch64][SVE] Guard svbfloat16_t with feature macro in ACLE Summary: svbfloat16_t should only be defined if the __ARM_FEATURE_SVE_BF16 feature macro is enabled, similar to the scalar bfloat16_t type. Also, arm_bf16.h should be included in arm_sve.h when __ARM_FEATURE_BF16_SCALAR_ARITHMETIC is defined. Patch also contains a fix for ld1ro intrinsic which should be guarded on __ARM_FEATURE_SVE_BF16 rather than __ARM_FEATURE_BF16_SCALAR_ARITHMETIC, and a fix for bfmmla test which was missing __ARM_FEATURE_BF16_SCALAR_ARITHMETIC and -target-feature +bf16 in the RUN line. Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82178	2020-06-23 10:24:10 +00:00
Amy Kwan	19df9e2959	[PowerPC][Power10] Implement VSX PCV Generate Operations in LLVM/Clang This patch implements builtins for the following prototypes for the VSX Permute Control Vector Generate with Mask Instructions: vector unsigned char vec_genpcvm (vector unsigned char, const int); vector unsigned short vec_genpcvm (vector unsigned short, const int); vector unsigned int vec_genpcvm (vector unsigned int, const int); vector unsigned long long vec_genpcvm (vector unsigned long long, const int); Differential Revision: https://reviews.llvm.org/D81774	2020-06-22 21:09:34 -05:00
Mikhail Maltsev	3a4feb1d53	[ARM][BFloat] Implement bf16 get/set_lane without casts to i16 vectors Currently, in order to extract an element from a bf16 vector, we cast the vector to an i16 vector, perform the extraction, and cast the result to bfloat. This behavior was copied from the old fp16 implementation. The goal of this patch is to achieve optimal code generation for lane copying intrinsics in a subsequent patch (LLVM fails to fold certain combinations of bitcast, insertelement, extractelement and shufflevector instructions leading to the generation of suboptimal code). Differential Revision: https://reviews.llvm.org/D82206	2020-06-22 17:35:43 +00:00
Zhi Zhuang	37fb860301	Add support of __builtin_expect_with_probability Add a new builtin-function __builtin_expect_with_probability and intrinsic llvm.expect.with.probability. The interface is __builtin_expect_with_probability(long expr, long expected, double probability). It is mainly the same as __builtin_expect besides one more argument indicating the probability of expression equal to expected value. The probability should be a constant floating-point expression and be in range [0.0, 1.0] inclusive. It is similar to builtin-expect-with-probability function in GCC built-in functions. Differential Revision: https://reviews.llvm.org/D79830	2020-06-22 10:21:28 -07:00
Francesco Petrogalli	ef597eda8e	[sve][acle] Add SVE BFloat16 extensions. Summary: List of intrinsics: svfloat32_t svbfdot[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfdot[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfdot_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svfloat32_t svbfmmla[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalb[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalb[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfmlalb_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svfloat32_t svbfmlalt[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3) svfloat32_t svbfmlalt[_n_f32](svfloat32_t op1, svbfloat16_t op2, bfloat16_t op3) svfloat32_t svbfmlalt_lane[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3, uint64_t imm_index) svbfloat16_t svcvt_bf16[_f32]_m(svbfloat16_t inactive, svbool_t pg, svfloat32_t op) svbfloat16_t svcvt_bf16[_f32]_x(svbool_t pg, svfloat32_t op) svbfloat16_t svcvt_bf16[_f32]_z(svbool_t pg, svfloat32_t op) svbfloat16_t svcvtnt_bf16[_f32]_m(svbfloat16_t even, svbool_t pg, svfloat32_t op) svbfloat16_t svcvtnt_bf16[_f32]_x(svbfloat16_t even, svbool_t pg, svfloat32_t op) For reference, see section 7.2 of "Arm C Language Extensions for SVE - Version 00bet4" Reviewers: sdesmalen, ctetreau, efriedma, david-arm, rengolin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82141	2020-06-22 16:53:02 +00:00
Anton Korobeynikov	6cb80fbe40	Revert "[MSP430] Update register names" This reverts commit `8f6620f663`.	2020-06-22 13:37:22 +03:00
Anatoly Trosinenko	8f6620f663	[MSP430] Update register names When writing a unit test on replacing standard epilogue sequences with `BR __mspabi_func_epilog_<N>`, by manually asm-clobbering `rN` - `r10` for N = 4..10, everything worked well except for seeming inability to clobber r4. The problem was that MSP430 code generator of LLVM used an obsolete name FP for that register. Things were worse because when `llc` read an unknown register name, it silently ignored it. Differential Revision: https://reviews.llvm.org/D82184	2020-06-22 13:24:03 +03:00
Craig Topper	1d4c87335d	[X86] Assign a feature priority to 'tigerlake' so it won't assert when used with function multiversioning Also test cooperlake since it was also just added to function multiversioning when it was enabled for __builtin_cpu_is.	2020-06-21 13:24:58 -07:00
Craig Topper	42c176c328	[X86] Add 'cooperlake' and 'tigerlake' to __builtin_cpu_is. Cooperlake can be detect by compiler-rt now, but not libgcc yet. Tigerlake can't be detected by either. Both names are accepted by gcc. Hopefully the detection code will be in place soon.	2020-06-21 13:03:18 -07:00
Amy Kwan	cc95635b1b	[PowerPC][Power10] Implement Vector Clear Left/Rightmost Bytes Builtins in LLVM/Clang This patch implements builtins for the following prototypes: ``` vector signed char vec_clrl (vector signed char a, unsigned int n); vector unsigned char vec_clrl (vector unsigned char a, unsigned int n); vector signed char vec_clrr (vector signed char a, unsigned int n); vector signed char vec_clrr (vector unsigned char a, unsigned int n); ``` Differential Revision: https://reviews.llvm.org/D81707	2020-06-20 18:29:16 -05:00
Xiangling Liao	22337bfe7d	[AIX][Frontend] Static init implementation for AIX considering no priority 1. Provides no piroirity supoort && disables three priority related attributes: init_priority, ctor attr, dtor attr; 2. '-qunique' in XL compiler equivalent behavior of emitting sinit and sterm functions name using getUniqueModuleId() util function in LLVM (currently no support for InternalLinkage and WeakODRLinkage symbols); 3. Add testcases to emit IR sample with __sinit80000000, __dtor, and __sterm80000000; 4. Temporarily side-steps the need to implement the functionality of llvm.global_ctors and llvm.global_dtors arrays. The uses of that functionality in this patch (with respect to the name of the functions involved) are not representative of how the functionality will be used once implemented. Differential Revision: https://reviews.llvm.org/D74166	2020-06-19 08:27:07 -04:00

... 8 9 10 11 12 ...

6983 Commits