llvm-project

Commit Graph

Author	SHA1	Message	Date
Asaf Badouh	d4a0d9a78c	[X86][AVX512]fix dag & add intrinsics for fixupimm cover all width and types (pd/ps/sd/ss) of fixupimm instruction and inrtinsics Differential Revision: http://reviews.llvm.org/D16313 llvm-svn: 258124	2016-01-19 14:21:39 +00:00
Simon Pilgrim	3e5fb61978	[X86][AVX2] Broadcast subvectors AVX2 can only broadcast from the zero'th element of a vector, but if the broadcastable element is the zero'th element of a 128-bit subvector its advantageous to extract the subvector, broadcast from that and avoid the loading of shuffle mask data that would be needed for VPERMPS/VPERMD. The only exception being when the source type is 4f64 or 4i64 which can directly use the immediate shuffle VPERMPD/VPERMQ directly. Differential Revision: http://reviews.llvm.org/D16050 llvm-svn: 258081	2016-01-18 20:59:04 +00:00
Igor Breger	239fda676c	AVX512: Masked store intrinsic implementation. Implemented intrinsic for the follow instructions (store) : VMOVDQU8/16/32/64, VMOVDQA32/64, VMOVAPS/PD, VMOVUPS/PD. Differential Revision: http://reviews.llvm.org/D16271 llvm-svn: 258047	2016-01-18 13:52:57 +00:00
Elena Demikhovsky	9242ea87d6	Added Cannonlake processor to X86 Target Differential Revision: http://reviews.llvm.org/D16289 llvm-svn: 258046	2016-01-18 13:00:31 +00:00
Igor Breger	dd6522c653	AVX512 : Change v8i1 bitconvert GR8 pattern, remove unnecessary movzbl instruction. code example , previous implementation. movzbl %dil, %eax kmovw %eax, %k0 new code kmovw %edi, %k0 Differential Revision: http://reviews.llvm.org/D16287 llvm-svn: 258045	2016-01-18 12:02:45 +00:00
Michael Zuckerman	97b6a6923e	[AVX512] adding AVXVBMI feature flag The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions. More about the instruction can be found in: hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf Differential Revision: http://reviews.llvm.org/D16190 llvm-svn: 258012	2016-01-17 13:42:12 +00:00
Igor Breger	e1f273d900	AVX512: Use MemIntrinsicSDNode to implement load/store intrinsic. Differential Revision: http://reviews.llvm.org/D16184 llvm-svn: 258009	2016-01-17 12:10:24 +00:00
Michael Zuckerman	ac1b238b0a	[AVX512] Adding VPERMW/D/Q VPERMPS/D Intrinsics Differential Revision: http://reviews.llvm.org/D16189 llvm-svn: 258008	2016-01-17 11:33:29 +00:00
Michael Zuckerman	ede597c753	[AVX512] Adding VPERMQ VPERMPD Intrinsics Differential Revision: http://reviews.llvm.org/D16194 llvm-svn: 258006	2016-01-17 08:32:14 +00:00
Simon Pilgrim	20f31fa31a	[X86][AVX] Enable extraction of upper 128-bit subvectors for 'half undef' shuffle lowering Added support for the extraction of the upper 128-bit subvectors for lower/upper half undef shuffles if it would reduce the number of extractions/insertions or avoid loads of AVX2 permps/permd shuffle masks. Minor follow up to D15477. llvm-svn: 258000	2016-01-16 22:30:20 +00:00
Manman Ren	53a54c41d7	CXX_FAST_TLS calling convention: fix issue on x86-64. %RBP can't be handled explicitly. We generate the following code: pushq %rbp movq %rsp, %rbp ... movq %rbx, (%rbp) ## 8-byte Spill where %rbp will be overwritten by the spilled value. The fix is to let PEI handle %RBP. PR26136 llvm-svn: 257997	2016-01-16 16:39:46 +00:00
NAKAMURA Takumi	33ff1dda6a	[Cygwin] Use -femulated-tls by default since r257718 introduced the new pass. FIXME: Add more targets to use emutls into clang/test/Driver/emulated-tls.cpp. FIXME: Add cygwin tests into llvm/test/CodeGen/X86. Working in progress. llvm-svn: 257984	2016-01-16 03:44:52 +00:00
Kevin B. Smith	c831a08fbf	[X86]: Make param names in header and body match for isCalleePop. Differential Revision: http://reviews.llvm.org/D16246 llvm-svn: 257965	2016-01-16 00:08:36 +00:00
Manman Ren	4fe01bd8f9	CXX_FAST_TLS calling convention: fix issue on X86-64. When we have a single basic block, the explicit copy-back instructions should be inserted right before the terminator. Before this fix, they were wrongly placed at the beginning of the basic block. I will commit fixes to other platforms as well. PR26136 llvm-svn: 257925	2016-01-15 19:35:42 +00:00
Pete Cooper	835594e627	Delete MCRelocationInfo::createExprForRelocation. This method has no callers. Also remove X86ELFRelocationInfo.cpp and X86MachORelocationInfo.cpp which only existed to provide an implementation of that method. Ok'd by Rafael and Jim. llvm-svn: 257859	2016-01-15 02:24:12 +00:00
Rui Ueyama	da00f2fdf4	Update to use new name alignTo(). llvm-svn: 257804	2016-01-14 21:06:47 +00:00
Igor Breger	fc96331d88	AVX512: VMOVDQA32/64 (load) intrinsic implementation. Differential Revision: http://reviews.llvm.org/D16142 llvm-svn: 257749	2016-01-14 07:56:04 +00:00
David Majnemer	3463e696fb	[X86] Don't alter HasOpaqueSPAdjustment after we've relied on it We rely on HasOpaqueSPAdjustment not changing after we've calculated things based on it. Things like whether or not we can use 'rep;movs' to copy bytes around, that sort of thing. If it changes, invariants in the backend will quietly break. This situation arose when we had a call to memcpy and a COPY of the FLAGS register where we would attempt to reference local variables using %esi, a register that was clobbered by the 'rep;movs'. This fixes PR26124. llvm-svn: 257730	2016-01-14 01:20:03 +00:00
Rafael Espindola	8340f94df1	Convert a few assert failures into proper errors. Fixes PR25944. llvm-svn: 257697	2016-01-13 22:56:57 +00:00
Michael Zuckerman	6b35f460ac	Fixing warning by adding the X86ISD::VROTRI case. Differential Revision: http://reviews.llvm.org/D16052 llvm-svn: 257607	2016-01-13 15:48:42 +00:00
Michael Zuckerman	0e31b22487	[AVX512] Adding PMOVSXBD/W/Q , PMOVZSDQ and PMOVZSWD/Q Intrinsics . Differential Revision: http://reviews.llvm.org/D16111 llvm-svn: 257604	2016-01-13 14:59:19 +00:00
Michael Zuckerman	43cea85db9	[AVX512] Adding PMOVZXBD/W/Q , PMOVZXDQ and PMOVZXWD/Q Intrinsics Differential Revision:http://reviews.llvm.org/D16071 llvm-svn: 257601	2016-01-13 14:25:21 +00:00
Michael Zuckerman	298a680c80	[AVX512] adding PRORQ , PRORD , PRORLVQ and PRORLVD Intrinsics Differential Revision: http://reviews.llvm.org/D16052 llvm-svn: 257594	2016-01-13 12:39:33 +00:00
Andrey Turetskiy	1ce2c9973f	LEA code size optimization pass (Part 2): Remove redundant LEA instructions. Make x86 OptimizeLEAs pass remove LEA instruction if there is another LEA (in the same basic block) which calculates address differing only be a displacement. Works only for -Oz. Differential Revision: http://reviews.llvm.org/D13295 llvm-svn: 257589	2016-01-13 11:30:44 +00:00
Michael Zuckerman	2ddcbcf464	[AVX512] adding PROLQ and PROLD Intrinsics Differential Revision: http://reviews.llvm.org/D16048 llvm-svn: 257523	2016-01-12 21:19:17 +00:00
Andrey Turetskiy	fed110f646	Test commit access - tiny comment and code style fix. llvm-svn: 257472	2016-01-12 13:34:11 +00:00
Robert Lougher	6abd69a60b	The isel pattern that selects the memory-register form of VCVTPH2PS (64 to 128-bit) matches against the pattern fragment 'vzmovl_v2i64' (a zero-extended 64-bit load). However, a change in r248784 teaches the instruction combiner that only the lower 64 bits of the input to a 128-bit vcvtph2ps are used. This means the instruction combiner will ordinarily optimize away the upper 64-bit insertelement instruction in the zero-extension and so we no longer select the memory-register form. To fix this a new pattern has been added. Differential Revision: http://reviews.llvm.org/D16067 llvm-svn: 257470	2016-01-12 11:48:25 +00:00
Igor Breger	ea8e8e9f97	AVX512: VPMOVAPS/PD and VPMOVUPS/PD (load) intrinsic implementation. Differential Revision: http://reviews.llvm.org/D16042 llvm-svn: 257463	2016-01-12 10:02:32 +00:00
Manman Ren	ed967f3752	CXX_FAST_TLS calling convention: performance improvement for x86-64. This is the same change on x86-64 as r255821 on AArch64. rdar://9001553 llvm-svn: 257428	2016-01-12 01:08:46 +00:00
Alexey Bataev	28f0c5efec	[X86] Reduce complexity of the LEA optimization pass, by Andrey Turetsky. In the OptimizeLEA pass keep instructions' positions in the basic block saved and use them for calculation of the distance between two instructions instead of std::distance. This reduces complexity of the pass from O(n^3) to O(n^2) and thus the compile time. Differential Revision: http://reviews.llvm.org/D15692 llvm-svn: 257328	2016-01-11 11:52:29 +00:00
Craig Topper	9d2cab7742	[AVX-512] Remove another extra space from the Intel syntax asm strings. llvm-svn: 257304	2016-01-11 01:03:40 +00:00
Craig Topper	9feea57844	[AVX-512] Remove more superfluous spaces from asm strings. llvm-svn: 257301	2016-01-11 00:44:58 +00:00
Craig Topper	156622ad9d	[AVX-512] Remove unused Round and Itinerary from the maskable_cmp multiclasses. They weren't used and there were extra spaces in the asm string to prepare for the concatenations of the round string that wasn't ever used. llvm-svn: 257300	2016-01-11 00:44:56 +00:00
Craig Topper	bfe13ff6ca	[AVX-512] Make spacing between comma and {sae} operand consistent in asm strings. llvm-svn: 257299	2016-01-11 00:44:52 +00:00
Craig Topper	5be407ab27	[X86] Remove extra spaces from MPX instruction asm strings. llvm-svn: 257298	2016-01-11 00:44:46 +00:00
Elena Demikhovsky	542dfcf44c	Optimized instruction sequence for sitofp operation on X86-32 Optimized sitofp i64 %x to double. The current sequence movl %ecx, 8(%esp) movl %edx, 12(%esp) fildll 8(%esp) is replaced with: movd %ecx, %xmm0 movd %edx, %xmm1 punpckldq %xmm1, %xmm0 movq %xmm0, 8(%esp) Differential Revision: http://reviews.llvm.org/D15946 llvm-svn: 257285	2016-01-10 09:41:22 +00:00
Michael Zuckerman	885f61c534	[AVX512] add PRORVQ and PRORVD Intrinsic Differential Revision:http://reviews.llvm.org/D15955 llvm-svn: 257283	2016-01-10 09:16:41 +00:00
Simon Pilgrim	c7bebcbfd8	[X86][AVX] Match broadcast loads through a bitcast AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through any bitcast to check for a load node to allow broadcasts to occur. This is a re-commit of r257055 after r257264 fixed 32-bit broadcast loads of i64 scalars. llvm-svn: 257266	2016-01-09 20:59:39 +00:00
Simon Pilgrim	2e7a1849c9	[X86][AVX] Add support for i64 broadcast loads on 32-bit targets Added 32-bit AVX1/AVX2 broadcast tests. llvm-svn: 257264	2016-01-09 19:59:27 +00:00
Craig Topper	048e700828	[AVX-512] Remove superfluous spaces from some asm strings. llvm-svn: 257150	2016-01-08 06:09:20 +00:00
Craig Topper	04493fda81	[X86] Don't print the aliased version of CVTSD2SI64rm. This appears to be a mistake I made years ago. llvm-svn: 257149	2016-01-08 06:09:18 +00:00
Craig Topper	29510c0430	[X86] Use \t instead of space after mnemonics in a bunch InstAliases for consistency. llvm-svn: 257148	2016-01-08 06:09:13 +00:00
Michael Zuckerman	3aca221b31	[AVX512] add PSLLW and PSLLV Intrinsic Differential Revision: http://reviews.llvm.org/D15889 llvm-svn: 257070	2016-01-07 16:02:51 +00:00
Nico Weber	4324b9b236	Revert r257055, it caused PR26064. llvm-svn: 257066	2016-01-07 15:01:46 +00:00
Michael Zuckerman	354152d590	[AVX512] add PSRAV Intrinsic Differential Revision: http://reviews.llvm.org/D15856 llvm-svn: 257063	2016-01-07 14:42:20 +00:00
Michael Zuckerman	a6df006b50	[AVX512] add PSHUFHW and PSHUFLW Intrinsic Differential Revision: http://reviews.llvm.org/D15925 llvm-svn: 257056	2016-01-07 12:35:43 +00:00
Simon Pilgrim	bcc11a059e	[X86][AVX] Match broadcast loads through a bitcast AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through bitcasts to check for a load node to allow broadcasts to occur. Follow up to D15310 llvm-svn: 257055	2016-01-07 11:34:27 +00:00
Simon Pilgrim	83e44c66ae	[X86][SSE} Add INSERTPS as a target shuffle Follow up to D15378, added INSERTPS to the list of decodable target shuffles and enabled XFormVExtractWithShuffleIntoLoad to handle target shuffles with SentinelZero and tested this with INSERTPS. llvm-svn: 257046	2016-01-07 10:24:19 +00:00
Michael Zuckerman	4a1566827d	[AVX512] add PSHUFD Intrinsic Differential Revision: http://reviews.llvm.org/D15934 llvm-svn: 257044	2016-01-07 09:24:12 +00:00
Craig Topper	68cffb17a0	[X86] Remove superfluous mayLoad flag. The pattern already implies it. llvm-svn: 257035	2016-01-07 06:42:10 +00:00

1 2 3 4 5 ...

12607 Commits