llvm-project

Commit Graph

Author	SHA1	Message	Date
James Molloy	043d613791	Revert "[ARM] Promote small global constants to constant pools" This reverts commit r281314. Speculatively revert as it's possible this caused linker errors: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19656 llvm-svn: 281327	2016-09-13 12:45:51 +00:00
Pablo Barrio	bb6984d401	[ARM] Add ".code 32" to functions in the ARM instruction set Before, only Thumb functions were marked as ".code 16". These ".code x" directives are effective until the next directive of its kind is encountered. Therefore, in code with interleaved ARM and Thumb functions, it was possible to declare a function as ARM and end up with a Thumb function after assembly. A test has been added. An existing test has also been fixed to take this change into account. Reviewers: aschwaighofer, t.p.northover, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D24337 llvm-svn: 281324	2016-09-13 12:18:15 +00:00
James Molloy	d246c598de	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281323	2016-09-13 12:12:32 +00:00
James Molloy	3e4bc66134	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281314	2016-09-13 10:28:11 +00:00
Eric Liu	882dc72b38	[WebAssembly] Trying to fix broken tests in CodeGen/WebAssembly caused by r281285. Reviewers: bkramer, ddcc, dschuff, sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: https://reviews.llvm.org/D24497 llvm-svn: 281312	2016-09-13 10:05:44 +00:00
Ayman Musa	0c2da88f82	Remove MVT:i1 xor instruction before SELECT. (Performance improvement). Differential Revision: https://reviews.llvm.org/D23764 llvm-svn: 281308	2016-09-13 09:12:45 +00:00
Sjoerd Meijer	520a18df9c	Revert of r281304 as it is causing build bot failures in hexagon hwloop regression tests. These tests pass locally; will be investigating where these differences come from. llvm-svn: 281306	2016-09-13 08:51:59 +00:00
Sjoerd Meijer	05453991fe	This adds a new field isAdd to MCInstrDesc. The ARM and Hexagon instruction descriptions now tag add instructions, and the Hexagon backend is using this to identify loop induction statements. Patch by Sam Parker and Sjoerd Meijer. Differential Revision: https://reviews.llvm.org/D23601 llvm-svn: 281304	2016-09-13 08:08:06 +00:00
Elena Demikhovsky	b906df9fe5	AVX-512: Fix for PR28175 - Scalar code optimization. Optimized (truncate (assertzext x) to i1) and anyext i1 to i8/16/32. Optimization of this patterns is a one more step towards i1 optimization on AVX-512. Differential Revision: https://reviews.llvm.org/D24456 llvm-svn: 281302	2016-09-13 07:57:00 +00:00
Diana Picus	4b97288184	[AArch64] Support stackmap/patchpoint in getInstSizeInBytes We currently return 4 for stackmaps and patchpoints, which is very optimistic and can in rare cases cause the branch relaxation pass to fail to relax certain branches. This patch causes getInstSizeInBytes to return a pessimistic estimate of the size as the number of bytes requested in the stackmap/patchpoint. In the future, we could provide a more accurate estimate by sharing some of the logic in AArch64::LowerSTACKMAP/PATCHPOINT. Fixes part of https://llvm.org/bugs/show_bug.cgi?id=28750 Differential Revision: https://reviews.llvm.org/D24073 llvm-svn: 281301	2016-09-13 07:45:17 +00:00
Craig Topper	4619c9e6a8	[X86] Remove masked shufpd/shufps intrinsics and autoupgrade to native vector shuffles. They were removed from clang previously but accidentally left in the backend. llvm-svn: 281300	2016-09-13 07:40:53 +00:00
Peter Collingbourne	d4135bbc30	DebugInfo: New metadata representation for global variables. This patch reverses the edge from DIGlobalVariable to GlobalVariable. This will allow us to more easily preserve debug info metadata when manipulating global variables. Fixes PR30362. A program for upgrading test cases is attached to that bug. Differential Revision: http://reviews.llvm.org/D20147 llvm-svn: 281284	2016-09-13 01:12:59 +00:00
Hans Wennborg	8a42d4b9cc	X86: Conditional tail calls should not have isBarrier = 1 That confuses e.g. machine basic block placement, which then doesn't realize that control can fall through a block that ends with a conditional tail call. Instead, isBranch=1 should be set. Also, mark EFLAGS as used by these instructions. llvm-svn: 281281	2016-09-13 00:21:32 +00:00
Nico Weber	7c31d0ebc0	Revert r281215, it caused PR30358. llvm-svn: 281263	2016-09-12 21:40:50 +00:00
Dehao Chen	9bbb941acf	Lower consecutive select instructions correctly. Summary: If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095 Reviewers: davidxl Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D24147 llvm-svn: 281252	2016-09-12 20:23:28 +00:00
Elena Demikhovsky	57c602aad3	AVX-512: Added a test for -O0 mode. NFC. llvm-svn: 281246	2016-09-12 19:03:21 +00:00
Elena Demikhovsky	5730bf0429	AVX-512: Simplified masked_gather_scatter test. NFC. llvm-svn: 281244	2016-09-12 18:50:47 +00:00
Nicolai Haehnle	e58e0e3fe3	AMDGPU: Do not clobber SCC in SIWholeQuadMode Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22198 llvm-svn: 281230	2016-09-12 16:25:20 +00:00
James Molloy	3d06ff22b7	Revert "[ARM] Promote small global constants to constant pools" This reverts commit r281213. It made a bot go bang: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/14625 llvm-svn: 281228	2016-09-12 16:18:23 +00:00
Ahmed Bougacha	b678219aa6	[BranchFolding] Unique added live-ins after hoisting code. We're not supposed to have duplicate live-ins. llvm-svn: 281224	2016-09-12 16:05:31 +00:00
Ahmed Bougacha	45bfa8772f	[X86] Copy imp-uses when folding tailcall into conditional branch. r280832 added 32-bit support for emitting conditional tail-calls, but dropped imp-used parameter registers. This went unnoticed until r281113, which added 64-bit support, as this is only exposed with parameter passing via registers. Don't drop the imp-used parameters. llvm-svn: 281223	2016-09-12 16:05:27 +00:00
Igor Breger	a3e36da6f2	add select i1 test, reproduser pr30249. llvm-svn: 281218	2016-09-12 15:27:02 +00:00
James Molloy	1e1b56bd48	[Thumb] Teach ISel how to lower compares of AND bitmasks efficiently For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. llvm-svn: 281215	2016-09-12 14:30:48 +00:00
James Molloy	8f82d45ff4	[ARM] Promote small global constants to constant pools If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). llvm-svn: 281213	2016-09-12 13:42:16 +00:00
Pablo Barrio	0bebc38abb	Fix the Thumb test for vfloat intrinsics Summary: This test was not testing the intrinsics. A function like this: define %v4f32 @test_v4f32.floor(%v4f32 %a){ ... %1 = call %v4f32 @llvm.floor.v4f32(%v4f32 %a) ... } is transformed into the following assembly: _test_v4f32.floor: @ @test_v4f32.floor ... bl _floorf ... In each function tested, there are two CHECK: one that checked for the label and another one for the intrinsic that should be used inside the function (in our case, "floor"). However, although the first CHECK was matching the label, the second was not matching the intrinsic, but the second "floor" in the same line as the label. This is fixed by making the first CHECK match the entire line. Reviewers: jmolloy, rengolin Subscribers: rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D24398 llvm-svn: 281211	2016-09-12 13:14:14 +00:00
Tim Northover	032548fc5e	GlobalISel: support translation of global addresses. llvm-svn: 281207	2016-09-12 12:10:41 +00:00
Tim Northover	a7653b3919	GlobalISel: translate GEP instructions. Unlike SDag, we use a separate G_GEP instruction (much simplified, only taking a single byte offset) to preserve the pointer type information through selection. llvm-svn: 281205	2016-09-12 11:20:22 +00:00
Tim Northover	d28d3cc079	GlobalISel: disambiguate types when printing MIR Some generic instructions have multiple types. While in theory these always be discovered by inspecting the single definition of each generic vreg, in practice those definitions won't always be local and traipsing through a big function to find them will not be fun. So this changes MIRPrinter to print out the type of uses as well as defs, if they're known to be different or not known to be the same. On the parsing side, we're a little more flexible: provided each register is given a type in at least one place it's mentioned (and all types are consistent) we accept the MIR. This doesn't introduce ambiguity but makes writing tests manually a bit less painful. llvm-svn: 281204	2016-09-12 11:20:10 +00:00
Elena Demikhovsky	de1b494555	AVX-512: Added a test case that should be optimized in the future. NFC. llvm-svn: 281196	2016-09-12 06:26:03 +00:00
NAKAMURA Takumi	cf6aaa9e1a	llvm/test/CodeGen/AMDGPU/infinite-loop-evergreen.ll REQUIRES +Asserts. This might not crash with -Asserts. I saw it caused infinite loop in the codegen. llvm-svn: 281190	2016-09-12 04:27:28 +00:00
James Molloy	3e1ce05752	[AArch64] Fixup test after r281160 How I missed this locally is beyond me. I suspect llc didn't recompile. This is just changing the CHECK line back to what it was before r280364. llvm-svn: 281161	2016-09-11 08:24:04 +00:00
Craig Topper	3639cda748	[AVX-512] Add test cases to demonstrate opportunities for commuting vpternlog. Commuting will be added in a future commit. llvm-svn: 281157	2016-09-11 05:33:43 +00:00
Craig Topper	fb4564cf21	[AVX-512] Add VPTERNLOG to load folding tables. llvm-svn: 281156	2016-09-11 05:33:40 +00:00
Craig Topper	2c86705755	[X86] Side effecting asm in AVX512 integer stack folding test should return 2 x i64 not 8 x i64. llvm-svn: 281155	2016-09-11 05:33:38 +00:00
Justin Lebar	6d6b11a4a6	[NVPTX] Use ldg for explicitly invariant loads. Summary: With this change (plus some changes to prevent !invariant from being clobbered within llvm), clang will be able to model the __ldg CUDA builtin as an invariant load, rather than as a target-specific llvm intrinsic. This will let the optimizer play with these loads -- specifically, we should be able to vectorize them in the load-store vectorizer. Reviewers: tra Subscribers: jholewinski, hfinkel, llvm-commits, chandlerc Differential Revision: https://reviews.llvm.org/D23477 llvm-svn: 281152	2016-09-11 01:39:04 +00:00
Arnold Schwaighofer	6c57f4f56d	It should also be legal to pass a swifterror parameter to a call as a swifterror argument. rdar://28233388 llvm-svn: 281147	2016-09-10 19:42:53 +00:00
Arnold Schwaighofer	112ff66505	We also need to pass swifterror in R12 under swiftcc not only under ccc rdar://28190687 llvm-svn: 281138	2016-09-10 14:16:55 +00:00
Matt Arsenault	124384f08d	AMDGPU: Fix immediate folding logic when shrinking instructions If the literal is being folded into src0, it doesn't matter if it's an SGPR because it's being replaced with the literal. Also fixes initially selecting 32-bit versions of some instructions which also confused commuting. llvm-svn: 281117	2016-09-09 23:32:53 +00:00
Hans Wennborg	6ecf619be9	X86: Fold tail calls into conditional branches also for 64-bit (PR26302) This extends the optimization in r280832 to also work for 64-bit. The only quirk is that we can't do this for 64-bit Windows (yet). Differential Revision: https://reviews.llvm.org/D24423 llvm-svn: 281113	2016-09-09 22:37:27 +00:00
Matt Arsenault	0efdd06b22	AMDGPU: Run LoadStoreVectorizer pass by default llvm-svn: 281112	2016-09-09 22:29:28 +00:00
Simon Pilgrim	a3d1e03cd7	[X86][XOP] Fix VPERMIL2PD mask creation on 32-bit targets Use getConstVector helper to correctly create v2i64/v4i64 constants on 32-bit targets llvm-svn: 281105	2016-09-09 21:47:21 +00:00
Michael Kuperstein	b1153848ff	[X86] Regenerate test. NFC. llvm-svn: 281099	2016-09-09 21:36:17 +00:00
Arnold Schwaighofer	7d7b4b4014	Create phi nodes for swifterror values at the end of the phi instructions list ISel makes assumption about the order of phi nodes. rdar://28190150 llvm-svn: 281095	2016-09-09 21:18:47 +00:00
Justin Lebar	b5e884976b	[NVPTX] Implement llvm.fabs.f32, llvm.max.f32, etc. Summary: Previously these only worked via NVPTX-specific intrinsics. This change will allow us to convert these target-specific intrinsics into the general LLVM versions, allowing existing LLVM passes to reason about their behavior. It also gets us some minor codegen improvements as-is, from situations where we canonicalize code into one of these llvm intrinsics. Reviewers: majnemer Subscribers: llvm-commits, jholewinski, tra Differential Revision: https://reviews.llvm.org/D24300 llvm-svn: 281092	2016-09-09 21:07:26 +00:00
Wei Ding	06f8d39424	AMDGPU : Fix mqsad_u32_u8 instruction incorrect data type. Differential Revision: http://reviews.llvm.org/D23700 llvm-svn: 281081	2016-09-09 19:31:51 +00:00
Tom Stellard	b2869eb6e9	AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is 8-byte aligned for HSA Reviewers: arsenm Subscribers: arsenm, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24405 llvm-svn: 281080	2016-09-09 19:28:00 +00:00
Chris Dewhurst	c59f7c745b	[Sparc][LEON] Removed the parts of the errata fixes implemented using inline assembly as this is not the desired behaviour for end-users. Small change to a unit test to implement this without requiring the inline assembly. llvm-svn: 281047	2016-09-09 14:16:51 +00:00
James Molloy	57d9dfa9ac	[ARM] ADD with a negative offset can become SUB for free So model that directly in TTI::getIntImmCost(). llvm-svn: 281044	2016-09-09 13:35:36 +00:00
James Molloy	1454e90f86	[ARM] icmp %x, -C can be lowered to a simple ADDS or CMN Tell TargetTransformInfo about this so ConstantHoisting is informed. llvm-svn: 281043	2016-09-09 13:35:28 +00:00
Simon Pilgrim	153b408433	[SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar type Fixes issue with rL280927 identified by Mikael Holmén llvm-svn: 281042	2016-09-09 13:31:52 +00:00

1 2 3 4 5 ...

17365 Commits