llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Zolotukhin	47eef7a3c9	[Tests] Slightly reduce test LoopUnroll/pr18861.ll. llvm-svn: 249172	2015-10-02 19:21:43 +00:00
Dan Gohman	72f1692a2c	[WebAssembly] Add a memory_size intrinsic. llvm-svn: 249171	2015-10-02 19:21:15 +00:00
Matt Arsenault	d092a068ba	AMDGPU/SI: Add verifier check for exec reads Make sure we aren't accidentally not setting these in the instruction definitions. llvm-svn: 249170	2015-10-02 18:58:37 +00:00
Matt Arsenault	3d3b4d0c2c	Add way to test for generic TargetOpcodes The alternative would be to add a bit to the target's InstrFlags but that seems like a waste of a bit. llvm-svn: 249169	2015-10-02 18:58:33 +00:00
Sanjoy Das	7d910f2b11	[SCEV] Try to prove predicates by splitting them Summary: This change teaches SCEV that to prove `A u< B` it is sufficient to prove each of these facts individually: - B >= 0 - A s< B - A >= 0 In practice, SCEV sometimes finds it easier to prove these facts individually than to prove `A u< B` as one atomic step. Reviewers: reames, atrick, nlewycky, hfinkel Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13042 llvm-svn: 249168	2015-10-02 18:50:30 +00:00
Roman Divacky	4b5507a037	Actually switch the arch when we see .arch. PR21695 llvm-svn: 249165	2015-10-02 18:25:25 +00:00
Tim Northover	8d67b8e053	ARM: diagnose invalid local fixups on Thumb1 We previously stopped producing Thumb2 relaxations when they weren't supported, but only diagnosed the case where an actual relocation was produced. We should also tell people if local symbols aren't going to work rather than silently overflowing. llvm-svn: 249164	2015-10-02 18:07:18 +00:00
Tim Northover	956b008db6	ARM: correctly align constant pool value on Thumb1 targets. Since we're using tLDRpci to access it, the constant pool's address must be 0 (mod 4). llvm-svn: 249163	2015-10-02 18:07:13 +00:00
Hal Finkel	942e949f0d	[lit] Raise the default soft process limit when possible It is common to have a default soft process limit, at least on some families of Linux distributions, of 1024. This is normally more than enough, but if you have many cores, and you're running tests that create many threads, this can become a problem. My POWER7 development machine has 48 cores, and when running the lld regression tests, which often want to create up to 48 threads, I run into problems. lit, by default, will want to run 48 tests in parallel, and 48*48 < 1024, and so many tests fail like this: terminate called after throwing an instance of 'std::system_error' what(): Resource temporarily unavailable or lit fails like this when launching a test: OSError: [Errno 11] Resource temporarily unavailable lit can easily detect this situation and attempt to repair it before launching tests (by raising the soft process limit to something that will allow ncpus^2 threads to be created), and should do so to prevent spurious test failures. This is the follow-up to this thread: http://lists.llvm.org/pipermail/llvm-dev/2015-October/090942.html llvm-svn: 249161	2015-10-02 17:50:28 +00:00
Chad Rosier	1f385618c0	[ARM] Typo. NFC. llvm-svn: 249153	2015-10-02 16:42:59 +00:00
Andrea Di Biagio	77f62652c1	Reapply r249121 : "[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types." This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Originally reviewed here: http://reviews.llvm.org/D13347 llvm-svn: 249147	2015-10-02 16:08:05 +00:00
Andrea Di Biagio	45874e67a1	Revert: [FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. r249121 caused a Clang test failure (avx2-buitins.c). Revert r249121 while I keep investigating on the reason why that test failed. llvm-svn: 249124	2015-10-02 13:06:19 +00:00
Zoran Jovanovic	9ffdfa5986	[mips][microMIPS] Fix an issue with selecting sqrt instruction in LLVM backend Differential Revision: http://reviews.llvm.org/D13235 llvm-svn: 249123	2015-10-02 13:06:02 +00:00
Andrea Di Biagio	cb33456122	[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Differential Revision: http://reviews.llvm.org/D13347 llvm-svn: 249121	2015-10-02 12:45:37 +00:00
Richard Smith	0ff249b906	DenseMap: we're trying to call the reserved global placement allocation function here; use "::new" to avoid accidentally picking up a class-specific operator new. llvm-svn: 249112	2015-10-02 00:46:33 +00:00
Adrian Prantl	42562c38f5	dsymutil: Also ignore the ByteSize when building the DeclContext cache for clang modules. Forward decls of ObjC interfaces don't have a bytesize. llvm-svn: 249110	2015-10-02 00:27:08 +00:00
Ivan Krasin	95e82d5b48	[LibFuzzer] test_single_input option to run a single test case. -test_single_input flag specifies a file name with test data. Review URL: http://reviews.llvm.org/D13359 Patch by Mike Aizatsky! llvm-svn: 249096	2015-10-01 23:23:06 +00:00
Bruno Cardoso Lopes	b491a2d641	[SimplifyLibCalls] Fix instruction misplacement in string/memory libcall optimization When trying to optimize fortified library functions use the right location to insert new instructions in order to preserve correct def-use order. This fixes an issue where a misplaced instruction definition would happen to be after one of its use after a RAUW, forming invalid IR. This behavior was introduced by r227250. Differential Revision: http://reviews.llvm.org/D13301 rdar://problem/22802369 llvm-svn: 249092	2015-10-01 22:43:53 +00:00
Matt Arsenault	b733f00510	AMDGPU: Fix unused variable warning in release build llvm-svn: 249091	2015-10-01 22:40:35 +00:00
Colin LeMahieu	665c9be489	[Hexagon] XFAILing test while diagnosing backend error. llvm-svn: 249088	2015-10-01 22:14:05 +00:00
Matt Arsenault	b87fc22915	AMDGPU: Move SIFixSGPRLiveRanges to be a regalloc pass Replace LiveInterval usage with LiveVariables. LiveIntervals computes far more information than is needed for this pass which just needs to find if an SGPR is live out of the defining block. LiveIntervals are not usually available that early, requiring computing them twice which is very expensive. The extra run of LiveIntervals/LiveVariables/SlotIndexes was costing in total about 5% of compile time. Continuing to use LiveIntervals is problematic. It seems there is an option (early-live-intervals) to run the analysis about where it should go to avoid recomputing LiveVariables, but it seems to be completely broken with subreg liveness enabled. There are also problems from trying to recompute LiveIntervals since this seems to undo LiveVariables and clearing kill flags, causing TwoAddressInstructions to make bad decisions. Insert the pass right after live variables and preserve it. The tricky case to worry about might be phis since LiveVariables doesn't count a register as live out if in the successor block it is only used in a phi, but I don't think this is a concern right now because SIFixSGPRCopies replaces SGPR phis. llvm-svn: 249087	2015-10-01 22:10:03 +00:00
Joerg Sonnenberger	c8d50d6347	Fix relocation used for GOT references in non-PIC mode. Fix relocations for "set" pseudo op in PIC mode. Differential Revision: http://reviews.llvm.org/D13173 llvm-svn: 249086	2015-10-01 22:08:20 +00:00
Davide Italiano	f070688ecf	[PATCH] D13360: [llvm-objdump] Teach -d about AArch64 mapping symbols AArch64 uses $d* and $x* to interleave between text and data. llvm-objdump didn't know about this so it ended up printing garbage. This patch is a first step towards a solution of the problem. Differential Revision: http://reviews.llvm.org/D13360 llvm-svn: 249083	2015-10-01 21:57:09 +00:00
Matt Arsenault	d2c7589f93	AMDGPU: Merge if and switch llvm-svn: 249082	2015-10-01 21:51:59 +00:00
Matt Arsenault	db7f0ef367	AMDGPU: Remove dead code There's no point in checking VReg_1 because all uses of it should already have been removed by SILowerI1Copies. llvm-svn: 249081	2015-10-01 21:51:57 +00:00
Matt Arsenault	d1d499aa56	AMDGPU: Make SIInsertWaits about a factor of 4 faster This was the slowest target custom pass and was spending 80% of the time in getMinimalPhysRegClass which was called for every register operand. Try to use the statically known register class when possible from the instruction's MCOperandInfo. There are a few pseudo instructions which are not well behaved with unknown register classes which still require the expensive physical register class search. There are a few other possibilities for making this even faster, such as not inspecting implicit operands. For now those are checked because it is technically possible to have a scalar load into exec or vcc which can be implicitly used. llvm-svn: 249079	2015-10-01 21:43:15 +00:00
Reid Kleckner	fc64fae6e3	[WinEH] Emit __C_specific_handler tables for the new IR We emit denormalized tables, where every range of invokes in the same state gets a complete list of EH action entries. This is significantly simpler than trying to infer the correct nested scoping structure from the MI. Fortunately, for SEH, the nesting structure is really just a size optimization. With this, some basic __try / __except examples work. llvm-svn: 249078	2015-10-01 21:38:24 +00:00
Colin LeMahieu	f92c175bdd	[Hexagon] XFAILing test while diagnosing backend error. llvm-svn: 249075	2015-10-01 21:19:03 +00:00
Tom Stellard	e9f8b24985	AMDGPU/SI: Remove assert from AMDGPUOpenCLImageTypeLowering pass Summary: Instead of asserting when the kernel metadata is different than we expect, we should just skip lowering that function. This fixes assertion failures with OpenCL argument metadata from older LLVM releases. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13356 llvm-svn: 249073	2015-10-01 21:16:05 +00:00
David Majnemer	4600c06434	[WinEH] Stop BranchFolding from merging across funclets BranchFolding would merge two funclets together, this is not OK. Disable this and strengthen the assertion in FuncletLayout. llvm-svn: 249069	2015-10-01 21:04:13 +00:00
Jonathan Roelofs	86cbf543e0	Kill another reference to in-source builds llvm-svn: 249067	2015-10-01 20:53:59 +00:00
David Majnemer	f828a0ccc7	[WinEH] Make FuncletLayout more robust against catchret Catchret transfers control from a catch funclet to an earlier funclet. However, it is not completely clear which funclet the catchret target is part of. Make this clear by stapling the catchret target's funclet membership onto the CATCHRET SDAG node. llvm-svn: 249052	2015-10-01 18:44:59 +00:00
Chad Rosier	f11d040f01	[AArch64] Deprecate a command-line option used for testing. Support for pairing unscaled loads and stores has been enabled since the original ARM64 port. This feature is no longer experimental, AFAICT. llvm-svn: 249049	2015-10-01 18:17:12 +00:00
Jonas Paulsson	12629324a4	[SystemZ] Add some generic (floating point support) load instructions. Add generic instructions for load complement, load negative and load positive for fp32 and fp64, and let isel prefer them. They do not clobber CC, and so give scheduler more freedom. SystemZElimCompare pass will convert them when it can to the CC-setting variants. Regression tests updated to expect the new opcodes in places where the old ones where used. New test case SystemZ/fp-cmp-05.ll checks that SystemZCompareElim.cpp can handle the new opcodes. README.txt updated (bullet removed). Note that fp128 is not yet handled, because it is relatively rare, and is a bit trickier, because of the fact that l.dfr would operate on the sign bit of one of the subregisters of a fp128, but we would not want to copy the other sub-reg in case src and dst regs are not the same. Reviewed by Ulrich Weigand. llvm-svn: 249046	2015-10-01 18:12:28 +00:00
Rafael Espindola	e883514736	Fix printing of 64 bit values and make test more strict. llvm-svn: 249043	2015-10-01 17:57:31 +00:00
Tom Stellard	e0e582c9aa	AMDGPU: Add MEM_RAT STORE_TYPED. v2: Add test (Matt). Fix capitalization of isEOP (Matt). Move pattern to class parameter (Matt). Make the instruction available to Cayman (Matt). Change name from MEM_RAT WRITE_TYPED to MEM_RAT STORE_TYPED. Patch by: Zoltan Gilian llvm-svn: 249042	2015-10-01 17:51:34 +00:00
Tom Stellard	c0f0fba2c4	AMDGPU: Factor out EOP query. v2: Fix brace placement and capitalization (Matt). Patch by: Zoltan Gilian llvm-svn: 249041	2015-10-01 17:51:29 +00:00
NAKAMURA Takumi	096492a07b	Reformat. llvm-svn: 249033	2015-10-01 17:01:03 +00:00
NAKAMURA Takumi	1ed20db720	Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64" It broke; LLVM :: CodeGen__Generic__2009-11-16-BadKillsCrash.ll llvm-svn: 249032	2015-10-01 17:00:56 +00:00
Rafael Espindola	812f57e6dc	Use more strict types. NFC. On 32 bit ELF these are 32 bit values. llvm-svn: 249022	2015-10-01 15:22:42 +00:00
Arnaud A. de Grandmaison	849f3bf8c9	[InstCombine] Remove trivially empty lifetime start/end ranges. Summary: Some passes may open up opportunities for optimizations, leaving empty lifetime start/end ranges. For example, with the following code: void foo(char , char ); void bar(int Size, bool flag) { for (int i = 0; i < Size; ++i) { char text[1]; char buff[1]; if (flag) foo(text, buff); // BBFoo } } the loop unswitch pass will create 2 versions of the loop, one with flag==true, and the other one with flag==false, but always leaving the BBFoo basic block, with lifetime ranges covering the scope of the for loop. Simplify CFG will then remove BBFoo in the case where flag==false, but will leave the lifetime markers. This patch teaches InstCombine to remove trivially empty lifetime marker ranges, that is ranges ending right after they were started (ignoring debug info or other lifetime markers in the range). This fixes PR24598: excessive compile time after r234581. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13305 llvm-svn: 249018	2015-10-01 14:54:31 +00:00
Ulrich Weigand	cf1670a095	[SystemZ] Add assembly instructions for obtaining clock values as well as CPU features Provide assembler support for STCK, STCKF, STCKE, and STFLE. Author: joncmu Differential Revision: http://reviews.llvm.org/D13299 llvm-svn: 249015	2015-10-01 14:43:48 +00:00
Chad Rosier	b7c5b91068	[AArch64] Hoist commonly failing check. NFC. llvm-svn: 249011	2015-10-01 13:43:05 +00:00
Chad Rosier	0b15e7c618	[AArch64] Rename variable to improve readability. NFC. llvm-svn: 249008	2015-10-01 13:33:31 +00:00
Chad Rosier	7a83d770ae	[AArch64] Update comment to reflect reality. llvm-svn: 249007	2015-10-01 13:09:44 +00:00
Zoran Jovanovic	2960f3a346	[mips][microMIPS] Implement CACHEE, WRPGPR and WSBH instructions Differential Revision: http://reviews.llvm.org/D10337 llvm-svn: 249004	2015-10-01 12:49:27 +00:00
Scott Douglass	290183d734	[ARM] More care with Thumb1 writeback in ARMLoadStoreOptimizer Differential Revision: http://reviews.llvm.org/D13240 llvm-svn: 249002	2015-10-01 11:56:19 +00:00
Jingyue Wu	df1a1b113b	[NaryReassociate] SeenExprs records WeakVH Summary: The instructions SeenExprs records may be deleted during rewriting. FindClosestMatchingDominator should ignore these deleted instructions. Fixes PR24301. Reviewers: grosser Subscribers: grosser, llvm-commits Differential Revision: http://reviews.llvm.org/D13315 llvm-svn: 248983	2015-10-01 03:51:44 +00:00
Keno Fischer	17433bd102	Fix performance problem in long-running SectionMemoryManagers Summary: Without this patch, the memory manager would call `mprotect` on every memory region it ever allocated whenever it wanted to finalize memory (i.e. not just the ones it just allocated). This caused terrible performance problems for long running memory managers. In one particular compile heavy julia benchmark, we were spending 50% of time in `mprotect` if running under MCJIT. Fix this by splitting allocated memory blocks into those on which memory permissions have been set and those on which they haven't and only running `mprotect` on the latter. Reviewers: lhames Subscribers: reames, llvm-commits Differential Revision: http://reviews.llvm.org/D13156 llvm-svn: 248981	2015-10-01 02:45:07 +00:00
Tom Stellard	1f0e7bbc5b	AMDGPU/SI: Re-order PreloadedValue enum and number entries based on init order Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12451 llvm-svn: 248978	2015-10-01 02:02:46 +00:00
Davide Italiano	c50ae36509	[llvm-objdump] Fix time of check to time of use bug. There's already a test that covers this situation, so we should be fine. llvm-svn: 248976	2015-10-01 01:02:37 +00:00
David Blaikie	74d8806635	Revert "Enable -Wdeprecated in the cmake build now that LLVM (& Clang, Polly, and LLD) are -Wdeprecated clean" This reverts commit r248963. Seems there's some standard libraries (and libcxxabi implementations) that aren't -Wdeprecated clean... hrm. llvm-svn: 248972	2015-10-01 00:44:21 +00:00
Dehao Chen	7c41dd6498	Update sample profile propagation algorithm. http://reviews.llvm.org/D13218 llvm-svn: 248968	2015-10-01 00:26:56 +00:00
Ahmed Bougacha	23a0d1a1d6	[X86] Don't custom-lower vNi32 uint_to_fp when unsafe-fp-math. The custom code produces incorrect results if later reassociated. Since r221657, on x86, vNi32 uitofp is lowered using an optimized sequence: movdqa LCPI0_0(%rip), %xmm1 ## xmm1 = [65535, ...] pand %xmm0, %xmm1 por LCPI0_1(%rip), %xmm1 ## [0x4b000000, ...] psrld $16, %xmm0 por LCPI0_2(%rip), %xmm0 ## [0x53000000, ...] addps LCPI0_3(%rip), %xmm0 ## [float -5.497642e+11, ...] addps %xmm1, %xmm0 Since r240361, the machine combiner opportunistically reassociates 2-instruction sequences (with -ffast-math). In the new code sequence, the ADDPS' are eligible. In isolation, for simple examples (without reassociable users), this makes no performance difference (the goal being to enable reassociation of longer chains). In the trivial example (just one uitofp), the reassociation doesn't happen, because (I think) it would require the emission of a separate movaps for a constantpool load (instead of folding it into addps). However, when we have multiple uitofp sequences, and the constantpool loads are CSE'd earlier, the machine combiner can do the reassociation. When the ADDPS' are reassociated, the resulting sequence isn't correct anymore, as we'd be adding large (239) constants with comparatively smaller values (~223). Given that two of the three inputs are powers of 2 larger than 216, and that ulp(239) == 2(39-24) == 215, the reassociated chain will produce 0 for any input in [0, 214[. In my testing, it also produces wrong results for 99.5% of [0, 232[. Avoid this by disabling the new lowering when -ffast-math. It does mean that we'll get slower code than without it, but at least we won't get egregiously incorrect code. One might argue that, considering -ffast-math is all but meaningless, uitofp producing wrong results isn't a compiler bug. But it really is. Fixes PR24512. ...though this is really more of a workaround. Ideally, we'd have some sort of Machine FMF, but that's a problem that's not worth tackling until we do more with machine IR. llvm-svn: 248965	2015-10-01 00:11:07 +00:00
David Blaikie	3830d68bc1	Enable -Wdeprecated in the cmake build now that LLVM (& Clang, Polly, and LLD) are -Wdeprecated clean This particularly helps enforce the C++ Rule of 5 (for new move ops this is already an error, but for a type only using C++98 features (copy ctor/assign, dtor) it is only deprecated, not invalid) Applying the flag for any GCC compatible compiler - GCC doesn't warn on the Rule of 5 cases that C++11 deprecates, but it doesn't have other false positives so far as I could see (compiling with GCC 4.8 didn't produce any -Wdeprecated warnings I could spot). Reviewers: aaron.ballman Differential Revision: http://reviews.llvm.org/D13314 llvm-svn: 248963	2015-09-30 23:36:12 +00:00
Reid Kleckner	6dec87a8a0	[WinEH] Emit int3 after noreturn calls on Win64 The Win64 unwinder disassembles forwards from each PC to try to determine if this PC is in an epilogue. If so, it skips calling the EH personality function for that frame. Typically, this means you cannot catch an exception in the same frame that you threw it, because 'throw' calls a noreturn runtime function. Previously we avoided this problem with the TrapUnreachable TargetOption, but that's a much bigger hammer than we need. All we need is a 1 byte non-epilogue instruction right after the call. Instead, what we got was an unconditional branch to a shared block containing the ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which added TrapUnreachable, and replaces it with something better. The new code pattern matches for invoke/call followed by unreachable and inserts an int3 into the DAG. To be 100% watertight, we would need to insert SEH_Epilogue instructions into all basic blocks ending in a call with no terminators or successors, but in practice this is unlikely to come up. llvm-svn: 248959	2015-09-30 23:09:23 +00:00
Hal Finkel	39ac97fa3a	[PowerPC] undef Relocation names in PowerPC.def glibc's PowerPC /usr/include/asm/sigcontext.h, has this: #ifdef __powerpc64__ #include <asm/elf.h> #endif and that contains defines of all of the relocation symbols, like this: #define R_PPC_NONE 0 and if that file is included prior to including include/llvm/Support/ELFRelocs/PowerPC.def, which we cannot in general prevent, the result will fail. As it turns out, this happens when compiling lld/unittests/DriverTests/GnuLdDriverTest.cpp under PPC64/Linux, because: lld/include/lld/ReaderWriter/ELFLinkingContext.h includes lld/unittests/DriverTests/DriverTest.h which includes utils/unittest/googletest/include/gtest/gtest.h which includes utils/unittest/googletest/include/gtest/internal/gtest-internal.h which includes /usr/include/sys/wait.h which includes /usr/include/signal.h which includes /usr/include/bits/sigcontext.h which includes /usr/include/asm/sigcontext.h which includes /usr/include/asm/elf.h the test could be fixed to include ReaderWriter/ELFLinkingContext.h before including unittests/DriverTests/DriverTest.h, but dealing with this in the *.def files is a more-general solution that localizes the fix to the headers instead of requiring changes to an unbounded number of other source files (both in-tree and external). llvm-svn: 248957	2015-09-30 22:34:35 +00:00
Sanjay Patel	a114a10bbe	[x86] enable machine combiner reassociations for 256-bit vector logical integer insts llvm-svn: 248955	2015-09-30 22:25:55 +00:00
Kostya Serebryany	3287d7a6ed	[libFuzzer] Marking exported symbols as visible. Patch by Mike Aizatsky llvm-svn: 248954	2015-09-30 22:22:37 +00:00
Chad Rosier	4c5a4646bf	[AArch64] Remove an unnecessary run line and other cleanup. NFC. Unscaled load/store combining has been enabled since the initial ARM64 port. No need for a redundance run. Also, add CHECK-LABEL directives. llvm-svn: 248945	2015-09-30 21:10:02 +00:00
Michael Zolotukhin	fc783e91e0	[SLP] Don't vectorize loads of non-packed types (like i1, i2). Summary: Given an array of i2 elements, 4 consecutive scalar loads will be lowered to i8-sized loads and thus will access 4 consecutive bytes in memory. If we vectorize these loads into a single <4 x i2> load, it'll access only 1 byte in memory. Hence, we should prohibit vectorization in such cases. PS: Initial patch was proposed by Arnold. Reviewers: aschwaighofer, nadav, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13277 llvm-svn: 248943	2015-09-30 21:05:43 +00:00
David Blaikie	757908e545	Fix -Wsign-compare warning llvm-svn: 248942	2015-09-30 20:37:48 +00:00
Evgeniy Stepanov	422a61306e	Move dw_op_minus test to DebugInfo/X86. The test requires X86 target support, and checks the actual debug info contents, including register numbers which would be different on other platforms. llvm-svn: 248938	2015-09-30 20:23:24 +00:00
Evgeniy Stepanov	f608111d1b	Fix debug info with SafeStack. llvm-svn: 248933	2015-09-30 19:55:43 +00:00
Chad Rosier	11c825f7db	[AArch64] Remove an unnecessary restriction on pre-index instructions. Previously, the index was constrained to the size of the memory operation for no apparent reason. This change removes that constraint so that we can form pre-index instructions with any valid offset. llvm-svn: 248931	2015-09-30 19:44:40 +00:00
Fiona Glaser	b0c6d9174e	DeadCodeElimination: rewrite to be faster Same strategy as simplifyInstructionsInBlock. ~1/3 less time on my test suite. This pass doesn't have many in-tree users, but getting rid of an O(N^2) worst case and making it cleaner should at least make it a viable alternative to ADCE, since it's now consistently somewhat faster. llvm-svn: 248927	2015-09-30 17:49:49 +00:00
Hal Finkel	4c45775880	[PowerPC] Disable shrink wrapping Shrink wrapping is causing a self-hosting failure on PPC64/Linux. Disable for now until the problem can be fixed. llvm-svn: 248924	2015-09-30 17:29:03 +00:00
Erik Eckstein	91c49810f2	SLPVectorizer: add a test to check if the minimum region size works. This is an addition to rL248917. llvm-svn: 248923	2015-09-30 17:28:19 +00:00
Artyom Skrobov	72ca6b8f3f	[ARM] Support for ARMv6-Z / ARMv6-ZK missing As Richard Barton observed at http://reviews.llvm.org/D12937#inline-107121 TargetParser in LLVM has insufficient support for ARMv6Z and ARMv6ZK. In particular, there were no tests for TrustZone being supported in these architectures. The patch clears a FIXME: left by Saleem Abdulrasool in r201471, and fixes his test case which hadn't really been testing what it was claiming to test. Differential Revision: http://reviews.llvm.org/D13236 llvm-svn: 248921	2015-09-30 17:25:52 +00:00
Erik Eckstein	848c1aa452	SLPVectorizer: limit the scheduling region size per basic block. Usually large blocks are not a problem. But if a large block (> 10k instructions) contains many (potential) chains of vector instructions, and those chains are spread over a wide range of instructions, then scheduling becomes a compile time problem. This change introduces a limit for the accumulate scheduling region size of a block. For real-world functions this limit will never be exceeded (it's about 10x larger than the maximum value seen in the test-suite and external test suite). llvm-svn: 248917	2015-09-30 17:00:44 +00:00
Chad Rosier	4f04e2ec87	[AArch64] Use helper function to improve readability. NFC. llvm-svn: 248914	2015-09-30 16:50:41 +00:00
Andrea Di Biagio	0594e2a1e9	[InstCombine] Teach how to convert SSSE3/AVX2 byte shuffles to builtin shuffles if the shuffle mask is constant. This patch teaches InstCombiner how to convert a SSSE3/AVX2 byte shuffle to a builtin shuffle if the mask is constant. Converting byte shuffle intrinsic calls to builtin shuffles can help finding more opportunities for combining shuffles later on in selection dag. We may end up with byte shuffles with constant masks as the result of inlining. Differential Revision: http://reviews.llvm.org/D13252 llvm-svn: 248913	2015-09-30 16:44:39 +00:00
John Brawn	c11ef2a89c	[CMake] Make the bindir and libdir arguments to set_output_directory optional When building a plugin against an installed LLVM toolchain using add_llvm_loadable_module (in the documented manner) doesn't work as nothing sets the *_OUTPUT_INTDIR variables causing an error when set_output_directory is called. Making those arguments optional (causing the default output directory to be used) fixes this. Differential Revision: http://reviews.llvm.org/D13215 llvm-svn: 248911	2015-09-30 15:20:51 +00:00
Teresa Johnson	eaa3d2a63c	Add support for sub-byte aligned writes to lib/Support/Endian.h Summary: As per Duncan's review for D12536, I extracted the sub-byte bit aligned reading and writing code into lib/Support, and generalized it. Added calls from BackpatchWord. Also added unittests. Reviewers: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13189 llvm-svn: 248897	2015-09-30 13:20:37 +00:00
Artur Pilipenko	029d8531e6	Refactor computeKnownBits alignment handling code Reviewed By: reames, hfinkel Differential Revision: http://reviews.llvm.org/D12958 llvm-svn: 248892	2015-09-30 11:55:45 +00:00
Jeroen Ketema	ab99b59e8c	[ARM][NEON] Use address space in vld([1234]\|[234]lane) and vst([1234]\|[234]lane) instructions This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234], vst[234]lane ARM neon intrinsics and associates an address space with the pointer that these intrinsics take. This changes, e.g., <2 x i32> @llvm.arm.neon.vld1.v2i32(i8, i32) to <2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8, i32) This change ensures that address spaces are fully taken into account in the ARM target during lowering of interleaved loads and stores. Differential Revision: http://reviews.llvm.org/D12985 llvm-svn: 248887	2015-09-30 10:56:37 +00:00
John Brawn	8a3ec2aec2	[CMake] Adjust the variables set by LLVMConfig.cmake When using LLVMConfig.cmake from an installed toolchain in order to build a loadable pass using add_llvm_loadable_module LLVM_ENABLE_PLUGINS and LLVM_PLUGIN_EXT must be set. Also make LLVM_DEFINITIONS be set to what it actually is. Differential Revision: http://reviews.llvm.org/D13214 llvm-svn: 248884	2015-09-30 10:34:06 +00:00
Simon Pilgrim	3d11c994f7	[X86][XOP] Added support for the lowering of 128-bit vector shifts to XOP shift instructions The XOP shifts just have logical/arithmetic versions and the left/right shifts are controlled by whether the value is positive/negative. Because of this I've added new X86ISD nodes instead of trying to force them to use the existing shift nodes. Additionally Excavator cores (bdver4) support XOP and AVX2 - meaning that it should use the AVX2 shifts when it can and fall back to XOP in other cases. Differential Revision: http://reviews.llvm.org/D8690 llvm-svn: 248878	2015-09-30 08:17:50 +00:00
Justin Bogner	75df7187f3	InstrProf: Don't call std::unique twice here llvm-svn: 248872	2015-09-30 02:02:08 +00:00
Dehao Chen	aae9e1f2bd	Add unittest for new samle profile format. http://reviews.llvm.org/D13145 llvm-svn: 248870	2015-09-30 01:05:37 +00:00
Dehao Chen	6722688eaa	http://reviews.llvm.org/D13145 Support hierarachical sample profile format. llvm-svn: 248865	2015-09-30 00:42:46 +00:00
Evgeniy Stepanov	d3f544f271	[safestack] Fix a stupid mix-up in the direct-tls code path. llvm-svn: 248863	2015-09-30 00:01:47 +00:00
Justin Bogner	10c7e148c4	InstrProf: Add a missing const_cast from r248833 llvm-svn: 248859	2015-09-29 23:42:47 +00:00
Marek Olsak	d1a69a2839	AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is set to prevent setting a huge stride, because DATA_FORMAT has a different meaning if ADD_TID_ENABLE is set. This is a candidate for stable llvm 3.7. Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 248858	2015-09-29 23:37:32 +00:00
Reid Kleckner	a13dfd539b	[WinEH] Setup RBP correctly in Win64 funclet prologues Previously local variable captures just didn't work in 64-bit. Now we can access local variables more or less correctly. llvm-svn: 248857	2015-09-29 23:32:01 +00:00
David Majnemer	91b0ab9172	[WinEH] Ensure that funclets obey the x64 ABI The x64 ABI requires that epilogues do not contain code other than stack adjustments and some limited control flow. However, we'd insert code to initialize the return address after stack adjustments. Instead, insert EAX/RAX with the current value before we create the stack adjustments in the epilogue. llvm-svn: 248839	2015-09-29 22:33:36 +00:00
Justin Bogner	9e9a057a9b	InstrProf: Support for value profiling in the indexed profile format Add support to the indexed instrprof reader and writer for the format that will be used for value profiling. Patch by Betul Buyukkurt, with minor modifications. llvm-svn: 248833	2015-09-29 22:13:58 +00:00
Maksim Panchenko	cce239c45d	HHVM calling conventions. HHVM calling convention, hhvmcc, is used by HHVM JIT for functions in translated cache. We currently support LLVM back end to generate code for X86-64 and may support other architectures in the future. In HHVM calling convention any GP register could be used to pass and return values, with the exception of R12 which is reserved for thread-local area and is callee-saved. Other than R12, we always pass RBX and RBP as args, which are our virtual machine's stack pointer and frame pointer respectively. When we enter translation cache via hhvmcc function, we expect the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed to standard ABI alignment. This affects stack object alignment and stack adjustments for function calls. One extra calling convention, hhvm_ccc, is used to call C++ helpers from HHVM's translation cache. It is almost identical to standard C calling convention with an exception of first argument which is passed in RBP (before we use RDI, RSI, etc.) Differential Revision: http://reviews.llvm.org/D12681 llvm-svn: 248832	2015-09-29 22:09:16 +00:00
Chad Rosier	1769d8505f	Fix test from r248825. llvm-svn: 248827	2015-09-29 20:50:15 +00:00
Chad Rosier	4315012769	[AArch64] Add support for pre- and post-index LDPSWs. llvm-svn: 248825	2015-09-29 20:39:55 +00:00
David Majnemer	a80c151286	[WinEH] Teach AsmPrinter about funclets Summary: Funclets have been turned into functions by the time they hit the object file. Make sure that they have decent names for the symbol table and CFI directives explaining how to reason about their prologues. Differential Revision: http://reviews.llvm.org/D13261 llvm-svn: 248824	2015-09-29 20:12:33 +00:00
Zachary Turner	4dddcc64d3	[llvm-pdbdump] Add include-only filters. PDB files have a lot of noise in them, with hundreds (or thousands) of symbols from system libraries and compiler generated types. If you're only looking for a specific type, this can be problematic. This CL allows you to display only types, variables, or compilands matching a particular pattern. These filters can even be combined with exclude filters. Include-only filters are given priority, so that first the set of items to display is limited only to those that match the include filters, and then the set of exclude filters is applied to those. If there are no include filters specified, then it means "display everything". llvm-svn: 248822	2015-09-29 19:49:06 +00:00
Cong Hou	166e08542e	Rename some function arguments in MachineBasicBlock.cpp/h by turning the first letter into upper case. NFC. llvm-svn: 248821	2015-09-29 19:46:09 +00:00
Dehao Chen	8e7df83e6a	http://reviews.llvm.org/D13231 Change lookup functions to const functions. llvm-svn: 248818	2015-09-29 18:28:15 +00:00
Chad Rosier	dabe2534ed	[AArch64] Add integer pre- and post-index halfword/byte loads and stores. llvm-svn: 248817	2015-09-29 18:26:15 +00:00
Dehao Chen	028e122ca9	Revert r248810 which breaks tests. llvm-svn: 248814	2015-09-29 18:18:49 +00:00
Hans Wennborg	cc9deb4801	Fix Clang-tidy modernize-use-nullptr warnings in examples and include directories; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13172 llvm-svn: 248811	2015-09-29 18:02:48 +00:00
Dehao Chen	410a25aa7a	http://reviews.llvm.org/D13231 Change lookup functions to const functions. llvm-svn: 248810	2015-09-29 17:59:58 +00:00
Nemanja Ivanovic	2c84b29464	Addition of interfaces the BE to conform to Table A-2 of ELF V2 ABI V1.1 This patch corresponds to review: http://reviews.llvm.org/D13191 Back end portion of the fifth round of additions to altivec.h. llvm-svn: 248809	2015-09-29 17:41:53 +00:00
Chad Rosier	32d4d37e61	[AArch64] Scale offsets by the size of the memory operation. NFC. The immediate in the load/store should be scaled by the size of the memory operation, not the size of the register being loaded/stored. This change gets us one step closer to forming LDPSW instructions. This change also enables pre- and post-indexing for halfword and byte loads and stores. llvm-svn: 248804	2015-09-29 16:07:32 +00:00

1 2 3 4 5 ...

122165 Commits