llvm-project

Commit Graph

Author	SHA1	Message	Date
JF Bastien	3428ed4f53	WebAssembly: don't omit dead vregs from locals Summary: This is a temporary hack until we get around to remapping the vreg numbers to local numbers. Dead vregs cause bad numbering and make consumers sad. We could also just look at debug info an use named locals instead, but vregs have to work properly anyways so there! Reviewers: binji, sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: http://reviews.llvm.org/D13839 llvm-svn: 250594	2015-10-17 00:25:38 +00:00
JF Bastien	4f43e80ece	WebAssembly: fix the syntax for comparisons Summary: It has also slightly changed. Reviewers: binji Subscribers: jfb, dschuff, llvm-commits, sunfish Differential Revision: http://reviews.llvm.org/D13837 llvm-svn: 250591	2015-10-17 00:12:29 +00:00
Matthias Braun	96e411b90c	RegisterPressure: Hide non-const iterators of PressureDiff It is too easy to accidentally violate the ordering requirements when modifying the PressureDiff entries through iterators. llvm-svn: 250590	2015-10-17 00:08:48 +00:00
Joseph Tremoulet	55b51e9dcc	[WinEH] Fix eh.exceptionpointer intrinsic lowering Summary: Some shared code for handling eh.exceptionpointer and eh.exceptioncode needs to not share the part that truncates to 32 bits, which is intended just for exception codes. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13747 llvm-svn: 250588	2015-10-17 00:08:08 +00:00
Reid Kleckner	28e490342b	[WinEH] Fix stack alignment in funclets and ParentFrameOffset calculation Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset is incorrect when CSRs are involved. We were supposed to have a test case to catch this, but it wasn't very rigorous. The main effect here is that calling _CxxThrowException inside a catchpad doesn't immediately crash on MOVAPS when you have an odd number of CSRs. llvm-svn: 250583	2015-10-16 23:43:27 +00:00
Matthias Braun	fdee8ec2bd	RegisterPressure: Use range based for, cleanup llvm-svn: 250579	2015-10-16 23:25:09 +00:00
Kostya Serebryany	d6edce97fb	[libFuzzer] print a stack trace on timeout llvm-svn: 250571	2015-10-16 23:04:31 +00:00
Benjamin Kramer	b43d33bf0f	Revert "This is a follow-up to the discussion in D12882." Breaks clang selfhost, see PR25222. This reverts commits r250527 and r250528. llvm-svn: 250570	2015-10-16 23:00:29 +00:00
Kostya Serebryany	a9da9b48ef	[libFuzzer] reduce the size of artifacts printed on the screen llvm-svn: 250565	2015-10-16 22:47:20 +00:00
Kostya Serebryany	b91c62b1f3	[libFuzzer] When -test_single_input crashes the test it is not necessary to write crash-file because input is already known to the user. Patch by Mike Aizatsky llvm-svn: 250564	2015-10-16 22:41:47 +00:00
Sanjay Patel	bbd524496c	[x86] promote 'add nsw' to a wider type to allow more combines The motivation for this patch starts with PR20134: https://llvm.org/bugs/show_bug.cgi?id=20134 void foo(int *a, int i) { a[i] = a[i+1] + a[i+2]; } It seems better to produce this (14 bytes): movslq %esi, %rsi movl 0x4(%rdi,%rsi,4), %eax addl 0x8(%rdi,%rsi,4), %eax movl %eax, (%rdi,%rsi,4) Rather than this (22 bytes): leal 0x1(%rsi), %eax cltq leal 0x2(%rsi), %ecx movslq %ecx, %rcx movl (%rdi,%rcx,4), %ecx addl (%rdi,%rax,4), %ecx movslq %esi, %rax movl %ecx, (%rdi,%rax,4) The most basic problem (the first test case in the patch combines constants) should also be fixed in InstCombine, but it gets more complicated after that because we need to consider architecture and micro-architecture. For example, AArch64 may not see any benefit from the more general transform because the ISA solves the sexting in hardware. Some x86 chips may not want to replace 2 ADD insts with 1 LEA, and there's an attribute for that: FeatureSlowLEA. But I suspect that doesn't go far enough or maybe it's not getting used when it should; I'm also not sure if FeatureSlowLEA should also mean "slow complex addressing mode". I see no perf differences on test-suite with this change running on AMD Jaguar, but I see small code size improvements when building clang and the LLVM tools with the patched compiler. A more general solution to the sext(add nsw(x, C)) problem that works for multiple targets is available in CodeGenPrepare, but it may take quite a bit more work to get that to fire on all of the test cases that this patch takes care of. Differential Revision: http://reviews.llvm.org/D13757 llvm-svn: 250560	2015-10-16 22:14:12 +00:00
Jim Grosbach	0fdd572763	MC: Don't crash after issuing a diagnostic. Crashing is bad, m'kay? Fixing a 4 year old bug of my own creation. Adding the testcase now which I should have added then which would have long since caught this. The problem is that printMessage() will display the diagnostic but not set HadError to true, resulting in the assembler continuing on its way and trying to create relocations for things that may not allow them or otherwise get itself into trouble. Using the Error() helper function here rather than calling printMessage() directly resolves this. rdar://23133240 llvm-svn: 250557	2015-10-16 22:07:59 +00:00
Joseph Tremoulet	d11a998e81	[WinEH] Fix CatchRetSuccessorColorMap accounting Summary: We now use the block for the catchpad itself, rather than its normal successor, as the funclet entry. Putting the normal successor in the map leads downstream funclet membership computations to erroneous results. Reviewers: majnemer, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D13798 llvm-svn: 250552	2015-10-16 21:22:54 +00:00
Andrew Kaylor	09b39acc03	Fix assertion failure with fp128 to unsigned i64 conversion Patch by Mitch Bodart Differential Revision: http://reviews.llvm.org/D13780 llvm-svn: 250550	2015-10-16 20:39:20 +00:00
Krzysztof Parzyszek	a7c5f0409c	[Hexagon] Split double registers llvm-svn: 250549	2015-10-16 20:38:54 +00:00
David Majnemer	e696583dba	[WinEH] Remove dead code/includes from WinEHPrepare No functionality change is intended. llvm-svn: 250545	2015-10-16 19:59:52 +00:00
Krzysztof Parzyszek	aec39c68ae	[Hexagon] Delete lib/Target/Hexagon/HexagonRemoveSZExtArgs.cpp llvm-svn: 250543	2015-10-16 19:51:53 +00:00
Krzysztof Parzyszek	5b7dd0cdf9	[Hexagon] Merge adjacent stores llvm-svn: 250542	2015-10-16 19:43:56 +00:00
Diego Novillo	b93483dbce	Sample profiles - Re-arrange binary format to emit head samples only on top functions. The number of samples collected at the head of a function only make sense for top-level functions (i.e., those actually called as opposed to being inlined inside another). Head samples essentially count the time spent inside the function's prologue. This clearly doesn't make sense for inlined functions, so we were always emitting 0 in those. llvm-svn: 250539	2015-10-16 18:54:35 +00:00
JF Bastien	6126d2b883	WebAssembly: fix load/store syntax Summary: The syntax has changed a bit recently. Reviewers: binji Subscribers: llvm-commits, jfb, sunfish, dschuff Differential Revision: http://reviews.llvm.org/D13821 llvm-svn: 250535	2015-10-16 18:24:42 +00:00
Joseph Tremoulet	53e9cbd95a	[WinEH] Fix endpad coloring/numbering Summary: When a cleanup's cleanupendpad or cleanupret targets a catchendpad, stop trying to propagate the cleanup's parent's color to the catchendpad, since what's needed is the cleanup's grandparent's color and the catchendpad will get that color from the catchpad linkage already. We already had this exclusion for invokes, but were missing it for cleanupendpad/cleanupret. Also add a missing line that tags cleanupendpads' states in the EHPadStateMap, without with lowering invokes that target cleanupendpads which unwind to other handlers (and so don't have the -1 state) will fail. This fixes the reduced IR repro in PR25163. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13797 llvm-svn: 250534	2015-10-16 18:08:16 +00:00
Sanjay Patel	374dd8d88e	This is a follow-up to the discussion in D12882. Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands are expensive (as defined by the TTI cost model) because that may expose further optimizations. However, we would then like a later pass like CodeGenPrepare to undo that transformation if the target would likely benefit from not speculatively executing an expensive op (this patch). Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its select-formation behavior that changed with r248439. Differential Revision: http://reviews.llvm.org/D13297 llvm-svn: 250527	2015-10-16 16:54:30 +00:00
JF Bastien	53bd975033	WebAssembly: relooper analysis pass Summary: Make the relooper an analysis pass, to convert CFG to AST. Reviewers: sunfish Subscribers: jfb, dschuff Differential Revision: http://reviews.llvm.org/D12744 llvm-svn: 250524	2015-10-16 16:35:49 +00:00
Charlie Turner	434d4599d4	[AArch64] Implement vector splitting on UADDV. Summary: Fixes PR25056. Reviewers: mcrosier, junbuml, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13466 llvm-svn: 250520	2015-10-16 15:38:25 +00:00
Hrvoje Varga	3c88fbd367	[mips][microMIPS] Implement LB, LBE, LBU and LBUE instructions Differential Revision: http://reviews.llvm.org/D11633 llvm-svn: 250511	2015-10-16 12:24:58 +00:00
Pawel Bylica	7187e4bba9	Use Windows Vista API to get the user's home directory Summary: This patch replaces usage of deprecated SHGetFolderPathW with SHGetKnownFolderPath. The usage of SHGetKnownFolderPath is wrapped to allow queries for other "known" folders in the near future. Reviewers: aaron.ballman, gbedwell Subscribers: chapuni, llvm-commits Differential Revision: http://reviews.llvm.org/D13753 llvm-svn: 250501	2015-10-16 09:08:59 +00:00
Craig Topper	09b6598572	[X86] Add fxsr feature flag for fxsave/fxrestore instructions. llvm-svn: 250497	2015-10-16 06:03:09 +00:00
Dylan McKay	b1d469c657	Initial migration of AVR backend This patch adds the underlying infrastructure for an AVR backend to be included into LLVM. It is the first of a series of patches aimed at moving the out-of-tree AVR backend into the tree. It consists of adding a new`Triple` target 'avr'. llvm-svn: 250492	2015-10-16 03:10:30 +00:00
Sanjoy Das	58fae7cf6b	[RS4GC] Dont' propagate call attrs related to patchable statepoints The `"statepoint-id"` and `"statepoint-num-patch-bytes"` attributes are used solely to determine properties of the `gc.statepoint` being created. Once the `gc.statepoint` is in place, these should be removed. llvm-svn: 250491	2015-10-16 02:41:23 +00:00
Sanjoy Das	810a59d037	[RS4GC] Bring legalizeCallAttributes up to LLVM coding style; NFC llvm-svn: 250490	2015-10-16 02:41:11 +00:00
Sanjoy Das	25ec1a3e60	[RS4GC] Use "deopt" operand bundles Summary: This is a step towards using operand bundles to carry deopt state till RewriteStatepointsForGC. The change adds a flag to RewriteStatepointsForGC that teaches it to pick up deopt state from a `"deopt"` operand bundle attached to the `call` or `invoke` it is wrapping. The command line flag added, `-rs4gc-use-deopt-bundles`, will only exist for a short while. Once we are able to pipe deopt bundle state through the full optimization pipeline without problems, we will "constant fold" `-rs4gc-use-deopt-bundles` to `true`. Reviewers: swaroop.sridhar, reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13372 llvm-svn: 250489	2015-10-16 02:41:00 +00:00
Sanjoy Das	7360f30852	[IndVars] Rename getExtend; NFC Rename `IndVarSimplify::getExtend` to `IndVarSimplify::createExtendInst` to make it obvious that it creates `llvm::Instruction` s. llvm-svn: 250484	2015-10-16 01:00:50 +00:00
Sanjoy Das	37e87c2023	[IndVars] Have `cloneArithmeticIVUser` guess better Summary: `cloneArithmeticIVUser` currently trips over expression like `add %iv, -1` when `%iv` is being zero extended -- it tries to construct the widened use as `add %iv.zext, zext(-1)` and (correctly) fails to prove equivalence to `zext(add %iv, -1)` (here the SCEV for `%iv` is `{1,+,1}`). This change teaches `IndVars` to try sign extending the non-IV operand if that makes the newly constructed IV use equivalent to the widened narrow IV use. Reviewers: atrick, hfinkel, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13717 llvm-svn: 250483	2015-10-16 01:00:47 +00:00
Sanjoy Das	472840a3d3	[IndVars] Extract out a few local variables; NFC llvm-svn: 250482	2015-10-16 01:00:44 +00:00
Sanjoy Das	1fd184e5a2	[IndVars] Split `WidenIV::cloneIVUser`; NFC Summary: This NFC splitting is intended to make a later diff easier to follow. It just tail duplicates `cloneIVUser` into `cloneArithmeticIVUser` and `cloneBitwiseIVUser`. Reviewers: atrick, hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13716 llvm-svn: 250481	2015-10-16 01:00:39 +00:00
JF Bastien	1d20a5e9e8	WebAssembly: update syntax Summary: Follow the same syntax as for the spec repo. Both have evolved slightly independently and need to converge again. This, along with wasmate changes, allows me to do the following: echo "int add(int a, int b) { return a + b; }" > add.c ./out/bin/clang -O2 -S --target=wasm32-unknown-unknown add.c -o add.wack ./experimental/prototype-wasmate/wasmate.py add.wack > add.wast ./sexpr-wasm-prototype/out/sexpr-wasm add.wast -o add.wasm ./sexpr-wasm-prototype/third_party/v8-native-prototype/v8/v8/out/Release/d8 -e "print(WASM.instantiateModule(readbuffer('add.wasm'), {print:print}).add(42, 1337));" As you'd expect, the d8 shell prints out the right value. Reviewers: sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: http://reviews.llvm.org/D13712 llvm-svn: 250480	2015-10-16 00:53:49 +00:00
Evgeniy Stepanov	9addbc9fc1	Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android." Breaks the hexagon buildbot. llvm-svn: 250461	2015-10-15 21:26:49 +00:00
Adrian Prantl	96b1551d53	Replace a forward declaration with an #include. When building with modules the forward-declared inner class DebugLocStream::ListBuilder causes clang to fall over. llvm-svn: 250459	2015-10-15 20:58:55 +00:00
Evgeniy Stepanov	142947e9f0	[safestack] Fast access to the unsafe stack pointer on AArch64/Android. Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. llvm-svn: 250456	2015-10-15 20:50:16 +00:00
Adrian Prantl	d8596384e7	Add a missing include of cstddef needed for size_t. llvm-svn: 250446	2015-10-15 19:41:54 +00:00
JF Bastien	2cdd5e4710	x86: preserve flags when folding atomic operations D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. I fixed some of this issue in D13680 but had missed INC/DEC. This patch adds the missing EFLAGS definition. llvm-svn: 250438	2015-10-15 18:24:52 +00:00
Benjamin Kramer	bacc7ba7aa	[SelectionDAG] Remove dead code. NFC. Carefully selected parts without deleting graph stuff and dumping methods. llvm-svn: 250434	2015-10-15 17:54:06 +00:00
Benjamin Kramer	7fa42c8a8c	[AsmPrinter] Prune dead code. NFC. I left all (dead) print and dump methods in place. llvm-svn: 250433	2015-10-15 17:16:32 +00:00
Philip Reames	a956cc7f08	Revert 250343 and 250344 Turns out this approach is buggy. In discussion about follow on work, Sanjoy pointed out that we could be subject to circular logic problems. Consider: if (i u< L) leave() if ((i + 1) u< L) leave() print(a[i] + a[i+1]) If we know that L is less than UINT_MAX, we could possible prove (in a control dependent way) that i + 1 does not overflow. This gives us: if (i u< L) leave() if ((i +nuw 1) u< L) leave() print(a[i] + a[i+1]) If we now do the transform this patch proposed, we end up with: if ((i +nuw 1) u< L) leave_appropriately() print(a[i] + a[i+1]) That would be a miscompile when i==-1. The problem here is that the control dependent nuw bits got used to prove something about the first condition. That's obviously invalid. This won't happen today, but since I plan to enhance LVI/CVP with exactly that transform at some point in the not too distant future... llvm-svn: 250430	2015-10-15 16:51:00 +00:00
JF Bastien	5b327712b0	x86 FP atomic codegen: don't drop globals, stack Summary: x86 codegen is clever about generating good code for relaxed floating-point operations, but it was being silly when globals and immediates were involved, forgetting where the global was and loading/storing from/to the wrong place. The same applied to hard-coded address immediates. Don't let it forget about the displacement. This fixes https://llvm.org/bugs/show_bug.cgi?id=25171 A very similar bug when doing floating-points atomics to the stack is also fixed by this patch. This fixes https://llvm.org/bugs/show_bug.cgi?id=25144 Reviewers: pete Subscribers: llvm-commits, majnemer, rsmith Differential Revision: http://reviews.llvm.org/D13749 llvm-svn: 250429	2015-10-15 16:46:29 +00:00
Diego Novillo	38be33302c	Sample Profiles - Adjust integer types. Mostly NFC. This adjusts all integers in the reader/writer to reflect the types stored on profile files. They should all be unsigned 32-bit or 64-bit values. Changed all associated internal types to be uint32_t or uint64_t. The only place that needed some adjustments is in the sample profile transformation. Altough the weight read from the profile are 64-bit values, the internal API for branch weights only accepts 32-bit values. The pass now saturates weights that overflow uint32_t. llvm-svn: 250427	2015-10-15 16:36:21 +00:00
Tim Northover	0515291c52	Prevent assertion with "llc -debug" and anonymous symbols. llvm-svn: 250425	2015-10-15 16:18:27 +00:00
Benjamin Kramer	6db3338cb1	[ScalarOpts] Remove dead code. Does not touch debug dumpers. NFC. llvm-svn: 250417	2015-10-15 15:08:58 +00:00
Manman Ren	72d44b1b09	Recommit r250345, it was reverted in r250366 to investigate a bot failure. Our internal bot is still red after r250366. llvm-svn: 250415	2015-10-15 14:59:40 +00:00
Daniel Sanders	6394ee598e	[mips][ias] Implement ulh macro. Summary: This macro is needed to prevent test/CodeGen/Mips/2008-08-01-AsmInline.ll from failing after the integrated assembler is enabled by default. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13654 llvm-svn: 250414	2015-10-15 14:52:58 +00:00
Pawel Bylica	6b129bd464	Require Windows API of version 6.1 (Windows 7). llvm-svn: 250413	2015-10-15 14:50:31 +00:00
Benjamin Kramer	c5275bdec1	[NVPTX] Remove dead code. I left helpers that look useful for debugging alone. NFC. llvm-svn: 250410	2015-10-15 14:45:41 +00:00
Daniel Sanders	8008de5551	[mips][mips16] MIPS16 is not a CPU/Architecture but is an ASE. Summary: The -mcpu=mips16 option caused the Integrated Assembler to crash because it couldn't figure out the architecture revision number to write to the .MIPS.abiflags section. This CPU definition has been removed because, like microMIPS, MIPS16 is an ASE to a base architecture. Reviewers: vkalintiris Subscribers: rkotler, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13656 llvm-svn: 250407	2015-10-15 14:34:23 +00:00
Benjamin Kramer	5dfcda73d5	[X86] Rip out orphaned method declarations and other dead code. NFC. llvm-svn: 250406	2015-10-15 14:09:59 +00:00
Aaron Ballman	58f413c518	Silencing a -Wtype-limits warning; an unsigned value will always be >= 0; NFC. llvm-svn: 250404	2015-10-15 13:55:43 +00:00
Igor Breger	d7bae451de	AVX512: Implemented DAG lowering for shuff62x2/shufi62x2 instructions ( shuffle packed values at 128-bit granularity ) Differential Revision: http://reviews.llvm.org/D13648 llvm-svn: 250400	2015-10-15 13:29:07 +00:00
Igor Breger	b4bb190eed	AVX512: Implemented encoding and intrinsics for vpternlogd/q. Differential Revision: http://reviews.llvm.org/D13768 llvm-svn: 250396	2015-10-15 12:33:24 +00:00
Elena Demikhovsky	ecff21b297	AVX-512: Fixed a bug in shuffle lowering 32-bit mode AVX-512 bit shuffle fails on 32 bit since we create a vector of 64-bit constants. I split 8x64-bit const vector to 16x32 on 32-bit mode. Differential Revision: http://reviews.llvm.org/D13644 llvm-svn: 250390	2015-10-15 11:35:33 +00:00
Artyom Skrobov	63471330d2	Don't pretend AMDGPU backend knows how to custom-lower UDIVREM for vector types; it can't Reviewers: arsenm, jvesely, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13734 llvm-svn: 250384	2015-10-15 09:18:47 +00:00
Zlatko Buljan	54b1eb4c73	[mips][microMIPS] Implement DPA.W.PH, DPAQ_S.W.PH, DPAQ_SA.L.W, DPAQX_S.W.PH, DPAQX_SA.W.PH, DPAU.H.QBL, DPAU.H.QBR and DPAX.W.PH instructions Differential Revision: http://reviews.llvm.org/D13376 llvm-svn: 250382	2015-10-15 08:59:45 +00:00
Hrvoje Varga	3a3c4b8a39	[mips][microMIPS] Implement BREAK16, LI16, MOVE16, SDBBP16, SUBU16 and XOR16 instructions Differential Revision: http://reviews.llvm.org/D11292#inline-103143 llvm-svn: 250381	2015-10-15 08:39:07 +00:00
Hrvoje Varga	3ef4dd7bc8	[mips][microMIPS] Implement LLE and SCE instructions Differential Revision: http://reviews.llvm.org/D11630 llvm-svn: 250379	2015-10-15 08:11:50 +00:00
Hrvoje Varga	a766eff5a0	[mips][microMIPS] Implement LWLE, LWRE, SWLE and SWRE instructions Differential Revision: http://reviews.llvm.org/D11631 llvm-svn: 250377	2015-10-15 07:23:06 +00:00
Eric Christopher	bdafb3cd1c	Remove DIFile from createSubroutineType. Patch by Amaury Sechet with a small modification by me. llvm-svn: 250374	2015-10-15 06:56:10 +00:00
Lang Hames	86a4593dd2	[RuntimeDyld] Don't try to get the contents of sections that don't have any (e.g. bss sections). MachO and ELF have been silently letting this pass, but COFFObjectFile contains an assertion to catch this kind of (ab)use of the getSectionContents, and this was causing the JIT to crash on COFF objects with BSS sections. This patch should fix that. llvm-svn: 250371	2015-10-15 06:41:45 +00:00
Akira Hatanaka	8ad7399f8e	[MachO] Stop generating coal sections. Recommit r250342: move coal-sections-powerpc.s to subdirectory for powerpc. Some background on why we don't have to use coal sections anymore: Long ago when C++ was new and "weak" had not been standardized, an attempt was made in cctools to support C++ inlines that can be coalesced by putting them into their own section (TEXT/textcoal_nt instead of TEXT/text). The current macho linker supports the weak-def bit on any symbol to allow it to be coalesced, but the compiler still puts weak-def functions/data into alternate section names, which the linker must map back to the base section name. This patch makes changes that are necessary to prevent the compiler from using the "coal" sections and have it use the non-coal sections instead when the target architecture is not powerpc: TEXT/textcoal_nt instead use TEXT/text TEXT/const_coal instead use TEXT/const DATA/datacoal_nt instead use DATA/data If the target is powerpc, we continue to use the coal sections since anyone targeting powerpc is probably using an old linker that doesn't have support for the weak-def bits. Also, have the assembler issue a warning if it encounters a coal section in the assembly file and inform the users to use the non-coal sections instead. rdar://problem/14265330 Differential Revision: http://reviews.llvm.org/D13188 llvm-svn: 250370	2015-10-15 05:28:38 +00:00
Hrvoje Varga	8c9526400e	Test commit. llvm-svn: 250367	2015-10-15 05:20:51 +00:00
Manman Ren	f5499fd9d5	Temporarily revert r250345 to sort out bot failure. With r250345 and r250343, we start to observe the following failure when bootstrap clang with lto and pgo: PHI node entries do not match predecessors! %.sroa.029.3.i = phi %"class.llvm::SDNode.13298"* [ null, %30953 ], [ null, %31017 ], [ null, %30998 ], [ null, %_ZN4llvm8dyn_castINS_14ConstantSDNodeENS_7SDValueEEENS_10cast_rettyIT_T0_E8ret_typeERS5_.exit.i.1804 ], [ null, %30975 ], [ null, %30991 ], [ null, %_ZNK4llvm3EVT13getScalarTypeEv.exit.i.1812 ], [ %..sroa.029.0.i, %_ZN4llvm11SmallVectorIiLj8EED1Ev.exit.i.1826 ], !dbg !451895 label %30998 label %_ZNK4llvm3EVTeqES0_.exit19.thread.i LLVM ERROR: Broken function found, compilation aborted! I will re-commit this if the bot does not recover. llvm-svn: 250366	2015-10-15 04:58:24 +00:00
Craig Topper	fd2cc7cd8a	Add XSAVE/XSAVEOPT to KNL processor. llvm-svn: 250362	2015-10-15 03:56:54 +00:00
David Majnemer	6e08126f31	[llvm-pdbdump] Provide a mechanism to dump the raw contents of a PDB A PDB can be thought of as a very simple file system. It is occasionally illuminating to see the contents of the underlying files. Differential Revision: http://reviews.llvm.org/D13674 llvm-svn: 250356	2015-10-15 01:27:19 +00:00
Richard Smith	93e9bb0864	Fix -Wmismatched-tags error in modules build by removing unused forward declaration. llvm-svn: 250355	2015-10-15 01:15:26 +00:00
Quentin Colombet	5084e44d71	[ARM] Make sure we do not dereference the end iterator when accessing debug information. Although the problem was always here, it would only be exposed when shrink-wrapping is enable. rdar://problem/23110493 llvm-svn: 250352	2015-10-15 00:41:26 +00:00
Akira Hatanaka	276332b47f	Revert r250349. Test case coal-sections-powerpc.s is still failing on some buildbots. llvm-svn: 250351	2015-10-15 00:11:03 +00:00
Akira Hatanaka	1cea644114	[MachO] Stop generating coal sections. Recommit r250342: add -arch=ppc32 to the RUN lines of powerpc tests. Some background on why we don't have to use coal sections anymore: Long ago when C++ was new and "weak" had not been standardized, an attempt was made in cctools to support C++ inlines that can be coalesced by putting them into their own section (TEXT/textcoal_nt instead of TEXT/text). The current macho linker supports the weak-def bit on any symbol to allow it to be coalesced, but the compiler still puts weak-def functions/data into alternate section names, which the linker must map back to the base section name. This patch makes changes that are necessary to prevent the compiler from using the "coal" sections and have it use the non-coal sections instead when the target architecture is not powerpc: TEXT/textcoal_nt instead use TEXT/text TEXT/const_coal instead use TEXT/const DATA/datacoal_nt instead use DATA/data If the target is powerpc, we continue to use the coal sections since anyone targeting powerpc is probably using an old linker that doesn't have support for the weak-def bits. Also, have the assembler issue a warning if it encounters a coal section in the assembly file and inform the users to use the non-coal sections instead. rdar://problem/14265330 Differential Revision: http://reviews.llvm.org/D13188 llvm-svn: 250349	2015-10-14 23:48:10 +00:00
Akira Hatanaka	d58d347e42	Revert r250342. Investigate why coal-sections-powerpc.s is failing on some buildbots. llvm-svn: 250346	2015-10-14 23:29:10 +00:00
Cong Hou	b74d3b3b86	Update the branch weight metadata in JumpThreading pass. Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). This is the third attempt to submit this patch, while the first two led to failures in some FDO tests. After investigation, it is the edge weight normalization that caused those failures. In this patch the edge weight normalization is fixed so that there is no zero weight in the output and the sum of all weights can fit in 32-bit integer. Several unit tests are added. Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250345	2015-10-14 23:14:17 +00:00
Philip Reames	b42db21de8	[SimplifyCFG] Speculatively flatten CFG based on profiling metadata If we have a series of branches which are all unlikely to fail, we can possibly combine them into a single check on the fastpath combined with a bit of dispatch logic on the slowpath. We don't want to do this unconditionally since it requires speculating instructions past a branch, but if the profiling metadata on the branch indicates profitability, this can reduce the number of checks needed along the fast path. The canonical example this is trying to handle is removing the second bounds check implied by the Java code: a[i] + a[i+1]. Note that it can currently only do so for really simple conditions and the values of a[i] can't be used anywhere except in the addition. (i.e. the load has to have been sunk already and not prevent speculation.) I plan on extending this transform over the next few days to handle alternate sequences. Differential Revision: http://reviews.llvm.org/D13070 llvm-svn: 250343	2015-10-14 22:46:19 +00:00
Akira Hatanaka	c078ae3e4f	[MachO] Stop generating coal sections. Some background on why we don't have to use coal sections anymore: Long ago when C++ was new and "weak" had not been standardized, an attempt was made in cctools to support C++ inlines that can be coalesced by putting them into their own section (TEXT/textcoal_nt instead of TEXT/text). The current macho linker supports the weak-def bit on any symbol to allow it to be coalesced, but the compiler still puts weak-def functions/data into alternate section names, which the linker must map back to the base section name. This patch makes changes that are necessary to prevent the compiler from using the "coal" sections and have it use the non-coal sections instead when the target architecture is not powerpc: TEXT/textcoal_nt instead use TEXT/text TEXT/const_coal instead use TEXT/const DATA/datacoal_nt instead use DATA/data If the target is powerpc, we continue to use the coal sections since anyone targeting powerpc is probably using an old linker that doesn't have support for the weak-def bits. Also, have the assembler issue a warning if it encounters a coal section in the assembly file and inform the users to use the non-coal sections instead. rdar://problem/14265330 Differential Revision: http://reviews.llvm.org/D13188 llvm-svn: 250342	2015-10-14 22:45:36 +00:00
Philip Reames	ddcf6b35a2	Tighten known bits for ctpop based on zero input bits This is a cleaned up patch from the one written by John Regehr based on the findings of the Souper superoptimizer. The basic idea here is that input bits that are known zero reduce the maximum count that the intrinsic could return. We know that the number of bits required to represent a particular count is at most log2(N)+1. Differential Revision: http://reviews.llvm.org/D13253 llvm-svn: 250338	2015-10-14 22:42:12 +00:00
Bill Schmidt	048cc97fb1	[PowerPC] Fix invalid lxvdsx optimization (PR25157) PR25157 identifies a bug where a load plus a vector shuffle is incorrectly converted into an LXVDSX instruction. That optimization is only valid if the load is of a doubleword, and in the noted case, it was not. This corrects that problem. Joint patch with Eric Schweitz, who provided the bugpoint-reduced test case. llvm-svn: 250324	2015-10-14 20:45:00 +00:00
Chen Li	567aa7ab30	[LoopUnswitch] Correct misleading comments. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13738 llvm-svn: 250317	2015-10-14 19:47:43 +00:00
Diego Novillo	bb5605ca3a	Sample profiles - Add documentation for binary profile encoding. NFC. This adds documentation for the binary profile encoding and moves the documentation for the text encoding into the header file SampleProfReader.h. llvm-svn: 250309	2015-10-14 18:36:30 +00:00
Artyom Skrobov	4bca0bb010	A doccomment for CombineTo, and some NFC refactorings Summary: Caching SDLoc(N), instead of recreating it in every single function call, keeps the code denser, and allows to unwrap long lines. Reviewers: sunfish, atrick, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13726 llvm-svn: 250305	2015-10-14 17:18:35 +00:00
Artyom Skrobov	a5b9ad22b3	Merge DAGCombiner::visitSREM and DAGCombiner::visitUREM (NFC) Summary: The two implementations had more code in common than not. Reviewers: sunfish, MatzeB, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13724 llvm-svn: 250302	2015-10-14 16:54:14 +00:00
Andrea Di Biagio	c47edbef4c	[x86][FastISel] Teach how to select nontemporal stores. This patch teaches x86 fast-isel how to select nontemporal stores. On x86, we can use MOVNTI for nontemporal stores of doublewords/quadwords. Instructions (V)MOVNTPS/PD/DQ can be used for SSE2/AVX aligned nontemporal vector stores. Before this patch, fast-isel always selected 'movd/movq' instead of 'movnti' for doubleword/quadword nontemporal stores. In the case of nontemporal stores of aligned vectors, fast-isel always selected movaps/movapd/movdqa instead of movntps/movntpd/movntdq. With this patch, if we use SSE2/AVX intrinsics for nontemporal stores we now always get the expected (V)MOVNT instructions. The lack of fast-isel support for nontemporal stores was spotted when analyzing the -O0 codegen for nontemporal stores. Differential Revision: http://reviews.llvm.org/D13698 llvm-svn: 250285	2015-10-14 10:03:13 +00:00
Craig Topper	b84b12699f	[X86] Update CPU detection to only enable XSAVE features if the OS has enabled them and the saving of YMM state. This seems to be consistent with gcc behavior. llvm-svn: 250269	2015-10-14 05:37:42 +00:00
Craig Topper	0ee356951a	[X86] Add XSAVE feature flags to their various processors. llvm-svn: 250268	2015-10-14 05:37:38 +00:00
Craig Topper	1129a00abf	Use range-based for loops. NFC llvm-svn: 250266	2015-10-14 04:36:00 +00:00
Manman Ren	2c8e16d507	Revert r250204 and r250240 due to bot failure. We failed to build PGO-ed clang. llvm-svn: 250264	2015-10-14 03:04:03 +00:00
Evgeniy Stepanov	ebd3f44f93	[msan] Fix crash on multiplication by a non-integer constant. Fixes PR25160. llvm-svn: 250260	2015-10-14 00:21:13 +00:00
Kostya Serebryany	5cb86d5a40	[asan] Disabling speculative loads under asan. Patch by Mike Aizatsky llvm-svn: 250259	2015-10-14 00:21:05 +00:00
Richard Smith	e7dc8bf9c2	Rename one of our two llvm::GCOVOptions classes to llvm::GCOV::Options. We used to get away with this because llvm/Support/GCOV.h was an implementation detail of the llvm-gcov tool, but it's now being used by FDO. llvm-svn: 250258	2015-10-14 00:04:19 +00:00
Sanjoy Das	16e7ff171b	[SCEV] Use `SCEV::isAllOnesValue` directly; NFC. Instead of `dyn_cast` ing to `SCEVConstant` and checking the contained `ConstantInteger. llvm-svn: 250251	2015-10-13 23:28:31 +00:00
Diego Novillo	43396fa8db	Sample profile reader - remove dead code. NFC. This removes old remnants from the gcov reader. I missed these when I re-wrote it recently. llvm-svn: 250242	2015-10-13 22:48:48 +00:00
Diego Novillo	760c5a8f45	Sample profiles - Add a name table to the binary encoding. Binary encoded profiles used to encode all function names inline at every reference. This is clearly suboptimal in terms of space. This patch fixes this by adding a name table to the header of the file. llvm-svn: 250241	2015-10-13 22:48:46 +00:00
David Majnemer	eba62796cb	[InlineFunction] Correctly inline TerminatePadInst We forgot to append the terminatepad's arguments which resulted in us treating the old terminatepad as an argument to the new terminatepad causing us to crash immediately. Instead, add the old terminatepad's arguments to the new terminatepad. This fixes PR25155. llvm-svn: 250234	2015-10-13 22:08:17 +00:00
Dan Gohman	ac93f649fa	[WebAssembly] Remove a TODO comment which is no longer needed. NFC. llvm-svn: 250233	2015-10-13 22:06:40 +00:00
Chad Rosier	7f08d80595	Typo. llvm-svn: 250224	2015-10-13 20:59:16 +00:00
Kevin Enderby	1c1add44b6	Tweak to r250117 and change to use ErrorOr and drop isSizeValid for ArchiveMemberHeader, suggestion by Rafael Espíndola. Also The clang-x86-win2008-selfhost bot still does not like the malformed-machos 00000031.a test, so removing it for now. All the other bots are fine with it however. llvm-svn: 250222	2015-10-13 20:48:04 +00:00
Joseph Tremoulet	28c89bbb36	[WinEH] Add CoreCLR EH table emission Summary: Emit the handler and clause locations immediately after the standard xdata. Clauses are emitted in the same order and format used to communiate them to the CLR Execution Engine. Add a lit test to verify correct table generation on a small but interesting example function. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13451 llvm-svn: 250219	2015-10-13 20:18:27 +00:00
Duncan P. N. Exon Smith	a73371a9b7	AMDGPU: Remove implicit ilist iterator conversions, NFC One of the changes in lib/Target/AMDGPU/AMDGPUMCInstLower.cpp was a new one. Previously, bundle iterators and single-instruction iterators could be compared to each other (comparing on underlying pointers). I changed a comparison from using `MBB->end()` to using `MBB->instr_end()`, since both end iterators should point at the some place anyway. I don't think the implicit conversion between the two iterator types is a good idea since it's fairly easy to accidentally compare to the wrong thing (they aren't always end iterators). Otherwise I would have just added the conversion. Even with that, no there should be functionality change here. llvm-svn: 250218	2015-10-13 20:07:10 +00:00
Duncan P. N. Exon Smith	d3b9df02b3	AArch64: Remove implicit ilist iterator conversions, NFC llvm-svn: 250216	2015-10-13 20:02:15 +00:00
Duncan P. N. Exon Smith	e400a7d412	SelectionDAG: Remove implicit ilist iterator conversions, NFC llvm-svn: 250214	2015-10-13 19:47:46 +00:00
Duncan P. N. Exon Smith	be4d8cba1c	Scalar: Remove remaining ilist iterator implicit conversions Remove remaining `ilist_iterator` implicit conversions from LLVMScalarOpts. This change exposed some scary behaviour in lib/Transforms/Scalar/SCCP.cpp around line 1770. This patch changes a call from `Function::begin()` to `&Function::front()`, since the return was immediately being passed into another function that takes a `Function`. `Function::front()` started to assert, since the function was empty. Note that `Function::end()` does not point at a legal `Function` -- it points at an `ilist_half_node` -- so the other function was getting garbage before. (I added the missing check for `Function::isDeclaration()`.) Otherwise, no functionality change intended. llvm-svn: 250211	2015-10-13 19:26:58 +00:00
Akira Hatanaka	5a4e4f8d8a	[AArch64] Check the size of the vector before accessing its elements. This fixes an assert in AArch64AsmParser::MatchAndEmitInstruction. rdar://problem/23081753 llvm-svn: 250207	2015-10-13 18:55:34 +00:00
Cong Hou	7ab123a5cf	Update the branch weight metadata in JumpThreading pass. Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250204	2015-10-13 18:43:10 +00:00
Xinliang David Li	3dd8817d84	[PGO]: Eliminate calls to __llvm_profile_register_function for Linux. On Linux, the profile runtime can use __start_SECTNAME and __stop_SECTNAME symbols defined by the linker to locate the start and end location of a named section (with C name). This eliminates the need for instrumented binary to call __llvm_profile_register_function during start-up time. llvm-svn: 250199	2015-10-13 18:39:48 +00:00
Duncan P. N. Exon Smith	3a9c9e3dcd	Scalar: Remove some implicit ilist iterator conversions, NFC Remove some of the implicit ilist iterator conversions in LLVMScalarOpts. More to go. llvm-svn: 250197	2015-10-13 18:26:00 +00:00
Duncan P. N. Exon Smith	4ead920ce5	ExecutionEngine: Remove implicit ilist iterator conversions, NFC llvm-svn: 250193	2015-10-13 18:11:02 +00:00
Duncan P. N. Exon Smith	1275bffa96	OrcJIT: Remove implicit ilist iterator conversions, NFC llvm-svn: 250192	2015-10-13 18:10:59 +00:00
Duncan P. N. Exon Smith	1732340bfa	IPO: Remove implicit ilist iterator conversions, NFC llvm-svn: 250187	2015-10-13 17:51:03 +00:00
Duncan P. N. Exon Smith	e82c286fba	Instrumentation: Remove ilist iterator implicit conversions, NFC llvm-svn: 250186	2015-10-13 17:39:10 +00:00
Duncan P. N. Exon Smith	c8f02e7540	Interpreter: Remove implicit ilist iterator conversions, NFC llvm-svn: 250185	2015-10-13 17:33:41 +00:00
Duncan P. N. Exon Smith	9f8aaf21ba	InstCombine: Remove ilist iterator implicit conversions, NFC Stop relying on implicit conversions of ilist iterators in LLVMInstCombine. No functionality change intended. llvm-svn: 250183	2015-10-13 16:59:33 +00:00
Duncan P. N. Exon Smith	fb1743a32e	BitcodeReader: Remove ilist iterator implicit conversions, NFC Get LLVMBitReader building without relying on `ilist_iterator` implicit conversions. llvm-svn: 250181	2015-10-13 16:48:55 +00:00
Joseph Tremoulet	1e2f062ec5	[WinEH] Iterate state changes instead of invokes Summary: Add an iterator that can walk across blocks and which visits the state transitions rather than state ranges, with explicit transitions to -1 indicating the presence of top-level calls that may throw and cause the current function to unwind to caller. This will simplify code that needs to identify nested try regions. Refactor SEH and C++EH table generation to use the new InvokeStateChangeIterator, and remove the InvokeLabelIterator they were using. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13623 llvm-svn: 250179	2015-10-13 16:44:30 +00:00
Xinliang David Li	c758387e9b	Fix a couple of comments; NFC llvm-svn: 250177	2015-10-13 16:35:59 +00:00
Sanjay Patel	85030aa1bd	function names should start with a lower case letter; NFC llvm-svn: 250174	2015-10-13 16:23:00 +00:00
Sanjay Patel	b5723d0dbd	don't repeat function/class/variable names in comments; NFC llvm-svn: 250162	2015-10-13 15:12:27 +00:00
Simon Pilgrim	3c2b30f8ba	[InstCombine][SSE4A] Remove broken INSERTQI range combining optimization As discussed in D13348 - the INSERTQI range combining code is wrong in that it confuses the insertion bit index with an extraction bit index. The remaining legal combines are very unlikely (especially once we've converted to shuffles in D13348) so I'm removing the optimization. llvm-svn: 250160	2015-10-13 14:48:54 +00:00
James Molloy	4015925606	[GlobalsAA] Turn GlobalsAA on again by default Now that all the known faults with GlobalsAA have been fixed, flip the big switch on -enable-non-lto-gmr again. Feel free to pester me with any more bugs found, and don't hesitate to flip the switch back off. llvm-svn: 250157	2015-10-13 10:43:57 +00:00
James Molloy	860507f838	[GlobalsAA] Don't assume anything about functions that may be overridden Weak linkage and friends allow a symbol to be overriden outside the code generator's model, so GlobalsAA shouldn't assume that anything it can compute about such a symbol is valid. llvm-svn: 250156	2015-10-13 10:43:33 +00:00
Christof Douma	f0765c4f5b	Test commit llvm-svn: 250154	2015-10-13 09:38:21 +00:00
Sanjoy Das	b873cbe5c9	[IndVars] NFC Cleanup. - Rename methods according to the LLVM Coding Style - Merge adjacent anonymous namespace block - Use `auto` in two places llvm-svn: 250152	2015-10-13 07:17:38 +00:00
Michael Kuperstein	af22dafc8b	Fix line-ending issue. NFC. llvm-svn: 250151	2015-10-13 06:22:30 +00:00
Craig Topper	24b56a62bb	[X86] Mark the AAD and AAM aliases as not valid in 64-bit mode. llvm-svn: 250148	2015-10-13 05:12:07 +00:00
Craig Topper	4f76372afc	[X86] Change all the i8imm operands in XOP instructions to u8imm so the parser will check the size. llvm-svn: 250147	2015-10-13 05:06:25 +00:00
Manman Ren	9f824dab1d	Revert 250089 due to bot failure. It failed when building clang itself with PGO. llvm-svn: 250145	2015-10-13 03:38:02 +00:00
Duncan P. N. Exon Smith	584af871cc	BitcodeWriter: Stop using implicit ilist iterator conversion, NFC Now LLVMBitWriter compiles without implicit ilist iterator conversions. In these cases, the cleanest thing was to switch to range-based for loops. Since there wasn't much noise I converted sub-loops and parent loops as a drive-by. llvm-svn: 250144	2015-10-13 03:26:19 +00:00
Sanjoy Das	1ed6910338	[SCEV] Put some utilites in the ScalarEvolution class In a later commit, `SplitBinaryAdd` will be used outside `IsConstDiff`, so lift that out. And lift out `IsConstDiff` as `computeConstantDifference` to keep things clean and to avoid playing C++ access specifier games. NFC. llvm-svn: 250143	2015-10-13 02:53:27 +00:00
Duncan P. N. Exon Smith	5b4c837c58	TransformUtils: Remove implicit ilist iterator conversions, NFC Continuing the work from last week to remove implicit ilist iterator conversions. First related commit was probably r249767, with some more motivation in r249925. This edition gets LLVMTransformUtils compiling without the implicit conversions. No functional change intended. llvm-svn: 250142	2015-10-13 02:39:05 +00:00
Matt Arsenault	e5d9515fb7	DAGCombiner: Don't stop finding better chain on 2 aliases The comment says this was stopped because it was unlikely to be profitable. This is not true if you want to combine vector loads with multiple components. For a simple case that looks like t0 = load t0 ... t1 = load t0 ... t2 = load t0 ... t3 = load t0 ... t4 = store t0:1, t0:1 t5 = store t4, t1:0 t6 = store t5, t2:0 t7 = store t6, t3:0 We want to get all of these stores onto a chain that is a TokenFactor of these N loads. This mostly solves the AMDGPU merge-stores.ll regressions with -combiner-alias-analysis for merging vector stores of vector loads. llvm-svn: 250138	2015-10-13 00:49:00 +00:00
JF Bastien	986ed68eed	x86: preserve flags when folding atomic operations Summary: D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. This patch adds the missing EFLAGS definition. Floating point operations don't set flags, the subsequent fadd optimization is therefore correct. The same applies for surrounding load/store optimizations. Reviewers: rsmith, rtrieu Subscribers: llvm-commits, reames, morisset Differential Revision: http://reviews.llvm.org/D13680 llvm-svn: 250135	2015-10-13 00:28:47 +00:00
Matt Arsenault	f0d9e47da2	AMDGPU: Refactor isVGPRToSGPRCopy It should now correctly handle physical registers and make it easier to identify the other direction. llvm-svn: 250132	2015-10-13 00:07:54 +00:00
Matt Arsenault	61dc235f20	DAGCombiner: Combine extract_vector_elt from build_vector This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129	2015-10-12 23:59:50 +00:00
Cong Hou	bf22f5063a	Assign correct edge weights to unwind destinations when lowering invoke statement. When lowering invoke statement, all unwind destinations are directly added as successors of call site block, and the weight of those new edges are not assigned properly. Actually, default weight 16 are used for those edges. This patch calculates the proper edge weights for those edges when collecting all unwind destinations. Differential revision: http://reviews.llvm.org/D13354 llvm-svn: 250119	2015-10-12 23:02:58 +00:00
Simon Pilgrim	c8832fc233	[SelectionDAG] Add common vector constant folding helper function We have a number of functions that implement constant folding of vectors (unary and binary ops) in near identical manners (and the differences don't appear to be critical). This patch introduces a common implementation (SelectionDAG::FoldConstantVectorArithmetic) and calls this in both the unary and binary op cases. After this initial patch I intend to begin enabling vector constant folding for a wider number of opcodes in SelectionDAG::getNode(). Differential Revision: http://reviews.llvm.org/D13665 llvm-svn: 250118	2015-10-12 23:00:11 +00:00
Kevin Enderby	903955451e	Fixed bugs in llvm-obdump while parsing Mach-O files from malformed archives that caused aborts. This was because of the characters of the ‘Size’ field in the archive header did not contain decimal characters. rdar://22983603 llvm-svn: 250117	2015-10-12 22:04:54 +00:00
Cong Hou	3320bcd815	Update the branch weight metadata in JumpThreading pass. In JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250089	2015-10-12 19:44:08 +00:00
Reid Kleckner	4a5f35c0ae	Make Win64 localescape offsets FP relative instead of SP relative We made them SP relative back in March (r233137) because that's the value the runtime passes to EH functions. With the new cleanuppad IR, funclets adjust their frame argument from SP to FP, so our offsets should now be FP-relative. llvm-svn: 250088	2015-10-12 19:43:34 +00:00
Andrea Di Biagio	b0fe4eb199	[x86] Fix wrong lowering of vsetcc nodes (PR25080). Function LowerVSETCC (in X86ISelLowering.cpp) worked under the wrong assumption that for non-AVX512 targets, the source type and destination type of a type-legalized setcc node were always the same type. This assumption was unfortunately incorrect; the type legalizer is not always able to promote the return type of a setcc to the same type as the first operand of a setcc. In the case of a vsetcc node, the legalizer firstly checks if the first input operand has a legal type. If so, then it promotes the return type of the vsetcc to that same type. Otherwise, the return type is promoted to the 'next legal type', which, for vectors of MVT::i1 is always a 128-bit integer vector type. Example (-mattr=+avx): %0 = trunc <8 x i32> %a to <8 x i23> %1 = icmp eq <8 x i23> %0, zeroinitializer The initial selection dag for the code above is: v8i1 = setcc t5, t7, seteq:ch t5: v8i23 = truncate t2 t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1 t7: v8i32 = build_vector of all zeroes. The type legalizer would firstly check if 't5' has a legal type. If so, then it would reuse that same type to promote the return type of the setcc node. Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to promote the return type of the setcc node. Consequently, the setcc return type is promoted to v8i16. Later on, 't5' is promoted to v8i32 thus leading to the following dag node: v8i16 = setcc t32, t25, seteq:ch where t32 and t25 are now values of type v8i32. Before this patch, function LowerVSETCC would have wrongly expanded the setcc to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an instruction. In our case, ISel would have matched a VPCMPEQWrr: t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25 However, t36 and t25 are both VR256, while the result type is instead of class VR128. This inconsistency ended up causing the insertion of COPY instructions like this: %vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3 Which is an invalid full copy (not a sub register copy). Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy instruction" in the attempt to expand the malformed pseudo COPY instructions. This patch fixes the problem adding the missing logic in LowerVSETCC to handle the corner case of a setcc with 128-bit return type and 256-bit operand type. This problem was originally reported by Dimitry as PR25080. It has been latent for a very long time. I have added the minimal reproducible from that bugzilla as test setcc-lowering.ll. Differential Revision: http://reviews.llvm.org/D13660 llvm-svn: 250085	2015-10-12 19:22:30 +00:00
Cong Hou	61e13de408	Add - and -= operators to BlockFrequency using saturating arithmetic. llvm-svn: 250077	2015-10-12 18:34:00 +00:00
Sanjay Patel	0dc91b3143	combine predicates; NFCI llvm-svn: 250075	2015-10-12 18:15:08 +00:00
Cong Hou	90c6cf8e7d	Turn const/const& into value type for BlockFrequency in functions of this class. Also fix a naming issue. NFC. llvm-svn: 250074	2015-10-12 18:14:15 +00:00
Matt Arsenault	8c0ef8b36d	AMDGPU: Register some more passes so -print-before works llvm-svn: 250071	2015-10-12 17:43:59 +00:00
Matt Arsenault	07a72bad0b	Enable verifier after PeepholeOptimizer No tests fail with this enabled so I assume it was an accident that it isn't enabled now. llvm-svn: 250070	2015-10-12 17:43:56 +00:00
Reid Kleckner	9abb3c06a6	Don't call PrepareEHLandingPad on non EH pads This was a minor bug in r249492. Calling PrepareEHLandingPad on a non-landingpad was a no-op, but it attempted to get the generic pointer register class, which apparently doesn't exist for some targets. llvm-svn: 250068	2015-10-12 17:42:32 +00:00
David Majnemer	99c1d13e52	[WinEH] Remove CatchObjRecoverIdx CatchObjRecoverIdx was used for the old scheme, it is no longer relevant. llvm-svn: 250065	2015-10-12 16:44:22 +00:00
Sanjay Patel	b814ef1ad6	fix typos; NFC llvm-svn: 250059	2015-10-12 16:09:59 +00:00
Zoran Jovanovic	2e386d3d07	[mips][micromips] Initial support for micrmomips DSP instructions and addu.qb implementation Differential Revision: http://reviews.llvm.org/D12798 llvm-svn: 250058	2015-10-12 16:07:25 +00:00
Oliver Stannard	cca893ffac	[Debug] Look through bitcasts to find argument registers On targets where f32 is not legal, we have to look through a BITCAST SDNode to find the register that an argument is stored in when emitting debug info, or we will not be able to emit a DW_AT_location for it. Differential Revision: http://reviews.llvm.org/D13005 llvm-svn: 250056	2015-10-12 15:52:36 +00:00
Vasileios Kalintiris	2a95f82859	[mips][FastISel] Clang-format switch statement. NFC. llvm-svn: 250053	2015-10-12 15:39:41 +00:00
Sanjay Patel	53d1d8b731	fix capitalization; NFC llvm-svn: 250049	2015-10-12 15:24:01 +00:00
Greg Bedwell	7f68a71669	Fix rename() sometimes failing if another process uses openFileForRead() On Windows, fs::rename() could fail is another process was reading the file at the same time using fs::openFileForRead(). In most cases the user wouldn't notice as fs::rename() will continue to retry for 2000ms. Typically this is enough for the read to complete and a retry to succeed, but if the disk is being it too hard then the response time might be longer than the retry time and the rename would fail with a permission error. Add FILE_SHARE_DELETE to the sharing flags for CreateFileW() in fs::openFileForRead() and try ReplaceFileW() prior to MoveFileExW() in fs::rename(). Based on an initial patch by Edd Dawson! Differential Revision: http://reviews.llvm.org/D13647 llvm-svn: 250046	2015-10-12 15:11:47 +00:00
Daniel Sanders	b1ef88c172	[mips][ias] Implement macro expansion when bcc has an immediate where a register belongs. Summary: Fixes PR24915. Reviewers: vkalintiris Subscribers: emaste, seanbruno, llvm-commits Differential Revision: http://reviews.llvm.org/D13533 llvm-svn: 250042	2015-10-12 14:24:05 +00:00
Daniel Sanders	2a5ce1ace0	[mips] Clean up most macro expansions to use the emit*() functions. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13591 llvm-svn: 250040	2015-10-12 14:09:12 +00:00
Daniel Sanders	2fb8564d99	[mips] Handle undef when extracting subregs from FP64 registers. Summary: This removes unnecessary instructions when extracting from an undefined register and also fixes a crash for O32 when passing undef to a double argument in held in integer registers. Reviewers: vkalintiris Subscribers: llvm-commits, zoran.jovanovic, petarj Differential Revision: http://reviews.llvm.org/D13467 llvm-svn: 250039	2015-10-12 13:55:44 +00:00
Oliver Stannard	939724cd02	GlobalOpt does not treat externally_initialized globals correctly GlobalOpt currently merges stores into the initialisers of internal, externally_initialized globals, but should not do so as the value of the global may change between the initialiser and any code in the module being run. llvm-svn: 250035	2015-10-12 13:20:52 +00:00
James Molloy	fa4e994a7a	[ARM] Mark Swift MISched model as incomplete The Swift Machine Scheduler Model is incomplete. There are instructions missing which can trigger the "incomplete machine model" abort. This was observed when a downstream SchedMachineModel was added to the ARM target. Patch by Christof Douma! llvm-svn: 250033	2015-10-12 12:49:59 +00:00
James Molloy	55d633bd60	[LoopVectorize] Shrink integer operations into the smallest type possible C semantics force sub-int-sized values (e.g. i8, i16) to be promoted to int type (e.g. i32) whenever arithmetic is performed on them. For targets with native i8 or i16 operations, usually InstCombine can shrink the arithmetic type down again. However InstCombine refuses to create illegal types, so for targets without i8 or i16 registers, the lengthening and shrinking remains. Most SIMD ISAs (e.g. NEON) however support vectors of i8 or i16 even when their scalar equivalents do not, so during vectorization it is important to remove these lengthens and truncates when deciding the profitability of vectorization. The algorithm this uses starts at truncs and icmps, trawling their use-def chains until they terminate or instructions outside the loop are found (or unsafe instructions like inttoptr casts are found). If the use-def chains starting from different root instructions (truncs/icmps) meet, they are unioned. The demanded bits of each node in the graph are ORed together to form an overall mask of the demanded bits in the entire graph. The minimum bitwidth that graph can be truncated to is the bitwidth minus the number of leading zeroes in the overall mask. The intention is that this algorithm should "first do no harm", so it will never insert extra cast instructions. This is why the use-def graphs are unioned, so that subgraphs with different minimum bitwidths do not need casts inserted between them. This algorithm works hard to reduce compile time impact. DemandedBits are only queried if there are extends of illegal types and if a truncate to an illegal type is seen. In the general case, this results in a simple linear scan of the instructions in the loop. No non-noise compile time impact was seen on a clang bootstrap build. llvm-svn: 250032	2015-10-12 12:34:45 +00:00
Amjad Aboud	1db6d7af46	[X86] Add XSAVE intrinsic family Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13012 llvm-svn: 250029	2015-10-12 11:47:46 +00:00
Andrea Di Biagio	a0922ed8fe	[x86] PR24562: fix incorrect folding of PSHUFB nodes with a mask where all indices have the most significant bit set. This patch fixes a problem in function 'combineX86ShuffleChain' that causes a chain of shuffles to be wrongly folded away when the combined shuffle mask has only one element. We may end up with a combined shuffle mask of one element as a result of multiple calls to function 'canWidenShuffleElements()'. Function canWidenShuffleElements attempts to simplify a shuffle mask by widening the size of the elements being shuffled. For every pair of shuffle indices, function canWidenShuffleElements checks if indices refer to adjacent elements. If all pairs refer to "adjacent" elements then the shuffle mask is safely widened. As a consequence of widening, we end up with a new shuffle mask which is half the size of the original shuffle mask. The byte shuffle (pshufb) from test pr24562.ll has a mask of all SM_SentinelZero indices. Function canWidenShuffleElements would combine each pair of SM_SentinelZero indices into a single SM_SentinelZero index. So, in a logarithmic number of steps (4 in this case), the pshufb mask is simplified to a mask with only one index which is equal to SM_SentinelZero. Before this patch, function combineX86ShuffleChain wrongly assumed that a mask of size one is always equivalent to an identity mask. So, the entire shuffle chain was just folded away as the combined shuffle mask was treated as a no-op mask. With this patch we know check if the only element of a combined shuffle mask is SM_SentinelZero. In case, we propagate a zero vector. Differential Revision: http://reviews.llvm.org/D13364 llvm-svn: 250027	2015-10-12 11:25:41 +00:00
Zlatko Buljan	d76b666a06	Test commit llvm-svn: 250026	2015-10-12 11:19:40 +00:00
Tobias Grosser	374bce0c22	SCEV: Allow simple AddRec * Parameter products in delinearization This patch also allows the -delinearize pass to delinearize expressions that do not have an outermost SCEVAddRec expression. The SCEV::delinearize infrastructure allowed this since r240952, but the -delinearize pass was not updated yet. llvm-svn: 250018	2015-10-12 08:02:00 +00:00
Craig Topper	8d2e6bc25b	[X86] Use u8imm for the immediate type for all shift and rotate instructions. This way the assembler will perform range checking. Believe this matches gas behavior. llvm-svn: 250016	2015-10-12 06:23:10 +00:00
Craig Topper	d6b661dbf0	[X86] Add support to assembler and MCInst lowering to use the other vmovq %xmmX, %xmmX encoding if it would be a shorter VEX encoding. llvm-svn: 250014	2015-10-12 04:57:59 +00:00
Craig Topper	635e05df0a	[X86] Cleanup formatting a bit. NFC llvm-svn: 250013	2015-10-12 04:27:17 +00:00
Craig Topper	5be914eda1	[X86] Change the immediate for IN/OUT instructions to u8imm so the assembly parser will check the size. llvm-svn: 250012	2015-10-12 04:17:55 +00:00
Craig Topper	95fffba227	[X86] Add some instruction aliases to get the assembly parser table to favor arithmetic instructions with 8-bit immediates over the forms that implicitly use the ax/eax/rax. This allows us to remove the explicit code for working around the existing priority llvm-svn: 250011	2015-10-12 03:39:57 +00:00
Craig Topper	fcc34bdee0	[X86] Fix CMP and TEST with al/ax/eax/rax to not mark EFLAGS as a use or al/ax/eax/rax as a def. Probably doesn't have a functional affect since these aren't used in isel. llvm-svn: 249994	2015-10-11 19:54:02 +00:00
Simon Pilgrim	d45c88bbb5	[DAGCombiner] Improved FMA combine support for vectors Enabled constant canonicalization for all constants. Improved combining of constant vectors. llvm-svn: 249993	2015-10-11 19:48:12 +00:00
Craig Topper	87990ee4ec	[X86] Remove special validation for INT immediate operand from AsmParser. Instead mark its operand type as u8imm which will cause it to fail to match. This is more consistent with other instruction behavior. This also fixes a bug where negative immediates below -128 were not being reported as errors. llvm-svn: 249989	2015-10-11 18:27:24 +00:00
Craig Topper	a71630729d	[X86] Simplify immediate range checking code. llvm-svn: 249979	2015-10-11 16:38:14 +00:00
Simon Pilgrim	5eac2607b9	[DAGCombiner] Tidyup FMINNUM/FMAXNUM constant folding Enable constant folding for vector splats as well as scalars. Enable constant canonicalization for all scalar and vector constants. llvm-svn: 249978	2015-10-11 16:02:28 +00:00
Simon Pilgrim	1d1c56e2df	[InstCombine][X86][XOP] Combine XOP integer vector comparisons to native IR We now have lowering support for XOP PCOM/PCOMU instructions. llvm-svn: 249977	2015-10-11 14:38:34 +00:00
Simon Pilgrim	52d47e5704	[X86][XOP] Added support for the lowering of 128-bit vector integer comparisons to XOP PCOM/PCOMU instructions. The XOP vector integer comparisons can deal with all signed/unsigned comparison cases directly and can be easily commuted as well (D7646). llvm-svn: 249976	2015-10-11 14:15:17 +00:00
Nathan Slingerland	5e896ce2d1	[ProfileData] Test commit for slingn This is a test of the LLVM commit system. In the event of a real commit there would be some useful code changes. llvm-svn: 249972	2015-10-11 13:30:56 +00:00
Craig Topper	55b1f29203	Change isUIntN/isIntN calls with constant N to use the template version. NFC llvm-svn: 249952	2015-10-10 20:17:07 +00:00
Teresa Johnson	1493ad9c24	Fix PR25101 - Handle anonymous functions without VST entries Summary: The change to use the VST function entries for lazy deserialization did not handle the case of anonymous functions without aliases. In that case we must fall back to scanning the function blocks as there is no VST entry. Reviewers: dexonsmith, joker.eph, davidxl Subscribers: tstellarAMD, llvm-commits Differential Revision: http://reviews.llvm.org/D13596 llvm-svn: 249947	2015-10-10 14:18:36 +00:00
Jonas Paulsson	63a2b6862e	[SystemZ] Fixes in the backend I/R. expandPostRAPseudo(): STX -> 2 * STD: The first STD should not have the kill flag set for the address. SystemZElimCompare: BRC -> BRCT conversion: Don't forget to remove the CC<use,kill> operand. Needed to make SystemZ/asm-17.ll pass with -verify-machineinstrs, which now runs with this flag. Reviewed by Ulrich Weigand. llvm-svn: 249945	2015-10-10 07:14:24 +00:00
Sanjoy Das	cc16ccc1ab	[IndVars] Use `auto`; NFC llvm-svn: 249944	2015-10-10 06:33:33 +00:00
Craig Topper	84008481e4	Use range-based for loops. NFC llvm-svn: 249943	2015-10-10 05:38:14 +00:00
Keno Fischer	2cd66e9270	[RuntimeDyld] Fix performance problem in resolveRelocations with many sections Summary: Rather than just iterating over all sections and checking whether we have relocations for them, iterate over the relocation map instead. This showed up heavily in an artificial julia benchmark that does lots of compilation. On that particular benchmark, this patch gives ~15% performance improvements. As far as I can tell the primary reason why the original loop was so expensive is that Relocations[i] actually constructs a relocationList (allocating memory & doing lots of other unnecessary computing) if none is found. Reviewers: lhames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13545 llvm-svn: 249942	2015-10-10 05:37:02 +00:00
Craig Topper	7143d8001a	Use range-based for loops. NFC. llvm-svn: 249941	2015-10-10 05:25:06 +00:00
Craig Topper	7d5b23101c	Use emplace_back instead of a constructor call and push_back. NFC llvm-svn: 249940	2015-10-10 05:25:02 +00:00
Duncan P. N. Exon Smith	5a82c916b0	Analysis: Remove implicit ilist iterator conversions Remove implicit ilist iterator conversions from LLVMAnalysis. I came across something really scary in `llvm::isKnownNotFullPoison()` which relied on `Instruction::getNextNode()` being completely broken (not surprising, but scary nevertheless). This function is documented (and coded to) return `nullptr` when it gets to the sentinel, but with an `ilist_half_node` as a sentinel, the sentinel check looks into some other memory and we don't recognize we've hit the end. Rooting out these scary cases is the reason I'm removing the implicit conversions before doing anything else with `ilist`; I'm not at all surprised that clients rely on badness. I found another scary case -- this time, not relying on badness, just bad (but I guess getting lucky so far) -- in `ObjectSizeOffsetEvaluator::compute_()`. Here, we save out the insertion point, do some things, and then restore it. Previously, we let the iterator auto-convert to `Instruction`, and then set it back using the `Instruction` version: Instruction PrevInsertPoint = Builder.GetInsertPoint(); / Logic that may change insert point */ if (PrevInsertPoint) Builder.SetInsertPoint(PrevInsertPoint); The check for `PrevInsertPoint` doesn't protect correctly against bad accesses. If the insertion point has been set to the end of a basic block (i.e., `SetInsertPoint(SomeBB)`), then `GetInsertPoint()` returns an iterator pointing at the list sentinel. The version of `SetInsertPoint()` that's getting called will then call `PrevInsertPoint->getParent()`, which explodes horribly. The only reason this hasn't blown up is that it's fairly unlikely the builder is adding to the end of the block; usually, we're adding instructions somewhere before the terminator. llvm-svn: 249925	2015-10-10 00:53:03 +00:00
Duncan P. N. Exon Smith	a5f45da27e	MC: Remove implicit ilist iterator conversions, NFC llvm-svn: 249922	2015-10-10 00:13:11 +00:00
David Majnemer	bfa5b98201	[WinEH] Remove more dead code wineh-parent is dead, so is ValueOrMBB. llvm-svn: 249920	2015-10-10 00:04:29 +00:00
Reid Kleckner	14e773500e	[WinEH] Delete the old landingpad implementation of Windows EH The new implementation works at least as well as the old implementation did. Also delete the associated preparation tests. They don't exercise interesting corner cases of the new implementation. All the codegen tests of the EH tables have already been ported. llvm-svn: 249918	2015-10-09 23:34:53 +00:00
Reid Kleckner	eb7cd6c889	[SEH] Update SEH codegen tests to use the new IR Also Fix a buglet where SEH tables had ranges that spanned funclets. The remaining tests using the old landingpad IR are preparation tests, and will be deleted along with the old preparation. llvm-svn: 249917	2015-10-09 23:05:54 +00:00
Duncan P. N. Exon Smith	f1ff53ecc2	CodeGen: Remove implicit ilist iterator conversions, NFC Finish removing implicit ilist iterator conversions from LLVMCodeGen. I'm sure there are lots more of these in lib/CodeGen/*/. llvm-svn: 249915	2015-10-09 22:56:24 +00:00
David Majnemer	35d27b21a1	[WinEH] Insert the catchpad return before CSR restoration x64 catchpads use rax to inform the unwinder where control should go next. However, we must initialize rax before the epilogue sequence so as to not perturb the unwinder. llvm-svn: 249910	2015-10-09 22:18:45 +00:00
James Y Knight	692e037499	Fix assert when emitting llvm.pow.f86. This occurred due to introducing the invalid i64 type after type legalization had already finished, in an attempt to workaround bitcast f64 -> v2i32 not doing constant folding. The right thing is to actually fix bitcast, but that has other complications. So, for now, just get rid of the broken workaround, and check in a test-case showing that it doesn't crash, with TODOs for emitting proper code. llvm-svn: 249908	2015-10-09 21:36:19 +00:00
Reid Kleckner	e1c8a7f9c7	[SEH] Fix _except_handler4 table base states We got them right for the old IR, but not with funclets. Port the old test to the new IR and fix the code. llvm-svn: 249906	2015-10-09 21:27:28 +00:00
Duncan P. N. Exon Smith	6e98cd32dc	CodeGen: Avoid more ilist iterator implicit conversions, NFC llvm-svn: 249903	2015-10-09 21:08:19 +00:00
Duncan P. N. Exon Smith	1ff409802d	CodeGen: Use range-based for in PostRAScheduler, NFC llvm-svn: 249901	2015-10-09 21:05:00 +00:00
Reid Kleckner	d880dc7509	[SEH] Remember to emit the last invoke range for SEH This wasn't very observable in execution tests, because usually there is an invoke in the catchpad that unwinds the the catchendpad but never actually throws. llvm-svn: 249898	2015-10-09 20:39:39 +00:00
Owen Anderson	97ca0f3f2c	Generalize convergent check to handle invokes as well as calls. llvm-svn: 249892	2015-10-09 20:17:46 +00:00
James Y Knight	5b8217bc05	Fix assert in X86 backend. When running combine on an extract_vector_elt, it wants to look through a bitcast to check if the argument to the bitcast was itself an extract_vector_elt with particular operands. However, it called getOperand() on the argument to the bitcast before checking that the opcode was EXTRACT_VECTOR_ELT, assert-failing if there were zero operands for the actual opcode. Fix, and add trivial test. llvm-svn: 249891	2015-10-09 20:10:14 +00:00
Chad Rosier	47eba05b47	Revert "Simplify code. NFC." This reverts commit r248610. llvm-svn: 249887	2015-10-09 19:48:48 +00:00
Duncan P. N. Exon Smith	5ec1568c9c	CodeGen: Continue removing ilist iterator implicit conversions llvm-svn: 249884	2015-10-09 19:40:45 +00:00
Duncan P. N. Exon Smith	6ac07fd228	CodeGen: Remove implicit iterator conversions from MBB.cpp Remove implicit ilist iterator conversions from MachineBasicBlock.cpp. I've also added an overload of `splice()` that takes a pointer, since it's a natural API. This is similar to the overloads I added for `remove()` and `erase()` in r249867. llvm-svn: 249883	2015-10-09 19:36:12 +00:00
Duncan P. N. Exon Smith	0ac8eb9171	CodeGen: Avoid ilist iterator implicit conversions in a few more places, NFC llvm-svn: 249880	2015-10-09 19:23:20 +00:00
Duncan P. N. Exon Smith	5ae5939fa1	CodeGen: Remove more ilist iterator implicit conversions, NFC llvm-svn: 249879	2015-10-09 19:13:58 +00:00
Duncan P. N. Exon Smith	6c64aeb065	CodeGen: Use range-based for in IntrinsicLowering::AddPrototypes, NFC This happens to avoid a host of implicit ilist iterator conversions. llvm-svn: 249877	2015-10-09 19:07:41 +00:00
Duncan P. N. Exon Smith	530d040bd9	CodeGen: Use range-based for in GlobalMerge, NFC llvm-svn: 249876	2015-10-09 18:57:47 +00:00
Duncan P. N. Exon Smith	d83547a16e	CodeGen: Remove a few more ilist iterator implicit conversions, NFC llvm-svn: 249875	2015-10-09 18:44:40 +00:00
Owen Anderson	2c9978b12b	Teach LoopUnswitch not to perform non-trivial unswitching on loops containing convergent operations. Doing so could cause the post-unswitching convergent ops to be control-dependent on the unswitch condition where they were not before. This check could be refined to allow unswitching where the convergent operation was already control-dependent on the unswitch condition. llvm-svn: 249874	2015-10-09 18:40:20 +00:00
Duncan P. N. Exon Smith	980f8f2639	CodeGen: Remove implicit conversions from Analysis and BranchFolding Remove a few more implicit ilist iterator conversions, this time from Analysis.cpp and BranchFolding.cpp. I added a few overloads for `remove()` and `erase()`, which quite naturally take pointers as well as iterators as parameters. This will reduce the churn at least in the short term, but I don't really have a problem with these existing for longer. llvm-svn: 249867	2015-10-09 18:23:49 +00:00
Owen Anderson	d95b08a0a7	Refine the definition of convergent to only disallow the addition of new control dependencies. This covers the common case of operations that cannot be sunk. Operations that cannot be hoisted should already be handled properly via the safe-to-speculate rules and mechanisms. llvm-svn: 249865	2015-10-09 18:06:13 +00:00
Sanjay Patel	9fbe22bac6	fix typos; NFC llvm-svn: 249863	2015-10-09 18:01:03 +00:00
Diego Novillo	a7f1e8ef83	Add inline stack streaming to binary sample profiles. With this patch we can now read and write inline stacks in sample profiles to the binary encoded profiles. In a subsequent patch, I will add a string table to the binary encoding. Right now function names are emitted as strings every time we find them. This is too bloated and will produce large files in applications with lots of inlining. llvm-svn: 249861	2015-10-09 17:54:24 +00:00
Dan Gohman	ee1588ce96	[WebAssembly] Rename floating-point operators to match their spec names. llvm-svn: 249859	2015-10-09 17:50:00 +00:00
Artur Pilipenko	cca800207a	Add verification for align, dereferenceable, dereferenceable_or_null load metadata Reviewed By: reames Differential Revision: http://reviews.llvm.org/D13428 llvm-svn: 249856	2015-10-09 17:41:29 +00:00
Keno Fischer	21a7f23666	Clear SectionSymbols in MCContext::Reset This was just forgotten when SectionSymbols was introduced and could cause corruption if the MCContext was reused after Reset. Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13547 llvm-svn: 249854	2015-10-09 17:24:54 +00:00
Duncan P. N. Exon Smith	769e1a972d	AArch64: Make getNextNode() cleanup in r249764 more clear After r249764, if you didn't see the full context, it looked like `std::next(I)` would get the same result as `++MachineBasicBlock::iterator(I)`. However, `I` is a `MachineInstr*` (not a `MachineBasicBlock::iterator`). Use the `getIterator()` helper I added later (r249782) to make this code more clear. llvm-svn: 249852	2015-10-09 16:54:54 +00:00
Duncan P. N. Exon Smith	8f11e1a713	CodeGen: Start removing implicit conversions to/from list iterators, NFC Start removing implicit conversions to/from list iterators in CodeGen, ala r249782 for IR. A lot more to go after this. llvm-svn: 249851	2015-10-09 16:54:49 +00:00
Dehao Chen	41dc5a6e86	Make HeaderLineno a local variable. http://reviews.llvm.org/D13576 As we are using hierarchical profile, there is no need to keep HeaderLineno a member variable. This is because each level of the inline stack will have its own header lineno. One should use the head lineno of its own inline stack level instead of the actual symbol. llvm-svn: 249848	2015-10-09 16:50:16 +00:00
Artur Pilipenko	ffd132878a	ValueTracking: use getAlignment in isAligned Reviewed By: reames Differential Revision: http://reviews.llvm.org/D13517 llvm-svn: 249841	2015-10-09 15:58:26 +00:00
Jun Bum Lim	0aace13d18	Improve ISel across lane float min/max reduction In vectorized float min/max reduction code, the final "reduce" step is sub-optimal. In AArch64, this change wll combine : svn0 = vector_shuffle t0, undef<2,3,u,u> fmin = fminnum t0,svn0 svn1 = vector_shuffle fmin, undef<1,u,u,u> cc = setcc fmin, svn1, ole n0 = extract_vector_elt cc, #0 n1 = extract_vector_elt fmin, #0 n2 = extract_vector_elt fmin, #1 result = select n0, n1,n2 into : result = llvm.aarch64.neon.fminnmv t0 This change extends r247575. llvm-svn: 249834	2015-10-09 14:11:25 +00:00
Jonas Paulsson	ee3685fd45	[SystemZ] Remove unused code in SystemZElimCompare.cpp The Reference IndirectDef and IndirectUse members were unused and therefore removed. llvm-svn: 249824	2015-10-09 11:27:44 +00:00
Nemanja Ivanovic	d389657399	Vector element extraction without stack operations on Power 8 This patch corresponds to review: http://reviews.llvm.org/D12032 This patch builds onto the patch that provided scalar to vector conversions without stack operations (D11471). Included in this patch: - Vector element extraction for all vector types with constant element number - Vector element extraction for v16i8 and v8i16 with variable element number - Removal of some unnecessary COPY_TO_REGCLASS operations that ended up unnecessarily moving things around between registers Not included in this patch (will be in upcoming patch): - Vector element extraction for v4i32, v4f32, v2i64 and v2f64 with variable element number - Vector element insertion for variable/constant element number Testing is provided for all extractions. The extractions that are not implemented yet are just placeholders. llvm-svn: 249822	2015-10-09 11:12:18 +00:00
Andrea Di Biagio	99493df257	[MemCpyOpt] Fix wrong merging adjacent nontemporal stores into memset calls. Pass MemCpyOpt doesn't check if a store instruction is nontemporal. As a consequence, adjacent nontemporal stores are always merged into a memset call. Example: ;;; define void @foo(<4 x float>* nocapture %p) { entry: store <4 x float> zeroinitializer, <4 x float>* %p, align 16, !nontemporal !0 %p1 = getelementptr inbounds <4 x float>, <4 x float>* %dst, i64 1 store <4 x float> zeroinitializer, <4 x float>* %p1, align 16, !nontemporal !0 ret void } !0 = !{i32 1} ;;; In this example, the two nontemporal stores are combined to a memset of zero which does not preserve the nontemporal hint. Later on the backend (tested on a x86-64 corei7) expands that memset call into a sequence of two normal 16-byte aligned vector stores. opt -memcpyopt example.ll -S -o - \| llc -mcpu=corei7 -o - Before: xorps %xmm0, %xmm0 movaps %xmm0, 16(%rdi) movaps %xmm0, (%rdi) With this patch, we no longer merge nontemporal stores into calls to memset. In this example, llc correctly expands the two stores into two movntps: xorps %xmm0, %xmm0 movntps %xmm0, 16(%rdi) movntps %xmm0, (%rdi) In theory, we could extend the usage of !nontemporal metadata to memcpy/memset calls. However a change like that would only have the effect of forcing the backend to expand !nontemporal memsets back to sequences of store instructions. A memset library call would not have exactly the same semantic of a builtin !nontemporal memset call. So, SelectionDAG will have to conservatively expand it back to a sequence of !nontemporal stores (effectively undoing the merging). Differential Revision: http://reviews.llvm.org/D13519 llvm-svn: 249820	2015-10-09 10:53:41 +00:00
Arnaud A. de Grandmaison	859b2ac07d	[EarlyCSE] Address post commit review for r249523. llvm-svn: 249814	2015-10-09 09:23:01 +00:00
Jonas Paulsson	5b3bab40b2	[SystemZ] Remove superfluous braces in SystemZShortenInst.cpp llvm-svn: 249812	2015-10-09 07:19:20 +00:00
Jonas Paulsson	18d877f79b	[SystemZ] Minor bugfixes. LLCH, LLHH and CLIH had the wrong register classes for the def-operand. Tie operands if changing opcode to an instruction with tied ops. Comment typo fix. These fixes were needed in order to make regression test case SystemZ/asm-18.ll pass with -verify-machineinstrs (not used by default). Reviewed by Ulrich Weigand. llvm-svn: 249811	2015-10-09 07:19:16 +00:00
Jonas Paulsson	0a9049ba82	[SystemZ] Bugfix in SystemZAsmParser.cpp. Let parseRegister() allow RegFP Group if expecting RegV Group, since the %f register prefix yields the FP group even while used with vector instructions. Reviewed by Ulrich Weigand. llvm-svn: 249810	2015-10-09 07:19:12 +00:00
Kostya Serebryany	e95022ac14	[libFuzzer] don't print large artifacts to stderr llvm-svn: 249808	2015-10-09 04:03:14 +00:00
Kostya Serebryany	bd5d1cdbb9	[libFuzzer] add -artifact_prefix flag llvm-svn: 249807	2015-10-09 03:57:59 +00:00
Saleem Abdulrasool	1825fac3c9	ARM: tweak WoA frame lowering Accept r11 when targeting Windows on ARM rather than just low registers. Because we are in a thumb-2 only mode, this may be slightly more expensive in code size, but results in better code for the environment since it spills the frame register, which is generally desired for fast stack walking as per the ABI. llvm-svn: 249804	2015-10-09 03:19:03 +00:00
Sanjoy Das	648956118b	[SCEV] Call `StrengthenNoWrapFlags` after `GroupByComplexity`; NFCI The current implementation of `StrengthenNoWrapFlags` is agnostic to the order of `Ops`, so this commit should not change anything semantic. An upcoming change will make `StrengthenNoWrapFlags` sensitive to the order of `Ops`. llvm-svn: 249802	2015-10-09 02:44:45 +00:00
Reid Kleckner	ae44e871cd	Revert "Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64""" This reverts commit r249794. Apparently my checkouts are full of unexpected surprises today. llvm-svn: 249796	2015-10-09 01:13:17 +00:00
Reid Kleckner	b510401785	Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64"" This reverts commit r249032. TODO write commit msg llvm-svn: 249794	2015-10-09 01:11:37 +00:00
Joseph Tremoulet	676e5cf07f	[WinEH] Fix cleanup state numbering Summary: - Recurse from cleanupendpads to their cleanuppads, to make sure the cleanuppad is visited if it has a cleanupendpad but no cleanupret. - Check for and avoid double-processing cleanuppads, to allow for them to have multiple cleanuprets (plus cleanupendpads). - Update Cxx state numbering to visit toplevel cleanupendpads and to recurse from cleanupendpads to their preds, to ensure we number any funclets in inlined cleanups. SEH state numbering already did this. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13374 llvm-svn: 249792	2015-10-09 00:46:08 +00:00
Reid Kleckner	ebef256269	[SEH] Fix llvm.eh.exceptioncode fast register allocation assertion I called the wrong MachineBasicBlock::addLiveIn() overload. llvm-svn: 249786	2015-10-09 00:15:13 +00:00
Reid Kleckner	21427ada3e	Address review comments, remove error case and return 0 instead as required by tests llvm-svn: 249785	2015-10-09 00:15:08 +00:00
Reid Kleckner	e94fef7b3d	[llvm-symbolizer] Make --relative-address work with DWARF contexts Summary: Previously the relative address flag only affected PDB debug info. Now both DIContext implementations always expect to be passed virtual addresses. llvm-symbolizer is now responsible for adding ImageBase to module offsets when --relative-offset is passed. Reviewers: zturner Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12883 llvm-svn: 249784	2015-10-09 00:15:01 +00:00
Duncan P. N. Exon Smith	52888a6738	IR: Remove implicit iterator conversions from lib/IR, NFC Stop converting implicitly between iterators and pointers/references in lib/IR. For convenience, I've added a `getIterator()` accessor to `ilist_node` so that callers don't need to know how to spell the iterator class (i.e., they can use `X.getIterator()` instead of `Function::iterator(X)`). I'll eventually disallow these implicit conversions entirely, but there's a lot of code, so it doesn't make sense to do it all in one patch. One library or so at a time. Why? To root out cases of `getNextNode()` and `getPrevNode()` being used in iterator logic. The design of `ilist` makes that invalid when the current node could be at the back of the list, but it happens to "work" right now because of a bug where those functions never return `nullptr` if you're using a half-node sentinel. Before I can fix the function, I have to remove uses of it that rely on it misbehaving. (Maybe the function should just be deleted anyway? But I don't want deleting it -- potentially a huge project -- to block fixing ilist/iplist.) llvm-svn: 249782	2015-10-08 23:49:46 +00:00
Sanjoy Das	3c520a1272	[RS4GC] Refactoring to make a later change easier, NFCI Summary: These non-semantic changes will help make a later change adding support for deopt operand bundles more streamlined. Reviewers: reames, swaroop.sridhar Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13491 llvm-svn: 249779	2015-10-08 23:18:38 +00:00
Sanjoy Das	4fd3d400fa	[IRBuilder] Change the `gc.statepoint` creation interface This is to enable me to address review for D13491 -- `Flags` is a bitfield of `StatepointFlags`, not an individual item out of the enum, so it should be represented as an `uint32_t`. llvm-svn: 249778	2015-10-08 23:18:33 +00:00
Sanjoy Das	c21a05a3a4	[PlaceSafeopints] Extract out `callsGCLeafFunction`, NFC Summary: This will be used in a later change to RewriteStatepointsForGC. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13490 llvm-svn: 249777	2015-10-08 23:18:30 +00:00
Sanjoy Das	1ede5367ba	[RS4GC] Don't copy ADT's unneccessarily, NFCI Summary: Use `const auto &` instead of `auto` in `makeStatepointExplicit`. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13454 llvm-svn: 249776	2015-10-08 23:18:22 +00:00
Kevin Enderby	46e642f8c5	Fix a bug in llvm-objdump’s printing of Objective-C meta data from malformed Mach-O files that caused a crash because of a section header had a size that extended past the end of the file. rdar://22983603 llvm-svn: 249768	2015-10-08 22:50:55 +00:00
Duncan P. N. Exon Smith	6eeaff169d	Support: Stop relying on iterator auto-conversion, NFC Stop relying on ilist implicit conversions from `value_type&` to `iterator` in YAMLParser.cpp. I eventually want to outlaw this entirely. It encourages `getNextNode()` and `getPrevNode()` in iterator logic, which is extremely fragile (and relies on them never returning `nullptr`). FTR, there's nothing nefarious going on in this case, it was just easy to clean up since the callers really wanted iterators to begin with. llvm-svn: 249767	2015-10-08 22:47:55 +00:00
Duncan P. N. Exon Smith	d389165c14	AArch64: Stop using MachineInstr::getNextNode() Stop using `getNextNode()` to get an insertion point (at least, in this one place). Instead, use iterator logic directly. The `getNextNode()` interface isn't actually supposed to work for creating iterators; it's supposed to return `nullptr` (not a real iterator) if this is the last node. It's currently broken and will "happen" to work, but if we ever fix the function, we'll get some strange failures in places like this. llvm-svn: 249764	2015-10-08 22:43:26 +00:00
Duncan P. N. Exon Smith	ece61624b1	MC: Stop using Fragment::getNextNode() Stop using `getNextNode()` to get an iterator to a fragment (at least, in this one place). Instead, use iterator logic directly. The `getNextNode()` interface isn't actually supposed to work for creating iterators; it's supposed to return `nullptr` (not a real iterator) if this is the last node. It's currently broken and will "happen" to work, but if we ever fix the function, we'll get some strange failures in places like this. llvm-svn: 249763	2015-10-08 22:36:08 +00:00
Duncan P. N. Exon Smith	a3da44882f	PowerPC: Don't use getNextNode() for insertion point Stop using `getNextNode()` to create an insertion point for machine instructions (at least, in this one place). Instead, use an iterator. As a drive-by, clean up dump statements to use iterator logic. The `getNextNode()` interface isn't actually supposed to work for insertion points; it's supposed to return `nullptr` if this is the last node. It's currently broken and will "happen" to work, but if we ever fix the function, we'll get some strange failures. llvm-svn: 249758	2015-10-08 22:20:37 +00:00
Evgeniy Stepanov	d12212bc8c	New MSan mapping layout (llvm part). This is an implementation of https://github.com/google/sanitizers/issues/579 It has a number of advantages over the current mapping: * Works for non-PIE executables. * Does not require ASLR; as a consequence, debugging MSan programs in gdb no longer requires "set disable-randomization off". * Supports linux kernels >=4.1.2. * The code is marginally faster and smaller. This is an ABI break. We never really promised ABI stability, but this patch includes a courtesy escape hatch: a compile-time macro that reverts back to the old mapping layout. llvm-svn: 249753	2015-10-08 21:35:26 +00:00
Evgeniy Stepanov	5fe279e727	Add Triple::isAndroid(). This is a simple refactoring that replaces Triple.getEnvironment() checks for Android with Triple.isAndroid(). llvm-svn: 249750	2015-10-08 21:21:24 +00:00
Eric Christopher	11e5983658	Move the MMX subtarget feature out of the SSE set of features and into its own variable. This is needed so that we can explicitly turn off MMX without turning off SSE and also so that we can diagnose feature set incompatibilities that involve MMX without SSE. Rationale: // sse3 __m128d test_mm_addsub_pd(__m128d A, __m128d B) { return _mm_addsub_pd(A, B); } // mmx void shift(__m64 a, __m64 b, int c) { _mm_slli_pi16(a, c); _mm_slli_pi32(a, c); _mm_slli_si64(a, c); _mm_srli_pi16(a, c); _mm_srli_pi32(a, c); _mm_srli_si64(a, c); _mm_srai_pi16(a, c); _mm_srai_pi32(a, c); } clang -msse3 -mno-mmx file.c -c For this code we should be able to explicitly turn off MMX without affecting the compilation of the SSE3 function and then diagnose and error on compiling the MMX function. This matches the existing gcc behavior and follows the spirit of the SSE/MMX separation in llvm where we can (and do) turn off MMX code generation except in the presence of intrinsics. Updated a couple of tests, but primarily tested with a couple of tests for turning on only mmx and only sse. This is paired with a patch to clang to take advantage of this behavior. llvm-svn: 249731	2015-10-08 20:10:06 +00:00
Diego Novillo	aae1ed8e08	Re-apply r249644: Handle inline stacks in gcov-encoded sample profiles. This fixes memory allocation problems by making the merge operation keep the profile readers around until the merged profile has been emitted. This is needed to prevent the inlined function names to disappear from the function profiles. Since all the names are kept as references, once the reader disappears, the names are also deallocated. Additionally, XFAIL on big-endian architectures. The test case uses a gcov file generated on a little-endian system. llvm-svn: 249724	2015-10-08 19:40:37 +00:00
Alexei Starovoitov	87f83e6926	[bpf] Do not expand UNDEF SDNode during insn selection lowering o Before this patch, BPF backend will expand UNDEF node to i64 constant 0. o For second pass of dag combiner, legalizer will run through each to-be-processed dag node. o If any new SDNode is generated and has an undef operand, dag combiner will put undef node, newly-generated constant-0 node, and any node which uses these nodes in the working list. o During this process, it is possible undef operand is generated again, and this will form an infinite loop for dag combiner pass2. o This patch allows UNDEF to be a legal type. Signed-off-by: Yonghong Song <yhs@plumgrid.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> llvm-svn: 249718	2015-10-08 18:52:40 +00:00
Sanjoy Das	413dbbb1c2	[SCEV] Bring some methods up to coding style; NFC - Start methods with lower case - Reflow a comment - Delete header comment repeated in .cpp file llvm-svn: 249716	2015-10-08 18:46:59 +00:00
Reid Kleckner	b2244cb8f0	[WinEH] Relax assertion in the presence of stack realignment The code is correct as is, but we should test it. llvm-svn: 249715	2015-10-08 18:41:52 +00:00
Sanjoy Das	3bf22b1883	[SCEV] Remove comment repeated in cpp file; NFC llvm-svn: 249713	2015-10-08 18:28:42 +00:00
Sanjoy Das	dd70996a5c	[SCEV] Pick backedge values for phi nodes correctly Summary: `getConstantEvolutionLoopExitValue` and `ComputeExitCountExhaustively` assumed all phi nodes in the loop header have the same order of incoming values. This is not correct, and this commit changes `getConstantEvolutionLoopExitValue` and `ComputeExitCountExhaustively` to lookup the backedge value of a phi node using the loop's latch block. Unfortunately, there is still some code duplication `getConstantEvolutionLoopExitValue` and `ComputeExitCountExhaustively`. At some point in the future we should extract out a helper class / method that can evolve constant evolution phi nodes across iterations. Fixes 25060. Thanks to Mattias Eriksson for the spot-on analysis! Depends on D13457. Reviewers: atrick, hfinkel Subscribers: materi, llvm-commits Differential Revision: http://reviews.llvm.org/D13458 llvm-svn: 249712	2015-10-08 18:28:36 +00:00
Rafael Espindola	483ad20009	Handle Archive::getNumberOfSymbols being called in an archive with no symbols. No change in llvm, but will be tested from lld. llvm-svn: 249709	2015-10-08 18:06:20 +00:00
Ulrich Weigand	f4d14f781f	[SystemZ] Fix another assertion failure in tryBuildVectorShuffle This fixes yet another scenario where tryBuildVectorShuffle would attempt to create a BUILD_VECTOR node with an invalid combination of types. This can happen if the incoming BUILD_VECTOR has elements of a type different from the vector element type, which is allowed in certain cases as long as they are all the same type. When one of these elements is used in the residual vector, and UNDEF elements are added to fill up the residual vector, those UNDEFs then have to use the type of the original element, not the vector element type, or else the resulting BUILD_VECTOR will have an invalid type combination. llvm-svn: 249706	2015-10-08 17:46:59 +00:00
Sanjay Patel	f61a08fbf1	[InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886) This is a partial fix for PR24886: https://llvm.org/bugs/show_bug.cgi?id=24886 Without this IR transform, the backend (x86 at least) was producing inefficient code. This patch is making 2 assumptions: 1. The canonical form of a fabs() operation is, in fact, the LLVM fabs() intrinsic. 2. The high bit of an FP value is always the sign bit; as noted in the bug report, this isn't specified by the LangRef. Differential Revision: http://reviews.llvm.org/D13076 llvm-svn: 249702	2015-10-08 17:09:31 +00:00
Sanjay Patel	9115cf8c9d	[ValueTracking] teach computeKnownBits that a fabs() clears sign bits This was requested in D13076: if we're going to canonicalize to fabs(), ValueTracking should know that fabs() clears sign bits. In this patch (as in D13076), we're not handling vectors yet even though computeKnownBits' fabs() case itself should be vector-ready via the splat in this patch. Fixing this will require follow-on patches to correct other logic that uses 'getScalarType'. Differential Revision: http://reviews.llvm.org/D13222 llvm-svn: 249701	2015-10-08 16:56:55 +00:00
George Rimar	87780300f6	Windows: Fixed sys::findProgramByName to work with files containing dot in their name. Problem was in SearchPathW function that does not attach an extension if file already has one. That does not work for executables like ld.lld2 for example which require to have .exe extension but SearchPath thinks that its "lld2". Solution was to add the extension manually. Differential Revision: http://reviews.llvm.org/D13536 llvm-svn: 249696	2015-10-08 16:03:19 +00:00
Frederic Riss	263b772bda	[X86] Disable X86CallFrameOptimization on Darwin in presence of EH We emit 1 compact unwind encoding per function, and this can’t represent the varying stack pointer that will be generated by X86CallFrameOptimization. Disable the optimization on Darwin. (It might be possible to split the function into multiple ranges and emit 1 compact unwind info per range. The compact unwind emission code isn’t ready for that and this kind of info certainly isn’t tested/used anywhere. It might be worth exploring this path if we want to get the space savings at some point though) llvm-svn: 249694	2015-10-08 15:45:08 +00:00
Teresa Johnson	ca6b64ff04	Fix combined function index abbrev (NFC) Removed an unused abbrev op in the VST_CODE_COMBINED_FNENTRY abbrev. I noticed while writing/testing an array string dumper for llvm-bcanalyze that the combined function's VST entry abbrevs contained an old field that I am not using. Everything was working fine since the bitcode writer and reader were in sync on how the record fields were actually being set up and interpreted. llvm-svn: 249691	2015-10-08 13:52:56 +00:00
Igor Breger	defab3c1ef	AVX512: vpextrb/w/d/q and vpinsrb/w/d/q implementation. This instructions doesn't have intrincis. Added tests for lowering and encoding. Differential Revision: http://reviews.llvm.org/D12317 llvm-svn: 249688	2015-10-08 12:55:01 +00:00
James Molloy	e9d50dc9f7	Compute demanded bits for icmp instructions Instead of bailing out when we see an icmp, we can instead at least say that if the upper bits of both operands are known zero, they are not demanded. This doesn't help with signed comparisons, but it's at least better than bailing out. llvm-svn: 249687	2015-10-08 12:40:06 +00:00
James Molloy	bcd7f0ac98	Treat Mul just like Add and Subtract Like adds and subtracts, muls ripple only to the left so we can use the same logic. While we're here, add a print method to DemandedBits so it can be used with -analyze, which we'll use in the testcase. llvm-svn: 249686	2015-10-08 12:39:59 +00:00
James Molloy	ab9fdb9226	Make demanded bits lazy The algorithm itself is still eager, but it doesn't get run until a query function is called. This greatly reduces the compile-time impact of requiring DemandedBits when at runtime it is not often used. NFCI. llvm-svn: 249685	2015-10-08 12:39:50 +00:00
Michael Kuperstein	04e79329d0	[X86] Fix wrong treatment of multi-lane blends in BUILD_VECTORtoBlendMask() This fixes two separate bugs: 1) The mask for the high lane was not set correctly. That fixes PR24532. 2) The transformation should bail out if it believes it involves more than 2 lanes, as it does not currently do anything sensible in this case. Differential Revision: http://reviews.llvm.org/D13505 llvm-svn: 249669	2015-10-08 08:13:02 +00:00
Michael Kuperstein	2b3c16ca17	Do not assert on first non-prologue instruction being a CFI directive. llvm-svn: 249668	2015-10-08 07:48:49 +00:00
Jonas Paulsson	5d3fbd3733	[SystemZ] SystemZElimCompare pass improved. Compare elimination extended to recognize load-and-test instructions used for comparison and eliminate them the same way as with compare instructions. Test case fp-cmp-05.ll updated to expect optimized results now also for z13. The order of instruction shortening and compare elimination passes have been changed so that opcodes do not have to be handled in both passes. Reviewed by Ulrich Weigand. llvm-svn: 249666	2015-10-08 07:40:23 +00:00
Jonas Paulsson	29d9d8d955	[SystemZ] Bugfix: check CC reg liveness in SystemZShortenInst. The following instruction shortening transformations would introduce a definition of the CC reg, so therefore liveness of CC reg must be checked: WFADB -> ADBR WFSDB -> SDBR Also add the CC reg implicit def operand to the MI in case of change of opcode. Reviewed by Ulrich Weigand. llvm-svn: 249665	2015-10-08 07:40:19 +00:00
Jonas Paulsson	7c5ce10a07	[SystemZ] Use load-and-test for fp compare with 0 if vector support is present. Since the LTxBRCompare instructions can't be used with vector registers, a normal load-and-test instruction (with a modelled def operand) is used instead. Reviewed by Ulrich Weigand. llvm-svn: 249664	2015-10-08 07:40:16 +00:00
Jonas Paulsson	2c96dd64fc	[SystemZ] More minor fixing in SystemZElimCompare.cpp Don't use subreg indices since they are not used after regalloc. Reviewed by Ulrich Weigand. llvm-svn: 249663	2015-10-08 07:40:11 +00:00
Jonas Paulsson	9e1f3bd1bd	[SystemZ] Minor fixes in SystemZElimCompare.cpp Reviewed by Ulrich Weigand. llvm-svn: 249662	2015-10-08 07:39:55 +00:00
Craig Topper	da5168b7ce	Use range-based for loops. NFC. llvm-svn: 249659	2015-10-08 06:06:42 +00:00
Sanjoy Das	10dffcb36b	[SCEV] Check `Pred` first in isKnownPredicateViaSplitting Comparing `Pred` with `ICmpInst::ICMP_ULT` is cheaper that memory access -- do that check before loading / storing `ProvingSplitPredicate`. llvm-svn: 249654	2015-10-08 03:46:00 +00:00
Sanjoy Das	1195dbee66	[SCEV] Use `auto *` instead of `auto`; NFCI (As prescribed by the coding style document) llvm-svn: 249653	2015-10-08 03:45:58 +00:00
Diego Novillo	a082040ded	Revert "Handle inline stacks in gcov-encoded sample profiles." This reverts commit r249644. The buildbots are failing the new test I added. Investigating. llvm-svn: 249648	2015-10-08 01:17:26 +00:00
Kostya Serebryany	3b804877fd	[libFuzzer] fix 32-bit build llvm-svn: 249646	2015-10-08 00:59:25 +00:00
Diego Novillo	b7fca57493	Handle inline stacks in gcov-encoded sample profiles. This patch adds support for reading sample profiles with inline stacks. Inline stacks in a profile are generated when the sampled binary has samples in inlined functions. For instance, if main() calls foo() and foo() calls bar(), and bar() is inlined into foo() and foo() inlined into main(), the profile may look something like: main total:364084 head:0 [ ... ] 2.3: _Z3fool total:243786 1: 60149 1.2: 38568 1.4: 46511 1.7: _Z3bari total:98558 1.1: 52672 1.2: 45886 At line 2, discriminator 3, main() calls foo(). In turn, foo() calls bar() at line 1, discriminator 7. In the textual format, this stacking of inline calls is represented with indentation. With this change, LLVM can now read sample profile files generated by the create_gcov tool from https://github.com/google/autofdo. llvm-svn: 249644	2015-10-08 00:39:11 +00:00
Justin Bogner	468c998031	CodeGen: print and verify after TargetPassConfig::insertPass by default In r224059, we started verifying after addPass, but missed doing so on insertPass. There isn't a good reason for the discrepancy, and skipping the verifier in these cases causes bugs. This also exposes a verifier error that was introduced in r249087, but the verifier doesn't run until after the register coalescer, when the issue happens to have been resolved. I've skipped the verifier after SIFixSGPRLiveRangesID to avoid the failures for now and will follow up with Matt for a proper fix. llvm-svn: 249643	2015-10-08 00:36:22 +00:00
Reid Kleckner	97797419e6	[WinEH] Fix 32-bit funclet epilogues in the presence of dynamic allocas In particular, passing non-trivially copyable objects by value on win32 uses a dynamic alloca (inalloca). We would clobber ESP in the epilogue and end up returning to outer space. llvm-svn: 249637	2015-10-07 23:55:01 +00:00
David Majnemer	6af5f82c20	[WinEH] Refer to filter funclets using their symbol-table symbol The relocation for the filter funclet will be against a symbol table entry for a function instead of the section, making it easier to understand what is going on. llvm-svn: 249621	2015-10-07 21:34:00 +00:00
Sanjoy Das	40bdd041db	[RS4GC] Use AssertingVH for RematerializedValueMapTy, NFCI Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13489 llvm-svn: 249620	2015-10-07 21:32:35 +00:00
Reid Kleckner	70bf6bb5e6	[WinEH] Undo the effect of r249578 for 32-bit The __CxxFrameHandler3 tables for 32-bit are supposed to hold stack offsets relative to EBP, not ESP. I blindly updated the win-catchpad.ll test case, and immediately noticed that 32-bit catching stopped working. While I'm at it, move the frame index to frame offset WinEH table logic out of PEI. PEI shouldn't have to know about WinEHFuncInfo. I realized we can calculate frame index offsets just fine from the table printer. llvm-svn: 249618	2015-10-07 21:13:15 +00:00
David Majnemer	c289c9ff55	[WinEH] Remove unreachable blocks before preparation We remove unreachable blocks because it is pointless to consider them for coloring. However, we still had stale pointers to these blocks in some data structures after we removed them from the function. Instead, remove the unreachable blocks before attempting to do anything with the function. This fixes PR25099. llvm-svn: 249617	2015-10-07 21:08:25 +00:00
Rafael Espindola	284093033f	git-clang-format r249548. Sorry for missing this the first time. llvm-svn: 249610	2015-10-07 20:32:24 +00:00
Vasileios Kalintiris	b876b58d38	[mips][FastISel] Factor out common code from switch statement. NFC llvm-svn: 249603	2015-10-07 20:06:30 +00:00
Duncan P. N. Exon Smith	37bf678a0d	IR: Create SymbolTableList wrapper around iplist, NFC Create `SymbolTableList`, a wrapper around `iplist` for lists that automatically manage a symbol table. This commit reduces a ton of code duplication between the six traits classes that were used previously. As a drive by, reduce the number of template parameters from 2 to 1 by using a SymbolTableListParentType metafunction (I originally had this as a separate commit, but it touched most of the same lines so I squashed them). I'm in the process of trying to remove the UB in `createSentinel()` (see the FIXMEs I added for `ilist_embedded_sentinel_traits` and `ilist_half_embedded_sentinel_traits`). My eventual goal is to separate the list logic into a base class layer that knows nothing about (and isn't templated on) the downcasted nodes -- removing the need to invoke UB -- but for now I'm just trying to get a handle on all the current use cases (and cleaning things up as I see them). Besides these six SymbolTable lists, there are two others that use the addNode/removeNode/transferNodes() hooks: the `MachineInstruction` and `MachineBasicBlock` lists. Ideally there'll be a way to factor these hooks out of the low-level API entirely, but I'm not quite there yet. llvm-svn: 249602	2015-10-07 20:05:10 +00:00
Sanjoy Das	af6980c70a	[IRBuilder] Add gc.statepoint related methods to IRBuilder Summary: This adds some more routines to `IRBuilder` around creating calls and invokes to `gc.statepoint`. These will be used later. Reviewers: reames, swaroop.sridhar Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13371 llvm-svn: 249596	2015-10-07 19:52:12 +00:00
Vasileios Kalintiris	6ae1b35cda	[mips][FastISel] Use ternary operator to select opcode. NFC llvm-svn: 249594	2015-10-07 19:43:31 +00:00
Joseph Tremoulet	39234fc67e	[WinEH] Set NoModuleLevelChanges in clone flags Summary: This is necessary to keep the cloner from making bogus copies of debug metadata attached to the IR it is cloning. Also, avoid running RemapInstruction over all instructions in the common case that no cloning was performed. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13514 llvm-svn: 249591	2015-10-07 19:29:56 +00:00
Rafael Espindola	4264e2d531	Use SpecificBumpPtrAllocator to simplify the MCSeciton destruction. llvm-svn: 249589	2015-10-07 19:08:19 +00:00
Mehdi Amini	044cb34bdc	Revert "Revert "This patch builds on top of D13378 to handle constant condition."" This reverts commit r249528 and reapply r249431. The fix for the fallout has been commited in r249575. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 249581	2015-10-07 18:14:25 +00:00
Vasileios Kalintiris	daad571ba4	[mips][FastISel] Simple refactoring of MipsFastISel::emitLogicalOP(). NFC. llvm-svn: 249580	2015-10-07 18:14:24 +00:00
Chad Rosier	7c6ac2b8f9	[AArch64] Fold a floating-point divide by power of two into fp conversion. Part of http://reviews.llvm.org/D13442 llvm-svn: 249579	2015-10-07 17:51:37 +00:00
Reid Kleckner	33bd2d99d8	[WinEH] Fix two minor issues in __CxxFrameHandler3 tables There was an off-by-one bug in ip2state tables which manifested when one call immediately preceded the try-range of the next. The return address of the previous call would appear to be within the try range of the next scope, resulting in extra destructors or catches running. We also computed the wrong offset for catch parameter stack objects. The offset should be from RSP, not from RBP. llvm-svn: 249578	2015-10-07 17:49:32 +00:00
Matt Arsenault	fc0ad42516	AMDGPU: Fix missing implicit m0 uses on movrel instructions llvm-svn: 249577	2015-10-07 17:46:32 +00:00
Chad Rosier	fa30c9b436	[AArch64] Fold a floating-point multiply by power of two into fp conversion. Part of http://reviews.llvm.org/D13442 llvm-svn: 249576	2015-10-07 17:39:18 +00:00
Sanjoy Das	0015e5a088	[IndVars] Preserve LCSSA in `eliminateIdentitySCEV` Summary: After r249211, SCEV can see through some LCSSA phis. Add a `replacementPreservesLCSSAForm` check before replacing uses of these phi nodes with a simplified use of the induction variable to avoid breaking LCSSA. Fixes 25047. Depends on D13460. Reviewers: atrick, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13461 llvm-svn: 249575	2015-10-07 17:38:31 +00:00
Sanjoy Das	4493b40002	[SCEV] Use some C++11'ism, NFC Summary: Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13457 llvm-svn: 249574	2015-10-07 17:38:25 +00:00
Chad Rosier	169865ffda	[ARM] Promote helper function to SelectionDAG. I'll be using the function in a similar combine for AArch64. The helper was also improved to handle undef values. Part of http://reviews.llvm.org/D13442 llvm-svn: 249572	2015-10-07 17:28:58 +00:00
Kevin B. Smith	9c7408807f	Test commit access. Fixed comment to have correct input parameter name and period termination. llvm-svn: 249571	2015-10-07 17:24:25 +00:00
Joseph Tremoulet	bde46c5642	[WinEH] Update CoreCLR EH for catchpad MBBs Summary: Set the pad MBB as a funclet entry for CoreCLR as well as MSVCCXX, and update state numbering to put the catchpad block rather than its normal successor into the unwind map. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13492 llvm-svn: 249569	2015-10-07 17:16:25 +00:00
Oliver Stannard	d3d114ba54	[ARM] Use correct half-precision functions in EABI mode The ARM RTABI defines the half- to single-precision float conversion functions with an __aeabi prefix, but libgcc only has them with a __gnu prefix. Therefore we need to emit the __aeabi version when compiling with an eabi or eabihf triple, and the __gnu version with a gnueabi or gnueabihf triple. llvm-svn: 249565	2015-10-07 16:58:49 +00:00
Chad Rosier	17436bf64e	[ARM] Prevent PerformVDIVCombine from combining a vcvt/vdiv with 8 lanes. This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 249560	2015-10-07 16:15:40 +00:00
Artur Pilipenko	d94903c9f8	Teach computeKnownBits to use new align attribute/metadata Reviewed By: reames Differential Revision: http://reviews.llvm.org/D13470 llvm-svn: 249557	2015-10-07 16:01:18 +00:00
Jeroen Ketema	aebca09543	[ARM][AArch64] Only lower to interleaved load/store if the target has NEON Without an additional check for NEON, the compiler crashes during legalization of NEON ldN/stN. Differential Revision: http://reviews.llvm.org/D13508 llvm-svn: 249550	2015-10-07 14:53:29 +00:00
Rafael Espindola	30d77777e7	Use non virtual destructors for sections. llvm-svn: 249548	2015-10-07 13:46:06 +00:00
Chad Rosier	db71abf2d4	[ARM] Push more complex check down to reduce compile time. NFC. llvm-svn: 249547	2015-10-07 13:40:44 +00:00
Rafael Espindola	665b0d3a4e	Don't repeat names in comments and don't indent in namespaces. NFC. llvm-svn: 249546	2015-10-07 13:38:49 +00:00
Scott Egerton	9004cc7942	Revert: r249536 - Testing commit access with a trival whitespace change. llvm-svn: 249537	2015-10-07 10:57:06 +00:00
Scott Egerton	be6b54b691	Testing commit access with a trival whitespace change. llvm-svn: 249536	2015-10-07 10:49:49 +00:00
James Molloy	47efaeb36e	Revert "This patch builds on top of D13378 to handle constant condition." This reverts commit r249431. This caused failures in sqlite3: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/14453 llvm-svn: 249528	2015-10-07 09:03:34 +00:00
Arnaud A. de Grandmaison	a6178a179d	[EarlyCSE] Fix handling of target memory intrinsics for CSE'ing loads. Summary: Some target intrinsics can access multiple elements, using the pointer as a base address (e.g. AArch64 ld4). When trying to CSE such instructions, it must be checked the available value comes from a compatible instruction because the pointer is not enough to discriminate whether the value is correct. Reviewers: ssijaric Subscribers: mcrosier, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13475 llvm-svn: 249523	2015-10-07 07:41:29 +00:00
Michael Kuperstein	259f1508f0	[X86] Emit .cfi_escape GNU_ARGS_SIZE when adjusting the stack before calls When outgoing function arguments are passed using push instructions, and EH is enabled, we may need to indicate to the stack unwinder that the stack pointer was adjusted before the call. This should fix the exception handling issues in PR24792. Differential Revision: http://reviews.llvm.org/D13132 llvm-svn: 249522	2015-10-07 07:01:31 +00:00
Igor Breger	1a6fd1cc0f	AVX512: Change encoding of vpshuflw and vpshufhw instructions. Implement WIG as W0 and not W1, like all other instruction have been implemented. Add encoding tests. Differential Revision: http://reviews.llvm.org/D13471 llvm-svn: 249521	2015-10-07 06:31:18 +00:00
Sanjoy Das	60bf3db17f	[RS4GC] Remove an unnecessary assert & related variables I don't think this assert adds much value, and removing it and related variables avoids an "unused variable" warning in release builds. llvm-svn: 249511	2015-10-07 02:39:27 +00:00
Sanjoy Das	b40bd1a93f	[RS4GC] Cosmetic cleanup, NFC Summary: A series of cosmetic cleanup changes to RewriteStatepointsForGC: - Rename variables to LLVM style - Remove some redundant asserts - Remove an unsued `Pass *` parameter - Remove unnecessary variables - Use C++11 idioms where applicable - Pass CallSite by value, not reference Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13370 llvm-svn: 249508	2015-10-07 02:39:18 +00:00
Matt Arsenault	10e6a61892	AMDGPU: Add comment for VOP2b operand class Because of the constant bus requirement, it is never legal to use a literal constant for these instructions despite the encoding allowing it. This was already doing the right thing, but note why. llvm-svn: 249500	2015-10-07 01:36:00 +00:00
Matt Arsenault	187276fa94	AMDGPU: Properly register passes llvm-svn: 249495	2015-10-07 00:42:53 +00:00
Matt Arsenault	284192730a	AMDGPU: Use explicit register size indirect pseudos This stops using an unknown reg class operand. Currently build_vector selection has a broken looking check where it tries to use a VGPR reg class and an SGPR one if it sees an SGPR use. With the source operand has an explicit VGPR class, illegal copies will be inserted that SIFixSGPRCopies will take care of normally later, which will allow removing the weird check of build_vector users. Without this, when removed v_movrels_b32 would still be emitted even though all of the values were only stored in SGPRs. llvm-svn: 249494	2015-10-07 00:42:51 +00:00
Matt Arsenault	922b7bf808	AMDGPU: Remove inferRegClassFromUses / inferRegClassFromDefs I'm not sure why this would be necessary, and no tests fail with them removed. Looking at the uses is suspect as well because the use reg classes will likely change when the users are moved as a result of moving this instruction. llvm-svn: 249493	2015-10-07 00:42:31 +00:00
Reid Kleckner	72ba70418f	[SEH] Add llvm.eh.exceptioncode intrinsic This will support the Clang __exception_code intrinsic. llvm-svn: 249492	2015-10-07 00:27:33 +00:00
Hans Wennborg	f1f36517b7	InstCombine: Fold comparisons between unguessable allocas and other pointers This will allow us to optimize code such as: int f(int p) { int x; return p == &x; } as well as: int allocate(void); int f() { int x; int *p = allocate(); return p == &x; } The folding can only be done under certain circumstances. Even though p and &x cannot alias, the comparison must still return true if the pointer representations are equal. If a user successfully generates a p that's a correct guess for &x, comparison should return true even though p is an invalid pointer. This patch argues that if the address of the alloca isn't observable outside the function, the function can act as-if the address is impossible to guess from the outside. The tricky part is keeping the act consistent: if we fold p == &x to false in one place, we must make sure to fold any other comparisons based on those pointers similarly. To ensure that, we only fold when &x is involved exactly once in comparison instructions. Differential Revision: http://reviews.llvm.org/D13358 llvm-svn: 249490	2015-10-07 00:20:07 +00:00
David Blaikie	c9ad9191a7	DebugInfo: Include the decl_line/decl_file in subprogram definitions if they differ from those in the declaration This is handy for some AutoFDO stuff, and seems like a minor improvement to correctness (otherwise a debug info consumer might think the decl line/file of the def was the same as that of the declaration - though what a consumer might use that for, I'm not sure - maybe "list <func>" would've misbehaved with the old behavior?) and at a minor cost (in my experiment, with fission, without type units, without compression, 0.01% growth in debug info in the executable/objects, 0.02% growth in the .dwo files). llvm-svn: 249487	2015-10-07 00:04:16 +00:00
David Majnemer	7735a6d07a	[WinEH] Create a separate MBB for funclet prologues Our current emission strategy is to emit the funclet prologue in the CatchPad's normal destination. This is problematic because intra-funclet control flow to the normal destination is not erroneous and results in us reevaluating the prologue if said control flow is taken. Instead, use the CatchPad's location for the funclet prologue. This correctly models our desire to have unwind edges evaluate the prologue but edges to the normal destination result in typical control flow. Differential Revision: http://reviews.llvm.org/D13424 llvm-svn: 249483	2015-10-06 23:31:59 +00:00
Hans Wennborg	083ca9bb32	Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482	2015-10-06 23:24:35 +00:00
Lang Hames	44780acd91	[Orc] Teach the CompileOnDemand layer to clone aliases. This allows modules containing aliases to be lazily jit'd. Previously these failed with missing symbol errors because the aliases weren't cloned from the original module. llvm-svn: 249481	2015-10-06 22:55:05 +00:00
Duncan P. N. Exon Smith	55a0c43f8e	IR: Use auto for iterators, NFC llvm-svn: 249480	2015-10-06 22:37:47 +00:00
Duncan P. N. Exon Smith	d146e7d93e	IR: Remove unnecessary TraitsClass typedef, NFC No classes are specializing the symbol table traits, so no need to look through a typedef for class API. Make a few more functions private since only SymbolTableListTraits should be using them. llvm-svn: 249476	2015-10-06 22:14:06 +00:00
Sanjoy Das	5c8bead46d	[IndVars] Don't break dominance in `eliminateIdentitySCEV` Summary: After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and Y are related in the dominator tree, even if X is an operand to Y (I've included a toy example in comments, and a real example as a test case). This commit changes `SimplifyIndVar` to require a `DominatorTree`. I don't think this is a problem because `ScalarEvolution` requires it anyway. Fixes PR25051. Depends on D13459. Reviewers: atrick, hfinkel Subscribers: joker.eph, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13460 llvm-svn: 249471	2015-10-06 21:44:49 +00:00
Sanjoy Das	088bb0ea9f	[IndVars] Extract out eliminateIdentitySCEV, NFC Summary: Reflow a comment while at it. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13459 llvm-svn: 249470	2015-10-06 21:44:39 +00:00
Duncan P. N. Exon Smith	c77e92da92	IR: Remove unnecessary specialization of getSymTab(), NFC The only specializations of `getSymTab()` were identical to the default defined in `SymbolTableListTraits::getSymTab()`. Remove the specializations, and stop treating it like a configuration point. Just to be sure no one else accesses this, make it private. llvm-svn: 249469	2015-10-06 21:31:07 +00:00
Tom Stellard	0fbf899c0f	AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments() Summary: We currently ignore the calling convention, so there is no real reason to assert on the calling convention of functions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13367 llvm-svn: 249468	2015-10-06 21:16:34 +00:00
Chad Rosier	dca46b426f	[ARM] Minor refactoring. NFC. llvm-svn: 249465	2015-10-06 20:58:42 +00:00
Chad Rosier	aed910b7d7	[ARM] Minor refactoring. NFC. llvm-svn: 249464	2015-10-06 20:51:26 +00:00
Chad Rosier	9df4aff86d	[ARM] Minor refactoring. NFC. llvm-svn: 249463	2015-10-06 20:45:45 +00:00
Vedant Kumar	1ab5ea564f	[Function] Clean up {prefix,prologue} data routines (NFC) Factor out some common code used to get+set function prefix/prologue data. This may come in handy if we ever decide to store personality functions in the same way we store prefix/prologue data. Differential Revision: http://reviews.llvm.org/D13120 Reviewed-by: bogner llvm-svn: 249460	2015-10-06 20:31:57 +00:00
Joseph Tremoulet	7f8c1165cd	[WinEH] Implement state numbering for CoreCLR Summary: Assign one state number per handler/funclet, tracking parent state, handler type, and catch type token. State numbers are arranged such that ancestors have lower state numbers than their descendants. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13450 llvm-svn: 249457	2015-10-06 20:30:33 +00:00
Joseph Tremoulet	2afea5438f	[WinEH] Recognize CoreCLR personality function Summary: - Add CoreCLR to if/else ladders and switches as appropriate. - Rename isMSVCEHPersonality to isFuncletEHPersonality to better reflect what it captures. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13449 llvm-svn: 249455	2015-10-06 20:28:16 +00:00
Chad Rosier	a087fd21da	[ARM] Minor refactoring to improve readability. NFC. llvm-svn: 249454	2015-10-06 20:23:42 +00:00
Philip Reames	675418ebc0	Extend known bits to understand @llvm.bswap This is a cleaned up patch from the one written by John Regehr based on the findings of the Souper superoptimizer. When writing tests, I was surprised to find that instsimplify apparently doesn't know how to collapse bit test sequences based purely on known bits. This required me to split my tests across both instsimplify and instcombine. Differential Revision: http://reviews.llvm.org/D13250 llvm-svn: 249453	2015-10-06 20:20:45 +00:00
Philip Reames	600a91580f	Fix pr25040 - Handle vectors of i1s in recently added implication code As mentioned in the bug, I'd missed the presence of a getScalarType in the caller of the new implies method. As a result, when we ended up with a implication over two vectors, we'd trip an assert and crash. Differential Revision: http://reviews.llvm.org/D13441 llvm-svn: 249442	2015-10-06 19:00:02 +00:00
Krzysztof Parzyszek	8d2b2cfa29	[Hexagon] Remove ZeroOrMore from option flags llvm-svn: 249438	2015-10-06 18:29:36 +00:00
Mehdi Amini	cf2513b352	This patch builds on top of D13378 to handle constant condition. With this patch, clang -O3 optimizes correctly providing > 1000x speedup on this artificial benchmark): for (a=0; a<n; a++) for (b=0; b<n; b++) for (c=0; c<n; c++) for (d=0; d<n; d++) for (e=0; e<n; e++) for (f=0; f<n; f++) x++; From test-suite/SingleSource/Benchmarks/Shootout/nestedloop.c Reviewers: sanjoyd Differential Revision: http://reviews.llvm.org/D13390 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 249431	2015-10-06 17:19:20 +00:00
Tom Stellard	88e0b25181	AMDGPU/SI: Add 64-bit versions of v_nop and v_clrexcp Summary: The assembly printing of these is still missing the encoding size suffix, but this will be fixed in a later commit. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13436 llvm-svn: 249424	2015-10-06 15:57:53 +00:00
Krzysztof Parzyszek	fb33824efd	[Hexagon] Add an early if-conversion pass llvm-svn: 249423	2015-10-06 15:49:14 +00:00
Daniel Sanders	1b3341724c	[mips][microMIPS] Fix an issue with selecting sqrt instruction in LLVM backend Summary: This fixes 7 tests during fast LLVM test-suite run: * MultiSource/Benchmarks/McCat/18-imp/imp * MultiSource/Applications/oggenc/oggenc * MultiSource/Benchmarks/MallocBench/gs/gs * MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan * MultiSource/Benchmarks/VersaBench/beamformer/beamformer * MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame * MultiSource/Benchmarks/Bullet/bullet Error message was in the form of: fatal error: error in backend: Cannot select: 0x95c3288: f32 = fsqrt 0x95c0190 [ORD=9] [ID=18] 0x95c0190: f32 = fadd 0x95bef30, 0x95c4d00 [ORD=8] [ID=17] 0x95bef30: f32 = fmul 0x95c4988, 0x95c4988 [ORD=5] [ID=16] ... There was problem with selecting sqrt instruction in LLVM backend. To fix the issue changes are made in TableGen definition for sqrt instruction in MipsInstrFPU.td and new test file sqrt.ll is added to LLVM regression tests. Patch by Zlatko Buljan Reviewers: zoran.jovanovic, hvarga, dsanders Subscribers: llvm-commits, petarj Differential Revision: http://reviews.llvm.org/D13235 llvm-svn: 249416	2015-10-06 15:17:25 +00:00
Daniel Sanders	add9057fa7	Revert r249123 - [mips][microMIPS] Fix an issue with selecting sqrt instruction in LLVM backend The author was not credited and most of the commit message is missing. Will re-commit with this fixed. llvm-svn: 249415	2015-10-06 15:13:16 +00:00
Arnaud A. de Grandmaison	6fd488b156	[EarlyCSE] Constify ParseMemoryInst methods (NFC). llvm-svn: 249400	2015-10-06 13:35:30 +00:00
Filipe Cabecinhas	b70fd8719e	Make sure the CastInst is valid before trying to create it Bug found with afl-fuzz. llvm-svn: 249396	2015-10-06 12:37:54 +00:00
Andrea Di Biagio	40f59e4466	[InstCombine] Teach SimplifyDemandedVectorElts how to handle ConstantVector select masks with ConstantExpr elements (PR24922) If the mask of a select instruction is a ConstantVector, method SimplifyDemandedVectorElts iterates over the mask elements to identify which values are selected from the select inputs. Before this patch, method SimplifyDemandedVectorElts always used method Constant::isNullValue() to check if a value in the mask was zero. Unfortunately that method always returns false when called on a ConstantExpr. This patch fixes the problem in SimplifyDemandedVectorElts by adding an explicit check for ConstantExpr values. Now, if a value in the mask is a ConstantExpr, we avoid calling isNullValue() on it. Fixes PR24922. Differential Revision: http://reviews.llvm.org/D13219 llvm-svn: 249390	2015-10-06 10:34:53 +00:00
Craig Topper	2c4068f409	[TwoAddressInstructionPass] When looking for a 3 addr conversion after commuting, make sure regB has been updated to take into account the commute. llvm-svn: 249378	2015-10-06 05:39:59 +00:00
Alexei Starovoitov	4e01a38da0	[bpf] Avoid extra pointer arithmetic for stack access For the program like below struct key_t { int pid; char name[16]; }; extern void test1(char *); int test() { struct key_t key = {}; test1(key.name); return 0; } For key.name, the llc/bpf may generate the below code: R1 = R10 // R10 is the frame pointer R1 += -24 // framepointer adjustment R1 \|= 4 // R1 is then used as the first parameter of test1 OR operation is not recognized by in-kernel verifier. This patch introduces an intermediate FI_ri instruction and generates the following code that can be properly verified: R1 = R10 R1 += -20 Patch by Yonghong Song <yhs@plumgrid.com> llvm-svn: 249371	2015-10-06 04:00:53 +00:00
Craig Topper	79dd1bf094	[X86] Teach constant hoisting that ANDs with 64-bit immediates in the range 0x80000000-0xffffffff can be handled cheaply and don't need to be hoisted. Most importantly, this keeps constant hoisting from preventing instruction selections ability to turn an AND with 0xffffffff into a move into a 32-bit subregister. llvm-svn: 249370	2015-10-06 02:50:24 +00:00
Craig Topper	d69d495333	[X86] Remove unnecessary AddComplexity directive. The instruction is already wrapped in the equivalent earlier. NFC llvm-svn: 249369	2015-10-06 02:50:21 +00:00
Dan Gohman	e51c058ecc	[WebAssembly] Switch to a more traditional assembly syntax This new syntax is built around putting each instruction on its own line in a "mnemonic op, op, op" like syntax. It also uses conventional data section directives like ".byte" and so on rather than requiring everything to be in hierarchical S-expression format. This is a more natural syntax for a ".s" file format from the perspective of LLVM MC and related tools, while remaining easy to translate into other forms as needed. llvm-svn: 249364	2015-10-06 00:27:55 +00:00
Benjamin Kramer	808d2a070d	Move helper classes into an anonymous namespace. NFC. llvm-svn: 249356	2015-10-05 21:20:26 +00:00
Diego Novillo	91cbed84d9	Remove AutoFDO profile handling for GCC's LIPO. NFC. Given the work we are doing on ThinLTO, we will never need to support module groups and working sets in GCC's implementation of LIPO. These are currently dead code, and will continue to be so. llvm-svn: 249351	2015-10-05 21:08:05 +00:00
David Majnemer	e4f9b09b51	[WinEH] Update CATCHRET's operand to match its successor The CATCHRET operand did not match the MachineFunction's CFG. This mismatch happened because FrameLowering created a new MachineBasicBlock and updated the CFG but forgot to update the CATCHRET operand. Let's make sure this doesn't happen again by strengthing the funclet membership analysis: it can now reason about the membership of all basic blocks, not just those inside of funclets. llvm-svn: 249344	2015-10-05 20:09:16 +00:00
Jakub Staszak	225d3ab801	Simplify code. No functionality change. llvm-svn: 249335	2015-10-05 18:53:30 +00:00
Evgeniy Stepanov	670abcfd78	[msan] Correct a typo in poison stack pattern command line description. Patch by Jon Eyolfson. llvm-svn: 249331	2015-10-05 18:01:17 +00:00
Tom Stellard	d585cd85a3	AMDGPU/SI: Add a helper for creating aliases for the _e32 instructions Summary: We are currently only using these aliases for VOPC instructions, but this helper will make it easier to use them everywhere. These aliases allow for the automatic matching of instructions with forced 32-bit encoding. Eventually, we should be able to remove the custom C++ logic we have for this in the assembler. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13396 llvm-svn: 249330	2015-10-05 17:57:39 +00:00
Arnold Schwaighofer	0591c5d719	MergeFunctions: Clear GlobalNumbers ValueMap Otherwise, the map will observe changes as long as MergeFunctions is alive. This is bad because follow-up passes could replace-all-uses-with on the key of an entry in the map. The value handle callback of ValueMap however asserts that the key type matches. rdar://22971893 llvm-svn: 249327	2015-10-05 17:26:36 +00:00
Scott Douglass	953f908173	[ARM] Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing memcpy as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achieving better codegen is to create MEMCPY pseudo instructions, attach scratch virtual registers to them and then, post register allocation, expand the MEMCPYs into LDM/STM pairs using the scratch registers. The register allocator will have picked arbitrary registers which we sort when expanding the MEMCPY. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. Fixes PR9199 and PR23768. [This is based on Peter Collingbourne's r238473 which was reverted.] Differential Revision: http://reviews.llvm.org/D13239 Change-Id: I727543c2e94136e0f80b8e22d5642d7b9ee5b458 Author: Peter Collingbourne <peter@pcc.me.uk> llvm-svn: 249322	2015-10-05 14:49:54 +00:00
Zoran Jovanovic	5a8dffc618	[mips][microMIPS] Implement JALRC16, JRCADDIUSP and JRC16 instructions Differential Revision: http://reviews.llvm.org/D11219 llvm-svn: 249317	2015-10-05 14:00:09 +00:00
Alexandros Lamprineas	1bab191f25	[MC layer][AArch64] llvm-mc accepts 4-bit immediate values for "msr pan, #imm", while only 1-bit immediate values should be valid. Changed encoding and decoding for msr pstate instructions. Differential Revision: http://reviews.llvm.org/D13011 llvm-svn: 249313	2015-10-05 13:42:31 +00:00
Daniel Sanders	d5a89418c5	[mips] Changed the way symbols are handled in dla and la instructions to allow simple expressions. Summary: An instruction like "(d)la $5, symbol+8" previously would have crashed the assembler as it contains an expression. This is now fixed. A few tests cases have also been changed to reflect these changes, however these should only be syntax changes. Some new test cases have also been added. Patch by Scott Egerton. Reviewers: vkalintiris, dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12760 llvm-svn: 249311	2015-10-05 13:19:29 +00:00
Benjamin Kramer	ae1d59967d	[Support] Add a version of fs::make_absolute with a custom CWD. This will be used soon from clang. llvm-svn: 249309	2015-10-05 13:02:43 +00:00
Rafael Espindola	e3a20f57d9	Fix pr24486. This extends the work done in r233995 so that now getFragment (in addition to getSection) also works for variable symbols. With that the existing logic to decide if a-b can be computed works even if a or b are variables. Given that, the expression evaluation can avoid expanding variables as aggressively and that in turn lets the relocation code see the original variable. In order for this to work with the asm streamer, there is now a dummy fragment per section. It is used to assign a section to a symbol when no other fragment exists. This patch is a joint work by Maxim Ostapenko andy myself. llvm-svn: 249303	2015-10-05 12:07:05 +00:00
David Majnemer	429c8eda22	[SelectionDAGBuilder] Remove dead code We already check for LandingPadInst two lines above. llvm-svn: 249280	2015-10-04 18:44:47 +00:00
Teresa Johnson	19f517a7d7	Remove unused private field introduced by r249270. llvm-svn: 249277	2015-10-04 15:00:55 +00:00
Teresa Johnson	403a787e03	Support for function summary index bitcode sections and files. Summary: The bitcode format is described in this document: https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view For more info on ThinLTO see: https://sites.google.com/site/llvmthinlto The first customer is ThinLTO, however the data structures are designed and named more generally based on prior feedback. There are a few comments regarding how certain interfaces are used by ThinLTO, and the options added here to gold currently have ThinLTO-specific names as the behavior they provoke is currently ThinLTO-specific. This patch includes support for generating per-module function indexes, the combined index file via the gold plugin, and several tests (more are included with the associated clang patch D11908). Reviewers: dexonsmith, davidxl, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13107 llvm-svn: 249270	2015-10-04 14:33:43 +00:00
Joerg Sonnenberger	726e624c0c	[SPARCv9] Add support for the rdpr/wrpr instructions. llvm-svn: 249262	2015-10-04 09:11:22 +00:00
Igor Breger	78741a1b1e	AVX512: Implemented encoding and intrinsics for VPERMILPS/PD instructions. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12690 llvm-svn: 249261	2015-10-04 07:20:41 +00:00
David Majnemer	161935520d	[WinEH] Permit branch folding in the face of funclets Track which basic blocks belong to which funclets. Permit branch folding to fire but only if it can prove that doing so will not cause code in one funclet to be reused in another. llvm-svn: 249257	2015-10-04 02:22:52 +00:00
Jeroen Ketema	321fc30afc	Fix typo in README llvm-svn: 249253	2015-10-04 00:46:16 +00:00
Simon Pilgrim	dde63374c5	[DAGCombiner] Generalize FADD constant combines to work with vectors Updated the FADD combines to work with vectors as well as scalars. Differential Revision: http://reviews.llvm.org/D13416 llvm-svn: 249251	2015-10-03 22:06:06 +00:00
Sanjay Patel	acd4baefca	include equal sign in debug equations; NFC llvm-svn: 249248	2015-10-03 20:45:01 +00:00
Simon Pilgrim	bc707d04a4	[X86] Lower SEXTLOAD using SIGN_EXTEND_VECTOR_INREG. NCI. The custom lowering in LowerExtendedLoad is doing the equivalent shuffle, so make use of existing lowering code to reduce duplication. llvm-svn: 249243	2015-10-03 18:55:43 +00:00
Rafael Espindola	28de224002	Move registerSection out of line and reduce #includes. NFC. llvm-svn: 249241	2015-10-03 18:28:40 +00:00
Simon Pilgrim	a38d76a087	[DAGCombiner] Merge SIGN_EXTEND_INREG vector constant folding methods. NCI. visitSIGN_EXTEND_INREG calls SelectionDAG::getNode to constant fold scalar constants but handles vector constants itself, despite getNode being capable of dealing with them. This required a minor change to the getNode implementation to actually deal with cases where the scalars of a BUILD_VECTOR were wider integers than the vector type - which was the only extra ability of the visitSIGN_EXTEND_INREG implementation. No codegen intended and all existing tests remain the same. llvm-svn: 249236	2015-10-03 16:26:52 +00:00
Kostya Serebryany	c8cd29fb7e	[libFuzzer] trying to fix at-exit hang llvm-svn: 249231	2015-10-03 07:02:05 +00:00
Dan Gohman	dc51b96b7f	[WebAssembly] Implement the remaining conversion operations. This is a temporary assembly syntax that will likely evolve along with broader upcoming syntax changes. llvm-svn: 249225	2015-10-03 02:10:28 +00:00
Rafael Espindola	81413c0ca0	Use early return. NFC. llvm-svn: 249224	2015-10-03 00:57:12 +00:00
Sanjoy Das	1cd930b05f	Try to appease MSVC, NFCI. This time by lifting the lambda's in `createNodeFromSelectLikePHI` to the file scope. Looks like there are differences in capture rules between clang and MSVC? llvm-svn: 249222	2015-10-03 00:34:19 +00:00
Tom Stellard	dc9088a10e	AMDGPU/SI: Remove unused tablegen multiclass Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13395 llvm-svn: 249221	2015-10-03 00:29:50 +00:00
Rafael Espindola	2f834372f4	Disallow assigning symbol a null section. They are constructed without one and they can't go back, so this was effectively dead code. llvm-svn: 249220	2015-10-03 00:18:14 +00:00
Sanjoy Das	21ea9bdc46	Try to appease the MSVC bots, NFCI. llvm-svn: 249219	2015-10-03 00:03:15 +00:00
Dan Gohman	6a050f30de	[WebAssembly] Rename setlocal to set_local to match the spec. llvm-svn: 249218	2015-10-03 00:01:53 +00:00
Sanjoy Das	5b92acea2b	Try to appease the MSVC bots, NFC. llvm-svn: 249216	2015-10-02 23:43:32 +00:00
Kostya Serebryany	20bb5e71b2	[libFuzzer] make LLVMFuzzerTestOneInput (the fuzzer target function) return int instead of void. The actual return value is not yet used (and expected to be 0). This change is API breaking, so the fuzzers will need to be updated. llvm-svn: 249214	2015-10-02 23:34:06 +00:00
Sanjoy Das	55015d210f	[SCEV] Recognize simple br-phi patterns Summary: Teach SCEV to match patterns like ``` br %cond, label %left, label %right left: br label %merge right: br label %merge merge: V = phi [ %x, %left ], [ %y, %right ] ``` as "select %cond, %x, %y". Before this SCEV would match PHI nodes exclusively to add recurrences. This addresses PR25005. Reviewers: joker.eph, joker-eph, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13378 llvm-svn: 249211	2015-10-02 23:09:44 +00:00
Piotr Padlewski	dc9b2cfc50	inariant.group handling in GVN The most important part required to make clang devirtualization works ( ͡°͜ʖ ͡°). The code is able to find non local dependencies, but unfortunatelly because the caller can only handle local dependencies, I had to add some restrictions to look for dependencies only in the same BB. http://reviews.llvm.org/D12992 llvm-svn: 249196	2015-10-02 22:12:22 +00:00
Kostya Serebryany	65d0a1458f	[libFuzzer] remove experimental flag and functionality llvm-svn: 249194	2015-10-02 22:00:32 +00:00
Dan Gohman	e3e4a5ff52	[WebAssembly] Fix CFG stackification of nested loops. llvm-svn: 249187	2015-10-02 21:11:36 +00:00
Dan Gohman	9cc692b06e	[WebAssembly] Support calls marked as "tail", fastcc, and coldcc. llvm-svn: 249184	2015-10-02 20:54:23 +00:00
Richard Trieu	e0129e474d	Call the correct overload. Call the correct overload so a string literal does not get converted to a bool. Also fix the test case to match the names given. llvm-svn: 249183	2015-10-02 20:52:14 +00:00
Kostya Serebryany	b85db178a0	[libFuzzer] add a flag -max_total_time llvm-svn: 249181	2015-10-02 20:47:55 +00:00
Dan Gohman	baba8c648b	[WebAssembly] Add a resize_memory intrinsic. llvm-svn: 249178	2015-10-02 20:10:26 +00:00
Sanjoy Das	d0671346ae	[SCEV] Refactor out a createNodeForSelect Summary: We will shortly re-use this for select-like br-phi pairs. Reviewers: atrick, joker-eph, joker.eph Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13377 llvm-svn: 249177	2015-10-02 19:39:59 +00:00
Dan Gohman	72f1692a2c	[WebAssembly] Add a memory_size intrinsic. llvm-svn: 249171	2015-10-02 19:21:15 +00:00
Matt Arsenault	d092a068ba	AMDGPU/SI: Add verifier check for exec reads Make sure we aren't accidentally not setting these in the instruction definitions. llvm-svn: 249170	2015-10-02 18:58:37 +00:00
Sanjoy Das	7d910f2b11	[SCEV] Try to prove predicates by splitting them Summary: This change teaches SCEV that to prove `A u< B` it is sufficient to prove each of these facts individually: - B >= 0 - A s< B - A >= 0 In practice, SCEV sometimes finds it easier to prove these facts individually than to prove `A u< B` as one atomic step. Reviewers: reames, atrick, nlewycky, hfinkel Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13042 llvm-svn: 249168	2015-10-02 18:50:30 +00:00
Roman Divacky	4b5507a037	Actually switch the arch when we see .arch. PR21695 llvm-svn: 249165	2015-10-02 18:25:25 +00:00
Tim Northover	8d67b8e053	ARM: diagnose invalid local fixups on Thumb1 We previously stopped producing Thumb2 relaxations when they weren't supported, but only diagnosed the case where an actual relocation was produced. We should also tell people if local symbols aren't going to work rather than silently overflowing. llvm-svn: 249164	2015-10-02 18:07:18 +00:00
Tim Northover	956b008db6	ARM: correctly align constant pool value on Thumb1 targets. Since we're using tLDRpci to access it, the constant pool's address must be 0 (mod 4). llvm-svn: 249163	2015-10-02 18:07:13 +00:00
Chad Rosier	1f385618c0	[ARM] Typo. NFC. llvm-svn: 249153	2015-10-02 16:42:59 +00:00
Andrea Di Biagio	77f62652c1	Reapply r249121 : "[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types." This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Originally reviewed here: http://reviews.llvm.org/D13347 llvm-svn: 249147	2015-10-02 16:08:05 +00:00
Andrea Di Biagio	45874e67a1	Revert: [FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. r249121 caused a Clang test failure (avx2-buitins.c). Revert r249121 while I keep investigating on the reason why that test failed. llvm-svn: 249124	2015-10-02 13:06:19 +00:00
Zoran Jovanovic	9ffdfa5986	[mips][microMIPS] Fix an issue with selecting sqrt instruction in LLVM backend Differential Revision: http://reviews.llvm.org/D13235 llvm-svn: 249123	2015-10-02 13:06:02 +00:00
Andrea Di Biagio	cb33456122	[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Differential Revision: http://reviews.llvm.org/D13347 llvm-svn: 249121	2015-10-02 12:45:37 +00:00
Ivan Krasin	95e82d5b48	[LibFuzzer] test_single_input option to run a single test case. -test_single_input flag specifies a file name with test data. Review URL: http://reviews.llvm.org/D13359 Patch by Mike Aizatsky! llvm-svn: 249096	2015-10-01 23:23:06 +00:00
Bruno Cardoso Lopes	b491a2d641	[SimplifyLibCalls] Fix instruction misplacement in string/memory libcall optimization When trying to optimize fortified library functions use the right location to insert new instructions in order to preserve correct def-use order. This fixes an issue where a misplaced instruction definition would happen to be after one of its use after a RAUW, forming invalid IR. This behavior was introduced by r227250. Differential Revision: http://reviews.llvm.org/D13301 rdar://problem/22802369 llvm-svn: 249092	2015-10-01 22:43:53 +00:00
Matt Arsenault	b733f00510	AMDGPU: Fix unused variable warning in release build llvm-svn: 249091	2015-10-01 22:40:35 +00:00
Matt Arsenault	b87fc22915	AMDGPU: Move SIFixSGPRLiveRanges to be a regalloc pass Replace LiveInterval usage with LiveVariables. LiveIntervals computes far more information than is needed for this pass which just needs to find if an SGPR is live out of the defining block. LiveIntervals are not usually available that early, requiring computing them twice which is very expensive. The extra run of LiveIntervals/LiveVariables/SlotIndexes was costing in total about 5% of compile time. Continuing to use LiveIntervals is problematic. It seems there is an option (early-live-intervals) to run the analysis about where it should go to avoid recomputing LiveVariables, but it seems to be completely broken with subreg liveness enabled. There are also problems from trying to recompute LiveIntervals since this seems to undo LiveVariables and clearing kill flags, causing TwoAddressInstructions to make bad decisions. Insert the pass right after live variables and preserve it. The tricky case to worry about might be phis since LiveVariables doesn't count a register as live out if in the successor block it is only used in a phi, but I don't think this is a concern right now because SIFixSGPRCopies replaces SGPR phis. llvm-svn: 249087	2015-10-01 22:10:03 +00:00
Joerg Sonnenberger	c8d50d6347	Fix relocation used for GOT references in non-PIC mode. Fix relocations for "set" pseudo op in PIC mode. Differential Revision: http://reviews.llvm.org/D13173 llvm-svn: 249086	2015-10-01 22:08:20 +00:00
Matt Arsenault	d2c7589f93	AMDGPU: Merge if and switch llvm-svn: 249082	2015-10-01 21:51:59 +00:00
Matt Arsenault	db7f0ef367	AMDGPU: Remove dead code There's no point in checking VReg_1 because all uses of it should already have been removed by SILowerI1Copies. llvm-svn: 249081	2015-10-01 21:51:57 +00:00
Matt Arsenault	d1d499aa56	AMDGPU: Make SIInsertWaits about a factor of 4 faster This was the slowest target custom pass and was spending 80% of the time in getMinimalPhysRegClass which was called for every register operand. Try to use the statically known register class when possible from the instruction's MCOperandInfo. There are a few pseudo instructions which are not well behaved with unknown register classes which still require the expensive physical register class search. There are a few other possibilities for making this even faster, such as not inspecting implicit operands. For now those are checked because it is technically possible to have a scalar load into exec or vcc which can be implicitly used. llvm-svn: 249079	2015-10-01 21:43:15 +00:00
Reid Kleckner	fc64fae6e3	[WinEH] Emit __C_specific_handler tables for the new IR We emit denormalized tables, where every range of invokes in the same state gets a complete list of EH action entries. This is significantly simpler than trying to infer the correct nested scoping structure from the MI. Fortunately, for SEH, the nesting structure is really just a size optimization. With this, some basic __try / __except examples work. llvm-svn: 249078	2015-10-01 21:38:24 +00:00
Tom Stellard	e9f8b24985	AMDGPU/SI: Remove assert from AMDGPUOpenCLImageTypeLowering pass Summary: Instead of asserting when the kernel metadata is different than we expect, we should just skip lowering that function. This fixes assertion failures with OpenCL argument metadata from older LLVM releases. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13356 llvm-svn: 249073	2015-10-01 21:16:05 +00:00
David Majnemer	4600c06434	[WinEH] Stop BranchFolding from merging across funclets BranchFolding would merge two funclets together, this is not OK. Disable this and strengthen the assertion in FuncletLayout. llvm-svn: 249069	2015-10-01 21:04:13 +00:00
David Majnemer	f828a0ccc7	[WinEH] Make FuncletLayout more robust against catchret Catchret transfers control from a catch funclet to an earlier funclet. However, it is not completely clear which funclet the catchret target is part of. Make this clear by stapling the catchret target's funclet membership onto the CATCHRET SDAG node. llvm-svn: 249052	2015-10-01 18:44:59 +00:00
Chad Rosier	f11d040f01	[AArch64] Deprecate a command-line option used for testing. Support for pairing unscaled loads and stores has been enabled since the original ARM64 port. This feature is no longer experimental, AFAICT. llvm-svn: 249049	2015-10-01 18:17:12 +00:00
Jonas Paulsson	12629324a4	[SystemZ] Add some generic (floating point support) load instructions. Add generic instructions for load complement, load negative and load positive for fp32 and fp64, and let isel prefer them. They do not clobber CC, and so give scheduler more freedom. SystemZElimCompare pass will convert them when it can to the CC-setting variants. Regression tests updated to expect the new opcodes in places where the old ones where used. New test case SystemZ/fp-cmp-05.ll checks that SystemZCompareElim.cpp can handle the new opcodes. README.txt updated (bullet removed). Note that fp128 is not yet handled, because it is relatively rare, and is a bit trickier, because of the fact that l.dfr would operate on the sign bit of one of the subregisters of a fp128, but we would not want to copy the other sub-reg in case src and dst regs are not the same. Reviewed by Ulrich Weigand. llvm-svn: 249046	2015-10-01 18:12:28 +00:00
Tom Stellard	e0e582c9aa	AMDGPU: Add MEM_RAT STORE_TYPED. v2: Add test (Matt). Fix capitalization of isEOP (Matt). Move pattern to class parameter (Matt). Make the instruction available to Cayman (Matt). Change name from MEM_RAT WRITE_TYPED to MEM_RAT STORE_TYPED. Patch by: Zoltan Gilian llvm-svn: 249042	2015-10-01 17:51:34 +00:00
Tom Stellard	c0f0fba2c4	AMDGPU: Factor out EOP query. v2: Fix brace placement and capitalization (Matt). Patch by: Zoltan Gilian llvm-svn: 249041	2015-10-01 17:51:29 +00:00
NAKAMURA Takumi	096492a07b	Reformat. llvm-svn: 249033	2015-10-01 17:01:03 +00:00
NAKAMURA Takumi	1ed20db720	Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64" It broke; LLVM :: CodeGen__Generic__2009-11-16-BadKillsCrash.ll llvm-svn: 249032	2015-10-01 17:00:56 +00:00
Arnaud A. de Grandmaison	849f3bf8c9	[InstCombine] Remove trivially empty lifetime start/end ranges. Summary: Some passes may open up opportunities for optimizations, leaving empty lifetime start/end ranges. For example, with the following code: void foo(char , char ); void bar(int Size, bool flag) { for (int i = 0; i < Size; ++i) { char text[1]; char buff[1]; if (flag) foo(text, buff); // BBFoo } } the loop unswitch pass will create 2 versions of the loop, one with flag==true, and the other one with flag==false, but always leaving the BBFoo basic block, with lifetime ranges covering the scope of the for loop. Simplify CFG will then remove BBFoo in the case where flag==false, but will leave the lifetime markers. This patch teaches InstCombine to remove trivially empty lifetime marker ranges, that is ranges ending right after they were started (ignoring debug info or other lifetime markers in the range). This fixes PR24598: excessive compile time after r234581. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13305 llvm-svn: 249018	2015-10-01 14:54:31 +00:00
Ulrich Weigand	cf1670a095	[SystemZ] Add assembly instructions for obtaining clock values as well as CPU features Provide assembler support for STCK, STCKF, STCKE, and STFLE. Author: joncmu Differential Revision: http://reviews.llvm.org/D13299 llvm-svn: 249015	2015-10-01 14:43:48 +00:00
Chad Rosier	b7c5b91068	[AArch64] Hoist commonly failing check. NFC. llvm-svn: 249011	2015-10-01 13:43:05 +00:00
Chad Rosier	0b15e7c618	[AArch64] Rename variable to improve readability. NFC. llvm-svn: 249008	2015-10-01 13:33:31 +00:00
Chad Rosier	7a83d770ae	[AArch64] Update comment to reflect reality. llvm-svn: 249007	2015-10-01 13:09:44 +00:00
Zoran Jovanovic	2960f3a346	[mips][microMIPS] Implement CACHEE, WRPGPR and WSBH instructions Differential Revision: http://reviews.llvm.org/D10337 llvm-svn: 249004	2015-10-01 12:49:27 +00:00
Scott Douglass	290183d734	[ARM] More care with Thumb1 writeback in ARMLoadStoreOptimizer Differential Revision: http://reviews.llvm.org/D13240 llvm-svn: 249002	2015-10-01 11:56:19 +00:00
Jingyue Wu	df1a1b113b	[NaryReassociate] SeenExprs records WeakVH Summary: The instructions SeenExprs records may be deleted during rewriting. FindClosestMatchingDominator should ignore these deleted instructions. Fixes PR24301. Reviewers: grosser Subscribers: grosser, llvm-commits Differential Revision: http://reviews.llvm.org/D13315 llvm-svn: 248983	2015-10-01 03:51:44 +00:00
Keno Fischer	17433bd102	Fix performance problem in long-running SectionMemoryManagers Summary: Without this patch, the memory manager would call `mprotect` on every memory region it ever allocated whenever it wanted to finalize memory (i.e. not just the ones it just allocated). This caused terrible performance problems for long running memory managers. In one particular compile heavy julia benchmark, we were spending 50% of time in `mprotect` if running under MCJIT. Fix this by splitting allocated memory blocks into those on which memory permissions have been set and those on which they haven't and only running `mprotect` on the latter. Reviewers: lhames Subscribers: reames, llvm-commits Differential Revision: http://reviews.llvm.org/D13156 llvm-svn: 248981	2015-10-01 02:45:07 +00:00
Tom Stellard	1f0e7bbc5b	AMDGPU/SI: Re-order PreloadedValue enum and number entries based on init order Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12451 llvm-svn: 248978	2015-10-01 02:02:46 +00:00
Dehao Chen	7c41dd6498	Update sample profile propagation algorithm. http://reviews.llvm.org/D13218 llvm-svn: 248968	2015-10-01 00:26:56 +00:00
Ahmed Bougacha	23a0d1a1d6	[X86] Don't custom-lower vNi32 uint_to_fp when unsafe-fp-math. The custom code produces incorrect results if later reassociated. Since r221657, on x86, vNi32 uitofp is lowered using an optimized sequence: movdqa LCPI0_0(%rip), %xmm1 ## xmm1 = [65535, ...] pand %xmm0, %xmm1 por LCPI0_1(%rip), %xmm1 ## [0x4b000000, ...] psrld $16, %xmm0 por LCPI0_2(%rip), %xmm0 ## [0x53000000, ...] addps LCPI0_3(%rip), %xmm0 ## [float -5.497642e+11, ...] addps %xmm1, %xmm0 Since r240361, the machine combiner opportunistically reassociates 2-instruction sequences (with -ffast-math). In the new code sequence, the ADDPS' are eligible. In isolation, for simple examples (without reassociable users), this makes no performance difference (the goal being to enable reassociation of longer chains). In the trivial example (just one uitofp), the reassociation doesn't happen, because (I think) it would require the emission of a separate movaps for a constantpool load (instead of folding it into addps). However, when we have multiple uitofp sequences, and the constantpool loads are CSE'd earlier, the machine combiner can do the reassociation. When the ADDPS' are reassociated, the resulting sequence isn't correct anymore, as we'd be adding large (239) constants with comparatively smaller values (~223). Given that two of the three inputs are powers of 2 larger than 216, and that ulp(239) == 2(39-24) == 215, the reassociated chain will produce 0 for any input in [0, 214[. In my testing, it also produces wrong results for 99.5% of [0, 232[. Avoid this by disabling the new lowering when -ffast-math. It does mean that we'll get slower code than without it, but at least we won't get egregiously incorrect code. One might argue that, considering -ffast-math is all but meaningless, uitofp producing wrong results isn't a compiler bug. But it really is. Fixes PR24512. ...though this is really more of a workaround. Ideally, we'd have some sort of Machine FMF, but that's a problem that's not worth tackling until we do more with machine IR. llvm-svn: 248965	2015-10-01 00:11:07 +00:00
Reid Kleckner	6dec87a8a0	[WinEH] Emit int3 after noreturn calls on Win64 The Win64 unwinder disassembles forwards from each PC to try to determine if this PC is in an epilogue. If so, it skips calling the EH personality function for that frame. Typically, this means you cannot catch an exception in the same frame that you threw it, because 'throw' calls a noreturn runtime function. Previously we avoided this problem with the TrapUnreachable TargetOption, but that's a much bigger hammer than we need. All we need is a 1 byte non-epilogue instruction right after the call. Instead, what we got was an unconditional branch to a shared block containing the ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which added TrapUnreachable, and replaces it with something better. The new code pattern matches for invoke/call followed by unreachable and inserts an int3 into the DAG. To be 100% watertight, we would need to insert SEH_Epilogue instructions into all basic blocks ending in a call with no terminators or successors, but in practice this is unlikely to come up. llvm-svn: 248959	2015-09-30 23:09:23 +00:00
Sanjay Patel	a114a10bbe	[x86] enable machine combiner reassociations for 256-bit vector logical integer insts llvm-svn: 248955	2015-09-30 22:25:55 +00:00
Kostya Serebryany	3287d7a6ed	[libFuzzer] Marking exported symbols as visible. Patch by Mike Aizatsky llvm-svn: 248954	2015-09-30 22:22:37 +00:00
Michael Zolotukhin	fc783e91e0	[SLP] Don't vectorize loads of non-packed types (like i1, i2). Summary: Given an array of i2 elements, 4 consecutive scalar loads will be lowered to i8-sized loads and thus will access 4 consecutive bytes in memory. If we vectorize these loads into a single <4 x i2> load, it'll access only 1 byte in memory. Hence, we should prohibit vectorization in such cases. PS: Initial patch was proposed by Arnold. Reviewers: aschwaighofer, nadav, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13277 llvm-svn: 248943	2015-09-30 21:05:43 +00:00
David Blaikie	757908e545	Fix -Wsign-compare warning llvm-svn: 248942	2015-09-30 20:37:48 +00:00
Evgeniy Stepanov	f608111d1b	Fix debug info with SafeStack. llvm-svn: 248933	2015-09-30 19:55:43 +00:00
Chad Rosier	11c825f7db	[AArch64] Remove an unnecessary restriction on pre-index instructions. Previously, the index was constrained to the size of the memory operation for no apparent reason. This change removes that constraint so that we can form pre-index instructions with any valid offset. llvm-svn: 248931	2015-09-30 19:44:40 +00:00
Fiona Glaser	b0c6d9174e	DeadCodeElimination: rewrite to be faster Same strategy as simplifyInstructionsInBlock. ~1/3 less time on my test suite. This pass doesn't have many in-tree users, but getting rid of an O(N^2) worst case and making it cleaner should at least make it a viable alternative to ADCE, since it's now consistently somewhat faster. llvm-svn: 248927	2015-09-30 17:49:49 +00:00
Hal Finkel	4c45775880	[PowerPC] Disable shrink wrapping Shrink wrapping is causing a self-hosting failure on PPC64/Linux. Disable for now until the problem can be fixed. llvm-svn: 248924	2015-09-30 17:29:03 +00:00
Artyom Skrobov	72ca6b8f3f	[ARM] Support for ARMv6-Z / ARMv6-ZK missing As Richard Barton observed at http://reviews.llvm.org/D12937#inline-107121 TargetParser in LLVM has insufficient support for ARMv6Z and ARMv6ZK. In particular, there were no tests for TrustZone being supported in these architectures. The patch clears a FIXME: left by Saleem Abdulrasool in r201471, and fixes his test case which hadn't really been testing what it was claiming to test. Differential Revision: http://reviews.llvm.org/D13236 llvm-svn: 248921	2015-09-30 17:25:52 +00:00
Erik Eckstein	848c1aa452	SLPVectorizer: limit the scheduling region size per basic block. Usually large blocks are not a problem. But if a large block (> 10k instructions) contains many (potential) chains of vector instructions, and those chains are spread over a wide range of instructions, then scheduling becomes a compile time problem. This change introduces a limit for the accumulate scheduling region size of a block. For real-world functions this limit will never be exceeded (it's about 10x larger than the maximum value seen in the test-suite and external test suite). llvm-svn: 248917	2015-09-30 17:00:44 +00:00
Chad Rosier	4f04e2ec87	[AArch64] Use helper function to improve readability. NFC. llvm-svn: 248914	2015-09-30 16:50:41 +00:00
Andrea Di Biagio	0594e2a1e9	[InstCombine] Teach how to convert SSSE3/AVX2 byte shuffles to builtin shuffles if the shuffle mask is constant. This patch teaches InstCombiner how to convert a SSSE3/AVX2 byte shuffle to a builtin shuffle if the mask is constant. Converting byte shuffle intrinsic calls to builtin shuffles can help finding more opportunities for combining shuffles later on in selection dag. We may end up with byte shuffles with constant masks as the result of inlining. Differential Revision: http://reviews.llvm.org/D13252 llvm-svn: 248913	2015-09-30 16:44:39 +00:00
Artur Pilipenko	029d8531e6	Refactor computeKnownBits alignment handling code Reviewed By: reames, hfinkel Differential Revision: http://reviews.llvm.org/D12958 llvm-svn: 248892	2015-09-30 11:55:45 +00:00
Jeroen Ketema	ab99b59e8c	[ARM][NEON] Use address space in vld([1234]\|[234]lane) and vst([1234]\|[234]lane) instructions This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234], vst[234]lane ARM neon intrinsics and associates an address space with the pointer that these intrinsics take. This changes, e.g., <2 x i32> @llvm.arm.neon.vld1.v2i32(i8, i32) to <2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8, i32) This change ensures that address spaces are fully taken into account in the ARM target during lowering of interleaved loads and stores. Differential Revision: http://reviews.llvm.org/D12985 llvm-svn: 248887	2015-09-30 10:56:37 +00:00
Simon Pilgrim	3d11c994f7	[X86][XOP] Added support for the lowering of 128-bit vector shifts to XOP shift instructions The XOP shifts just have logical/arithmetic versions and the left/right shifts are controlled by whether the value is positive/negative. Because of this I've added new X86ISD nodes instead of trying to force them to use the existing shift nodes. Additionally Excavator cores (bdver4) support XOP and AVX2 - meaning that it should use the AVX2 shifts when it can and fall back to XOP in other cases. Differential Revision: http://reviews.llvm.org/D8690 llvm-svn: 248878	2015-09-30 08:17:50 +00:00
Justin Bogner	75df7187f3	InstrProf: Don't call std::unique twice here llvm-svn: 248872	2015-09-30 02:02:08 +00:00
Dehao Chen	6722688eaa	http://reviews.llvm.org/D13145 Support hierarachical sample profile format. llvm-svn: 248865	2015-09-30 00:42:46 +00:00
Evgeniy Stepanov	d3f544f271	[safestack] Fix a stupid mix-up in the direct-tls code path. llvm-svn: 248863	2015-09-30 00:01:47 +00:00
Marek Olsak	d1a69a2839	AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is set to prevent setting a huge stride, because DATA_FORMAT has a different meaning if ADD_TID_ENABLE is set. This is a candidate for stable llvm 3.7. Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 248858	2015-09-29 23:37:32 +00:00
Reid Kleckner	a13dfd539b	[WinEH] Setup RBP correctly in Win64 funclet prologues Previously local variable captures just didn't work in 64-bit. Now we can access local variables more or less correctly. llvm-svn: 248857	2015-09-29 23:32:01 +00:00
David Majnemer	91b0ab9172	[WinEH] Ensure that funclets obey the x64 ABI The x64 ABI requires that epilogues do not contain code other than stack adjustments and some limited control flow. However, we'd insert code to initialize the return address after stack adjustments. Instead, insert EAX/RAX with the current value before we create the stack adjustments in the epilogue. llvm-svn: 248839	2015-09-29 22:33:36 +00:00
Justin Bogner	9e9a057a9b	InstrProf: Support for value profiling in the indexed profile format Add support to the indexed instrprof reader and writer for the format that will be used for value profiling. Patch by Betul Buyukkurt, with minor modifications. llvm-svn: 248833	2015-09-29 22:13:58 +00:00
Maksim Panchenko	cce239c45d	HHVM calling conventions. HHVM calling convention, hhvmcc, is used by HHVM JIT for functions in translated cache. We currently support LLVM back end to generate code for X86-64 and may support other architectures in the future. In HHVM calling convention any GP register could be used to pass and return values, with the exception of R12 which is reserved for thread-local area and is callee-saved. Other than R12, we always pass RBX and RBP as args, which are our virtual machine's stack pointer and frame pointer respectively. When we enter translation cache via hhvmcc function, we expect the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed to standard ABI alignment. This affects stack object alignment and stack adjustments for function calls. One extra calling convention, hhvm_ccc, is used to call C++ helpers from HHVM's translation cache. It is almost identical to standard C calling convention with an exception of first argument which is passed in RBP (before we use RDI, RSI, etc.) Differential Revision: http://reviews.llvm.org/D12681 llvm-svn: 248832	2015-09-29 22:09:16 +00:00
Chad Rosier	4315012769	[AArch64] Add support for pre- and post-index LDPSWs. llvm-svn: 248825	2015-09-29 20:39:55 +00:00
David Majnemer	a80c151286	[WinEH] Teach AsmPrinter about funclets Summary: Funclets have been turned into functions by the time they hit the object file. Make sure that they have decent names for the symbol table and CFI directives explaining how to reason about their prologues. Differential Revision: http://reviews.llvm.org/D13261 llvm-svn: 248824	2015-09-29 20:12:33 +00:00
Cong Hou	166e08542e	Rename some function arguments in MachineBasicBlock.cpp/h by turning the first letter into upper case. NFC. llvm-svn: 248821	2015-09-29 19:46:09 +00:00
Dehao Chen	8e7df83e6a	http://reviews.llvm.org/D13231 Change lookup functions to const functions. llvm-svn: 248818	2015-09-29 18:28:15 +00:00
Chad Rosier	dabe2534ed	[AArch64] Add integer pre- and post-index halfword/byte loads and stores. llvm-svn: 248817	2015-09-29 18:26:15 +00:00
Dehao Chen	028e122ca9	Revert r248810 which breaks tests. llvm-svn: 248814	2015-09-29 18:18:49 +00:00
Dehao Chen	410a25aa7a	http://reviews.llvm.org/D13231 Change lookup functions to const functions. llvm-svn: 248810	2015-09-29 17:59:58 +00:00
Nemanja Ivanovic	2c84b29464	Addition of interfaces the BE to conform to Table A-2 of ELF V2 ABI V1.1 This patch corresponds to review: http://reviews.llvm.org/D13191 Back end portion of the fifth round of additions to altivec.h. llvm-svn: 248809	2015-09-29 17:41:53 +00:00
Chad Rosier	32d4d37e61	[AArch64] Scale offsets by the size of the memory operation. NFC. The immediate in the load/store should be scaled by the size of the memory operation, not the size of the register being loaded/stored. This change gets us one step closer to forming LDPSW instructions. This change also enables pre- and post-indexing for halfword and byte loads and stores. llvm-svn: 248804	2015-09-29 16:07:32 +00:00
Igor Laevsky	cea9ede74e	[ValueTracking] Lower dom-conditions-dom-blocks and dom-conditions-max-uses thresholds On some of our benchmarks this change shows about 50% compile time improvement without any noticeable performance difference. Differential Revision: http://reviews.llvm.org/D13248 llvm-svn: 248801	2015-09-29 14:57:52 +00:00
Chad Rosier	a4d3217e81	[AArch64] Remove some redundant cases. NFC. llvm-svn: 248800	2015-09-29 14:57:10 +00:00
James Molloy	897048bee3	[ValueTracking] Teach isKnownNonZero about monotonically increasing PHIs If a PHI starts at a non-negative constant, monotonically increases (only adds of a constant are supported at the moment) and that add does not wrap, then the PHI is known never to be zero. llvm-svn: 248796	2015-09-29 14:08:45 +00:00
Jeroen Ketema	740f9d79ca	Arguments spilled on the stack before a function call may have alignment requirements, for example in the case of vectors. These requirements are exploited by the code generator by using move instructions that have similar alignment requirements, e.g., movaps on x86. Although the code generator properly aligns the arguments with respect to the displacement of the stack pointer it computes, the displacement itself may cause misalignment. For example if we have %3 = load <16 x float>, <16 x float>* %1, align 64 call void @bar(<16 x float> %3, i32 0) the x86 back-end emits: movaps 32(%ecx), %xmm2 movaps (%ecx), %xmm0 movaps 16(%ecx), %xmm1 movaps 48(%ecx), %xmm3 subl $20, %esp <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards movaps %xmm3, (%esp) <-- movaps requires 16-byte alignment, while %esp is not aligned as such. movl $0, 16(%esp) calll __bar To solve this, we need to make sure that the computed value with which the stack pointer is changed is a multiple af the maximal alignment seen during its computation. With this change we get proper alignment: subl $32, %esp movaps %xmm3, (%esp) Differential Revision: http://reviews.llvm.org/D12337 llvm-svn: 248786	2015-09-29 10:12:57 +00:00
Simon Pilgrim	43f5e0848e	[InstCombine] Improve Vector Demanded Bits Through Bitcasts Currently SimplifyDemandedVectorElts can only peek through bitcasts if the vectors have the same number of elements. This patch fixes and enables some existing (disabled) code to support bitcasting to vectors with more/fewer elements. It currently only accepts cases when vectors alias cleanly (i.e. number of elements are an exact multiple of the other vector). This was added to improve the demanded vector elements support for SSE vector shifts which require the __m128i (<2 x i64>) argument type to be bitcast to the vector type for the builtin shift. I've added extra tests for various additional bitcasts. Differential Revision: http://reviews.llvm.org/D12935 llvm-svn: 248784	2015-09-29 08:19:11 +00:00
Chen Li	9f27fc0599	[LoopUnswitch] Add block frequency analysis to recognize hot/cold regions Summary: This patch adds block frequency analysis to LoopUnswitch pass to recognize hot/cold regions. For cold regions the pass only performs trivial unswitches since they do not increase code size, and for hot regions everything works as before. This helps to minimize code growth in cold regions and be more aggressive in hot regions. Currently the default cold regions are blocks with frequencies below 20% of function entry frequency, and it can be adjusted via -loop-unswitch-cold-block-frequency flag. The entire feature is controlled via -loop-unswitch-with-block-frequency flag and it is off by default. Reviewers: broune, silvas, dnovillo, reames Subscribers: davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D11605 llvm-svn: 248777	2015-09-29 05:03:32 +00:00
NAKAMURA Takumi	0c12a3949e	[CMake] X86AsmParser: Prune redundant LINK_LIBS. It is described in LLVMBuild.txt. llvm-svn: 248771	2015-09-29 01:25:01 +00:00
Evgeniy Stepanov	d8b86f7cdc	Move dbg.declare intrinsics when merging and replacing allocas. Place new and update dbg.declare calls immediately after the corresponding alloca. Current code in replaceDbgDeclareForAlloca puts the new dbg.declare at the end of the basic block. LLVM codegen has problems emitting debug info in a situation when dbg.declare appears after all uses of the variable. This usually kinda works for inlining and ASan (two users of this function) but not for SafeStack (see the pending change in http://reviews.llvm.org/D13178). llvm-svn: 248769	2015-09-29 00:30:19 +00:00
Matthias Braun	99ae16217e	RegisterPressure: LiveRegSet tracks register units not physregs There are always more physical registers and register units so the previous behaviour was correct but we can do with less memory. llvm-svn: 248767	2015-09-29 00:20:32 +00:00
Reid Kleckner	c71d6275ca	[WinEH] Fix ip2state table emission with funclets Previously we were hijacking the old LandingPadInfo data structures to communicate our state numbers. Now we don't need that anymore. llvm-svn: 248763	2015-09-28 23:56:30 +00:00
Richard Trieu	e778e87d2a	Fix unused variable warning in non-debug builds. llvm-svn: 248754	2015-09-28 22:54:43 +00:00
Sanjay Patel	4e6527682a	tidy up comments; NFC llvm-svn: 248750	2015-09-28 22:14:51 +00:00
Sanjay Patel	3a14f1a338	add a FIXME for a CPU model check that should have an attribute instead llvm-svn: 248746	2015-09-28 22:00:24 +00:00
Sanjay Patel	5e5f0e9756	move one-use check under the comment that describes it; NFCI llvm-svn: 248745	2015-09-28 21:44:46 +00:00
Sanjoy Das	4f1c45952c	[SCEV] Don't crash on pointer comparisons `ScalarEvolution::isImpliedCondOperandsViaNoOverflow` tries to cast the operand type of the comparison it is given to an `IntegerType`. This is incorrect because it could actually be simplifying a comparison between two pointers. Switch it to using `getTypeSizeInBits` instead, which does the right thing for both pointers and integers. Fixed PR24956. llvm-svn: 248743	2015-09-28 21:14:32 +00:00
Matt Arsenault	ba6aae785a	AMDGPU: Factor switch into separate function llvm-svn: 248742	2015-09-28 20:54:57 +00:00
Matt Arsenault	73aa8f687a	AMDGPU: Fix splitting x16 SMRD loads When used recursively, this would set the kill flag on the intermediate step from first splitting x16 to x8. llvm-svn: 248741	2015-09-28 20:54:52 +00:00
Matt Arsenault	e5d042cd56	AMDGPU: Fix moving SMRD loads with literal offsets on CI llvm-svn: 248740	2015-09-28 20:54:46 +00:00
Matt Arsenault	dd49c5fc1b	AMDGPU: Fix splitting SMRD with large offset The splitting of > 4 dword SMRD instructions if using an offset in an SGPR instead of an immediate was not setting the destination register, resulting an an instruction missing an operand which would assert later. Test will be included in a following commit which fixes a related issue. llvm-svn: 248739	2015-09-28 20:54:42 +00:00
Andrew Kaylor	16c4da03d5	Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735	2015-09-28 20:33:22 +00:00
Sean Silva	ace7818ce6	[GlobalOpt] Sort members of llvm.used deterministically Patch by Jake VanAdrighem! Summary: Fix the way we sort the llvm.used and llvm.compiler.used members. This bug seems to have been introduced in rL183756 through a set of improper casts to GlobalValue*. In subsequent patches this problem was missed and transformed into a getName call on a ConstantExpr. Reviewers: silvas Subscribers: silvas, llvm-commits Differential Revision: http://reviews.llvm.org/D12851 llvm-svn: 248728	2015-09-28 19:02:11 +00:00
Fiona Glaser	f74cc40e34	Improve performance of SimplifyInstructionsInBlock 1. Use a worklist, not a recursive approach, to avoid needless revisitation and being repeatedly forced to jump back to the start of the BB if a handle is invalidated. 2. Only insert operands to the worklist if they become unused after a dead instruction is removed, so we don’t have to visit them again in most cases. 3. Use a SmallSetVector to track the worklist. 4. Instead of pre-initting the SmallSetVector like in DeadCodeEliminationPass, only put things into the worklist if they have to be revisited after the first run-through. This minimizes how much the actual SmallSetVector gets used, which saves a lot of time. llvm-svn: 248727	2015-09-28 18:56:07 +00:00
Daniel Sanders	7727e1098c	[mips][p5600] Added P5600 processor and initial scheduler. Summary: The P5600 is an out-of-order, superscalar implementation of the MIPS32R5 architecture. The scheduler has a few missing details (see the 'Tricky Instructions' section and some quirks of the P5600 are deliberately omitted due to implementation difficulty and low chance of significant benefit (e.g. the predicate on P5600WriteEitherALU). However, testing on SingleSource is showing significant performance benefits on some apps (seven in the 10-30% range) and only one significant regression (12%) when -pre-RA-sched=linearize is given. Without -pre-RA-sched=linearize the results are more variable. Some do even better (up to 55% improvement) but increased numbers of copies are slowing others down (up to 12%). Overall, the scheduler as it currently stands is a 2.4% win with -pre-RA-sched=linearize and a 2.7% win without -pre-RA-sched=linearize. I'm sure we can improve on this further. For completeness, the FPGA this was tested on shows some failures with and without the P5600 scheduler. These appear to be scheduling related since the two test runs have fairly different sets of failing tests even after accounting for other factors (e.g. spurious connection failures) however it's not P5600 specific since we also get some for the generic scheduler. Reviewers: vkalintiris Subscribers: mpf, llvm-commits, atrick, vkalintiris Differential Revision: http://reviews.llvm.org/D12193 llvm-svn: 248725	2015-09-28 18:24:08 +00:00
Artur Pilipenko	b4d009042b	Introduce !align metadata for load instruction Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D12853 llvm-svn: 248721	2015-09-28 17:41:08 +00:00
Philip Reames	13f023c09d	[InstSimplify] Fold simple known implications to true This was split off of http://reviews.llvm.org/D13040 to make it easier to test the correctness of the implication logic. For the moment, this only handles a single easy case which shows up when eliminating and combining range checks. In the (near) future, I plan to extend this for other cases which show up in range checks, but I wanted to make those changes incrementally once the framework was in place. At the moment, the implication logic will be used by three places. One in InstSimplify (this review) and two in SimplifyCFG (http://reviews.llvm.org/D13040 & http://reviews.llvm.org/D13070). Can anyone think of other locations this style of reasoning would make sense? Differential Revision: http://reviews.llvm.org/D13074 llvm-svn: 248719	2015-09-28 17:14:24 +00:00
Weiming Zhao	310770a90f	[LoopReroll] Ignore debug intrinsics Originally, debug intrinsics and annotation intrinsics may prevent the loop to be rerolled, now they are ignored. Differential Revision: http://reviews.llvm.org/D13150 llvm-svn: 248718	2015-09-28 17:03:23 +00:00
Dan Gohman	05a17aa82a	[WebAssembly] Support for direct call and call_indirect. llvm-svn: 248716	2015-09-28 16:22:39 +00:00
Zoran Jovanovic	cdb64566cc	[mips] Handling of immediates bigger than 16 bits Differential Revision: http://reviews.llvm.org/D10539 llvm-svn: 248706	2015-09-28 11:11:34 +00:00
Artyom Skrobov	ad8a0638f7	[ARM] Avoid redundant checks for isThumb1Only() after supportsTailCall() supportsTailCall() has two callers. Both of them double-check isThumb1Only(), and refuse to proceed with tail-calling in that case. Therefore, it makes sense to move this check to ARMSubtarget::initSubtargetFeatures, where SupportsTailCall is initialized; and to eliminate the extra checks at the call sites. Following a review comment, added an "assert(supportsTailCall())" in IsEligibleForTailCall. NFC. llvm-svn: 248703	2015-09-28 09:44:11 +00:00
Hal Finkel	bd582581b8	[DAGCombine] Fix getStoreMergeAndAliasCandidates's AA-enabled chain walking When AA is being used, non-aliasing stores are canonicalized to use the same chain, and DAGCombiner::getStoreMergeAndAliasCandidates can take advantage of this by looking only as users of a store's chain operand. However, user iteration is not result-number specific, we need to check that the use is as a chain operand, and not via some other operand. It is certainly possible to have another potentially-aliasing store, which shares the first's base pointer, and uses the first's chain's node via some other operand. Failure to catch this situation caused, at least in the included test case, an assert later because the relative sequence-number ordering caused later replacement to create a cycle in the DAG. llvm-svn: 248698	2015-09-28 08:02:14 +00:00
Craig Topper	862d5d8322	Remove 'const' from some ArrayRefs. ArrayRefs are already immutable. NFC llvm-svn: 248693	2015-09-28 00:15:34 +00:00
Justin Bogner	d7d1a72f66	AsmWriter: Print the argument names in declarations while debugging When llvm declarations have argument names, it's helpful to actually print those names when debugging. Arguably, it'd be nice to print them all the time, but that would mean the IR we output wouldn't round trip through bitcode, which doesn't store the names. Make the varous print() methods in AsmWriter optionally print "for debug" and set that flag in the dump() methods. The only thing this does differently for now is print the argument names in declarations. llvm-svn: 248692	2015-09-27 22:38:50 +00:00
Yaron Keren	e5a9dc2f5b	Silence clang warning: variable ‘Status’ set but not used. llvm-svn: 248691	2015-09-27 21:31:33 +00:00
Sanjoy Das	f1090b6061	[SCEV] identical instructions don't compute equal values Before this change `HasSameValue` would return true for distinct `alloca` instructions if they happened to be allocating the same type (`alloca` instructions are not specified as reading memory). This change adds an explicit whitelist of instruction types for which "identical" instructions compute the same value. Fixes PR24952. llvm-svn: 248690	2015-09-27 21:09:48 +00:00
Sanjay Patel	9533407566	[InstCombine] fold zexts and constants into a phi (PR24766) This is one step towards solving PR24766: https://llvm.org/bugs/show_bug.cgi?id=24766 We were not producing the same IR for these two C functions because the store to the temp bool causes extra zexts: #include <stdbool.h> bool switchy(char x1, char x2, char condition) { bool conditionMet = false; switch (condition) { case 0: conditionMet = (x1 == x2); break; case 1: conditionMet = (x1 <= x2); break; } return conditionMet; } bool switchy2(char x1, char x2, char condition) { switch (condition) { case 0: return (x1 == x2); case 1: return (x1 <= x2); } return false; } As noted in the code comments, this test case manages to avoid the more general existing phi optimizations where there are only 2 phi inputs or where there are no constant phi args mixed in with the casts ops. It seems like a corner case, but if we don't catch it, then I don't think we can get SimplifyCFG to further optimize towards the canonical form for this function shown in the bug report. Differential Revision: http://reviews.llvm.org/D12866 llvm-svn: 248689	2015-09-27 20:34:31 +00:00
Joseph Tremoulet	09af67aba5	[EH] Create removeUnwindEdge utility Summary: Factor the code that rewrites invokes to calls and rewrites WinEH terminators to their "unwind to caller" equivalents into a helper in Utils/Local, and use it in the three places I'm aware of that need to do this. Reviewers: andrew.w.kaylor, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13152 llvm-svn: 248677	2015-09-27 01:47:46 +00:00
Benjamin Kramer	2a63631abd	[BranchProbability] Manually round the floating point output. llvm::format compiles down to snprintf which has no defined rounding for floating point arguments, and MSVC has implemented it differently from what the BSD libcs and glibc do. Try to emulate the glibc rounding behavior to avoid changing tests. While there simplify code a bit and move trivial methods inline. llvm-svn: 248665	2015-09-26 10:09:36 +00:00
Matt Arsenault	1d36b717a5	AMDGPU: Remove hasPostISelHook from most instructions Since this is only needed for VOP3 and a few other special case instructions, stop setting it on everything. llvm-svn: 248657	2015-09-26 05:06:48 +00:00
Matt Arsenault	f32481372c	AMDGPU: Switch over reg class size instead of checking all super classes This gets isSGPRClass out of my profile of SIFixSGPRCopies. llvm-svn: 248656	2015-09-26 04:59:04 +00:00
Matt Arsenault	6e28010215	AMDGPU: Don't handle invalid reg classes in helper functions No tests hit these and it would be better to have checks like this explicit where they are used. llvm-svn: 248655	2015-09-26 04:53:30 +00:00
Saleem Abdulrasool	9174623b2d	AMDGPU: address -Winconsistent-missing-override Add missing override. NFC. llvm-svn: 248652	2015-09-26 04:34:52 +00:00
Matt Arsenault	8e1ddf84fe	AMDGPU: Set CopyCost of register classes These require multiple mov instructions to copy, but the default value is that 1 instruction is needed. I'm not sure if this actually changes anything. llvm-svn: 248651	2015-09-26 04:09:34 +00:00
Chen Li	7452d95656	[Bug 24848] Use range metadata to constant fold comparisons between two values Summary: This is the second part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. If both operands of a comparison have range metadata, they should be used to constant fold the comparison. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13177 llvm-svn: 248650	2015-09-26 03:26:47 +00:00
Matt Arsenault	e98a074c42	AMDGPU: VOP3b definition cleanups llvm-svn: 248647	2015-09-26 02:25:48 +00:00
Matt Arsenault	86095b8dec	AMDGPU: Fix sched model for VOP2b instructions Trying to use the version with the explicit output operand would complain because of the missing WriteSALU. I'm not sure why it doesn't complain about this with the implicit VCC def. llvm-svn: 248646	2015-09-26 02:25:45 +00:00
Dan Gohman	d0bf981296	[WebAssembly] Rename several functions and types according to the new spec. llvm-svn: 248644	2015-09-26 01:09:44 +00:00
Ahmed Bougacha	e81610fabb	[ARM] Don't generate clrex for pre-v7 targets. Since r248294, we emit clrex, but it doesn't exist on v6. llvm-svn: 248640	2015-09-26 00:14:02 +00:00
Sanjoy Das	b174f9a316	[SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to exploit trip counts' Summary: If the trip count of a specific backedge is `N`, then we know that backedge is effectively guarded by the condition `{0,+,1} u< N`. This change teaches SCEV to use this condition to prove things in `isLoopBackedgeGuardedByCond`. Depends on D12948 Depends on D12949 The original checkin, r248608 had to be backed out due to an issue with a ObjCXX unit test. That issue is now fixed, so re-landing. Reviewers: atrick, reames, majnemer, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12950 llvm-svn: 248638	2015-09-25 23:53:50 +00:00
Sanjoy Das	96709c4854	[SCEV] Reapply 'Exploit A < B => (A+K) < (B+K) when possible' Summary: This change teaches SCEV's `isImpliedCond` two new identities: A u< B u< -C => (A + C) u< (B + C) A s< B s< INT_MIN - C => (A + C) s< (B + C) While these are useful on their own, they're really intended to support D12950. The original checkin, r248606 had to be backed out due to an issue with a ObjCXX unit test. That issue is now fixed, so re-landing. Reviewers: atrick, reames, majnemer, nlewycky, hfinkel Subscribers: aadg, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12948 llvm-svn: 248637	2015-09-25 23:53:45 +00:00
Matthias Braun	93ab942c24	LivePhysRegs: Fix live-outs of return blocks I realized that the live-out set computed for the return block is missing the callee saved registers (the non-pristine ones to be exact). This only affects the liveness computed for instructions inside the function epilogue which currently none of the LivePhysRegs users in llvm cares about, so this is just a drive-by fix without a testcase. Differential Revision: http://reviews.llvm.org/D13180 llvm-svn: 248636	2015-09-25 23:50:53 +00:00
Sanjay Patel	e1b09caaaf	[InstCombine] match De Morgan's Law hidden by zext ops (PR22723) This is a fix for PR22723: https://llvm.org/bugs/show_bug.cgi?id=22723 My first attempt at this was to change what I thought was the root problem: xor (zext i1 X to i32), 1 --> zext (xor i1 X, true) to i32 ...but we create the opposite pattern in InstCombiner::visitZExt(), so infinite loop! My next idea was to fix the matchIfNot() implementation in PatternMatch, but that would mean potentially returning a different size for the match than what was input. I think this would require all users of m_Not to check the size of the returned match, so I abandoned that idea. I settled on just fixing the exact case presented in the PR. This patch does allow the 2 functions in PR22723 to compile identically (x86): bool test(bool x, bool y) { return !x \| !y; } bool test(bool x, bool y) { return !x \|\| !y; } ... andb %sil, %dil xorb $1, %dil movb %dil, %al retq Differential Revision: http://reviews.llvm.org/D12705 llvm-svn: 248634	2015-09-25 23:21:38 +00:00
Cong Hou	15ea016346	Use fixed-point representation for BranchProbability. BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator 1<<31. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change: Pros: 1. It uses only a half space of the current one. 2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer. Cons: 1. Constructing a probability using arbitrary numerator and denominator needs additional calculations. 2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/3 may not be exactly identical to 1 / 3 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio. One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock. We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern. Differential revision: http://reviews.llvm.org/D12603 llvm-svn: 248633	2015-09-25 23:09:59 +00:00
Matthias Braun	a3b701f828	SelectionDAGDumper: Print simple operands inline. Print simple operands inline instead of their pointer/value number. Simple operands are SDNodes without predecessors like Constant(FP), Register, UNDEF. This unifies the behaviour with dumpr() which was already doing this. Previously: t0: ch = EntryToken t1: i64 = Register %vreg0 t2: i64,ch = CopyFromReg t0, t1 t3: i64 = Constant<1> t4: i64 = add t2, t3 t5: i64 = Constant<2> t6: i64 = add t2, t5 t10: i64 = undef t11: i8,ch = load t0, t2, t10<LD1[%tmp81]> t12: i8,ch = load t0, t4, t10<LD1[%tmp10]> t13: i8,ch = load t0, t6, t10<LD1[%tmp12]> Now: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t4: i64 = add t2, Constant:i64<1> t6: i64 = add t2, Constant:i64<2> t11: i8,ch = load<LD1[%tmp81]> t0, t2, undef:i64 t12: i8,ch = load<LD1[%tmp10]> t0, t4, undef:i64 t13: i8,ch = load<LD1[%tmp12]> t0, t6, undef:i64 Differential Revision: http://reviews.llvm.org/D12567 llvm-svn: 248628	2015-09-25 22:27:02 +00:00
Matt Arsenault	e229c0c45e	AMDGPU: Construct new buffer instruction when moving SMRD It's easier to understand creating a full instruction than the current situation where sometimes a new instruction is created and sometimes it is awkwardly mutated in place. llvm-svn: 248627	2015-09-25 22:21:19 +00:00
Matt Arsenault	3c07e963b8	DAGCombiner: Check if store is volatile first This is the simpler check. NFC. llvm-svn: 248625	2015-09-25 22:06:19 +00:00
Matthias Braun	c804cdb912	TargetRegisterInfo: Introduce PrintLaneMask. This makes it more convenient to print lane masks and lead to more uniform printing. llvm-svn: 248624	2015-09-25 21:51:24 +00:00
Matthias Braun	e6a2485e1a	TargetRegisterInfo: Add typedef unsigned LaneBitmask and use it where apropriate; NFC llvm-svn: 248623	2015-09-25 21:51:14 +00:00
Sanjay Patel	bbbf9a1a34	merge vector stores into wider vector stores and fix AArch64 misaligned access TLI hook (PR21711) This is a redo of D7208 ( r227242 - http://llvm.org/viewvc/llvm-project?view=revision&revision=227242 ). The patch was reverted because an AArch64 target could infinite loop after the change in DAGCombiner to merge vector stores. That happened because AArch64's allowsMisalignedMemoryAccesses() wasn't telling the truth. It reported all unaligned memory accesses as fast, but then split some 128-bit unaligned accesses up in performSTORECombine() because they are slow. This patch attempts to fix the problem in AArch's allowsMisalignedMemoryAccesses() while preserving existing (perhaps questionable) lowering behavior. The x86 test shows that store merging is working as intended for a target with fast 32-byte unaligned stores. Differential Revision: http://reviews.llvm.org/D12635 llvm-svn: 248622	2015-09-25 21:49:48 +00:00
Matthias Braun	e86bbd8979	PrologueEpilogInserter: Fix missing live-ins when savepoint equals restorepoint The algorithm would not modify the live-in list of blocks below the save block point which is correct unless it happens to be a restore point at the same time. Also fixes the benign issue of live-in registers being added twice in some cases. The testcase is based on a test submitted by Kit Barton. Differential Revision: http://reviews.llvm.org/D13176 llvm-svn: 248620	2015-09-25 21:41:40 +00:00
Tom Stellard	e135ffd554	AMDGPU/SI: Use .hsatext section instead of .text for HSA Reviewers: arsenm, grosbach, rafael Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12424 llvm-svn: 248619	2015-09-25 21:41:28 +00:00
Tom Stellard	8e0257625d	MCAsmInfo: Allow targets to specify when the .section directive should be omitted Summary: The default behavior is to omit the .section directive for .text, .data, and sometimes .bss, but some targets may want to omit this directive for other sections too. The AMDGPU backend will uses this to emit a simplified syntax for section switches. For example if the section directive is not omitted (current behavior), section switches to .hsatext will be printed like this: .section .hsatext,#alloc,#execinstr,#write This is actually wrong, because .hsatext has some custom STT_* flags, which MC doesn't know how to print or parse. If the section directive is omitted (made possible by this commit), section switches will be printed like this: .hsatext The motivation for this patch is to make it possible to emit sections with custom STT_* flags without having to teach MC about all the target specific STT_* flags. Reviewers: rafael, grosbach Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12423 llvm-svn: 248618	2015-09-25 21:41:14 +00:00
Matthias Braun	c2d4befb54	MachineBasicBlock: Factor out common code into isReturnBlock() llvm-svn: 248617	2015-09-25 21:25:19 +00:00
Sanjoy Das	4a39b97671	Revert two SCEV changes that caused test failures in clang. r248606: "[SCEV] Exploit A < B => (A+K) < (B+K) when possible" r248608: "[SCEV] Teach isLoopBackedgeGuardedByCond to exploit trip counts." llvm-svn: 248614	2015-09-25 21:16:50 +00:00
Justin Bogner	0638b7ba99	ADCE: Fix typo in file comment. NFC llvm-svn: 248613	2015-09-25 21:03:46 +00:00
Matt Arsenault	10aa807856	PeepholeOptimizer: Remove redundant copies If a virtual register is copied and another copy was already seen, replace with the previous copy. This only handles the simplest cases for now. This pattern shows up from various operand restrictions AMDGPU has which require inserting copies depending on the register class of the operands. llvm-svn: 248611	2015-09-25 20:22:12 +00:00
Chad Rosier	d9f102b464	Simplify code. NFC. llvm-svn: 248610	2015-09-25 20:20:22 +00:00
Sanjay Patel	a67559c106	more space; NFC llvm-svn: 248609	2015-09-25 20:12:43 +00:00
Sanjoy Das	d706fa8a0c	[SCEV] Teach isLoopBackedgeGuardedByCond to exploit trip counts. Summary: If the trip count of a specific backedge is `N`, then we know that backedge is effectively guarded by the condition `{0,+,1} u< N`. This change teaches SCEV to use this condition to prove things in `isLoopBackedgeGuardedByCond`. Depends on D12948 Depends on D12949 Reviewers: atrick, reames, majnemer, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12950 llvm-svn: 248608	2015-09-25 19:59:57 +00:00
Sanjoy Das	df1635d394	[SCEV] Extract helper function from isImpliedCond; NFC Summary: This new helper routine will be used in a subsequent change. Reviewers: hfinkel Subscribers: hfinkel, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12949 llvm-svn: 248607	2015-09-25 19:59:52 +00:00
Sanjoy Das	fdec9deb13	[SCEV] Exploit A < B => (A+K) < (B+K) when possible Summary: This change teaches SCEV's `isImpliedCond` two new identities: A u< B u< -C => (A + C) u< (B + C) A s< B s< INT_MIN - C => (A + C) s< (B + C) While these are useful on their own, they're really intended to support D12950. Reviewers: atrick, reames, majnemer, nlewycky, hfinkel Subscribers: aadg, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12948 llvm-svn: 248606	2015-09-25 19:59:49 +00:00
Matt Arsenault	f743b838cb	AMDGPU: Make getNamedOperandIdx declaration readonly This matches how it is defined in the generated implementation. llvm-svn: 248598	2015-09-25 18:09:15 +00:00
Chad Rosier	1bbd7fb38e	[AArch64] Add support for generating pre- and post-index load/store pairs. llvm-svn: 248593	2015-09-25 17:48:17 +00:00
Matt Arsenault	0a10900070	AMDGPU: Disable some passes that are not meaningful Don't run passes related to stack maps, garbage collection, exceptions since these aren't useful for GPUs. There might be a few more to turn off that I'm less sure about (e.g. ShrinkWrapping) or I'm not sure how to disable (SafeStack and StackProtector) llvm-svn: 248591	2015-09-25 17:41:20 +00:00
Matt Arsenault	4bf43d4e68	AMDGPU: Handle i64->v2i32 loads/stores in PreprocessISelDAG This fixes a select error when the i64 source was also bitcasted to v2i32 in the original source. Instead of awkwardly trying to select the modified source value and the store, replace before isel begins. Uses a worklist to avoid possible problems from mutating the DAG, although it seems to work OK without it. llvm-svn: 248589	2015-09-25 17:27:08 +00:00
Matt Arsenault	0cb8517dc6	AMDGPU: Fix recomputing dominator tree unnecessarily SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands. llvm-svn: 248587	2015-09-25 17:21:28 +00:00
Matt Arsenault	2d6fdb8495	AMDGPU: Re-justify workaround and fix worked around problem When buffer resource descriptors were built, the upper two components of the descriptor were first composed into a 64-bit register because legalizeOperands assumed all operands had the same register class. Fix that problem, but keep the workaround. I'm not sure anything actually is actually emitting such a REG_SEQUENCE now. If multiple resource descriptors are set up with different base pointers, this is copied with a single s_mov_b64. We probably should fix this better by recognizing a pair of s_mov_b32 later, but for now delete the dead code. llvm-svn: 248585	2015-09-25 17:08:42 +00:00
Matt Arsenault	3ad55ec946	AMDGPU: Don't create REG_SEQUENCE with SGPR dest and VGPR sources This avoids needting to re-legalize the new REG_SEQUENCE. llvm-svn: 248584	2015-09-25 17:08:40 +00:00
Matt Arsenault	6525aa3529	AMDGPU: Fix not adding exec to defs of cmpx instruction pseudos This was only set on the final _si/_vi version, but not on the pseudos most of codegen sees. No test since these instructions aren't used yet. llvm-svn: 248583	2015-09-25 16:58:27 +00:00
Matt Arsenault	5f70436c49	AMDGPU: Improve accuracy of instruction rates for VOPC These were all using the default 32-bit VALU write class, but the i64/f64 compares are half rate. I'm not sure this is really correct, because they are still using the write to VALU write class, even though they really write to the SALU. llvm-svn: 248582	2015-09-25 16:58:25 +00:00
James Molloy	eb46641c28	[GlobalsAA] Teach GlobalsAA about nocapture Arguments to function calls marked "nocapture" can be marked as non-escaping. However, nocapture is defined in terms of the lifetime of the callee, and if the callee can directly or indirectly recurse to the caller, the semantics of nocapture are invalid. Therefore, we eagerly discover which SCC each function belongs to, and later can check if callee and caller of a callsite belong to the same SCC, in which case there could be recursion. This means that we can't be so optimistic in getModRefInfo(ImmutableCallsite) - previously we assumed all call arguments never aliased with an escaping global. Now we need to check, because a global could now be passed as an argument but still not escape. This also solves a related conformance problem: MemCpyOptimizer can turn non-escaping stores of globals into calls to intrinsics like llvm.memcpy/llvm/memset. This confuses GlobalsAA, which knows the global can't escape and so returns NoModRef when queried, when obviously a memcpy/memset call does indeed reference and modify its arguments. This fixes PR24800, PR24801, and PR24802. llvm-svn: 248576	2015-09-25 15:39:29 +00:00
Saleem Abdulrasool	8e99f50768	ARM: make -Asserts,-Werror=unused-variable build happy The value was only used in an assertion. Sink the variable usage into the assertion. llvm-svn: 248562	2015-09-25 05:41:02 +00:00
Saleem Abdulrasool	fe83b50289	ARM: address WoA division limitation We now emit the compiler generated divide by zero check that was needed for the MSVC routines. We construct a psuedo-instruction for the DBZ check as the operation requires splitting up the BB. For the 64-bit operations, we need to custom expand the node as we need to insert the DBZ check and then emit the libcall to the appropriate name. Because this is target specific, it seemed better to reproduce the expansion operation from the target-agnostic type legalization rather than sink this there to avoid the duplication. The division library calls now match MSVC semantically. llvm-svn: 248561	2015-09-25 05:15:46 +00:00
Matt Arsenault	8aa9973696	AMDGPU: Remove unused includes llvm-svn: 248553	2015-09-25 00:28:43 +00:00
Sanjoy Das	b513a9fa4f	[Bitcode][Asm] Teach LLVM to read and write operand bundles. Summary: This also adds the first set of tests for operand bundles. The optimizer has not been audited to ensure that it does the right thing with operand bundles. Depends on D12456. Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner Subscribers: maksfb, llvm-commits Differential Revision: http://reviews.llvm.org/D12457 llvm-svn: 248551	2015-09-24 23:34:52 +00:00
Matt Arsenault	50f0a42b66	Fix typo llvm-svn: 248549	2015-09-24 22:36:49 +00:00
Chad Rosier	b02f5a5a1f	[AArch64] Improve the readability of the ld/st optimization pass. NFC. In this context, MI is an add/sub instruction not a loads/store. llvm-svn: 248540	2015-09-24 21:27:49 +00:00
Simon Pilgrim	68d0050c6a	[X86][SSE2] Fix zero/any extension shuffles that don't start from the first element Fix for D12561 - we weren't correctly ensuring that the base element for extension was moved to start on a boundary suitable for UNPCKL/H llvm-svn: 248536	2015-09-24 21:02:17 +00:00
Matt Arsenault	e66621b306	AMDGPU: Add s_dcache_* instructions llvm-svn: 248533	2015-09-24 19:52:27 +00:00
Matt Arsenault	d6adfb401c	AMDGPU: Add cache invalidation instructions. These are necessary for implementing mem_fence for OpenCL 2.0. The VI assembler tests are disabled since it seems to be using the wrong encoding or opcode. llvm-svn: 248532	2015-09-24 19:52:21 +00:00
Chad Rosier	7cd472b719	[AArch64] The paired post-increment store instruction has an output register. The pre- and post-increment version update the base register, but the post- version was defined incorrectly. There is no test case as we don't currently generate these instructions, but I plan on changing that in the near future. llvm-svn: 248528	2015-09-24 19:21:42 +00:00
Sanjoy Das	9303c24650	[IR] Add operand bundles to CallInst and InvokeInst. Summary: This change teaches `CallInst`s and `InvokeInst`s to maintain a set of operand bundles as part of its operands. `CallInst`s and `InvokeInst`s with operand bundles co-allocate some space before their `Use` array to hold meta information about which of its operands are part of an operand bundle. The strings corresponding to the bundle tags are interned into `LLVMContextImpl::BundleTagCache` This change does not include any parsing / bitcode support. That's the next change. Depends on D12455. Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner Subscribers: MatzeB, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12456 llvm-svn: 248527	2015-09-24 19:14:18 +00:00
Artyom Skrobov	cf296444ab	[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with a FIXME: attached. This patch changes the handling of +t2dsp to be in line with other architecture extensions. Following a revert of r248152 and new review comments, this patch also includes renaming FeatureDSPThumb2 -> FeatureDSP, hasThumb2DSP() -> hasDSP(), etc. The spelling of "t2dsp" is preserved, pending a further investigation of its possible external usage. Differential Revision: http://reviews.llvm.org/D12937 llvm-svn: 248519	2015-09-24 17:31:16 +00:00
James Molloy	b6be1ebb7d	[ValueTracking] Teach isKnownNonZero a new trick If the shifter operand is a constant, and all of the bits shifted out are known to be zero, then if X is known non-zero at least one non-zero bit must remain. llvm-svn: 248508	2015-09-24 16:06:32 +00:00
Daniel Sanders	090f6e41c4	[mips] Use PredicateControl for the MSA ASE instructions. NFC. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13092 llvm-svn: 248486	2015-09-24 12:10:23 +00:00
Mohammad Shahid	13f1dfdf2e	Codegen: Fix llvm.absdiff semantic. Fixes the overflow case of llvm.absdiff intrinsic also updats the tests and LangRef.rst accordingly. Differential Revision: http://reviews.llvm.org/D11678 llvm-svn: 248483	2015-09-24 10:35:03 +00:00
Charlie Turner	2720593ab4	[InstCombine] Recognize another bswap idiom. Summary: The byte-swap recognizer can now notice that this ``` uint32_t bswap(uint32_t x) { x = (x & 0x0000FFFF) << 16 \| (x & 0xFFFF0000) >> 16; x = (x & 0x00FF00FF) << 8 \| (x & 0xFF00FF00) >> 8; return x; } ``` is a bswap. Fixes PR23863. Reviewers: nlewycky, hfinkel, hans, jmolloy, rengolin Subscribers: majnemer, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12637 llvm-svn: 248482	2015-09-24 10:24:58 +00:00
Matt Arsenault	68d938649e	Introduce target hook for optimizing register copies Allow a target to do something other than search for copies that will avoid cross register bank copies. Implement for SI by only rewriting the most basic copies, so it should look through anything like a subregister extract. I'm not entirely satisified with this because it seems like eliminating a reg_sequence that isn't fully used should work generically for all targets without them having to override something. However, it seems to be tricky to have a simple implementation of this without rewriting to invalid kinds of subregister copies on some targets. I'm not sure if there is currently a generic way to easily check if a subregister index would be valid for the current use. The current set of TargetRegisterInfo::get*Class functions don't quite behave like I would expect (e.g. getSubClassWithSubReg returns the maximal register class rather than the minimal), so I'm not sure how to make the generic test keep searching if SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making the default implementation to check for simple copies breaks a variety of ARM and x86 tests by producing illegal subregister uses. The ARM tests are not actually changed since it should still be using the same sharesSameRegisterFile implementation, this just relaxes them to not check for specific registers. llvm-svn: 248478	2015-09-24 08:36:14 +00:00
Matt Arsenault	e068f9a263	AMDGPU: Return after instruction is processed. llvm-svn: 248476	2015-09-24 07:51:28 +00:00
Matt Arsenault	708586faa2	AMDGPU: Remove another unnecessary check from commuteInstruction llvm-svn: 248475	2015-09-24 07:51:25 +00:00
Matt Arsenault	fa242960fc	AMDGPU: Add readonly to InstrMapping functions llvm-svn: 248474	2015-09-24 07:51:23 +00:00
Matt Arsenault	cab64f1c75	AMDGPU: Fix printing trailing whitespace for mubuf atomics llvm-svn: 248472	2015-09-24 07:51:17 +00:00
Matt Arsenault	c7ec46c3aa	Remove dead declaration llvm-svn: 248471	2015-09-24 07:51:12 +00:00
Matt Arsenault	c721df0478	Use new TokenFactor chain when merging stores If the stores are storing values from loads which partially alias the stores, we could end up placing the merged loads and stores on the same chain which has the potential to break. Each store may have a different chain dependency on only some of the original loads. Create a new TokenFactor to capture all of the required dependencies of the stores rather than assuming all stores can use the same chain. The testcase is a situation where this happens, although it does not have an observable change from this. The DAG nodes just happened to not be reordered before despite this missing chain dependency. This is based on an off-list report for an out of tree target which regressed due to r246307 and I haven't managed to find a case where the nodes do end up reordered with an in tree target. llvm-svn: 248468	2015-09-24 07:22:38 +00:00
Matt Arsenault	c8e2ce4046	AMDGPU: Reduce number of copies emitted Instead of always inserting a copy in case the super register is itself a subregister, only extract to the super reg class if this is actually the case. This shouldn't really change codegen, but makes looking at the output of SIFixSGPRCopies easier to read. llvm-svn: 248467	2015-09-24 07:16:37 +00:00
Justin Bogner	abdcb3c1b3	Fix a think-o in which functions these should surround llvm-svn: 248465	2015-09-24 05:29:31 +00:00
Justin Bogner	aa57ac5d96	Add some NDEBUG checks I accidentally dropped in r248462 llvm-svn: 248464	2015-09-24 05:20:04 +00:00
Justin Bogner	49655f806f	BasicAA: Move BasicAAResult::alias out-of-line. NFC This makes the header more readable and cleans up some unnecessary header differences between NDEBUG and !NDEBUG. llvm-svn: 248462	2015-09-24 04:59:24 +00:00
Michael Zolotukhin	74621cced7	Add CFG Simplification pass after Loop Unswitching. Loop unswitching produces conditional branches with constant condition, and it's beneficial for later passes to clean this up with simplify-cfg. We do this after the second invocation of loop-unswitch, but not after the first one. Not doing so might cause problem for passes like LoopUnroll, whose estimate of loop body size would be less accurate. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D13064 llvm-svn: 248460	2015-09-24 03:50:17 +00:00
Evgeniy Stepanov	8685daf23e	[safestack] Fix compiler crash in the presence of stack restores. A use can be emitted before def in a function with stack restore points but no static allocas. llvm-svn: 248455	2015-09-24 01:23:51 +00:00
Sanjoy Das	b07fc572bd	[IR] Teach `llvm::User` to co-allocate a descriptor. Summary: With this change, subclasses of `llvm::User` will be able to co-allocate a variable number of bytes (called a "descriptor") with the `llvm::User` instance. The co-allocated descriptor can later be accessed using `llvm::User::getDescriptor`. This will be used in later changes to implement operand bundles. This change steals one bit from `NumUserOperands`, but given that it is still 28 bits wide I don't think this will be a practical issue. This change does not allow allocating hung off uses with descriptors. This only for simplicity, not for any fundamental reason; and we can easily add this functionality later if needed. Reviewers: reames, chandlerc, dexonsmith, kmod, majnemer, pete, JosephTremoulet Subscribers: pete, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12455 llvm-svn: 248453	2015-09-24 01:00:49 +00:00
Michael Zolotukhin	d56ee06d1f	[Unroll] When completely unrolling the loop, replace conditinal branches with unconditional. Nothing is expected to change, except we do less redundant work in clean-up. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12951 llvm-svn: 248444	2015-09-23 23:12:43 +00:00
Wei Mi	3cc9204a52	Put profile variables of COMDAT functions to it's own COMDAT group. In -fprofile-instr-generate compilation, to remove the redundant profile variables for the COMDAT functions, these variables are placed in the same COMDAT group as its associated function. This way when the COMDAT function is not picked by the linker, those profile variables will also not be output in the final binary. This may cause warning when mix link objects built w and wo -fprofile-instr-generate. This patch puts the profile variables for COMDAT functions to its own COMDAT group to avoid the problem. Patch by xur. Differential Revision: http://reviews.llvm.org/D12248 llvm-svn: 248440	2015-09-23 22:40:45 +00:00
Tim Northover	beb5bccf88	ARM: fix folding stack adjustment (again again again...) This time, the issue is that we weren't accounting for the possibility that aligned DPRs could have been stored after the final "push" in a prologue. When that happened we effectively moved a "sub sp, #N" from below the aligned stores to above them, and everything went to pot. To make it worse, I'd actually committed something testing that we produced wrong code, so the test update is tiny. llvm-svn: 248437	2015-09-23 22:21:09 +00:00
Philip Reames	d63df5107e	Remove handling of AddrSpaceCast in stripAndAccumulateInBoundsConstantOffsets Patch by: simoncook Unlike BitCasts, AddrSpaceCasts do not always produce an output the same size as its input, which was previously assumed. This fixes cases where two address spaces do not have the same size pointer, as an assertion failure would occur when trying to prove deferenceability. LoopUnswitch is used in the particular test, but LICM also exhibits the same problem. Differential Revision: http://reviews.llvm.org/D13008 llvm-svn: 248422	2015-09-23 19:48:43 +00:00
Lawrence Hu	cac0b89289	Swap loop invariant GEP with loop variant GEP to allow more LICM. This patch changes the order of GEPs generated by Splitting GEPs pass, specially when one of the GEPs has constant and the base is loop invariant, then we will generate the GEP with constant first when beneficial, to expose more cases for LICM. If originally Splitting GEP generate the following: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 %3 %uglygep7 = getelementptr i8, i8* %uglygep, i64 1032 ... Now it genereates: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 1032 %uglygep7 = getelementptr i8, i8* %uglygep, i64 %3 ... For no-loop cases, the original way of generating GEPs seems to expose more CSE cases, so we don't change the logic for no-loop cases, and only limit our change to the specific case we are interested in. llvm-svn: 248420	2015-09-23 19:25:30 +00:00
Akira Hatanaka	f6afd11538	[InstCombine] Preserve metadata when merging loads that are phi arguments. Make sure InstCombiner::FoldPHIArgLoadIntoPHI doesn't drop the following metadata: MD_tbaa MD_alias_scope MD_noalias MD_invariant_load MD_nonnull MD_range rdar://problem/17617709 Differential Revision: http://reviews.llvm.org/D12710 llvm-svn: 248419	2015-09-23 18:40:57 +00:00
Sanjay Patel	1a6534661b	[x86] replace integer 'xor' ops with packed SSE FP 'xor' ops when operating on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx xorl %eax, %ecx movd %ecx, %xmm0 into this: xorps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 This is an extension of: http://reviews.llvm.org/rL248395 llvm-svn: 248415	2015-09-23 18:33:42 +00:00
Sanjay Patel	aba37553c4	[x86] replace integer 'or' ops with packed SSE FP 'or' ops when operating on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx orl %eax, %ecx movd %ecx, %xmm0 into this: orps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 This is an extension of: http://reviews.llvm.org/rL248395 llvm-svn: 248409	2015-09-23 18:19:07 +00:00
Evgeniy Stepanov	a2002b08f7	Android support for SafeStack. Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). This is a re-commit of a change in r248357 that was reverted in r248358. llvm-svn: 248405	2015-09-23 18:07:56 +00:00
Sanjay Patel	b14ecd34f7	move call to convertIntLogicToFPLogic up; NFCI The BEXTR comments didn't make sense before, we may want to extend the FP logic transform to work on vectors, and this way is more beautiful. llvm-svn: 248404	2015-09-23 18:03:37 +00:00
Chen Li	5cd6deeae3	[Bug 24848] Use range metadata to constant fold comparisons with constant values Summary: This is the first part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. When range metadata is provided, it should be used to constant fold comparisons with constant values. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12988 llvm-svn: 248402	2015-09-23 17:58:44 +00:00
Sanjay Patel	ade3abd2d9	[x86] move code for converting int logic to FP logic to a helper function; NFCI This is a follow-on to: http://reviews.llvm.org/rL248395 so we can add the call to the or/xor combines too. llvm-svn: 248399	2015-09-23 17:39:41 +00:00
Sanjay Patel	df2495f331	[x86] replace integer 'and' ops with packed SSE FP 'and' ops when operating on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx andl %eax, %ecx movd %ecx, %xmm0 into this: andps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 Differential Revision: http://reviews.llvm.org/D13065 llvm-svn: 248395	2015-09-23 17:00:06 +00:00
Dan Gohman	979840d31f	[WebAssembly] Fix hasAddr64 being used before being initializer. This reverts r248388 and fixes the underlying bug: hasAddr64 was initialized in runOnMachineFunction, but runOnMachineFunction isn't ever called in CodeGen/WebAssembly/global.ll since that testcase has no functions. The fix here is to use AsmPrinter's getPointerSize() as needed to determine the pointer size instead. llvm-svn: 248394	2015-09-23 16:59:10 +00:00
Vedant Kumar	ff08e926ba	[Inline] Use AssumptionCache from the right Function This changes the behavior of AddAligntmentAssumptions to match its comment. I.e, prove the asserted alignment in the context of the caller, not the callee. Thanks to Mehdi Amini for seeing the issue here! Also to Artur Pilipenko who also saw a fix for the issue. rdar://22521387 Differential Revision: http://reviews.llvm.org/D12997 llvm-svn: 248390	2015-09-23 15:49:08 +00:00
Alexander Kornienko	a3eaa204e6	Fix CodeGen/WebAssembly/global.ll test under ASAN. llvm-svn: 248388	2015-09-23 15:41:25 +00:00
David Majnemer	fa36bde2f6	[DeadArgElim] Split the invoke successor edge Invoking a function which returns an aggregate can sometimes be transformed to return a scalar value. However, this means that we need to create an insertvalue instruction(s) to recreate the correct aggregate type. We achieved this by inserting an insertvalue instruction at the invoke's normal successor. However, this is not feasible if the normal successor uses the invoke's return value inside a PHI node. Instead, split the edge between the invoke and the unwind successor and create the insertvalue instruction in the new basic block. The new basic block's successor will be the old invoke successor which leaves us with IR which is well behaved. This fixes PR24906. llvm-svn: 248387	2015-09-23 15:41:09 +00:00
Chad Rosier	2dfd35499e	[AArch64] Refactor pre- and post-index merge fuctions into a single function. NFC. llvm-svn: 248377	2015-09-23 13:51:44 +00:00
Igor Laevsky	029bd93c5d	[DeadStoreElimination] Remove dead zero store to calloc initialized memory This change allows dead store elimination to remove zero and null stores into memory freshly allocated with calloc-like function. Differential Revision: http://reviews.llvm.org/D13021 llvm-svn: 248374	2015-09-23 11:38:44 +00:00
Oliver Stannard	f2ed5c68d2	[ARM] Add option to force fast-isel The ARM backend has some logic that only allows the fast-isel to be enabled for subtargets where it is known to be stable. This adds a backend option to override this and force the fast-isel to be used for any target, to allow it to be tested. This is an ARM-specific option, because no other backend disables the fast-isel on a per-subtarget basis. llvm-svn: 248369	2015-09-23 09:19:54 +00:00
Simon Pilgrim	9cb018b6b6	[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR This patches removes the x86.sse41.pmovsx* intrinsics, provides a suitable upgrade path and updates relevant tests to sign extend a subvector instead. LLVM counterpart to D12835 Differential Revision: http://reviews.llvm.org/D13002 llvm-svn: 248368	2015-09-23 08:48:33 +00:00
Sanjoy Das	2aacc0ecca	[SCEV] Introduce ScalarEvolution::getOne and getZero. Summary: It is fairly common to call SE->getConstant(Ty, 0) or SE->getConstant(Ty, 1); this change makes such uses a little bit briefer. I've refactored the call sites I could find easily to use getZero / getOne. Reviewers: hfinkel, majnemer, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12947 llvm-svn: 248362	2015-09-23 01:59:04 +00:00
Evgeniy Stepanov	8d0e3011d8	Revert "Android support for SafeStack." test/Transforms/SafeStack/abi.ll breaks when target is not supported; needs refactoring. llvm-svn: 248358	2015-09-23 01:23:22 +00:00
Evgeniy Stepanov	ce2e16f00c	Android support for SafeStack. Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). llvm-svn: 248357	2015-09-23 01:03:51 +00:00
Cong Hou	9def6efd7e	Fixed an issue on updating profile data when lowering switch statement. Fixed the issue that when there is an edge from the jump table to the default statement, we should check it directly instead of checking if the sibling of the jump table header is a successor of the jump table header, which may not be the default statment but a successor of it. llvm-svn: 248354	2015-09-23 00:20:27 +00:00
Adrian Prantl	77fefeba37	Debug Info: Emit the dwo_name only in skeleton CUs, not in DWOs. llvm-svn: 248340	2015-09-22 23:21:00 +00:00
Matthias Braun	73e4221e6c	LiveIntervalAnalysis: Avoid multiple connected liveness components We may have subregister defs which are unused but not discovered and cleaned up prior to liveness analysis. This creates multiple connected components in the resulting live range which are forbidden in the MachineVerifier because they would unnecesarily constrain the register allocator. Rewrite those dead definitions to define a newly created virtual register. Differential Revision: http://reviews.llvm.org/D13035 llvm-svn: 248335	2015-09-22 22:37:44 +00:00
Matthias Braun	5efe871971	LiveInterval: Distribute subregister liveranges to new intervals in ConnectedVNInfoEqClasses::Distribute() This improves ConnectedVNInfoEqClasses::Distribute() to distribute the segments and value numbers in the subranges instead of conservatively clearing all subregister info. No separate test here, just clearing the subrange instead of properly distributing them would however break my upcoming fix regarding dead super register definitions. Differential Revision: http://reviews.llvm.org/D13075 llvm-svn: 248334	2015-09-22 22:37:42 +00:00
Michael Zolotukhin	deade19630	[Unroll] Do not crash trying to propagate a value to vector load. llvm-svn: 248333	2015-09-22 22:27:12 +00:00
Michael Zolotukhin	8bb31dd08a	[Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad. Apart from checking that GlobalVariable is a constant, we should check that it's not a weak constant, in which case we can't propagate its value. llvm-svn: 248327	2015-09-22 21:41:29 +00:00
Ahmed Bougacha	81616a72ea	[ARM] Emit clrex in the expanded cmpxchg fail block. ARM counterpart to r248291: In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248294	2015-09-22 17:22:58 +00:00
Ahmed Bougacha	07a844d758	[AArch64] Emit clrex in the expanded cmpxchg fail block. In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248291	2015-09-22 17:21:44 +00:00
Benjamin Kramer	3c96f0a54e	Make helper function static. NFC. llvm-svn: 248278	2015-09-22 14:34:57 +00:00
Daniel Sanders	86cce70010	[mips][sched] Split IIBranch into specific instruction classes. Summary: Almost no functional change since the InstrItinData's have been duplicated. The one functional change is to remove IIBranch from the MSA branches. The classes will be assigned to the MSA instructions as part of implementing the P5600 scheduler. II_IndirectBranchPseudo and II_ReturnPseudo can probably be removed. I've preserved the itinerary information for the corresponding pseudo instructions to avoid making a functional change to these pseudos in this patch. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12189 llvm-svn: 248273	2015-09-22 13:36:28 +00:00
Daniel Sanders	1af1d275bc	[mips][sched] Temporarily rename IIAlu to IIM16Alu. NFC. Summary: The only instructions left in IIAlu are MIPS16 specific. We're not implementing a MIPS16 scheduler at this time so rename the class to make it obvious that they are MIPS16 instructions. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12188 llvm-svn: 248267	2015-09-22 12:36:28 +00:00
Stephen Canon	8216d88511	Don't raise inexact when lowering ceil, floor, round, trunc. The C standard has historically not specified whether or not these functions should raise the inexact flag. Traditionally on Darwin, these functions did raise inexact, and the llvm lowerings followed that conventions. n1778 (C bindings for IEEE-754 (2008)) clarifies that these functions should not set inexact. This patch brings the lowerings for arm64 and x86 in line with the newly specified behavior. This also lets us fold some logic into TD patterns, which is nice. Differential Revision: http://reviews.llvm.org/D12969 llvm-svn: 248266	2015-09-22 11:43:17 +00:00
NAKAMURA Takumi	10c80e7996	Prune trailing whitespaces. llvm-svn: 248265	2015-09-22 11:19:03 +00:00
NAKAMURA Takumi	0a7d0ad95f	Untabify. llvm-svn: 248264	2015-09-22 11:15:07 +00:00
NAKAMURA Takumi	a9cb538a74	Reformat blank lines. llvm-svn: 248263	2015-09-22 11:14:39 +00:00
NAKAMURA Takumi	84965031a7	Reformat comment lines. llvm-svn: 248262	2015-09-22 11:14:12 +00:00
NAKAMURA Takumi	70ad98aca4	Reformat. llvm-svn: 248261	2015-09-22 11:13:55 +00:00
NAKAMURA Takumi	59a16a76be	ARMInstrInfo.cpp: Reformat. llvm-svn: 248260	2015-09-22 11:10:17 +00:00
NAKAMURA Takumi	bf9cc7f30b	Fix utf8 chars. llvm-svn: 248259	2015-09-22 11:10:08 +00:00
Daniel Sanders	f173dda0e2	[mips][ias] Implement .cpreturn directive. Summary: Based on a patch by David Chisnall. I've modified the original patch as follows: * Moved the expansion to the TargetStreamers so that the directive isn't expanded when emitting assembly. * Fixed an operand order bug. * Changed the move instructions from DADDu to OR to match recent changes to GAS. Reviewers: vkalintiris Subscribers: llvm-commits, emaste, seanbruno, theraven Differential Revision: http://reviews.llvm.org/D13017 llvm-svn: 248258	2015-09-22 10:50:09 +00:00
Daniel Sanders	254f387723	[mips][sched] Added class for WSBH Summary: No functional change since no InstrItinData is provided. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12190 llvm-svn: 248257	2015-09-22 10:01:13 +00:00
Simon Pilgrim	1cad0cd3ce	[X86][SSE] Match zero/any extension shuffles that don't start from the first element This patch generalizes the lowering of shuffles as zero extensions to allow extensions that don't start from the first element. It now recognises extensions starting anywhere in the lower 128-bits or at the start of any higher 128-bit lane. The motivation was to reduce the number of high cost pshufb calls, but it also improves the SSE2 case as well. Differential Revision: http://reviews.llvm.org/D12561 llvm-svn: 248250	2015-09-22 08:16:08 +00:00
Matt Arsenault	f11e7489e1	AMDGPU: Remove unnecessary check If the instruction doesn't have enough operands, it either shouldn't be marked as isCommutable or is malformed. llvm-svn: 248242	2015-09-22 04:17:45 +00:00
Matthias Braun	d3dd1354a4	LiveIntervalAnalysis: Factor common code into splitSeparateComponents; NFC llvm-svn: 248241	2015-09-22 03:44:41 +00:00
Evgeniy Stepanov	3c9c8338d0	Remove unused TargetTransformInfo dependency from SafeStack pass. llvm-svn: 248233	2015-09-22 00:44:32 +00:00
Michael Zolotukhin	9f3aea6e1f	[LoopUnswitch] Require DominatorTree info. Summary: We should either require the DT info to be available, or check if it's available in every place we use DT (and we already miss such check in one place, which causes failures in some cases). As other loop passes preserve DT and it's usually available, it makes sense to just require it here. There is no regression test, because the bug only shows up if pass manager decides to clean DT info right before LoopUnswitch. If loop-unswitch is run separately, DT is available, so bug isn't exposed. Reviewers: chandlerc, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13036 llvm-svn: 248230	2015-09-22 00:22:47 +00:00
Sanjoy Das	5d9a8cbb68	[SCEV] Use SaveAndRestore<T> instead of a hand rolled struct; NFCI. `ClearWalkingBEDominatingCondsOnExit` is exactly `SaveAndRestore<bool>`, so use `SaveAndRestore<bool>` instead. llvm-svn: 248227	2015-09-22 00:10:57 +00:00
Sanjay Patel	fc580a60e2	function names should start with a lower case letter; NFC llvm-svn: 248224	2015-09-21 23:03:16 +00:00
Sanjay Patel	4ac6b115e8	don't repeat function/variable names in header comments; NFC llvm-svn: 248222	2015-09-21 22:47:23 +00:00
Philip Reames	5f99423de9	[LICM] Hoist calls to readonly argmemonly functions even with stores in the loop We know that an argmemonly function can only access memory pointed to by it's pointer arguments. Rather than needing to consider all possible stores as aliasing (as we do for a readonly function), we can only consider the aliasing of the pointer arguments. Note that this change only addresses hoisting. I'm thinking about how to address speculation safety as well, but that will be a different change. FYI, argmemonly disallows accessing memory through non-pointer typed arguments. Differential Revision: http://reviews.llvm.org/D12771 llvm-svn: 248220	2015-09-21 22:27:59 +00:00
Philip Reames	963febd4f8	Fix for pr24866 Turns out that not every basic block is guaranteed to have a node within the DominatorTree. This is really hard to trigger, but the test case from the PR managed to do so. There's active discussion continuing about what documentation and/or invariants needed cleaned up. llvm-svn: 248216	2015-09-21 22:04:10 +00:00
Mehdi Amini	24e20583d1	Fix UB: can't bind a reference to nullptr (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 248213	2015-09-21 21:29:43 +00:00
David Blaikie	9ebdc69214	auto and range-for-ify some things to make changing container types a bit easier in the (possibly near) future llvm-svn: 248212	2015-09-21 21:07:50 +00:00
Simon Pilgrim	4003ed2da3	[DAGCombiner] Improve FMA support for interpolation patterns This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents. This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t)))) Differential Revision: http://reviews.llvm.org/D13003 llvm-svn: 248210	2015-09-21 20:32:48 +00:00
Jeroen Ketema	41681a5329	[ARM] Do not scale vext with a factor The vext pseudo-instruction takes the number of elements that need to be extracted, not the number of bytes. Hence, use the number of elements directly instead of scaling them with a factor. Reviewers: Silviu Baranga, James Molloy (not reflected in the differential revision) Differential Revision: http://reviews.llvm.org/D12974 llvm-svn: 248208	2015-09-21 20:28:04 +00:00
Simon Pilgrim	e8e5a17a12	[DAGCombiner] Tidy up FMA combine helpers. NFCI. Based on feedback for D13003. llvm-svn: 248206	2015-09-21 20:15:03 +00:00
James Molloy	50a4c27f97	[LoopUtils,LV] Propagate fast-math flags on generated FCmp instructions We're currently losing any fast-math flags when synthesizing fcmps for min/max reductions. In LV, make sure we copy over the scalar inst's flags. In LoopUtils, we know we only ever match patterns with hasUnsafeAlgebra, so apply that to any synthesized ops. llvm-svn: 248201	2015-09-21 19:41:19 +00:00
Stephen Canon	b12db0e42c	Remove roundingMode argument in APFloat::mod Because mod is always exact, this function should have never taken a rounding mode argument. The actual implementation still has issues, which I'll look at resolving in a subsequent patch. llvm-svn: 248195	2015-09-21 19:29:25 +00:00
Matt Arsenault	8fb9b94f7f	Fix accidentally committed debug printing llvm-svn: 248190	2015-09-21 18:21:10 +00:00
Marcello Maggioni	ab58c74d98	[DivergenceAnalysis] Separated definition of class into header. The definition of the DivergenceAnalysis pass was in a CPP file and wasn't accessible to users of the analysis to get it through "getAnalysis<>()". This patch extracts the definition into a separate header that can be used by users of the analysis to fetch the results. Patch by Volkan Keles (vkeles@apple.com) llvm-svn: 248186	2015-09-21 17:58:14 +00:00
Matthias Braun	b9fe44ddb0	SelectionDAG: Use InsertNode for EntryNode This fixes problems where two nodes have persistent debug id 0 assigned. llvm-svn: 248182	2015-09-21 17:41:05 +00:00
Chandler Carruth	7542d37688	[FunctionAttrs] Extract a helper function for the core logic used to evaluate whether 'readonly' or 'readnone' apply to a given function. This both reduces indentation and will make it easy to share the logic with a new pass manager implementation. llvm-svn: 248181	2015-09-21 17:39:41 +00:00
Ulrich Weigand	126caeb043	[SystemZ] Fix expansion of ISD::FPOW and ISD::FSINCOS The ISD::FPOW and ISD::FSINCOS opcodes default to Legal, but there is no legal instruction for those on SystemZ. This could cause LLVM internal errors. Fixed by setting the operation action to Expand for those opcodes. Also added test cases for all other LLVM IR intrinsics that should generate a library call. (Those already work correctly since the default operation action is fine.) llvm-svn: 248180	2015-09-21 17:35:45 +00:00
James Molloy	e46da3849a	Revert "[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def" This was committed without the code review (http://reviews.llvm.org/D12937) being approved. This reverts commit r248152. llvm-svn: 248174	2015-09-21 16:35:08 +00:00
Matt Arsenault	85441dd724	AMDGPU: Move copy handling under switch like other instructions llvm-svn: 248172	2015-09-21 16:27:22 +00:00
Sanjay Patel	55dcd40d3e	add ShouldChangeType() variant that takes bitwidths This is more efficient for cases like D12965 where we already have widths. llvm-svn: 248170	2015-09-21 16:09:37 +00:00
Matt Arsenault	b774834429	DAGCombiner: Replace store of FP constant after attemping store merges If storing multiple FP constants, some subset of the stores would be replaced with integers due to visit order, so MergeConsecutiveStores would only partially merge these. llvm-svn: 248169	2015-09-21 15:59:46 +00:00
Matt Arsenault	a30ddb6524	Factor replacement of stores of FP constants into new function llvm-svn: 248168	2015-09-21 15:59:43 +00:00
Sanjay Patel	84dca494b1	don't repeat function names in comments; NFC llvm-svn: 248166	2015-09-21 15:33:26 +00:00
Chad Rosier	03a47305ec	[Machine Combiner] Refactor machine reassociation code to be target-independent. No functional change intended. Patch by Haicheng Wu <haicheng@codeaurora.org>! http://reviews.llvm.org/D12887 PR24522 llvm-svn: 248164	2015-09-21 15:09:11 +00:00
Artyom Skrobov	79b0adaae4	[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with a FIXME: attached. This patch changes the handling of +t2dsp to be in line with other architecture extensions. Following review comments, also updating the description of FeatureDSPThumb2 in ARM.td. Differential Revision: http://reviews.llvm.org/D12937 llvm-svn: 248152	2015-09-21 12:43:10 +00:00
Asaf Badouh	eaf2da14bf	[X86][AVX512] add masked version for RSQRT14 & RCP14 Scalar FP Differential Revision: http://reviews.llvm.org/D12524 llvm-svn: 248147	2015-09-21 10:23:53 +00:00
Daniel Sanders	5d7962880d	[mips] Allow constant expressions in second argument of .cpsetup. Summary: Also tightened up the test and made a trivial fix to prevent double-newline after emitting .cpsetup directives. Reviewers: vkalintiris Subscribers: seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D12956 llvm-svn: 248143	2015-09-21 09:26:55 +00:00
Craig Topper	0013be16ff	Use makeArrayRef or None to avoid unnecessarily mentioning the ArrayRef type extra times. NFC llvm-svn: 248140	2015-09-21 05:32:41 +00:00
Craig Topper	4e9b03d6f9	Don't pass StringRefs around by const reference. Pass by value instead per coding standards. NFC llvm-svn: 248136	2015-09-21 00:18:00 +00:00
Craig Topper	3c76c523e1	Cleanup places that passed SMLoc by const reference to pass it by value instead. NFC llvm-svn: 248135	2015-09-20 23:35:59 +00:00
Sanjoy Das	7cc2cfecd9	[IndVars] Use C++11 style field initialization; NFCI. llvm-svn: 248131	2015-09-20 18:42:53 +00:00
Sanjoy Das	e1e352d5c5	[IndVars] Don't add a level of indentation for namespace {. NFC. Whitespace-only change. llvm-svn: 248130	2015-09-20 18:42:50 +00:00
Igor Breger	b7e1f9d680	AVX512: Implemented encoding and intrinsics for vcmpss/sd. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12593 llvm-svn: 248121	2015-09-20 15:15:10 +00:00
Asaf Badouh	2744d21fb8	[X86][AVX512] extend support in Scalar conversion add scalar FP to Int conversion with truncation intrinsics add scalar conversion FP32 from/to FP64 intrinsics add rounding mode and SAE mode encoding for these intrinsics Differential Revision: http://reviews.llvm.org/D12665 llvm-svn: 248117	2015-09-20 14:31:19 +00:00
Igor Breger	4c4cd789c9	AVX512: vsqrtss/sd encoding and intrinsics implementation. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12102 llvm-svn: 248116	2015-09-20 09:13:41 +00:00
Asaf Badouh	572bbceecc	[X86][AVX512DQ] Add fpclass instruction Differential Revision: http://reviews.llvm.org/D12931 llvm-svn: 248115	2015-09-20 08:46:07 +00:00
Michael Kuperstein	58e86bc893	[X86] Fix sitofp and uitofp instruction matching failures with long double and avx512 The operation action for i32 and i64 cannot be set to legal, as long double needs custom lowering. Patch by: mitch.l.bodart@intel.com Differential Revision: http://reviews.llvm.org/D12372 llvm-svn: 248114	2015-09-20 08:12:17 +00:00
Igor Breger	1d55f20bee	AVX512: Implemented intrinsics for vshuff32x4, vshuff64x2, vshufi64x2, vshufi32x4 Added tests for intrinsics. Differential Revision: http://reviews.llvm.org/D12525 llvm-svn: 248113	2015-09-20 07:18:53 +00:00
Sanjoy Das	9119bf4c0b	[IndVars] Don't repeat function names in comment; NFC. Only changes comments. llvm-svn: 248112	2015-09-20 06:58:03 +00:00
Igor Breger	0ede3cbb5c	AVX512: Implement instructions encoding, lowering and intrinsics vinserti64x4, vinserti64x2, vinserti32x8, vinserti32x4, vinsertf64x4, vinsertf64x2, vinsertf32x8, vinsertf32x4 Added tests for encoding, lowering and intrinsics. Differential Revision: http://reviews.llvm.org/D11893 llvm-svn: 248111	2015-09-20 06:52:42 +00:00
Saleem Abdulrasool	4966f58ac2	ARM: cleanup formatting clang-format a line which was poorly formatted. NFC. llvm-svn: 248110	2015-09-20 03:19:09 +00:00
Sanjoy Das	428db150d1	[IndVars] Fix a bug in r248045. Because -indvars widens induction variables through arithmetic, `NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV` manages information for all transitive uses of an IV being widened, including uses of `-1 * IV`). Instead it must live on `NarrowIVDefUse` which manages information for a specific def-use edge in the transitive use list of an induction variable. This change also adds a test case that demonstrates the problem with r248045. llvm-svn: 248107	2015-09-20 01:52:18 +00:00
Simon Pilgrim	d0448ee59f	[X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1)) Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x)) Differential Revision: http://reviews.llvm.org/D12663 llvm-svn: 248091	2015-09-19 13:22:57 +00:00
Simon Pilgrim	996725eb17	[InstCombine] Use SimplifyDemandedVectorEltsLow helper function. NFCI. Use the SimplifyDemandedVectorEltsLow helper function introduced in D12680. llvm-svn: 248089	2015-09-19 11:41:53 +00:00
Matt Arsenault	1fafdc82d6	AMDGPU: Remove dead code getCFGStructurizerRegClass is not used for SI, so move it into R600 specific stuff. llvm-svn: 248087	2015-09-19 06:41:10 +00:00
Bob Wilson	8823b84fae	NFC: Fix indentation and add braces to clarify nested of else-statement. llvm-svn: 248086	2015-09-19 06:20:59 +00:00
Maksim Panchenko	0510cd5161	[PrologEpilogInserter] Minor refactoring. Differential Revision: http://reviews.llvm.org/D12924 llvm-svn: 248084	2015-09-19 04:42:15 +00:00
Maksim Panchenko	07b754daf8	Test commit. Fix comment. NFC. llvm-svn: 248082	2015-09-19 04:01:19 +00:00
David Majnemer	47ce0b81b0	[InstCombine] FoldICmpCstShrCst failed for ashr when comparing against -1 (icmp eq (ashr C1, %V) -1) may have multiple answers if C1 is not a power of two and has the sign bit set. This fixes PR24873. llvm-svn: 248074	2015-09-19 00:48:31 +00:00
David Majnemer	e5977ebecc	[InstCombine] FoldICmpCstShrCst didn't handle icmps of -1 in the ashr case correctly llvm-svn: 248073	2015-09-19 00:48:26 +00:00
Sanjoy Das	f69d0e3384	[IndVars] Widen more comparisons for non-negative induction vars Summary: If an induction variable is provably non-negative, its sign extension is equal to its zero extension. This means narrow uses like icmp slt iNarrow %indvar, %rhs can be widened into icmp slt iWide zext(%indvar), sext(%rhs) Reviewers: atrick, mcrosier, hfinkel Subscribers: hfinkel, reames, llvm-commits Differential Revision: http://reviews.llvm.org/D12745 llvm-svn: 248045	2015-09-18 21:21:02 +00:00
Cong Hou	d40105d321	Update edge weights properly when merging blocks in if-conversion. In if-conversion, there is a utility function MergeBlocks() that is used to merge blocks. However, when new edges are built in this function the edge weight is either not provided or not updated properly, leading to a modified CFG with incorrect edge weights. This patch corrects this issue. Differential Revision: http://reviews.llvm.org/D12513 llvm-svn: 248030	2015-09-18 20:22:41 +00:00
Eric Christopher	a835956bda	Limit the range of processors supported by ARM fast isel to v6 or later as that's all that is tested right now. Fixes PR24858. llvm-svn: 248027	2015-09-18 20:08:18 +00:00
Larisse Voufo	532bf7153c	Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan. (Complete version of r247497. See D12886) llvm-svn: 248022	2015-09-18 19:14:35 +00:00
James Y Knight	e72b0dbf97	Make MachineScheduler debug output less confusing. At least...a little bit. llvm-svn: 248020	2015-09-18 18:52:20 +00:00
Cong Hou	f9f9ffb98b	Scaling up values in ARMBaseInstrInfo::isProfitableToIfCvt() before they are scaled by a probability to avoid precision issue. In ARMBaseInstrInfo::isProfitableToIfCvt(), there is a simple cost model in which the number of cycles is scaled by a probability to estimate the cost. However, when the number of cycles is small (which is usually the case), there is a precision issue after the computation. To avoid this issue, this patch scales those cycles by 1024 (chosen to make the multiplication a litter faster) before they are scaled by the probability. Other variables are also scaled up for the final comparison. Differential Revision: http://reviews.llvm.org/D12742 llvm-svn: 248018	2015-09-18 18:19:40 +00:00
Matthias Braun	77771cfd97	SelectionDAGDumper: Leave out the <multiple use> markers They mostly clutter the output while it is still possible to see which node has multiple users without them. Differential Revision: http://reviews.llvm.org/D12569 llvm-svn: 248013	2015-09-18 17:57:33 +00:00
Matthias Braun	bab3fb45e5	SelectionDAGDumper: Avoid unnecessary newlines Before: t0 = EntryToken:ch t0: <multiple use> t0: <multiple use> t1 = CopyFromReg:v4f32,ch t0, Register:v4f32 %vreg0 t25 = IMPLICIT_DEF:v4f32 t26 = HADDPSrr:v4f32 t1, t25 t23 = CopyToReg:ch,glue t0, Register:v4f32 %XMM0, t26 t23: <multiple use> t23: <multiple use> t24 = RETQ:ch Register:v4f32 %XMM0, t23, t23:1 After: t0: <multiple use> t0: <multiple use> t1 = CopyFromReg:v4f32,ch t0, Register:v4f32 %vreg0 t26 = X86ISD::FHADD:v4f32 t1, undef:v4f32 t23 = CopyToReg:ch,glue t0, Register:v4f32 %XMM0, t26 t23: <multiple use> t21 = TargetConstant:i16<0> t23: <multiple use> t24 = X86ISD::RET_FLAG:ch t23, t21, Register:v4f32 %XMM0, t23:1 Differential Revision: http://reviews.llvm.org/D12568 llvm-svn: 248012	2015-09-18 17:57:31 +00:00
Matthias Braun	f89b7c7188	SelectionDAGDumper: Hide [ID=X], [ORD=X] and source locations by default. You can show them with the new -dag-dump-verbose switch. Differential Revision: http://reviews.llvm.org/D12566 llvm-svn: 248011	2015-09-18 17:57:28 +00:00
Matthias Braun	0b7d6c14c9	SelectionDAG: Introduce PersistentID to SDNode for assert builds. This gives us more human readable numbers to identify nodes in debug dumps. Before: 0x7fcbd9700160: ch = EntryToken 0x7fcbd985c7c8: i64 = Register %RAX ... 0x7fcbd9700160: <multiple use> 0x7fcbd985c578: i64,ch = MOV64rm 0x7fcbd985c6a0, 0x7fcbd985cc68, 0x7fcbd985c200, 0x7fcbd985cd90, 0x7fcbd985ceb8, 0x7fcbd9700160<Mem:LD8[@foo]> [ORD=2] 0x7fcbd985c8f0: ch,glue = CopyToReg 0x7fcbd9700160, 0x7fcbd985c7c8, 0x7fcbd985c578 [ORD=3] 0x7fcbd985c7c8: <multiple use> 0x7fcbd985c8f0: <multiple use> 0x7fcbd985c8f0: <multiple use> 0x7fcbd985ca18: ch = RETQ 0x7fcbd985c7c8, 0x7fcbd985c8f0, 0x7fcbd985c8f0:1 [ORD=3] Now: t0: ch = EntryToken t5: i64 = Register %RAX ... t0: <multiple use> t3: i64,ch = MOV64rm t10, t12, t11, t13, t14, t0<Mem:LD8[@foo]> [ORD=2] t6: ch,glue = CopyToReg t0, t5, t3 [ORD=3] t5: <multiple use> t6: <multiple use> t6: <multiple use> t7: ch = RETQ t5, t6, t6:1 [ORD=3] Differential Revision: http://reviews.llvm.org/D12564 llvm-svn: 248010	2015-09-18 17:41:00 +00:00
Geoff Berry	43ec15e57e	[AArch64] Improved bitfield instruction selection. Summary: For bitfield insert OR matching, check both operands for larger pattern first before checking for smaller pattern. Add pattern for unsigned bitfield insert-in-zero done with SHL+AND. Resolves PR21631. Reviewers: jmolloy, t.p.northover Subscribers: aemerson, rengolin, llvm-commits, mcrosier Differential Revision: http://reviews.llvm.org/D12908 llvm-svn: 248006	2015-09-18 17:11:53 +00:00
Rafael Espindola	7dbb577826	Remove temporary file on signal. Without this lld leaves temporary files behind when it crashes. llvm-svn: 247994	2015-09-18 15:17:53 +00:00
Daniel Sanders	df19a5e605	[mips][microMIPS] Fix an invalid read for lwm32 and reserved reglist values. Summary: Some values of 'reglist' are reserved and cause the disassembler to read past the end of the Regs array. Treat lwm32's containing reserved values as invalid instructions. Reviewers: zoran.jovanovic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12959 llvm-svn: 247990	2015-09-18 14:20:54 +00:00
Chad Rosier	d90e2ebdf6	[AArch64] Reorder cases to improve readability. NFC. llvm-svn: 247989	2015-09-18 14:15:19 +00:00
Chad Rosier	84a0afdeff	[AArch64] Remove some redundant cases. NFC. llvm-svn: 247988	2015-09-18 14:13:18 +00:00
Aaron Ballman	2d0f38c5fb	Silencing a -Wsign-compare warning; NFC. llvm-svn: 247986	2015-09-18 13:31:42 +00:00
Igor Laevsky	0fa4819dd8	[LazyValueInfo] Report nonnull range for nonnull pointers Currently LazyValueInfo will report only alloca's as having nonnull range. For loads with !nonnull metadata it will bailout with no additional information. Same is true for calls returning nonnull pointers. This change extends LazyValueInfo to handle additional nonnull instructions. Differential Revision: http://reviews.llvm.org/D12932 llvm-svn: 247985	2015-09-18 13:01:48 +00:00
Artur Pilipenko	84bc62f7a3	Support align attribute for return values Reviewed By: reames Differential Revision: http://reviews.llvm.org/D12844 llvm-svn: 247984	2015-09-18 12:33:31 +00:00
Michael Kruse	020296a968	[Support] Reapply r245289 "Always wait for GraphViz before opening the viewer" The change was accidentally undone by r245290. Original log message: When calling DisplayGraph and a PS viewer is chosen, two programs are executed: The GraphViz generator and the PostScript viewer. Always wait for the generator to finish to ensure that the .ps file is written before opening the viewer for that file. DisplayGraph's wait parameter refers to whether to wait until the user closes the viewer. This happened on Windows and if none of the options to open the .dot file directly applies, also on Linux. Differential Revision: http://reviews.llvm.org/D11876 llvm-svn: 247980	2015-09-18 10:56:30 +00:00
David Majnemer	9966fe8f85	[WinEH] Moved funclet pads should be in relative order We shifted the MachineBasicBlocks to the end of the MachineFunction in DFS order. This will not ensure that MachineBasicBlocks which fell through to one another will remain contiguous. Instead, implement a stable sort algorithm for iplist. This partially reverts commit r214150. llvm-svn: 247978	2015-09-18 08:18:07 +00:00
Bob Wilson	dd0eadce7d	Whitespace. Indent with spaces instead of a tab. llvm-svn: 247969	2015-09-18 05:36:13 +00:00
Quentin Colombet	b4c6886215	[ShrinkWrap] Refactor the handling of infinite loop in the analysis. - Strenghten the logic to be sure we hoist the restore point out of the current loop. (The fixes a bug with infinite loop, added as part of the patch.) - Walk over the exit blocks of the current loop to conver to the desired restore point in one iteration of the update loop. llvm-svn: 247958	2015-09-17 23:21:34 +00:00
David Blaikie	6a51dbdb3c	[opaque pointer types] Add an explicit pointee type to alias records in the IR Since aliases actually use and verify their explicit type already, no further invalid testing is required here. The invalid.test:ALIAS-TYPE-MISMATCH case catches errors due to emitting a non-pointee type in the new format or a non-pointer type in the old format. llvm-svn: 247952	2015-09-17 22:18:59 +00:00
Alexei Starovoitov	bf19a11116	[bpf] expand indirect branches BPF instruction set doesn't have indirect branches. Expand them. Reported by John Fastabend. llvm-svn: 247951	2015-09-17 22:18:08 +00:00
Matthias Braun	3e86de1acb	Revert "(HEAD -> master, origin/master, origin/HEAD) RegisterPressure: Move LiveInRegs/LiveOutRegs from RegisterPressure to PressureTracker" This reverts commit r247943. Accidental commit, code review was not finished yet. llvm-svn: 247945	2015-09-17 21:12:24 +00:00
Matthias Braun	70eff2571f	RegisterPressure: Move LiveInRegs/LiveOutRegs from RegisterPressure to PressureTracker Differential Revision: http://reviews.llvm.org/D12814 llvm-svn: 247943	2015-09-17 21:10:06 +00:00
Matthias Braun	d78ee54a54	MachineScheduler: Provide an option for node hiding cutoff and disable it by default llvm-svn: 247942	2015-09-17 21:09:59 +00:00
Joerg Sonnenberger	1bbfa7f9d7	[SPARC] Add mulscc. llvm-svn: 247940	2015-09-17 20:54:26 +00:00
Sanjay Patel	5dd66c3ca2	fix typo; NFC llvm-svn: 247938	2015-09-17 20:51:50 +00:00
David Majnemer	978902309a	[WinEH] Add a funclet layout pass Windows EH funclets need to be contiguous. The FuncletLayout pass will ensure that the funclets are together and begin with a funclet entry MBB. Differential Revision: http://reviews.llvm.org/D12943 llvm-svn: 247937	2015-09-17 20:45:18 +00:00
Reid Kleckner	5b8a46e771	[WinEH] Make funclet return instrs pseudo instrs This makes catchret look more like a branch, and less like a weird use of BlockAddress. It also lets us get away from llvm.x86.seh.restoreframe, which relies on the old parentfpoffset label arithmetic. llvm-svn: 247936	2015-09-17 20:43:47 +00:00
Piotr Padlewski	a4d43337d4	gvn small fix http://reviews.llvm.org/D12928 llvm-svn: 247935	2015-09-17 20:34:22 +00:00
Simon Pilgrim	61116ddc7b	[InstCombine] Added vector demanded bits support for SSE4A EXTRQ/INSERTQ instructions The SSE4A instructions EXTRQ/INSERTQ only use the lower 64-bits (or less) for many of their input vector operands and all of them have undefined upper 64-bits results. Differential Revision: http://reviews.llvm.org/D12680 llvm-svn: 247934	2015-09-17 20:32:45 +00:00
Piotr Padlewski	ea09288ee7	Added MD_invariant_group to LLVMContext http://reviews.llvm.org/D12926 llvm-svn: 247931	2015-09-17 20:25:07 +00:00
Teresa Johnson	ff642b9b84	Restore "Function bitcode index in Value Symbol Table and lazy reading support" This reverts commit r247898 (which reverted r247894). Patch fixed to address two issues exposed by buildbots: - unused variable warning in NDEBUG mode - std::initializer_list lifetime issue causing test failures Original Summary: Support for including the function bitcode indices in the Value Symbol Table. This requires writing the VST after the function blocks, which in turn requires a new VST forward declaration record encoding the offset of the full VST (which is backpatched to contain the offset after the VST is written). This patch also enables the lazy function reader to use the new function indices out of the VST. This support will be used by ThinLTO as well, which will be in a follow on patch. Backwards compatibility with older bitcode files is maintained. A new test is also included. The bitcode format (used for the lazy reader as well as the upcoming ThinLTO patches) came out of discussions with Duncan and others and is described here: https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view Reviewers: dexonsmith, davidxl, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12536 llvm-svn: 247927	2015-09-17 20:12:00 +00:00
Sanjoy Das	7a9f8bb995	[SCEV] Use auto instead of full iterator type; NFCI. llvm-svn: 247919	2015-09-17 19:04:09 +00:00
Reid Kleckner	ed17079b52	[WinEH] Add and use hasEHPadSuccessor instead of getLandingPadSuccessor getLandingPadSuccessor assumes that each invoke can have at most one EH pad successor, but WinEH invokes can have more than one. Two out of three callers of getLandingPadSuccessor don't use the returned landingpad, so we can make them use this simple predicate instead. Eventually we'll have to circle back and fix SplitKit.cpp so that register allocation works. Baby steps. llvm-svn: 247904	2015-09-17 17:19:40 +00:00
Zia Ansari	841cce1ae9	Test commit. llvm-svn: 247901	2015-09-17 16:51:27 +00:00
Teresa Johnson	2e98d57ad4	Revert "Function bitcode index in Value Symbol Table and lazy reading support" Temporarily revert to fix some buildbot issues. One is a minor issue with a variable unused in NDEBUG mode. More concerning are some test failures on win7 that I need to dig into. This reverts commit 4e66a74543459832cfd571db42b4543580ae1d1d. llvm-svn: 247898	2015-09-17 16:19:10 +00:00
Daniel Sanders	e2982adc0b	[mips] Add assembler support for the .cprestore directive. Summary: This assembler directive is used in O32 PIC to restore the current function's $gp after executing JAL's. The $gp is first stored on the stack at a user-specified offset. It has the following format: ".cprestore 8" (where 8 is the offset). This fixes llvm.org/PR20967. Patch by Toma Tabacu. Reviewers: seanbruno, tomatabacu Subscribers: brooks, seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D6267 llvm-svn: 247897	2015-09-17 16:08:39 +00:00
Teresa Johnson	b77b1f8a0c	Function bitcode index in Value Symbol Table and lazy reading support Summary: Support for including the function bitcode indices in the Value Symbol Table. This requires writing the VST after the function blocks, which in turn requires a new VST forward declaration record encoding the offset of the full VST (which is backpatched to contain the offset after the VST is written). This patch also enables the lazy function reader to use the new function indices out of the VST. This support will be used by ThinLTO as well, which will be in a follow on patch. Backwards compatibility with older bitcode files is maintained. A new test is also included. The bitcode format (used for the lazy reader as well as the upcoming ThinLTO patches) came out of discussions with Duncan and others and is described here: https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view Reviewers: dexonsmith, davidxl, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12536 llvm-svn: 247894	2015-09-17 15:52:30 +00:00
Teresa Johnson	c01e4cbccc	Refactor string encoding checks in BitcodeWriter (NFC) llvm-svn: 247891	2015-09-17 14:37:35 +00:00
Chad Rosier	6c1f0933ac	Typos. NFC. llvm-svn: 247884	2015-09-17 13:10:27 +00:00
Zoran Jovanovic	7ba636cb4c	[mips][microMIPS] Implement TEQ, TGE, TGEU, TLT, TLTU and TNE instructions Differential Revision: http://reviews.llvm.org/D9658 llvm-svn: 247880	2015-09-17 10:14:09 +00:00
Elena Demikhovsky	702a6adfaa	AVX-512: shufflevector for i1 vectors <2 x i1> .. <64 x i1> AVX-512 does not provide an instruction that shuffles mask register. So I do the following way: mask-2-simd , shuffle simd , simd-2-mask Differential Revision: http://reviews.llvm.org/D12727 llvm-svn: 247876	2015-09-17 06:53:12 +00:00
Diego Novillo	3376a78781	GCC AutoFDO profile reader - Initial support. This adds enough machinery to support reading simple GCC AutoFDO profiles. It now supports reading flat profiles (no function calls). Subsequent patches will add support for: - Inlined calls (in particular, the inline call stack is not traversed to accumulate samples). - Working sets and modules. These are used mostly for GCC's LIPO optimizations, so they're not needed in LLVM atm. I'm not sure that we will ever need them. For now, I've if0'd around the calls. The patch also adds support in GCOV.h for gcov version V704 (generated by GCC's profile conversion tool). llvm-svn: 247874	2015-09-17 00:17:24 +00:00
Hans Wennborg	9099b5e644	Try to fix WebAssembly build after r247864 llvm-svn: 247870	2015-09-16 23:59:57 +00:00
Naomi Musgrave	f90c1be78c	ScalarEvolution: added tmp to avoid use-after-dtor in for loop. Summary: For loop destroyed current instance before invoking next. Temporary variable added to prevent use-after-dtor when invoke destructor on current instance. Reviewers: eugenis Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12912 Rename temp var. llvm-svn: 247867	2015-09-16 23:46:40 +00:00
Eric Christopher	eb8b4bc473	Make sure we're negating the assembler predicate - no testcase because it isn't being used on anything via the assembler right now. llvm-svn: 247866	2015-09-16 23:38:18 +00:00
Eric Christopher	c7b155f670	Use the cached TargetInstrInfo instead of looking it up again. llvm-svn: 247865	2015-09-16 23:38:16 +00:00
Eric Christopher	a4e5d3cf8e	constify the Function parameter to the TTI creation callback and propagate to all callers/users/etc. llvm-svn: 247864	2015-09-16 23:38:13 +00:00
Reid Kleckner	813f1b65bc	[WinEH] Rip out the landingpad-based C++ EH state numbering code It never really worked, and the new code is working better every day. llvm-svn: 247860	2015-09-16 22:14:46 +00:00
David Majnemer	67bff0d88b	[WinEHPrepare] Turn terminatepad into a cleanuppad + call + cleanupret The MSVC doesn't really support exception specifications so let's just turn these into cleanuppads. Later, we might use terminatepad to more efficiently encode the "noexcept"-ness of a function body. llvm-svn: 247848	2015-09-16 20:42:16 +00:00
Sanjoy Das	e5f4889ba9	[InstCombine] Optimize icmp slt signum(x), 1 --> icmp slt x, 1 Summary: `signum(x)` is sometimes implemented as `(x >> 63) \| (-x >>> 63)` (for an `i64` `x`). This change adds a matcher for that pattern, and an instcombine rule to optimize `signum(x) s< 1`. Later, we can also consider optimizing: icmp slt signum(x), 0 --> icmp slt x, 0 icmp sle signum(x), 1 --> true etc. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12703 llvm-svn: 247846	2015-09-16 20:41:29 +00:00
Reid Kleckner	b005d281c3	[WinEH] Pull Adjectives and CatchObj out of the catchpad arg list Clang now passes the adjectives as an argument to catchpad. Getting the CatchObj working is simply a matter of threading another static alloca through codegen, first as an alloca, then as a frame index, and finally as a frame offset. llvm-svn: 247844	2015-09-16 20:16:27 +00:00
David Majnemer	459a64aed7	[WinEHPrepare] Provide a cloning mode which doesn't demote We are experimenting with a new approach to saving and restoring SSA values used across funclets: let the register allocator do the dirty work for us. However, this means that we need to be able to clone commoned blocks without relying on demotion. llvm-svn: 247835	2015-09-16 18:40:37 +00:00
David Majnemer	b3d9b960ea	[WinEHPrepare] Refactor explicit EH preparation Split the preparation machinery into several functions, we will want to selectively enable/disable different parts of it for an alternative mechanism for dealing with cross-funclet uses. llvm-svn: 247834	2015-09-16 18:40:24 +00:00
Reid Kleckner	84ebff4a5e	[WinEH] Skip state numbering when no EH pads are present Otherwise we'd try to emit the thunk that passes the LSDA to __CxxFrameHandler3. We don't emit the LSDA if there were no landingpads, so we'd end up with an assembler error when trying to write the COFF object. llvm-svn: 247820	2015-09-16 17:19:44 +00:00
Dan Gohman	950a13cfa3	[WebAssembly] Check in an initial CFG Stackifier pass This pass implements a simple algorithm for conversion from CFG to wasm's structured control flow. It doesn't yet handle multiple-entry loops; that will be added in a future patch. It also adds initial support for switch statements. Differential Revision: http://reviews.llvm.org/D12735 llvm-svn: 247818	2015-09-16 16:51:30 +00:00
Sanjay Patel	a260701bbb	propagate fast-math-flags on DAG nodes After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815	2015-09-16 16:31:21 +00:00
Reid Kleckner	85dfb68e50	Add assembler fatal error for undefined assembler labels in COFF writer llvm-svn: 247814	2015-09-16 16:26:29 +00:00
Sanjay Patel	815adacd22	don't repeat function names in comments; NFC llvm-svn: 247813	2015-09-16 16:21:08 +00:00
Adhemerval Zanella	f0c95bd2ca	[sanitizer] Add MSan support for AArch64 This patch adds support for msan on aarch64-linux for both 39 and 42-bit VMA. The support is enabled by defining the SANITIZER_AARCH64_VMA compiler flag to either 39 or 42 at build time for both clang/llvm and compiler-rt. The default VMA is 39 bits. llvm-svn: 247807	2015-09-16 15:10:27 +00:00
Joerg Sonnenberger	22cd644e1b	[SPARC] Both GNU and Solaris as support eq as condition code for integer ops. llvm-svn: 247804	2015-09-16 14:41:36 +00:00
Joerg Sonnenberger	9763490e4d	[SPARC] Recognize st/stx operations with %fsr argument too. llvm-svn: 247794	2015-09-16 13:30:54 +00:00
David L Kreitzer	da700ce581	Test commit: Fixed a few typos in the comments. llvm-svn: 247793	2015-09-16 13:27:30 +00:00
Chad Rosier	5d485db6b2	[ARM] Register ARMPreAllocLoadStoreOpt pass with LLVM pass manager. llvm-svn: 247791	2015-09-16 13:11:31 +00:00
Michael Kuperstein	d926465342	[X86] Do not generate 64-bit pops of 32-bit GPRs. When trying emit a stack adjustments using pops, frame lowering selects an arbitrary free GPR. It should always select one from an appropriate class... This fixes PR24649. Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D12609 llvm-svn: 247785	2015-09-16 11:27:20 +00:00
Michael Kuperstein	098cd9fba7	[X86] Fix emitEpilogue() to make less assumptions about pops This is the mirror image of r242395. When X86FrameLowering::emitEpilogue() looks for where to insert the %esp addition that deallocates stack space used for local allocations, it assumes that any sequence of pop instructions from function exit backwards consists purely of restoring callee-save registers. This may be false, since from some point backward, the pops may be clean-up of stack space allocated for arguments to a call. Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D12688 llvm-svn: 247784	2015-09-16 11:18:25 +00:00
Zoran Jovanovic	6e6a2c9cd7	[mips][microMIPS] Implement PREFX, LHUE, LBE, LBUE, LHE, LWE, SBE, SHE and SWE instructions Differential Revision: http://reviews.llvm.org/D9189 llvm-svn: 247780	2015-09-16 09:14:35 +00:00
Craig Topper	5db36df4d0	Use range-based for loops. NFC llvm-svn: 247772	2015-09-16 03:52:35 +00:00
Craig Topper	77ec077067	Fix a spelling error in the description of a statistic. NFC llvm-svn: 247771	2015-09-16 03:52:32 +00:00
Michael Zolotukhin	fc314be0ec	[Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad. We only checked that a global is initialized with constants, which is incorrect. We should be checking that GlobalVariable is a constant, not just initialized with it. llvm-svn: 247769	2015-09-16 03:25:09 +00:00
Sanjoy Das	8a5526e8be	[IndVars] Fix PR24783. In `IndVarSimplify::ExpandSCEVIfNeeded`, `SCEVExpander::findExistingExpansion` may return an `llvm::Value` that differs in type from the SCEV it was asked to find an expansion for (but computes the same value). In such cases, we fall back on `expandCodeFor`; and rely on LLVM to CSE the two equivalent expressions (different only by a no-op cast) into a single computation. I tried a few other approaches to fixing PR24783, all of which turned out to be more complex than this current version: 1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`. This got problematic because currently we do not pass in the `Loop *` into `expandCodeFor`. Changing the interface to do this is a more invasive change, and really does not make much semantic sense unless the SCEV being passed in is an add recurrence. There is also the problem of `expandCodeFor` being used in places other than `indvars` -- there may be performance / correctness issues elsewhere if `expandCodeFor` is moved from always generating IR from scratch to cache-like model. 2. Have `findExistingExpansion` only return expression with the correct type. This would make `isHighCostExpansionHelper` and thus `isHighCostExpansion` more conservative than necessary. 3. Insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo`. This is complicated because `InsertNoopCastOfTo` depends on internal state of its `SCEVExpander` (specifically `Builder.GetInserPoint()`), and this may not be set up when `ExpandSCEVIfNeeded` is called. 4. Manually insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo` via `CastInst::Create`. This is probably workable, but figuring out the location where the cast instruction needs to be inserted has enough edge cases (arguments, constants, invokes, LCSSA must be preserved) makes me feel what I have right now is simplest solution. llvm-svn: 247749	2015-09-15 23:45:39 +00:00
Sanjoy Das	0ce51a92a8	[IndVars] Rename variable; NFC. llvm-svn: 247748	2015-09-15 23:45:35 +00:00
Duncan P. N. Exon Smith	cff5feff6f	Reapply "LTO: Disable extra verify runs in release builds" This reverts commit r247730, effectively reapplying r247729. This time I have an lld commit ready to follow. llvm-svn: 247735	2015-09-15 23:05:59 +00:00
Alexey Samsonov	c1603b6493	[ASan] Don't instrument globals in .preinit_array/.init_array/.fini_array These sections contain pointers to function that should be invoked during startup/shutdown by __libc_csu_init and __libc_csu_fini. Instrumenting these globals will append redzone to them, which will be filled with zeroes. This will cause null pointer dereference at runtime. Merge ASan regression tests for globals that should be ignored by instrumentation pass. llvm-svn: 247734	2015-09-15 23:05:48 +00:00
Duncan P. N. Exon Smith	7de73e56a4	Revert "LTO: Disable extra verify runs in release builds" This temporarily reverts commit r247729, as it caused lld build failures. I'll recommit once I have an lld patch ready-to-go. llvm-svn: 247730	2015-09-15 22:47:38 +00:00
Duncan P. N. Exon Smith	236787838c	LTO: Disable extra verify runs in release builds The verifier currently runs three times in LTO: (1) after parsing, (2) at the beginning of the optimization pipeline, and (3) at the end of it. The first run is important, since we're not sure where the bitcode comes from and it's nice to validate it, but in release builds the extra runs aren't appropriate. This commit: - Allows these runs to be disabled in LTOCodeGenerator. - Adds command-line options to llvm-lto. - Adds command-line options to libLTO.dylib, and disables the verifier by default in release builds (based on NDEBUG). This shaves about 3.5% off the runtime of ld64 when linking verify-uselistorder with -flto -g. rdar://22509081 llvm-svn: 247729	2015-09-15 22:26:11 +00:00
Larisse Voufo	6b867c7254	Revert "Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan." for preliminary community discussion (See. D12886) llvm-svn: 247716	2015-09-15 19:14:05 +00:00
Piotr Padlewski	6c15ec49ed	Introducing llvm.invariant.group.barrier intrinsic For more info for what reason it was invented, goto: http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html invariant.group.barrier: http://reviews.llvm.org/D12310 docs: http://reviews.llvm.org/D11399 CodeGenPrepare: http://reviews.llvm.org/D12875 llvm-svn: 247711	2015-09-15 18:32:14 +00:00
Quentin Colombet	dc29c973e5	[ShrinkWrapping] Fix an infinite loop while looking for restore point. This may happen when the input program itself contains an infinite loop with no exit block. In that case, we would fail to find a block post-dominating the loop such that this block is outside of the loop. This fixes PR24823. Working on reducing the test case. llvm-svn: 247710	2015-09-15 18:19:39 +00:00
Arch D. Robison	8ed0854f55	Broaden optimization of fcmp ([us]itofp x, constant) by instcombine. The patch extends the optimization to cases where the constant's magnitude is so small or large that the rounding of the conversion is irrelevant. The "so small" case includes negative zero. Differential review: http://reviews.llvm.org/D11210 llvm-svn: 247708	2015-09-15 17:51:59 +00:00
Igor Laevsky	bdc1eafe20	[CorrelatedValuePropagation] Infer nonnull attributes LazuValueInfo can prove that value is nonnull based on the context information. Make use of this ability to infer nonnull attributes for the call arguments. Differential Revision: http://reviews.llvm.org/D12836 llvm-svn: 247707	2015-09-15 17:51:50 +00:00
Marcello Maggioni	454faa84e2	[NaryReassociate] Add support for Mul instructions This patch extends the current pass by handling Mul instructions as well. Patch by: Volkan Keles (vkeles@apple.com) llvm-svn: 247705	2015-09-15 17:22:52 +00:00
Daniel Sanders	50f17235dd	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Eric has replied and has demanded the patch be reverted. llvm-svn: 247702	2015-09-15 16:17:27 +00:00
Sanjay Patel	e9434e80d1	80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC llvm-svn: 247700	2015-09-15 15:26:25 +00:00
Sanjay Patel	f9b776350f	more space; NFC llvm-svn: 247699	2015-09-15 15:24:42 +00:00
Zoran Jovanovic	dc4b8c2761	[mips][microMIPS] Fix an issue with disassembling lwm32 instruction Fixed microMIPS disassembler crash on test case generated by llvm-mc-fuzzer. Differential Revision: http://reviews.llvm.org/D12881 llvm-svn: 247698	2015-09-15 15:21:27 +00:00
Zoran Jovanovic	8eb8c9861d	[mips] Add support for branch-likely pseudo-instructions Differential Revision: http://reviews.llvm.org/D10537 llvm-svn: 247697	2015-09-15 15:06:26 +00:00
Ulrich Weigand	e861e6442c	[SystemZ] Fix assertion failure in tryBuildVectorShuffle Under certain circumstances, tryBuildVectorShuffle would attempt to create a BUILD_VECTOR node with an invalid combination of types. This happened when one of the components of the original BUILD_VECTOR was itself a TRUNCATE node. That TRUNCATE was stripped off during intermediate processing to simplify code, but when adding the node back to the result vector, we still need it to get the type right. llvm-svn: 247694	2015-09-15 14:27:46 +00:00
Daniel Sanders	153010c52d	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247692	2015-09-15 14:08:28 +00:00
Daniel Sanders	c40de48041	Revert r247684 - Replace Triple with a new TargetTuple ... LLDB needs to be updated in the same commit. llvm-svn: 247686	2015-09-15 13:46:21 +00:00
Daniel Sanders	18d4b0dab7	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247683	2015-09-15 13:17:40 +00:00
Daniel Sanders	c8cd6e95d2	Fix namespace indentation and missing blank lines before 'public:' in *MCAsmInfo.h. NFC. This is to reduce noise in a following commit. Also fixes a couple missing spaces before the reference operator. llvm-svn: 247679	2015-09-15 12:27:06 +00:00
James Molloy	d5b161a221	[GlobalsAA] Disable globals-aa by default Several issues have been found with it - disabling in the meantime. llvm-svn: 247674	2015-09-15 10:44:06 +00:00
Zoran Jovanovic	7beb737b46	[mips][microMIPS] Implement CACHEE and PREFE instructions for microMIPS32r6 Differential Revision: http://reviews.llvm.org/D11632 llvm-svn: 247670	2015-09-15 10:05:10 +00:00
Daniel Sanders	e4e83a7bc1	[mips] Added support for various EVA ASE instructions. Summary: Added support for the following instructions: CACHEE, LBE, LBUE, LHE, LHUE, LWE, LLE, LWLE, LWRE, PREFE, SBE, SHE, SWE, SCE, SWLE, SWRE, TLBINV, TLBINVF This required adding some infrastructure for the EVA ASE. Patch by Scott Egerton. Reviewers: vkalintiris, dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11139 llvm-svn: 247669	2015-09-15 10:02:16 +00:00
Sanjoy Das	f75e15e5ac	[PlaceSafepoints] Make the width of a counted loop settable. Summary: This change lets a `PlaceSafepoints` client change how wide the trip count of a loop has to be for the loop to be considerd "counted", via `CountedLoopTripWidth`. It also removes the boolean `SkipCounted` flag and the `upperTripBound` constant -- we can get the old behavior of `SkipCounted` == `false` by setting `CountedLoopTripWidth` to `13` (2 ^ 13 == 8192). Reviewers: reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12789 llvm-svn: 247656	2015-09-15 01:42:48 +00:00
Dan Gohman	311b488d76	[WebAssembly] Implement int64-to-int32 conversion. llvm-svn: 247649	2015-09-15 00:55:19 +00:00
Adrian Prantl	deef90d7f5	DwarfDebug: Emit dwo_id+dwo_name for DICompileUnits that provide a dwoId. For module debugging clang emits prefabricated skeleton compile units that can be recognized by a nonzero dwoId. llvm-svn: 247626	2015-09-14 22:10:22 +00:00
David Blaikie	798f3079d4	[opaque pointer types] Add an explicit value type to GlobalObject This is needed by all GlobalObjects (GlobalAlias, Function, GlobalVariable), see the GlobalObject::getValueType which is used in many places. If at some point that can be removed, then we can remove this member. llvm-svn: 247621	2015-09-14 21:47:27 +00:00
David Blaikie	6614d8d230	[opaque pointer types] Switch a few cases of getElementType over, since I had them lying around anyway llvm-svn: 247610	2015-09-14 20:29:26 +00:00
Matthias Braun	3f3934b010	RegisterPressure: Simplify close{Top\|Bottom}() - There are no duplicate registers in LiveRegs list we are copying from and so we do not need to sort the registers. - Simply use SmallVector::apend instead of a loop between begin() and end() with push_back(). Differential Revision: http://reviews.llvm.org/D12813 llvm-svn: 247588	2015-09-14 18:24:15 +00:00
Chen Li	0d043b52eb	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. It also adds assertions in isKnownNonNull() and isKnownNonNullFromDominatingCondition() to make sure the value checked is pointer type (as defined in LLVM document). These assertions might trip failures in things which are not covered under llvm/test, but fixes should be pretty obvious. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12779 llvm-svn: 247587	2015-09-14 18:10:43 +00:00
David Blaikie	16a2f3e302	Revert "[opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space" This was a flawed change - it just caused the getElementType call to be deferred until later, when we really need to remove it. Now that the IR for GlobalAliases has been updated, the root cause is addressed that way instead and this change is no longer needed (and in fact gets in the way - because we want to pass the pointee type directly down further). Follow up patches to push this through GlobalValue, bitcode format, etc, will come along soon. This reverts commit 236160. llvm-svn: 247585	2015-09-14 18:01:59 +00:00
Jun Bum Lim	34b9bd0435	Improve ISel using across lane min/max reduction In vectorized integer min/max reduction code, the final "reduce" step is sub-optimal. In AArch64, this change wll combine : %svn0 = vector_shuffle %0, undef<2,3,u,u> %smax0 = smax %0, svn0 %svn3 = vector_shuffle %smax0, undef<1,u,u,u> %sc = setcc %smax0, %svn3, gt %n0 = extract_vector_elt %sc, #0 %n1 = extract_vector_elt %smax0, #0 %n2 = extract_vector_elt $smax0, #1 %result = select %n0, %n1, n2 becomes : %1 = smaxv %0 %result = extract_vector_elt %1, 0 This change extends r246790. llvm-svn: 247575	2015-09-14 16:19:52 +00:00
Daniel Sanders	7d6d898826	[mips] Unified the MipsMemSimm9GPRAsmOperand and MipsMemSimm9AsmOperand operands, NFC. Summary: These operands had the same purpose, however the MipsMemSimm9GPRAsmOperand operand was only for micromips32r6 and the MipsMemSimm9AsmOperand did not have a ParserMatchClass. Patch by Scott Egerton Reviewers: vkalintiris, dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12730 llvm-svn: 247573	2015-09-14 15:57:24 +00:00
JF Bastien	26aca14b15	[MergeFuncs] Fix bug in merging GetElementPointers GetElementPointers must have the first argument's type compared for structural equivalence. Previously the code erroneously compared the pointer's type, but this code was dead because all pointer types (of the same address space) are the same. The pointee must be compared instead (using the type stored in the GEP, not from the pointer type which will be erased anyway). Author: jrkoenig Reviewers: dschuff, nlewycky, jfb Subscribers: nlewycky, llvm-commits Differential revision: http://reviews.llvm.org/D12820 llvm-svn: 247570	2015-09-14 15:37:48 +00:00
John Brawn	056e67865a	[ARM] Extract shifts out of multiply-by-constant Turning (op x (mul y k)) into (op x (lsl (mul y k>>n) n)) is beneficial when we can do the lsl as a shifted operand and the resulting multiply constant is simpler to generate. Do this by doing the transformation when trying to select a shifted operand, as that ensures that it actually turns out better (the alternative would be to do it in PreprocessISelDAG, but we don't know for sure there if extracting the shift would allow a shifted operand to be used). Differential Revision: http://reviews.llvm.org/D12196 llvm-svn: 247569	2015-09-14 15:19:41 +00:00
Simon Atanasyan	75e3b3c826	[mips] Remove redundant nested-name-specifier. NFC llvm-svn: 247547	2015-09-14 11:18:22 +00:00
Simon Atanasyan	4d45c76895	[mips] Save a copy of MipsABIInfo in the MipsTargetStreamer to escape a dangling pointer The MipsTargetELFStreamer can receive ABI info from many sources. For example, from the MipsAsmParser instance. Lifetime of the MipsAsmParser can be shorter than MipsTargetELFStreamer's lifetime. In that case we get a dangling pointer to MipsABIInfo. Differential Revision: http://reviews.llvm.org/D12805 llvm-svn: 247546	2015-09-14 11:18:03 +00:00
NAKAMURA Takumi	98800d9c3a	GlobalsAAResult: Try to fix crash. DeletionCallbackHandle holds GAR in its creation. It assumes; - It is registered as CallbackVH. It should not be moved in its life. - Its parent, GAR, may be moved. To move list<DeletionCallbackHandle> GlobalsAAResult::Handles, GAR must be updated with the destination in GlobalsAAResult(&&). llvm-svn: 247534	2015-09-14 06:16:44 +00:00
Simon Pilgrim	f8f86ab176	[X86][MMX] Added shuffle decodes for MMX/3DNow! shuffles. Added shuffle decodes for MMX PUNPCK + PSHUFW shuffles. Added shuffle decodes for 3DNow! PSWAPD shuffles. llvm-svn: 247526	2015-09-13 11:28:45 +00:00
Chandler Carruth	3824f859f5	[FunctionAttrs] Move the malloc-like test to a static helper function that could be used from a new pass manager. This one makes particular sense as a static helper as it doesn't even need TLI. llvm-svn: 247525	2015-09-13 08:23:27 +00:00
Chandler Carruth	8874b78697	[FunctionAttrs] Factor the logic to test for a known non-null return out of a method and into a re-usable static helper. We can potentially use this function from the implementation of a new pass manager oriented version of the pass. Also add some better documentation of exactly what the semantic model of this routine is (it isn't trivial) and use a more modern naming convention for it. llvm-svn: 247524	2015-09-13 08:17:14 +00:00
Elena Demikhovsky	8671fcbbd6	AVX-512: Fixed a bug in OR/XOR operations for 512-bit FP values on KNL. KNL does not have VXORPS, VORPS for 512-bit values. I use integer VPXOR, VPOR that actually do the same. X86ISD::FXOR/FOR are generated as a result of FSUB combining. Differential Revision: http://reviews.llvm.org/D12753 llvm-svn: 247523	2015-09-13 08:15:15 +00:00
Chandler Carruth	444d005615	[FunctionAttrs] Make the per-function attribute inference a boring static function rather than a method. It just needed access to TargetLibraryInfo, and this way it can be easily reused between the current FunctionAttrs implementation and any port for the new pass manager. llvm-svn: 247522	2015-09-13 08:03:23 +00:00
Chandler Carruth	d02452015c	[FunctionAttrs] Collect utility functions as static helpers rather than methods. They don't need anything from the class anyways. Also, collect the declarations into the private section of the pass. llvm-svn: 247521	2015-09-13 07:50:43 +00:00
Chandler Carruth	a632fb9e86	Clean up doxygen comments in FunctionAttrs, promoting some non-doxygen comments, deleting duplicate comments, moving comments to consistently live on the definition since these are all really internal routines, etc. NFC. llvm-svn: 247520	2015-09-13 06:57:25 +00:00
Chandler Carruth	63559d7cd4	Do some spring cleaning on FunctionAttrs.cpp with clang-format prior to other refactorings and cleanups here. llvm-svn: 247519	2015-09-13 06:47:20 +00:00
Sanjay Patel	8b960d22ad	[x86] enable machine combiner reassociations for 128-bit vector logical integer insts (2nd try) The changes in: test/CodeGen/X86/machine-cp.ll are just due to scheduling differences after some logic instructions were reassociated. llvm-svn: 247516	2015-09-12 19:47:50 +00:00
Ahmed Bougacha	49b531a08d	[CodeGen] Fix AtomicExpand invalidation issue caused by r247429. llvm-svn: 247514	2015-09-12 18:51:23 +00:00
Simon Pilgrim	5253b7b4a7	[X86] Renamed lowerVectorShuffleAsUnpack NFCI. Renamed to lowerVectorShuffleAsPermuteAndUnpack to make it clear that it lowers to more than just a UNPCK instruction. llvm-svn: 247513	2015-09-12 18:26:47 +00:00
Simon Pilgrim	2fcfef542a	[X86] Moved lowerVectorShuffleWithUNPCK earlier to make reuse easier. NFCI. llvm-svn: 247511	2015-09-12 16:03:06 +00:00
Sanjay Patel	99f7370a79	revert r247506; need to verify changes in existing tests llvm-svn: 247507	2015-09-12 15:27:31 +00:00
Sanjay Patel	08755c7dbc	[x86] enable machine combiner reassociations for 128-bit vector logical integer insts llvm-svn: 247506	2015-09-12 14:58:04 +00:00
Simon Pilgrim	48ffca0f47	Fixed unused variable warning. llvm-svn: 247505	2015-09-12 14:00:17 +00:00
Simon Pilgrim	20c607b110	[InstCombine] CVTPH2PS Vector Demanded Elements + Constant Folding Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion): <4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion. Added constant folding support. Differential Revision: http://reviews.llvm.org/D12731 llvm-svn: 247504	2015-09-12 13:39:53 +00:00
Chandler Carruth	29a18a4663	[PM] Port SROA to the new pass manager. In some ways this is a very boring port to the new pass manager as there are no interesting analyses or dependencies or other oddities. However, this does introduce the first good example of a transformation pass with non-trivial state porting to the new pass manager. I've tried to carve out patterns here to replicate elsewhere, and would appreciate comments on whether folks like these patterns: - A common need in the new pass manager is to effectively lift the pass class and some of its state into a public header file. Prior to this, LLVM used anonymous namespaces to provide "module private" types and utilities, but that doesn't scale to cases where a public header file is needed and the new pass manager will exacerbate that. The pattern I've adopted here is to use the namespace-cased-name of the core pass (what would be a module if we had them) as a module-private namespace. Then utility and other code can be declared and defined in this namespace. At some point in the future, we could even have (conditionally compiled) code that used modules features when available to do the same basic thing. - I've split the actual pass run method in two in order to expose a private method usable by the old pass manager to wrap the new class with a minimum of duplicated code. I actually looked at a bunch of ways to automate or generate these, but they are all quite terrible IMO. The fundamental need is to extract the set of analyses which need to cross this interface boundary, and that will end up being too unpredictable to effectively encapsulate IMO. This is also a relatively small amount of boiler plate that will live a relatively short time, so I'm not too worried about the fact that it is boiler plate. The rest of the patch is totally boring but results in a massive diff (sorry). It just moves code around and removes or adds qualifiers to reflect the new name and nesting structure. Differential Revision: http://reviews.llvm.org/D12773 llvm-svn: 247501	2015-09-12 09:09:14 +00:00
Larisse Voufo	f57162b6e7	Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan. llvm-svn: 247497	2015-09-12 01:41:55 +00:00
Bruce Mitchener	e9ffb45b60	Fix typos. Summary: This fixes a variety of typos in docs, code and headers. Subscribers: jholewinski, sanjoy, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12626 llvm-svn: 247495	2015-09-12 01:17:08 +00:00
Davide Italiano	983366ab12	[MC] Fix style bugs introduced in r247471. Reported by Rafael Espindola. llvm-svn: 247483	2015-09-11 22:04:21 +00:00
Davide Italiano	63cee81c3c	[MC] Don't crash on division by zero. Differential Revision: http://reviews.llvm.org/D12776 llvm-svn: 247471	2015-09-11 20:47:35 +00:00
Yunzhong Gao	46261a74db	Add a non-exiting diagnostic handler for LTO. This is in order to give LTO clients a chance to do some clean-up before terminating the process. llvm-svn: 247461	2015-09-11 20:01:53 +00:00
Sanjay Patel	41c739b3fa	typo; NFC llvm-svn: 247454	2015-09-11 19:29:18 +00:00
Akira Hatanaka	bc497c93f5	Use function attribute "stackrealign" to decide whether stack realignment should be forced. With this commit, we can now force stack realignment when doing LTO and do so on a per-function basis. Also, add a new cl::opt option "stackrealign" to CommandFlags.h which is used to force stack realignment via llc's command line. Out-of-tree projects currently using -force-align-stack to force stack realignment should make changes to attach the attribute to the functions in the IR. Differential Revision: http://reviews.llvm.org/D11814 llvm-svn: 247450	2015-09-11 18:54:38 +00:00
David Majnemer	0e70598a5b	[X86] Make sure startproc/endproc are paired We used different conditions to determine if we should emit startproc vs endproc. Use the same condition to ensure that they will always be paired. This fixes PR24374. llvm-svn: 247435	2015-09-11 17:34:34 +00:00
Reid Kleckner	5dbee7baef	[IR] Print the label operands of a catchpad like an invoke The rest of the EH pads are fine, since they have at most one label and take fewer operands for the personality. Old catchpad vs. new: %5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8)] to label %__except.ret.10 unwind label %catchendblock.9 ----- %5 = catchpad [i8 bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)] to label %__except.ret.10 unwind label %catchendblock.9 llvm-svn: 247433	2015-09-11 17:27:52 +00:00
Ahmed Bougacha	5246867384	[CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit. We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429	2015-09-11 17:08:28 +00:00
Ahmed Bougacha	9d677131c4	[CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind. This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428	2015-09-11 17:08:17 +00:00
NAKAMURA Takumi	25b5bd27ea	[PR24785] Appease MSC18 to tweak optimizations. This brings a warning. cl : Command line warning D9035: option 'Og-' has been deprecated and will be removed in a future release We should resolve PR11951 to remove this tweak. llvm-svn: 247427	2015-09-11 17:08:02 +00:00
Yaron Keren	102e3ce5b3	Add #include llvm-config.h to Locale.cpp which depends on LLVM_ON_WIN32. Source code was assuming that llvm-config.h would be included somehow but up to r247253 that added #include "llvm/Support/Compiler.h" to StringRef.h the config file was not actually included. The inclusion of llvm-config.h caused a change of behaviour in tools/clang/test/Frontend/source-col-map.c: previously it would output the original UTF-8 but now it outputs <U+03B1>. llvm-svn: 247409	2015-09-11 13:22:47 +00:00
NAKAMURA Takumi	8061e8645f	PPCFrameLowering::emitEpilogue(): Avoid manipulating MBBI on iterator end. It caused crash in MachineInstr::hasPropertyInBundle() since r247237. llvm-svn: 247395	2015-09-11 08:20:56 +00:00
David Blaikie	2f40830dde	[opaque pointer type] Add textual IR support for explicit type parameter for global aliases update.py: import fileinput import sys import re alias_match_prefix = r"(.(?:=\|:\|^)\s(?:external \|)(?:(?:private\|internal\|linkonce\|linkonce_odr\|weak\|weak_odr\|common\|appending\|extern_weak\|available_externally) )?(?:default \|hidden \|protected )?(?:dllimport \|dllexport )?(?:unnamed_addr \|)(?:thread_local(?:$[a-z]$)? )?alias" plain = re.compile(alias_match_prefix + r" (.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|addrspacecast\|\[\[[a-zA-Z]\|\{\{).$)") cast = re.compile(alias_match_prefix + r") ((?:bitcast\|inttoptr\|addrspacecast)\s$. to (.?)(\| addrspace\(\d+$ )\\)\s(?:;.)?$)") gep = re.compile(alias_match_prefix + r") ((?:getelementptr)\s(?:inbounds)?\s$(?P<type>.), (?P=type)(?:\saddrspace\(\d+$\s)?\* .\)\s(?:;.)?$)") def conv(line): m = re.match(cast, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(gep, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(plain, line) if m: return m.group(1) + ", " + m.group(2) + m.group(3) + "" + m.group(4) + "\n" return line for line in sys.stdin: sys.stdout.write(conv(line)) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name .ll \| xargs ./apply.sh llvm-svn: 247378	2015-09-11 03:22:04 +00:00
Cong Hou	c416e4182a	Fixed a bug that BranchProbability is not defined in BlockFrequency.cpp. NFC. llvm-svn: 247376	2015-09-11 02:47:30 +00:00
Duncan P. N. Exon Smith	adc5456da0	AsmWriter: Avoid O(N^2) processing of metadata Fix embarrassing bugs I introduced to the `SlotTracker` in or around r235785. I had us iterating through every instruction in a function (and hitting a map in the LLVMContext) for every basic block in the function. While there, completely avoid the call to `SlotTracker::processFunctionMetadata()` from `SlotTracker::processFunction()` if we've speculatively done this already in `SlotTracker::processModule()` by checking `ShouldInitializeAllMetadata` (this wasn't an algorithmic problem, but it's touching the same line of code). Fixes PR24699. llvm-svn: 247372	2015-09-11 01:34:59 +00:00
Mehdi Amini	2bd08527ff	Revert "[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite" This reverts commit r247356. Breaks test/Transforms/InstCombine/pr8547.ll with: Wrong types for attribute: byval inalloca nest noalias nocapture nonnull readnone readonly sret dereferenceable(1) dereferenceable_or_null(1) %call = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i64 0, i64 0), i32 nonnull %conv2) #0 LLVM ERROR: Broken function found, compilation aborted! From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 247371	2015-09-11 01:33:48 +00:00
Kostya Serebryany	dd02f1f8ab	[libFuzzer] perform fewer crossover operations compared to plain mutations llvm-svn: 247364	2015-09-11 00:20:58 +00:00
Reid Kleckner	95ce1df93a	Add .exe check to Execute to fix clang-modernize tests broken in r247358 llvm-svn: 247361	2015-09-10 23:59:45 +00:00
Reid Kleckner	89d4b1a77c	ScanDirForExecutable on Windows fails to find executables with the "exe" extension in name When the driver tries to locate a program by its name, e.g. a linker, it scans the paths provided by the toolchain using the ScanDirForExecutable function. If the lookup fails, the driver uses llvm::sys::findProgramByName. Unlike llvm::sys::findProgramByName, ScanDirForExecutable is not aware of file extensions. If the program has the "exe" extension in its name, which is very common on Windows, ScanDirForExecutable won't find it under the toolchain-provided paths. This patch changes the Windows version of the "`can_execute`" function called by ScanDirForExecutable to respect file extensions, similarly to llvm::sys::findProgramByName. Patch by Oleg Ranevskyy Reviewers: rnk Differential Revision: http://reviews.llvm.org/D12711 llvm-svn: 247358	2015-09-10 23:28:06 +00:00
Cong Hou	c536bd9e73	Pass BranchProbability/BlockMass by value instead of const& as they are small. NFC. llvm-svn: 247357	2015-09-10 23:10:42 +00:00
Chen Li	a29c612ddd	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12779 llvm-svn: 247356	2015-09-10 23:04:49 +00:00
Chen Li	32a51416e5	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of gc.relocate return value Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of gc.relocate return value. In this way it can handle cases where the relocated value does not have nonnull attribute but has a dominating null check from the CFG. Reviewers: reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12772 llvm-svn: 247353	2015-09-10 22:35:41 +00:00
Filipe Cabecinhas	48b090a31f	Remove gcc warning when comparing an unsigned var for >= 0 llvm-svn: 247352	2015-09-10 22:34:39 +00:00
Reid Kleckner	da6dcc5d92	[WinEH] Push and pop EBP for 32-bit funclets The Win32 EH runtime caller does not preserve EBP, even though it does preserve the CSRs (EBX, ESI, EDI) for us. The result was that each finally funclet call would leave the frame pointer off by 12 bytes. llvm-svn: 247348	2015-09-10 22:00:02 +00:00
Matt Arsenault	e0b44040aa	AMDGPU: Simplify debug printing llvm-svn: 247345	2015-09-10 21:51:19 +00:00
Matt Arsenault	57116cce19	AMDGPU: Use StringRef value llvm-svn: 247344	2015-09-10 21:51:15 +00:00
James Y Knight	1f3e6af7d0	[SPARC] Switch to the Machine Scheduler. The (mostly-deprecated) SelectionDAG-based ILPListDAGScheduler scheduler was making poor scheduling decisions, causing high register pressure and extraneous register spills. Switching to the newer machine scheduler generates better code -- even without there being a machine model defined for SPARC yet. (Actually committing the test changes too, this time, unlike r247315) llvm-svn: 247343	2015-09-10 21:49:06 +00:00
Reid Kleckner	7bb20bd69e	Fix SEH state numbering algorithm to handle cleanupendpads WinEHPrepare's new coloring algorithm really expects to see cleanupendpads now, so Clang will start emitting them soon. llvm-svn: 247341	2015-09-10 21:46:36 +00:00
Matthew Simpson	29dc0f7075	[LV] Relax Small Size Reduction Type Requirement This patch enables small size reductions in which the source types are smaller than the reduction type (e.g., computing an i16 sum from the values in an i8 array). The previous behavior was to only allow small size reductions if the source types and reduction type were the same. The change accounts for the fact that the existing sign- and zero-extend instructions in these cases should still be included in the cost model. Differential Revision: http://reviews.llvm.org/D12770 llvm-svn: 247337	2015-09-10 21:12:57 +00:00
Lang Hames	21a77ba1f7	[RuntimeDyld] Support non-zero addends for the MachO X86_64 SUBTRACTOR reloc. This functionality was accidentally left out of r247119. llvm-svn: 247336	2015-09-10 21:05:58 +00:00
Lang Hames	79fce4711b	[RuntimeDyld] Fix a bug in debugging output: all sections should be dumped before any relocations have been applied, and again after all relocations have been applied. Previously each section was dumped before and after relocations targetting it were applied, but this only shows the impact of relocations that point to other symbols in the same section. llvm-svn: 247335	2015-09-10 20:44:36 +00:00
Chandler Carruth	2e4ca848f4	Add an explicit 'inline' specifier to these static functions. GCC is warning on them having always_inline attribute for reasons I don't fully understand -- static functions are just as inlinable as inline functions in terms of linkage. llvm-svn: 247334	2015-09-10 20:34:57 +00:00
James Y Knight	221885c7cb	Revert "[SPARC] Switch to the Machine Scheduler." This reverts commit r247315. Accidentally omitted test changes; will resubmit full change shortly. llvm-svn: 247328	2015-09-10 19:42:03 +00:00
David Majnemer	880c2cb097	[IR] Conservatively mark 'catchpad' as accessing memory The exact semantics of 'catchpad' are really in the hands of the personality routine so we shouldn't assume that they have no side effects. llvm-svn: 247322	2015-09-10 18:50:09 +00:00
Kostya Serebryany	65f50868e5	[libFuzzer] refactor the code to allow building libFuzzer on platforms that don't have dfsan and don't support weak functions llvm-svn: 247321	2015-09-10 18:48:38 +00:00
James Y Knight	8a772cfd61	[SPARC] Switch to the Machine Scheduler. The (mostly-deprecated) SelectionDAG-based ILPListDAGScheduler scheduler was making poor scheduling decisions, causing high register pressure and extraneous register spills. Switching to the newer machine scheduler generates better code -- even without there being a machine model defined for SPARC yet. llvm-svn: 247315	2015-09-10 18:20:45 +00:00
Matthew Simpson	ddb4d9741f	[SCEV] Consistently Handle Expressions That Cannot Be Divided This patch addresses the issue of SCEV division asserting on some input expressions (e.g., non-affine expressions) and quietly giving up on others. When giving up, we set the quotient to be equal to zero and the remainder to be equal to the numerator. With this patch, we always quietly give up when we cannot perform the division. This patch also adds a test case for DependenceAnalysis that previously caused an assertion. Differential Revision: http://reviews.llvm.org/D11725 llvm-svn: 247314	2015-09-10 18:12:47 +00:00
JF Bastien	fa946233b4	[MergeFuncs] Fix callsite attributes in thunk generation This change correctly sets the attributes on the callsites generated in thunks. This makes sure things such as sret, sext, etc. are correctly set, so that the call can be a proper tailcall. Also, the transfer of attributes in the replaceDirectCallers function appears to be unnecessary, but until this is confirmed it will remain. Author: jrkoenig Reviewers: dschuff, jfb Subscribers: llvm-commits, nlewycky Differential revision: http://reviews.llvm.org/D12581 llvm-svn: 247313	2015-09-10 18:08:35 +00:00
Philip Reames	053701399d	[SimplifyCFG] Use known bits to eliminate dead switch defaults This is a follow up to http://reviews.llvm.org/D11995 implementing the suggestion by Hans. If we know some of the bits of the value being switched on, we know that the maximum number of unique cases covers the unknown bits. This allows to eliminate switch defaults for large integers (i32) when most bits in the value are known. Note that I had to make the transform contingent on not having any dead cases. This is conservatively correct with the old code, but required for the new code since we might have a dead case which varies one of the known bits. Counting that towards our number of covering cases would be bad. If we do have dead cases, we'll eliminate them first, then revisit the possibly dead default. Differential Revision: http://reviews.llvm.org/D12497 llvm-svn: 247309	2015-09-10 17:44:47 +00:00
Adrian Prantl	d209500fd5	Debug Info: Allow a DIModule to appear as the scope of other entities. llvm-svn: 247304	2015-09-10 17:13:58 +00:00
Kostya Serebryany	a938bcb89a	[libFuzzer] add two more variants of FuzzerDriver for convenience llvm-svn: 247300	2015-09-10 16:57:57 +00:00
Joseph Tremoulet	f3aff31401	[WinEH] Fix single-block cleanup coloring Summary: The coloring code in WinEHPrepare queues cleanuprets' successors with the correct color (the parent one) when it sees their cleanuppad, and so later when iterating successors knows to skip processing cleanuprets since they've already been queued. This latter check was incorrectly under an 'else' condition and so inadvertently was not kicking in for single-block cleanups. This change sinks the check out of the 'else' to fix the bug. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12751 llvm-svn: 247299	2015-09-10 16:51:25 +00:00
Hans Wennborg	aa15bffa1f	Re-commit r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes" Except the changes that defined virtual destructors as =default, because that ran into problems with GCC 4.7 and overriding methods that weren't noexcept. llvm-svn: 247298	2015-09-10 16:49:58 +00:00
Steven Wu	e3b1f2b765	Fix an undefined behavior introduces in r247234 llvm-svn: 247296	2015-09-10 16:32:28 +00:00
Sanjay Patel	9361d35525	80-cols; NFC llvm-svn: 247295	2015-09-10 16:31:19 +00:00
Sanjay Patel	f4b34b76d4	use range-based for loop; NFCI llvm-svn: 247294	2015-09-10 16:25:38 +00:00
Sanjay Patel	5e7bd91891	use range-based for loop; NFCI llvm-svn: 247293	2015-09-10 16:15:21 +00:00
Sanjay Patel	59661459f1	fix typo; NFC llvm-svn: 247287	2015-09-10 15:14:34 +00:00
Alex Lorenz	0153e59935	Fix PR 24724 - The implicit register verifier shouldn't assume certain operand order. The implicit register verifier in the MIR parser should only check if the instruction's default implicit operands are present in the instruction. It should not check the order in which they occur. llvm-svn: 247283	2015-09-10 14:04:34 +00:00
Igor Breger	7f69a99c54	AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 llvm-svn: 247276	2015-09-10 12:54:54 +00:00
Jakub Kuderski	58ea4eeb9e	There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that removes cast by performing the lshr on smaller types. However, currently there is no trunc(lshr (sext A), Cst) variant. This patch add such optimization by transforming trunc(lshr (sext A), Cst) to ashr A, Cst. Differential Revision: http://reviews.llvm.org/D12520 llvm-svn: 247271	2015-09-10 11:31:20 +00:00
Chandler Carruth	233edd20a7	[ADT] Rewrite the StringRef::find implementation to be simpler, clearer, and tremendously less reliant on the optimizer to fix things. The code is always necessarily looking for the entire length of the string when doing the equality tests in this find implementation, but it previously was needlessly re-checking the size each time among other annoyances. By writing this so simply an ddirectly in terms of memcmp, it also is about 8x faster in a debug build, which in turn makes FileCheck about 2x faster in 'ninja check-llvm'. This saves about 8% of the time for FileCheck-heavy parts of the test suite like the x86 backend tests. llvm-svn: 247269	2015-09-10 11:17:49 +00:00
Silviu Baranga	df9ce8408a	[DAGCombine] Truncate BUILD_VECTOR operators if necessary when constant folding vectors Summary: The BUILD_VECTOR node will truncate its operators to match the type. We need to take this into account when constant folding - we need to perform a truncation before constant folding the elements. This is because the upper bits can change the result, depending on the operation type (for example this is the case for min/max). This change also adds a regression test. Reviewers: jmolloy Subscribers: jmolloy, llvm-commits Differential Revision: http://reviews.llvm.org/D12697 llvm-svn: 247265	2015-09-10 10:34:34 +00:00
James Molloy	d47634d781	Enable GlobalsAA by default This can give significant improvements to alias analysis in some situations, and improves its testing coverage in all situations. llvm-svn: 247264	2015-09-10 10:22:20 +00:00
James Molloy	efbba72cb2	Add GlobalsAA as preserved to a bunch of transforms GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA. llvm-svn: 247263	2015-09-10 10:22:12 +00:00
James Molloy	8c995a93ce	[ARM] Do not use vtrn for vectorshuffle if the order is reversed The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case. Patch by Jeroen Ketema! llvm-svn: 247254	2015-09-10 08:42:28 +00:00
Chandler Carruth	f054eca167	[ADT] Micro-optimize the Triple constructor by doing a single split and re-using the resulting components rather than repeatedly splitting and re-splitting to compute each component as part of the initializer list. This is more work on PR23676. Sadly, it doesn't help much. It removes the constructor from my profile, but doesn't make a sufficient dent in the total time. But it should play together nicely with subsequent changes. llvm-svn: 247250	2015-09-10 07:51:43 +00:00
Chandler Carruth	4425c91dea	[ADT] Fix a confusing interface spec and some annoying peculiarities with the StringRef::split method when used with a MaxSplit argument other than '-1' (which nobody really does today, but which should actually work). The spec claimed both to split up to MaxSplit times, but also to append <= MaxSplit strings to the vector. One of these doesn't make sense. Given the name "MaxSplit", let's go with it being a max over how many splits occur, which means the max on how many strings get appended is MaxSplit+1. I'm not actually sure the implementation correctly provided this logic either, as it used a really opaque loop structure. The implementation was also playing weird games with nullptr in the data field to try to rely on a totally opaque hidden property of the split method that returns a pair. Nasty IMO. Replace all of this with what is (IMO) simpler code that doesn't use the pair returning split method, and instead just finds each separator and appends directly. I think this is a lot easier to read, and it most definitely matches the spec. Added some tests that exercise the corner cases around StringRef() and StringRef("") that all now pass. I'll start using this in code in the next commit. llvm-svn: 247249	2015-09-10 07:51:37 +00:00
NAKAMURA Takumi	1a296ec6d1	GlobalsAAResult(&&): Move every members. Or, one of MSVC builders failed with unexpected behavior. llvm-svn: 247247	2015-09-10 07:16:42 +00:00
Chandler Carruth	e4405e949f	[ADT] Switch a bunch of places in LLVM that were doing single-character splits to actually use the single character split routine which does less work, and in a debug build is substantially faster. llvm-svn: 247245	2015-09-10 06:12:31 +00:00
Chandler Carruth	477121721b	[ADT] Add a single-character version of the small vector split routine on StringRef. Finding and splitting on a single character is substantially faster than doing it on even a single character StringRef -- we immediately get to a very tuned memchr call this way. Even nicer, we get to this even in a debug build, shaving 18% off the runtime of TripleTest.Normalization, helping PR23676 some more. llvm-svn: 247244	2015-09-10 06:07:03 +00:00
Sanjoy Das	f3132d3b03	[ScalarEvolution] Fix PR24757. Summary: PR24757 was caused by some incorect math in `ScalarEvolution::HowFarToZero` -- the smallest unsigned solution for X in 2^N * A = 2^N * X is not necessarily A. Reviewers: atrick, majnemer, meheff Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12721 llvm-svn: 247242	2015-09-10 05:27:38 +00:00
Chandler Carruth	87275186d1	[LPM] Simplify this code and fix a compile error for compilers that don't correctly implement the scoping rules of C++11 range based for loops. This kind of aliasing isn't a good idea anyways (and wasn't really intended). llvm-svn: 247241	2015-09-10 04:22:36 +00:00
Chandler Carruth	b1e3a9ae8d	[LPM] Use a map from analysis ID to immutable passes in the legacy pass manager to avoid a slow linear scan of every immutable pass and on every attempt to find an analysis pass. This speeds up 'check-llvm' on an unoptimized build for me by 15%, YMMV. It should also help (a tiny bit) other folks that are really bottlenecked on repeated runs of tiny pass pipelines across small IR files. llvm-svn: 247240	2015-09-10 02:31:42 +00:00
Kit Barton	d3b904d440	Enable the shrink wrapping optimization for PPC64. The changes in this patch are as follows: 1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blocks are not the first/last blocks in the function 2. Fix a bug in PPCEarlyReturn optimization caused by an empty entry block in the function 3. Override the runShrinkWrap PredicateFtor (defined in TargetMachine) to check whether shrink wrapping should run: Shrink wrapping will run on PPC64 (Little Endian and Big Endian) unless -enable-shrink-wrap=false is specified on command line A new test case, ppc-shrink-wrapping.ll was created based on the existing shrink wrapping tests for x86, arm, and arm64. Phabricator review: http://reviews.llvm.org/D11817 llvm-svn: 247237	2015-09-10 01:55:44 +00:00
Ahmed Bougacha	05541459fa	[AArch64] Match FI+offset in STNP addressing mode. First, we need to teach isFrameOffsetLegal about STNP. It already knew about the STP/LDP variants, but those were probably never exercised, because it's only the load/store optimizer that generates STP/LDP, and the only user of the method is frame lowering, which runs earlier. The STP/LDP cases were wrong: they didn't take into account the fact that they return two results, not one, so the immediate offset will be the 4th operand, not the 3rd. Follow-up to r247234. llvm-svn: 247236	2015-09-10 01:54:43 +00:00
Ahmed Bougacha	c0ac38d584	[AArch64] Match base+offset in STNP addressing mode. Followup to r247231. llvm-svn: 247234	2015-09-10 01:48:29 +00:00
Ahmed Bougacha	b8886b517d	[AArch64] Support selecting STNP. We could go through the load/store optimizer and match STNP where we would have matched a nontemporal-annotated STP, but that's not reliable enough, as an opportunistic optimization. Insetad, we can guarantee emitting STNP, by matching them at ISel. Since there are no single-input nontemporal stores, we have to resort to some high-bits-extracting trickery to generate an STNP from a plain store. Also, we need to support another, LDP/STP-specific addressing mode, base + signed scaled 7-bit immediate offset. For now, only match the base. Let's make it smart separately. Part of PR24086. llvm-svn: 247231	2015-09-10 01:42:28 +00:00
Matt Arsenault	80f766a032	AMDGPU/SI: Fix more cases of losing exec operands llvm-svn: 247230	2015-09-10 01:23:28 +00:00
Matt Arsenault	ad46e0c1ab	AMDGPU/SI: Fix creating v_mov_b32s without exec uses This will be caught by existing tests with a verifier check to be added in a future commit. llvm-svn: 247229	2015-09-10 01:06:06 +00:00
Hans Wennborg	d2799a963f	Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes" This caused build breakges, e.g. http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926 llvm-svn: 247226	2015-09-10 00:57:26 +00:00
Ahmed Bougacha	37bffd83f0	[CodeGen] Make x86 nontemporal store patfrags generic. NFC. To be used by other targets. llvm-svn: 247225	2015-09-10 00:53:15 +00:00
Philip Reames	953817b65d	[RewriteStatepointsForGC] Minor refactor to use shared implementation [NFC] llvm-svn: 247223	2015-09-10 00:44:10 +00:00
Philip Reames	b4e55f3923	[RewriteStatepointsForGC] Strengthen a confusingly weak assertion [NFC] The assertion was weaker than it should be and gave the impression we're growing the number of base defining values being considered during the fixed point interation. That's not true. The tighter form of the assert is useful documentation. llvm-svn: 247221	2015-09-10 00:32:56 +00:00
Philip Reames	c8ded462c4	[RewriteStatepointsForGC] One last bit of naming [NFCI] llvm-svn: 247220	2015-09-10 00:27:50 +00:00
Reid Kleckner	7878391208	[WinEH] Add codegen support for cleanuppad and cleanupret All of the complexity is in cleanupret, and it mostly follows the same codepaths as catchret, except it doesn't take a return value in RAX. This small example now compiles and executes successfully on win32: extern "C" int printf(const char *, ...) noexcept; struct Dtor { ~Dtor() { printf("~Dtor\n"); } }; void has_cleanup() { Dtor o; throw 42; } int main() { try { has_cleanup(); } catch (int) { printf("caught it\n"); } } Don't try to put the cleanup in the same function as the catch, or Bad Things will happen. llvm-svn: 247219	2015-09-10 00:25:23 +00:00
Philip Reames	34d7a7493d	[RewriteStatepointsForGC] Further style/naming fixup [NFCI] llvm-svn: 247217	2015-09-10 00:22:49 +00:00
Hans Wennborg	6fa09455ed	Fix Clang-tidy misc-use-override warnings, other minor fixes Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D12740 llvm-svn: 247216	2015-09-10 00:12:56 +00:00
Philip Reames	7540e3a45d	[RewriteStatepointsForGC] More naming cleanup [NFCI] llvm-svn: 247213	2015-09-10 00:01:53 +00:00
Philip Reames	ece70b8042	[RewriteStatepointsForGC] Code cleanup [NFC] Factor out common code related to naming values, fix a small style issue. More to follow in separate changes. llvm-svn: 247211	2015-09-09 23:57:18 +00:00
Philip Reames	6628713f4f	[RewriteStatepointsForGC] Extend base pointer inference to handle insertelement This change is simply enhancing the existing inference algorithm to handle insertelement instructions by conservatively inserting a new instruction to propagate the vector of associated base pointers. In the process, I'm ripping out the peephole optimizations which mostly helped cover the fact this hadn't been done. Note that most of the newly inserted nodes will be nearly immediately removed by the post insertion optimization pass introduced in 246718. Arguably, we should be trying harder to avoid the malloc traffic here, but I'd rather get the code correct, then worry about compile time. Unlike previous extensions of the algorithm to handle more case, I discovered the existing code was causing miscompiles in some cases. In particular, we had an implicit assumption that the peephole covered all insert element instructions, so if we had a value directly based on a insert element the peephole didn't cover, we proceeded as if it were a base anyways. Not good. I believe we had the same issue with shufflevector which is why I adjusted the predicate for them as well. Differential Revision: http://reviews.llvm.org/D12583 llvm-svn: 247210	2015-09-09 23:40:12 +00:00
Philip Reames	15d5563cea	[RewriteStatepointsForGC] Make base pointer inference deterministic Previously, the base pointer algorithm wasn't deterministic. The core fixed point was (of course), but we were inserting new nodes and optimizing them in an order which was unspecified and variable. We'd somewhat hacked around this for testing by sorting by value name, but that doesn't solve the general determinism problem. Instead, we can use the order of traversal over the def/use graph to give us a single consistent ordering. Today, this is a DFS order, but the exact order doesn't mater provided it's deterministic for a given input. (Q: It is safe to rely on a deterministic order of operands right?) Note that this only fixes the determinism within a single inference step. The inference step is currently invoked many times in a non-deterministic order. That's a future change in the sequence. :) Differential Revision: http://reviews.llvm.org/D12640 llvm-svn: 247208	2015-09-09 23:26:08 +00:00
Peter Collingbourne	1cbc91eccf	LowerBitSets: Fix non-determinism bug. Visit disjoint sets in a deterministic order based on the maximum BitSetNM index, otherwise the order in which we visit them will depend on pointer comparisons. This was being exposed by MSan. llvm-svn: 247201	2015-09-09 22:30:32 +00:00
Reid Kleckner	94b704c469	[SEH] Emit 32-bit SEH tables for the new EH IR The 32-bit tables don't actually contain PC range data, so emitting them is incredibly simple. The 64-bit tables, on the other hand, use the same table for state numbering as well as label ranges. This makes things more difficult, so it will be implemented later. llvm-svn: 247192	2015-09-09 21:10:03 +00:00
Piotr Padlewski	0dde00d239	ScalarEvolution assume hanging bugfix http://reviews.llvm.org/D12719 llvm-svn: 247184	2015-09-09 20:47:30 +00:00
David Majnemer	d34dbf07bd	Revert trunc(lshr (sext A), Cst) to ashr A, Cst This reverts commit r246997, it introduced a regression (PR24763). llvm-svn: 247180	2015-09-09 20:20:08 +00:00
Renato Golin	db7ea86bf4	Revert "AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding." This reverts commit r247149, as it was breaking numerous buildbots of varied architectures. llvm-svn: 247177	2015-09-09 19:44:40 +00:00
Matthias Braun	d9da162789	Save LaneMask with livein registers With subregister liveness enabled we can detect the case where only parts of a register are live in, this is expressed as a 32bit lanemask. The current code only keeps registers in the live-in list and therefore enumerated all subregisters affected by the lanemask. This turned out to be too conservative as the subregister may also cover additional parts of the lanemask which are not live. Expressing a given lanemask by enumerating a minimum set of subregisters is computationally expensive so the best solution is to simply change the live-in list to store the lanemasks as well. This will reduce memory usage for targets using subregister liveness and slightly increase it for other targets Differential Revision: http://reviews.llvm.org/D12442 llvm-svn: 247171	2015-09-09 18:08:03 +00:00
Matthias Braun	cc58005885	VirtRegMap: Improve addMBBLiveIns() using SlotIndex::MBBIndexIterator; NFC Now that we have an explicit iterator over the idx2MBBMap in SlotIndices we can use the fact that segments and the idx2MBBMap is sorted by SlotIndex position so can advance both simultaneously instead of starting from the beginning for each segment. This complicates the code for the subregister case somewhat but should be more efficient and has the advantage that we get the final lanemask for each block immediately which will be important for a subsequent change. Removes the now unused SlotIndexes::findMBBLiveIns function. Differential Revision: http://reviews.llvm.org/D12443 llvm-svn: 247170	2015-09-09 18:07:54 +00:00
Chandler Carruth	7b560d40bd	[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are within a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167	2015-09-09 17:55:00 +00:00
Matthias Braun	80595460d8	MachineVerifier: Check that SlotIndex MBBIndexList is sorted. This introduces a check that the MBBIndexList is sorted as proposed in http://reviews.llvm.org/D12443 but split up into a separate commit. llvm-svn: 247166	2015-09-09 17:49:46 +00:00
Matt Arsenault	ef67d76869	AMDGPU: Extract full 64-bit subregister and use subregs Instead of extracting both 32-bit components from the 128-bit register. This produces fewer copies and is easier for the copy peephole optimizer to understand and see the actual uses as extracts from a reg_sequence. This avoids needing to handle subregister composing in the PeepholeOptimizer's ValueTracker for this case. llvm-svn: 247162	2015-09-09 17:03:29 +00:00
Matt Arsenault	b5541fb098	AMDGPU: Remove unused multiclass argument llvm-svn: 247161	2015-09-09 17:03:18 +00:00
Dan Gohman	f71abef701	[WebAssembly] Implement calls with void return types. llvm-svn: 247158	2015-09-09 16:13:47 +00:00
Tom Stellard	9a197676b1	AMDGPU/SI: Fold operands through REG_SEQUENCE instructions Summary: This helps mostly when we use add instructions for address calculations that contain immediates. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12256 llvm-svn: 247157	2015-09-09 15:43:26 +00:00
Silviu Baranga	a3e27edb5d	[CostModel][AArch64] Remove amortization factor for some of the vector select instructions Summary: We are not scalarizing the wide selects in codegen for i16 and i32 and therefore we can remove the amortization factor. We still have issues with i64 vectors in codegen though. Reviewers: mcrosier Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12724 llvm-svn: 247156	2015-09-09 15:35:02 +00:00
Sanjay Patel	6eccf487c9	don't repeat function names in comments; NFC llvm-svn: 247154	2015-09-09 15:24:36 +00:00
Dan Gohman	1ce7ba5fe0	[WebAssembly] Tidy up some unneeded newline characters. llvm-svn: 247152	2015-09-09 15:13:36 +00:00
Sanjay Patel	e283441836	function names start with a lower case letter; NFC llvm-svn: 247150	2015-09-09 14:54:29 +00:00
Igor Breger	ac29a82921	AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 llvm-svn: 247149	2015-09-09 14:35:09 +00:00
Sanjay Patel	2fbab9d893	don't repeat function names in comments; NFC llvm-svn: 247148	2015-09-09 14:34:26 +00:00
Zoran Jovanovic	6b28f09d67	[mips][microMIPS] Implement ADDU16, AND16, ANDI16, NOT16, OR16, SLL16 and SRL16 instructions Differential Revision: http://reviews.llvm.org/D11178 llvm-svn: 247146	2015-09-09 13:55:45 +00:00
Alex Lorenz	b9a68dbcae	Fix PR 24633 - Handle undef values when parsing standalone constants. llvm-svn: 247145	2015-09-09 13:44:33 +00:00
James Molloy	520838977b	Rename ExitCount to BackedgeTakenCount, because that's what it is. We called a variable ExitCount, stored the backedge count in it, then redefined it to be the exit count again. llvm-svn: 247140	2015-09-09 12:51:10 +00:00
James Molloy	89eccee4db	Delay predication of stores until near the end of vector code generation Predicating stores requires creating extra blocks. It's much cleaner if we do this in one pass instead of mutating the CFG while writing vector instructions. Besides which we can make use of helper functions to update domtree for us, reducing the work we need to do. llvm-svn: 247139	2015-09-09 12:51:06 +00:00
Daniel Sanders	2038747fce	Fix vector splitting for extract_vector_elt and vector elements of <8-bits. Summary: One of the vector splitting paths for extract_vector_elt tries to lower: define i1 @via_stack_bug(i8 signext %idx) { %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx ret i1 %1 } to: define i1 @via_stack_bug(i8 signext %idx) { %base = alloca <2 x i1> store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx %3 = load i1, i1* %2 ret i1 %3 } However, the elements of <2 x i1> are not byte-addressible. The result of this is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies to '%base + %idx * 0', and then simply '%base' causing all values of %idx to extract element zero. This commit fixes this by promoting the vector elements of <8-bits to i8 before splitting the vector. This fixes a number of test failures in pocl. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12591 llvm-svn: 247128	2015-09-09 09:53:20 +00:00
Chandler Carruth	1688a772fc	Fix a typo I spotted when hacking on SROA. Somewhat alarming that nothing broke. llvm-svn: 247127	2015-09-09 09:46:16 +00:00
Zoran Jovanovic	d9790793d6	[mips][microMIPS] Implement CACHEE and PREFE instructions Differential Revision: http://reviews.llvm.org/D11628 llvm-svn: 247125	2015-09-09 09:10:46 +00:00
Matt Arsenault	d768737454	AMDGPU: Fix not encoding src2 of VOP3b instructions Broken by r247074. Should include an assembler test, but the assembler is currently broken for VOP3b apparently. llvm-svn: 247123	2015-09-09 08:39:49 +00:00
Sanjoy Das	da0d79e0a0	[IRCE] Add INITIALIZE_PASS_DEPENDENCY invocations. IRCE was just using INITIALIZE_PASS(), which is incorrect. llvm-svn: 247122	2015-09-09 03:47:18 +00:00
Lang Hames	856e4767ff	[RuntimeDyld] Add support for MachO x86_64 SUBTRACTOR relocation. llvm-svn: 247119	2015-09-09 03:14:29 +00:00
Dan Gohman	e590b33bf8	[WebAssembly] Fix lowering of calls with more than one argument. llvm-svn: 247118	2015-09-09 01:52:45 +00:00
Matt Arsenault	acd68b58ae	SelectionDAG: Support Expand of f16 extloads Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113	2015-09-09 01:12:27 +00:00
Dan Gohman	4f52e00ecb	[WebAssembly] Implement WebAssemblyInstrInfo::copyPhysReg llvm-svn: 247110	2015-09-09 00:52:47 +00:00
Matt Arsenault	3099156261	Fix typos / grammar llvm-svn: 247109	2015-09-09 00:38:33 +00:00
Reid Kleckner	51189f0a1d	[WinEH] Avoid creating MBBs for LLVM BBs that cannot contain code Typically these are catchpads, which hold data used to decide whether to catch the exception or continue unwinding. We also shouldn't create MBBs for catchendpads, cleanupendpads, or terminatepads, since no real code can live in them. This fixes a problem where MI passes (like the register allocator) would try to put code into catchpad blocks, which are not executed by the runtime. In the new world, blocks ending in invokes now have many possible successors. llvm-svn: 247102	2015-09-08 23:28:38 +00:00
Peter Collingbourne	8d24ae9441	Re-apply r247080 with order of evaluation fix. llvm-svn: 247095	2015-09-08 22:49:35 +00:00
Reid Kleckner	df1295173f	[WinEH] Emit prologues and epilogues for funclets Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 llvm-svn: 247092	2015-09-08 22:44:41 +00:00
Peter Collingbourne	07f3af2e82	Revert r247080, "LowerBitSets: Extend pass to support functions as bitset members." as it causes test failures on a number of bots. llvm-svn: 247088	2015-09-08 22:33:23 +00:00
Eric Christopher	71f6e2f568	Fix the PPC CTR Loop pass to look for calls to the intrinsics that read CTR and count them as reading the CTR. llvm-svn: 247083	2015-09-08 22:14:58 +00:00
Peter Collingbourne	c634ed0b1a	LowerBitSets: Extend pass to support functions as bitset members. This change extends the bitset lowering pass to support bitsets that may contain either functions or global variables. A function bitset is lowered to a jump table that is laid out before one of the functions in the bitset. Also add support for non-string bitset identifier names. This allows for distinct metadata nodes to stand in for names with internal linkage, as done in D11857. Differential Revision: http://reviews.llvm.org/D11856 llvm-svn: 247080	2015-09-08 21:57:45 +00:00
Ivan Krasin	a610cb5ba0	[libFuzzer]Add a test for defeating a hash sum. Summary: Add a test for a data followed by 4-byte hash value. I use a slightly modified Jenkins hash function, as described in https://en.wikipedia.org/wiki/Jenkins_hash_function The modification is to ensure that hash(zeros) != 0. Reviewers: kcc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12648 llvm-svn: 247076	2015-09-08 21:22:52 +00:00
Matt Arsenault	86d336e91b	AMDGPU/SI: Fix input vcc operand for VOP2b instructions Adds vcc to output string input for e32. Allows option of using e64 encoding with assembler. Also fixes these instructions not implicitly reading exec. llvm-svn: 247074	2015-09-08 21:15:00 +00:00
Artem Belevich	0127d80986	[NVPTX] Added run NVVMReflect pass to NVPTX back-end. The pass is needed to remove __nvvm_reflect calls when we link in libdevice bitcode that comes with CUDA. Differential Revision: http://reviews.llvm.org/D11663 llvm-svn: 247072	2015-09-08 21:04:55 +00:00
Derek Schuff	eef533f422	x32. Fixes a bug in how struct va_list is initialized in x32 Summary: This patch modifies X86TargetLowering::LowerVASTART so that struct va_list is initialized with 32 bit pointers in x32. It also includes tests that call @llvm.va_start() for x32. Patch by João Porto Subscribers: llvm-commits, hjl.tools Differential Revision: http://reviews.llvm.org/D12346 llvm-svn: 247069	2015-09-08 20:51:31 +00:00
Kostya Serebryany	4b82de2e47	[libFuzzer] remove a piece of stale code llvm-svn: 247067	2015-09-08 20:40:10 +00:00
Kostya Serebryany	9cdea94f66	[libFuzzer] be more robust when dealing with files on disk (e.g. don't crash if a file was there but disappeared) llvm-svn: 247066	2015-09-08 20:36:33 +00:00
Dan Gohman	e32c57443f	[WebAssembly] Support running without a register allocator in the default CodeGen passes This allows backends which don't use a traditional register allocator, but do need PHI lowering and other passes, to use the default TargetPassConfig::addFastRegAlloc and TargetPassConfig::addOptimizedRegAlloc implementations. Differential Revision: http://reviews.llvm.org/D12691 llvm-svn: 247065	2015-09-08 20:36:33 +00:00
Sanjay Patel	b54e62fe17	refactor matches for De Morgan's Laws; NFCI llvm-svn: 247061	2015-09-08 20:14:13 +00:00
Matt Arsenault	8ac35cd031	AMDGPU: Mark s_barrier as a high latency instruction These were marked as WriteSALU, which is low latency. I'm guessing at the value to use, but it should probably be considered the highest latency instruction. I'm not sure this has any actual effect since hasSideEffects probably is preventing any moving of these. llvm-svn: 247060	2015-09-08 19:54:32 +00:00
Matt Arsenault	8fb810a1d2	AMDGPU: Fix s_barrier flags This should be convergent. This is not a barrier in the isBarrier sense, nor hasCtrlDep. llvm-svn: 247059	2015-09-08 19:54:25 +00:00
Derek Schuff	ee4e947e23	x32. Fixes a bug in i8mem_NOREX declaration. The old implementation assumed LP64 which is broken for x32. Specifically, the MOVE8rm_NOREX and MOVE8mr_NOREX, when selected, would cause a 'Cannot emit physreg copy instruction' error message to be reported. This patch also enable the h-register*ll tests for x32. Differential Revision: http://reviews.llvm.org/D12336 Patch by João Porto llvm-svn: 247058	2015-09-08 19:47:15 +00:00
Matt Arsenault	966a94f861	AMDGPU: Handle sub of constant for DS offset folding sub C, x - > add (sub 0, x), C for DS offsets. This is mostly to fix regressions that show up when SeparateConstOffsetFromGEP is enabled. llvm-svn: 247054	2015-09-08 19:34:22 +00:00
Diego Novillo	f9aa39b0cf	Fix PR 24723 - Handle 0-mass backedges in irreducible loops This corner case happens when we have an irreducible SCC that is deeply nested. As we work down the tree, the backedge masses start getting smaller and smaller until we reach one that is down to 0. Since we distribute the incoming mass using the backedge masses as weight, the distributor does not allow zero weights. So, we simply ignore them (which will just use the weights of the non-zero nodes). llvm-svn: 247050	2015-09-08 19:22:17 +00:00
Davide Italiano	cb2da7166a	[MC/ELF] Accept zero for .align directive .align directive refuses alignment 0 -- a comment in the code hints this is done for GNU as compatibility, but it seems GNU as accepts .align 0 (and silently rounds up alignment to 1). Differential Revision: http://reviews.llvm.org/D12682 llvm-svn: 247048	2015-09-08 18:59:47 +00:00
David Blaikie	12dd3c4ebb	Fix CPP Backend for GEP API changes for opaque pointer types Based on a patch by Jerome Witmann. llvm-svn: 247047	2015-09-08 18:42:29 +00:00
Sanjay Patel	1854927556	remove function names from comments; NFC llvm-svn: 247043	2015-09-08 18:24:36 +00:00
Andrew Kaylor	e2ea93c6c0	Fix for bz24500: Avoid non-deterministic code generation triggered by the x86 call frame optimization Patch by Dave Kreitzer Differential Revision: http://reviews.llvm.org/D12620 llvm-svn: 247042	2015-09-08 18:18:46 +00:00
Kostya Serebryany	b06fae5ede	[libFuzzer] better documentatio for -save_minimized_corpus=1 llvm-svn: 247033	2015-09-08 17:43:51 +00:00
Kostya Serebryany	468ed78434	[libFuzzer] remove -iterations as redundant (there is also -num_runs) llvm-svn: 247030	2015-09-08 17:30:35 +00:00
JF Bastien	749ed88aa5	WebAssembly: NFC rename shr/sar Renamed from: https://github.com/WebAssembly/design/pull/332 llvm-svn: 247028	2015-09-08 17:21:21 +00:00
Kostya Serebryany	25425ad920	[libFuzzer] add one more mutator: Mutate_ChangeASCIIInteger llvm-svn: 247027	2015-09-08 17:19:31 +00:00
Jun Bum Lim	4d3c5986f2	Remove white space (test commit) llvm-svn: 247021	2015-09-08 16:11:22 +00:00
Zoran Jovanovic	2da1437d62	[mips][microMIPS] Implement LLE, LUI, LW and LWE instructions Differential Revision: http://reviews.llvm.org/D1179 llvm-svn: 247017	2015-09-08 15:02:50 +00:00
Igor Breger	a54a1a84dd	AVX512: kunpck encoding implementation Added tests for encoding. Differential Revision: http://reviews.llvm.org/D12061 llvm-svn: 247010	2015-09-08 13:10:00 +00:00
Dan Gohman	25d2a0dda4	[WebAssembly] Enable SSA lowering and other pre-regalloc passes llvm-svn: 247008	2015-09-08 12:39:25 +00:00
Elena Demikhovsky	ddf715ef77	Removed an old comment, NFC llvm-svn: 247006	2015-09-08 12:22:22 +00:00
Zoran Jovanovic	9eaa30d2bf	[mips][microMIPS] Implement SB, SBE, SCE, SH and SHE instructions Differential Revision: http://reviews.llvm.org/D11801 llvm-svn: 246999	2015-09-08 10:18:38 +00:00
Jakub Kuderski	7cd4810021	There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that removes cast by performing the lshr on smaller types. However, currently there is no trunc(lshr (sext A), Cst) variant. This patch add such optimization by transforming trunc(lshr (sext A), Cst) to ashr A, Cst. Differential Revision: http://reviews.llvm.org/D12520 llvm-svn: 246997	2015-09-08 10:03:17 +00:00
Daniel Sanders	808dfb8ba7	[mips] Reserve address spaces 1-255 for software use. Summary: And define them to have noop casts with address spaces 0-255. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12678 llvm-svn: 246990	2015-09-08 09:07:03 +00:00
Zoran Jovanovic	68be5f21a9	[mips][microMIPS] Add microMIPS32r6 and microMIPS64r6 tests for existing 16-bit LBU16, LHU16, LW16, LWGP and LWSP instructions Differential Revision: http://reviews.llvm.org/D10956 llvm-svn: 246987	2015-09-08 08:25:34 +00:00
Elena Demikhovsky	dec0f0885f	compilation issue, NFC llvm-svn: 246983	2015-09-08 07:34:06 +00:00
Elena Demikhovsky	d240d778b3	fixed compilation issue, NFC. llvm-svn: 246982	2015-09-08 07:10:08 +00:00
Elena Demikhovsky	e88038f235	AVX-512: Lowering for 512-bit vector shuffles. Vector types: <8 x 64>, <16 x 32>, <32 x 16> float and integer. Differential Revision: http://reviews.llvm.org/D10683 llvm-svn: 246981	2015-09-08 06:38:21 +00:00
Zoran Jovanovic	7b85682541	[mips][microMIPS] Implement ABS.fmt, CEIL.L.fmt, CEIL.W.fmt, FLOOR.L.fmt, FLOOR.W.fmt, TRUNC.L.fmt, TRUNC.W.fmt, RSQRT.fmt and SQRT.fmt instructions Differential Revision: http://reviews.llvm.org/D11674 llvm-svn: 246968	2015-09-07 13:01:04 +00:00
Zoran Jovanovic	ada7091812	[mips][microMIPS] Implement BC16, BEQZC16 and BNEZC16 instructions Differential Revision: http://reviews.llvm.org/D11181 llvm-svn: 246963	2015-09-07 11:56:37 +00:00
John Brawn	d8b405abf7	[ARM] Get rid of SelectT2ShifterOperandReg, NFC SelectT2ShifterOperandReg has identical behaviour to SelectImmShifterOperand, so get rid of it and use SelectImmShifterOperand instead. Differential Revision: http://reviews.llvm.org/D12195 llvm-svn: 246962	2015-09-07 11:45:18 +00:00
Zoran Jovanovic	14f308e44f	[mips][microMIPS] Implement CVT.D.fmt, CVT.L.fmt, CVT.S.fmt, CVT.W.fmt, MAX.fmt, MIN.fmt, MAXA.fmt, MINA.fmt and CMP.condn.fmt instructions Differential Revision: http://reviews.llvm.org/D12141 llvm-svn: 246960	2015-09-07 10:31:31 +00:00
NAKAMURA Takumi	0d72539d5a	Prune utf8 chars in comments. llvm-svn: 246953	2015-09-07 00:26:54 +00:00
David Majnemer	135ca40a7d	[InstCombine] Don't divide by zero when evaluating a potential transform Trivial multiplication by zero may survive the worklist. We tried to reassociate the multiplication with a division instruction, causing us to divide by zero; bail out instead. This fixes PR24726. llvm-svn: 246939	2015-09-06 06:49:59 +00:00
Hal Finkel	10aac5fd0e	[SelectionDAG] Swap commutative binops before constant-based folding In searching for a fix for the underlying code-quality bug highlighted by r246937 (that SDAG simplification can lead to us generating an ISD::OR node with a constant zero LHS), I ran across this: We generically canonicalize commutative binary-operation nodes in SDAG getNode so that, if only one operand is a constant, it will be on the RHS. However, we were doing this only after a bunch of constant-based simplification checks that all assume this canonical form (that any constant will be on the RHS). Moving the operand-swapping canonicalization prior to these checks seems like the right thing to do (and, as it turns out, causes SDAG to completely fold away the computation in test/CodeGen/ARM/2012-11-14-subs_carry.ll, just like InstCombine would do). llvm-svn: 246938	2015-09-06 05:42:13 +00:00
Hal Finkel	ccf9259c00	[PowerPC] Don't commute trivial rlwimi instructions To commute a trivial rlwimi instructions (meaning one with a full mask and zero shift), we'd need to ability to form an all-zero mask (instead of an all-one mask) using rlwimi. We can't represent this, however, and we'll miscompile code if we try. The code quality problem that this highlights (that SDAG simplification can lead to us generating an ISD::OR node with a constant zero LHS) will be fixed as a follow-up. Fixes PR24719. llvm-svn: 246937	2015-09-06 04:17:30 +00:00
David Majnemer	daa24b9789	[InstCombine] Don't assume m_Mul gives back an Instruction This fixes PR24713. llvm-svn: 246933	2015-09-05 20:44:56 +00:00
Alexandros Lamprineas	ea33e5e88e	Added arch extensions and default target features in TargetParser. Differential: http://reviews.llvm.org/D11590 llvm-svn: 246930	2015-09-05 17:05:33 +00:00
Zoran Jovanovic	89ca2b982e	[mips][microMIPS] Implement ADD.fmt, SUB.fmt, MOV.fmt, MUL.fmt, DIV.fmt, MADDF.fmt, MSUBF.fmt and NEG.fmt instructions Differential Revision: http://reviews.llvm.org/D11978 llvm-svn: 246919	2015-09-05 09:25:30 +00:00
Craig Topper	02a55d701d	Fix build warning. llvm-svn: 246908	2015-09-05 04:49:44 +00:00
NAKAMURA Takumi	2f9e8c0570	WinCOFFObjectWriter.cpp: Roll back TimeDateStamp along ENABLE_TIMESTAMPS. We want a deterministic output. GNU AS leaves it zero. FIXME: It may be optional by its user, like llc and clang. llvm-svn: 246905	2015-09-05 01:17:49 +00:00
Andrew Kaylor	2a9a6d8c38	Fix build warning llvm-svn: 246903	2015-09-05 01:00:51 +00:00
Hal Finkel	b1518d6c24	[PowerPC] Fix and(or(x, c1), c2) -> rlwimi generation PPCISelDAGToDAG has a transformation that generates a rlwimi instruction from an input pattern that looks like this: and(or(x, c1), c2) but the associated logic does not work if there are bits that are 1 in c1 but 0 in c2 (these are normally canonicalized away, but that can't happen if the 'or' has other users. Make sure we abort the transformation if such bits are discovered. Fixes PR24704. llvm-svn: 246900	2015-09-05 00:02:59 +00:00
Andrew Kaylor	a212aba680	Fix build warning llvm-svn: 246899	2015-09-04 23:58:32 +00:00
Andrew Kaylor	50e4e86c26	[WinEH] Teach SimplfyCFG to eliminate empty cleanup pads. Differential Revision: http://reviews.llvm.org/D12434 llvm-svn: 246896	2015-09-04 23:39:40 +00:00
Kostya Serebryany	e641dd6479	[libFuzzer] more accurate logic for traces, 80-char fix llvm-svn: 246888	2015-09-04 22:32:25 +00:00
Yaron Keren	771e31964d	Remove two unused includes and C++11 rangify for loops. llvm-svn: 246865	2015-09-04 20:24:24 +00:00
Chad Rosier	a67b2d0117	Typo. NFC. llvm-svn: 246851	2015-09-04 12:34:55 +00:00
David Majnemer	5ca46f0df1	[MC] Replace comparison with isUInt<32>. Casting to unsigned long can cause the time to get truncated to 32-bits, making it appear to be a valid timestamp. Just use isUInt<32> instead. llvm-svn: 246840	2015-09-04 07:22:36 +00:00
NAKAMURA Takumi	c95358b1ea	WinCOFFObjectWriter.cpp: Appease a warning in checking std::time_t. [-Wsign-compare] llvm-svn: 246839	2015-09-04 05:19:37 +00:00
Kostya Serebryany	b2e9897644	[libFuzzer] when a single mutation fails try a few more times with other mutations before returning un-mutated data llvm-svn: 246828	2015-09-04 00:40:29 +00:00
Kostya Serebryany	7d21166218	[libFuzzer] actually make the dictionaries work (+docs) llvm-svn: 246825	2015-09-04 00:12:11 +00:00
Hal Finkel	4a7be23976	[PowerPC] Enable interleaved-access vectorization This adds a basic cost model for interleaved-access vectorization (and a better default for shuffles), and enables interleaved-access vectorization by default. The relevant difference from the default cost model for interleaved-access vectorization, is that on PPC, the shuffles that end up being used are much cheaper than modeling the process with insert/extract pairs (which are quite expensive, especially on older cores). llvm-svn: 246824	2015-09-04 00:10:41 +00:00
Hal Finkel	75afa2b6b6	[PowerPC] Always use aggressive interleaving on the A2 On the A2, with an eye toward QPX unaligned-load merging, we should always use aggressive interleaving. It is generally superior to only using concatenation unrolling. llvm-svn: 246819	2015-09-03 23:23:00 +00:00
Hal Finkel	e6702ca0e2	[PowerPC] Try harder to find a base+offset when looking for consecutive accesses When forming permutation-based unaligned vector loads, we need to know whether it is valid to read ahead of the requested address by a full vector length. Doing so is more efficient (and allows for more CSE with later loads), but could trigger a page fault if invalid. To determine validity, we look for other loads in the same block that access the relevant address range. The relevant point here is that we need to do this as part of the process of forming permutation-based vector loads, and this happens quite early in the SDAG pipeline - specifically before many of the address calculations are fully canonicalized. As a result, we need to try harder to recognize base+offset address computations, because they still might appear as chain of adds (base+offset+offset, for example). To account for this, we'll look through chains of adds, accumulating the constant offsets. llvm-svn: 246813	2015-09-03 22:37:44 +00:00
Sanjoy Das	88d0fdeb00	[IR] Have AttrBuilder::clear clear `TargetDepAttrs`. Test case attached -- currently the parser smears the "foo bar" to all of the formal arguments. llvm-svn: 246812	2015-09-03 22:27:42 +00:00
Philip Reames	3ea158950e	[RewriteStatepointsForGC] Extract common code, comment, and fix a build warning [NFC] llvm-svn: 246810	2015-09-03 21:57:40 +00:00
Philip Reames	f5b8e47651	[RewriteStatepointsForGC] Strengthen invariants around BDVs As a first step towards a new implementation of the base pointer inference algorithm, introduce an abstraction for BDVs, strengthen the assertions around them, and rewrite the BDV relation code in terms of the abstraction which includes an explicit notion of whether the BDV is also a base. The later is motivated by the fact we had a bug where insertelement was always assumed to be a base pointer even though the BDV code knew it wasn't. The strengthened assertions in this patch would have caught that bug. The next step will be to separate the DefiningValueMap into a BDV use list cache (entirely within findBasePointers) and a base pointer cache. Having the former will allow me to use a deterministic visit order when visiting BDVs in the inference algorithm and remove a bunch of ordering related hacks. Before actually doing the last step, I'm likely going to extend the lattice with a 'BaseN' (seen only base inputs) state so that I can kill the post process optimization step. Phabricator Revision: http://reviews.llvm.org/D12608 llvm-svn: 246809	2015-09-03 21:34:30 +00:00
Kostya Serebryany	ec2dcb1d91	[libFuzzer] refactor the mutation functions so that they are now methods of a class. NFC llvm-svn: 246808	2015-09-03 21:24:19 +00:00
Hal Finkel	f11bc761d8	[PowerPC] Include the permutation cost for unaligned vector loads Pre-P8, when we generate code for unaligned vector loads (for Altivec and QPX types), even when accounting for the combining that takes place for multiple consecutive such loads, there is at least one load instructions and one permutation for each load. Make sure the cost reported reflects the cost of the permutes as well. llvm-svn: 246807	2015-09-03 21:23:18 +00:00
Hal Finkel	99d95328d6	[PowerPC] Compute the MMO offset for an unaligned load with signed arithmetic If you compute the MMO offset using unsigned arithmetic, you end up with a large positive offset instead of a small negative one. In theory, this could cause bad instruction-scheduling decisions later. I noticed this by inspection from the debug output, and using that for the regression test is the best I can do right now. llvm-svn: 246805	2015-09-03 21:12:15 +00:00
Philip Reames	246e618e77	[RewriteStatepointsForGC] Workaround a lack of determinism in visit order The visit order being used in the base pointer inference algorithm is currently non-deterministic. When working on http://reviews.llvm.org/D12583, I discovered that we were relying on a peephole optimization to get deterministic ordering in one of the test cases. This change is intented to let me test and land http://reviews.llvm.org/D12583. The current code will not be long lived. I'm starting to investigate a rewrite of the algorithm which will combine the post-process step into the initial algorithm and make the visit order determistic. Before doing that, I wanted to make sure the existing code was complete and the test were stable. Hopefully, patches should be up for review for the new algorithm this week or early next. llvm-svn: 246801	2015-09-03 20:24:29 +00:00
Kostya Serebryany	9838b2be87	[libFuzzer] adding a parser for AFL-style dictionaries + tests. llvm-svn: 246800	2015-09-03 20:23:46 +00:00
Reid Kleckner	df52337bfc	[sancov] Disable sanitizer coverage on functions using SEH Splitting basic blocks really messes up WinEHPrepare. We can remove this change when SEH uses the new EH IR. llvm-svn: 246799	2015-09-03 20:18:29 +00:00
Chad Rosier	6c36eff1d6	[AArch64] Improve ISel using across lane addition reduction. In vectorized add reduction code, the final "reduce" step is sub-optimal. This change wll combine : ext v1.16b, v0.16b, v0.16b, #8 add v0.4s, v1.4s, v0.4s dup v1.4s, v0.s[1] add v0.4s, v1.4s, v0.4s into addv s0, v0.4s PR21371 http://reviews.llvm.org/D12325 Patch by Jun Bum Lim <junbuml@codeaurora.org>! llvm-svn: 246790	2015-09-03 18:13:57 +00:00
Karl Schimpf	7772978ccf	Allow global address space forward decls using IDs in .ll files. Summary: This fixes bugzilla bug 24656. Fixes the case where there is a forward reference to a global variable using an ID (i.e. @0). It does this by passing the address space of the initializer pointer for which the forward referenced global is used. llvm-svn: 246788	2015-09-03 18:06:44 +00:00
Reid Kleckner	1f13d4789f	Sink COFF.h MC include into .cpp files This prevents MC clients from getting COFF.h, which conflicts with winnt.h macros. Also a minor IWYU cleanup. Now the only public headers including COFF.h are in Object, and they actually need it. llvm-svn: 246784	2015-09-03 16:41:50 +00:00
Chad Rosier	08ef462d15	Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR." This reverts commit r246769. This appears to have broken Multisource/Benchmarks/tramp3d-v4. llvm-svn: 246782	2015-09-03 16:41:28 +00:00
Sanjay Patel	c9ae9d72f8	[x86] enable machine combiner reassociations for scalar 'xor' insts llvm-svn: 246781	2015-09-03 16:36:16 +00:00
Karl Schimpf	44876c535f	Fix assertion failure in LLParser::ConvertValIDToValue Summary: Fixes bug 24645. Problem appears to be that the type may be undefined when ConvertValIDToValue is called. Reviewers: kcc Subscribers: llvm-commits llvm-svn: 246779	2015-09-03 16:18:32 +00:00
Karl Schimpf	f04a5d5978	Fix SEGV in InlineAsm::ConstraintInfo::Parse. Summary: Fixes bug 24646. Previous code was not checking if an index into a vector was valid, resulting in a SEGV. Fixed by assuming the construct can't be parsed when given this input. Reformat and add test. Differential Revision: http://reviews.llvm.org/D12539 llvm-svn: 246774	2015-09-03 15:41:37 +00:00
Karl Schimpf	388fd5ae2f	Fix SEGV in InlineAsm::ConstraintInfo::Parse. Fixes bug 24646. Previous code was not checking if an index into a vector was valid, resulting in a SEGV. Fixed by assuming the construct can't be parsed when given this input. llvm-svn: 246773	2015-09-03 15:41:34 +00:00
Sanjay Patel	ce74db9d8d	check for fastness before merging in DAGCombiner::MergeConsecutiveStores() Use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time we have a merged access candidate. Without this patch, we were generating unaligned 16-byte (SSE) memops for x86 targets where those accesses are slow. This change was mentioned in: http://reviews.llvm.org/D10662 and http://reviews.llvm.org/D10905 and will help solve PR21711. Differential Revision: http://reviews.llvm.org/D12573 llvm-svn: 246771	2015-09-03 15:03:19 +00:00
Chad Rosier	491a1bd998	[AArch64] Improve load/store optimizer to handle LDUR + LDR. This patch allows the mixing of scaled and unscaled load/stores to form load/store pairs. PR24465 http://reviews.llvm.org/D12116 Many thanks to Ahmed and Michael for fixes and code review. llvm-svn: 246769	2015-09-03 14:41:37 +00:00
Chad Rosier	5f668e170a	[AArch64] Reuse MayLoad. NFC. llvm-svn: 246767	2015-09-03 14:19:43 +00:00
Daniel Sanders	3ebcaf6685	[mips] Added support for the div, divu, ddiv and ddivu macros which use traps and breaks in the integrated assembler. Summary: Patch by Scott Egerton Reviewers: vkalintiris, dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11675 llvm-svn: 246763	2015-09-03 12:31:22 +00:00
Joseph Tremoulet	61efbc32a6	[WinEH] Add llvm.eh.exceptionpointer intrinsic Summary: This intrinsic can be used to extract a pointer to the exception caught by a given catchpad. Its argument has token type and must be a `catchpad`. Also clarify ExtendingLLVM documentation regarding overloaded intrinsics. Reviewers: majnemer, andrew.w.kaylor, sanjoy, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12533 llvm-svn: 246752	2015-09-03 09:15:32 +00:00
Joseph Tremoulet	9ce71f76b9	[WinEH] Add cleanupendpad instruction Summary: Add a `cleanupendpad` instruction, used to mark exceptional exits out of cleanups (for languages/targets that can abort a cleanup with another exception). The `cleanupendpad` instruction is similar to the `catchendpad` instruction in that it is an EH pad which is the target of unwind edges in the handler and which itself has an unwind edge to the next EH action. The `cleanupendpad` instruction, similar to `cleanupret` has a `cleanuppad` argument indicating which cleanup it exits. The unwind successors of a `cleanuppad`'s `cleanupendpad`s must agree with each other and with its `cleanupret`s. Update WinEHPrepare (and docs/tests) to accomodate `cleanupendpad`. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12433 llvm-svn: 246751	2015-09-03 09:09:43 +00:00
Igor Breger	0dcd8bcf24	AVX512: Implemented encoding and intrinsics for vplzcntq, vplzcntd, vpconflictq, vpconflictd Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11931 llvm-svn: 246750	2015-09-03 09:05:31 +00:00
JF Bastien	3a4ad61c2f	[MergeFuncs] Efficiently defer functions on merge Summary: This patch introduces a side table in Merge Functions to efficiently remove functions from the function set when functions they refer to are merged. Previously these functions would need to be compared lg(N) times to find the appropriate FunctionNode in the tree to defer. With the recent determinism changes, this comparison is more expensive. In addition, the removal function would not always actually remove the function from the set (i.e. after remove(F), there would sometimes still be a node in the tree which contains F). With these changes, these functions are properly deferred, and so more functions can be merged. In addition, when there are many merged functions (and thus more deferred functions), there is a speedup: chromium: 48678 merged -> 49380 merged; 6.58s -> 5.49s libxul.so: 41004 merged -> 41030 merged; 8.02s -> 6.94s mysqld: 1607 merged -> 1607 merged (same); 0.215s -> 0.212s (probably noise) Author: jrkoenig Reviewers: jfb, dschuff Subscribers: llvm-commits, nlewycky Differential revision: http://reviews.llvm.org/D12537 llvm-svn: 246735	2015-09-02 23:55:23 +00:00
Kostya Serebryany	6ea1b69fcf	[libFuzzer] deprecate the -tokens flag. This was a bad idea because the corpus with this flag contains encrypted inputs, not the real inputs, which complicates interoperation with other fuzzers. Instead we'll need to implement AFL dictionary support llvm-svn: 246734	2015-09-02 23:27:39 +00:00
Ahmed Bougacha	b03ea02479	[X86] Require 32-byte alignment for 32-byte VMOVNTs. We used to accept (and even test, and generate) 16-byte alignment for 32-byte nontemporal stores, but they require 32-byte alignment, per SDM. Found by inspection. Instead of hardcoding 16 in the patfrag, check for natural alignment. Also fix the autoupgrade and the various tests. Also, use explicit -mattr instead of -mcpu: I stared at the output several minutes wondering why I get 2x movntps for the unaligned case (which is the ideal output, but needs some work: see FIXME), until I remembered corei7-avx implies +slow-unaligned-mem-32. llvm-svn: 246733	2015-09-02 23:25:39 +00:00
Douglas Katzman	78425200ee	Add Myriad into enum VendorType Differential Revision: http://reviews.llvm.org/D12540 llvm-svn: 246732	2015-09-02 23:11:25 +00:00
Justin Bogner	0ffea9d47f	IR: Remove an unused AssemblyWriter constructor. NFC llvm-svn: 246729	2015-09-02 22:46:15 +00:00
Philip Reames	07a2ee1aff	[RewriteStatepointsForGC] Delete stale comment [NFC] llvm-svn: 246722	2015-09-02 22:35:42 +00:00
Philip Reames	b3967cd08e	[RewriteStatepointsForGC] Pull a function out of anon namespace [NFC] Thanks to David Blaikie for noticing in previous commit. llvm-svn: 246721	2015-09-02 22:30:53 +00:00
Justin Bogner	7fe2469150	IR: Remove a redundant function. NFC Function::print isn't interestingly different from Value::print. Just let the only caller (in PrintCallGraphPass) call the Value version. llvm-svn: 246720	2015-09-02 22:28:47 +00:00
Ahmed Bougacha	bc72ad7b27	[X86] Cleanup nontemporal fragments. NFCI. We can chain other fragments to avoid repeating conditions. This also fixes a potential bug (that realistically can't happen), where we would match indexed nontemporal stores for i32/i64. llvm-svn: 246719	2015-09-02 22:27:38 +00:00
Philip Reames	9546f367f7	[RewriteStatepointsForGC] Bugfix for change 246133 Fix a bug in change 246133. I didn't handle the case where we had a cycle in the use graph and could add an instruction we were about to erase back on to the worklist. Oddly, I have not been able to write a small test case for this, even with the AssertingVH added. I have confirmed the basic theory for the fix on a large failing example, but all attempts to reduce that to something appropriate for a test case have failed. Differential Revision: http://reviews.llvm.org/D12575 llvm-svn: 246718	2015-09-02 22:25:07 +00:00
Philip Reames	6906e92812	Fix release build warning for unused function llvm-svn: 246717	2015-09-02 21:57:17 +00:00
Philip Reames	dab35f317d	[RewriteStatepointsForGC] Improve debug output [NFC] llvm-svn: 246713	2015-09-02 21:11:44 +00:00
Hal Finkel	79dbf5b562	[PowerPC] Cleanup cost model for unaligned vector loads/stores I'm adding a regression test to better cover code generation for unaligned vector loads and stores, but there's no functional change to the code generation here. There is an improvement to the cost model for unaligned vector loads and stores, mostly for QPX (for which we were not previously accounting for the permutation-based loads), and the cost model implementation is cleaner. llvm-svn: 246712	2015-09-02 21:03:28 +00:00
Douglas Katzman	a26be4a946	Move twice-repeated clang path operation into a new function. And make it more robust in the edge case of exactly "./" as input. llvm-svn: 246711	2015-09-02 21:02:10 +00:00
Piotr Padlewski	0c7d8fc1f6	assuem(X) handling in GVN bugfix There was infinite loop because it was trying to change assume(true) into assume(true) Also added handling when assume(false) appear http://reviews.llvm.org/D12516 llvm-svn: 246697	2015-09-02 20:00:03 +00:00
Piotr Padlewski	28ffcbe1cc	Constant propagation after hitting assume(cmp) bugfix Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246696	2015-09-02 19:59:59 +00:00
Piotr Padlewski	14e815c22b	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246695	2015-09-02 19:59:53 +00:00
Benjamin Kramer	f175e04435	[RemoveDuplicatePHINodes] Start over after removing a PHI. This makes RemoveDuplicatePHINodes more effective and fixes an assertion failure. Triggering the assertions requires a DenseSet reallocation so this change only contains a constructive test. I'll explain the issue with a small example. In the following function there's a duplicate PHI, %4 and %5 are identical. When this is found the DenseSet in RemoveDuplicatePHINodes contains %2, %3 and %4. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %5, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] %5 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } after RemoveDuplicatePHINodes runs the function looks like this. %3 has changed and is now identical to %2, but RemoveDuplicatePHINodes never saw this. define void @F() { br label %1 ; <label>:1 ; preds = %1, %0 %2 = phi i32 [ 42, %0 ], [ %4, %1 ] %3 = phi i32 [ 42, %0 ], [ %4, %1 ] %4 = phi i32 [ 42, %0 ], [ 23, %1 ] br label %1 } If the DenseSet does a reallocation now it will reinsert all keys and stumble over %3 now having a different hash value than it had when inserted into the map for the first time. This change clears the set whenever a PHI is deleted and starts the progress from the beginning, allowing %3 to be deleted and avoiding inconsistent DenseSet state. This potentially has a negative performance impact because it rescans all PHIs, but I don't think that this ever makes a difference in practice. llvm-svn: 246694	2015-09-02 19:52:23 +00:00
Sanjay Patel	42574203e5	use "unpredictable" metadata in fast-isel when splitting compares This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch. Differential Revision: http://reviews.llvm.org/D12342 llvm-svn: 246692	2015-09-02 19:23:23 +00:00
Sanjay Patel	fff7c6dc73	use "unpredictable" metadata in SelectionDAG when splitting compares This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch. Differential Revision: http://reviews.llvm.org/D12343 llvm-svn: 246691	2015-09-02 19:17:25 +00:00
Kostya Serebryany	a9346c2e65	[libFuzzer] honour -only_ascii=1 when reading the initial corpus. Also, remove ugly #ifdef llvm-svn: 246689	2015-09-02 19:08:08 +00:00
Sanjay Patel	a99ab1f536	add unpredictable metadata type for control flow This patch defines 'unpredictable' metadata. This metadata can be used to signal to the optimizer or backend that a branch or switch is unpredictable, and therefore, it's probably better to not split a compound predicate into multiple branches such as in CodeGenPrepare::splitBranchCondition(). This was discussed in: https://llvm.org/bugs/show_bug.cgi?id=23827 Dependent patches to alter codegen and expose this in clang to follow. Differential Revision; http://reviews.llvm.org/D12341 llvm-svn: 246688	2015-09-02 19:06:43 +00:00
Ahmed Bougacha	63fae0e58b	[AArch64] More consistently separate asm opc and operands with '\t'. Somehow missed these in r246686. llvm-svn: 246687	2015-09-02 18:52:54 +00:00
Ahmed Bougacha	cca07716f5	[AArch64] Consistently separate asm opc and operands with '\t'. Some of the instructions use ' ', which drives OCD-me nuts. Let's put an end to this. NFC-ish: hopefully nobody cares about whitespace. llvm-svn: 246686	2015-09-02 18:38:36 +00:00
Justin Bogner	58e0823ee9	IR: Invert a condition to make it more legible. NFC Also updates the style to more modern conventions. llvm-svn: 246681	2015-09-02 17:54:41 +00:00
James Molloy	569cea65f0	[ValueTracking] Look through casts when both operands are casts. We only looked through casts when one operand was a constant. We can also look through casts when both operands are non-constant, but both are in fact the same cast type. For example: %1 = icmp ult i8 %a, %b %2 = zext i8 %a to i32 %3 = zext i8 %b to i32 %4 = select i1 %1, i32 %2, i32 %3 llvm-svn: 246678	2015-09-02 17:25:25 +00:00
Hal Finkel	77c8b7ffd3	[PowerPC] Don't always consider P8Altivec-only masks in LowerVECTOR_SHUFFLE LowerVECTOR_SHUFFLE needs to decide whether to pass a vector shuffle off to the TableGen-generated matching code, and it does this by testing the same predicates used by the TableGen files. Unfortunately, when we added new P8Altivec-only predicates, we started universally testing them in LowerVECTOR_SHUFFLE, and if then matched when targeting a system prior to a P8, we'd end up with a selection failure. llvm-svn: 246675	2015-09-02 16:52:37 +00:00
Sanjay Patel	fbcd189f8a	[x86] fix allowsMisalignedMemoryAccesses() for 8-byte and smaller accesses This is a continuation of the fix from: http://reviews.llvm.org/D10662 and discussion in: http://reviews.llvm.org/D12154 Here, we distinguish slow unaligned SSE (128-bit) accesses from slow unaligned scalar (64-bit and under) accesses. Other lowering (eg, getOptimalMemOpType) assumes that unaligned scalar accesses are always ok, so this changes allowsMisalignedMemoryAccesses() to match that behavior. Differential Revision: http://reviews.llvm.org/D12543 llvm-svn: 246658	2015-09-02 15:42:49 +00:00
Asaf Badouh	d2c3599c5f	[X86][AVX512VLBW] add support in byte shift and SAD add byte shift left/right add SAD - compute sum of absolute differences Differential Revision: http://reviews.llvm.org/D12479 llvm-svn: 246654	2015-09-02 14:21:54 +00:00
Joseph Tremoulet	917c7382c1	[TableGen] Allow TokenTy in intrinsic signatures Summary: Add the necessary plumbing so that llvm_token_ty can be used as an argument/return type in intrinsic definitions and correspondingly require TokenTy in function types. TokenTy is an opaque type that has no target lowering, but can be used in machine-independent intrinsics. It is required for the upcoming llvm.eh.padparam intrinsic. Reviewers: majnemer, rnk Subscribers: stoklund, llvm-commits Differential Revision: http://reviews.llvm.org/D12532 llvm-svn: 246651	2015-09-02 13:36:25 +00:00
Igor Breger	1e58e8adf6	AVX512: Implemented encoding and intrinsics for VGETMANTPD/S , VGETMANTSD/S instructions Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11593 llvm-svn: 246642	2015-09-02 11:18:55 +00:00
Igor Breger	a6297c701e	AVX512: Implemented encoding and intrinsics for vshufps/d. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11709 llvm-svn: 246640	2015-09-02 10:50:58 +00:00
James Molloy	1e583704f5	[LV] Don't bail to MiddleBlock if a runtime check fails, bail to ScalarPH instead We were bailing to two places if our runtime checks failed. If the initial overflow check failed, we'd go to ScalarPH. If any other check failed, we'd go to MiddleBlock. This caused us to have to have an extra PHI per induction and reduction as the vector loop's exit block was not dominated by its latch. There's no need to have this behavior - if we just always go to ScalarPH we can get rid of a bunch of complexity. llvm-svn: 246637	2015-09-02 10:15:39 +00:00
James Molloy	f2523e38d8	[LV] Move some code around slightly to make the intent of the function more clear. NFC. llvm-svn: 246636	2015-09-02 10:15:32 +00:00
James Molloy	aca2f400ba	[LV] Cleanup: Sink an IRBuilder closer to its uses. NFC. llvm-svn: 246635	2015-09-02 10:15:27 +00:00
James Molloy	cba9230507	[LV] Refactor all runtime check emissions into helper functions. This reduces the complexity of createEmptyBlock() and will open the door to further refactoring. The test change is simply because we're now constant folding a trivial test. llvm-svn: 246634	2015-09-02 10:15:22 +00:00
James Molloy	ff623dce39	[LV] Pull creation of trip counts into a helper function. ... and do a tad of tidyup while we're at it. Because StartIdx must now be zero, there's no difference between Count and EndIdx. llvm-svn: 246633	2015-09-02 10:15:16 +00:00
James Molloy	239ff5d193	[LV] Factor the creation of the loop induction variable out of createEmptyLoop() It makes things easier to understand if this is in a helper method. This is part of my ongoing spaghetti-removal operation on createEmptyLoop. llvm-svn: 246632	2015-09-02 10:15:09 +00:00
James Molloy	a860a2216a	[LV] Never widen an induction variable. There's no need to widen canonical induction variables. It's just as efficient to create a new, wide, induction variable. Consider, if we widen an indvar, then we'll have to truncate it before its uses anyway (1 trunc). If we create a new indvar instead, we'll have to truncate that instead (1 trunc) [besides which IndVars should go and clean up our mess after us anyway on principle]. This lets us remove a ton of special-casing code. llvm-svn: 246631	2015-09-02 10:15:05 +00:00
James Molloy	c07701b017	[LV] Switch to using canonical induction variables. Vectorized loops only ever have one induction variable. All induction PHIs from the scalar loop are rewritten to be in terms of this single indvar. We were trying very hard to pick an indvar that already existed, even if that indvar wasn't canonical (didn't start at zero). But trying so hard is really fruitless - creating a new, canonical, indvar only results in one extra add in the worst case and that add is trivially easy to push through the PHI out of the loop by instcombine. If we try and be less clever here and instead let instcombine clean up our mess (as we do in many other places in LV), we can remove unneeded complexity. llvm-svn: 246630	2015-09-02 10:14:54 +00:00
Elena Demikhovsky	9f83c7346f	AVX-512: store <4 x i1> and <2 x i1> values in memory Enabled DAG pattern lowering for SKX with DQI predicate. Differential Revision: http://reviews.llvm.org/D12550 llvm-svn: 246625	2015-09-02 09:20:58 +00:00
Elena Demikhovsky	1b9d6914d3	Optimization for Gather/Scatter with uniform base Vector 'getelementptr' with scalar base is an opportunity for gather/scatter intrinsic to generate a better sequence. While looking for uniform base, we want to use the scalar base pointer of GEP, if exists. Differential Revision: http://reviews.llvm.org/D11121 llvm-svn: 246622	2015-09-02 08:39:13 +00:00
Yaron Keren	611c7cff53	Move createEliminateAvailableExternallyPass earlier in the pass pipeline to save running many ModulePasses on available external functions that are thrown away anyhow. llvm-svn: 246619	2015-09-02 06:34:11 +00:00
Vedant Kumar	b5c2fd7257	[CodeGen] Fix FREM on 32-bit MSVC on x86 Patch by Dylan McKay! Differential Revision: http://reviews.llvm.org/D12099 llvm-svn: 246615	2015-09-02 01:31:58 +00:00
David Majnemer	088ba020dd	[MC] Generate a timestamp for COFF object files The MS incremental linker seems to inspect the timestamp written into the object file to determine whether or not it's contents need to be considered. Failing to set the timestamp to a date newer than the executable will result in the object file not participating in subsequent links. To ameliorate this, write the current time into the object file's TimeDateStamp field. llvm-svn: 246607	2015-09-01 23:46:11 +00:00
David Majnemer	83c862ad52	[MC] Remove MCAssembler's copy of OS We can just ask the ObjectWriter for it's stream instead of caching around our own reference to it. No functionality change is intended. llvm-svn: 246604	2015-09-01 23:19:38 +00:00
Ahmed Bougacha	699a9dd7c3	[ARM] Don't abort on variable-idx extractelt in ReconstructShuffle. The code introduced in r244314 assumed that EXTRACT_VECTOR_ELT only takes constant indices, but it does accept variables. Bail out for those: we can't use them, as the shuffles we want to reconstruct do require constant masks. llvm-svn: 246594	2015-09-01 21:56:00 +00:00
David Majnemer	6ddc636862	[MC] Add support for generating COFF CRCs COFF sections are accompanied with an auxiliary symbol which includes a checksum. This checksum used to be filled with just zero but this seems to upset LINK.exe when it is processing a /INCREMENTAL link job. Instead, fill the CheckSum field with the JamCRC of the section contents. This matches MSVC's behavior. This fixes PR19666. N.B. A rather simple implementation of JamCRC is given. It implements a byte-wise calculation using the method given by Sarwate. There are implementations with higher throughput like slice-by-eight and making use of PCLMULQDQ. We can switch to one of those techniques if it turns out to be a significant use of time. llvm-svn: 246590	2015-09-01 21:23:58 +00:00
Sanjay Patel	30145677a8	rename "slow-unaligned-mem-under-32" to slow-unaligned-mem-16" (NFCI) This is a follow-on suggested by: http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 ) http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 ) This makes the attribute name match most of the existing lowering logic and regression test expectations. But the current use of this attribute is inconsistent; see the FIXME comment for "allowsMisalignedMemoryAccesses()". That change will result in functional changes and should be coming soon. llvm-svn: 246585	2015-09-01 20:51:51 +00:00
Hans Wennborg	dada1d20ba	DeadArgElim: don't eliminate arguments from naked functions Differential Revision: http://reviews.llvm.org/D12534 llvm-svn: 246564	2015-09-01 18:06:46 +00:00
Artem Belevich	020d4fb17f	New bitcode linker flags: -only-needed -- link in only symbols needed by destination module -internalize -- internalize linked symbols Differential Revision: http://reviews.llvm.org/D12459 llvm-svn: 246561	2015-09-01 17:55:55 +00:00
Ahmed Bougacha	b0ff6437cb	[AArch64] Lower READCYCLECOUNTER using MRS PMCCTNR_EL0. This matches the ARM behavior. In both cases, the register is part of the optional Performance Monitors extension, so, add the feature, and enable it for the A-class processors we support. Differential Revision: http://reviews.llvm.org/D12425 llvm-svn: 246555	2015-09-01 16:23:45 +00:00
David Majnemer	abdb2d2aba	[MC] Allow MCObjectWriter's output stream to be swapped out There are occasions where it is useful to consider the entirety of the contents of a section. For example, compressed debug info needs the entire section available before it can compress it and write it out. The compressed debug info scenario was previously implemented by mirroring the implementation of writeSectionData in the ELFObjectWriter. Instead, allow the output stream to be swapped on demand. This lets callers redirect the output stream to a more convenient location before it hits the object file. No functionality change is intended. Differential Revision: http://reviews.llvm.org/D12509 llvm-svn: 246554	2015-09-01 16:19:03 +00:00
Igor Breger	f6f1bb6ddc	AVX512: Implemented intrinsics for valign. Differential Revision: http://reviews.llvm.org/D12526 llvm-svn: 246551	2015-09-01 15:27:18 +00:00
Silviu Baranga	755ec0e027	[AArch64] Turn on by default interleaved access vectorization Summary: This change turns on by default interleaved access vectorization for AArch64. We also clean up some tests which were spedifically enabling this behaviour. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12149 llvm-svn: 246542	2015-09-01 11:26:46 +00:00
Silviu Baranga	e748c9ef55	[ARM] Turn on by default interleaved access vectorization Summary: This change turns on by default interleaved access vectorization on ARM, as it has shown to be beneficial on ARM. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12146 llvm-svn: 246541	2015-09-01 11:19:15 +00:00
Silviu Baranga	6d3f05c04b	[ARM][AArch64] Turn on by default interleaved access lowering Summary: Interleaved access lowering removes a memory operation and a sequence of vector shuffles and replaces it with a series of memory operations. This should be always beneficial. This pass in only enabled on ARM/AArch64. Reviewers: rengolin Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12145 llvm-svn: 246540	2015-09-01 11:12:35 +00:00
Yaron Keren	55f5c3d43b	Fix typo. llvm-svn: 246538	2015-09-01 10:13:49 +00:00
Matt Arsenault	51d2d0f668	AMDGPU: Fix adding redundant implicit operands These are already added during the MachineInstr construction, so this was adding the implicit registers twice. llvm-svn: 246525	2015-09-01 02:02:21 +00:00
Cong Hou	511298b919	Distribute the weight on the edge from switch to default statement to edges generated in lowering switch. Currently, when edge weights are assigned to edges that are created when lowering switch statement, the weight on the edge to default statement (let's call it "default weight" here) is not considered. We need to distribute this weight properly. However, without value profiling, we have no idea how to distribute it. In this patch, I applied the heuristic that this weight is evenly distributed to successors. For example, given a switch statement with cases 1,2,3,5,10,11,20, and every edge from switch to each successor has weight 10. If there is a binary search tree built to test if n < 10, then its two out-edges will have weight 4x10+10/2 = 45 and 3x10 + 10/2 = 35 respectively (currently they are 40 and 30 without considering the default weight). Each distribution (which is 5 here) will be stored in each SwitchWorkListItem for further distribution. There are some exceptions: For a jump table header which doesn't have any edge to default statement, we don't distribute the default weight to it. For a bit test header which covers a contiguous range and hence has no edges to default statement, we don't distribute the default weight to it. When the branch checks a single value or a contiguous range with no edge to default statement, we don't distribute the default weight to it. In other cases, the default weight is evenly distributed to successors. Differential Revision: http://reviews.llvm.org/D12418 llvm-svn: 246522	2015-09-01 01:42:16 +00:00
Duncan P. N. Exon Smith	f4967754a5	LTO: Cleanup parameter names and header docs, NFC Follow LLVM style for the parameter names (`CamelCase` not `camelCase`), and surface the header docs in doxygen. No functionality change intended. llvm-svn: 246509	2015-08-31 23:44:06 +00:00
Hal Finkel	1baec5323b	[DAGCombine] Fixup SETCC legality checking SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and\|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and\|or node are actual setcc nodes, then this is not an issue (because the and\|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. llvm-svn: 246507	2015-08-31 23:15:04 +00:00
Sanjay Patel	719b3e6a3e	don't set a legal vector type if we know we can't use that type (NFCI) Added benefit: the 'if' logic now matches the text of the comment that describes it. llvm-svn: 246506	2015-08-31 22:59:03 +00:00
Quentin Colombet	5989bc6f41	[BasicAA] Fix the handling of sext and zext in the analysis of GEPs. Hopefully this will end the GEPs saga! This commit reverts r245394, i.e., it reapplies r221876 while incorporating the fixes from D11847. r221876 was not reapplied alone because it was not safe and D11847 was not applied alone because it needs r221876 to produce correct results. This should fix PR24596. Original commit message for r221876: Let's try this again... This reverts r219432, plus a bug fix. Description of the bug in r219432 (by Nick): The bug was using AllPositive to break out of the loop; if the loop break condition i != e is changed to i != e && AllPositive then the test_modulo_analysis_with_global test I've added will fail as the Modulo will be calculated incorrectly (as the last loop iteration is skipped, so Modulo isn't updated with its Scale). Nick also adds this comment: ComputeSignBit is safe to use in loops as it takes into account phi nodes, and the == EK_ZeroEx check is safe in loops as, no matter how the variable changes between iterations, zero-extensions will always guarantee a zero sign bit. The isValueEqualInPotentialCycles check is therefore definitely not needed as all the variable analysis holds no matter how the variables change between loop iterations. And this patch also adds another enhancement to GetLinearExpression - basically to convert ConstantInts to Offsets (see test_const_eval and test_const_eval_scaled for the situations this improves). Original commit message: This reverts r218944, which reverted r218714, plus a bug fix. Description of the bug in r218714 (by Nick): The original patch forgot to check if the Scale in VariableGEPIndex flipped the sign of the variable. The BasicAA pass iterates over the instructions in the order they appear in the function, and so BasicAliasAnalysis::aliasGEP is called with the variable it first comes across as parameter GEP1. Adding a %reorder label puts the definition of %a after %b so aliasGEP is called with %b as the first parameter and %a as the second. aliasGEP later calculates that %a == %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) - ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly conclude that %a > %b. Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug. Slightly modified by me to add an early exit from the loop and avoid unnecessary, but expensive, function calls. Original commit message: Two related things: 1. Fixes a bug when calculating the offset in GetLinearExpression. The code previously used zext to extend the offset, so negative offsets were converted to large positive ones. 2. Enhance aliasGEP to deduce that, if the difference between two GEP allocations is positive and all the variables that govern the offset are also positive (i.e. the offset is strictly after the higher base pointer), then locations that fit in the gap between the two base pointers are NoAlias. Patch by Nick White! Message from D11847: Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression delegates to 'Add' if possible, and if not it returns an Opaque value. Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) - and a scale of zero effectively removes the variable from the GEP instruction. This meant that BasicAA would return MustAliases when it should have been returning PartialAliases (and PR23626 was an example of the GVN pass using an incorrect MustAlias to merge loads from what should have been different pointers). Differential Revision: http://reviews.llvm.org/D11847 Patch by Nick White <n.j.white@gmail.com>! llvm-svn: 246502	2015-08-31 22:32:47 +00:00
JF Bastien	73ff6afa87	WebAssembly: generate load/store Summary: This handles all load/store operations that WebAssembly defines, and handles those necessary for C++ such as i1. I left a FIXME for outstanding features which aren't required for now. Reviewers: sunfish Subscribers: jfb, llvm-commits, dschuff llvm-svn: 246500	2015-08-31 22:24:11 +00:00
Sanjay Patel	218cbd5a48	generalize helper function of MergeConsecutiveStores to handle vector types (NFCI) This was part of D7208 (r227242), but that commit was reverted because it exposed a bug in AArch64 lowering. I should have that fixed and the rest of the commit reinstated soon. llvm-svn: 246493	2015-08-31 21:50:16 +00:00
Karl Schimpf	4da0e12968	Fix bug in method LLLexer::FP80HexToIntPair llvm-svn: 246489	2015-08-31 21:36:14 +00:00
Hans Wennborg	043bf5b296	Fix Windows build by including raw_ostream.h llvm-svn: 246486	2015-08-31 21:19:18 +00:00
Naomi Musgrave	21c1bc46ae	Rollback of commit "Repress sanitization on User dtor." This would have suppressed bug 24578, about use-after- destroy on User and MDNode. Rolled back suppression for the sake of code cleanliness, in preferance for bug tracking to keep track of this issue. This reverts commit 6ff2baabc4625d5b0a8dccf76aa0f72d930ea6c0. llvm-svn: 246484	2015-08-31 21:06:08 +00:00
Hal Finkel	2483f2060a	[DAGCombine] Use getSetCCResultType utility function DAGCombine has a utility wrapper around TLI's getSetCCResultType; use it in the one place in DAGCombine still directly calling the TLI function. NFC. llvm-svn: 246482	2015-08-31 20:42:38 +00:00
Sanjay Patel	d9a5c225d1	[x86] enable machine combiner reassociations for scalar 'or' insts llvm-svn: 246481	2015-08-31 20:27:03 +00:00
Reid Kleckner	e00faf8ce1	[EH] Handle non-Function personalities like unknown personalities Also delete and simplify a lot of MachineModuleInfo code that used to be needed to handle personalities on landingpads. Now that the personality is on the LLVM Function, we no longer need to track it this way on MMI. Certainly it should not live on LandingPadInfo. llvm-svn: 246478	2015-08-31 20:02:16 +00:00
Philip Reames	a88caeab6c	[FunctionAttr] Infer nonnull attributes on returns Teach FunctionAttr to infer the nonnull attribute on return values of functions which never return a potentially null value. This is done both via a conservative local analysis for the function itself and a optimistic per-SCC analysis. If no function in the SCC returns anything which could be null (other than values from other functions in the SCC), we can conclude no function returned a null pointer. Even if some function within the SCC returns a null pointer, we may be able to locally conclude that some don't. Differential Revision: http://reviews.llvm.org/D9688 llvm-svn: 246476	2015-08-31 19:44:38 +00:00
Quentin Colombet	a80b9c824e	[AArch64][CollectLOH] Remove an invalid assertion and add a test case exposing it. rdar://problem/22491525 llvm-svn: 246472	2015-08-31 19:02:00 +00:00
Naomi Musgrave	763468baec	Undo reversion on commit: Revert "Revert "Repress sanitization on User dtor. Modify msan macros for applying attribute"" This reverts commit 020e70a79878c96457e6882bcdfaf6628baf32b7. llvm-svn: 246470	2015-08-31 18:49:31 +00:00
Hal Finkel	a894266d28	[DAGCombine] Remove some old dead code for forming SETCC nodes This code was dead when it was committed in r23665 (Oct 7, 2005), and before it reaches its 10th anniversary, it really should go. We can always bring it back if we'd like, but it forms more SETCC nodes, and the way we do legality checking on SETCC nodes is wrong in a number of places, and removing this means fewer places to fix. NFC. llvm-svn: 246466	2015-08-31 18:38:55 +00:00
Philip Reames	bb11d62a5a	[LazyValueInfo] Look through Phi nodes when trying to prove a predicate If asked to prove a predicate about a value produced by a PHI node, LazyValueInfo was unable to do so even if the predicate was known to be true for each input to the PHI. This prevented JumpThreading from eliminating a provably redundant branch. The problematic test case looks something like this: ListNode *p = ...; while (p != null) { if (!p) return; x = g->x; // unrelated p = p->next } The null check at the top of the loop is redundant since the value of 'p' is null checked on entry to the loop and before executing the backedge. This resulted in us a) executing an extra null check per iteration and b) not being able to LICM unrelated loads after the check since we couldn't prove they would execute or that their dereferenceability wasn't effected by the null check on the first iteration. Differential Revision: http://reviews.llvm.org/D12383 llvm-svn: 246465	2015-08-31 18:31:48 +00:00
Kit Barton	d3cc1678e8	Rework of the new interface for shrink wrapping Based on comments from Hal (http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150810/292978.html), I've changed the interface to add a callback mechanism to the TargetFrameLowering class to query whether the specific target supports shrink wrapping. By default, shrink wrapping is disabled by default. Each target can override the default behaviour using the TargetFrameLowering::targetSupportsShrinkWrapping() method. Shrink wrapping can still be explicitly enabled or disabled from the command line, using the existing -enable-shrink-wrap=<true\|false> option. Phabricator: http://reviews.llvm.org/D12293 llvm-svn: 246463	2015-08-31 18:26:45 +00:00
Matthias Braun	0acbd08f3c	AArch64: Fix loads to lower NEON vector lanes using GPR registers The ISelLowering code turned insertion turned the element for the lowest lane of a BUILD_VECTOR into an INSERT_SUBREG, this prohibited the patterns for SCALAR_TO_VECTOR(Load) to match later. Restrict this to cases without a load argument. Reported in rdar://22223823 Differential Revision: http://reviews.llvm.org/D12467 llvm-svn: 246462	2015-08-31 18:25:15 +00:00
Matthias Braun	818c78d0cc	X86: Fix FastISel SSESelect register class X86FastISel has been using the wrong register class for VBLENDVPS which produces a VR128 and needs an extra copy to the target register. The problem was already hit by the existing test cases when using > llvm-lit -Dllc="llc -verify-machineinstr" llvm-svn: 246461	2015-08-31 18:25:11 +00:00
Filipe Cabecinhas	984fefdd81	[BitcodeReader] Ensure we can read constant vector selects with an i1 condition Summary: Constant vectors weren't allowed to have an i1 condition in the BitcodeReader. Make sure we have the same restrictions that are documented, not more. Reviewers: nlewycky, rafael, kschimpf Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12440 llvm-svn: 246459	2015-08-31 18:00:30 +00:00
Vedant Kumar	86dbd92334	[MC/AsmParser] Avoid setting MCSymbol.IsUsed in some cases Avoid marking some MCSymbols as used in MC/AsmParser.cpp when no uses exist. This fixes a bug in parseAssignmentExpression() which inadvertently sets IsUsed, thereby triggering: "invalid re-assignment of non-absolute variable" on otherwise valid code. No other functionality change intended. The original version of this patch touched many calls to MCSymbol accessors. On rafael's advice, I have stripped this patch down a bit. As a follow-up, I intend to find the call sites which intentionally set IsUsed and force them to do so explicitly. Differential Revision: http://reviews.llvm.org/D12347 llvm-svn: 246457	2015-08-31 17:44:53 +00:00
Karl Schimpf	36440082f8	Change comment to verify commit accesss. llvm-svn: 246451	2015-08-31 16:43:55 +00:00
Naomi Musgrave	5f79c6653d	Revert "Repress sanitization on User dtor. Modify msan macros for applying attribute" This reverts commit 5e3bfbb38eb3fb6f568b107f6b239e0aa4c5f334. llvm-svn: 246450	2015-08-31 16:26:44 +00:00
Naomi Musgrave	d8c1a064e5	Repress sanitization on User dtor. Modify msan macros for applying attribute to repress sanitization. Move attribute for repressing sanitization to operator delete for User, MDNode. Summary: In response to bug 24578, reported against failing LLVM test. Reviewers: chandlerc, rsmith, eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12335 llvm-svn: 246449	2015-08-31 15:57:40 +00:00
Benjamin Kramer	efeddcc552	[SectionMemoryManager] Use range-based for loops. No functional change intended. llvm-svn: 246440	2015-08-31 13:39:14 +00:00
Igor Breger	5ea0a68115	AVX512: ktest implemantation Added tests for encoding. Differential Revision: http://reviews.llvm.org/D11979 llvm-svn: 246439	2015-08-31 13:30:19 +00:00
Igor Breger	f3ded811b2	AVX512: Implemented encoding and intrinsics for vdbpsadbw Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12491 llvm-svn: 246436	2015-08-31 13:09:30 +00:00
Igor Breger	59ac339357	AVX512: kadd implementation Added tests for encoding. Differential Revision: http://reviews.llvm.org/D11973 llvm-svn: 246432	2015-08-31 11:50:23 +00:00
Igor Breger	2ae0fe3ac3	AVX512: Implemented encoding and intrinsics for vpalignr Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12270 llvm-svn: 246428	2015-08-31 11:14:02 +00:00
Hal Finkel	e0a28e54c7	[AggressiveAntiDepBreaker] Check for EarlyClobber on defining instruction AggressiveAntiDepBreaker was doing some EarlyClobber checking, but was not checking that the register being potentially renamed was defined by an early-clobber def where there was also a use, in that instruction, of the register being considered as the target of the rename. Fixes PR24014. llvm-svn: 246423	2015-08-31 07:51:36 +00:00
Jingyue Wu	e84f671830	[JumpThreading] make jump threading respect convergent annotation. Summary: JumpThreading shouldn't duplicate a convergent call, because that would move a convergent call into a control-inequivalent location. For example, if (cond) { ... } else { ... } convergent_call(); if (cond) { ... } else { ... } should not be optimized to if (cond) { ... convergent_call(); ... } else { ... convergent_call(); ... } Test Plan: test/Transforms/JumpThreading/basic.ll Patch by Xuetian Weng. Reviewers: resistor, arsenm, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12484 llvm-svn: 246415	2015-08-31 06:10:27 +00:00
Peter Collingbourne	592ee15e14	Support: Support LLVM_ENABLE_THREADS=0 in llvm/Support/thread.h. Specifically, the header now provides llvm::thread, which is either a typedef of std::thread or a replacement that calls the function synchronously depending on the value of LLVM_ENABLE_THREADS. llvm-svn: 246402	2015-08-31 00:09:01 +00:00
Hal Finkel	a2cdbce661	[PowerPC] Fixup SELECT_CC (and SETCC) patterns with i1 comparison operands There were really two problems here. The first was that we had the truth tables for signed i1 comparisons backward. I imagine these are not very common, but if you have: setcc i1 x, y, LT this has the '0 1' and the '1 0' results flipped compared to: setcc i1 x, y, ULT because, in the signed case, '1 0' is really '-1 0', and the answer is not the same as in the unsigned case. The second problem was that we did not have patterns (at all) for the unsigned comparisons select_cc nodes for i1 comparison operands. This was the specific cause of PR24552. These had to be added (and a missing Altivec promotion added as well) to make sure these function for all types. I've added a bunch more test cases for these patterns, and there are a few FIXMEs in the test case regarding code-quality. Fixes PR24552. llvm-svn: 246400	2015-08-30 22:12:50 +00:00
Elena Demikhovsky	63a7ca9948	NFC: Code style in VectorUtils.cpp Differential Revision: http://reviews.llvm.org/D12478 llvm-svn: 246381	2015-08-30 13:48:02 +00:00
Renato Golin	3b1d3b0d84	Revert "Revert "New interface function is added to VectorUtils Value getSplatValue(Value Val);"" This reverts commit r246379. It seems that the commit was not the culprit, and the bot will be investigated for instability. llvm-svn: 246380	2015-08-30 10:49:04 +00:00
Renato Golin	c7be31736c	Revert "New interface function is added to VectorUtils Value getSplatValue(Value Val);" This reverts commit r246371, as it cause a rather obscure bug in AArch64 test-suite paq8p (time outs, seg-faults). I'll investigate it before reapplying. llvm-svn: 246379	2015-08-30 10:05:30 +00:00
Chandler Carruth	5543fbc9b2	Stop calling the flat out insane ARM target parsing code unless the architecture string is something quite weird. Similarly delay calling the BPF parsing code, although that is more reasonable. To understand why I was motivated to make this change, it cuts the time for running the ADT TripleTest unittests by a factor of two in non-optimized builds (the developer default) and reduces my 'check-llvm' time by a full 15 seconds. The implementation of parseARMArch is that slow. I tried to fix it in the prior series of commits, but frankly, I have no idea how to finish fixing it. The entire premise of the function (to allow 'v7a-unknown-linux' or some such to parse as an 'arm-unknown-linux' triple) seems completely insane to me, but I'll let the ARM folks sort that out. At least it is now out of the critical path of every developer working on LLVM. It also will likely make some other folks' code significantly faster as I've heard reports of 2% of time spent in triple parsing even in optimized builds! I'm not done making this code faster, but I am done trying to improve the ARM target parsing code. llvm-svn: 246378	2015-08-30 09:54:34 +00:00
Chandler Carruth	822d54a22c	Remove a linear walk to find the default FPU for a given CPU by directly expanding the .def file within a StringSwitch. llvm-svn: 246377	2015-08-30 09:01:38 +00:00
Hal Finkel	982e8d48f8	[MIR Serialization] static -> static const in getSerializable*MachineOperandTargetFlags Make the arrays 'static const' instead of just 'static'. Post-commit review comment from Roman Divacky on IRC. NFC. llvm-svn: 246376	2015-08-30 08:07:29 +00:00
Chandler Carruth	3309ef6f02	Teach the target parsing framework to directly compute the length of all of its strings when expanding the string literals from the macros, and push all of the APIs to be StringRef instead of C-string APIs. This (remarkably) removes a very non-trivial number of strlen calls. It even deletes code and complexity from one of the primary users -- Clang. llvm-svn: 246374	2015-08-30 07:51:04 +00:00
Hal Finkel	2d55698ed7	[PowerPC/MIR Serialization] Target flags serialization support Add support for MIR serialization of PowerPC-specific operand target flags (based on the generic infrastructure added in r244185 and r245383). I won't even pretend that this is good test coverage, but this includes the regression test associated with r246372. Adding an MIR test for that fix is far superior to adding an IR-level test because particular instruction-scheduling decisions are necessary in order to expose the bug, and using an MIR test we can start the pipeline post-scheduling. llvm-svn: 246373	2015-08-30 07:50:35 +00:00
Hal Finkel	d2fd9becf4	[PowerPC] Don't assume ADDISdtprelHA's source is r3 Even through ADDISdtprelHA generally has r3 as its source register, it is possible for the instruction scheduler to move things around such that some other register is the source. We need to print the actual source register, not always r3. Fixes PR24394. The test case will come in a follow-up commit because it depends on MIR target-flags parsing. llvm-svn: 246372	2015-08-30 07:44:05 +00:00
Elena Demikhovsky	a59fcfa56b	New interface function is added to VectorUtils Value getSplatValue(Value Val); It complements the CreateVectorSplat(), which creates 2 instructions - insertelement and shuffle with all-zero mask. The new function recognizes the pattern - insertelement+shuffle and returns the splat value (or nullptr). It also returns a splat value form ConstantDataVector, for completeness. Differential Revision: http://reviews.llvm.org/D11124 llvm-svn: 246371	2015-08-30 07:28:18 +00:00
Chandler Carruth	799e880e95	Refactor the ARM target parsing to use a def file with macros to expand the necessary tables. This will allow me to restructure the code and structures using this to be significantly more efficient. It also removes the duplication of the list of several enumerators. It also enshrines that the order of enumerators match the order of the entries in the tables, something the implementation code actually uses. No functionality changed (yet). llvm-svn: 246370	2015-08-30 05:27:31 +00:00
Chandler Carruth	4fc3a9862c	[Triple] Use clang-format to normalize the formatting of the ARM target parsing logic prior to making substantial changes to it. This parsing logic is incredibly wasteful, so I'm planning to rewrite it. Just unittesting the triple parsing logic spends well over 80% of its time in the ARM parsing logic, and others have measured significant time spent here in real production compiles. Stay tuned... llvm-svn: 246369	2015-08-30 02:17:15 +00:00
Chandler Carruth	bb47b9a367	[Triple] Stop abusing a class to have only static methods and just use the namespace that we are already using for the enums that are produced by the parsing. llvm-svn: 246367	2015-08-30 02:09:48 +00:00
Fiona Glaser	934765c1df	SelectionDAG: add missing ComputeSignBits case for SELECT_CC Identical to SELECT, just with different operand numbers. llvm-svn: 246366	2015-08-29 23:04:38 +00:00
Peter Collingbourne	79bf113dca	Fix shared library build. llvm-svn: 246365	2015-08-29 22:34:34 +00:00
James Molloy	45ee9898ec	[ARM] Hoist fabs/fneg above a conversion to float. This is especially visible in softfp mode, for example in the implementation of libm fabs/fneg functions. If we have: %1 = vmovdrr r0, r1 %2 = fabs %1 then move the fabs before the vmovdrr: %1 = and r1, #0x7FFFFFFF %2 = vmovdrr r0, r1 This is never a lose, and could be a serious win because the vmovdrr may be followed by a vmovrrd, which would enable us to remove the conversion into FPRs completely. We already do this for f32, but not for f64. Tests are added for both. llvm-svn: 246360	2015-08-29 10:49:11 +00:00
Matt Arsenault	e4d0c142e8	AMDGPU: Add sdst operand to VOP2b instructions The VOP3 encoding of these allows any SGPR pair for the i1 output, but this was forced before to always use vcc. This doesn't yet try to use this, but does add the operand to the definitions so the main change is adding vcc to the output of the VOP2 encoding. llvm-svn: 246358	2015-08-29 07:16:50 +00:00
Matt Arsenault	9a32cd3d3b	AMDGPU: Set mem operands for spill instructions llvm-svn: 246357	2015-08-29 06:48:57 +00:00
Matt Arsenault	5c004a7c61	AMDGPU: Fix dropping mem operands when moving to VALU Without a memory operand, mayLoad or mayStore instructions are treated as hasUnorderedMemRef, which results in much worse scheduling. We really should have a verifier check that any non-side effecting mayLoad or mayStore has a memory operand. There are a few instructions (interp and images) which I'm not sure what / where to add these. llvm-svn: 246356	2015-08-29 06:48:46 +00:00
Tom Stellard	eea72ccbf2	AMDGPU/SI: Fix some invaild assumptions when folding 64-bit immediates Summary: We were assuming tha if the use operand had a sub-register that the immediate was 64-bits, but this was breaking the case of folding a 64-bit immediate into another 64-bit instruction. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12255 llvm-svn: 246354	2015-08-29 01:58:21 +00:00
Tom Stellard	b8ce14c4c3	AMDGPU/SI: Factor operand folding code into its own function Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12254 llvm-svn: 246353	2015-08-28 23:45:19 +00:00
Duncan P. N. Exon Smith	b09eb9f1c2	DI: Set DILexicalBlock columns >= 65536 to 0/unknown This fixes PR24621 and matches what we do for `DILocation`. Although the limit seems somewhat artificial, there are places in the backend that also assume 16-bit columns, so we may as well just be consistent about the limits. llvm-svn: 246349	2015-08-28 22:58:50 +00:00
Vedant Kumar	44fccb7b50	[X86] NFC: Clean up and clang-format a few lines llvm-svn: 246340	2015-08-28 21:59:00 +00:00
Duncan P. N. Exon Smith	b56b5af4c3	DI: Add Function::getSubprogram() Add `Function::setSubprogram()` and `Function::getSubprogram()`, convenience methods to forward to `setMetadata()` and `getMetadata()`, respectively, and deal in `DISubprogram` instead of `MDNode`. Also add a verifier check to enforce that `!dbg` attachments are always subprograms. Originally (when I had the llvm-dev discussion back in April) I thought I'd store a pointer directly on `llvm::Function` for these attachments -- we frequently have debug info, and that's much cheaper than using map in the context if there are no other function-level attachments -- but for now I'm just using the generic infrastructure. Let's add the extra complexity only if this shows up in a profile. llvm-svn: 246339	2015-08-28 21:55:35 +00:00
Duncan P. N. Exon Smith	0660bcda53	AsmPrinter: Allow null subroutine type Currently the DWARF backend requires that subprograms have a type, and the type is ignored if it has an empty type array. The long term direction here -- see PR23079 -- is instead to skip the type entirely if there's no valid type. It turns out we have cases in tree of missing types on subprograms, but since they're not referenced by compile units, the backend never crashes on them. One option would be to add a Verifier check that subprograms have types, and fix the bitrot. However, this is a fair bit of churn (20-30 testcases) that would be reversed anyway by PR23079. I found this inconsistency because of a WIP patch and upgrade script for PR23367 that started crashing on test/DebugInfo/2010-10-01-crash.ll. This commit updates the testcase to reference the subprogram from the compile unit, and fixes the resulting crash (in line with the direction of PR23079). This also updates `DIBuilder` to stop assuming a non-null pointer for the subroutine types. llvm-svn: 246333	2015-08-28 21:38:24 +00:00
David Majnemer	0a92f86fe6	Revert r246232 and r246304. This reverts isSafeToSpeculativelyExecute's use of ReadNone until we split ReadNone into two pieces: one attribute which reasons about how the function reasons about memory and another attribute which determines how it may be speculated, CSE'd, trap, etc. llvm-svn: 246331	2015-08-28 21:13:39 +00:00
Duncan P. N. Exon Smith	814b8e91c7	DI: Require subprogram definitions to be distinct As a follow-up to r246098, require `DISubprogram` definitions (`isDefinition: true`) to be 'distinct'. Specifically, add an assembler check, a verifier check, and bitcode upgrading logic to combat testcase bitrot after the `DIBuilder` change. While working on the testcases, I realized that test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore. Its purpose was to check for a corner case in PR22792 where two subprogram definitions match exactly and share the same metadata node. The new verifier check, requiring that subprogram definitions are 'distinct', precludes that possibility. I updated almost all the IR with the following script: git grep -l -E -e '= !DISubprogram$.* isDefinition: true' \| grep -v test/Bitcode \| xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true$/= distinct \1/' Likely some variant of would work for out-of-tree testcases. llvm-svn: 246327	2015-08-28 20:26:49 +00:00
Sanjoy Das	6f5dca70ed	[InstCombine] Fix PR24605. PR24605 is caused due to an incorrect insert point in instcombine's IR builder. When simplifying %t = add X Y ... %m = icmp ... %t the replacement for %t should be placed before %t, not before %m, as there could be a use of %t between %t and %m. llvm-svn: 246315	2015-08-28 19:09:31 +00:00
Chad Rosier	dc65532fd9	Optimize memcmp(x,y,n)==0 for small n and suitably aligned x/y. http://reviews.llvm.org/D6952 PR20673 llvm-svn: 246313	2015-08-28 18:30:18 +00:00
Petar Jovanovic	207a191a98	[mips64][mcjit] Add N64R6 relocations tests and fix N64R2 tests This patch adds a test for MIPS64R6 relocations, it corrects check expressions for R_MIPS_26 and R_MIPS_PC16 relocations in MIPS64R2 test, and it adds run for big endian in MIPS64R2 test. Patch by Vladimir Radosavljevic. Differential Revision: http://reviews.llvm.org/D11217 llvm-svn: 246311	2015-08-28 18:02:53 +00:00
Petar Jovanovic	28e2b717fc	[mips] Remove incorrect DebugLoc entries from prologue This has been causing the prologue_end to be incorrectly positioned. Patch by Vladimir Radosavljevic. Differential Revision: http://reviews.llvm.org/D11293 llvm-svn: 246309	2015-08-28 17:53:26 +00:00
Matt Arsenault	d9c830154f	Make MergeConsecutiveStores look at other stores on same chain When combiner AA is enabled, look at stores on the same chain. Non-aliasing stores are moved to the same chain so the existing code fails because it expects to find an adajcent store on a consecutive chain. Because of how DAGCombiner tries these store combines, MergeConsecutiveStores doesn't see the correct set of stores on the chain when it visits the other stores. Each store individually has its chain fixed before trying to merge consecutive stores, and then tries to merge stores from that point before the other stores have been processed to have their chains fixed. To fix this, attempt to use FindBetterChain on any possibly neighboring stores in visitSTORE. Suppose you have 4 32-bit stores that should be merged into 1 vector store. One store would be visited first, fixing the chain. What happens is because not all of the store chains have yet been fixed, 2 of the stores are merged. The other 2 stores later have their chains fixed, but because the other stores were already merged, they have different memory types and merging the two different sized stores is not supported and would be more difficult to handle. llvm-svn: 246307	2015-08-28 17:31:28 +00:00
JF Bastien	f5aa1ca655	Remove Merge Functions pointer comparisons Summary: This patch removes two remaining places where pointer value comparisons are used to order functions: comparing range annotation metadata, and comparing block address constants. (These are both rare cases, and so no actual non-determinism was observed from either case). The fix for range metadata is simple: the annotation always consists of a pair of integers, so we just order by those integers. The fix for block addresses is more subtle. Two constants are the same if they are the same basic block in the same function, or if they refer to corresponding basic blocks in each respective function. Note that in the first case, merging is trivially correct. In the second, the correctness of merging relies on the fact that the the values of block addresses cannot be compared. This change is actually an enhancement, as these functions could not previously be merged (see merge-block-address.ll). There is still a problem with cross function block addresses, in that constants pointing to a basic block in a merged function is not updated. This also more robustly compares floating point constants by all fields of their semantics, and fixes a dyn_cast/cast mixup. Author: jrkoenig Reviewers: dschuff, nlewycky, jfb Subscribers llvm-commits Differential revision: http://reviews.llvm.org/D12376 llvm-svn: 246305	2015-08-28 16:49:09 +00:00
David Majnemer	a787de3227	[CodeGen] isInTailCallPosition didn't consider readnone tailcalls A readnone tailcall may still have a chain of computation which follows it that would invalidate a tailcall lowering. Don't skip the analysis in such cases. This fixes PR24613. llvm-svn: 246304	2015-08-28 16:44:09 +00:00
Sanjay Patel	7c912898a5	[x86] enable machine combiner reassociations for scalar 'and' insts llvm-svn: 246300	2015-08-28 14:09:48 +00:00
Chandler Carruth	4b682f6f24	[SROA] Fix PR24463, a crash I introduced in SROA by allowing it to handle more allocas with loads past the end of the alloca. I suspect there are some related crashers with slightly different patterns, but I'll fix those and add test cases as I find them. Thanks to David Majnemer for the excellent test case reduction here. Made this super simple to debug and fix. llvm-svn: 246289	2015-08-28 09:03:52 +00:00
Rui Ueyama	71ba9bdd23	Re-apply r246276 - Object: Teach llvm-ar to create symbol table for COFF short import files This patch includes a fix for a llvm-readobj test. With this patch, the tool does no longer print out COFF headers for the short import file, but that's probably desirable because the header for the short import file is dummy. llvm-svn: 246283	2015-08-28 07:40:30 +00:00
Steven Wu	61db34d12e	Revert r246244 and r246243 These two commits cause clang/llvm bootstrap to hang. llvm-svn: 246279	2015-08-28 06:52:00 +00:00
Rui Ueyama	8cff17469f	Rollback r246276 - Object: Teach llvm-ar to create symbol table for COFF short import files This change caused a test for llvm-readobj to fail. llvm-svn: 246277	2015-08-28 06:03:01 +00:00
Rui Ueyama	22b1b7aad2	Object: Teach llvm-ar to create symbol table for COFF short import files. COFF short import files are special kind of files that contains only DLL-exported symbol names. That's different from object files because it has no data except symbol names. This change implements a SymbolicFile interface for the short import files so that symbol names can be accessed through that interface. llvm-ar is now able to read the file and create symbol table entries for short import files. llvm-svn: 246276	2015-08-28 05:47:46 +00:00
NAKAMURA Takumi	bc3af7b031	LLVMCodeGen: Update libdeps corresponding to r246236. llvm-svn: 246274	2015-08-28 05:38:49 +00:00
Ahmed Bougacha	f9c19da03a	[CodeGen] Support (and default to) expanding READCYCLECOUNTER to 0. For targets that didn't support this, this will let us respect the langref instead of failing to select. Note that we don't need to change the 32-bit x86/PPC lowerings (to account for the result type/# difference) because they're both custom and bypass type legalization. llvm-svn: 246258	2015-08-28 01:49:59 +00:00
Joseph Tremoulet	ec18285b91	[WinEH] Update coloring to handle nested cases cleanly Summary: Change the coloring algorithm in WinEHPrepare to visit a funclet's exits in its parents' contexts and so properly classify the continuations of nested funclets. Also change the placement of cloned blocks to be deterministic and to maintain the relative order of each funclet's blocks. Add a lit test showing various patterns that require cloning, the last several of which don't have CHECKs yet because they require cloning entire funclets which is NYI. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12353 llvm-svn: 246245	2015-08-28 01:12:35 +00:00
Piotr Padlewski	3f81ec1e38	Constant propagation after hitting assume(cmp) bugfix Last time code run into assertion `BBE.isSingleEdge()` in lib/IR/Dominators.cpp:200. http://reviews.llvm.org/D12170 llvm-svn: 246244	2015-08-28 01:02:00 +00:00
Piotr Padlewski	63cc5d4627	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 246243	2015-08-28 01:01:57 +00:00
George Burgess IV	68b36e01da	Fix: CFLAA -- Mark no-args returns as unknown Prior to this patch, we hadn't been marking StratifiedSets with the appropriate StratifiedAttrs when handling the result of no-args call instructions. This caused us to report NoAlias when handed, for example, an escaped alloca and a result from an opaque function. Now we properly mark the return value of said functions. Thanks again to Chandler, Richard, and Nick for pinging me about this. Differential review: http://reviews.llvm.org/D12408 llvm-svn: 246240	2015-08-28 00:16:18 +00:00
Quentin Colombet	fa4ecb4b9a	[AArch64][CollectLOH] Fix a regression that prevented us to detect chains of more than 2 instructions. I introduced this regression a while back and did not noticed it because I somehow forgot to push the initial test cases for the pass! Fix that as well! llvm-svn: 246239	2015-08-27 23:47:10 +00:00
Peter Collingbourne	c269ed5115	CodeGen: Introduce splitCodeGen and teach LTOCodeGenerator to use it. llvm::splitCodeGen is a function that implements the core of parallel LTO code generation. It uses llvm::SplitModule to split the module into linkable partitions and spawning one code generation thread per partition. The function produces multiple object files which can be linked in the usual way. This has been threaded through to LTOCodeGenerator (and llvm-lto for testing purposes). Separate patches will add parallel LTO support to the gold plugin and lld. Differential Revision: http://reviews.llvm.org/D12260 llvm-svn: 246236	2015-08-27 23:37:36 +00:00
Reid Kleckner	0e2882345d	[WinEH] Add some support for code generating catchpad We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-svn: 246235	2015-08-27 23:27:47 +00:00
David Majnemer	0293704be2	[ValueTracking] readnone CallInsts are fair game for speculation Any call which is side effect free is trivially OK to speculate. We already had similar logic in EarlyCSE and GVN but we were missing it from isSafeToSpeculativelyExecute. This fixes PR24601. llvm-svn: 246232	2015-08-27 23:03:01 +00:00
Ahmed Bougacha	87166905c8	[CodeGen] Check FoldConstantArithmetic result before using it. Fixes PR24602: r245689 introduced an unguarded use of SelectionDAG::FoldConstantArithmetic, which returns 0 when it fails because of opaque (hoisted) constants. llvm-svn: 246217	2015-08-27 21:46:04 +00:00
Erik Schnetter	5e93e28d8b	Enable constant propagation for more math functions Constant propagation for single precision math functions (such as tanf) is already working, but was not enabled. This patch enables these for many single-precision functions, and adds respective test cases. Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf llvm-svn: 246194	2015-08-27 19:56:57 +00:00
Erik Schnetter	ed6eab32b3	Revert 246186; still breaks on some systems llvm-svn: 246191	2015-08-27 19:34:14 +00:00
Tyler Nowicki	5eaa5a9d26	Improve vectorization diagnostic messages and extend vectorize(enable) pragma. This patch changes the analysis diagnostics produced when loops with floating-point recurrences or memory operations are identified. The new messages say "cannot prove it is safe to reorder * operations; allow reordering by specifying #pragma clang loop vectorize(enable)". Depending on the type of diagnostic the message will include additional options such as ffast-math or __restrict__. This patch also allows the vectorize(enable) pragma to override the low pointer memory check threshold. When the hint is given a higher threshold is used. See the clang patch for the options produced for each diagnostic. llvm-svn: 246187	2015-08-27 18:56:49 +00:00
Erik Schnetter	05845d31c9	Enable constant propagation for more math functions Constant propagation for single precision math functions (such as tanf) is already working, but was not enabled. This patch enables these for many single-precision functions, and adds respective test cases. Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf llvm-svn: 246186	2015-08-27 18:56:23 +00:00
Erik Schnetter	a23672626d	Revert r246158 since it breaks LLVM.Transforms/ConstProp.calls.ll llvm-svn: 246166	2015-08-27 17:24:01 +00:00
Erik Schnetter	694bf5c9b5	Enable constant propagation for more math functions Constant propagation for single precision math functions (such as tanf) is already working, but was not enabled. This patch enables these for many single-precision functions, and adds respective test cases. Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf llvm-svn: 246158	2015-08-27 16:36:37 +00:00
Chad Rosier	c94f8e2906	[LoopVectorize] Add Support for Small Size Reductions. Unlike scalar operations, we can perform vector operations on element types that are smaller than the native integer types. We type-promote scalar operations if they are smaller than a native type (e.g., i8 arithmetic is promoted to i32 arithmetic on Arm targets). This patch detects and removes type-promotions within the reduction detection framework, enabling the vectorization of small size reductions. In the legality phase, we look through the ANDs and extensions that InstCombine creates during promotion, keeping track of the smaller type. In the profitability phase, we use the smaller type and ignore the ANDs and extensions in the cost model. Finally, in the code generation phase, we truncate the result of the reduction to allow InstCombine to rewrite the entire expression in the smaller type. This fixes PR21369. http://reviews.llvm.org/D12202 Patch by Matt Simpson <mssimpso@codeaurora.org>! llvm-svn: 246149	2015-08-27 14:12:17 +00:00
James Molloy	1bbf15c57c	[LoopVectorize] Extract InductionInfo into a helper class... ... and move it into LoopUtils where it can be used by other passes, just like ReductionDescriptor. The API is very similar to ReductionDescriptor - that is, not very nice at all. Sorting these both out will come in a followup. NFC llvm-svn: 246145	2015-08-27 09:53:00 +00:00
Alex Rosenberg	a0a19c1c91	Whoops, remove trailing whitespace. llvm-svn: 246141	2015-08-27 05:37:12 +00:00
Pete Cooper	6b716218fa	isKnownNonNull needs to consider globals in non-zero address spaces. Globals in address spaces other than one may have 0 as a valid address, so we should not assume that they can be null. Reviewed by Philip Reames. llvm-svn: 246137	2015-08-27 03:16:29 +00:00
Philip Reames	dfd890dd3a	Allow value forwarding past release fences in EarlyCSE A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence. We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and then eliminate the former, but we can't do this if the stores are on opposite sides of the fence. Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering. In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences. Differential Revision: http://reviews.llvm.org/D11434 llvm-svn: 246134	2015-08-27 01:32:33 +00:00
Philip Reames	abcdc5e3a8	[RewriteStatepointsForGC] Reduce the number of new instructions for base pointers When computing base pointers, we introduce new instructions to propagate the base of existing instructions which might not be bases. However, the algorithm doesn't make any effort to recognize when the new instruction to be inserted is the same as an existing one already in the IR. Since this is happening immediately before rewriting, we don't really have a chance to fix it after the pass runs without teaching loop passes about statepoints. I'm really not thrilled with this patch. I've rewritten it 4 different ways now, but this is the best I've come up with. The case where the new instruction is just the original base defining value could be merged into the existing algorithm with some complexity. The problem is that we might have something like an extractelement from a phi of two vectors. It may be trivially obvious that the base of the 0th element is an existing instruction, but I can't see how to make the algorithm itself figure that out. Thus, I resort to the call to SimplifyInstruction instead. Note that we can only adjust the instructions we've inserted ourselves. The live sets are still being tracked in side structures at this point in the code. We can't easily muck with instructions which might be in them. Long term, I'm really thinking we need to materialize the live pointer sets explicitly in the IR somehow rather than using side structures to track them. Differential Revision: http://reviews.llvm.org/D12004 llvm-svn: 246133	2015-08-27 01:02:28 +00:00
Tyler Nowicki	e0f400feaa	Improved printing of analysis diagnostics in the loop vectorizer. This patch ensures that every analysis diagnostic produced by the vectorizer will be printed if the loop has a vectorization hint on it. The condition has also been improved to prevent printing when a disabling hint is specified. llvm-svn: 246132	2015-08-27 01:02:04 +00:00
Cong Hou	08cb4fc688	Fixed a bug that edge weights are not assigned correctly when lowering switch statement. This is a one-line-change patch that moves the update to UnhandledWeights to the correct position: it should be updated for all clusters instead of just range clusters. Differential Revision: http://reviews.llvm.org/D12391 llvm-svn: 246129	2015-08-27 00:37:40 +00:00
Philip Reames	98a2dabc08	[SimplifyCFG] Prune code from a provably unreachable switch default As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine. Note: Duplicate case values are disallowed by the LangRef and verifier. Differential Revision: http://reviews.llvm.org/D11995 llvm-svn: 246125	2015-08-26 23:56:46 +00:00
Hal Finkel	7ffe55ae9d	[PowerPC] Remove unnecessary braces in PPCVSXFMAMutate Address Eric's post-commit review of r245741. NFC. llvm-svn: 246121	2015-08-26 23:41:53 +00:00
Bjarke Hammersholt Roune	6c64738e87	[NVPTX] Let NVPTX backend detect integer min and max patterns. Summary: Let NVPTX backend detect integer min and max patterns during isel and emit intrinsics that enable hardware support. Reviewers: jholewinski, meheff, jingyue Subscribers: arsenm, llvm-commits, meheff, jingyue, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D12377 llvm-svn: 246107	2015-08-26 23:22:02 +00:00
Cong Hou	b5ef475e5c	[ARM] Use BranchProbability::scale() to scale an integer with a probability in ARMBaseInstrInfo.cpp, Previously in isProfitableToIfCvt() in ARMBaseInstrInfo.cpp, the multiplication between an integer and a branch probability is done manually in an unsafe way that may lead to overflow. This patch corrects those cases by using BranchProbability's member function scale() to avoid overflow (which stores the intermediate result in int64). Differential Revision: http://reviews.llvm.org/D12295 llvm-svn: 246106	2015-08-26 23:17:52 +00:00
Cong Hou	03127700d5	Assign weights to edges to jump table / bit test header when lowering switch statement. Currently, when lowering switch statement and a new basic block is built for jump table / bit test header, the edge to this new block is not assigned with a correct weight. This patch collects the edge weight from all its successors and assign this sum of weights to the edge (and also the other fall-through edge). Test cases are adjusted accordingly. Differential Revision: http://reviews.llvm.org/D12166#fae6eca7 llvm-svn: 246104	2015-08-26 23:15:32 +00:00
JF Bastien	b1b61ebb21	WebAssembly: NFC comment update llvm-svn: 246101	2015-08-26 23:03:07 +00:00
Duncan P. N. Exon Smith	b2df64721c	DI: Make Subprogram definitions 'distinct' Change `DIBuilder` always to produce 'distinct' nodes when creating `DISubprogram` definitions. I measured a ~5% memory improvement in the link step (of ld64) when using `-flto -g`. `DISubprogram`s are used in two ways in the debug info graph. Some are definitions, point at actual functions, and can't really be shared between compile units. With full debug info, these point down at their variables, forming uniquing cycles. These uniquing cycles are expensive to link between modules, since all unique nodes that reference them transitively need to be duplicated (see commit message for r244181 for more details). Others are declarations, primarily used for member functions in the type hierarchy. Definitions never show up there; instead, a definition points at its corresponding declaration node. I started by making all subprograms 'distinct'. However, that was too big a hammer: memory usage increased ~5% (net increase vs. this patch of ~10%) because the 'distinct' declarations undermine LTO type uniquing. This is a targeted fix for the definitions (where uniquing is an observable problem). A couple of notes: - There's an accompanying commit to update IRGen testcases in clang. - ^ That's what I'm using to test this commit. - In a follow-up, I'll change the verifier to require 'distinct' on definitions and add an upgrade to `BitcodeReader`. llvm-svn: 246098	2015-08-26 22:50:16 +00:00
JF Bastien	45479f627a	WebAssembly: handle private/internal globals. Things of note: - Other linkage types aren't handled yet. We'll figure it out with dynamic linking. - Special LLVM globals are either ignored, or error out for now. - TLS isn't supported yet (WebAssembly will have threads later). - There currently isn't a syntax for alignment, I left it in a comment so it's easy to hook up. - Undef is convereted to whatever the type's appropriate null value is. - assert versus report_fatal_error: follow what other AsmPrinters do, and assert only on what should have been caught elsewhere. llvm-svn: 246092	2015-08-26 22:09:54 +00:00
Reid Kleckner	c2b9254426	[ms-inline-asm] Relax assertion around funky identifiers slightly A corresponding clang change will make it so that clang can consume part of an assembler token. The assembler treats '.' as an identifier character while clang does not, so it's view of the token stream is a little different. llvm-svn: 246089	2015-08-26 21:57:25 +00:00
Kostya Serebryany	06c199ac9d	[libFuzzer] fix minor inefficiency, PR24584 llvm-svn: 246087	2015-08-26 21:55:19 +00:00
Mehdi Amini	0ab4b5b52e	Fix LLVM C API for DataLayout We removed access to the DataLayout on the TargetMachine and deprecated the C API function LLVMGetTargetMachineData() in r243114. However the way I tried to be backward compatible was broken: I changed the wrapper of the TargetMachine to be a structure that includes the DataLayout as well. However the TargetMachine is also wrapped by the ExecutionEngine, in the more classic way. A client using the TargetMachine wrapped by the ExecutionEngine and trying to get the DataLayout would break. It seems tricky to solve the problem completely in the C API implementation. This patch tries to address this backward compatibility in a more lighter way in the C++ API. The C API is restored in its original state and the removed C++ API is reintroduced, but privately. The C API is friended to the TargetMachine and should be the only consumer for this API. Reviewers: ributzka Differential Revision: http://reviews.llvm.org/D12263 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246082	2015-08-26 21:16:29 +00:00
Matt Arsenault	8a067121f8	AMDGPU: Delete dead code There is no context where s_mov_b64 is emitted and could potentially be moved to the VALU. It is currently only emitted for materializing immediates, which can't be dependent on vector sources. The immediate splitting is already done when selecting constants. I'm not sure what contexts if any the register splitting would have been used before. Also clean up using s_mov_b64 in place of v_mov_b64_pseudo, although this isn't required and just skips the extra step of eliminating the copy from the SReg_64. llvm-svn: 246080	2015-08-26 20:48:08 +00:00
Matt Arsenault	5e7f95e567	AMDGPU: Don't reprocess instructions when splitting i64 bcnt llvm-svn: 246079	2015-08-26 20:48:04 +00:00
Matt Arsenault	445833cc91	AMDGPU: Fix not moving users of s_bfe_i64 to VALU This wouldn't propagate to users of the original BFE and would hit a verifier error. llvm-svn: 246078	2015-08-26 20:47:58 +00:00
Matt Arsenault	f003c38e1e	AMDGPU: Don't create intermediate SALU instructions When splitting 64-bit operations, create the correct VALU instructions immediately. This was splitting things like s_or_b64 into the two s_or_b32s and then pushing the new instructions onto the worklist. There's no reason we need to do this intermediate step. llvm-svn: 246077	2015-08-26 20:47:50 +00:00
Matthias Braun	4e7ded834f	SelectionDAGBuilder: Fix SPDescriptor not resetting GuardReg This was causing problems when some functions use a GuardReg and some don't as can happen when mixing SelectionDAG and FastISel generated functions. llvm-svn: 246075	2015-08-26 20:46:52 +00:00
Matthias Braun	4816b18d86	FastISel: Avoid adding a successor block twice for degenerate IR. This fixes http://llvm.org/PR24581 Differential Revision: http://reviews.llvm.org/D12350 llvm-svn: 246074	2015-08-26 20:46:49 +00:00
Andrew Kaylor	af083d4cf9	Expose hasLiveCondCodeDef as a member function of the X86InstrInfo class. NFC This takes the existing static function hasLiveCondCodeDef and makes it a member function of the X86InstrInfo class. This is a useful utility function that an upcoming change would like to use. NFC. Patch by: Kevin B. Smith Differential Revision: http://reviews.llvm.org/D12371 llvm-svn: 246073	2015-08-26 20:36:52 +00:00
Diego Novillo	7732ae4a4f	Fix memory leak in sample profile pass. The problem here were the function analyses invoked by the function pass manager from the new IPO pass. I looked at other IPO passes needing dominance information and the only one that requires it (partial inliner) does not use the standard dependency mechanism. This patch mimics what the partial inliner does to compute dominance, post-dominance and loop info. One thing I like about this approach is that I can delay the computation of all this until I actually need it. This should bring the ASAN buildbot back to green. If there's a better way to fix this, I'll do it in a follow-up patch. llvm-svn: 246066	2015-08-26 20:00:27 +00:00
Mehdi Amini	31ebf03c09	Revert "Fix LLVM C API for DataLayout" This reverts commit r246052. Third attempt, still unpleasant for some bots. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246057	2015-08-26 19:24:59 +00:00
Matt Arsenault	602a16d3db	AMDGPU/SI: Report SIFixSGPRLiveRanges changed function llvm-svn: 246056	2015-08-26 19:12:03 +00:00
Mehdi Amini	9d692b6805	Fix LLVM C API for DataLayout We removed access to the DataLayout on the TargetMachine and deprecated the C API function LLVMGetTargetMachineData() in r243114. However the way I tried to be backward compatible was broken: I changed the wrapper of the TargetMachine to be a structure that includes the DataLayout as well. However the TargetMachine is also wrapped by the ExecutionEngine, in the more classic way. A client using the TargetMachine wrapped by the ExecutionEngine and trying to get the DataLayout would break. It seems tricky to solve the problem completely in the C API implementation. This patch tries to address this backward compatibility in a more lighter way in the C++ API. The C API is restored in its original state and the removed C++ API is reintroduced, but privately. The C API is friended to the TargetMachine and should be the only consumer for this API. Reviewers: ributzka Differential Revision: http://reviews.llvm.org/D12263 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246052	2015-08-26 18:56:01 +00:00
Matt Arsenault	bd66061db7	AMDGPU: Make sure to reserve super registers I think this could potentially have broken if one of the super registers were allocated that contain v254/v255. llvm-svn: 246051	2015-08-26 18:54:50 +00:00
Mehdi Amini	8b3dda3f71	Revert "Fix LLVM C API for DataLayout" This reverts commit r246044. Build broken, still. It builds for me... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246049	2015-08-26 18:37:59 +00:00
Matt Arsenault	19c5488015	AMDGPU: Produce error on dynamic_stackalloc llvm-svn: 246048	2015-08-26 18:37:13 +00:00
David Majnemer	3354fe473f	[SimplifyLibCalls] Fix a typo cbrt(sqrt(x)) calculates the sixth root, not the ninth root. cbrt(cbrt(x)) calculates the ninth root. llvm-svn: 246046	2015-08-26 18:30:16 +00:00
Mehdi Amini	b5d8b27fc8	Fix LLVM C API for DataLayout We removed access to the DataLayout on the TargetMachine and deprecated the C API function LLVMGetTargetMachineData() in r243114. However the way I tried to be backward compatible was broken: I changed the wrapper of the TargetMachine to be a structure that includes the DataLayout as well. However the TargetMachine is also wrapped by the ExecutionEngine, in the more classic way. A client using the TargetMachine wrapped by the ExecutionEngine and trying to get the DataLayout would break. It seems tricky to solve the problem completely in the C API implementation. This patch tries to address this backward compatibility in a more lighter way in the C++ API. The C API is restored in its original state and the removed C++ API is reintroduced, but privately. The C API is friended to the TargetMachine and should be the only consumer for this API. Reviewers: ributzka Differential Revision: http://reviews.llvm.org/D12263 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246044	2015-08-26 18:22:34 +00:00
James Y Knight	3602286937	[SPARC] Fix stupid oversight in stack realignment support. If you're going to realign %sp to get object alignment properly (which the code does), and stack offsets and alignments are calculated going down from %fp (which they are), then the total stack size had better be a multiple of the alignment. LLVM did indeed ensure that. And then, after aligning, the sparc frame code added 96 (for sparcv8) to the frame size, making any requested alignment of 64-bytes or higher guaranteed to be misaligned. The test case added with r245668 even tests this exact scenario, and asserted the incorrect behavior, which I somehow failed to notice. D'oh. This change fixes the frame lowering code to align the stack size after adding the spill area, instead. Differential Revision: http://reviews.llvm.org/D12349 llvm-svn: 246042	2015-08-26 17:57:51 +00:00
Vedant Kumar	bf891b12b4	[llvm-mc] Ignore opcode size prefix in 64-bit CALL disassembly This is a fix for disassembling unusual instruction sequences in 64-bit mode w.r.t the CALL rel16 instruction. It might be desirable to move the check somewhere else, but it essentially mimics the special case handling with JCXZ in 16-bit mode. The current behavior accepts the opcode size prefix and causes the call's immediate to stop disassembling after 2 bytes. When debugging sequences of instructions with this pattern, the disassembler output becomes extremely unreliable and essentially useless (if you jump midway into what lldb thinks is a unified instruction, you'll lose %rip). So we ignore the prefix and consume all 4 bytes when disassembling a 64-bit mode binary. Note: in Vol. 2A 3-99 the Intel spec states that CALL rel16 is N.S. N.S. is defined as: Indicates an instruction syntax that requires an address override prefix in 64-bit mode and is not supported. Using an address override prefix in 64-bit mode may result in model-specific execution behavior. (Vol. 2A 3-7) Since 0x66 is an operand override prefix we should be OK (although we may want to warn about 0x67 prefixes to 0xe8). On the CPUs I tested with, they all ignore the 0x66 prefix in 64-bit mode. Patch by Matthew Barney! Differential Revision: http://reviews.llvm.org/D9573 llvm-svn: 246038	2015-08-26 16:20:29 +00:00
Chad Rosier	9f4709b261	[AArch64] Remove a use-after-free when collecting stats. The call to mergePairedInsns() deletes MI, so the later use by isUnscaledLdSt() is referencing freed memory. llvm-svn: 246033	2015-08-26 13:39:48 +00:00
Silviu Baranga	db1ddb32ce	[AArch64] Unify the integer min/max vector selection patterns with the intrinsic ones Summary: This change lowers the aarch64 integer vector min/max intrinsic nodes to generic min/max nodes and replaces the intrinsic selection patterns with the generic ones. There should already be testing in place for this, so no further tests were added. Reviewers: jmolloy Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12276 llvm-svn: 246030	2015-08-26 11:11:14 +00:00
Chandler Carruth	748d095ff0	[SROA] Rip out all support for SSAUpdater in SROA. This was only added to preserve the old ScalarRepl's use of SSAUpdater which was originally to avoid use of dominance frontiers. Now, we only need a domtree, and we'll need a domtree right after this pass as well and so it makes perfect sense to always and only use the dom-tree powered mem2reg. This was flag-flipper earlier and has stuck reasonably so I wanted to gut the now-dead code out of SROA before we waste more time with it. Among other things, this will make passmanager porting easier. llvm-svn: 246028	2015-08-26 09:09:29 +00:00
Alex Rosenberg	81cfed21ca	Modernize with range-based for loops. llvm-svn: 246018	2015-08-26 06:11:41 +00:00
Alex Rosenberg	99805ed45a	Reduce code duplication. llvm-svn: 246017	2015-08-26 06:11:38 +00:00
Alex Rosenberg	5b3404a03e	Trailing whitespace llvm-svn: 246016	2015-08-26 06:11:36 +00:00
Frederic Riss	74b9882ec3	[MC] Split the layout part of MCAssembler::finish() into its own method. NFC. Split a MCAssembler::layout() method out of MCAssembler::finish(). This allows running the MCSections layout separately from the streaming of the output file. This way if a client wants to use MC to generate section contents, but emit something different than the standard relocatable object files it is possible (llvm-dsymutil is such a client). llvm-svn: 246008	2015-08-26 05:09:49 +00:00
Frederic Riss	75c0c7050a	[MC/MachO] Make some MachObjectWriter methods more generic. NFC. Hardcode less values in some mach-o header writing routines and pass them as argument. Doing so will allow reusing this code in llvm-dsymutil. llvm-svn: 246007	2015-08-26 05:09:46 +00:00
JF Bastien	9dc042a0b6	Comparing operands should not require the same ValueID Summary: When comparing basic blocks, there is an additional check that two Value*'s should have the same ID, which interferes with merging equivalent constants of different kinds (such as a ConstantInt and a ConstantPointerNull in the included testcase). The cmpValues function already ensures that the two values in each function are the same, so removing this check should not cause incorrect merging. Also, the type comparison is redundant, based on reviewing the code and testing on the test suite and several large LTO bitcodes. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D12302 llvm-svn: 246001	2015-08-26 03:02:58 +00:00
JF Bastien	a1d3c24ccf	Expose more properties of llvm::fltSemantics Summary: Adds accessor functions for all the fields in llvm::fltSemantics. This will be used in MergeFunctions to order two APFloats with different semanatics. Author: jrkoenig Reviewers: jfb Subscribers: dschuff, llvm-commits Differential revision: http://reviews.llvm.org/D12253 llvm-svn: 245999	2015-08-26 02:32:45 +00:00
Matthias Braun	ccfc9c8d6d	FastISel: Use finishCondBranch() for ARM,Mips,PowerPC FastISel Note that after this change branch probabilities are preserved now. llvm-svn: 245998	2015-08-26 01:55:47 +00:00
Matthias Braun	17af607796	FastISel: Factor out common code; NFC intended This should be no functional change but for the record: For three cases in X86FastISel this will change the order in which the FalseMBB and TrueMBB of a conditional branch is addedd to the successor/predecessor lists. llvm-svn: 245997	2015-08-26 01:38:00 +00:00
JF Bastien	1a4aa1589b	WebAssembly: add small FIXME for AsmPrinter. Suggested by @sunfish as a follow-up to r245982. llvm-svn: 245996	2015-08-26 00:50:49 +00:00
Charles Davis	119525914c	Make variable argument intrinsics behave correctly in a Win64 CC function. Summary: This change makes the variable argument intrinsics, `llvm.va_start` and `llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch I have to add support for GCC's `__builtin_ms_va_list` constructs. Reviewers: nadav, asl, eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1622 llvm-svn: 245990	2015-08-25 23:27:41 +00:00
JF Bastien	54be3b1f03	WebAssembly: assert that there aren't any constant pools WebAssembly will either use globals or immediates, since it's a virtual ISA. llvm-svn: 245989	2015-08-25 23:19:49 +00:00
JF Bastien	b6091dfe0f	WebAssembly: emit `(func (param t) (result t))` s-expressions Summary: Match spec format: https://github.com/WebAssembly/spec/blob/master/ml-proto/test/fac.wasm Reviewers: sunfish Subscribers: llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D12307 llvm-svn: 245986	2015-08-25 22:58:05 +00:00
JF Bastien	289287060b	WebAssembly: comment out .globl when printing textual assembly Do the same for .weak (not implemented for now, but may as well to it). Update comment string to two semicolons. llvm-svn: 245982	2015-08-25 22:23:15 +00:00
Evgeniy Stepanov	d04d07e65e	[msan] Precise instrumentation for icmp sgt %x, -1. Extend signed relational comparison instrumentation with a special case for comparisons with -1. This fixes an MSan false positive when such comparison is used as a sign bit test. https://llvm.org/bugs/show_bug.cgi?id=24561 llvm-svn: 245980	2015-08-25 22:19:11 +00:00
Matthias Braun	130bd90e17	MachineBasicBlock: Use MCPhysReg instead of unsigned in livein API This is friendlier to the readers as it makes it clear that the API is not meant for vregs but just for physregs. llvm-svn: 245977	2015-08-25 22:05:55 +00:00
Cong Hou	cd59591396	Remove the final bit test during lowering switch statement if all cases in bit test cover a contiguous range. When lowering switch statement, if bit tests are used then LLVM will always generates a jump to the default statement in the last bit test. However, this is not necessary when all cases in bit tests cover a contiguous range. This is because when generating the bit tests header MBB, there is a range check that guarantees cases in bit tests won't go outside of [low, high], where low and high are minimum and maximum case values in the bit tests. This patch checks if this is the case and then doesn't emit jump to default statement and hence saves a bit test and a branch. Differential Revision: http://reviews.llvm.org/D12249 llvm-svn: 245976	2015-08-25 21:34:38 +00:00
Davide Italiano	68961bba06	[MachO] Move trivial accessors to header. Requested by: Jim Grosbach. llvm-svn: 245963	2015-08-25 18:27:59 +00:00
NAKAMURA Takumi	c57a09821f	Update libdeps in LLVMipo and LLVMScalarOpts, corresponding to r245940. llvm-svn: 245957	2015-08-25 17:11:17 +00:00
Matthias Braun	a7fc3856f1	Fix dependencies/shared library build llvm-svn: 245955	2015-08-25 17:07:40 +00:00
David Blaikie	d486000387	Fix dropped conditional in cleanup in r245752 Code review feedback by Charlie Turner. llvm-svn: 245954	2015-08-25 17:01:36 +00:00
Wei Mi	edae87d819	The patch replace the overflow check in loop vectorization with the minimum loop iterations check. The loop minimum iterations check below ensures the loop has enough trip count so the generated vector loop will likely be executed, and it covers the overflow check. Differential Revision: http://reviews.llvm.org/D12107. llvm-svn: 245952	2015-08-25 16:43:47 +00:00
Sanjay Patel	deb8f826a5	make fast unaligned memory accesses implicit with SSE4.2 or SSE4a This is a follow-on from the discussion in http://reviews.llvm.org/D12154. This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has generally fast unaligned memory ops. A motivating use case for this change is a clang invocation that doesn't explicitly set the CPU, but does target a feature that we know only exists on a CPU that supports fast unaligned memops. For example: $ clang -O1 foo.c -mavx This resolves a difference in lowering noted in PR24449: https://llvm.org/bugs/show_bug.cgi?id=24449 Before this patch, we used different store types depending on whether the example can be lowered as a memset or not. Differential Revision: http://reviews.llvm.org/D12288 llvm-svn: 245950	2015-08-25 16:29:21 +00:00
Diego Novillo	4d71113cdb	Convert SampleProfile pass into a Module pass. Eventually, we will need sample profiles to be incorporated into the inliner's cost models. To do this, we need the sample profile pass to be a module pass. This patch makes no functional changes beyond the mechanical adjustments needed to run SampleProfile as a module pass. llvm-svn: 245940	2015-08-25 15:25:11 +00:00
Davide Italiano	933e230738	[MachO] Introduce MinVersion API. While introducing support for MinVersionLoadCommand in llvm-readobj I noticed there's no API to extract Major/Minor/Update components conveniently. Currently consumers do the bit twiddling on their own, but this will change from now on. I'll convert llvm-objdump (and llvm-readobj) in a later commit. Differential Revision: http://reviews.llvm.org/D12282 Reviewed by: rafael llvm-svn: 245938	2015-08-25 15:02:23 +00:00
Michael Kuperstein	6e3fee07f7	[X86] Remove references to _ftol2 As of r245924, _ftol2 is no longer used for fptoui on MS platforms. Remove the dead code associated with it. llvm-svn: 245925	2015-08-25 07:58:33 +00:00
Michael Kuperstein	8515893be8	[X86] Fix fptoui conversions This fixes two issues in x86 fptoui lowering. 1) Makes conversions from f80 go through the right path on AVX-512. 2) Implements an inline sequence for fptoui i64 instead of a library call. This improves performance by 6X on SSE3+ and 3X otherwise. Incidentally, it also removes the use of ftol2 for fptoui, which was wrong to begin with, as ftol2 converts to a signed i64, producing wrong results for values >= 2^63. Patch by: mitch.l.bodart@intel.com Differential Revision: http://reviews.llvm.org/D11316 llvm-svn: 245924	2015-08-25 07:42:09 +00:00
Steve King	5cdbd20cc3	Pass function attributes instead of boolean in isIntDivCheap(). llvm-svn: 245921	2015-08-25 02:31:21 +00:00
Piotr Padlewski	4e7f752bb8	Assume intrinsic handling in global opt It doesn't solve the problem, when for example we load something, and then assume that it is the same as some constant value, because globalopt will fail on unknown load instruction. The proposed solution would be to skip some instructions that we can't evaluate and they are safe to skip (f.e. load, assume and many others) and see if they are required to perform optimization (f.e. we don't care about ephemeral instructions that may appear using @llvm.assume()) http://reviews.llvm.org/D12266 llvm-svn: 245919	2015-08-25 01:34:15 +00:00
Mehdi Amini	f83b865448	Revert "Fix LLVM C API for DataLayout" This reverts commit 433bfd94e4b7e3cc3f8b08f8513ce47817941b0c. Broke some bot, have to see why it passed locally. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245917	2015-08-25 01:21:09 +00:00
Mehdi Amini	84b2e325d3	Fix LLVM C API for DataLayout We removed access to the DataLayout on the TargetMachine and deprecated the C API function LLVMGetTargetMachineData() in r243114. However the way I tried to be backward compatible was broken: I changed the wrapper of the TargetMachine to be a structure that includes the DataLayout as well. However the TargetMachine is also wrapped by the ExecutionEngine, in the more classic way. A client using the TargetMachine wrapped by the ExecutionEngine and trying to get the DataLayout would break. It seems tricky to solve the problem completely in the C API implementation. This patch tries to address this backward compatibility in a more lighter way in the C++ API. The C API is restored in its original state and the removed C++ API is reintroduced, but privately. The C API is friended to the TargetMachine and should be the only consumer for this API. Reviewers: ributzka Differential Revision: http://reviews.llvm.org/D12263 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245916	2015-08-25 01:07:25 +00:00
Hal Finkel	0f2ddcb83f	[PowerPC] PPCVSXFMAMutate should ignore trivial-copy addends We might end up with a trivial copy as the addend, and if so, we should ignore the corresponding FMA instruction. The trivial copy can be coalesced away later, so there's nothing to do here. We should not, however, assert. Fixes PR24544. llvm-svn: 245907	2015-08-24 23:48:28 +00:00
Matthias Braun	1b50bb58a1	Try to fix buildbots Apparently std::vector::erase(const_iterator) (as opposed to the non-const iterator) is a part of C++11 but it seems this is not available on all the buildbots. llvm-svn: 245900	2015-08-24 23:30:39 +00:00
Sanjay Patel	4104337d9d	fix typos; NFC llvm-svn: 245899	2015-08-24 23:20:16 +00:00
Matthias Braun	7a8b1150bf	Let's try to fix GNU libstdc++ buildbots llvm-svn: 245898	2015-08-24 23:19:39 +00:00
Sanjay Patel	942b46a011	fix typo; NFC llvm-svn: 245896	2015-08-24 23:18:44 +00:00
Matthias Braun	b2b7ef1de8	MachineBasicBlock: Add liveins() method returning an iterator_range llvm-svn: 245895	2015-08-24 22:59:52 +00:00
Dan Gohman	2683a5534e	[WebAssembly] DYNAMIC_STACKALLOC returns a pointer. llvm-svn: 245893	2015-08-24 22:31:52 +00:00
Peter Collingbourne	9c8909dbd1	LTO: Simplify merged module ownership. This change moves LTOCodeGenerator's ownership of the merged module to a field of type std::unique_ptr<Module>. This helps simplify parts of the code and clears the way for the module to be consumed by LLVM CodeGen (see D12132 review comments). Differential Revision: http://reviews.llvm.org/D12205 llvm-svn: 245891	2015-08-24 22:22:53 +00:00
JF Bastien	af111db8af	WebAssembly: Implement call Summary: Support function calls. Reviewers: sunfish, sunfishcode Subscribers: sunfishcode, jfb, llvm-commits Differential revision: http://reviews.llvm.org/D12219 llvm-svn: 245887	2015-08-24 22:16:48 +00:00
JF Bastien	19c2e6634d	Revert two bad commits. Summary: I forgot to squash git commits before doing an svn dcommit of D12219. Reverting, and re-submitting. Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D12298 llvm-svn: 245886	2015-08-24 22:07:33 +00:00
JF Bastien	744ad106c3	Missing print. llvm-svn: 245883	2015-08-24 22:00:04 +00:00
JF Bastien	d8a9d66d50	call llvm-svn: 245882	2015-08-24 21:59:51 +00:00
Dan Gohman	12e1997e4b	[WebAssembly] Make the assembly printer indent instructions. llvm-svn: 245875	2015-08-24 21:19:48 +00:00
Peter Collingbourne	e34034c8d0	LTO: Rename mergedModule variables to MergedModule to prepare for ownership change. Also convert a few loops to range-for loops and correct a comment. llvm-svn: 245874	2015-08-24 21:15:35 +00:00
Dan Gohman	69c4c76396	[WebAssembly] CodeGen support for __builtin_wasm_page_size() llvm-svn: 245872	2015-08-24 21:03:24 +00:00
Sanjay Patel	6b2765fe49	fix typo; NFC llvm-svn: 245869	2015-08-24 20:11:14 +00:00
Bill Schmidt	32fd189de2	[PPC64LE] Fix PR24546 - Swap optimization and debug values This patch fixes PR24546, which demonstrates a segfault during the VSX swap removal pass. The problem is that debug value instructions were not excluded from the list of instructions to be analyzed for webs of related computation. I've added the test case from the PR as a crash test in test/CodeGen/PowerPC. llvm-svn: 245862	2015-08-24 19:27:27 +00:00
Dan Gohman	7b63484b99	[WebAssembly] Skeleton FastISel support llvm-svn: 245860	2015-08-24 18:44:37 +00:00
Dan Gohman	896e53fae8	[WebAssembly] Implement floating point rounding operators. llvm-svn: 245859	2015-08-24 18:23:13 +00:00
Dan Gohman	01612f627d	[WebAssembly] Tell TargetTransformInfo about popcnt and sqrt. llvm-svn: 245853	2015-08-24 16:51:46 +00:00
Dan Gohman	e419a7c307	[WebAssembly] Use the checked form of MachineFunction::getSubtarget. NFC. llvm-svn: 245852	2015-08-24 16:46:31 +00:00
Dan Gohman	08fc966d3c	[WebAssembly] Implement the is_zero_undef forms of cttz and ctlz llvm-svn: 245851	2015-08-24 16:39:37 +00:00
Adhemerval Zanella	4754e2d59c	[sanitizers] Add DFSan support for AArch64 42-bit VMA This patch adds support for dfsan on aarch64-linux with 42-bit VMA (current default config for 64K pagesize kernels). The support is enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time for both clang/llvm and compiler-rt. The default VMA is 39 bits. llvm-svn: 245840	2015-08-24 13:48:10 +00:00
Michael Zuckerman	9beca2e7e2	[X86] Add support for mmword memory operand size for Intel-syntax x86 assembly Differential Revision: http://reviews.llvm.org/D12151 llvm-svn: 245835	2015-08-24 10:26:54 +00:00
Oliver Stannard	284f2bffc9	Add DAG optimisation for FP16_TO_FP The FP16_TO_FP node only uses the bottom 16 bits of its input, so the following pattern can be optimised by removing the AND: (FP16_TO_FP (AND op, 0xffff)) -> (FP16_TO_FP op) This is a common pattern for ARM targets when functions have __fp16 arguments, as they are passed as floats (so that they get passed in the correct registers), but then bitcast and truncated to ignore the top 16 bits. llvm-svn: 245832	2015-08-24 09:47:45 +00:00
Scott Douglass	bdef60462d	[ARM] Use AEABI helpers for i64 div and rem Differential Revision: http://reviews.llvm.org/D12232 llvm-svn: 245830	2015-08-24 09:17:18 +00:00
Scott Douglass	d2974a6afa	[ARM] Refactor LowerDivRem before adding LowerREM (nfc) Differential Revision: http://reviews.llvm.org/D12230 llvm-svn: 245829	2015-08-24 09:17:11 +00:00
Michael Zuckerman	2fe19db94f	first commit to llvm llvm-svn: 245825	2015-08-24 07:48:50 +00:00
Mehdi Amini	d134a67ce9	Require Dominator Tree For SROA, improve compile-time TL-DR: SROA is followed by EarlyCSE which requires the DominatorTree. There is no reason not to require it up-front for SROA. Some history is necessary to understand why we ended-up here. r123437 switched the second (Legacy)SROA in the optimizer pipeline to use SSAUpdater in order to avoid recomputing the costly DominanceFrontier. The purpose was to speed-up the compile-time. Later r123609 removed the need for the DominanceFrontier in (Legacy)SROA. Right after, some cleanup was made in r123724 to remove any reference to the DominanceFrontier. SROA existed in two flavors: SROA_SSAUp and SROA_DT (the latter replacing SROA_DF). The second argument of `createScalarReplAggregatesPass` was renamed from `UseDomFrontier` to `UseDomTree`. I believe this is were a mistake was made. The pipeline was not updated and the call site was still: PM->add(createScalarReplAggregatesPass(-1, false)); At that time, SROA was immediately followed in the pipeline by EarlyCSE which required alread the DominatorTree. Not requiring the DominatorTree in SROA didn't save anything, but unfortunately it was lost at this point. When the new SROA Pass was introduced in r163965, I believe the goal was to have an exact replacement of the existing SROA, this bug slipped through. You can see currently: $ echo "" \| clang -x c++ -O3 -c - -mllvm -debug-pass=Structure ... ... FunctionPass Manager SROA Dominator Tree Construction Early CSE After this patch: $ echo "" \| clang -x c++ -O3 -c - -mllvm -debug-pass=Structure ... ... FunctionPass Manager Dominator Tree Construction SROA Early CSE This improves the compile time from 88s to 23s for PR17855. https://llvm.org/bugs/show_bug.cgi?id=17855 And from 113s to 12s for PR16756 https://llvm.org/bugs/show_bug.cgi?id=16756 Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D12267 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245820	2015-08-23 22:15:49 +00:00
David Majnemer	b01aa9f794	[IR] Cleanup EH instructions a little bit Just a cosmetic change, no functionality change is intended. llvm-svn: 245818	2015-08-23 19:22:31 +00:00
Simon Pilgrim	2a7049abe0	[DAGCombiner] Fold CONCAT_VECTORS of bitcasted EXTRACT_SUBVECTOR Minor generalization of D12125 - peek through any bitcast to the original vector that we're extracting from. llvm-svn: 245814	2015-08-23 15:22:14 +00:00
Frederic Riss	7bb12261a3	[dwarfdump] Do not apply relocations in mach-o files if there is no LoadedObjectInfo. Not only do we not need to do anything to read correct values from the object files, but the current logic actually wrongly applies twice the section base address when there is no LoadedObjectInfo passed to the DWARFContext creation (as the added test shows). Simply do not apply any relocations on the mach-o debug info if there is no load offset to apply. llvm-svn: 245807	2015-08-23 04:44:21 +00:00
Mehdi Amini	a758398833	Add missing break in AArch64DAGToDAGISel::Select() switch case Reported by coverity. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245800	2015-08-23 00:42:57 +00:00
Mehdi Amini	5aa7bd7d62	Do not use dyn_cast<> after isa<> Reported by coverity. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245799	2015-08-23 00:27:57 +00:00
Joseph Tremoulet	8220bcc570	[WinEH] Require token linkage in EH pad/ret signatures Summary: WinEHPrepare is going to require that cleanuppad and catchpad produce values of token type which are consumed by any cleanupret or catchret exiting the pad. This change updates the signatures of those operators to require/enforce that the type produced by the pads is token type and that the rets have an appropriate argument. The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that restriction, this change adds a notion of an operator constraint to both LLParser and BitcodeReader, allowing appropriate sentinels to be constructed for forward references and appropriate error messages to be emitted for illegal inputs. Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad predecessor must have no other predecessors; this ensures that WinEHPrepare will see the expected linear relationship between sibling catches on the same try. Lastly, remove some superfluous/vestigial casts from instruction operand setters operating on BasicBlocks. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12108 llvm-svn: 245797	2015-08-23 00:26:33 +00:00
David Blaikie	3c338f3a7e	Verifier: Don't crash on null entries in debug info retained types list There was already a good error path for this. Added a test for it & made a minor code change to ensure the error path was actually reached, rather than crashing before we got that far. llvm-svn: 245795	2015-08-22 22:36:40 +00:00
Jingyue Wu	fcec09866a	[NVPTX] Allow undef value as global initializer Summary: __shared__ variable may now emit undef value as initializer, do not throw error on that. Test Plan: test/CodeGen/NVPTX/global-addrspace.ll Patch by Xuetian Weng Reviewers: jholewinski, tra, jingyue Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D12242 llvm-svn: 245785	2015-08-22 05:40:26 +00:00
Peter Collingbourne	c7b675f48c	LTO: Maintain target triple, FeatureStr and CGOptLevel in the module or LTOCodeGenerator. This makes it easier to create new TargetMachines on demand. llvm-svn: 245781	2015-08-22 02:25:53 +00:00
Matt Arsenault	0a3ac1be43	AMDGPU: Allow specifying different opcode on VI for SMRD/SMEM Although the basic s_load_* instructions happen to use the same opcode, some of the special case SMRD instructions have different opcodes. llvm-svn: 245775	2015-08-22 00:54:31 +00:00
Matt Arsenault	e8df879948	AMDGPU: Improve accuracy of instruction rates for some FP instructions llvm-svn: 245774	2015-08-22 00:50:41 +00:00
Matt Arsenault	33010103b7	AMDGPU: Use DFS to avoid second loop over function llvm-svn: 245772	2015-08-22 00:43:38 +00:00
Matt Arsenault	c8d8e4ed76	AMDGPU: Make sure to run verifier after SIFixSGPRLiveRanges llvm-svn: 245769	2015-08-22 00:19:34 +00:00
Matt Arsenault	aba29d6ab1	AMDGPU: Improve debug printing in SIFixSGPRLiveRanges llvm-svn: 245768	2015-08-22 00:19:25 +00:00
Matt Arsenault	6adf07a92e	AMDGPU: Move CI instructions into CIInstructions.td There are still a couple of CI patterns left in SIInstructions. llvm-svn: 245767	2015-08-22 00:16:34 +00:00
Matt Arsenault	f56872dc30	AMDGPU: Minor cleanups to help with f16 support The main change is inverting the condition for the operand class classes so that VT.Size == 16 uses VGPR_32 instead of 64. llvm-svn: 245764	2015-08-21 23:49:51 +00:00
JF Bastien	057292a76c	Improve the determinism of MergeFunctions Summary: Merge functions previously relied on unsigned comparisons of pointer values to order functions. This caused observable non-determinism in the compiler for large bitcode programs. Basically, opt -mergefuncs program.bc \| md5sum produces different hashes when run repeatedly on the same machine. Differing output was observed on three large bitcodes, but it was less frequent on the smallest file. It is possible that this only manifests on the large inputs, hence remaining undetected until now. This patch fixes this by removing (almost, see below) all places where comparisons between pointers are used to order functions. Most of these changes are local, but the comparison of global values requires assigning an identifier to each local in the order it is visited. This is very similar to the way the comparison function identifies Value's defined within a function. Because the order of visiting the functions and their subparts is deterministic, the identifiers assigned to the globals will be as well, and the order of functions will be deterministic. With these changes, there is no more observed non-determinism. There is also only minor slowdowns (negligible to 4%) compared to the baseline, which is likely a result of the fact that global comparisons involve hash lookups and not just pointer comparisons. The one caveat so far is that programs containing BlockAddress constants can still be non-deterministic. It is not clear what the right solution is here. In particular, even if the global numbers are used to order by function, we still need a way to order the BasicBlock's. Unfortunately, we cannot just bail out and fail to order the functions or consider them equal, because we require a total order over functions. Note that programs with BlockAddress constants are relatively rare, so the impact of leaving this in is minor as long as this pass is opt-in. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: jevinskie, llvm-commits, chapuni Differential revision: http://reviews.llvm.org/D12168 llvm-svn: 245762	2015-08-21 23:27:24 +00:00
Adam Nemet	4e533ef7a9	[LAA] Hold bounds via ValueHandles during SCEV expansion SCEV expansion can invalidate previously expanded values. For example in SCEVExpander::ReuseOrCreateCast, if we already have the requested cast value but it's not at the desired location, a new cast is inserted and the old cast will be invalidated. Therefore, when expanding the bounds for the pointers, a later entry can invalidate the IR value for an earlier one. The fix is to store a value handle rather than the value itself. The newly added test has a more detailed description of how the bug triggers. This bug can have a negative but potentially highly variable performance impact in Loop Distribution. Because one of the bound values was invalidated and is an undef expression now, InstCombine is free to transform the array overlap check: Start0 <= End1 && Start1 <= End0 into: Start0 <= End1 So depending on the runtime location of the arrays, we would detect a conflict and fall back on the original loop of the versioned loop. Also tested compile time with SPEC2006 LTO bc files. llvm-svn: 245760	2015-08-21 23:19:57 +00:00
Tyler Nowicki	552a62fabc	Standardized 'failed' to 'Failed' in LoopVectorizationRequirements. llvm-svn: 245759	2015-08-21 23:03:24 +00:00
Peter Collingbourne	44ee84eec5	LTO: Change signature of LTOCodeGenerator::setCodePICModel() to take a Reloc::Model. This allows us to remove a bunch of code in LTOCodeGenerator and llvm-lto and has the side effect of improving error handling in the libLTO C API. llvm-svn: 245756	2015-08-21 22:57:17 +00:00
Tom Stellard	bd8a0856e2	AMDGPU/SI: Better handle s_wait insertion We can wait on either VM, EXP or LGKM. The waits are independent. Without this patch, a wait inserted because of one of them would also wait for all the previous others. This patch makes s_wait only wait for the ones we need for the next instruction. Here's an example of subtle perf reduction this patch solves: This is without the patch: buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen s_load_dwordx4 s[44:47], s[8:9], 0xc s_waitcnt lgkmcnt(0) buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen s_load_dwordx4 s[48:51], s[8:9], 0x10 s_waitcnt vmcnt(1) buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen The s_waitcnt vmcnt(1) is useless. The reason it is added is because the last buffer_load_format_xyzw needs s[44:47], which was issued by the first s_load_dwordx4. It waits for all VM before that call to have finished. Internally after every instruction, 3 counters (for VM, EXP and LGTM) are updated after every instruction. For example buffer_load_format_xyzw will increase the VM counter, and s_load_dwordx4 the LGKM one. Without the patch, for every defined register, the current 3 counters are stored, and are used to know how long to wait when an instruction needs the register. Because of that, the s[44:47] counter includes that to use the register you need to wait for the previous buffer_load_format_xyzw. Instead this patch stores only the counters that matter for the register, and puts zero for the other ones, since we don't need any wait for them. Patch by: Axel Davy Differential Revision: http://reviews.llvm.org/D11883 llvm-svn: 245755	2015-08-21 22:47:27 +00:00
Sanjoy Das	c86c162a58	Re-apply r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" The original checkin was buggy, this change has a fix. Original commit message: [InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245753	2015-08-21 22:22:37 +00:00
David Blaikie	47bf5c019d	Range-for-ify some things in GlobalMerge llvm-svn: 245752	2015-08-21 22:19:06 +00:00
David Blaikie	9ed57a9ef0	[opaque pointer types] Fix a few easy places in GlobalMerge that were accessing value types through pointee types llvm-svn: 245746	2015-08-21 22:00:44 +00:00
Alex Lorenz	c1136ef3b8	MIR Serialization: Serialize the pointer IR expression values in the machine memory operands. llvm-svn: 245745	2015-08-21 21:54:12 +00:00
Vedant Kumar	366dd9fd2b	[ARM] Fix MachO CPU Subtype selection Differential Revision: http://reviews.llvm.org/D12040 llvm-svn: 245744	2015-08-21 21:52:48 +00:00
Alex Lorenz	5d8b0bd9b0	MIRParser: Split the 'parseIRConstant' method into two methods. NFC. One variant of this method can be reused when parsing the quoted IR pointer expressions in the machine memory operands. llvm-svn: 245743	2015-08-21 21:48:22 +00:00
David Blaikie	d583b19569	[opaque pointer types] Push the passing of value types up from Function/GlobalVariable to GlobalObject (coming next, pushing this up into GlobalValue, so it can store the value type directly) llvm-svn: 245742	2015-08-21 21:35:28 +00:00
Hal Finkel	ff9639d6b7	[PowerPC] PPCVSXFMAMutate should not segfault on undef input registers When PPCVSXFMAMutate would look at the input addend register, it would get its input value number. This would fail, however, if the register was undef, causing a segfault. Don't segfault (just skip such FMA instructions). Fixes the test case from PR24542 (although that may have been over-reduced). llvm-svn: 245741	2015-08-21 21:34:24 +00:00
Alex Lorenz	1de2acd3c2	AsmParser: Save and restore the parsing state for types using SlotMapping. This commit extends the 'SlotMapping' structure and includes mappings for named and numbered types in it. The LLParser is extended accordingly to fill out those mappings at the end of module parsing. This information is useful when we want to parse standalone constant values at a later stage using the 'parseConstantValue' method. The constant values can be constant expressions, which can contain references to types. In order to parse such constant values, we have to restore the internal named and numbered mappings for the types in LLParser, otherwise the parser will report a parsing error. Therefore, this commit also introduces a new method called 'restoreParsingState' to LLParser, which uses the slot mappings to restore some of its internal parsing state. This commit is required to serialize constant value pointers in the machine memory operands for the MIR format. Reviewers: Duncan P. N. Exon Smith llvm-svn: 245740	2015-08-21 21:32:39 +00:00
Bruno Cardoso Lopes	7a1483e7d1	[LVI] Use a SmallVector instead of SmallPtrSet. NFC llvm-svn: 245739	2015-08-21 21:18:26 +00:00
Alex Lorenz	f22ca8ad35	MIR Serialization: Print MCSymbol operands. This commit allows the MIR printer to print the MCSymbol machine operands. Unfortunately they can't be parsed at this time. I will create a bug that will track the fact that the MCSymbol operands can't be parsed yet. llvm-svn: 245737	2015-08-21 21:12:44 +00:00
Sanjay Patel	f0bc07f7a5	[x86] enable machine combiner reassociations for 256-bit vector min/max llvm-svn: 245735	2015-08-21 21:04:21 +00:00
Sanjay Patel	dddad10241	remove 'FeatureSlowUAMem' from AMD CPUs based on 10H micro-arch or later See discussion in D12154 ( http://reviews.llvm.org/D12154 ), AMD Software Optimization Guides for 10H/12H/15H/16H, and Agner Fog's experimental data. llvm-svn: 245733	2015-08-21 20:39:17 +00:00
David Blaikie	51973e1088	Add comment as follow up to r245712 llvm-svn: 245730	2015-08-21 20:18:39 +00:00
Sanjay Patel	9e916dc48d	[x86] invert logic for attribute 'FeatureFastUAMem' This is a 'no functional change intended' patch. It removes one FIXME, but adds several more. Motivation: the FeatureFastUAMem attribute may be too general. It is used to determine if any sized misaligned memory access under 32-bytes is 'fast'. From the added FIXME comments, however, you can see that we're not consistent about this. Changing the name of the attribute makes it clearer to see the logic holes. Changing this to a 'slow' attribute also means we don't have to add an explicit 'fast' attribute to new chips; fast unaligned accesses have been standard for several generations of CPUs now. Differential Revision: http://reviews.llvm.org/D12154 llvm-svn: 245729	2015-08-21 20:17:26 +00:00
David Blaikie	88208840b5	[opaque pointer type]: Pass explicit pointee type when building a constant GEP. Gets a bit tricky in the ValueMapper, of course - not sure if we should just expose a list of explicit types for each Value so that the ValueMapper can be neutral to these special cases (it's OK for things like load, where the explicit type is the result type - but when that's not the case, it means plumbing through another "special" type... ) llvm-svn: 245728	2015-08-21 20:16:51 +00:00
Sanjay Patel	cf942fa905	[x86] enable machine combiner reassociations for 128-bit vector min/max llvm-svn: 245715	2015-08-21 18:06:49 +00:00
David Blaikie	401bb64b31	Remove an unnecessary use of pointee types introduced in r194220 David Majnemer (the original author) believes this to be an impossible condition to reach anyway, and no test cases cover this so we'll go with that. llvm-svn: 245712	2015-08-21 17:37:41 +00:00
Yaron Keren	528d8d6092	Disable Visual C++ 2013 Debug mode assert on null pointer in some STL algorithms, such as std::equal on the third argument. This reverts previous workarounds. Predefining _DEBUG_POINTER_IMPL disables Visual C++ 2013 headers from defining it to a function performing the null pointer check. In practice, it's not that bad since any function actually using the nullptr will seg fault. The other iterator sanity checks remain enabled in the headers. Reviewed by Aaron Ballmanþ and Duncan P. N. Exon Smith. llvm-svn: 245711	2015-08-21 17:31:03 +00:00
Benjamin Kramer	103fc94d2d	[APFloat] Remove else after return and replace loop with std::equal. NFC. llvm-svn: 245707	2015-08-21 16:44:52 +00:00
Eric Christopher	e5e302f7e0	Fix typo - symetric -> symmetric. llvm-svn: 245705	2015-08-21 16:23:39 +00:00
John Brawn	eab960c46f	[DAGCombiner] Fold together mul and shl when both are by a constant This is intended to improve code generation for GEPs, as the index value is shifted by the element size and in GEPs of multi-dimensional arrays the index of higher dimensions is multiplied by the lower dimension size. Differential Revision: http://reviews.llvm.org/D12197 llvm-svn: 245689	2015-08-21 10:48:17 +00:00
NAKAMURA Takumi	6a6232818d	Revert r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" It caused miscompilation in clang. llvm-svn: 245678	2015-08-21 07:46:07 +00:00
Peter Collingbourne	5cd1e8d3ab	Linker: Remove empty destructor. llvm-svn: 245672	2015-08-21 04:51:24 +00:00
Peter Collingbourne	ec43d0f356	LTO: Simplify ownership of LTOCodeGenerator::TargetMach. llvm-svn: 245671	2015-08-21 04:45:57 +00:00
Peter Collingbourne	2257512f87	LTO: Simplify ownership of LTOCodeGenerator::CodegenOptions. llvm-svn: 245670	2015-08-21 04:45:55 +00:00
James Y Knight	667395f334	[Sparc] Support user-specified stack object overalignment. Note: I do not implement a base pointer, so it's still impossible to have dynamic realignment AND dynamic alloca in the same function. This also moves the code for determining the frame index reference into getFrameIndexReference, where it belongs, instead of inline in eliminateFrameIndex. [Begin long-winded screed] Now, stack realignment for Sparc is actually a silly thing to support, because the Sparc ABI has no need for it -- unlike the situation on x86, the stack is ALWAYS aligned to the required alignment for the CPU instructions: 8 bytes on sparcv8, and 16 bytes on sparcv9. However, LLVM unfortunately implements user-specified overalignment using stack realignment support, so for now, I'm going to go along with that tradition. GCC instead treats objects which have alignment specification greater than the maximum CPU-required alignment for the target as a separate block of stack memory, with their own virtual base pointer (which gets aligned). Doing it that way avoids needing to implement per-target support for stack realignment, except for the targets which actually have an ABI-specified stack alignment which is too small for the CPU's requirements. Further unfortunately in LLVM, the default canRealignStack for all targets effectively returns true, despite that implementing that is something a target needs to do specifically. So, the previous behavior on Sparc was to silently ignore the user's specified stack alignment. Ugh. Yet MORE unfortunate, if a target actually does return false from canRealignStack, that also causes the user-specified alignment to be silently ignored, rather than emitting an error. (I started looking into fixing that last, but it broke a bunch of tests, because LLVM actually depends on having it silently ignored: some architectures (e.g. non-linux i386) have smaller stack alignment than spilled-register alignment. But, the fact that a register needs spilling is not known until within the register allocator. And by that point, the decision to not reserve the frame pointer has been frozen in place. And without a frame pointer, stack realignment is not possible. So, canRealignStack() returns false, and needsStackRealignment() then returns false, assuming everyone can just go on their merry way assuming the alignment requirements were probably just suggestions after-all. Sigh...) Differential Revision: http://reviews.llvm.org/D12208 llvm-svn: 245668	2015-08-21 04:17:56 +00:00
Peter Collingbourne	1dc6a8d179	TransformUtils: Introduce module splitter. The module splitter splits a module into linkable partitions. It will be used to implement parallel LTO code generation. This initial version of the splitter does not attempt to deal with the somewhat subtle symbol visibility issues around module splitting. These will be dealt with in a future change. Differential Revision: http://reviews.llvm.org/D12132 llvm-svn: 245662	2015-08-21 02:48:20 +00:00
NAKAMURA Takumi	cf61aae163	SparcAsmParser.cpp: Appease msc x86. llvm-svn: 245661	2015-08-21 01:12:19 +00:00
Matthias Braun	46e5639806	AArch64: Fix cmp;ccmp ordering When producing conditional compare sequences for or operations we need to negate the operands and the finally tested flags. The thing is if we negate the finally tested flags this equals a logical negation of all previously emitted expressions. There was a case missing where we have to order OR expressions so they get emitted first. This fixes http://llvm.org/PR24459 llvm-svn: 245641	2015-08-20 23:33:34 +00:00
Matthias Braun	266204b7dc	AArch64: Do not create CCMP on multiple users. Create CMP;CCMP sequences from and/or trees does not gain us anything if the and/or tree is materialized to a GP register anyway. While most of the code already checked for hasOneUse() there was one important case missing. llvm-svn: 245640	2015-08-20 23:33:31 +00:00
David Majnemer	2df38cd0c4	[InstSimplify] add nuw %x, C2 must be at least C2 Use the fact that add nuw always creates a larger bit pattern when trying to simplify comparisons. llvm-svn: 245638	2015-08-20 23:01:41 +00:00
Dan Gohman	32907a6b21	[WebAssembly] Mark more operators as Expand. llvm-svn: 245636	2015-08-20 22:57:13 +00:00
Sanjoy Das	e472d8a57a	[InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245635	2015-08-20 22:31:55 +00:00
Michael Zolotukhin	51b00e6d82	[SLP] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245633	2015-08-20 22:28:15 +00:00
Michael Zolotukhin	2a3d99fedf	[LoopVectorize] Propagate 'nontemporal' attribute into vectorized instructions. llvm-svn: 245632	2015-08-20 22:27:38 +00:00
Adrian Prantl	cbdfdb74d3	Rename Instruction::dropUnknownMetadata() to dropUnknownNonDebugMetadata() and make it always preserve debug locations, since all callers wanted this behavior anyway. This is addressing a post-commit review feedback for r245589. NFC (inside the LLVM tree). llvm-svn: 245622	2015-08-20 22:00:30 +00:00
Ahmed Bougacha	0cdc7719f0	[X86] Look for scalar through one bitcast when lowering to VBROADCAST. Fixes PR23464: one way to use the broadcast intrinsics is: _mm256_broadcastw_epi16(_mm_cvtsi32_si128((int)src)); We don't currently fold this, but now that we use native IR for the intrinsics (r245605), we can look through one bitcast to find the broadcast scalar. Differential Revision: http://reviews.llvm.org/D10557 llvm-svn: 245613	2015-08-20 21:02:39 +00:00
Jingyue Wu	ca3ef11a9b	[NVPTX] truncating 64-bit to 32-bit is free Summary: Add an LSR test that exercises isTruncateFree. Without this change, LSR creates another indvar representing the truncated value. Reviewers: jholewinski, eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D12058 llvm-svn: 245611	2015-08-20 20:59:02 +00:00
Ahmed Bougacha	1a498705e4	[X86] Replace avx2 broadcast intrinsics with native IR. Since r245605, the clang headers don't use these anymore. r245165 updated some of the tests already; update the others, add an autoupgrade, remove the intrinsics, and cleanup the definitions. Differential Revision: http://reviews.llvm.org/D10555 llvm-svn: 245606	2015-08-20 20:36:19 +00:00
Adhemerval Zanella	e00b497242	[asan] Add ASAN support for AArch64 42-bit VMA This patch adds support for asan on aarch64-linux with 42-bit VMA (current default config for 64K pagesize kernels). The support is enabled by defining the SANITIZER_AARCH64_VMA to 42 at build time for both clang/llvm and compiler-rt. The default VMA is 39 bits. llvm-svn: 245594	2015-08-20 18:30:40 +00:00
Jingyue Wu	10fcea5d4b	[ValueTracking] computeOverflowForSignedAdd and isKnownNonNegative Summary: Refactor, NFC Extracts computeOverflowForSignedAdd and isKnownNonNegative from NaryReassociate to ValueTracking in case others need it. Reviewers: reames Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D11313 llvm-svn: 245591	2015-08-20 18:27:04 +00:00
Bruno Cardoso Lopes	ed6b9bfeab	[LVI] Avoid iterator invalidation in LazyValueInfoCache::threadEdge Do that by copying out the elements to another SmallPtrSet. Follow up from r245309. llvm-svn: 245590	2015-08-20 18:24:54 +00:00
Adrian Prantl	baf90fc265	Fix a bug that caused SimplifyCFG to drop DebugLocs. Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all metadata in KnownSet, but the condition for DebugLocs was inverted. Most users of dropUnknownMetadata() actually worked around this by not adding LLVMContext::MD_dbg to their list of KnowIDs. This is now made explicit. llvm-svn: 245589	2015-08-20 18:24:02 +00:00
Adrian Prantl	a317cd2583	Fix a debug location handling bug in GVN. Caught by the famous "DebugLoc describes the currect SubProgram" assertion. When GVN is removing a nonlocal load it updates the debug location of the SSA value it replaced the load with with the one of the load. In the testcase this actually overwrites a valid debug location with an empty one. In reality GVN has to make an arbitrary choice between two equally valid debug locations. This patch changes to behavior to only update the location if the value doesn't already have a debug location. llvm-svn: 245588	2015-08-20 18:23:56 +00:00
Adam Nemet	e48134093d	[LVer] Fix FIXME: hide addPHINodes, NFC Since Ashutosh made findDefsUsedOutsideOfLoop public, we can clean this up. Now clients that don't compute DefsUsedOutsideOfLoop can just call versionLoop() and computing DefsUsedOutsideOfLoop will happen implicitly. With that there is no reason to expose addPHINodes anymore. Ashutosh, you can now drop the calls to findDefsUsedOutsideOfLoop and addPHINodes in LVerLICM and things should just work. llvm-svn: 245579	2015-08-20 17:22:29 +00:00
James Molloy	bf17009a97	[ARM] Don't try and custom lower a vNi64 SETCC. It won't go well. We've already marked 64-bit SETCCs as non-Custom, but it's just possible that a SETCC has a legal result type but an illegal operand type. If this happens, bail out before we create unselectable nodes. Fixes PR24292. I tried to create a testcase but in 99% of cases we can't trigger this - not surprising that this bug has been latent since 2009. llvm-svn: 245577	2015-08-20 16:33:44 +00:00
Rafael Espindola	c30c7c493f	Fix symbol value computation when part of the expression is weak. This matches the behaviour of the gnu assembler and is part of fixing pr24486. llvm-svn: 245576	2015-08-20 16:18:30 +00:00
Douglas Katzman	58195a2d74	[Sparc]: correct the 'set' synthetic instruction Differential Revision: http://reviews.llvm.org/D12194 llvm-svn: 245575	2015-08-20 16:16:16 +00:00
Balaram Makam	ccf59731e3	Optimize bitwise even/odd test (-x&1 -> x&1) to not use negation. Summary: We know that -x & 1 is equivalent to x & 1, avoid using negation for testing if a negative integer is even or odd. Reviewers: majnemer Subscribers: junbuml, mssimpso, gberry, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D12156 llvm-svn: 245569	2015-08-20 15:35:00 +00:00
Marina Yatsina	bce1ab67a5	[X86] Fix FBLD and FBSTP FBLD and FBSTP should receive TBYTE because it is defined as FBLD m80 FBSTP m80 Differential Revision: http://reviews.llvm.org/D11748 llvm-svn: 245553	2015-08-20 11:51:24 +00:00
Marina Yatsina	7a4e1ba737	[X86] Fix bug in COMISD and COMISS definition in td files COMISD should receive QWORD because it is defined as (V)COMISD xmm1, xmm2/m64 COMISS should receive DWORD because it is defined as (V)COMISS xmm1, xmm2/m32 Differential Revision: http://reviews.llvm.org/D11712 llvm-svn: 245551	2015-08-20 11:21:36 +00:00
Benjamin Kramer	fcdb1c14ac	Make helper functions static. NFC. llvm-svn: 245549	2015-08-20 09:57:22 +00:00
David Majnemer	cfc1df553e	[X86] Fix the (shl (and (setcc_c), c1), c2) -> (and setcc_c, (c1 << c2)) fold We didn't check for the necessary preconditions before folding a mask/shift into a single mask. This fixes PR24516. llvm-svn: 245544	2015-08-20 09:00:56 +00:00
Bjorn Steinbrink	2e2f66557e	Revert "[DSE] Enable removal of lifetime intrinsics in terminating blocks" llvm-svn: 245543	2015-08-20 08:58:47 +00:00
Bjorn Steinbrink	cc7e8a9705	[DSE] Enable removal of lifetime intrinsics in terminating blocks Usually DSE is not supposed to remove lifetime intrinsics, but it's actually ok to remove them for dead objects in terminating blocks, because they convey no extra information there. Until we hit a lifetime start that cannot be removed, that is. Because from that point on the lifetime intrinsics become interesting again, e.g. for stack coloring. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11710 llvm-svn: 245542	2015-08-20 08:25:28 +00:00
Chandler Carruth	0f792189a4	[ARC] Pull the ObjC ARC components that really serve the role of analyses into LLVM's Analysis library rather than having them in a Transforms library. This is motivated by the need to have the core AliasAnalysis infrastructure be aware of the ObjCARCAliasAnalysis. However, it also seems like a nice and clean separation. Everything was very easy to move and this doesn't create much clutter in the analysis library IMO. Differential Revision: http://reviews.llvm.org/D12133 llvm-svn: 245541	2015-08-20 08:06:03 +00:00
Hal Finkel	9fdce9adee	[PowerPC] Fix value type on XVCMPEQDP for v2f64 comparisons XVCMPEQDP is used for VSX v2f64 equality comparisons, but the value type needs to be v2i64 (as that's the corresponding SETCC type). Fixes PR24225. llvm-svn: 245535	2015-08-20 03:02:02 +00:00
Hal Finkel	be78c25acb	[PowerPC] Fix the int2fp(fp2int(x)) DAGCombine to ignore ppc_fp128 This DAGCombine was creating custom SDAG nodes with an illegal ppc_fp128 operand type because it was triggering on f64/f32 int2fp(fp2int(ppc_fp128 x)), but shouldn't (it should only apply to f32/f64 types). The result was a crash. llvm-svn: 245530	2015-08-20 01:18:20 +00:00
Alex Lorenz	36efd3883d	MIR Serialization: Use the global value syntax for global value memory operands. This commit modifies the serialization syntax so that the global IR values in machine memory operands use the global value '@<name>' syntax instead of the current '%ir.<name>' syntax. The unnamed global IR values are handled by this commit as well, as the existing global value parsing method can parse the unnamed globals already. llvm-svn: 245527	2015-08-20 00:20:03 +00:00
Alex Lorenz	0d009645a1	MIR Serialization: Change syntax for the call entry pseudo source values. The global IR values in machine memory operands should use the global value '@<name>' syntax instead of the current '%ir.<name>' syntax. However, the global value call entry pseudo source values use the global value syntax already. Therefore, the syntax for the call entry pseudo source values has to be changed so that the global values and call entry global value PSVs can be parsed without ambiguities. llvm-svn: 245526	2015-08-20 00:12:57 +00:00
Alex Lorenz	dbd22a9a6c	Fix test failure introduced by r245521. Machine memory operands can contain pointer values that are constants, and the 'getLocalSlot' method requires non-constant values. The constant pointer values will have to be serialized in a different patch. llvm-svn: 245523	2015-08-19 23:56:37 +00:00
Alex Lorenz	dd13be0bcc	MIR Serialization: Serialize unnamed local IR values in memory operands. llvm-svn: 245521	2015-08-19 23:31:05 +00:00
Alex Lorenz	36593ac51b	MIR Parser: parseIRValue should take in a constant pointer. NFC. llvm-svn: 245520	2015-08-19 23:27:07 +00:00
Alex Lorenz	55dc6f8165	MIR Printer: Extract the code that prints IR slots to a separate function. NFC. This code can be reused when printing references to unnamed local IR values. llvm-svn: 245519	2015-08-19 23:24:37 +00:00
Sanjay Patel	9e5927fdc3	[x86] enable machine combiner reassociations for scalar double-precision min/max llvm-svn: 245506	2015-08-19 21:27:27 +00:00
Sanjay Patel	4e3ee1e548	[x86] enable machine combiner reassociations for scalar single-precision maximums llvm-svn: 245504	2015-08-19 21:18:46 +00:00
Simon Pilgrim	35f528262f	[DAGCombiner] Added SMAX/SMIN/UMAX/UMIN constant folding We still need to add constant folding of vector comparisons to fold the tests for targets that don't support the respective min/max nodes I needed to update 2011-12-06-AVXVectorExtractCombine to load a vector instead of using a constant vector to prevent it folding Differential Revision: http://reviews.llvm.org/D12118 llvm-svn: 245503	2015-08-19 21:11:58 +00:00
Juergen Ributzka	b12248e9cd	[AArch64][FastISel] Don't fold shifts with UB. We are already falling back to SelectionDAG when encountering an shift with UB. This adds the same checks for shifts with UB that get folded into arithmetic or logical operations. This fixes rdar://problem/22345295. llvm-svn: 245499	2015-08-19 20:52:55 +00:00
David Majnemer	f25fe64716	[X86] Emit more efficient >= comparisons against 0 We don't do a great job with >= 0 comparisons against zero when the result is used as an i8. Given something like: void f(long long LL, bool B) { B = LL >= 0; } We used to generate: shrq $63, %rdi xorb $1, %dil movb %dil, (%rsi) Now we generate: testq %rdi, %rdi setns (%rsi) Differential Revision: http://reviews.llvm.org/D12136 llvm-svn: 245498	2015-08-19 20:51:40 +00:00
Dan Gohman	dde8dce6a9	[WebAssembly] Use the default alignment for SIMD types. Previously WebAssembly's datalayout string had -v128:8:128. This had been an attempt to declare a certain level of support for unaligned SIMD accesses. However, clang makes its own determinations for SIMD alignment that are independent of the datalayout string, so this wasn't actually meaningful. llvm-svn: 245494	2015-08-19 20:30:20 +00:00
Simon Pilgrim	989cbbd2f5	[DAGCombiner] Fold CONCAT_VECTORS of EXTRACT_SUBVECTOR (or undef) to VECTOR_SHUFFLE. Check to see if this is a CONCAT_VECTORS of a bunch of EXTRACT_SUBVECTOR operations. If so, and if the EXTRACT_SUBVECTOR vector inputs come from at most two distinct vectors the same size as the result, attempt to turn this into a legal shuffle. Differential Revision: http://reviews.llvm.org/D12125 llvm-svn: 245490	2015-08-19 20:09:50 +00:00
David Majnemer	ba275f9947	Replace some calls to isa<LandingPadInst> with isEHPad() No functionality change is intended. llvm-svn: 245487	2015-08-19 19:54:02 +00:00
Douglas Katzman	2362b69dd9	[Sparc]: asm-only support for the ldstub instruction. llvm-svn: 245485	2015-08-19 19:30:57 +00:00
Alex Lorenz	feb6b4395b	MIR Parser: Rename 'MachineOperandWithLocation' to 'ParsedMachineOperand'. NFC. Besides storing the operand's source range, this structure now stores other attributes as well, so the name should reflect this fact. llvm-svn: 245483	2015-08-19 19:19:16 +00:00
Alex Lorenz	5ef93b0c4c	MIR Serialization: Serialize instruction's register ties. This commit serializes the machine instruction's register operand ties. The ties are printed out only when the instructon has register ties that are different from the ties that are specified in the instruction's description. llvm-svn: 245482	2015-08-19 19:05:34 +00:00
Nemanja Ivanovic	5f1cea4141	Temporary fix for the self-host failures introduced by rL244921. This revision has introduced an issue that only affects bootstrapped compiler when it is printing the ASM. I am working on resolving the issue, but in the meantime, I'm disabling the legalization of scalar_to_vector operation for v2i64 and the associated testing until I can get this fixed. llvm-svn: 245481	2015-08-19 19:04:47 +00:00
Alex Lorenz	e66a7ccf77	MIR Serialization: Serialize defined registers that require 'def' register flag. The defined registers are already serialized - they are represented by placing them before the '=' in a machine instruction. However, certain instructions like INLINEASM can have defined register operands after the '=', so this commit introduces the 'def' register flag for such operands. llvm-svn: 245480	2015-08-19 18:55:47 +00:00
Bruno Cardoso Lopes	27fd06922b	[PeepholeOptimizer] Look through PHIs to find additional register sources Reintroduce r245442. Remove an overly conservative assertion introduced in r245442. We could replace the assertion to use `shareSameRegisterFile` instead, but in that point in `insertPHI` we already lost the original Def subreg to check against. So drop the assertion completely. Original commit message: - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 245479	2015-08-19 18:53:36 +00:00
Douglas Katzman	e5485c651e	[SPARC] Enable writing to floating-point-state register. llvm-svn: 245475	2015-08-19 18:34:48 +00:00
Ahmed Bougacha	9e00ec6195	[AArch64] Improve short-form diags on long-form Match_InvalidOperand. Since r244955, we try to use the short-form ErrorInfo when both tries failed, and the long-form match failed on a suffix operand. However, this means we sometimes mix ErrorInfo and MatchResult (one manifestation of this being PR24498). Instead, restore both. llvm-svn: 245469	2015-08-19 17:40:19 +00:00
Hal Finkel	ff08a2ecad	[SCEV] Fix GCC 4.8.0 ICE in lambda function Rewrite some code to not use a lambda function. The non-lambda code is just about as clean as the original, and not any longer. The lambda function causes an internal compiler error in GCC 4.8.0, and it is not worth breaking support for that compiler over this. NFC. llvm-svn: 245466	2015-08-19 17:26:07 +00:00
Adam Nemet	cdb791cd33	[LAA] Comment how memchecks are codegened llvm-svn: 245465	2015-08-19 17:24:36 +00:00
Renato Golin	eb552e83e0	Revert "[AArch64] Simplify/refactor code to ease code review. NFC." This reverts commit r245443, as it broke AArch64 test-suite tramp3d with an assert "Reg && "Null register has no regunits". llvm-svn: 245455	2015-08-19 16:29:53 +00:00
Derek Schuff	55817ee604	x32. Fixes a bug in x32 exception handling. This patch updates the X86 lowering so that the Exception Pointer and Selector are 64-bit wide only if Subtarget.isTarget64BitLP64. Patch by João Porto Reviewers: dschuff, rnk Differential Revision: http://reviews.llvm.org/D12111 llvm-svn: 245454	2015-08-19 16:28:21 +00:00
JF Bastien	5ab87edbb4	x32. Fixes jmp %reg in x32 x32 has 32-bit pointers; x86-64 can't jmp %r32. This patch addresses this issue by explicitly zero-extending brind's target to 64-bits. Author: jpp Reviewers: jfb, dschuff, pavel.v.chupin Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D12112 llvm-svn: 245452	2015-08-19 16:17:08 +00:00
James Y Knight	3b0fd753c4	[Sparc] Rename LoadASR and StoreASR from r245360 to *ASI, as was intended. llvm-svn: 245450	2015-08-19 15:59:49 +00:00
Bruno Cardoso Lopes	61009142b8	Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources" Revert r245442 while investigating a fix. An assertion hit in http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/11380 llvm-svn: 245446	2015-08-19 15:10:32 +00:00
James Y Knight	d966fb6fef	[SPARC] Fix BooleanContents, so that select of a trunc doesn't eliminate the trunc. Differential Revision: http://reviews.llvm.org/D10442 llvm-svn: 245444	2015-08-19 14:47:04 +00:00
Chad Rosier	494abf1ad8	[AArch64] Simplify/refactor code to ease code review. NFC. llvm-svn: 245443	2015-08-19 14:34:54 +00:00
Bruno Cardoso Lopes	0a1c126684	[PeepholeOptimizer] Look through PHIs to find additional register sources Reapply r243486. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 245442	2015-08-19 14:34:41 +00:00
Silviu Baranga	ad1b19fcb7	[ARM] Add instruction selection patterns for vmin/vmax Summary: The mid-end was generating vector smin/smax/umin/umax nodes, but we were using vbsl to generatate the code. This adds the vmin/vmax patterns and a test to check that we are now generating vmin/vmax instructions. Reviewers: rengolin, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12105 llvm-svn: 245439	2015-08-19 14:11:27 +00:00
Joerg Sonnenberger	7d180c59bb	Map %fprs to %asr6 in the Sparc assembler parser. llvm-svn: 245437	2015-08-19 13:55:14 +00:00
Daniel Sanders	1e97a0b324	Emit <regmask R1 R2 R3 ...> instead of just <regmask> in IR dumps. Reviewers: qcolombet Subscribers: kparzysz, qcolombet, llvm-commits Differential Revision: http://reviews.llvm.org/D11644 llvm-svn: 245433	2015-08-19 12:03:04 +00:00
Tobias Grosser	85508e804b	Revert "[X86] Widen the 'AND' mask if doing so shrinks the encoding size" This reverts commit 245169 which miscompiles MultiSource/Applications/siod from LNT. llvm-svn: 245432	2015-08-19 11:35:10 +00:00
Michael Kuperstein	9fe42604aa	[X86] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize There are some cases where the mul sequence is smaller, but for the most part, using a div is preferable. This does not apply to vectors, since x86 doesn't have vector idiv, and a vector mul/shifts sequence ought to be smaller than a scalarized division. Differential Revision: http://reviews.llvm.org/D12082 llvm-svn: 245431	2015-08-19 11:21:43 +00:00
Michael Kuperstein	dcdab4cd3a	[TLI] Refactor "is integer division cheap" queries. This removes the isPow2SDivCheap() query, as it is not currently used in any meaningful way. isIntDivCheap() no longer relies on a state variable (as all in-tree target set it to false), but the interface allows querying based on the type optimization level. NFC. Differential Revision: http://reviews.llvm.org/D12082 llvm-svn: 245430	2015-08-19 11:17:59 +00:00
Nick Lewycky	1098e496e1	More clean up, still NFC. Remove dead variables now that the casts are gone. llvm-svn: 245420	2015-08-19 06:25:30 +00:00
Nick Lewycky	2c852543a3	Clean up this file a little. Remove dead casts, casting Values to Values. Adjust some comments for typos and whitespace. NFC. llvm-svn: 245419	2015-08-19 06:22:33 +00:00
Ashutosh Nema	c5b7b55589	Exposed findDefsUsedOutsideOfLoop as a loop utility function Exposed findDefsUsedOutsideOfLoop as a loop utility function by moving it from LoopDistribute to LoopUtils. Reviewed By: anemet llvm-svn: 245416	2015-08-19 05:40:42 +00:00
Chandler Carruth	44a1385c45	[LPM] Teach the legacy pass manager to support using an analysis without requiring it. This allows a pass indicate that it will use an analysis if available (through getAnalysisIfAvailable). When the pass manager knows this, it will refrain from deleting that analysis if it can. Naturally, it will still get invalidated at the correct time. These passes are not considered when scheduling the pass pipeline, so typically they will require manual scheduling, but this may also allow passes with getAnalysisIfAvailable to find the analysis more often if nothing after them requires that analysis and it wasn't invalidated. I don't have a particular use case with the current passes, but with my new structure for alias analyses, this will be very useful. We want to allow people to customize the set of AAs available by scheduling additional passes. These's aren't ever required for obvious reasons. So we need some way to mark in the legacy pass manager that they will still be used if available. This is essentially how analysis groups already work. But this makes the feature generally available and more explicit. It should allow the AA change to not impact how people trigger a custom alias analysis being available at a certain point in compilation. Differential Revision: http://reviews.llvm.org/D12114 llvm-svn: 245409	2015-08-19 03:02:12 +00:00
Hal Finkel	0ef2b10f16	Fix how DependenceAnalysis calls delinearization Fix how DependenceAnalysis calls delinearization, mirroring what is done in Delinearization.cpp (mostly by making sure to call getSCEVAtScope before delinearizing, and by removing the unnecessary 'Pairs == 1' check). Patch by Vaivaswatha Nagaraj! llvm-svn: 245408	2015-08-19 02:56:36 +00:00
Eric Christopher	0efe9f60bb	Revert "Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks." This is causing bootstrap problems, e.g.: http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/2960 This reverts r245195. llvm-svn: 245402	2015-08-19 02:15:13 +00:00
Hal Finkel	a8d205f145	Make ScalarEvolution::isKnownPredicate a little smarter Here we make ScalarEvolution::isKnownPredicate, indirectly, a little smarter. Given some relational comparison operator OP, and two AddRec SCEVs, {I,+,S} OP {J,+,T}, we can reduce this to the comparison I OP J when S == T, both AddRecs are for the same loop, and both are known not to wrap. As it turns out, because of the way that backedge-guard expressions can be leveraged when computing known predicates, this allows indvars to simplify the if-statement comparison in this loop: void foo (int a, int b, int n) { for (int i = 0; i < n; ++i) { if (i > n) a[i] = b[i] + 1; } } which, somewhat surprisingly, we were not previously optimizing away. llvm-svn: 245400	2015-08-19 01:51:51 +00:00
Alex Lorenz	df9e3c6fb0	MIR Serialization: Serialize MMI's variable debug information. llvm-svn: 245396	2015-08-19 00:13:25 +00:00
Quentin Colombet	b700e357b5	[BasicAA] Revert r221876 because it can produce incorrect aliasing information: see PR24468. llvm-svn: 245394	2015-08-19 00:07:20 +00:00
Steve King	d4c8f70ce1	Fix backward operands in call to isTruncateFree() and improve comments. llvm-svn: 245385	2015-08-18 23:02:41 +00:00
Alex Lorenz	607efb6c7e	MIR Parser: Return true on error when parsing standalone registers. llvm-svn: 245384	2015-08-18 22:57:36 +00:00
Alex Lorenz	f3630113cd	MIR Serialization: Serialize the operand's bit mask target flags. This commit adds support for bit mask target flag serialization to the MIR printer and the MIR parser. It also adds support for the machine operand's target flag serialization to the AArch64 target. Reviewers: Duncan P. N. Exon Smith llvm-svn: 245383	2015-08-18 22:52:15 +00:00
Sanjay Patel	5c55fbc5ea	use TLI.allowsMemoryAccess() to check if memory accesses are fast; NFCI This consolidates use of isUnalignedMem32Slow() in one place. There is a slight change in logic although I'm not sure that it would ever come up in the real world: we were assuming that an alignment of the type size is always fast; now, we actually check the data layout to confirm that. llvm-svn: 245382	2015-08-18 22:48:12 +00:00
Nick Lewycky	06b0ea2e8f	Fix three typos in comments; "easilly" -> "easily". llvm-svn: 245379	2015-08-18 22:41:58 +00:00
Peter Collingbourne	4cfa086df2	Support: Clean up TSan annotations. Remove support for Valgrind-based TSan, which hasn't been maintained for a few years. We now use the TSan annotations only if LLVM is compiled with -fsanitize=thread. We no longer need the weak function definitions as we are guaranteed that our program is linked directly with the TSan runtime. Differential Revision: http://reviews.llvm.org/D12121 llvm-svn: 245374	2015-08-18 22:31:24 +00:00
Alex Lorenz	a314d81328	MIR Serialization: Serialize the frame information's stack protector index. llvm-svn: 245372	2015-08-18 22:26:26 +00:00
Alex Lorenz	dc9dadf683	MIR Parser: Extract the code that parses stack object references into a new method. This commit extracts the code that parses the stack object references into a new method named 'parseStackFrameIndex', so that it can be reused when parsing standalone stack object references. llvm-svn: 245370	2015-08-18 22:18:52 +00:00
David Majnemer	8e335ca278	[InstSimplify] Remove unused variable No functionality change is intended. llvm-svn: 245369	2015-08-18 22:18:22 +00:00
David Majnemer	c6bb0e2a51	[InstSimplify] Don't assume getAggregateElement will succeed It isn't always possible to get a value from getAggregateElement. This fixes PR24488. llvm-svn: 245365	2015-08-18 22:07:25 +00:00
David Majnemer	5eaf08ff1f	[VectorUtils] Replace 'llvm::' qualification with 'using llvm' No funcitonal change is intended, this just makes the file look more like the rest of LLVM. llvm-svn: 245364	2015-08-18 22:07:20 +00:00
Joerg Sonnenberger	b0ce8747c3	Load/store instructions for floating points with address space require SparcV9. To properly handle this, define the *a instructions as separate instruction classes by refactoring the LoadA and StoreA multiclasses. Move the instruction tests into the sparcv9 file to test the difference. llvm-svn: 245360	2015-08-18 21:31:46 +00:00
Matthias Braun	fa3b248a66	DAGCombiner: Improve DAGCombiner select normalization The current code normalizes select(C0, x, select(C1, x, y)) towards select(C0\|C1, x, y) if the targets prefers that form. This patch adds an additional rule that if the select(C1, x, y) part already exists in the function then we want to normalize into the other direction because the effects of reusing the existing value are bigger than transforming into the target preferred form. This addresses regressions following r238793, see also: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150727/290272.html Differential Revision: http://reviews.llvm.org/D11616 llvm-svn: 245350	2015-08-18 20:48:36 +00:00
Matthias Braun	2e920bd04f	DAGCombiner: Optimize SELECTs first before turning them into SELECT_CC This is part of http://reviews.llvm.org/D11616 - I just decided to split this up into a separate commit. llvm-svn: 245349	2015-08-18 20:48:29 +00:00
David Majnemer	0ad363eebc	[WinEH] Calculate state numbers for the new EH representation State numbers are calculated by performing a walk from the innermost funclet to the outermost funclet. Rudimentary support for the new EH constructs has been added to the assembly printer, just enough to test the new machinery. Differential Revision: http://reviews.llvm.org/D12098 llvm-svn: 245331	2015-08-18 19:07:12 +00:00
Matthias Braun	d55bcf2646	MachineRegisterInfo: Introduce isPhysRegUsed() This method checks whether a physical regiser or any of its aliases are used in the function. Using this function in SIRegisterInfo::findUnusedReg() should also fix this reported failure: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150803/292143.html http://reviews.llvm.org/rL242173#inline-533 The report doesn't come with a testcase and I don't know enough about AMDGPU to create one myself. llvm-svn: 245329	2015-08-18 18:54:27 +00:00
Chandler Carruth	2f02ea462c	[LPM] Cleanup some loops to be range based for loops before hacking on this code. NFC. llvm-svn: 245327	2015-08-18 18:41:53 +00:00
Chandler Carruth	7adc3a2b0e	[PM/AA] Remove the last relics of the separate IPA library from LLVM, folding the code into the main Analysis library. There already wasn't much of a distinction between Analysis and IPA. A number of the passes in Analysis are actually IPA passes, and there doesn't seem to be any advantage to separating them. Moreover, it makes it hard to have interactions between analyses that are both local and interprocedural. In trying to make the Alias Analysis infrastructure work with the new pass manager, it becomes particularly awkward to navigate this split. I've tried to find all the places where we referenced this, but I may have missed some. I have also adjusted the C API to continue to be equivalently functional after this change. Differential Revision: http://reviews.llvm.org/D12075 llvm-svn: 245318	2015-08-18 17:51:53 +00:00
Alex Lorenz	eb7c9be43c	MIR Parser: Implicit register verifier should accept unexpected implicit subregister operands. llvm-svn: 245315	2015-08-18 17:17:13 +00:00
Bruno Cardoso Lopes	1846ea3c71	[LVI] Use a SmallDenseMap instead of std::map for ValueCacheEntryTy Historically there seems to be some resistance regarding the change to DenseMap (r147980). However, I couldn't find cases of iterator invalidation for ValueCacheEntryTy, but only for ValueCache, which I left untouched. This reduces 20s on an internal testcase. Follow up from r245309. Differential Revision: http://reviews.llvm.org/D11651 rdar://problem/21320066 llvm-svn: 245314	2015-08-18 16:54:36 +00:00
Sanjay Patel	1cd6d88e4d	use minSize wrapper; NFCI These were missed when other uses were switched over: http://llvm.org/viewvc/llvm-project?view=revision&revision=243994 llvm-svn: 245311	2015-08-18 16:44:23 +00:00
Bruno Cardoso Lopes	6ac4ea4d29	[LVI] Improve LazyValueInfo compile time performance Changes in LoopUnroll in the past six months exposed scalability issues in LazyValueInfo when used from JumpThreading. One internal test that used to take 20s under -O2 now takes 6min. This commit change the OverDefinedCache from DenseSet<std::pair<AssertingVH<BasicBlock>, Value>> to DenseMap<AssertingVH<BasicBlock>, SmallPtrSet<Value , 4>> and reduces compile time down to 1m40s. Differential Revision: http://reviews.llvm.org/D11651 rdar://problem/21320066 llvm-svn: 245309	2015-08-18 16:34:27 +00:00
Chad Rosier	3dd0e942b6	[AArch64] Simplify the logic for computing in bounds offset. NFC. llvm-svn: 245307	2015-08-18 16:20:03 +00:00
Daniel Sanders	63f4a5dcad	[mips] Expand JAL instructions when PIC is enabled. Summary: This is the correct way to handle JAL instructions when PIC is enabled. Patch by Toma Tabacu Reviewers: seanbruno, tomatabacu Subscribers: brooks, seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D6231 llvm-svn: 245305	2015-08-18 16:18:09 +00:00
Zoran Jovanovic	2fe8466f6e	[mips][microMIPS] Implement DDIV, DMOD, DDIVU and DMODU instructions Differential Revision: http://reviews.llvm.org/D10953 llvm-svn: 245297	2015-08-18 14:40:43 +00:00
Zoran Jovanovic	a6593ff613	[mips][microMIPS] Implement SW and SWE instructions Differential Revision: http://reviews.llvm.org/D10869 llvm-svn: 245293	2015-08-18 12:53:08 +00:00
Daniel Sanders	a699444094	[mips] Make the MipsAsmParser capable of knowing whether PIC mode is enabled or not. Summary: This information is needed to decide whether we do the PIC-only JAL expansions or not. It's also needed for an upcoming patch which implements the .cprestore assembler directive (which can only be used effectively in PIC mode). By making this information available to the MipsAsmParser, we will know when to insert the instructions mandated by the .cprestore assembler directive and we will be able to give some useful warnings when we encounter a potential misuse of this directive. Patch by Toma Tabacu Reviewers: dsanders, seanbruno Subscribers: brooks, seanbruno, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D5626 llvm-svn: 245291	2015-08-18 12:33:54 +00:00
Michael Kruse	c1c9f8a0d5	[Support] On Windows, generate PDF files for graphs and open with associated viewer Summary: Windows system rarely have good PostScript viewers installed, but PDF viewers are common. So for viewing graphs, generate PDF files and open with the associated PDF viewer using cmd.exe's start command. Reviewers: Bigcheese, aaron.ballman Subscribers: aaron.ballman, JakeVanAdrighem, dwiberg, llvm-commits Differential Revision: http://reviews.llvm.org/D11877 llvm-svn: 245290	2015-08-18 12:17:37 +00:00
Michael Kruse	c0a8414c1c	[Support] Always wait for GraphViz before opening the viewer Summary: When calling DisplayGraph and a PS viewer is chosen, two programs are executed: The GraphViz generator and the PostScript viewer. Always for the generator to finish to ensure that the .ps file is written before opening the viewer for that file. DisplayGraph's wait parameter refers to whether to wait until the user closes the viewer. This happened on Windows and if none of the options to open the .dot file directly applies, also on Linux. Reviewers: Bigcheese, chandlerc, aaron.ballman Subscribers: dwiberg, aaron.ballman, llvm-commits Differential Revision: http://reviews.llvm.org/D11876 llvm-svn: 245289	2015-08-18 12:13:57 +00:00
Daniel Sanders	f1ae367a99	[mips] Correct -Woverflow warning in r245208 without changing signedness of the constant. This was supposed to have been committed as part of r245208 llvm-svn: 245285	2015-08-18 09:55:57 +00:00
Justin Bogner	9f00ebaeda	Revert "Constant propagation after hiting llvm.assume" This was also failing bootstrap: http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build This reverts r245265. llvm-svn: 245269	2015-08-18 07:00:34 +00:00
Piotr Padlewski	94ca3783b8	Constant propagation after hiting llvm.assume After hitting @llvm.assume(X) we can: - propagate equality that X == true - if X is icmp/fcmp (with eq operation), and one of operand is constant we can change all variables with constants in the same BasicBlock http://reviews.llvm.org/D11918 llvm-svn: 245265	2015-08-18 03:55:30 +00:00
Dan Gohman	ab48abeafa	[WebAssembly] Don't default to ELF in the triple. WebAssembly doesn't yet have a specified binary format, and it may not end up being ELF, so we don't want the Triple class defaulting to ELF for it at this time. llvm-svn: 245254	2015-08-17 22:37:56 +00:00
Guozhi Wei	f66d384443	Align SP adjustment in function getSPAdjust This commit adds a new function TargetFrameLowering::alignSPAdjust and calls it from TargetInstrInfo::getSPAdjust. It fixes PR24142. llvm-svn: 245253	2015-08-17 22:36:27 +00:00
Dan Gohman	4e2d799cab	[WebAssembly] Make getArchTypePrefix return "wasm". The arch prefix string isn't currently being used for anything on WebAssembly, but if it were to be used, it makes sense to use the same arch prefix string for wasm32 and wasm64. llvm-svn: 245252	2015-08-17 22:35:40 +00:00
Alex Lorenz	a56ba6a6dd	MIR Serialization: Serialize the local offsets for the stack objects. llvm-svn: 245249	2015-08-17 22:17:42 +00:00
Alex Lorenz	eb62568625	MIR Serialization: Serialize the memory operand's range metadata node. llvm-svn: 245247	2015-08-17 22:09:52 +00:00
Alex Lorenz	03e940d1f8	MIR Serialization: Serialize the memory operand's noalias metadata node. llvm-svn: 245246	2015-08-17 22:08:02 +00:00
Alex Lorenz	a16f624dc3	MIR Serialization: Serialize the memory operand's alias scope metadata node. llvm-svn: 245245	2015-08-17 22:06:40 +00:00
Alex Lorenz	a617c9162d	MIR Serialization: Serialize the memory operand's TBAA metadata node. llvm-svn: 245244	2015-08-17 22:05:15 +00:00
David Majnemer	83f4bb23c4	[WinEHPrepare] Replace unreasonable funclet terminators with unreachable It is possible to be in a situation where more than one funclet token is a valid SSA value. If we see a terminator which exits a funclet which doesn't use the funclet's token, replace it with unreachable. Differential Revision: http://reviews.llvm.org/D12074 llvm-svn: 245238	2015-08-17 20:56:39 +00:00
Douglas Katzman	685a7d1a70	[SPARC]: recognize '.' as the start of an assembler expression. llvm-svn: 245232	2015-08-17 19:55:01 +00:00
James Molloy	974838f294	[ARM] Fix crash when targetting CPU without NEON We emulate a scalar vmin/vmax with NEON instructions as they don't exist in the VFP ISA. So only mark these as legal when NEON is available. Found here: https://code.google.com/p/chromium/issues/detail?id=521671 llvm-svn: 245231	2015-08-17 19:37:12 +00:00
Igor Laevsky	06044f97d2	[ScalarEvolutionExpander] Reuse findExistingExpansion during expansion cost calculation for division Primary purpose of this change is to reuse existing code inside findExistingExpansion. However it introduces very slight semantic change - findExistingExpansion now looks into exiting blocks instead of a loop latches. Originally heuristic was based on the fact that we want to look at the loop exit conditions. And since all exiting latches will be listed in the ExitingBlocks, heuristic stays roughly the same. Differential Revision: http://reviews.llvm.org/D12008 llvm-svn: 245227	2015-08-17 16:37:04 +00:00
Silviu Baranga	b322aa6f53	[CostModel][AArch64] Increase cost of vector insert element and add missing cast costs Summary: Increase the estimated costs for insert/extract element operations on AArch64. This is motivated by results from benchmarking interleaved accesses. Add missing costs for zext/sext/trunc instructions and some integer to floating point conversions. These costs were previously calculated by scalarizing these operation and were affected by the cost increase of the insert/extract element operations. Reviewers: rengolin Subscribers: mcrosier, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11939 llvm-svn: 245226	2015-08-17 16:05:09 +00:00
Silviu Baranga	d5ac26937c	[CostModel][ARM] Increase cost of insert/extract operations Summary: This change limits the minimum cost of an insert/extract element operation to 2 in cases where this would result in mixing of NEON and VFP code. Reviewers: rengolin Subscribers: mssimpso, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12030 llvm-svn: 245225	2015-08-17 15:57:05 +00:00
Igor Laevsky	b20bda77e7	[BasicAliasAnalysis] Do not check ModRef table for intrinsics All possible ModRef behaviours can be completely represented using existing LLVM IR attributes. Differential Revision: http://reviews.llvm.org/D12033 llvm-svn: 245224	2015-08-17 15:56:56 +00:00
Artur Pilipenko	34d8ba84c8	Take alignment into account in isSafeToSpeculativelyExecute and isSafeToLoadUnconditionally. Reviewed By: hfinkel, sanjoy, MatzeB Differential Revision: http://reviews.llvm.org/D9791 llvm-svn: 245223	2015-08-17 15:54:26 +00:00
Benjamin Kramer	1ee99a8b46	Extend MCAsmLexer so that it can peek forward several tokens This commit adds a virtual `peekTokens()` function to `MCAsmLexer` which can peek forward an arbitrary number of tokens. It also makes the `peekTok()` method call `peekTokens()` method, but only requesting one token. The idea is to better support targets which more more ambiguous assembly syntaxes. Patch by Dylan McKay! llvm-svn: 245221	2015-08-17 14:35:25 +00:00
Aaron Ballman	aa3d810b5f	Correcting a -Woverflow warning where 0xFFFF was overflowing an implicit constant conversion. llvm-svn: 245220	2015-08-17 14:25:57 +00:00
Joseph Tremoulet	7031c9fc2e	[WinEHPrepare] Fix catchret successor phi demotion Summary: When demoting an SSA value that has a use on a phi and one of the phi's predecessors terminates with catchret, the edge needs to be split and the load inserted in the new block, else we'll still have a cross-funclet SSA value. Add a test for this, and for the similar case where a def to be spilled is on and invoke and a critical edge, which was already implemented but missing a test. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12065 llvm-svn: 245218	2015-08-17 13:51:37 +00:00
Tobias Grosser	58fdd88751	Revert "Disable targetdatalayoutcheck" I committed by accident a local hack that should not have made it upstream. Sorry for the noise. llvm-svn: 245212	2015-08-17 10:58:03 +00:00
Tobias Grosser	607b8b26e9	Disable targetdatalayoutcheck llvm-svn: 245210	2015-08-17 10:56:35 +00:00
Daniel Sanders	a39ef1c68f	[mips] [IAS] Add support for the DLA pseudo-instruction and fix problems with DLI Summary: It is the same as LA, except that it can also load 64-bit addresses and it only works on 64-bit MIPS architectures. Reviewers: tomatabacu, seanbruno, vkalintiris Subscribers: brooks, seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D9524 llvm-svn: 245208	2015-08-17 10:11:55 +00:00
Michael Kuperstein	adc4e9c414	[GMR] isNonEscapingGlobalNoAlias() should look through Bitcasts/GEPs when looking at loads. This fixes yet another case from PR24288. Differential Revision: http://reviews.llvm.org/D12064 llvm-svn: 245207	2015-08-17 10:06:08 +00:00
James Molloy	88edc8243d	Remove hand-rolled matching for fmin and fmax. SDAGBuilder now does this all for us. llvm-svn: 245198	2015-08-17 07:13:20 +00:00
James Molloy	c617be559a	Rip out hand-rolled matching code for VMIN, VMAX, VMINNM and VMAXNM This is no longer needed - SDAGBuilder will do this for us. llvm-svn: 245197	2015-08-17 07:13:15 +00:00
James Molloy	ef183397b1	Generate FMINNAN/FMINNUM/FMAXNAN/FMAXNUM from SDAGBuilder. These only get generated if the target supports them. If one of the variants is not legal and the other is, and it is safe to do so, the other variant will be emitted. For example on AArch32 (V8), we have scalar fminnm but not fmin. Fix up a couple of tests while we're here - one now produces better code, and the other was just plain wrong to start with. llvm-svn: 245196	2015-08-17 07:13:10 +00:00
Karthik Bhat	3af28945b9	Fix PR24469 resulting from r245025 and re-enable dead store elimination across basicblocks. PR24469 resulted because DeleteDeadInstruction in handleNonLocalStoreDeletion was deleting the next basic block iterator. Fixed the same by resetting the basic block iterator post call to DeleteDeadInstruction. llvm-svn: 245195	2015-08-17 05:51:39 +00:00
David Majnemer	8ed559ad22	Revert "[InstCombinePHI] Partial simplification of identity operations." This reverts commit r244887, it caused PR24470. llvm-svn: 245194	2015-08-17 03:11:26 +00:00
Chandler Carruth	2f1fd1658f	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Chandler Carruth	b596ba2376	[ADT] Teach FoldingSet to be movable. This is a very minimal move support - it leaves the moved-from object in a zombie state that is only valid for destruction and move assignment. This seems fine to me, and leaving it in the default constructed state would require adding more state to the object and potentially allocating memory (!!!) and so seems like a Bad Idea. llvm-svn: 245192	2015-08-16 23:17:27 +00:00
Benjamin Kramer	bb70d751de	[SimplifyLibCalls] Drop default template args. No functional change. llvm-svn: 245189	2015-08-16 21:16:37 +00:00
Benjamin Kramer	dc1d1cbd82	[IR] Simplify code. No functionality change. llvm-svn: 245188	2015-08-16 21:16:26 +00:00
Sanjay Patel	57fd1dc5db	transform fmin/fmax calls when possible (PR24314) If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187	2015-08-16 20:18:19 +00:00
Sanjoy Das	94c4aecf83	[LSR][NFC] Don’t duplicate entity name at the beginning of the comment. llvm-svn: 245183	2015-08-16 18:22:46 +00:00
Sanjoy Das	302bfd04b5	[LSR][NFC] Use camelCase for method names in Formula and RegUseTracker. llvm-svn: 245182	2015-08-16 18:22:43 +00:00
Sanjay Patel	3ab4a73bac	use SDValue bool operator; NFCI llvm-svn: 245181	2015-08-16 17:54:28 +00:00
Yaron Keren	178c465223	Add missing include guard. llvm-svn: 245173	2015-08-16 07:55:08 +00:00
David Majnemer	e04443baff	Revert "Add support for cross block dse. This patch enables dead stroe elimination across basicblocks." This reverts commit r245025, it caused PR24469. llvm-svn: 245172	2015-08-16 07:11:59 +00:00
David Majnemer	dfa3b09541	[InstCombine] Replace an and+icmp with a trunc+icmp Bitwise arithmetic can obscure a simple sign-test. If replacing the mask with a truncate is preferable if the type is legal because it permits us to rephrase the comparison more explicitly. llvm-svn: 245171	2015-08-16 07:09:17 +00:00
Chandler Carruth	5efd530cbc	Revert r244127: [PM] Remove a failed attempt to port the CallGraph analysis ... It turns out that we do need the old CallGraph ported to the new pass manager. There are times where this model of a call graph is really superior to the one provided by the LazyCallGraph. For example, GlobalsModRef very specifically needs the model provided by CallGraph. While here, I've tried to make the move semantics actually work. =] llvm-svn: 245170	2015-08-16 06:35:19 +00:00
David Majnemer	1a59e49f3c	[X86] Widen the 'AND' mask if doing so shrinks the encoding size We can set additional bits in a mask given that we know the other operand of an AND already has some bits set to zero. This can be more efficient if doing so allows us to use an instruction which implicitly sign extends the immediate. This fixes PR24085. Differential Revision: http://reviews.llvm.org/D11289 llvm-svn: 245169	2015-08-16 04:52:11 +00:00
NAKAMURA Takumi	5196275eea	MergeFunc: Quick fix for r245140, Ignore second, aka Function*, in sorting. Don't assume second would be ordered in the module. llvm-svn: 245168	2015-08-16 02:41:23 +00:00
Yaron Keren	dfb655fe17	Try to appease VS 2015 warnings from http://reviews.llvm.org/D11890 ByteSize and BitSize should not be size_t but unsigned, considering 1) They are at most 2^16 and 2^19, respectively. 2) BitSize is an argument to Type::getIntNTy which takes unsigned. Also, use the correct utostr instead itostr and cache the string result. Thanks to James Touton for reporting this! llvm-svn: 245167	2015-08-15 19:06:14 +00:00
Sanjay Patel	40d4eb40f6	[x86] enable machine combiner reassociations for scalar single-precision minimums llvm-svn: 245166	2015-08-15 17:01:54 +00:00
Yaron Keren	8b2a031cff	Silence VS2015 warning. Patch by James Touton! http://reviews.llvm.org/D11890 llvm-svn: 245161	2015-08-15 14:54:43 +00:00
Simon Pilgrim	0750c84623	[DAGCombiner] Attempt to mask vectors before zero extension instead of after. For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors. (zext (truncate x)) -> (zext (and(x, m)) Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code. This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles. Differential Revision: http://reviews.llvm.org/D11764 llvm-svn: 245160	2015-08-15 13:27:30 +00:00
Chandler Carruth	e8824e3026	[PM/AA] Delete the LibCallAliasAnalysis and all the associated infrastructure. This AA was never used in tree. It's infrastructure also completely overlaps that of TargetLibraryInfo which is used heavily by BasicAA to achieve similar goals to those stated for this analysis. As has come up in several discussions, the use case here is still really important, but this code isn't helping move toward that use case. Any progress on better supporting rich AA information for runtime library environments would likely be better off starting from scratch or starting from TargetLibraryInfo than from this base. Differential Revision: http://reviews.llvm.org/D12028 llvm-svn: 245155	2015-08-15 09:22:21 +00:00
Matt Arsenault	588732bd6e	AMDGPU/SI: Only look at live out SGPR defs When trying to fix SGPR live ranges, skip defs that are killed in the same block as the def. I don't think we need to worry about these cases as long as the live ranges of the SGPRs in dominating blocks are correct. This reduces the number of elements the second loop over the function needs to look at, and makes it generally easier to understand. The second loop also only considers if the live range is live in to a block, which logically means it must have been live out from another. llvm-svn: 245150	2015-08-15 02:58:49 +00:00
David Majnemer	0bc0eef71c	[IR] Give catchret an optional 'return value' operand Some personality routines require funclet exit points to be clearly marked, this is done by producing a token at the funclet pad and consuming it at the corresponding ret instruction. CleanupReturnInst already had a spot for this operand but CatchReturnInst did not. Other personality routines don't need to use this which is why it has been made optional. llvm-svn: 245149	2015-08-15 02:46:08 +00:00
James Y Knight	5567bafe93	Remove redundant TargetFrameLowering::getFrameIndexOffset virtual function. This was the same as getFrameIndexReference, but without the FrameReg output. Differential Revision: http://reviews.llvm.org/D12042 llvm-svn: 245148	2015-08-15 02:32:35 +00:00
JF Bastien	d4698e1bac	[WebAssembly] Add Relooper This is just an initial checkin of an implementation of the Relooper algorithm, in preparation for WebAssembly codegen to utilize. It doesn't do anything yet by itself. The Relooper algorithm takes an arbitrary control flow graph and generates structured control flow from that, utilizing a helper variable when necessary to handle irreducibility. The WebAssembly backend will be able to use this in order to generate an AST for its binary format. Author: azakai Reviewers: jfb, sunfish Subscribers: jevinskie, arsenm, jroelofs, llvm-commits Differential revision: http://reviews.llvm.org/D11691 llvm-svn: 245142	2015-08-15 01:23:28 +00:00
JF Bastien	5e4303dc14	Accelerate MergeFunctions with hashing This patch makes the Merge Functions pass faster by calculating and comparing a hash value which captures the essential structure of a function before performing a full function comparison. The hash is calculated by hashing the function signature, then walking the basic blocks of the function in the same order as the main comparison function. The opcode of each instruction is hashed in sequence, which means that different functions according to the existing total order cannot have the same hash, as the comparison requires the opcodes of the two functions to be the same order. The hash function is a static member of the FunctionComparator class because it is tightly coupled to the exact comparison function used. For example, functions which are equivalent modulo a single variant callsite might be merged by a more aggressive MergeFunctions, and the hash function would need to be insensitive to these differences in order to exploit this. The hashing function uses a utility class which accumulates the values into an internal state using a standard bit-mixing function. Note that this is a different interface than a regular hashing routine, because the values to be hashed are scattered amongst the properties of a llvm::Function, not linear in memory. This scheme is fast because only one word of state needs to be kept, and the mixing function is a few instructions. The main runOnModule function first computes the hash of each function, and only further processes functions which do not have a unique function hash. The hash is also used to order the sorted function set. If the hashes differ, their values are used to order the functions, otherwise the full comparison is done. Both of these are helpful in speeding up MergeFunctions. Together they result in speedups of 9% for mysqld (a mostly C application with little redundancy), 46% for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all three cases, the new speed of MergeFunctions is about half that of the module verifier, making it relatively inexpensive even for large LTO builds with hundreds of thousands of functions. The same functions are merged, so this change is free performance. Author: jrkoenig Reviewers: nlewycky, dschuff, jfb Subscribers: llvm-commits, aemerson Differential revision: http://reviews.llvm.org/D11923 llvm-svn: 245140	2015-08-15 01:18:18 +00:00
Matt Arsenault	427a0fd22e	LoopStrengthReduce: Try to pass address space to isLegalAddressingMode This seems to only work some of the time. In some situations, this seems to use a nonsensical type and isn't actually aware of the memory being accessed. e.g. if branch condition is an icmp of a pointer, it checks the addressing mode of i1. llvm-svn: 245137	2015-08-15 00:53:06 +00:00
Matt Arsenault	297ae311ce	AMDGPU/SI: Fix printing useless info with amdhsa The comments at the bottom would all report 0 if amdhsa was used. llvm-svn: 245135	2015-08-15 00:12:39 +00:00
Matt Arsenault	0259a7aa41	AMDGPU/SI: Update LiveVariables This is simple but won't work if/when this pass is moved to be post-SSA. llvm-svn: 245134	2015-08-15 00:12:37 +00:00
Matt Arsenault	670ba46efe	AMDGPU/SI: Update LiveIntervals during SIFixSGPRLiveRanges Does not mark SlotIndexes as reserved, although I think that might be OK. LiveVariables still need to be handled. llvm-svn: 245133	2015-08-15 00:12:35 +00:00
Matt Arsenault	b75233235c	AMDGPU: Remove unnecessary assert These shouldn't ever be null. The number of successors was already asserted to be 2. llvm-svn: 245132	2015-08-15 00:12:32 +00:00
Matt Arsenault	4275c29a02	AMDGPU/SI: Make comments more precise. True branch instructions do behave as expected with liveness. Avoid the phrasing "branch decision is based on a value in an SGPR" because this could be misleading. A VALU compare instruction's result is still based on an SGPR, even though that condition may be divergent. llvm-svn: 245131	2015-08-15 00:12:30 +00:00
Nick Lewycky	8075fd22b9	Fix a crash where a utility function wasn't aware of fcmp vectors and created a value with the wrong type. Fixes PR24458! llvm-svn: 245119	2015-08-14 22:46:49 +00:00
Bjarke Hammersholt Roune	9791ed4705	[SCEV] Apply NSW and NUW flags via poison value analysis for sub, mul and shl Summary: http://reviews.llvm.org/D11212 made Scalar Evolution able to propagate NSW and NUW flags from instructions to SCEVs for add instructions. This patch expands that to sub, mul and shl instructions. This change makes LSR able to generate pointer induction variables for loops like these, where the index is 32 bit and the pointer is 64 bit: for (int i = 0; i < numIterations; ++i) sum += ptr[i - offset]; for (int i = 0; i < numIterations; ++i) sum += ptr[i * stride]; for (int i = 0; i < numIterations; ++i) sum += ptr[3 * (i << 7)]; Reviewers: atrick, sanjoy Subscribers: sanjoy, majnemer, hfinkel, llvm-commits, meheff, jingyue, eliben Differential Revision: http://reviews.llvm.org/D11860 llvm-svn: 245118	2015-08-14 22:45:26 +00:00
Pat Gavlin	b399095c3f	Add a target environment for CoreCLR. Although targeting CoreCLR is similar to targeting MSVC, there are certain important differences that the backend must be aware of (e.g. differences in stack probes, EH, and library calls). Differential Revision: http://reviews.llvm.org/D11012 llvm-svn: 245115	2015-08-14 22:41:43 +00:00
Ahmed Bougacha	cd35787217	[AArch64] Fix FMLS scalar-indexed-from-2s-after-neg patterns. We canonicalize V64 vectors to V128 through insert_subvector: the other FMLA/FMLS/FMUL/FMULX patterns match that already, but this one doesn't, so we'd fail to match fmls and generate fneg+fmla instead. The vector equivalents are already tested and functional. llvm-svn: 245107	2015-08-14 22:06:05 +00:00
Evgeniy Stepanov	24ac55d884	[msan] Fix handling of musttail calls. MSan instrumentation for return values of musttail calls is not allowed by the IR constraints, and not needed at the same time. llvm-svn: 245106	2015-08-14 22:03:50 +00:00
Alex Lorenz	577d271a75	MIR Serialization: Serialize the '.cfi_same_value' CFI directive. llvm-svn: 245103	2015-08-14 21:55:58 +00:00
Alex Lorenz	c3ba7508f6	MIR Serialization: Serialize the external symbol call entry pseudo source values. llvm-svn: 245098	2015-08-14 21:14:50 +00:00
Alex Lorenz	50b826fb75	MIR Serialization: Serialize the global value call entry pseudo source values. llvm-svn: 245097	2015-08-14 21:08:30 +00:00
Tom Stellard	bef1094ee7	AMDGPU/SI: Add missing spill class The compiler was failing to spill for some shaders. Patch By: Axel Davy llvm-svn: 245087	2015-08-14 19:46:05 +00:00

... 28 29 30 31 32 ...

85123 Commits