llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniel Berlin	3d512a2dc2	MSSA: Factor out phi node placement llvm-svn: 279462	2016-08-22 19:14:30 +00:00
Daniel Berlin	868381bff6	MSSA: Only rename accesses whose defining access is nullptr llvm-svn: 279461	2016-08-22 19:14:16 +00:00
James Molloy	5bf2114265	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd [Recommitting now an unrelated assertion in SROA is sorted out] The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279460	2016-08-22 19:07:15 +00:00
James Molloy	0fee97f8ba	[SROA] Remove incorrect assertion Confirmed with aprantl, this assertion is incorrect - code can get here (for example 80-bit FP types) and if it does it's benign. This is exposed by a completely unrelated patch of mine, so stop the compiler falling over. Original differential: http://reviews.llvm.org/D16187 aprantl's advice to remove assertion: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160815/382129.html llvm-svn: 279454	2016-08-22 18:49:42 +00:00
Tim Shen	a5cc25e50f	[SSP] Do not set __guard_local to hidden for OpenBSD SSP __guard_local is defined as long on OpenBSD. If the source file contains a definition of __guard_local, it mismatches with the int8 pointer type used in LLVM. In that case, Module::getOrInsertGlobal() returns a cast operation instead of a GlobalVariable. Trying to set the visibility on the cast operation leads to random segfaults (seen when compiling the OpenBSD kernel, which also runs with stack protection). In the kernel, the hidden attribute does not matter. For userspace code, __guard_local is defined as hidden in the startup code. If a program re-defines __guard_local, the definition from the startup code will either win or the linker complains about multiple definitions (depending on whether the re-defined __guard_local is placed in the common segment or not). It also matches what gcc on OpenBSD does. Thanks Stefan Kempf <sisnkemp@gmail.com> for the patch! Differential Revision: http://reviews.llvm.org/D23674 llvm-svn: 279449	2016-08-22 18:26:27 +00:00
Jun Bum Lim	ec8b8cc595	[InstCombine] Allow sinking from unique predecessor with multiple edges Summary: We can allow sinking if the single user block has only one unique predecessor, regardless of the number of edges. Note that a switch statement with multiple cases can have the same destination. Reviewers: mcrosier, majnemer, spatel, reames Subscribers: reames, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23722 llvm-svn: 279448	2016-08-22 18:21:56 +00:00
James Molloy	475f4a763f	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r279443. It caused buildbot failures. llvm-svn: 279447	2016-08-22 18:13:12 +00:00
James Molloy	353052698a	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279443	2016-08-22 17:40:23 +00:00
Simon Pilgrim	c8ad5c069c	[X86][AVX] Don't use SubVectorBroadcast if there are additional users of the chain (PR29088) We could improve on this by making X86SubVBroadcast a full memory intrinsic similar to X86vzload llvm-svn: 279441	2016-08-22 16:47:55 +00:00
Simon Atanasyan	eb9ed61021	[mips][ias] Support .dtprel[d]word and .tprel[d]word directives Assembler directives .dtprelword, .dtpreldword, .tprelword, and .tpreldword generates relocations R_MIPS_TLS_DTPREL32, R_MIPS_TLS_DTPREL64, R_MIPS_TLS_TPREL32, and R_MIPS_TLS_TPREL64 respectively. The main motivation for this patch is to be able to write test cases for checking correctness of the LLD linker's behaviour. Differential Revision: https://reviews.llvm.org/D23669 llvm-svn: 279439	2016-08-22 16:18:42 +00:00
Mehdi Amini	f8c2f08cb3	[LTO] Constify the Module Hook function (NFC) It use to be non-const for the sole purpose of custom handling of commons symbol. This is moved now in the regular LTO handling now and such we can constify the callback. llvm-svn: 279438	2016-08-22 16:17:40 +00:00
Krzysztof Parzyszek	673b347e5a	Reset isUndef when removing subreg from a def operand llvm-svn: 279437	2016-08-22 14:50:12 +00:00
Simon Pilgrim	13fa33012b	[X86] Only accept SM_SentinelUndef (-1) as an undefined shuffle mask in range As discussed on D23027 we should be trying to be more strict on what is an undefined mask value. llvm-svn: 279435	2016-08-22 13:18:56 +00:00
Artur Pilipenko	bc76ecada0	Revert -r278267 [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers This change cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See https://reviews.llvm.org/D18777 for details. llvm-svn: 279433	2016-08-22 13:14:07 +00:00
Artur Pilipenko	b78ad9d41f	Revert -r278269 [IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative This change needs to be reverted in order to revert -r278267 which cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See comments on https://reviews.llvm.org/D18777 for details. llvm-svn: 279432	2016-08-22 13:12:07 +00:00
Simon Pilgrim	2279e59573	[X86][SSE] Avoid specifying unused arguments in SHUFPD lowering As discussed on PR26491, we are missing the opportunity to make use of the smaller MOVHLPS instruction because we set both arguments of a SHUFPD when using it to lower a single input shuffle. This patch sets the lowered argument to UNDEF if that shuffle element is undefined. This in turn makes it easier for target shuffle combining to decode UNDEF shuffle elements, allowing combines to MOVHLPS to occur. A fix to match against MOVHPD stores was necessary as well. This builds on the improved MOVLHPS/MOVHLPS lowering and memory folding support added in D16956 Adding similar support for SHUFPS will have to wait until have better support for target combining of binary shuffles. Differential Revision: https://reviews.llvm.org/D23027 llvm-svn: 279430	2016-08-22 12:56:54 +00:00
Hrvoje Varga	f0ed16eae5	[mips][microMIPS] Implement BLTZC, BLEZC, BGEZC and BGTZC instructions, fix disassembly and add operand checking to existing B<cond>C implementations Differential Revision: https://reviews.llvm.org/D22667 llvm-svn: 279429	2016-08-22 12:17:59 +00:00
Davide Italiano	80d379f228	[MC] Remove guard(s). NFCI. All the methods are already marked with LLVM_DUMP_METHOD. llvm-svn: 279428	2016-08-22 11:55:22 +00:00
Craig Topper	5f8419da34	[X86] Create a new instruction format to handle 4VOp3 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling. llvm-svn: 279424	2016-08-22 07:38:50 +00:00
Craig Topper	9b20fece81	[X86] Create a new instruction format to handle MemOp4 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling. llvm-svn: 279423	2016-08-22 07:38:45 +00:00
Craig Topper	61b62e56b7	[X86] Space out the encodings of X86 instruction formats. I plan to add some new encodings in future commits and this will reduce the size of those commits. NFC This tries to keep all the ModRM memory and register forms in their own regions of the encodings. Hoping to make it simple on some of the switch statements that operate on these encodings. llvm-svn: 279422	2016-08-22 07:38:41 +00:00
Mehdi Amini	dc4c8cf9ac	[LTO] Handles commons in monolithic LTO The gold-plugin was doing this internally, now the API is handling commons correctly based on the given resolution. Differential Revision: https://reviews.llvm.org/D23739 llvm-svn: 279417	2016-08-22 06:25:46 +00:00
Mehdi Amini	d310b47c23	[LTO] Add a "CodeGenOnly" option. Allows the client to skip the optimizer. Summary: Slowly getting on par with libLTO Reviewers: tejohnson Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23615 llvm-svn: 279416	2016-08-22 06:25:41 +00:00
Vitaly Buka	0672a27bb5	[asan] Use 1 byte aligned stores to poison shadow memory Summary: r279379 introduced crash on arm 32bit bot. I suspect this is alignment issue. Reviewers: eugenis Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D23762 llvm-svn: 279413	2016-08-22 04:16:14 +00:00
Craig Topper	ca0eda3e6a	[X86] Merge hasVEX_i8ImmReg into the ImmFormat type which had extra unused encodings. This saves one bit in TSFlags. NFC llvm-svn: 279412	2016-08-22 01:37:19 +00:00
Craig Topper	522541231a	[X86] Remove ignoreVEX_L from TSFlags. Only the disassembler needs it and the disassembler doesn't use TSFlags. NFC llvm-svn: 279411	2016-08-22 01:37:16 +00:00
NAKAMURA Takumi	9d0b53129c	Reformat. llvm-svn: 279409	2016-08-22 00:58:47 +00:00
NAKAMURA Takumi	59a20649c6	Untabify. llvm-svn: 279408	2016-08-22 00:58:04 +00:00
Sanjay Patel	643d21a62c	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 4 This concludes the fixes for icmp+shl in this series: https://reviews.llvm.org/rL279339 https://reviews.llvm.org/rL279398 https://reviews.llvm.org/rL279399 llvm-svn: 279401	2016-08-21 17:10:07 +00:00
Sanjay Patel	7ffcde7422	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 3 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279399	2016-08-21 16:35:34 +00:00
Sanjay Patel	7e09f13fed	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 2 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279398	2016-08-21 16:28:22 +00:00
Simon Pilgrim	67e7e22462	[X86][AVX] Dropped combineShuffle256 - this can now be performed by EltsFromConsecutiveLoads llvm-svn: 279397	2016-08-21 15:39:45 +00:00
Sanjay Patel	792636603f	[InstCombine] use APInt instead of ConstantInt in isSignBitCheck(); NFCI The callers still have ConstantInt guards, so there is no functional change intended from this change. But relaxing the callers will allow more folds for vector types. llvm-svn: 279396	2016-08-21 15:07:45 +00:00
Guy Blank	9ae797a798	[AVX512][FastISel] Do not use K registers in TEST instructions In some cases, FastIsel was emitting TEST instruction with K reg input, which is illegal. Changed to using KORTEST when dealing with K regs. Differential Revision: https://reviews.llvm.org/D23163 llvm-svn: 279393	2016-08-21 08:02:27 +00:00
Duncan P. N. Exon Smith	8f44c98d04	ARM: Avoid dereferencing end() in ARMFrameLowering::emitEpilogue This fixes the crash from PR29072, where the MachineBasicBlock::iterator wasn't being properly checked against MachineBasicBlock::end() before iterating. This was another bug exposed by the new ilist::iterator::operator*() assertion from r279314. This testcase is poor quality. bugpoint couldn't reduce any further, and I haven't had time to dig into what's going on so I can't invent a better one. I didn't even get good CHECK lines in: this is just a crasher. I'm committing anyway since this is a real crash with an obvious fix, but I'll leave PR29072 open and ask an ARM maintainer to help improve the testcase. llvm-svn: 279391	2016-08-21 00:08:10 +00:00
Vitaly Buka	1f9e135023	[asan] Minimize code size by using __asan_set_shadow_* for large blocks Summary: We can insert function call instead of multiple store operation. Current default is blocks larger than 64 bytes. Changes are hidden behind -asan-experimental-poisoning flag. PR27453 Differential Revision: https://reviews.llvm.org/D23711 llvm-svn: 279383	2016-08-20 20:23:50 +00:00
Simon Pilgrim	02b13d4d3c	Use SDValue::getOpcode() helper instead of via SDValue::getNode() llvm-svn: 279381	2016-08-20 20:04:18 +00:00
Vitaly Buka	3455b9b8bc	[asan] Initialize __asan_set_shadow_* callbacks Summary: Callbacks are not being used yet. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23634 llvm-svn: 279380	2016-08-20 18:34:39 +00:00
Vitaly Buka	186280daa5	[asan] Optimize store size in FunctionStackPoisoner::poisonRedZones Summary: Reduce store size to avoid leading and trailing zeros. Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23648 llvm-svn: 279379	2016-08-20 18:34:36 +00:00
Vitaly Buka	5b4f12176c	[asan] Cleanup instrumentation of dynamic allocas Summary: Extract instrumenting dynamic allocas into separate method. Rename asan-instrument-allocas -> asan-instrument-dynamic-allocas Differential Revision: https://reviews.llvm.org/D23707 llvm-svn: 279376	2016-08-20 17:22:27 +00:00
Vitaly Buka	f9fd63ad39	[asan] Add support of lifetime poisoning into ComputeASanStackFrameLayout Summary: We are going to combine poisoning of red zones and scope poisoning. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23623 llvm-svn: 279373	2016-08-20 16:48:24 +00:00
Matthew Simpson	235e479984	Reapply "[SLP] Initialize VectorizedValue when gathering" The test case included in r279125 exposed existing undefined behavior in the SLP vectorizer that it did not introduce. This patch reapplies the original patch, but modifies the test case to avoid hitting the undefined behavior. This allows us to close PR28330 while keeping the UBSan bot happy. The undefined behavior the original test uncovered will be addressed in a follow-on patch. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 llvm-svn: 279370	2016-08-20 14:49:02 +00:00
Matthew Simpson	2429656aa9	[SLP] Add command line option for minimum tree size (NFC) llvm-svn: 279369	2016-08-20 14:10:06 +00:00
Vitaly Buka	cc7db13bf0	Revert "[SLP] Initialize VectorizedValue when gathering" to fix ubsan bot. This reverts commit r279125. https://reviews.llvm.org/D23410 llvm-svn: 279363	2016-08-20 07:09:39 +00:00
Chandler Carruth	8abdf75d6b	[PM] Introduce an abstraction for all the analyses over a particular IR unit for use in the PreservedAnalyses set. This doesn't have any important functional change yet but it cleans things up and makes the analysis substantially more efficient by avoiding querying through the type erasure for every analysis. I also think it makes it much easier to reason about how analyses are preserved when walking across pass managers and across IR unit abstractions. Thanks to Sean and Mehdi both for the comments and suggestions. Differential Revision: https://reviews.llvm.org/D23691 llvm-svn: 279360	2016-08-20 04:57:28 +00:00
Matthias Braun	367d853042	MachineFunction: Add llvm_unreachable for missing properties Most compilers should give you a warning anyway though. llvm-svn: 279346	2016-08-19 23:03:28 +00:00
Krzysztof Parzyszek	d95d100c28	Reset "undef" flag when coalescing subregister into whole register llvm-svn: 279344	2016-08-19 22:57:23 +00:00
Tim Northover	a11be04769	GlobalISel: support legalization of G_FCONSTANTs llvm-svn: 279341	2016-08-19 22:40:08 +00:00
Tim Northover	ea904f9424	GlobalISel: teach legalizer how to handle integer constants. llvm-svn: 279340	2016-08-19 22:40:00 +00:00
Sanjay Patel	fa7de606c4	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 1 This is a partial enablement (move the ConstantInt guard down) because there are many different folds here and one of the later ones will require reworking 'isSignBitCheck'. llvm-svn: 279339	2016-08-19 22:33:26 +00:00
Matthias Braun	a7d6fc9618	MachineFunction: Cleanup/simplify MachineFunctionProperties::print() - Always compile print() regardless of LLVM_ENABLE_DUMP. (We usually only gard dump() functions with that). - Only show the set properties to reduce output clutter. - Remove the unused variant that even shows the unset properties. - Fix comments llvm-svn: 279338	2016-08-19 22:31:45 +00:00
Matthias Braun	a3b983aa5e	MachineFunction: Make LastProperty an alias of the last property This avoids unnecessary cases in switch statements covering all properties. llvm-svn: 279337	2016-08-19 22:31:42 +00:00
Daniel Berlin	11da66fc10	Partially revert 279331, as we modify this instruction in the loop llvm-svn: 279335	2016-08-19 22:18:38 +00:00
Vitaly Buka	e149b392a8	Revert "[asan] Add support of lifetime poisoning into ComputeASanStackFrameLayout" This reverts commit r279020. Speculative revert in hope to fix asan test on arm. llvm-svn: 279332	2016-08-19 22:12:58 +00:00
Daniel Berlin	a36f46363f	Convert some depth first traversals to depth_first llvm-svn: 279331	2016-08-19 22:06:23 +00:00
Tim Shen	b5e0f5ac95	[GraphTraits] Make nodes_iterator dereference to NodeType/NodeRef Currently nodes_iterator may dereference to a NodeType or a NodeType&. Make them all dereference to NodeType*, which is NodeRef later. Differential Revision: https://reviews.llvm.org/D23704 Differential Revision: https://reviews.llvm.org/D23705 llvm-svn: 279326	2016-08-19 21:20:13 +00:00
Krzysztof Parzyszek	e4582d4a2e	[Packetizer] Add debugging code to stop packetization after N instructions llvm-svn: 279325	2016-08-19 21:12:52 +00:00
Krzysztof Parzyszek	29a6a2eb8f	[Hexagon] Avoid register dependencies on indirect branches in packetizer Do not packetize the instruction setting the branch address with the indirect branch itself. llvm-svn: 279324	2016-08-19 21:07:35 +00:00
Kostya Serebryany	a533e514b8	[libFuzzer] fix the non-debug build warnings llvm-svn: 279321	2016-08-19 20:57:09 +00:00
Tim Northover	d5c23bcfc9	GlobalISel: translate floating-point comparisons llvm-svn: 279319	2016-08-19 20:48:16 +00:00
Justin Lebar	d13880a750	[NVPTX] Switch nvptx-use-infer-addrspace to true. Summary: This switches us to use a different, more powerful algorithm for address space inference. I've tested this locally and it seems to work great. Once we're more confident in it, we can remove the old pass altogether. Reviewers: jingyue Subscribers: llvm-commits, tra, jholewinski Differential Revision: https://reviews.llvm.org/D23694 llvm-svn: 279317	2016-08-19 20:46:45 +00:00
Reid Kleckner	98a48afa5d	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r279229. It breaks intrinsic function calls in diamonds. llvm-svn: 279313	2016-08-19 20:22:39 +00:00
Tim Northover	b16734fbaa	GlobalISel: translate floating-point constants llvm-svn: 279311	2016-08-19 20:09:15 +00:00
Tim Northover	5a28c3642f	GlobalISel: support translating select instructions. llvm-svn: 279309	2016-08-19 20:09:07 +00:00
Tim Northover	b604622bba	GlobalISel: fix insert/extract to work on ConstantExprs too. No tests yet unfortunately (ConstantFolding reduces all supported constants to ConstantInts before we get to translation). Soon. llvm-svn: 279308	2016-08-19 20:09:03 +00:00
Tim Northover	bbbfb1cfb8	GlobalISel: translate insertvalue instructions. This adds a G_INSERT instruction, which technically makes G_SEQUENCE redundant (it's equivalent to a G_INSERT into an IMPLICIT_DEF). We'll leave G_SEQUENCE for now though: it's likely to be far more common as it's a fundamental part of legalization, so avoiding the mess and bloat of the extra IMPLICIT_DEFs is probably worthwhile. llvm-svn: 279306	2016-08-19 20:08:55 +00:00
Tom Stellard	68726a5359	MachineScheduler: Add constructor functions for the DAGMutations Summary: This way they can be re-used by target-specific schedulers. Reviewers: atrick, MatzeB, kparzysz Subscribers: kparzysz, llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D23678 llvm-svn: 279305	2016-08-19 19:59:18 +00:00
Krzysztof Parzyszek	fb4c4178a2	[Hexagon] Fix subesthetic indentation llvm-svn: 279303	2016-08-19 19:29:15 +00:00
Krzysztof Parzyszek	505eb498bd	[Hexagon] Allow i1 values for 'r' constraint in inline-asm llvm-svn: 279302	2016-08-19 19:17:28 +00:00
Sanjay Patel	7a104615c5	[InstCombine] remove an icmp fold that is already handled by InstSimplify Specifically, this is done near the end of "SimplifyICmpInst" using computeKnownBits() as the broader solution. There are even vector tests (yay!) for this in test/Transforms/InstSimplify/compare.ll. I considered putting an assert here instead of just deleting, but then we could assert every possible fold in InstSimplify in InstCombine, so...less is more? llvm-svn: 279300	2016-08-19 19:03:07 +00:00
Krzysztof Parzyszek	8849a51370	[Hexagon] Do not cache alloca instructions during isel They can be deleted or replicated, so the cache may become outdated. They only need to be visited once during frame lowering, so just scan the function instead. llvm-svn: 279297	2016-08-19 18:46:13 +00:00
Chandler Carruth	9b35e6d746	[PM] Re-instate r279227 and r279228 with a fix to the way the templating was done to hopefully appease MSVC. As an upside, this also implements the suggestion Sanjoy made in code review, so two for one! =] I'll be watching the bots to see if there are still issues. llvm-svn: 279295	2016-08-19 18:36:06 +00:00
Tim Northover	26b76f2c59	GlobalISel: improve representation of G_SEQUENCE and G_EXTRACT First, make sure all types involved are represented, rather than being implicit from the register width. Second, canonicalize all types to scalar. These operations just act in bits and don't care about vectors. Also standardize spelling of Indices in the MachineIRBuilder (NFC here). llvm-svn: 279294	2016-08-19 18:32:14 +00:00
Kyle Butt	5b10483618	Revert "IfConversion: Rescan diamonds." This reverts commit bfd62a4b4465dd21811bf615c3b04c30ddb09f7b. llvm-svn: 279289	2016-08-19 18:17:06 +00:00
Kyle Butt	ce0196de3f	Revert "CodeGen: If Convert blocks that would form a diamond when tail-merged." This reverts commit 0fda93481c4231c06b838ef476c0c404c51ff875. llvm-svn: 279288	2016-08-19 18:17:04 +00:00
Tim Northover	2fa5fa391f	GlobalISel: allow extractvalue to extract an aggregate. llvm-svn: 279287	2016-08-19 18:09:41 +00:00
Krzysztof Parzyszek	3d9946eb23	[Hexagon] Fixes for new-value jump formation - Recognize C2_cmpgtui, S2_tstbit_i, and S4_ntstbit_i. - Avoid creating new-value instructions with both source operands equal. llvm-svn: 279286	2016-08-19 17:54:49 +00:00
Tim Northover	6f80b08c64	GlobalISel: support translation of extractvalue instructions. llvm-svn: 279285	2016-08-19 17:47:05 +00:00
Sanjay Patel	e38e79c3e6	[InstCombine] use local variables to reduce code in foldICmpShlConstant; NFC llvm-svn: 279282	2016-08-19 17:34:05 +00:00
Krzysztof Parzyszek	5a7bef9c14	[Hexagon] Fix a few omissions in HexagonInstrInfo llvm-svn: 279280	2016-08-19 17:20:57 +00:00
Sanjay Patel	38b7506f75	[InstCombine] rename variables in foldICmpShlConstant(); NFC llvm-svn: 279279	2016-08-19 17:20:37 +00:00
Tim Northover	91c8173093	GlobalISel: support overflow arithmetic intrinsics. Unsigned addition and subtraction can reuse the instructions created to legalize large width operations (i.e. both produce and consume a carry flag). Signed operations and multiplies get a dedicated op-with-overflow instruction. Once this is produced the two values are combined into a struct register (which will almost always be merged with a corresponding G_EXTRACT as part of legalization). llvm-svn: 279278	2016-08-19 17:17:06 +00:00
Vitaly Buka	170dede75d	Revert "[asan] Optimize store size in FunctionStackPoisoner::poisonRedZones" This reverts commit r279178. Speculative revert in hope to fix asan crash on arm. llvm-svn: 279277	2016-08-19 17:15:38 +00:00
Vitaly Buka	c8f4d69c82	Revert "[asan] Fix size of shadow incorrectly calculated in r279178" This reverts commit r279222. Speculative revert in hope to fix asan crash on arm. llvm-svn: 279276	2016-08-19 17:15:33 +00:00
Lang Hames	6e9f0309e9	[RuntimeDyld] Revert r279182 and 279201 -- they broke some ARM bots. llvm-svn: 279275	2016-08-19 17:06:39 +00:00
Michael Kuperstein	41898f0396	[AliasSetTracker] Degrade AliasSetTracker when may-alias sets get too large. Repeated inserts into AliasSetTracker have quadratic behavior - inserting a pointer into AST is linear, since it requires walking over all "may" alias sets and running an alias check vs. every pointer in the set. We can avoid this by tracking the total number of pointers in "may" sets, and when that number exceeds a threshold, declare the tracker "saturated". This lumps all pointers into a single "may" set that aliases every other pointer. (This is a stop-gap solution until we migrate to MemorySSA) This fixes PR28832. Differential Revision: https://reviews.llvm.org/D23432 llvm-svn: 279274	2016-08-19 17:05:22 +00:00
Simon Pilgrim	d7a3782ae4	[X86][SSE] Generalised combining to VZEXT_MOVL to any vector size This doesn't change tests codegen as we already combined to blend+zero which is what we lower VZEXT_MOVL to on SSE41+ targets, but it does put us in a better position when we improve shuffling for optsize. llvm-svn: 279273	2016-08-19 17:02:00 +00:00
Krzysztof Parzyszek	639545b4d8	[Hexagon] Enforce LLSC packetization rules Ensure that load locked and store conditional instructions are only packetized with ALU32 instructions. Patch by Ben Craig. llvm-svn: 279272	2016-08-19 16:57:05 +00:00
Reid Kleckner	a871d3872a	Fix regression in InstCombine introduced by r278944 The intended transform is: // Simplify icmp eq (or (ptrtoint P), (ptrtoint Q)), 0 // -> and (icmp eq P, null), (icmp eq Q, null). P and Q are both pointer types, but may have different types. We need two calls to getNullValue() to make the icmps. llvm-svn: 279271	2016-08-19 16:53:18 +00:00
Krzysztof Parzyszek	b7640d4df0	[Hexagon] Minor updates to register definitions llvm-svn: 279269	2016-08-19 16:40:19 +00:00
David Majnemer	5554edabef	[CloneFunction] Don't remove unrelated nodes from the CGSSC CGSCC use a WeakVH to track call sites. RAUW a call within a function can result in that WeakVH getting confused about whether or not the call site is still around. llvm-svn: 279268	2016-08-19 16:37:40 +00:00
Krzysztof Parzyszek	9335bf0ec5	[Hexagon] Fix incorrect generation of S4_subi_asl_ri Patch by Jyotsna Verma. llvm-svn: 279267	2016-08-19 16:35:05 +00:00
Sanjay Patel	a867afe094	[InstCombine] use m_APInt to allow icmp (shl 1, Y), C folds for splat constant vectors llvm-svn: 279266	2016-08-19 16:12:16 +00:00
Krzysztof Parzyszek	dddb097a1f	[Hexagon] Add missing pattern for C4_cmplte llvm-svn: 279265	2016-08-19 16:11:33 +00:00
Sanjay Patel	57b12d3876	[InstCombine] use m_APInt to allow icmp X, C folds for splat constant vectors Of course, we really need to refactor and fix all of the cmp predicates, but this one is interesting because without it, we later perform an information-losing transform of icmp (shl 1, Y), C, and we can't recover the better fold. llvm-svn: 279263	2016-08-19 15:40:44 +00:00
Mehdi Amini	9989f80ae8	[LTO] Remove dead-code: collectUsedGlobalVariables has been moved to Thin and LTO specifc path (NFC) llvm-svn: 279261	2016-08-19 15:35:44 +00:00
Krzysztof Parzyszek	0b8672269c	[Hexagon] Make p0 an explicit operand in VA1_clr* subinstructions, NFC llvm-svn: 279255	2016-08-19 15:17:19 +00:00
Krzysztof Parzyszek	6ce82951c3	[Hexagon] Add explicit default constructor for HexagonSelectionDAGInfo llvm-svn: 279254	2016-08-19 15:13:54 +00:00
Krzysztof Parzyszek	0ba9754584	[Hexagon] Allow tail-call optimization when mixing C and fast calling conv Patch by Arnold Schwaighofer. llvm-svn: 279251	2016-08-19 15:02:18 +00:00
Krzysztof Parzyszek	66dd6797e8	[Hexagon] Check for empty live interval Patch by Brendon Cahoon. llvm-svn: 279249	2016-08-19 14:29:43 +00:00
Krzysztof Parzyszek	db019ae801	[Hexagon] Consider zext/sext of a load to i32 to be free llvm-svn: 279248	2016-08-19 14:22:07 +00:00
Anton Korobeynikov	b38195c1a8	Revert r279242 - it's failing the tests llvm-svn: 279247	2016-08-19 14:18:34 +00:00
Krzysztof Parzyszek	a243adfd27	[Hexagon] Handle J2_jumptpt and J2_jumpfpt instructions llvm-svn: 279246	2016-08-19 14:14:09 +00:00
Krzysztof Parzyszek	067debe0a0	[Hexagon] Fix indentation, NFC llvm-svn: 279245	2016-08-19 14:12:51 +00:00
Krzysztof Parzyszek	9273ecc176	[Hexagon] Remove unnecessary llvm::, NFC llvm-svn: 279244	2016-08-19 14:10:57 +00:00
Krzysztof Parzyszek	75e74ee699	[Hexagon] Rename the HEXAGON_MC namespace to Hexagon_MC, NFC llvm-svn: 279243	2016-08-19 14:09:47 +00:00
Anton Korobeynikov	2aae31a945	Fix PR27500: on MSP430 the branch destination offset is measured in words, not bytes. In addition, the branch instructions will have proper BB destinations, not offsets, like before. Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D20162 llvm-svn: 279242	2016-08-19 14:07:10 +00:00
Krzysztof Parzyszek	6421b934ec	[Hexagon] Mark PS_jumpret as pseudo-instruction, expand it into J2_jumpr llvm-svn: 279241	2016-08-19 14:04:45 +00:00
Krzysztof Parzyszek	bd8ef4b8ce	[Hexagon] Improvements to handling and generation of FP instructions Improved handling of fma, floating point min/max, additional load/store instructions for floating point types. Patch by Jyotsna Verma. llvm-svn: 279239	2016-08-19 13:34:31 +00:00
Benjamin Kramer	96fcf5df03	[LoopVectorize] Don't copy std::vector in for-range loop. llvm-svn: 279233	2016-08-19 12:44:24 +00:00
Chandler Carruth	b8824a5d3f	[PM] Revert r279227 and r279228 until I can find someone to help me solve completely opaque MSVC build errors. It complains about lots of stuff with this change without givin nearly enough information to even try to fix. llvm-svn: 279231	2016-08-19 10:51:55 +00:00
Simon Pilgrim	f1b8fdc074	[X86][SSE] Add support for matching commuted insertps patterns INSERTPS doesn't fit well with our shuffle mask canonicalization, so we need to attempt both the original mask and the commuted mask to more likely get a match llvm-svn: 279230	2016-08-19 10:31:53 +00:00
James Molloy	11a1936b70	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 279229	2016-08-19 10:10:27 +00:00
Chandler Carruth	db1759ace1	[PM] Make the the new pass manager support fully generic extra arguments to run methods, both for transform passes and analysis passes. This also allows the analysis manager to use a different set of extra arguments from the pass manager where useful. Consider passes over analysis produced units of IR like SCCs of the call graph or loops. Passes of this nature will often want to refer to the analysis result that was used to compute their IR units (the call graph or LoopInfo). And for transformations, they may want to communicate special update information to the outer pass manager. With this change, it becomes possible to have a run method for a loop pass that looks more like: PreservedAnalyses run(Loop &L, AnalysisManager<Loop, LoopInfo> &AM, LoopInfo &LI, LoopUpdateRecord &UR); And to query the analysis manager like: AM.getResult<MyLoopAnalysis>(L, LI); This makes accessing the known-available analyses convenient and clear, and it makes passing customized data structures around easy. My initial use case is going to be in updating the pass manager layers when the analysis units of IR change. But there are more use cases here such as having a layer that lets inner passes signal whether certain additional passes should be run because of particular simplifications made. Two desires for this have come up in the past: triggering additional optimization after successfully unrolling loops, and triggering additional inlining after collapsing indirect calls to direct calls. Despite adding this layer of generic extensibility, the only change to existing, simple usage are for places where we forward declare the AnalysisManager template. We really shouldn't be doing this because of the fragility exposed here, but currently it makes coping with the legacy PM code easier. Differential Revision: http://reviews.llvm.org/D21462 llvm-svn: 279227	2016-08-19 09:45:16 +00:00
James Molloy	7ee640f9b6	[CodeGen] Fix a trivial type conversion bug dating back to pre-2008 The heuristic above this code is incredibly suspect, but disregarding that it mutates the cast opcode so we need to check the mutated opcode later to see if we need to emit an AssertSext or AssertZext node. Fixes PR29041. llvm-svn: 279223	2016-08-19 08:38:50 +00:00
Vitaly Buka	b81960a6c8	[asan] Fix size of shadow incorrectly calculated in r279178 Summary: r279178 generates 8 times more stores than necessary. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23708 llvm-svn: 279222	2016-08-19 08:33:53 +00:00
Chandler Carruth	b7be5b6479	[PM] Rework the new PM support for building the ModuleSummaryIndex to directly produce the index as the value type result. This requires making the index movable which is straightforward. It greatly simplifies things by allowing us to completely avoid the builder API and the layers of abstraction inherent there. Instead both pass managers can directly construct these when run by value. They still won't be constructed truly eagerly thanks to the optional in the legacy PM. The code that directly builds the index can also just share a direct function. A notable change here is that the result type of the analysis for the new PM is no longer a reference type. This was really problematic when making changes to how we handle result types to make our interface requirements much more strict and precise. But I think this is an overall improvement. Differential Revision: https://reviews.llvm.org/D23701 llvm-svn: 279216	2016-08-19 07:49:19 +00:00
Xinliang David Li	63248ab888	[Profile] Fix edge count read bug Use uint64_t to avoid value truncation before scaling. llvm-svn: 279213	2016-08-19 06:31:45 +00:00
Mehdi Amini	18b91112af	[LTO] Move callback member from base class to the derived where it is used (NFC) llvm-svn: 279212	2016-08-19 06:10:03 +00:00
Mehdi Amini	cc1fe9b9d6	Constify some path in the bitcode writer (NFC) llvm-svn: 279211	2016-08-19 06:06:18 +00:00
Mehdi Amini	026ddbb4d6	[LTO] Add a move to inialize member in ctor initialization list (NFC) llvm-svn: 279210	2016-08-19 05:56:37 +00:00
Xinliang David Li	2c9336823c	[Profile] Simple code refactoring for reuse /NFC llvm-svn: 279209	2016-08-19 05:31:33 +00:00
Dean Michael Berris	1dd1ca9727	[XRay] Synthesize a reference to the xray_instr_map Without the synthesized reference to a symbol in the xray_instr_map, linker section garbage collection will helpfully remove the whole xray_instr_map section from the final executable (or archive). This will cause the runtime to not be able to identify the sleds and hot-patch the calls/jumps into the runtime trampolines. This change adds a reference from the text section at the end of the function to keep around the associated xray_instr_map section as well. We also make sure that we catch this reference in the test. Reviewers: chandlerc, echristo, majnemer, mehdi_amini Subscribers: mehdi_amini, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D23398 llvm-svn: 279204	2016-08-19 04:44:30 +00:00
Matthias Braun	fdc4c6b426	Revert "RegScavenging: Add scavengeRegisterBackwards()" The ppc64 multistage bot fails on this. This reverts commit r279124. Also Revert "CodeGen: Add/Factor out LiveRegUnits class; NFCI" because it depends on the previous change This reverts commit r279171. llvm-svn: 279199	2016-08-19 03:03:24 +00:00
Lang Hames	b65f16c8e5	[RuntimeDyld] Add support for ELF R_ARM_REL32 and R_ARM_GOT_PREL. Patch by William Dillon. Thanks William! This patch adds support for the R_ARM_REL32 and R_ARM_GOT_PREL ELF ARM relocations to RuntimeDyld, which should allow JITing of code that produces these relocations. No test case: Unfortunately RuntimeDyldELF's GOT building mechanism (which uses a separate section for GOT entries) isn't compatible with RuntimeDyldChecker. The correct fix for this is to fix RuntimeDyldELF's GOT support (it's fundamentally broken at the moment: separate sections aren't guaranteed to be in range of a GOT entry load), but that's a non-trivial job. llvm-svn: 279182	2016-08-19 01:15:39 +00:00
Vitaly Buka	aa654292bd	[asan] Optimize store size in FunctionStackPoisoner::poisonRedZones Summary: Reduce store size to avoid leading and trailing zeros. Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23648 llvm-svn: 279178	2016-08-18 23:51:15 +00:00
Andrew Kaylor	81901d658f	Include X86CallFrameOptimization in the opt-bisect process. Differential Revision: https://reviews.llvm.org/D23683 llvm-svn: 279175	2016-08-18 22:49:51 +00:00
Saleem Abdulrasool	dab786fb78	AArch64: remove extraneous padding The structs BarrierOp, PrefetchOp, PSBHintOp are in AArch64AsmParser.cpp (inside anonymous namespace). This diff changes the order of fields and removes the excessive padding (8 bytes). Patch by Alexander Shaposhnikov! llvm-svn: 279173	2016-08-18 22:35:06 +00:00
Matthias Braun	91f95f0201	CodeGen: Add/Factor out LiveRegUnits class; NFCI This is a set of register units intended to track register liveness, it is similar in spirit to LivePhysRegs. You can also think of this as the liveness tracking parts of the RegisterScavenger factored out into an own class. This was proposed in http://llvm.org/PR27609 Differential Revision: http://reviews.llvm.org/D21916 llvm-svn: 279171	2016-08-18 22:11:28 +00:00
Kyle Butt	780b517d6b	CodeGen: If Convert blocks that would form a diamond when tail-merged. The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. Regression on self-hosting bots with no obvious explanation. Tidied up range handling to be more obviously correct, but there was no smoking gun. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 279168	2016-08-18 22:09:27 +00:00
Kyle Butt	491afad8f6	IfConversion: Rescan diamonds. The cost of predicating a diamond is only the instructions that are not shared between the two branches. Additionally If a predicate clobbering instruction occurs in the shared portion of the branches (e.g. a cond move), it may still be possible to if convert the sub-cfg. This change handles these two facts by rescanning the non-shared portion of a diamond sub-cfg to recalculate both the predication cost and whether both blocks are pred-clobbering. llvm-svn: 279167	2016-08-18 22:09:25 +00:00
Kyle Butt	d76755ec95	IfConversion: Handle inclusive ranges more carefully. This may affect calculations for thresholds, but is not a significant change in behavior. The problem was that an inclusive range must have an additonal flag to showr that it is empty, because otherwise begin == end implies that the range has one element, and it may not be possible to move past on either side. llvm-svn: 279166	2016-08-18 22:09:23 +00:00
Zhan Jun Liau	cf2f4b3251	[SystemZ] Use valid base/index regs for inline asm Summary: Inline asm memory constraints can have the base or index register be assigned to %r0 right now. Make sure that we assign only ADDR64 registers to the base and index. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23367 llvm-svn: 279157	2016-08-18 21:44:15 +00:00
Sanjay Patel	98cd99dfc6	[InstCombine] add helper function for folds of icmp (shl 1, Y), C; NFCI Clean up the existing code by: 1. Renaming variables 2. Adding local variables 3. Making it vector-safe This is still guarded by a ConstantInt check, so no functional change is intended. But this should be ready to go: if we move the ConstantInt check down, all of these folds should do the right thing for vector types. llvm-svn: 279150	2016-08-18 21:28:30 +00:00
Kostya Serebryany	32661f9d66	[libFuzzer] add more __attribute__((visibility("default"))) llvm-svn: 279143	2016-08-18 20:52:52 +00:00
Amaury Sechet	763c59dc9a	Make cltz and cttz zero undef when the operand cannot be zero in InstCombine Summary: Also add popcount(n) == bitsize(n) -> n == -1 transformation. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23134 llvm-svn: 279141	2016-08-18 20:43:50 +00:00
Sanjay Patel	40e8ca46ad	[InstCombine] use m_APInt to allow icmp (trunc X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 https://reviews.llvm.org/rL279101 llvm-svn: 279133	2016-08-18 20:28:54 +00:00
Sanjay Patel	5f4ce4e23d	[InstCombine] clean up foldICmpTruncConstant(); NFCI 1. Fix variable names 2. Add local variables to reduce code llvm-svn: 279132	2016-08-18 20:25:16 +00:00
Michael Kuperstein	2bc3d4d46c	[SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129	2016-08-18 20:08:15 +00:00
Wei Ding	52bb661dec	AMDGPU : Fix QSAD and MQSAD instructions' incorrect data type. Differential Revision: http://reviews.llvm.org/D23689 llvm-svn: 279126	2016-08-18 19:51:14 +00:00
Matthew Simpson	11db6b6b8c	[SLP] Initialize VectorizedValue when gathering We abort building vectorizable trees in some cases (e.g., if the maximum recursion depth is reached, if the region size is too large, etc.). If this happens for a reduction, we can be left with a root entry that needs to be gathered. For these cases, we need make sure we actually set VectorizedValue to the resulting vector. This patch ensures we properly set VectorizedValue, and it also ensures the insertelement sequence generated for the gathers is inserted at the correct location. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 Differential Revison: https://reviews.llvm.org/D23410 llvm-svn: 279125	2016-08-18 19:50:32 +00:00
Matthias Braun	075d0c23d5	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044 with off-by-1 instruction fix for the reload placement. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 279124	2016-08-18 19:47:59 +00:00
Kyle Butt	64e428147f	Branch Folding: Accept explicit threshold for tail merge size. This is prep work for allowing the threshold to be different during layout, and to enforce a single threshold between merging and duplicating during layout. No observable change intended. llvm-svn: 279117	2016-08-18 18:57:29 +00:00
Pete Cooper	a8db71e840	Add a version of Intrinsic::getName which is more efficient when there are no overloads. When running 'opt -O2 verify-uselistorder-nodbg.lto.bc', there are 33m allocations. 8.2m come from std::string allocations in Intrinsic::getName(). Turns out this method only returns a std::string because it needs to handle overloads, but that is not the common case. This adds an overload of getName which just returns a StringRef when there are no overloads and so saves on the allocations. llvm-svn: 279113	2016-08-18 18:30:54 +00:00
Valery Pykhtin	609c2f8137	[AMDGPU] add s_incperflevel/s_decperflevel intrinsics. Differential revision: https://reviews.llvm.org/D23666 llvm-svn: 279106	2016-08-18 18:06:20 +00:00
Elliot Colp	687691aeac	Fix SystemZ compilation abort caused by negative AND mask Normally, when an AND with a constant is lowered to NILL, the constant value is truncated to 16 bits. However, since r274066, ANDs whose results are used in a shift are caught by a different pattern that does not truncate. The instruction printer expects a 16-bit unsigned immediate operand for NILL, so this results in an abort. This patch adds code to manually truncate the constant in this situation. The rest of the bits are then set, so we will detect a case for NILL "naturally" rather than using peephole optimizations. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 279105	2016-08-18 18:04:26 +00:00
Duncan P. N. Exon Smith	84c2da47f9	AArch64: Don't call getIterator() on iterators Remove an unnecessary round-trip: iterator => operator->() => getIterator() In some cases, the iterator is end(), so the dereference of operator-> is invalid (UB). The testcase only crashes with r278974 (currently reverted to investigate this), which adds an assertion for invalid dereferences of ilist nodes. Fixes PR29035. llvm-svn: 279104	2016-08-18 17:58:09 +00:00
Eugene Zelenko	61a72d8850	[LLVM] Fix some Clang-tidy modernize-use-using and Include What You Use warnings Differential revision: https://reviews.llvm.org/D23675 llvm-svn: 279102	2016-08-18 17:56:27 +00:00
Sanjay Patel	fa5ca2bf46	[InstCombine] use m_APInt to allow icmp (udiv X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 llvm-svn: 279101	2016-08-18 17:55:59 +00:00
Dan Gohman	c9623db884	[WebAssembly] Disable the store-results optimization. The WebAssemly spec removing the return value from store instructions, so remove the associated optimization from LLVM. This patch leaves the store instruction operands in place for now, so stores now always write to "$drop"; these will be removed in a seperate patch. llvm-svn: 279100	2016-08-18 17:51:27 +00:00
Chandler Carruth	e2f36bcb84	[Assumptions] Make collecting ephemeral values not quadratic in the number of assume intrinsics. The classical way to have a cache-friendly vector style container when we need queue semantics for BFS instead of stack semantics for DFS is to use an ever-growing vector and an index. Erasing from the front requires O(size) work, and unless we expect the worklist to grow very large, its probably cheaper to just grow and race down the list. But that makes it more bad that we're putting the assume intrinsics in this at all. We end up looking at the (by definition empty) use list to see if they're ephemeral (when we've already put them in that set), etc. Instead, directly populate the worklist with the operands when we mark the assume intrinsics as ephemeral. Also, test the visited set before putting things into the worklist so we don't accumulate the same value in the list 100s of times. It would be nice to use a set-vector for this but I think its useful to test the set earlier to avoid repeatedly querying whether the same instruction is safe to speculate. Hopefully with these changes the number of values pushed onto the worklist is smaller, and we avoid quadratic work by letting it grow as necessary. Differential Revision: https://reviews.llvm.org/D23396 llvm-svn: 279099	2016-08-18 17:51:24 +00:00
Vedant Kumar	c948d182e1	Fix -Wpessimizing-move error, NFC llvm-svn: 279095	2016-08-18 17:39:53 +00:00
Sanjay Patel	12a4105647	[InstCombine] clean up foldICmpUDivConstant; NFC 1. Better variable names 2. Remove unnecessary check of ConstantInt llvm-svn: 279094	2016-08-18 17:37:26 +00:00
Zachary Turner	ac5763eca4	Resubmit "Write the TPI stream from a PDB to Yaml." The original patch was breaking some buildbots due to an incorrect ordering of function definitions which caused some compilers to recognize a definition but others to not. llvm-svn: 279089	2016-08-18 16:49:29 +00:00
Artur Pilipenko	615b820af6	CVP. Turn marking adds as no wrap (introduced by r278107) off by default It causes a regression on our internal benchmark. Introduce cvp-dont-process flag and set it off by default while investigating the regression. llvm-svn: 279082	2016-08-18 16:08:35 +00:00
Ahmed Bougacha	33e19fe1c4	[AArch64][GlobalISel] Select floating-point binary ops. There is no FREM instruction, but the others are straightforward. llvm-svn: 279081	2016-08-18 16:05:11 +00:00
Davide Italiano	d1279df752	[IRCE] Switch over to LLVM_DUMP_METHOD. NFCI. llvm-svn: 279079	2016-08-18 15:55:49 +00:00
Sanjay Patel	6347807f87	[InstCombine] use m_APInt to allow icmp (mul X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 llvm-svn: 279077	2016-08-18 15:44:44 +00:00
Derek Schuff	ccdceda128	[WebAssembly] Refactor WebAssemblyLowerEmscriptenException pass for setjmp/longjmp This patch changes the code structure of WebAssemblyLowerEmscriptenException pass to support both exception handling and setjmp/longjmp. It also changes the name of the pass and the source file. 1. Change the file/pass name to WebAssemblyLowerEmscriptenExceptions -> WebAssemblyLowerEmscriptenEHSjLj to make it clear that it supports both EH and SjLj 2. List function / global variable names at the top so they can be changed easily 3. Some cosmetic changes Patch by Heejin Ahn Differential Revision: https://reviews.llvm.org/D23588 llvm-svn: 279075	2016-08-18 15:27:25 +00:00
Ahmed Bougacha	1d0560b14d	[AArch64][GlobalISel] Select G_SDIV/G_UDIV. There is no REM instruction; that will require an expansion. It's not obvious that should be done in select, rather than as a (custom?) legalization. llvm-svn: 279074	2016-08-18 15:17:13 +00:00
Sanjay Patel	5b112845da	[InstCombine] use APInt in isSignTest instead of ConstantInt; NFC This will enable vector splat folding, but NFC until the callers have their ConstantInt restrictions removed. llvm-svn: 279072	2016-08-18 14:59:14 +00:00
Sanjay Patel	7d37b221a2	fix typo; NFC llvm-svn: 279068	2016-08-18 14:17:34 +00:00
Krzysztof Parzyszek	b1b0372337	[Hexagon] Create vcombine in HexagonCopyToCombine llvm-svn: 279067	2016-08-18 14:12:34 +00:00
Sanjay Patel	4c5e60d95c	[InstCombine] use m_APInt to allow icmp (xor X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 llvm-svn: 279066	2016-08-18 14:10:48 +00:00
Simon Dardis	ea3431598e	[mips] Correct tail call encoding for MIPSR6 r277708 enabled tails calls for MIPS but used the 'jr' instruction when the jump target was held in a register. For MIPSR6, 'jalr $zero, $reg' should have been used. Additionally, add missing patterns for external and global symbols for tail calls. Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23301 llvm-svn: 279064	2016-08-18 13:22:43 +00:00
Alex Bradbury	3447ca3f08	(Trivial) TargetPassConfig: assert when TargetMachine has no MCAsmInfo Summary: This is a pretty trivial, but I thought it was worth just checking that nobody feels it's completely the wrong thing to be doing. The motivation is that when starting a new backend, you often start with a minimal stub, pretty much just FooTargetMachine and FooTargetInfo. Once that's built, you might naturally try `llc -march=foo myinput.ll` and it seems more developer-friendly if this ends up asserting due to the lack of MCAsmInfo with an informative message rather than just segfaulting. Reviewers: MatzeB, chandlerc Subscribers: bogner, llvm-commits Differential Revision: https://reviews.llvm.org/D23443 llvm-svn: 279061	2016-08-18 13:08:58 +00:00
Simon Pilgrim	916485c765	Remove trailing whitespace llvm-svn: 279054	2016-08-18 11:22:22 +00:00
Lang Hames	75601bf71e	Revert r279016 -- it breaks win32-elf JIT tests. llvm-svn: 279029	2016-08-18 01:33:28 +00:00
Kostya Serebryany	524c3f32e7	[sanitizer-coverage/libFuzzer] instrument comparisons with __sanitizer_cov_trace_cmp[1248] instead of __sanitizer_cov_trace_cmp, don't pass the comparison type to save a bit performance. Use these new callbacks in libFuzzer llvm-svn: 279027	2016-08-18 01:25:28 +00:00
Matthias Braun	6442fc1f6e	TailDuplicator: Fix crash after r278974 Some inputs would after r278974 without this fix (see http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_build/2733/console for an example) llvm-svn: 279022	2016-08-18 00:59:32 +00:00
Mehdi Amini	8ac7b32207	[LTO] Promote before performing weak resolution Summary: This was reversed compared to ThinLTOCodeGenerator for some reason, and lead to an increased code-size on my tests. I figured that the weak resolution may internalize a linkonce function, which will be promoted immediately (and renamed), before being internalized again. Reviewers: tejohnson Subscribers: pcc, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23632 llvm-svn: 279021	2016-08-18 00:59:24 +00:00
Vitaly Buka	d5ec14989d	[asan] Add support of lifetime poisoning into ComputeASanStackFrameLayout Summary: We are going to combine poisoning of red zones and scope poisoning. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23623 llvm-svn: 279020	2016-08-18 00:56:58 +00:00
Lang Hames	1d39cb16ec	[RuntimeDyld] Strip leading '_' from symbols on 32-bit windows in RTDyldMemoryManager::getSymbolAddressInProcess() This should allow JIT'd code for win32 to find in-process symbols. See http://llvm.org/PR28699 . Patch by James Holderness. Thanks James! llvm-svn: 279016	2016-08-18 00:22:34 +00:00
Mehdi Amini	eccffada33	[LTO] Change addSaveTemps API: do not add dot to the supplied prefix path Summary: It does not play well with directories (end up with a bunch of hidden files). Also, do not strip the 0 suffix for the first task, especially since 0 can be used by ThinLTO as well now. Reviewers: tejohnson Subscribers: mehdi_amini, pcc, llvm-commits Differential Revision: https://reviews.llvm.org/D23612 llvm-svn: 279014	2016-08-18 00:12:33 +00:00
Dominic Chen	a8a638292c	[WebAssembly] Handle debug information and virtual registers without crashing (reland r278967) Summary: Currently, enabling debug information when compiling for WebAssembly crashes the backend. This commit fixes these by skipping debug values in backend passes. Reviewers: jfb, aprantl, dschuff, echristo Subscribers: llvm-commits, dschuff, jfb, MatzeB, dexonsmith, yurydelendik, mehdi_amini Differential Revision: https://reviews.llvm.org/D23635 llvm-svn: 279011	2016-08-17 23:42:27 +00:00
Kostya Serebryany	5a5d5548f0	[libFuzzer] force proper popcnt instruction llvm-svn: 279002	2016-08-17 23:09:57 +00:00
Hans Wennborg	3879035e66	SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932) Differential Revision: https://reviews.llvm.org/D23594 llvm-svn: 278999	2016-08-17 22:50:18 +00:00
Haicheng Wu	e787763275	[LoopUnroll] Move a simple check earlier. NFC. Move the check of CallInst earlier to skip expensive recursive operations. Differential Revision: https://reviews.llvm.org/D23611 llvm-svn: 278998	2016-08-17 22:42:58 +00:00
Tim Shen	5c0c063ad5	[LV] Move LoopBodyTraits to a better place, and add comment for simplifying LoopBlocksTraversal. NFC. Summary: I later (after r278573) found that LoopIterator.h has some overlapping with LoopBodyTraits. It's good to use LoopBodyTraits because a *Traits struct is algorithm independent. Reviewers: anemet, nadav, mkuper Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23529 llvm-svn: 278996	2016-08-17 22:20:07 +00:00
Kostya Serebryany	e72774dd69	[libFuzzer] given 0 and 255 more preference when inserting repeated bytes llvm-svn: 278986	2016-08-17 21:50:54 +00:00
Chris Bieneman	432ba9d89a	[macho2yaml] Don't write empty linkedit data Since I stopped writing empty export tries it causes LinkEdit to potentially be completely empty which results in invalid yaml being generated. To prevent this we skip linkedit data if it is empty. llvm-svn: 278985	2016-08-17 21:46:04 +00:00
Kostya Serebryany	0c537b124c	[libFuzzer] one more mutation: ChangeBinaryInteger; also fix the breakage from r278970 llvm-svn: 278982	2016-08-17 21:30:30 +00:00
Kyle Butt	db3391ebe0	Tail Duplication: Accept explicit threshold for duplicating. This will allow tail duplication and tail merging during layout to have a shared threshold to make sure that they don't overlap. No observable change intended. llvm-svn: 278981	2016-08-17 21:07:35 +00:00
Kyle Butt	feafec588f	TailDuplicator: Use optForSize instead of hasFnAttribute. This will cause minsize functions to have the same threshold as optsize functions, but otherwise should have no effects. llvm-svn: 278980	2016-08-17 21:07:33 +00:00
Kostya Serebryany	a9a548049a	[libFuzzer] when printing the reproducer input, also print the base input and the mutation sequence llvm-svn: 278975	2016-08-17 20:45:23 +00:00
Duncan P. N. Exon Smith	afdd8e541b	Revert "[WebAssembly] Handle debug information and virtual registers without crashing" This reverts commit r278967, since the new test is failing when you don't build the WebAssembly target (most people, since it's off-by-default). llvm-svn: 278973	2016-08-17 20:41:50 +00:00
Justin Bogner	cd1d5aaf2e	Replace a few more "fall through" comments with LLVM_FALLTHROUGH Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970	2016-08-17 20:30:52 +00:00
Tim Northover	de3aea0412	GlobalISel: support irtranslation of icmp instructions. llvm-svn: 278969	2016-08-17 20:25:25 +00:00
Dominic Chen	4326167a37	[WebAssembly] Handle debug information and virtual registers without crashing Summary: Currently, enabling debug information when compiling for WebAssembly crashes the backend. This commit fixes these by skipping debug values in backend passes. Reviewers: jfb, aprantl, dschuff, echristo Subscribers: mehdi_amini, yurydelendik, dexonsmith, MatzeB, jfb, dschuff, llvm-commits Differential Revision: https://reviews.llvm.org/D21808 llvm-svn: 278967	2016-08-17 20:11:03 +00:00
Tim Shen	eb3958fafd	[GraphWriter] Change GraphWriter to use NodeRef in GraphTraits Summary: This is part of the "NodeType* -> NodeRef" migration. Notice that since GraphWriter prints object address as identity, I added a static_assert on NodeRef to be a pointer type. Reviewers: dblaikie Subscribers: llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D23580 llvm-svn: 278966	2016-08-17 20:07:29 +00:00
Matt Arsenault	d42d58cf21	AMDGPU: Remove dead option llvm-svn: 278965	2016-08-17 20:07:16 +00:00
Tim Shen	8b58bdfe6f	[GenericDomTree] Change GenericDomTree to use NodeRef in GraphTraits. NFC. Summary: Looking at the implementation, GenericDomTree has more specific requirements on NodeRef, e.g. NodeRefObject->getParent() should compile, and NodeRef should be a pointer. We can remove the pointer requirement, but it seems to have little gain, given the limited use cases. Also changed GraphTraits<Inverse<Inverse<T>> to be more accurate. Reviewers: dblaikie, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23593 llvm-svn: 278961	2016-08-17 20:01:58 +00:00
Sanjay Patel	daffec91ef	[InstCombine] more clean up of foldICmpXorConstant(); NFCI Use m_APInt for the xor constant, but this is all still guarded by the initial ConstantInt check, so no vector types should make it in here. llvm-svn: 278957	2016-08-17 19:45:18 +00:00
Sanjay Patel	6d5f448746	[InstCombine] clean up foldICmpXorConstant(); NFCI 1. Change variable names 2. Use local variables to reduce code 3. Early exit to reduce indent llvm-svn: 278955	2016-08-17 19:23:42 +00:00
Marina Yatsina	53ce3f9d02	Fix for PR29010 This is a fix for https://llvm.org/bugs/show_bug.cgi?id=29010 Root cause of the bug is that the register class of the machine instruction operand does not fully reflect if this registers that can be allocated. Both for i386 and x86_64 the operand's register class is VR128RegClass and thus contains xmm0-xmm15, though in i386 we can only use xmm0-xmm8. In order to get the actual allocable registers of the class we need to use RegisterClassInfo. Differential Revision: https://reviews.llvm.org/D23613 llvm-svn: 278954	2016-08-17 19:07:40 +00:00
Kostya Serebryany	a7398ba024	[libFuzzer] more mutations llvm-svn: 278950	2016-08-17 18:10:42 +00:00
Sanjay Patel	63e14a07e8	[InstCombine] use m_APInt to allow icmp (or X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 llvm-svn: 278945	2016-08-17 16:38:57 +00:00
Sanjay Patel	943e92efde	[InstCombine] clean up foldICmpOrConstant(); NFCI 1. Change variable names 2. Use local variables to reduce code 3. Use ? instead of if/else 4. Use the APInt variable instead of 'RHS' so the removal of the FIXME code will be direct llvm-svn: 278944	2016-08-17 16:30:43 +00:00
Adrian Prantl	c19dee734f	Support the DW_AT_noreturn DWARF flag. This is used to mark functions with the C++11 [[ noreturn ]] or C11 _Noreturn attributes. Patch by Victor Leschuk! https://reviews.llvm.org/D23167 llvm-svn: 278940	2016-08-17 16:02:43 +00:00
Chad Rosier	ea7e4647db	Revert "Reassociate: Reprocess RedoInsts after each inst". This reverts commit r258830, which introduced a bug described in PR28367. PR28367 llvm-svn: 278938	2016-08-17 15:54:39 +00:00
Sanjay Patel	4f7eb2aa95	[InstCombine] use m_APInt to allow icmp (add X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 llvm-svn: 278935	2016-08-17 15:24:30 +00:00
Simon Dardis	ac96ec7906	[mips] Add l.[sd] and s.[sd] instruction aliases Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23121 llvm-svn: 278930	2016-08-17 14:45:09 +00:00
Chad Rosier	a6822f64f3	Revert "[Reassociate] Avoid iterator invalidation when negating value." This reverts commit r278928 due to lit test failures. llvm-svn: 278929	2016-08-17 14:31:34 +00:00
Chad Rosier	cf3e8121a6	[Reassociate] Avoid iterator invalidation when negating value. Differential Revision: https://reviews.llvm.org/D23464 PR28367 llvm-svn: 278928	2016-08-17 14:16:45 +00:00
Jonas Paulsson	7a79422536	[LoopStrenghtReduce] Refactoring and addition of a new target cost function. Refactored so that a LSRUse owns its fixups, as oppsed to letting the LSRInstance own them. This makes it easier to rate formulas for LSRUses, since the fixups are available directly. The Offsets vector has been removed since it was no longer necessary. New target hook isFoldableMemAccessOffset(), which is used during formula rating. For SystemZ, this is useful to express that loads and stores with float or vector types with a big/negative offset should be avoided in loops. Without this, LSR will generate a lot of negative offsets that would require extra instructions for loading the address. Updated tests: test/CodeGen/SystemZ/loop-01.ll Reviewed by: Quentin Colombet and Ulrich Weigand. https://reviews.llvm.org/D19152 llvm-svn: 278927	2016-08-17 13:24:19 +00:00
Marina Yatsina	4b22642e6f	Fixing bug committed in rev. 278321 In theory the indices of RC (and thus the index used for LiveRegs) may differ from the indices of OpRC. Fixed the code to extract the correct RC index. OpRC contains the first X consecutive elements of RC, and thus their indices are currently de facto the same, therefore a test cannot be added at this point. Differential Revision: https://reviews.llvm.org/D23491 llvm-svn: 278923	2016-08-17 11:40:21 +00:00
Ayman Musa	71b43c5c1d	Fix bug in DAGBuilder for getelementptr with expanded vector. Replacing the usage of MVT with EVT in case the vector type is expanded. Differential Revision: https://reviews.llvm.org/D23306 llvm-svn: 278913	2016-08-17 07:52:15 +00:00
Ayman Musa	c96f421ad4	First commit (test commit) - Adding empty line. llvm-svn: 278910	2016-08-17 07:37:34 +00:00
Mehdi Amini	970800e0c8	[LTO] Introduce an Output class to wrap the output stream creation (NFC) Summary: While NFC for now, this will allow more flexibility on the client side to hold state necessary to back up the stream. Also when adding caching, this class will grow in complexity. Note I blindly modified the gold-plugin as I can't compile it. Reviewers: tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23542 llvm-svn: 278907	2016-08-17 06:23:09 +00:00
Justin Bogner	14f383e9c4	Fix a use of LLVM_FALLTHROUGH that wasn't even in a switch. I was over-aggressive in my conversions from comments to the fallthrough attribute. llvm-svn: 278903	2016-08-17 05:25:38 +00:00
Justin Bogner	b03fd12cef	Replace "fallthrough" comments with LLVM_FALLTHROUGH This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902	2016-08-17 05:10:15 +00:00
Chuang-Yu Cheng	f7ba716bcb	[ppc64] Don't apply sibling call optimization if callee has any byval arg This is a quick work around, because in some cases, e.g. caller's stack size > callee's stack size, we are still able to apply sibling call optimization even callee has any byval arg. This patch fix: https://llvm.org/bugs/show_bug.cgi?id=28328 Reviewers: hfinkel kbarton nemanjai amehsan Subscribers: hans, tjablin https://reviews.llvm.org/D23441 llvm-svn: 278900	2016-08-17 03:17:44 +00:00
Chandler Carruth	67fc52f067	[PM] Port the always inliner to the new pass manager in a much more minimal and boring form than the old pass manager's version. This pass does the very minimal amount of work necessary to inline functions declared as always-inline. It doesn't support a wide array of things that the legacy pass manager did support, but is alse ... about 20 lines of code. So it has that going for it. Notably things this doesn't support: - Array alloca merging - To support the above, bottom-up inlining with careful history tracking and call graph updates - DCE of the functions that become dead after this inlining. - Inlining through call instructions with the always_inline attribute. Instead, it focuses on inlining functions with that attribute. The first I've omitted because I'm hoping to just turn it off for the primary pass manager. If that doesn't pan out, I can add it here but it will be reasonably expensive to do so. The second should really be handled by running global-dce after the inliner. I don't want to re-implement the non-trivial logic necessary to do comdat-correct DCE of functions. This means the -O0 pipeline will have to be at least 'always-inline,global-dce', but that seems reasonable to me. If others are seriously worried about this I'd like to hear about it and understand why. Again, this is all solveable by factoring that logic into a utility and calling it here, but I'd like to wait to do that until there is a clear reason why the existing pass-based factoring won't work. The final point is a serious one. I can fairly easily add support for this, but it seems both costly and a confusing construct for the use case of the always inliner running at -O0. This attribute can of course still impact the normal inliner easily (although I find that a questionable re-use of the same attribute). I've started a discussion to sort out what semantics we want here and based on that can figure out if it makes sense ta have this complexity at O0 or not. One other advantage of this design is that it should be quite a bit faster due to checking for whether the function is a viable candidate for inlining exactly once per function instead of doing it for each call site. Anyways, hopefully a reasonable starting point for this pass. Differential Revision: https://reviews.llvm.org/D23299 llvm-svn: 278896	2016-08-17 02:56:20 +00:00
Matthias Braun	08f4704ec8	IfConversion: Use references instead of pointers where possible; NFC Also put some commonly used subexpressions into variables. llvm-svn: 278895	2016-08-17 02:52:01 +00:00
Matthias Braun	b1e0558df4	IfConversion: Use range based for; NFC Also avoid some pointless use of auto! Because that's friendlier to readers and avoids several types accidentally resolving to unnecessary references here (MachineInstr *&, unsigned &). llvm-svn: 278894	2016-08-17 02:51:59 +00:00
Matthias Braun	2c931798d6	IfConversion: Improve doxygen comments llvm-svn: 278893	2016-08-17 02:51:57 +00:00
Chandler Carruth	f702d8ecb6	[Inliner] Add a flag to disable manual alloca merging in the Inliner. This is off for now while testing can take place to make sure that in fact we do sufficient stack coloring to fully obviate the manual alloca array merging. Some context on why we should be using stack coloring rather than merging allocas in this way: LLVM relies very heavily on analyzing pointers as coming from different allocas in order to make aliasing decisions. These are some of the most powerful aliasing signals available in LLVM. So merging allocas is an extremely destructive operation on the LLVM IR -- it takes away highly valuable and hard to reconstruct information. As a consequence, inlined functions which happen to have array allocas that this pattern matches will fail to be properly interleaved unless SROA manages to hoist everything to an SSA register. Instead, the inliner will have added an unnecessary dependence that one inlined function execute after the other because they will have been rewritten to refer to the same memory. All that said, folks will reasonably want some time to experiment here and make sure there are no significant regressions. A flag should give us an easy knob to test. For more context, see the thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-July/103277.html http://lists.llvm.org/pipermail/llvm-dev/2016-August/103285.html Differential Revision: https://reviews.llvm.org/D23052 llvm-svn: 278892	2016-08-17 02:40:23 +00:00
Zijiao Ma	53d55f45a1	Some places that could using TargetParser in LLVM. NFC. llvm-svn: 278888	2016-08-17 02:08:28 +00:00
Duncan P. N. Exon Smith	362d120488	Scalar: Avoid dereferencing end() in IndVarSimplify IndVarSimplify::sinkUnusedInvariants calls BasicBlock::getFirstInsertionPt on the ExitBlock and moves instructions before it. This can return end(), so it's not safe to dereference. Add an iterator-based overload to Instruction::moveBefore to avoid the UB. llvm-svn: 278886	2016-08-17 01:54:41 +00:00
Duncan P. N. Exon Smith	9e3edad932	IPO: Swap \|\| operands to avoid dereferencing end() IsOperandBundleUse conveniently indicates whether std::next(F->arg_begin(),UseIndex) will get to (or past) end(). Check it first to avoid dereferencing end(). llvm-svn: 278884	2016-08-17 01:23:58 +00:00
Duncan P. N. Exon Smith	3bcaa81204	Scalar: Avoid dereferencing end() in InductiveRangeCheckElimination BasicBlock::Create isn't designed to take iterators (which might be end()), but pointers (which might be nullptr). Fix the UB that was converting end() to a BasicBlock* by calling BasicBlock::getNextNode() in the first place. llvm-svn: 278883	2016-08-17 01:16:17 +00:00
Duncan P. N. Exon Smith	6331dc171c	ObjCARC: Don't increment or dereference end() when scanning args When there's only one argument and it doesn't match one of the known functions, return ARCInstKind::CallOrUser rather than falling through to the two argument case. The old behaviour both incremented past and dereferenced end(). llvm-svn: 278881	2016-08-17 01:02:18 +00:00
Duncan P. N. Exon Smith	ec083b59ed	ARM: Avoid dereferencing end() in ARMFrameLowering::emitPrologue llvm::tryFoldSPUpdateIntoPushPop assumes its arguments are valid MachineInstrs. Update ARMFrameLowering::emitPrologue to respect that; when LastPush==end(), it can't possibly be a push instruction anyway. llvm-svn: 278880	2016-08-17 00:53:04 +00:00
Duncan P. N. Exon Smith	00ec93da26	CodeGen: Avoid dereferencing end() in OptimizePHIs::OptimizeBB llvm-svn: 278879	2016-08-17 00:43:59 +00:00
Duncan P. N. Exon Smith	e04fe1a394	Hexagon: Avoid dereferencing end() in HexagonInstrInfo::InsertBranch llvm-svn: 278878	2016-08-17 00:34:00 +00:00
Duncan P. N. Exon Smith	db53d99d02	AMDGPU: Avoid looking for the DebugLoc in end() The end() iterator isn't a safe thing to dereference. Pass the DebugLoc into EmitFetchClause and EmitALUClause to avoid it. llvm-svn: 278873	2016-08-17 00:06:43 +00:00
Duncan P. N. Exon Smith	0a12729f99	SimplifyCFG: Avoid dereferencing end() When comparing a User* to a BasicBlock::iterator in passingValueIsAlwaysUndefined, don't dereference the iterator in case it is end(). llvm-svn: 278872	2016-08-16 23:57:56 +00:00
Justin Bogner	39eec466a2	Revert "Write the TPI stream from a PDB to Yaml." This is hitting a "use of undeclared identifier 'skipPadding' error locally and on some bots. This reverts r278869. llvm-svn: 278871	2016-08-16 23:37:10 +00:00
Duncan P. N. Exon Smith	dcbce9c391	CodeGen: Avoid dereferencing end() when unconstifying iterators Rather than doing a funny dance that relies on dereferencing end() not crashing, add some API to MachineInstrBundleIterator to get a non-const version of the iterator. llvm-svn: 278870	2016-08-16 23:34:07 +00:00
Zachary Turner	8321ba5437	Write the TPI stream from a PDB to Yaml. Reviewed By: ruiu, rnk Differential Revision: https://reviews.llvm.org/D23226 llvm-svn: 278869	2016-08-16 23:28:54 +00:00
Kyle Butt	07d61425e3	Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough. If AnalyzeBranch can't analyze a block and it is possible to fallthrough, then duplicating the block doesn't make sense, as only one block can be the layout predecessor for the un-analyzable fallthrough. Submitted wit a test case, but NOTE: the test case doesn't currently fail. However, the test case fails with D20505 and would have saved me some time debugging. llvm-svn: 278866	2016-08-16 22:56:14 +00:00
Sanjay Patel	60ea1b43d6	[InstCombine] clean up foldICmpAddConstant(); NFCI 1. Fix variable names 2. Add local variables to reduce code 3. Fix code comments 4. Add early exit to reduce indentation 5. Remove 'else' after if -> return 6. Hoist common predicate llvm-svn: 278864	2016-08-16 22:34:42 +00:00
Konstantin Zhuravlyov	e0b87181cf	[AMDGPU] Remove duplicate initialization of SIDebuggerInsertNops pass Differential Revision: https://reviews.llvm.org/D23556 llvm-svn: 278863	2016-08-16 22:30:11 +00:00
David Majnemer	744a8753db	Preserve the assumption cache more often We were clearing it out in LoopUnswitch and InlineFunction instead of attempting to preserve it. llvm-svn: 278860	2016-08-16 22:07:32 +00:00
Sanjay Patel	e47df1ac62	[InstCombine] use m_APInt to allow icmp (sub X, Y), C folds for splat constant vectors llvm-svn: 278859	2016-08-16 21:53:19 +00:00
Duncan P. N. Exon Smith	41cf73ce16	CodeGen: Don't dereference end() in MachineBasicBlock::CorrectExtraCFGEdges The current MachineBasicBlock might be the last block, so FallThru may be past the end(). Use getNextNode(), which will convert to nullptr, rather than &*++, which is invalid if we reach the end(). llvm-svn: 278858	2016-08-16 21:46:03 +00:00
Sanjay Patel	904cd39b05	[x86] Allow merging multiple instances of an immediate within a basic block for code size savings, for 64-bit constants. This patch handles 64-bit constants which can be encoded as 32-bit immediates. It extends the functionality added by https://reviews.llvm.org/D11363 for 32-bit constants to 64-bit constants. Patch by Sunita Marathe! Differential Revision: https://reviews.llvm.org/D23391 llvm-svn: 278857	2016-08-16 21:35:16 +00:00
Kostya Serebryany	3044390af1	[libFuzzer] minor speed improvement llvm-svn: 278856	2016-08-16 21:28:05 +00:00
Sanjay Patel	b9aa67bfcf	[InstCombine] fix variable names to match formula comments; NFC llvm-svn: 278855	2016-08-16 21:26:10 +00:00
David Majnemer	110522bc0f	[LoopUnroll] Don't clear out the AssumptionCache on each loop Clearing out the AssumptionCache can cause us to rescan the entire function for assumes. If there are many loops, then we are scanning over the entire function many times. Instead of clearing out the AssumptionCache, register all cloned assumes. llvm-svn: 278854	2016-08-16 21:09:46 +00:00
Reid Kleckner	b99b709068	Revert "Enhance SCEV to compute the trip count for some loops with unknown stride." This reverts commit r278731. It caused http://crbug.com/638314 llvm-svn: 278853	2016-08-16 21:02:04 +00:00
Matt Arsenault	b8037a1bd3	TailDuplicator: Use range loops llvm-svn: 278847	2016-08-16 20:38:05 +00:00
Evandro Menezes	5a5b8dcd32	[AArch64] Adjust the scheduling model for Exynos M1. Refine the model for the FP division unit. llvm-svn: 278846	2016-08-16 20:35:01 +00:00
Evandro Menezes	d03aff2e11	[AArch64] Adjust the scheduling model for Exynos M1. Refine the model for the integer division unit. llvm-svn: 278845	2016-08-16 20:34:58 +00:00
Matt Arsenault	7f19298bfa	AMDGPU: Remove excessive padding from ImmOp and RegOp. The structs ImmOp and RegOp are in AArch64AsmParser.cpp (inside anonymous namespace). This diff changes the order of fields and removes the excessive padding (8 bytes). Patch by Alexander Shaposhnikov llvm-svn: 278844	2016-08-16 20:28:06 +00:00
Sjoerd Meijer	15c81b05ea	[MBP] do not reorder and move up loop latch block Do not reorder and move up a loop latch block before a loop header when optimising for size because this will generate an extra unconditional branch. Differential Revision: https://reviews.llvm.org/D22521 llvm-svn: 278840	2016-08-16 19:50:33 +00:00
Kostya Serebryany	d46a59fac4	[libFuzzer] new experimental feature: value profiling. Profiles values that affect control flow and treats new values as new coverage. llvm-svn: 278839	2016-08-16 19:33:51 +00:00
Benjamin Kramer	0464ae83e7	Remove excessive padding from LineNoCacheTy The struct LineNoCacheTy is in SourceMgr.cpp inside anonymous namespace. This diff changes the order of fields and removes the excessive padding (8 bytes). Patch by Alexander Shaposhnikov! Differential revision: https://reviews.llvm.org/D23546 llvm-svn: 278838	2016-08-16 19:20:10 +00:00
David Majnemer	00940fb854	Make MDNode::intersect faster than O(n * m) It is pretty easy to get it down to O(nlogn + mlogm). This implementation has the added benefit of automatically deduplicating entries between the two sets. llvm-svn: 278837	2016-08-16 18:48:37 +00:00
David Majnemer	fa0f1e660b	Don't passively concatenate MDNodes I have audited all the callers of concatenate and none require duplicate entries to service concatenation. These duplicates serve no purpose but to needlessly embiggen the IR. N.B. Layering getMostGenericAliasScope on top of concatenate makes it O(nlogn + mlogm) instead of O(n*m). llvm-svn: 278836	2016-08-16 18:48:34 +00:00
Krzysztof Parzyszek	1d01a79304	[Hexagon] Standardize next batch of pseudo instructions ALIGNA PS_aligna ALLOCA PS_alloca TFR_FI PS_fi TFR_FIA PS_fia TFR_PdFalse PS_false TFR_PdTrue PS_true VMULW PS_vmulw VMULW_ACC PS_vmulw_acc llvm-svn: 278832	2016-08-16 18:08:40 +00:00
Gor Nishanov	74309fa014	[Coroutines] Part 7: Split coroutine into subfunctions Summary: This patch adds simple coroutine splitting logic to CoroSplit pass. Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) ... 7. Split coroutine into subfunctions <= we are here 8. Coroutine Frame Building algorithm 9. Handle coroutine with unwinds 10+. The rest of the logic Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23461 llvm-svn: 278830	2016-08-16 18:04:14 +00:00
Sanjay Patel	a3f4f0828b	[InstCombine] add helper functions for foldICmpWithConstant; NFCI Besides breaking up a 700 line function to improve readability, this sinks the 'FIXME: ConstantInt' check into each helper. So now we can independently break that restriction within any of the helper functions. As much as possible, the code was only {cut/paste/clang-format}'ed to minimize risk (no functional changes intended), so several more readability improvements are still possible. llvm-svn: 278828	2016-08-16 17:54:36 +00:00
Kostya Serebryany	c98ef718ea	[libFuzzer] refactoring around PCMap, NFC llvm-svn: 278825	2016-08-16 17:37:13 +00:00
Simon Dardis	4893aff94e	[mips] Enforce compact branch restrictions Check both operands for use of the $zero register which cannot be used with a compact branch instruction. Reviewers: dsanders, vkalintris Differential Review: https://reviews.llvm.org/D23547 llvm-svn: 278824	2016-08-16 17:16:11 +00:00
Krzysztof Parzyszek	eabc0d0fd5	[Hexagon] Clean up some miscellaneous V60 intrinsics a bit llvm-svn: 278823	2016-08-16 17:14:44 +00:00
Wolfgang Pieb	8df58f48dd	When the inline spiller rematerializes an instruction, take the debug location from the instruction that immediately follows the rematerialization point. Patch by Andrea DiBiagio. Differential Revision: http://reviews.llvm.org/D23539 llvm-svn: 278822	2016-08-16 17:12:50 +00:00
Vitaly Buka	1ce73ef11c	[Asan] Unpoison red zones even if use-after-scope was disabled with runtime flag Summary: PR27453 Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23481 llvm-svn: 278818	2016-08-16 16:24:10 +00:00
Sanjay Patel	1e5b2d1611	[InstCombine] use m_APInt in foldICmpWithConstant; NFCI There's some formatting and pointer deref ugliness here that I intend to fix in subsequent patches. The overall goal is to refactor the obnoxiously long switch and incrementally remove the restriction to scalar types (allow folds for vector splats). This patch introduces the use of m_APInt which means the RHSV reference is now a pointer (and may have matched a vector splat), but the check of 'RHS' remains, so vector folds are disallowed and no functional change is intended. llvm-svn: 278816	2016-08-16 16:08:11 +00:00
Krzysztof Parzyszek	17aa4136a2	[Hexagon] Standardize vector predicate load/store pseudo instructions - Remove unused instructions: LDriq_pred_vec_V6, STriq_pred_vec_V6, and the 128B counterparts. - Rename: LDriq_pred_V6 PS_vloadrq_ai LDriq_pred_V6_128B PS_vloadrq_ai_128B STriq_pred_V6 PS_vstorerq_ai STriq_pred_V6_128B PS_vstorerq_ai_128B llvm-svn: 278813	2016-08-16 15:43:54 +00:00
Ahmed Bougacha	e4c03abddd	[AArch64][GlobalISel] Select G_MUL. llvm-svn: 278810	2016-08-16 14:37:46 +00:00
Ahmed Bougacha	59e160a19c	[AArch64][GlobalISel] Factor out unsupported binop check. NFC. We're going to need it for G_MUL, and, if other targets end up using something similar, we can easily put it in the generic selector. llvm-svn: 278808	2016-08-16 14:37:40 +00:00
David Callahan	947be0fa66	[ADCE] Modify data structures to support removing control flow Summary: This is part of a serious of patches to evolve ADCE.cpp to support removing of unnecessary control flow. This patch changes the data structures to hold liveness information to support the additional information we will eventually need. In particular we now have a notion of basic blocks being live because they contain a live operations. This will eventually feed into control dependence analysis of which branches are live. We cater to getting from instructions to associated block information and from blocks to information about their terminators. This patch also changes the structure of the main loop of the algorithm so that it alternates propagating liveness between instructions and usign control dependence information to mark branches live. We force all terminators live for now until we add code to handlinge removing control flow in a later patch. No changes to effective behavior with this patch Previous patches: D23065 [ADCE] Refactor anticipating new functionality (NFC) D23102 [ADCE] Refactoring for new functionality (NFC) Reviewers: nadav, majnemer, mehdi_amini Subscribers: freik, twoh, llvm-commits Differential Revision: https://reviews.llvm.org/D23225 llvm-svn: 278807	2016-08-16 14:31:51 +00:00
Brendon Cahoon	65b6ebccad	[Pipeliner] Fix an asssert due to invalid Phi in the epilog The pipeliner was generating an invalid Phi name for an operand in the epilog block, which caused an assert in the live variable analysis pass. The fix is to the code that generates new Phis in the epilog block. In this case, there is an existing Phi that needs to be reused rather than creating a new Phi instruction. Differential Revision: https://reviews.llvm.org/D23513 llvm-svn: 278805	2016-08-16 14:29:24 +00:00
Ahmed Bougacha	2ac5bf94bc	[AArch64][GlobalISel] Select (variable) shifts. For now, no support for immediates. llvm-svn: 278804	2016-08-16 14:02:47 +00:00
Ahmed Bougacha	0306b5ef07	[AArch64][GlobalISel] Select p0 G_FRAME_INDEX. And mark it as legal. llvm-svn: 278802	2016-08-16 14:02:42 +00:00
Pierre Gousseau	051db7d838	[x86] Refactor a PowerPC specific ctlz/srl transformation (NFC). Following the discussion on D22038, this refactors a PowerPC specific setcc -> srl(ctlz) transformation so it can be used by other targets. Differential Revision: https://reviews.llvm.org/D23445 llvm-svn: 278799	2016-08-16 13:53:53 +00:00
Sagar Thakur	e311740bde	[MemorySanitizer] [MIPS] Changed memory mapping to support pie executable. Reviewed by eugenis Differential: D22994 llvm-svn: 278795	2016-08-16 12:55:38 +00:00
Simon Pilgrim	cc316f013a	[X86][SSE] Add support for combining v2f64 target shuffles to VZEXT_MOVL byte rotations The combine was only matching v2i64 as it assumed lowering to MOVQ - but we have v2f64 patterns that match in a similar fashion llvm-svn: 278794	2016-08-16 12:52:06 +00:00
Prakhar Bahuguna	a27c4a0e66	Correct the upper bound for a CBZ/CBNZ branch target. Summary: Fix for the upper bound check that was causing a build failure. Reviewers: olista01, rengolin, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23501 llvm-svn: 278789	2016-08-16 10:41:56 +00:00
Prakhar Bahuguna	15ed7ec5aa	[Thumb] Validate branch target for CBZ/CBNZ instructions. Summary: The assembler currently does not check the branch target for CBZ/CBNZ instructions, which only permit branching forwards with a positive offset. This adds validation for the branch target to ensure negative PC-relative offsets are not encoded into the instruction, whether specified as a literal or as an assembler symbol. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D23312 llvm-svn: 278788	2016-08-16 10:41:52 +00:00
Simon Pilgrim	f16cd361d4	[X86][SSE] Add support for combining target shuffles to PALIGNR byte rotations llvm-svn: 278787	2016-08-16 10:03:23 +00:00
Job Noorman	6cd8c9a9d6	[AVR] Fix compile errors Differential Revision: https://reviews.llvm.org/D23450 llvm-svn: 278784	2016-08-16 08:41:35 +00:00
Guy Blank	722caebdae	[X86] Add xgetbv/xsetbv intrinsics to non-windows platforms Differential Revision: https://reviews.llvm.org/D21958 llvm-svn: 278782	2016-08-16 06:41:00 +00:00
David Majnemer	5c5df6283a	[InstSimplify] Fold gep (gep V, C), (xor V, -1) to C-1 llvm-svn: 278779	2016-08-16 06:13:46 +00:00
Mehdi Amini	88c491ddec	FunctionImport: missed one occurence of ImportListForModule to rename (NFC) llvm-svn: 278778	2016-08-16 05:49:12 +00:00
Mehdi Amini	9b490f10e1	FunctionImport: rename ImportsForModule to ImportList for consistency (NFC) llvm-svn: 278777	2016-08-16 05:47:12 +00:00
Mehdi Amini	cdbcbf7477	[LTO] Simplify APIs and constify (NFC) Summary: Multiple APIs were taking a StringMap for the ImportLists containing the entries for for all the modules while operating on a single entry for the current module. Instead we can pass the desired ModuleImport directly. Also some of the APIs were not const, I believe just to be able to use operator[] on the StringMap. Reviewers: tejohnson Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23537 llvm-svn: 278776	2016-08-16 05:46:05 +00:00
Mehdi Amini	acc50c4334	[LTO] Rename variables with meaningul names, i.e. more than one character (NFC) llvm-svn: 278766	2016-08-16 00:44:46 +00:00
Reid Kleckner	229d32abfc	[AMDGPU] Give enum an explicit 64-bit type to fix MSVC 2013 failures Recall that MSVC always gives enums the type 'int', nothing else. MSVC 2015 does not appear to have this problem anymore. Clang-cl -Wmicrosoft-enum-value flags this, FWIW, so now I have a true positive for my warning. :) llvm-svn: 278762	2016-08-15 23:54:44 +00:00
Teresa Johnson	c44a12244f	[ThinLTO] Fix temp file dumping, enable via llvm-lto and test it Summary: Fixed a bug in ThinLTOCodeGenerator's temp file dumping. The Twine needs to be passed directly as an argument, or a copy saved into a std::string. It doesn't seem there are any consumers of this, so I added a new option to llvm-lto to enable saving of temp files during ThinLTO, and augmented a test to use it to check post-import but pre-opt bitcode. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23525 llvm-svn: 278761	2016-08-15 23:24:57 +00:00
Justin Bogner	375f71e3a3	Linker: Avoid some ridiculous indentation by using a temporary. NFC This was indented really awkwardly, and clang-format didn't seem to know how to do any better. Avoid the issue with a temporary variable. llvm-svn: 278756	2016-08-15 22:41:42 +00:00
Tim Shen	e0793db41d	[ADT] Change PostOrderIterator to use NodeRef. NFC. Reviewers: dblaikie Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23522 llvm-svn: 278752	2016-08-15 21:52:54 +00:00
Eli Friedman	98151d6440	Fix typo in lowering for fp128 ueq. Regression from r259791. Differential Revision: https://reviews.llvm.org/D23374 llvm-svn: 278750	2016-08-15 21:46:19 +00:00
Jan Vesely	0486f739a4	AMDGPU/R600: Convert buffer id to VTX_READ input Use patterns instead of multiple instructions Add buffer id to asm string https://reviews.llvm.org/D22650 llvm-svn: 278749	2016-08-15 21:38:30 +00:00
Tim Northover	28fdc4272d	GlobalISel: support loads and stores of strange types. Before we mischaracterized structs and i1 types as a scalar with size 0 in various ways. llvm-svn: 278744	2016-08-15 21:13:17 +00:00
Sanjoy Das	78db2963f6	Revert "[ValueTracking] Improve ValueTracking on left shift with nsw flag" This reverts commit r278172. It causes PR28946. llvm-svn: 278740	2016-08-15 21:01:31 +00:00
Teresa Johnson	6107a4195d	[ThinLTO] Remove functions resolved to available_externally from comdats Summary: thinLTOResolveWeakForLinkerModule needs to drop any preempted weak symbols that were converted to available_externally from comdats, otherwise we will get a verification failure (since available_externally is a declaration for the linker, and no declarations can be in a comdat). Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23015 llvm-svn: 278739	2016-08-15 21:00:04 +00:00
David L Kreitzer	7fe18251a5	Enhance SCEV to compute the trip count for some loops with unknown stride. Patch by Pankaj Chawla Differential Revision: https://reviews.llvm.org/D22377 llvm-svn: 278731	2016-08-15 20:21:41 +00:00
Kostya Serebryany	bdb220c7a0	[libFuzzer] print a verbose message after executing inputs in non-fuzzing mode llvm-svn: 278724	2016-08-15 19:44:04 +00:00
Kostya Serebryany	a0d40a21e7	[libFuzzer] fix the bot llvm-svn: 278721	2016-08-15 19:36:13 +00:00
Matthias Braun	b948c52416	Revert "[Thumb] Validate branch target for CBZ/CBNZ instructions." This currently breaks the greendragon clang-stage1-configure-RA/ and brotli. It is probably just uncovering a pre-existing problem. Reverting temporarily to get the buildbots green again. A reduced testcase will follow shortly. This reverts commit r278659. llvm-svn: 278711	2016-08-15 18:50:13 +00:00
Wolfgang Pieb	dfad9b20c9	Local variables whose address is taken and passed on to a call are described in debug info using their stack slots instead of as an indirection of param reg + 0 offset. This is done by detecting FrameIndexSDNodes in SelectionDAG and generating FrameIndexDbgValues for them. This ultimately generates DBG_VALUEs with stack location operands. Differential Revision: http://reviews.llvm.org/D23283 llvm-svn: 278703	2016-08-15 18:18:26 +00:00
Kostya Serebryany	dfbe59b03d	[libFuzzer] add InsertRepeatedBytes and EraseBytes. New mutation: InsertRepeatedBytes. Updated mutation: EraseByte => EraseBytes. This helps https://github.com/google/sanitizers/issues/710 where libFuzzer was not able to find a known bug. Now it finds it in minutes. Hopefully, the change is general enough to help other targets. llvm-svn: 278687	2016-08-15 17:48:28 +00:00
Yaxun Liu	c7cbd72921	AMDGPU: Update AMDGPURuntimeMetadata.h for enums of address space qualifiers llvm-svn: 278682	2016-08-15 16:54:25 +00:00
Matt Arsenault	3661e90e71	AMDGPU: Don't fold subregister extracts into tied operands llvm-svn: 278676	2016-08-15 16:18:36 +00:00
Reid Kleckner	70a600b8bb	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r278660. It causes downstream assertion failure in InstCombine on shuffle instructions. Comes up in __mm_swizzle_epi32. llvm-svn: 278672	2016-08-15 15:42:31 +00:00
Valery Pykhtin	c761675ef4	[AMDGPU] fix failure on printing of non-existing instruction operands. Differential revision: https://reviews.llvm.org/D23323 llvm-svn: 278665	2016-08-15 10:56:48 +00:00
Sjoerd Meijer	58156715b4	MachineLoop: add methods findLoopControlBlock and findLoopPreheader This adds two new utility functions findLoopControlBlock and findLoopPreheader to MachineLoop and MachineLoopInfo. These functions are refactored and taken from the Hexagon target as they are target independent; thus this is intendend to be a non-functional change. Differential Revision: https://reviews.llvm.org/D22959 llvm-svn: 278661	2016-08-15 08:22:42 +00:00
James Molloy	9a3c82f5cf	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 278660	2016-08-15 08:04:56 +00:00
Prakhar Bahuguna	a305a435a6	[Thumb] Validate branch target for CBZ/CBNZ instructions. Summary: The assembler currently does not check the branch target for CBZ/CBNZ instructions, which only permit branching forwards with a positive offset. This adds validation for the branch target to ensure negative PC-relative offsets are not encoded into the instruction, whether specified as a literal or as an assembler symbol. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D23312 llvm-svn: 278659	2016-08-15 07:57:44 +00:00
James Molloy	196ad0823e	[LSR] Don't try and create post-inc expressions on non-rotated loops If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs! Motivating testcase: void f(float a, float b, float c, int n) { while (n-- > 0) c++ = a++ + b++; } It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage. llvm-svn: 278658	2016-08-15 07:53:03 +00:00
Craig Topper	f774de6d54	[X86] PADDUSB/W instructions should be commutable. llvm-svn: 278654	2016-08-15 06:31:57 +00:00
Craig Topper	80c8b80919	[X86] Mark some of the X86 SDNodes as commutative. llvm-svn: 278653	2016-08-15 04:47:30 +00:00
Craig Topper	dbc387cfc9	[X86] X86ISD::FANDN is not commutative or associative. llvm-svn: 278652	2016-08-15 04:47:28 +00:00
David Majnemer	3b47a5a562	[ScopedNoAliasAA] collectMDInDomain should be a free function collectMDInDomain doesn't use any class members, making it a free function is not a functional change. llvm-svn: 278651	2016-08-15 03:56:06 +00:00
David Majnemer	8b8869f8ef	[ScopedNoAliasAA] Only collect noalias nodes if we have alias.scope nodes No functional change is intended. llvm-svn: 278646	2016-08-15 02:23:50 +00:00
David Majnemer	ddc7ab26fc	[ScopedNoAliasAA] Replace !ScopeNodes.size() with ScopeNodes.empty() No functional change is intended. llvm-svn: 278645	2016-08-15 02:23:48 +00:00
David Majnemer	c77a1390de	Revert "[ScopedNoAliasAA] Remove an unneccesary set" This reverts commit r278641. I'm not sure why but this has upset the multistage builders... llvm-svn: 278644	2016-08-15 02:23:46 +00:00
David Majnemer	5ec9c58f13	[ScopedNoAliasAA] Remove an unneccesary set We are trying to prove that one group of operands is a subset of another. We did this by populating two Sets and determining that every element within one was inside the other. However, this is unnecessary. We can simply construct a single set and test if each operand is within it. llvm-svn: 278641	2016-08-15 00:13:04 +00:00
Craig Topper	37e8c5443c	[AVX-512] Mark VPMADDWD as commutable to match SSE/AVX version. llvm-svn: 278629	2016-08-14 17:57:22 +00:00
Craig Topper	c677e97dff	[AVX-512] Add masked commutable floating point max/min instructions to folding tables. llvm-svn: 278628	2016-08-14 17:57:19 +00:00
Craig Topper	29fbdc309a	[AVX-512] Add masked logical operations to memory folding tables. llvm-svn: 278627	2016-08-14 17:57:16 +00:00
Igor Breger	505f2cc468	[AVX512] Fix VFPCLASSSD/VFPCLASSSS intrinsic lowering. The i1 result should be zero extended according to SPEC. Differential Revision: http://reviews.llvm.org/D23489 llvm-svn: 278626	2016-08-14 13:58:57 +00:00
Igor Breger	8672408db0	[AVX512] Fix insertelement i1 lowering. 1. Use shuffle to insert element i1 into vector. The previous implementation was incorrect ( dest_bit OR src_bit , it doesn't clear the bit if src_bit=0 ) 2. Improve shuffle i1 vector, use CVT2MASK if supported instead TRUNCATE. Differential Revision: http://reviews.llvm.org/D23347 llvm-svn: 278623	2016-08-14 05:25:07 +00:00
Diana Picus	68be1eb885	Revert "CodeGen: If Convert blocks that would form a diamond when tail-merged." This reverts commit r278287. This commit broke the clang-cmake-thumbv7-a15-full-sh bot. See https://llvm.org/bugs/show_bug.cgi?id=28949 llvm-svn: 278621	2016-08-14 02:10:18 +00:00
Diana Picus	35ccf53e75	Revert "Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough." This reverts commit r278288. r278287 broke the clang-cmake-thumbv7-a15-full-sh bot. Revert this so we can get to r278287. llvm-svn: 278620	2016-08-14 02:10:12 +00:00
Sanjoy Das	35459f0e34	[IRCE] Change variable grouping; NFC llvm-svn: 278619	2016-08-14 01:04:50 +00:00
Sanjoy Das	2143447c73	[IRCE] Create llvm::Loop instances for cloned out loops llvm-svn: 278618	2016-08-14 01:04:46 +00:00
Sanjoy Das	7a18a238c6	[IRCE] Don't iterate on loops that were cloned out IRCE has the ability to further version pre-loops and post-loops that it created, but this isn't useful at all. This change teaches IRCE to leave behind some metadata in the loops it creates (by cloning the main loop) so that these new loops are not re-processed by IRCE. Today this bug is hidden by another bug -- IRCE does not update LoopInfo properly so the loop pass manager does not re-invoke IRCE on the loops it split out. However, once the latter is fixed the bug addressed in this change causes IRCE to infinite-loop in some cases (e.g. it splits out a pre-loop, a pre-pre-loop from that, a pre-pre-pre-loop from that and so on). llvm-svn: 278617	2016-08-14 01:04:36 +00:00
Sanjoy Das	43fdc54303	[IRCE] Add better DEBUG diagnostic; NFC NFC meaning IRCE should not _do_ anything different, but -debug-only=irce will be a little friendlier. llvm-svn: 278616	2016-08-14 01:04:31 +00:00
Mehdi Amini	a71002e7f1	Fix bitcode auto-upgrade when using bitcode lazy loading The auto-upgrade path could be called before the VST (global names) was fully parsed, and thus intrinsic names were not available and the autoupgrade logic could not operate. Fix link failures with ThinLTO. This is a recommit of r278610 with a different fix. llvm-svn: 278615	2016-08-14 00:01:27 +00:00
Ron Lieberman	822ee88ab8	Fix unsupported relocation type R_HEX_6_X' for symbol .rodata LowerTargetConstantPool is not properly setting the TargetFlag to indicate desired relocation. Coding error, the offset parameter was omitted, so the TargetFlag was used as the offset, and the TargetFlag defaulted to zero. This only affects -fpic compilation, and only those items created in a Constant Pool, for example a vector of constants. Halide ran into this issue. llvm-svn: 278614	2016-08-13 23:41:11 +00:00
Mehdi Amini	466a64e298	Revert "Fix bitcode auto-upgrade when using bitcode lazy loading" This reverts commit r278610. Tests are broken llvm-svn: 278613	2016-08-13 23:39:14 +00:00
Sanjoy Das	2a2f14d7ab	[IRCE] Be resilient in the face of non-simplified loops Loops containing `indirectbr` may not be in simplified form, even after running LoopSimplify. Reject then gracefully, instead of tripping an assert. llvm-svn: 278611	2016-08-13 23:36:35 +00:00
Mehdi Amini	e62aaf2303	Fix bitcode auto-upgrade when using bitcode lazy loading The auto-upgrade path could be called before the VST (global names) was fully parsed, and thus intrinsic names were not available and the autoupgrade logic could not operate. Fix link failures with ThinLTO. llvm-svn: 278610	2016-08-13 23:31:53 +00:00
Mehdi Amini	8c629ecf3a	Revert "Revert "Invariant start/end intrinsics overloaded for address space"" This reverts commit 32fc6488e48eafc0ca1bac1bd9cbf0008224d530. llvm-svn: 278609	2016-08-13 23:31:24 +00:00
Mehdi Amini	164ac651da	Revert "Invariant start/end intrinsics overloaded for address space" This reverts commit r276447. llvm-svn: 278608	2016-08-13 23:27:32 +00:00
Sanjoy Das	f2b7bafae4	[IRCE] Use dyn_cast instead of explicit isa/cast; NFC llvm-svn: 278607	2016-08-13 22:00:12 +00:00
Sanjoy Das	d1d62a1354	[IRCE] Use range-for; NFC llvm-svn: 278606	2016-08-13 22:00:09 +00:00
Aditya Kumar	f24939b1f4	Test commit llvm-svn: 278598	2016-08-13 11:56:50 +00:00
Craig Topper	8c372a31b7	[X86] Add a check of isCommutable at the top of X86InstrInfo::findCommutedOpIndices. Most callers don't check if the instruction is commutable before calling. This saves us the trouble of ending up in the default of the switch and having to determine if this is an FMA or not. llvm-svn: 278597	2016-08-13 06:48:44 +00:00
Craig Topper	eafdbecc44	[AVX-512] Add isCommutable to scalar FMA3 instructions. llvm-svn: 278596	2016-08-13 06:48:41 +00:00
Craig Topper	5f2441d8f3	[AVX-512] Add commutable flags to 132 form FMA3 instructions. llvm-svn: 278595	2016-08-13 06:48:39 +00:00
Craig Topper	e5115aa4ca	[X86] Remove patterns for (vzmovl (insert_subvector undef, (scalar_to_vector))) as the (vzmovl VR256) pattern has higher priority. NFC llvm-svn: 278594	2016-08-13 06:02:19 +00:00
Craig Topper	3f8126e6fa	[AVX-512] Remove an AddedComplexity that was prioritizing basic vzmovl patterns over more complex ones that produce better code. llvm-svn: 278593	2016-08-13 05:43:20 +00:00
Craig Topper	600685d510	[AVX-512] Add patterns to support VZEXT_MOVL from 512-bit vectors with 64-bit and 32-bit elements. Fixes PR28961. llvm-svn: 278592	2016-08-13 05:33:12 +00:00
Teresa Johnson	1eca6bc6a7	[PM] Port LoopDataPrefetch to new pass manager Summary: Refactor the existing support into a LoopDataPrefetch implementation class and a LoopDataPrefetchLegacyPass class that invokes it. Add a new LoopDataPrefetchPass for the new pass manager that utilizes the LoopDataPrefetch implementation class. Reviewers: mehdi_amini Subscribers: sanjoy, mzolotukhin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23483 llvm-svn: 278591	2016-08-13 04:11:27 +00:00
Matt Arsenault	c1ebd82ebe	AMDGPU: Fix not estimating MBB operand sizes correctly llvm-svn: 278590	2016-08-13 01:43:54 +00:00
Matt Arsenault	3cc1e0066d	AMDGPU: Fix missing test for addressing mode with odd offsets Add test if the constant offset looks unaligned. llvm-svn: 278589	2016-08-13 01:43:51 +00:00
Matt Arsenault	44f6d694b3	AMDGPU/R600: Remove macros llvm-svn: 278588	2016-08-13 01:43:46 +00:00
Hans Wennborg	0dd9ed1d45	Fix more dereferenced end() iterators after r278532 llvm-svn: 278587	2016-08-13 01:12:49 +00:00
Pete Cooper	35b00d5d9e	Constify ValueTracking. NFC. Almost all of the method here are only analysing Value's as opposed to mutating them. Mark all of the easy ones as const. llvm-svn: 278585	2016-08-13 01:05:32 +00:00
Sanjoy Das	3502511548	[IndVars] Ignore (s\|z)exts that don't extend the induction variable `IVVisitor::visitCast` used to have the invariant that if the instruction it was passed was a sext or zext instruction, the result of the instruction would be wider than the induction variable. This is no longer true after rL275037, so this change teaches `IndVarSimplify` s implementation of `IVVisitor::visitCast` to work with the relaxed invariant. A corresponding change to SimplifyIndVar to preserve the said invariant after rL275037 would also work, but given how `IVVisitor::visitCast` is spelled (no indication of said invariant), I figured the current fix is cleaner. Fixes PR28935. llvm-svn: 278584	2016-08-13 00:58:31 +00:00
Eugene Zelenko	3e3a057c20	Fix some Clang-tidy modernize-use-using and Include What You Use warnings. Differential revision: https://reviews.llvm.org/D23478 llvm-svn: 278583	2016-08-13 00:50:41 +00:00
Justin Lebar	d1675aadf6	[LSV] Use a set rather than an ArraySlice at the end of getVectorizablePrefix. NFC Summary: This avoids a small O(n^2) loop. Reviewers: asbirlea Subscribers: mzolotukhin, llvm-commits, arsenm Differential Revision: https://reviews.llvm.org/D23473 llvm-svn: 278581	2016-08-13 00:04:12 +00:00
Justin Lebar	222ceff289	[LSV] Use OrderedBasicBlock instead of rolling it ourselves. NFC Summary: In getVectorizablePrefix, this is less efficient (because we have to iterate over the BB twice), but boy is it simpler. Given how much trouble we've had here, I think the simplicity gain is worthwhile. In reorder(), this is actually more efficient, as DominatorTree::dominates iterates over the BB from the beginning when the two instructions are in the same BB. Reviewers: asbirlea Subscribers: arsenm, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D23472 llvm-svn: 278580	2016-08-13 00:04:08 +00:00
Justin Lebar	cf56e92c50	Minor comment fix ("generate" --> "generates"). llvm-svn: 278578	2016-08-12 23:58:19 +00:00
Hans Wennborg	2d87ccfd58	X86: Fix another dereferenced end() iterator after r278532 llvm-svn: 278577	2016-08-12 23:35:59 +00:00
Haicheng Wu	7c4535d1e7	Reapply [BranchFolding] Restrict tail merging loop blocks after MBP Fixed a bug in the test case. To fix PR28104, this patch restricts tail merging to blocks that belong to the same loop after MBP. llvm-svn: 278575	2016-08-12 23:13:38 +00:00
Dominic Chen	2868fa171a	Avoid accessing LLVM/DWARF register mappings if undefined Summary: If the backend does not define LLVM/DWARF register mappings, the associated variables are undefined since the map initializer is called by auto-generated TableGen routines. This patch initializes the pointers and sizes to nullptr and zero, respectively, and checks that they are valid before searching for a mapping. Reviewers: grosbach, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23458 llvm-svn: 278574	2016-08-12 23:12:59 +00:00
Tim Shen	c9c0d2dcb5	[LoopVectorize] Detect loops in the innermost loop before creating InnerLoopVectorizer InnerLoopVectorizer shouldn't handle a loop with cycles inside the loop body, even if that cycle isn't a natural loop. Fixes PR28541. Differential Revision: https://reviews.llvm.org/D22952 llvm-svn: 278573	2016-08-12 22:47:13 +00:00
Duncan P. N. Exon Smith	69b0650548	X86: Stop dereferencing end() in X86FrameLowering::emitEpilogue On a Windows build of Chromium, r278532 (up to r278539) X86FrameLowering::emitEpilogue because it wasn't wary enough of the return of MachineBasicBlock::getFirstTerminator. Guard all the uses here. Note that r278532 looks like an NFC commit (just an API change), but it removes a couple of layers of abstraction and is probably causing optimization differences in MSVC. llvm-svn: 278572	2016-08-12 22:43:33 +00:00
Reid Kleckner	6ee00a2602	[Inliner] Don't treat inalloca allocas as static They aren't static, and moving them to the entry block across something else will only result in tears. Root cause of http://crbug.com/636558. llvm-svn: 278571	2016-08-12 22:23:04 +00:00
Artem Belevich	2f0a3dfe64	[NVPTX] Use untyped (.b) integer registers in PTX. This bring LLVM-generated PTX closer to what nvcc generates and avoids triggering issues in ptxas. For instance, ptxas does not accept .s16 (or .u16) registers as operands for .fp16 instructions. Differential Revision: https://reviews.llvm.org/D23460 llvm-svn: 278568	2016-08-12 22:02:19 +00:00
David L Kreitzer	9667417a1a	Fixed typo. llvm-svn: 278565	2016-08-12 21:06:53 +00:00
Krzysztof Parzyszek	f285963608	[Hexagon] Cleanup and standardize vector load/store pseudo instructions Remove the following single-vector load/store pseudo instructions, use real instructions instead: LDriv_pseudo_V6 STriv_pseudo_V6 LDriv_pseudo_V6_128B STriv_pseudo_V6_128B LDrivv_indexed STrivv_indexed LDrivv_indexed_128B STrivv_indexed_128B Rename the double-vector load/store pseudo instructions, add unaligned counterparts: -- old -- -- new -- -- unaligned -- LDrivv_pseudo_V6 PS_vloadrw_io PS_vloadrwu_io LDrivv_pseudo_V6_128B PS_vloadrw_io_128B PS_vloadrwu_io_128B STrivv_pseudo_V6 PS_vstorerw_io PS_vstorerwu_io STrivv_pseudo_V6_128B PS_vstorerw_io_128 PS_vstorerwu_io_128 llvm-svn: 278564	2016-08-12 21:05:05 +00:00
Eli Friedman	f184e4befc	[AArch64LoadStoreOptimizer] Check aliasing correctly when creating paired loads/stores. The existing code accidentally skipped the aliasing check in edge cases. Differential revision: https://reviews.llvm.org/D23372 llvm-svn: 278562	2016-08-12 20:39:51 +00:00
Mike Aizatsky	f4fdb5ddf3	[AArch64] Registering default MCInstrAnalysis Even in this form it is useful: it can detect branch instructions. https://github.com/google/sanitizers/issues/706 Subscribers: aemerson, rengolin Differential Revision: https://reviews.llvm.org/D23426 llvm-svn: 278560	2016-08-12 20:28:05 +00:00
Eli Friedman	8585e9d33d	[AArch64LoadStoreOpt] Handle offsets correctly for post-indexed paired loads. Trunk would try to create something like "stp x9, x8, [x0], #512", which isn't actually a valid instruction. Differential revision: https://reviews.llvm.org/D23368 llvm-svn: 278559	2016-08-12 20:28:02 +00:00
Kevin Enderby	c614d283b7	Next set of additional error checks for invalid Mach-O files. This contains the two missing checks for LC_SEGMENT load command fields. And checks for the Mach-O sections fields that would make them invalid. With the new checks, some of the existing malformed file checks now trips one of these instead of the issue it was having before so those tests were adjusted. llvm-svn: 278557	2016-08-12 20:10:25 +00:00
Tim Shen	dc698c3e91	[PPC] Memoize getValueBits. NFC. Summary: It triggers exponential behavior when the DAG has many branches. Reviewers: hfinkel, kbarton Subscribers: iteratee, nemanjai, echristo Differential Revision: https://reviews.llvm.org/D23428 llvm-svn: 278548	2016-08-12 18:40:04 +00:00
Benjamin Kramer	9bc1b230fd	[WebAssembly] Plug MachineMemOperand leaks. llvm-svn: 278545	2016-08-12 18:33:50 +00:00
Dan Liew	ed3c9cae49	[LibFuzzer] Fix `-jobs=<N>` where <N> > 1 and the number of workers is > 1 on macOS. The original `ExecuteCommand()` called `system()` from the C library. The C library implementation of this on macOS contains a mutex which serializes calls to `system()`. This prevented the `-jobs=` flag from running copies of the fuzzing binary in parallel which is the opposite of what is intended. To fix this on macOS an alternative implementation of `ExecuteCommand()` is provided that can be used concurrently. This is provided in `FuzzerUtilDarwin.cpp` which is guarded to only compile code on Apple platforms. The existing implementation has been moved to a new file `FuzzerUtilLinux.cpp` which is guarded to only compile code on Linux. This commit includes a simple test to check that LibFuzzer is being executed in parallel when requested. Differential Revision: https://reviews.llvm.org/D22742 llvm-svn: 278544	2016-08-12 18:29:36 +00:00
Michael Kuperstein	31b8399beb	[PM] Port LowerInvoke to the new pass manager llvm-svn: 278531	2016-08-12 17:28:27 +00:00
Pete Cooper	980a935e27	constify InstCombine::foldAllocaCmp. NFC. This is part of an effort to constify ValueTracking.cpp. This change is to methods which need const Value* instead of Value* to go with the upcoming changes to ValueTracking. llvm-svn: 278528	2016-08-12 17:13:28 +00:00
Dehao Chen	c0a1e432c7	Fine tuning of sample profile propagation algorithm. Summary: The refined propagation algorithm is more accurate and robust. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23224 llvm-svn: 278522	2016-08-12 16:22:12 +00:00
Artur Pilipenko	87e4038a91	[x86] X86ISelLowering zext(add_nuw(x, C)) --> add(zext(x), C_zext) Currently X86ISelLowering has a similar transformation for sexts: sext(add_nsw(x, C)) --> add(sext(x), C_sext) In this change I extend this code to handle zexts as well. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D23359 llvm-svn: 278520	2016-08-12 16:08:30 +00:00
Ehsan Amiri	17e1701075	[BasicAA] Avoid calling GetUnderlyingObject, when the result of a previous call can be reused. Recursive calls to aliasCheck from alias[GEP\|Select\|PHI] may result in a second call to GetUnderlyingObject for a Value, whose underlying object is already computed. This patch ensures that in this situations, the underlying object is not computed again, and the result of the previous call is resued. https://reviews.llvm.org/D22305 llvm-svn: 278519	2016-08-12 16:05:03 +00:00
Artur Pilipenko	2e8f82d962	[LVI] Take guards into account Teach LVI to gather control dependant constraints from guards. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23358 llvm-svn: 278518	2016-08-12 15:52:23 +00:00
Geoff Berry	22dfbc5637	[AArch64] Re-factor code shared by AArch64LoadStoreOpt and AArch64InstrInfo. This re-factoring could cause the following slight changes in generated code, though none were observed during testing: - MachineScheduler could decide not to cluster some loads/stores if there are other load/stores with non-pairable opcodes that have the same base register and offset as a pairable set of load/stores. One case of different MachineScheduler pairing did show up in my testing, but it wasn't due to this issue, but due BaseMemOpClusterMutation::clusterNeighboringMemOps() being unstable w.r.t. the order it considers memory operations. See PR28942. - The ImplicitNullChecks optimization could be done for more load/store opcodes. This optimization isn't done for C/C++ code, so it didn't show up in my testing. Reviewers: mcrosier, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23365 llvm-svn: 278515	2016-08-12 15:26:00 +00:00
Artur Pilipenko	b623088abe	[LVI] Fix potential memory corruption in getValueFromCondition Rewrite Visited[Cond] = getValueFromConditionImpl(..., Visited) statement which can lead to a memory corruption since getValueFromConditionImpl changes Visited map and invalidates the iterators. llvm-svn: 278514	2016-08-12 15:08:15 +00:00
Duncan P. N. Exon Smith	0d2ed35d3e	ADT: Share code for embedded sentinel traits, NFC Share code for the (mostly problematic) embedded sentinel traits. - Move the LLVM_NO_SANITIZE("object-size") attribute to ilist_half_embedded_sentinel_traits and ilist_embedded_sentinel_traits (previously it spread throughout the code duplication). - Add an ilist_full_embedded_sentinel_traits which has no UB (but has the downside of storing the complete node). - Replace all the custom sentinel traits in LLVM with a declaration of ilist_sentinel_traits that inherits from one of the embedded sentinel traits classes. There are still custom sentinel traits in other LLVM subprojects. I'll remove those in a follow-up. Nothing at all should be changing here, this is just rearranging code. Note that the final goal here is to remove the sentinel traits altogether, settling on the memory layout of ilist_half_embedded_sentinel_traits without the UB. This intermediate step moves the logic into ilist.h. llvm-svn: 278513	2016-08-12 15:00:55 +00:00
James Y Knight	2cc9da9a65	Revert "[Sparc] Leon errata fix passes." ...and the two followup commits: Revert "[Sparc][Leon] Missed resetting option flags from check-in 278489." Revert "[Sparc][Leon] Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor." This reverts commit r274856, r278489, and r278492. llvm-svn: 278511	2016-08-12 14:48:09 +00:00
Teresa Johnson	4223dd8559	[PM] Port NameAnonFunction pass to new pass manager Summary: Port the NameAnonFunction pass and add a test. Depends on D23439. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23440 llvm-svn: 278509	2016-08-12 14:03:36 +00:00
Teresa Johnson	f93b246f8b	[PM] Port ModuleSummaryIndex analysis to new pass manager Summary: Port the ModuleSummaryAnalysisWrapperPass to the new pass manager. Use it in the ported BitcodeWriterPass (similar to how we use the legacy ModuleSummaryAnalysisWrapperPass in the legacy WriteBitcodePass). Also, pass the -module-summary opt flag through to the new pass manager pipeline and through to the bitcode writer pass, and add a test that uses it. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23439 llvm-svn: 278508	2016-08-12 13:53:02 +00:00
Simon Pilgrim	687d71e877	[X86][SSE] Add support for combining target shuffles to PSLLDQ/PSRLDQ byte shifts llvm-svn: 278502	2016-08-12 11:24:34 +00:00
Krzysztof Parzyszek	be976d4ea9	[Hexagon] Standardize pseudo-instructions for calls and returns - CALLv3nr PS_call_nr - CALLRv3nr PS_callr_nr - CALLstk PS_call_stk - TCRETURNi PS_tailcall_i - TCRETURNr PS_tailcall_r - JMPret PS_jmpret - JMPrett PS_jmprett - JMPretf PS_jmpretf - JMPrettnew PS_jmprettnew - JMPretfnew PS_jmpretfnew - JMPrettnewpt PS_jmprettnewpt - JMPretfnewpt PS_jmpretfnewpt llvm-svn: 278499	2016-08-12 11:12:02 +00:00
Krzysztof Parzyszek	ab9127ca3c	[Hexagon] Treat non-returning indirect calls as scheduling boundaries llvm-svn: 278498	2016-08-12 11:01:10 +00:00
Artur Pilipenko	6669f253d5	[LVI] Take range metadata into account while calculating icmp condition constraints Take range metadata into account for conditions like this: %length = load i32, i32* %length_ptr, !range !{i32 0, i32 2147483647} %cmp = icmp ult i32 %a, %length This is a common pattern for range checks where the length of the array is dynamically loaded. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23267 llvm-svn: 278496	2016-08-12 10:14:11 +00:00
Simon Pilgrim	ed96b9adfb	[X86][SSE] Fixed PALIGNR target shuffle decode The PALIGNR target shuffle decode was not taking into account that DecodePALIGNRMask (rather oddly) expects the operands to be in reverse order, nor was it detecting unary patterns, causing combines to combine with the incorrect input. The cgbuiltin, auto upgrade and instruction comments code correctly swap the operands so are not affected. llvm-svn: 278494	2016-08-12 10:10:51 +00:00
Artur Pilipenko	635625855f	[LVI] Handle any predicate in comparisons like icmp <pred> (add Val, Offset), ... Currently LVI can only gather value constraints from comparisons like: * icmp <pred> Val, ... * icmp ult (add Val, Offset), ... In fact we can handle any predicate in latter comparisons. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23357 llvm-svn: 278493	2016-08-12 10:05:11 +00:00
Chris Dewhurst	5247af24c3	[Sparc][Leon] Missed resetting option flags from check-in 278489. llvm-svn: 278492	2016-08-12 09:54:39 +00:00
Chris Dewhurst	829f8efe55	[Sparc][Leon] Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor. The nature of the errata are listed in the comments preceding the errata fix passes. Relevant unit tests are implemented for each of these. These changes update older versions of these errata fixes with improvements to code and unit tests. Differential Revision: https://reviews.llvm.org/D21960 llvm-svn: 278489	2016-08-12 09:34:26 +00:00
Benjamin Kramer	bbff9c6130	[Coroutines] Move class into anonymous namespace. Hopefully fixes visibility warnings from GCC. No functionality change. llvm-svn: 278485	2016-08-12 08:47:13 +00:00
Haicheng Wu	d9cbb1608f	Revert "[BranchFolding] Restrict tail merging loop blocks after MBP" This reverts commit r278463 because it hits the bot. llvm-svn: 278484	2016-08-12 08:40:24 +00:00
Gor Nishanov	0f303accde	[Coroutines]: Part6b: Add coro.id intrinsic. Summary: 1. Make coroutine representation more robust against optimization that may duplicate instruction by introducing coro.id intrinsics that returns a token that will get fed into coro.alloc and coro.begin. Due to coro.id returning a token, it won't get duplicated and can be used as reliable indicator of coroutine identify when a particular coroutine call gets inlined. 2. Move last three arguments of coro.begin into coro.id as they will be shared if coro.begin will get duplicated. 3. doc + test + code updated to support the new intrinsic. Reviewers: mehdi_amini, majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23412 llvm-svn: 278481	2016-08-12 05:45:49 +00:00
Duncan P. N. Exon Smith	f197b1f78f	ADT: Remove all ilist_iterator => pointer casts, NFC Remove all ilist_iterator to pointer casts. There were two reasons for casts: - Checking for an uninitialized (i.e., null) iterator. I added MachineInstrBundleIterator::isValid() to check for that case. - Comparing an iterator against the underlying pointer value while avoiding converting the pointer value to an iterator. This is occasionally necessary in MachineInstrBundleIterator, since there is an assertion in the constructors that the underlying MachineInstr is not bundled (but we don't care about that if we're just checking for pointer equality). To support the latter case, I rewrote the == and != operators for ilist_iterator and MachineInstrBundleIterator. - The implicit constructors now use enable_if to exclude const-iterator => non-const-iterator conversions from overload resolution (previously it was a compiler error on instantiation, now it's SFINAE). - The == and != operators are now global (friends), and are not templated. - MachineInstrBundleIterator has overloads to compare against both const_pointer and const_reference. This avoids the implicit conversions to MachineInstrBundleIterator that assert, instead just checking the address (and I added unit tests to confirm this). Notably, the only remaining uses of ilist_iterator::getNodePtrUnchecked are in ilist.h, and no code outside of ilist.h directly relies on this UB end-iterator-to-pointer conversion anymore. It's still needed for ilist_sentinel_traits, but I'll clean that up soon. llvm-svn: 278478	2016-08-12 05:05:36 +00:00
David Majnemer	91a02f5bee	Use the range variant of transform instead of unpacking begin/end No functionality change is intended. llvm-svn: 278477	2016-08-12 04:32:45 +00:00
David Majnemer	2d006e7673	Use the range variant of transform instead of unpacking begin/end No functionality change is intended. llvm-svn: 278476	2016-08-12 04:32:42 +00:00
David Majnemer	c700490f48	Use the range variant of remove_if instead of unpacking begin/end No functionality change is intended. llvm-svn: 278475	2016-08-12 04:32:37 +00:00
David Majnemer	0da5afe717	Use the range variant of count_if instead of unpacking begin/end No functionality change is intended. llvm-svn: 278474	2016-08-12 04:32:29 +00:00
David Majnemer	42531260b3	Use the range variant of find/find_if instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278469	2016-08-12 03:55:06 +00:00
Haicheng Wu	ea02372059	[BranchFolding] Restrict tail merging loop blocks after MBP To fix PR28014, this patch restricts tail merging to blocks that belong to the same loop after MBP. Differential Revision: https://reviews.llvm.org/D23191 llvm-svn: 278463	2016-08-12 03:30:23 +00:00
Ivan Krasin	89439a7939	WholeProgramDevirt: initialize WasDevirt in all constructors. Summary: This is a follow up to r278389 and r278442. Differential Revision: https://reviews.llvm.org/D23438 llvm-svn: 278455	2016-08-12 01:40:10 +00:00
Eli Friedman	a6707f56b5	[DSE] Don't remove stores made live by a call which unwinds. Issue exposed by noalias or more aggressive alias analysis. Fixes http://llvm.org/PR25422. Differential revision: https://reviews.llvm.org/D21007 llvm-svn: 278451	2016-08-12 01:09:53 +00:00
Pete Cooper	54a0255679	Refactor isValidAssumeForContext to reduce duplication and indentation. NFC. This method had some duplicate code when we did or did not have a dom tree. Refactor it to remove the duplication, but also clean up the control flow to have less duplication. llvm-svn: 278450	2016-08-12 01:00:15 +00:00
David Majnemer	562e82945e	Use the range variant of find_if instead of unpacking begin/end No functionality change is intended. llvm-svn: 278443	2016-08-12 00:18:03 +00:00
Xinliang David Li	1ce88fa0a5	Add comment /NFC llvm-svn: 278438	2016-08-11 23:09:56 +00:00
Xinliang David Li	cbb5e02f4a	Fix typos /NFC llvm-svn: 278436	2016-08-11 22:34:00 +00:00
Pete Cooper	fa7ae4f3b6	Remove unnecessary extra version of isValidAssumeForContext. NFC. There were 2 versions of this method. A public one which takes a const Instruction* and a private implementation which takes a mutable Value* and casts to an Instruction. There was no need for the 2 versions as all callers pass a const Instruction and there was no need for a mutable pointer as we only do analysis here. llvm-svn: 278434	2016-08-11 22:23:07 +00:00
David Majnemer	0d955d0bf5	Use the range variant of find instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. llvm-svn: 278433	2016-08-11 22:21:41 +00:00
Piotr Padlewski	332b3b2210	Don't import variadic functions Summary: This patch adds IsVariadicFunction bit to summary in order to not import variadic functions. Inliner doesn't inline variadic functions because it is hard to reason about it. This one small fix improves Importer by about 16% (going from 86% to 100% of imported functions that are inlined anywhere) on some spec benchmarks like 'int' and others. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23339 llvm-svn: 278432	2016-08-11 22:13:57 +00:00
Vyacheslav Klochkov	6daefcf626	X86-FMA3: Implemented commute transformation for EVEX/AVX512 FMA3 opcodes. This helped to improved memory-folding and register coalescing optimizations. Also, this patch fixed the tracker #17229. Reviewer: Craig Topper. Differential Revision: https://reviews.llvm.org/D23108 llvm-svn: 278431	2016-08-11 22:07:33 +00:00
Rui Ueyama	8950a53821	Re-commit r278066: Do not ignore SizeOfOptionalHeader in COFF header even if PE header is not present. llvm-svn: 278429	2016-08-11 22:02:44 +00:00
Tim Northover	8e0c53a018	GlobalISel: support 'null' constant in translation. It's sharing the integer G_CONSTANT for now since I don't think it creates any ambiguity (even on weird archs). If that turns out wrong we can create a G_PTRCONSTANT or something. llvm-svn: 278423	2016-08-11 21:40:55 +00:00
Ehsan Amiri	dbcfea9811	Extend trip count instead of truncating IV in LFTR, when legal When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because (1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant). (2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change). I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere. To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence. This commit contains changes in a newly added testcase which was not included in the previous commit (which was reverted later on). https://reviews.llvm.org/D23075 llvm-svn: 278421	2016-08-11 21:31:40 +00:00
Daniel Berlin	da2f38e0f4	[MSSA] Use is_contained llvm-svn: 278418	2016-08-11 21:26:50 +00:00
David Majnemer	0a16c22846	Use range algorithms instead of unpacking begin/end No functionality change is intended. llvm-svn: 278417	2016-08-11 21:15:00 +00:00
Krzysztof Parzyszek	1b689da04e	[Hexagon] Allow non-returning calls in hardware loops llvm-svn: 278416	2016-08-11 21:14:25 +00:00
Matt Arsenault	18da70dd2d	AMDGPU: Remove unused tablegen utilities llvm-svn: 278414	2016-08-11 21:08:43 +00:00
Geoff Berry	d01828096f	[SCEV] Update interface to handle SCEVExpander insert point motion. Summary: This is an extension of the fix in r271424. That fix dealt with builder insert points being moved by SCEV expansion, but only for the lifetime of the expand call. This change modifies the interface so that LSR can safely call expand multiple times at the same insert point and do the right thing if one of the expansions decides to move the original insert point. This is a fix for PR28719. Reviewers: sanjoy Subscribers: llvm-commits, mcrosier, mzolotukhin Differential Revision: https://reviews.llvm.org/D23342 llvm-svn: 278413	2016-08-11 21:05:17 +00:00
Teresa Johnson	faa7506f18	Fix type truncation warnings Avoid type truncation warnings from a 32-bit bot due to size_t not being unsigned long long, by converting the variables and constants to unsigned. This was introduced by r278338 and caused warnings here: http://bb.pgr.jp/builders/i686-mingw32-RA-on-linux/builds/15527/steps/build_llvmclang/logs/warnings%20%287%29 llvm-svn: 278406	2016-08-11 20:38:39 +00:00
Wei Ding	70cda07526	AMDGPU : Add intrinsic for instruction v_cvt_pk_u8_f32 Differential Revision: http://reviews.llvm.org/D23336 llvm-svn: 278403	2016-08-11 20:34:48 +00:00
Daniel Berlin	f75fd1b58b	Fix PR 28933 Summary: This fixes PR 28933 by making sure GVNHoist does not try to recreate memory accesses when it has not actually moved them. Reviewers: sebpop Subscribers: llvm-commits, george.burgess.iv Differential Revision: https://reviews.llvm.org/D23411 llvm-svn: 278401	2016-08-11 20:32:43 +00:00
Duncan P. N. Exon Smith	38eea4a76f	CodeGen: Avoid dereferencing end() in MachineScheduler Check MachineInstr::isDebugValue for the same instruction as we're calling isSchedBoundary, avoiding the possibility of dereferencing end(). This is a functionality change even when I!=end(). Matthias had a look and agrees this is the right resolution (as opposed to checking for end()). This is triggered by a huge number of tests, but they happen to magically pass right now. I found this because WIP patches for PR26753 convert them into crashes. llvm-svn: 278394	2016-08-11 20:03:09 +00:00
Matt Arsenault	2ffe8fd2ce	AMDGPU: Prune includes llvm-svn: 278391	2016-08-11 19:18:50 +00:00
Krzysztof Parzyszek	258af19d99	[Hexagon] Standardize "select" pseudo-instructions - PS_pselect: general register pairs - PS_vselect: vector registers (+ 128B version) - PS_wselect: vector register pairs (+ 128B version) llvm-svn: 278390	2016-08-11 19:12:18 +00:00
Ivan Krasin	f3403fd2c8	WholeProgramDevirt: generate more detailed and accurate remarks. Summary: Keep track of all methods for which we have devirtualized at least one call and then print them sorted alphabetically. That allows to avoid duplicates and also makes the order deterministic. Add optimization names into the remarks, so that it's easier to understand how has each method been devirtualized. Fix a bug when wrong methods could have been reported for tryVirtualConstProp. Reviewers: kcc, mehdi_amini Differential Revision: https://reviews.llvm.org/D23297 llvm-svn: 278389	2016-08-11 19:09:02 +00:00
Krzysztof Parzyszek	a003b76391	If-conversion incorrectly calculates liveness of redefined registers Differential Revision: https://reviews.llvm.org/D23207 llvm-svn: 278383	2016-08-11 18:42:06 +00:00
Andrew Kaylor	7cdf01ef58	Target independent codesize heuristics for Loop Idiom Recognition Patch by Sunita Marathe Differential Revision: https://reviews.llvm.org/D21449 llvm-svn: 278378	2016-08-11 18:28:33 +00:00
Easwaran Raman	61edc107bb	Add a new method to create SimpleInliner instance and make pre-inliner use this. This adds a createFunctionInliningPass pass that takes an InlineParams object and use this to create the pre-inliner pass. This prevents the regular inliner's threshold flag from influencing the preinliner. Differential revision: https://reviews.llvm.org/D23377 llvm-svn: 278377	2016-08-11 18:24:08 +00:00
Krzysztof Parzyszek	60f0b51485	[Hexagon] Skip byval arguments when checking parameter attributes From the point of view of register assignment, byval parameters are ignored: a byval parameter is not going to be assigned to a register, and it will not affect the assignments of subsequent parameters. When matching registers with parameters in the bit tracker, make sure to skip byval parameters before advancing the registers. llvm-svn: 278375	2016-08-11 18:15:16 +00:00
Dominic Chen	6ba19659cb	Improve virtual register handling when computing debug information Summary: Some backends, like WebAssembly, use virtual registers instead of physical registers. This crashes the DbgValueHistoryCalculator pass, which assumes that all registers are physical. Instead, skip virtual registers when iterating aliases, and assume that they are clobbered. Reviewers: dexonsmith, dschuff, aprantl Subscribers: yurydelendik, llvm-commits, jfb, sunfish Differential Revision: https://reviews.llvm.org/D22590 llvm-svn: 278371	2016-08-11 17:52:40 +00:00
Michael Kuperstein	e36d7716c3	Make TwoAddressInstructionPass::rescheduleMIBelowKill subreg-aware This fixes PR28824. Differential Revision: https://reviews.llvm.org/D23220 llvm-svn: 278370	2016-08-11 17:38:33 +00:00
Matt Arsenault	56684d4538	AMDGPU: Fix crashes on memory functions llvm-svn: 278369	2016-08-11 17:31:42 +00:00
Matt Arsenault	76837df6ff	AArch64: Assert on analyzeBranch failing llvm-svn: 278366	2016-08-11 17:22:59 +00:00
Michael Kuperstein	ee900b62ef	[AliasSetTracker] Delete dead code Deletes unused remove() and containsPointer() interfaces. NFC. Differential Revision: https://reviews.llvm.org/D23360 llvm-svn: 278365	2016-08-11 17:20:20 +00:00
Eugene Zelenko	cdc7161281	Fix some Clang-tidy modernize and Include What You Use warnings. Differential revision: https://reviews.llvm.org/D23291 llvm-svn: 278364	2016-08-11 17:20:18 +00:00
Matt Arsenault	4b5fc093d0	AMDGPU: Remove custom getSubReg This was kind of confusing, the subregister class shouldn't really be necessary. llvm-svn: 278362	2016-08-11 17:15:32 +00:00
Matt Arsenault	69fd2c1179	AMDGPU: Remove unused tracking of flat instructions llvm-svn: 278361	2016-08-11 17:15:28 +00:00
Duncan P. N. Exon Smith	ec30cc2171	Hexagon: Avoid dereferencing end() in HexagonCopyToCombine::findPairable Check for end() before skipping through debug values. This avoids dereferencing end() when the instruction is the final one in the basic block. (It still assumes that a debug value will not be the final instruction in the basic block. No tests seemed to violate that.) Many Hexagon tests trigger this, but they happen to magically pass right now. I found this because WIP patches for PR26753 convert them into crashes. llvm-svn: 278355	2016-08-11 16:40:03 +00:00
Wei Ding	34e1753585	AMDGPU : Add LLVM intrinsics for SAD related instructions. Differential Revision: http://reviews.llvm.org/D23133 llvm-svn: 278354	2016-08-11 16:33:53 +00:00
Tim Northover	0d51044b69	GlobalISel: clear vreg mapping after translating each function Otherwise we only materialize (shared) constants in the first function they appear in. This doesn't go well. llvm-svn: 278351	2016-08-11 16:21:29 +00:00
Reid Kleckner	26f9e9ebc3	Remove FIXME about asserting on the end iterator After machine block placement, MBBs may not have terminators, and it is appropriate to check for the end iterator here. We can fold the check into the next if, as well. This look is really just looking for BBs that end in CATCHRET. llvm-svn: 278350	2016-08-11 16:00:43 +00:00
Lang Hames	30526070ab	[MCJIT] Improve documentation and error handling for MCJIT::runFunction. ExecutionEngine::runFunction is supposed to allow execution of arbitrary function types, but MCJIT can only reasonably support a limited subset of main-linke function types. This patch documents this limitation, and fixes MCJIT::runFunction to abort with a meaningful error at runtime if called with an unsupported function type. llvm-svn: 278348	2016-08-11 15:56:23 +00:00
Duncan P. N. Exon Smith	62e351f5a4	X86: Use operator lookup for operator==, NFC Avoid relying on the MachineInstrBundleIterator operator== being implemented as a member function. llvm-svn: 278347	2016-08-11 15:51:29 +00:00
Duncan P. N. Exon Smith	43724649c3	IR: Don't cast the end iterator to Instruction* End iterators are usually sentinels, not actually Instruction* at all. Stop casting to it just to get an iterator back. There is likely no observable functionality change here right now (although this is relying on UB, I doubt it was triggering anything), but I'll be removing the cast soon. llvm-svn: 278346	2016-08-11 15:45:04 +00:00
Duncan P. N. Exon Smith	2e7af979b9	CodeGen: Check for a terminator in llvm::getFuncletMembership Check for an end iterator from MachineBasicBlock::getFirstTerminator in llvm::getFuncletMembership. If this is turned into an assertion, it fires in 48 X86 testcases (for example, CodeGen/X86/regalloc-spill-at-ehpad.ll). Since this is likely a latent bug (shouldn't all basic blocks end with a terminator?) I've filed PR28938. llvm-svn: 278344	2016-08-11 15:29:02 +00:00
Matthew Simpson	3f69195b9e	[SLP] Make RecursionMaxDepth a command line option (NFC) llvm-svn: 278343	2016-08-11 15:28:45 +00:00
Sanjay Patel	38ae83de38	fix comment; NFC llvm-svn: 278342	2016-08-11 15:23:56 +00:00
Sanjay Patel	e3c335cbed	use auto* with dyn_cast ; NFC llvm-svn: 278340	2016-08-11 15:21:21 +00:00
Sanjay Patel	5a470950b9	getParent()->getParent() == getFunction() ; NFC llvm-svn: 278339	2016-08-11 15:16:06 +00:00
Teresa Johnson	9ba95f99f3	Restore "Resolution-based LTO API." This restores commit r278330, with fixes for a few bot failures: - Fix a late change I had made to the save temps output file that I missed due to existing files sitting on my disk - Fix a bunch of Windows bot failures with "ambiguous call to overloaded function" due to confusion between llvm::make_unique vs std::make_unique (preface the new make_unique calls with "llvm::") - Attempt to fix a modules bot failure by adding a missing include to LTO/Config.h. Original change: Resolution-based LTO API. Summary: This introduces a resolution-based LTO API. The main advantage of this API over existing APIs is that it allows the linker to supply a resolution for each symbol in each object, rather than the combined object as a whole. This will become increasingly important for use cases such as ThinLTO which require us to process symbol resolutions in a more complicated way than just adjusting linkage. Patch by Peter Collingbourne. Reviewers: rafael, tejohnson, mehdi_amini Subscribers: lhames, tejohnson, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D20268 llvm-svn: 278338	2016-08-11 14:58:12 +00:00
Ehsan Amiri	3818f1b38a	revert 278334 llvm-svn: 278337	2016-08-11 14:51:14 +00:00
Valery Pykhtin	82c73bee2b	Revert "[AMDGPU] fix failure on printing of non-existing instruction operands." This reverts revision 278333, newly added test failed. llvm-svn: 278336	2016-08-11 14:22:05 +00:00
Ehsan Amiri	b9fcc2b171	Extend trip count instead of truncating IV in LFTR, when legal When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because (1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant). (2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change). I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere. To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence. https://reviews.llvm.org/D23075 llvm-svn: 278334	2016-08-11 13:51:20 +00:00
Valery Pykhtin	3048ff6ec3	[AMDGPU] fix failure on printing of non-existing instruction operands. Differential revision: https://reviews.llvm.org/D23323 llvm-svn: 278333	2016-08-11 13:49:46 +00:00
Teresa Johnson	cbf684e6c6	Revert "Resolution-based LTO API." This reverts commit r278330. I made a change to the save temps output that is causing issues with the bots. Didn't realize this because I had older output files sitting on disk in my test output directory. llvm-svn: 278331	2016-08-11 13:03:56 +00:00
Teresa Johnson	f99573b3ee	Resolution-based LTO API. Summary: This introduces a resolution-based LTO API. The main advantage of this API over existing APIs is that it allows the linker to supply a resolution for each symbol in each object, rather than the combined object as a whole. This will become increasingly important for use cases such as ThinLTO which require us to process symbol resolutions in a more complicated way than just adjusting linkage. Patch by Peter Collingbourne. Reviewers: rafael, tejohnson, mehdi_amini Subscribers: lhames, tejohnson, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D20268 Address review comments llvm-svn: 278330	2016-08-11 12:56:40 +00:00
Simon Pilgrim	5c91764af5	Fixed VS2015 (Update 3) warning - differing const/volatile qualifiers for overridden function Dropped the const qualifier to match llvm::CallLowering::lowerCall llvm-svn: 278329	2016-08-11 12:19:43 +00:00
Igor Breger	a77b14d02c	[AVX512] Fix extractelement i1 lowering. The previous implementation (not custom) doesn't enforce zeroing off upper bits. The assumption is that i1 PRODUCER (truncate and extractelement) must zero all upper bits, so i1 CONSUMER instructions ( test, zext, save, etc) can be done without additional zeroing. Make extractelement i1 lowering custom for all vector i1. Differential Revision: http://reviews.llvm.org/D23246 llvm-svn: 278328	2016-08-11 12:13:46 +00:00
Marina Yatsina	88f0c31f13	Avoid false dependencies of undef machine operands This patch helps avoid false dependencies on undef registers by updating the machine instructions' undef operand to use a register that the instruction is truly dependent on, or use a register with clearance higher than Pref. Pseudo example: loop: xmm0 = ... xmm1 = vcvtsi2sdl eax, xmm0<undef> ... = inst xmm0 jmp loop In this example, selecting xmm0 as the undef register creates false dependency between loop iterations. This false dependency cannot be solved by inserting an xor before vcvtsi2sdl because xmm0 is alive at the point of the vcvtsi2sdl instruction. Selecting a different register instead of xmm0, especially a register that is not used in the loop, will eliminate this problem. Differential Revision: https://reviews.llvm.org/D22466 llvm-svn: 278321	2016-08-11 07:32:08 +00:00
Craig Topper	a78b768ed4	[AVX-512] Promote 512-bit integer loads to v8i64 similar to what is done for 128/256-bit vectors for overall consistency. llvm-svn: 278318	2016-08-11 06:04:07 +00:00
Craig Topper	14aa2665d3	[AVX-512] Add patterns to allow EVEX encoded stores of v16i16/v8i16/v16i8/v32i8 even when BWI is not supported. llvm-svn: 278317	2016-08-11 06:04:04 +00:00
Craig Topper	3563d0f622	[AVX-512] Fix the 128-bit and 256-bit nontemporal load patterns with elements type other than i64. These loads have all been promoted to v2i64/v4i64 loads so we need bitcasts or we end up selecting VMOVDQA32/VMOVDQU32 instead. llvm-svn: 278316	2016-08-11 06:04:00 +00:00
Xinliang David Li	76a0108be4	[Profile] improve warning control option Change --no-pgo-warn-missing to -pgo-warn-missing-function and negate the default. /NFC Add more test to make sure the warning is off by default llvm-svn: 278314	2016-08-11 05:09:30 +00:00
Dominic Chen	4173fffa08	[WebAssembly] Cleanup trailing whitespace Summary: Test for commit access. Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D23392 llvm-svn: 278313	2016-08-11 04:10:56 +00:00
Easwaran Raman	0d58fcac99	Make more fields of InlineParams Optional. Differential revision: https://reviews.llvm.org/D23386 llvm-svn: 278312	2016-08-11 03:58:05 +00:00
Sanjoy Das	25fb5bda0f	[Statepoints] Minor cosmetic change; NFC The verification failure message was missing a space. llvm-svn: 278309	2016-08-11 00:56:46 +00:00
Chris Bieneman	ca5de9d9e3	[MachOYAML] Don't output empty ExportTrie The YAML representation was always outputting the root node of an export trie even if the trie was empty. While this doesn't really have any functional impact, it does add visual clutter to the yaml file. llvm-svn: 278307	2016-08-11 00:20:03 +00:00
Tim Northover	357f1be2ca	GlobalISel: support same ConstantExprs as Instructions. It's more than just inttoptr, but the others can't be tested until we have support for non-trivial constants (they currently get unavoidably folded to a ConstantInt). llvm-svn: 278303	2016-08-10 23:02:41 +00:00
Tim Northover	406024a108	GlobalISel: implement simple function calls on AArch64. We're still limited in the arguments we support, but this at least handles the basic cases. llvm-svn: 278293	2016-08-10 21:44:01 +00:00
Changpeng Fang	fb9c3818dd	AMDGPU/SI: Implement amdgcn image intrinsics with sampler Summary: This patch define and implement amdgcn image intrinsics with sampler. 1. define vdata type to be llvm_anyfloat_ty, address type to be llvm_anyfloat_ty, and rsrc type to be llvm_anyint_ty. As a result, we expect the intrinsics name to have three suffixes to overload each of these three types; 2. D128 as well as two other flags are implied in the three types, for example, if you use v8i32 as resource type, then r128 is 0! 3. don't expose TFE flag, and other flags are exposed in the instruction order: unrm, glc, slc, lwe and da. Differential Revision: http://reviews.llvm.org/D22838 Reviewed by: arsenm and tstellarAMD llvm-svn: 278291	2016-08-10 21:15:30 +00:00
Piotr Padlewski	d89875ca39	Changed sign of LastCallToStaticBouns Summary: I think it is much better this way. When I firstly saw line: Cost += InlineConstants::LastCallToStaticBonus; I though that this is a bug, because everywhere where the cost is being reduced it is usuing -=. Reviewers: eraman, tejohnson, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23222 llvm-svn: 278290	2016-08-10 21:15:22 +00:00
Kyle Butt	81d32846b0	Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough. If AnalyzeBranch can't analyze a block and it is possible to fallthrough, then duplicating the block doesn't make sense, as only one block can be the layout predecessor for the un-analyzable fallthrough. Submitted wit a test case, but NOTE: the test case doesn't currently fail. However, the test case fails with D20505 and would have saved me some time debugging. llvm-svn: 278288	2016-08-10 21:03:27 +00:00
Kyle Butt	e1c931b171	CodeGen: If Convert blocks that would form a diamond when tail-merged. The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 278287	2016-08-10 20:45:56 +00:00
Jonathan Roelofs	851b79dc4d	Fix UB in APInt::ashr i64 -1, whose sign bit is the 0th one, can't be left shifted without invoking UB. https://reviews.llvm.org/D23362 llvm-svn: 278280	2016-08-10 19:50:14 +00:00
Matt Arsenault	61f8ba8b79	AMDGPU: s_setpc_b64 should be an indirect branch llvm-svn: 278278	2016-08-10 19:20:02 +00:00
Matt Arsenault	c6b1350039	AMDGPU: Set sizes on control flow pseudos llvm-svn: 278276	2016-08-10 19:11:51 +00:00
Matt Arsenault	f4af802381	AMDGPU: Remove empty file comment llvm-svn: 278275	2016-08-10 19:11:48 +00:00
Matt Arsenault	11587d97be	AMDGPU: Remove unnecessary cast llvm-svn: 278274	2016-08-10 19:11:45 +00:00
Matt Arsenault	57431c9680	AMDGPU: Change insertion point of si_mask_branch Insert before the skip branch if one is created. This is a somewhat more natural placement relative to the skip branches, and makes it possible to implement analyzeBranch for skip blocks. The test changes are mostly due to a quirk where the block label is not emitted if there is a terminator that is not also a branch. llvm-svn: 278273	2016-08-10 19:11:42 +00:00
Matt Arsenault	b920e9987d	AMDGPU: Use CreateStackObject instead of CreateSpillStackObject I'm not sure what the difference is, but no other target uses this for emergency spill slots. llvm-svn: 278272	2016-08-10 19:11:36 +00:00
Sanjay Patel	5ccc85fe83	[x86, AVX] allow FP vector select folding to bitwise logic ops (PR28895) This handles the case in: https://llvm.org/bugs/show_bug.cgi?id=28895 ...but we are not getting all of the possibilities yet. Eg, we use 'X86::FANDN' for scalar FP select combines. That enhancement is filed as: https://llvm.org/bugs/show_bug.cgi?id=28925 Differential Revision: https://reviews.llvm.org/D23337 llvm-svn: 278270	2016-08-10 19:00:11 +00:00
Andrew Kaylor	498d3113c3	[IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18867 llvm-svn: 278269	2016-08-10 18:56:35 +00:00
Nicolai Haehnle	02d784172c	LiveIntervalAnalysis: fix a crash in repairOldRegInRange Summary: See the new test case for one that was (non-deterministically) crashing on trunk and deterministically hit the assertion that I added in D23302. Basically, the machine function contains a sequence DS_WRITE_B32 %vreg4, %vreg14:sub0, ... DS_WRITE_B32 %vreg4, %vreg14:sub0, ... %vreg14:sub1<def> = COPY %vreg14:sub0 and SILoadStoreOptimizer::mergeWrite2Pair merges the two DS_WRITE_B32 instructions into one before calling repairIntervalsInRange. Now repairIntervalsInRange wants to repair %vreg14, in particular, and ends up trying to repair %vreg14:sub1 as well, but that only becomes active _after_ the range that is to be repaired, hence the crash due to LR.find(...) == LR.begin() at the start of repairOldRegInRange. I believe that just skipping those subrange is fine, but again, not too familiar with that code. Reviewers: MatzeB, kparzysz, tstellarAMD Subscribers: llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D23303 llvm-svn: 278268	2016-08-10 18:51:14 +00:00
Andrew Kaylor	b10f6876cd	[ValueTracking] An improvement to IR ValueTracking on Non-negative Integers Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18777 llvm-svn: 278267	2016-08-10 18:47:19 +00:00
Krzysztof Parzyszek	c9c2bba621	[Hexagon] Remove unused variants of LO/HI instructions llvm-svn: 278266	2016-08-10 18:40:36 +00:00
Kyle Butt	71b1ca1be4	Codegen: Tail Merge: Be less aggressive with special cases. This change makes it possible for tail-duplication and tail-merging to be disjoint. By being less aggressive when merging during layout, there are no overlapping cases between tail-duplication and tail-merging, provided the thresholds are disjoint. There is a remaining TODO to benchmark the succ_size() test for non-layout tail merging. llvm-svn: 278265	2016-08-10 18:36:18 +00:00
Simon Pilgrim	675c257a32	[X86][SSE] Dropped blend(insertps(x,y),zero) combine - this is now handled by target shuffle chain combining llvm-svn: 278260	2016-08-10 18:10:29 +00:00
Krzysztof Parzyszek	0bbad0fc86	[Hexagon] Simplify the SplitConst32/64 pass llvm-svn: 278256	2016-08-10 18:05:47 +00:00
Krzysztof Parzyszek	3b946c90ef	[Hexagon] Add extra patterns for single-precision min/max instructions llvm-svn: 278252	2016-08-10 17:56:24 +00:00
Rong Xu	63f970ee24	Fix LCSSA increased compile time We are seeing r276077 drastically increasing compiler time for our larger benchmarks in PGO profile generation build (both clang based and IR based mode) -- it can be 20x slower than without the patch (like from 30 secs to 780 secs) The increased time are all in pass LCSSA. The problematic code is about PostProcessPHIs after use-rewrite. Note that the InsertedPhis from ssa_updater is accumulating (never been cleared). Since the inserted PHIs are added to the candidate for each rewrite, The earlier ones will be repeatedly added. Later when adding the new PHIs to the work-list, we don't check the duplication either. This can result in extremely long work-list that containing tons of duplicated PHIs. This patch fixes the issue by hoisting the code out of the loop. Differential Revision: http://reviews.llvm.org/D23344 llvm-svn: 278250	2016-08-10 17:49:11 +00:00
Krzysztof Parzyszek	c1f6cd2980	[Hexagon] Fix table-gen decode conflict warnings for CONST32/64 llvm-svn: 278247	2016-08-10 17:22:24 +00:00
Tim Northover	7552ef5a00	GlobalISel: avoid inserting redundant COPYs for bitcasts. If the value produced by the bitcast hasn't been referenced yet, we can simply reuse the input register avoiding an unnecessary COPY instruction. llvm-svn: 278245	2016-08-10 16:51:14 +00:00
Krzysztof Parzyszek	a3386501af	[Hexagon] Use integer instructions for floating point immediates Floating point instructions use general purpose registers, so the few instructions that can put floating point immediates into registers are, in fact, integer instruction. Use them explicitly instead of having pseudo-instructions specifically for dealing with floating point values. Simplify the constant loading instructions (from sdata) to have only two: one for 32-bit values and one for 64-bit values: CONST32 and CONST64. llvm-svn: 278244	2016-08-10 16:46:36 +00:00
Gor Nishanov	b2a9c02521	[Coroutines] Part 6: Elide dynamic allocation of a coroutine frame when possible Summary: A particular coroutine usage pattern, where a coroutine is created, manipulated and destroyed by the same calling function, is common for coroutines implementing RAII idiom and is suitable for allocation elision optimization which avoid dynamic allocation by storing the coroutine frame as a static `alloca` in its caller. coro.free and coro.alloc intrinsics are used to indicate which code needs to be suppressed when dynamic allocation elision happens: ``` entry: %elide = call i8* @llvm.coro.alloc() %need.dyn.alloc = icmp ne i8* %elide, null br i1 %need.dyn.alloc, label %coro.begin, label %dyn.alloc dyn.alloc: %alloc = call i8* @CustomAlloc(i32 4) br label %coro.begin coro.begin: %phi = phi i8* [ %elide, %entry ], [ %alloc, %dyn.alloc ] %hdl = call i8* @llvm.coro.begin(i8* %phi, i32 0, i8* null, i8* bitcast ([2 x void (%f.frame)]* @f.resumers to i8)) ``` and ``` %mem = call i8 @llvm.coro.free(i8* %hdl) %need.dyn.free = icmp ne i8* %mem, null br i1 %need.dyn.free, label %dyn.free, label %if.end dyn.free: call void @CustomFree(i8* %mem) br label %if.end if.end: ... ``` If heap allocation elision is performed, we replace coro.alloc with a static alloca on the caller frame and coro.free with null constant. Also, we need to make sure that if there are any tail calls referencing the coroutine frame, we need to remote tail call attribute, since now coroutine frame lives on the stack. Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) 3.Add empty coroutine passes. (https://reviews.llvm.org/D22847) 4.Add coroutine devirtualization + tests. ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998) c) Do devirtualization (https://reviews.llvm.org/D23229) 5.Add CGSCC restart trigger + tests. (https://reviews.llvm.org/D23234) 6.Add coroutine heap elision + tests. <= we are here 7.Add the rest of the logic (split into more patches) Reviewers: mehdi_amini, majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23245 llvm-svn: 278242	2016-08-10 16:40:39 +00:00
Roger Ferrer Ibanez	17586582e7	Fix build break of VS 2013 debug builds In debug mode extra macros are enabled for several C++ algorithms. Some of them may cause unfortunate build failures. This commit adds a redundant operator() to work around one of those troublesome macros which was hit accidentally by change r278012. llvm-svn: 278241	2016-08-10 16:39:58 +00:00
Krzysztof Parzyszek	12e03aa5fe	[Hexagon] Delete HexagonSelectCCInfo.td This file is not used. The location assignment of call arguments and return values is implemented directly in HexagonISelLowering. llvm-svn: 278237	2016-08-10 16:23:53 +00:00
Krzysztof Parzyszek	2a48ce4ec2	[Hexagon] Remove unneeded/unused ISD opcodes ARGEXTEND and FCONST32 llvm-svn: 278236	2016-08-10 16:20:33 +00:00
Artur Pilipenko	fd223d5d25	[LVI] Handle conditions in the form of (cond1 && cond2) Teach LVI how to gather information from conditions in the form of (cond1 && cond2). Our out-of-tree front-end emits range checks in this form. Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D23200 llvm-svn: 278231	2016-08-10 15:13:15 +00:00
Simon Pilgrim	ac8fa6c2c6	[X86][SSE] Add support for combining target shuffles to MOVSS/MOVSD Only do this on pre-SSE41 targets where we should be lowering to BLENDPS/BLENDPD instead llvm-svn: 278228	2016-08-10 14:15:41 +00:00
Artur Pilipenko	933c07a4fb	[LVI] NFC. Make getValueFromCondition return LVILatticeValue instead of changing reference argument Instead of returning bool and setting LVILatticeValue reference argument return LVILattice value. Use overdefined value to denote the case when we didn't gather any information from the condition. This change was separated from the review "[LVI] Handle conditions in the form of (cond1 && cond2)" (https://reviews.llvm.org/D23200#inline-199531). Once getValueFromCondition returns LVILatticeValue we can cache the result in Visited map. llvm-svn: 278224	2016-08-10 13:38:07 +00:00
Artur Pilipenko	e171ea8a33	Teach CorrelatedValuePropagation to mark adds as no wrap This is a resubmission of previously reverted r277592. It was hitting overly strong assertion in getConstantRange which was relaxed in r278217. Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem. Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D23059 llvm-svn: 278220	2016-08-10 13:08:34 +00:00
Simon Pilgrim	9811e98495	[X86][SSE] Only treat SM_SentinelUndef as UNDEF in shuffle mask predicates isUndefOrEqual and isUndefOrInRange treated all -ve shuffle mask values as UNDEF, now it has to be SM_SentinelUndef (-1) We already have asserts to check that lowered SHUFFLE_VECTOR indices are in the range -1 <= index < 2*masksize (or masksize for unary shuffles) llvm-svn: 278218	2016-08-10 12:55:25 +00:00
Artur Pilipenko	a4b6a70a9c	[LVI] Relax the assertion about LVILatticeVal type in getConstantRange The problem was triggered by my recent change in CVP (D23059). Current code expected that integer constants are represented by constantrange LVILatticeVal and never represented as LVILatticeVal with constant tag. That is true for ConstantInt constants, although ConstantExpr integer type constants are legally represented as constant LVILatticeVal. This code fails with CVP change in: @b = global i32 0, align 4 define void @test6(i32 %a) { bb: %add = add i32 %a, ptrtoint (i32* @b to i32) ret void } Currently getConstantRange code is not executed by any of the upstream passes. I'm going to add a test case to test/Transforms/CorrelatedValuePropagation/add.ll once I resubmit the CVP change. Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D23194 llvm-svn: 278217	2016-08-10 12:54:54 +00:00
Simon Pilgrim	cb419a896c	[X86][SSE] Reorder shuffle mask undef helper predicates. NFCI To make it easier for a more complex helper to use a simpler one llvm-svn: 278216	2016-08-10 12:34:23 +00:00
Simon Pilgrim	85c7ea86ae	[DAGCombine] Avoid INSERT_SUBVECTOR reinsertions (PR28678) If the input vector to INSERT_SUBVECTOR is another INSERT_SUBVECTOR, and this inserted subvector replaces the last insertion, then insert into the common source vector. i.e. INSERT_SUBVECTOR( INSERT_SUBVECTOR( Vec, SubOld, Idx ), SubNew, Idx ) --> INSERT_SUBVECTOR( Vec, SubNew, Idx ) Differential Revision: https://reviews.llvm.org/D23330 llvm-svn: 278211	2016-08-10 10:50:53 +00:00
Sam Parker	62965c96df	[ARM] Improve sxta{b\|h} and uxta{b\|h} tests Created a Thumb2 predicated pattern matcher that uses Thumb2 and HasT2ExtractPack and used it to redefine the patterns for sxta{b\|h} and uxta{b\|h}. Also used the similar patterns to fill in isel pattern gaps for the corresponding instructions in the ARM backend. The patch is mainly changes to tests since most of this functionality appears not to have been tested. Differential Revision: https://reviews.llvm.org/D23273 llvm-svn: 278207	2016-08-10 09:34:34 +00:00
Chandler Carruth	0215e76836	[x86] Fix a bug in the auto-upgrade from r276416 where we failed to give a sufficiently low alignment for the IR load created. There is no test case because we don't have any test cases for the IR produced by the autoupgrade, only the x86 assembly, and it happens that the x86 assembly for this intrinsic as it is tested in the autoupgrade path just happens to not produce a separate load instruction where we might have observed the alignment. I'm going to follow up on the original commit to suggest getting IR-level testing in addition to the asm level testing here so that we can see and test these kinds of issues. We might never get an x86 instruction out with an alignment constraint, but we could stil miscompile code by folding against the alignment marked on (or inferred for in this case) the load. llvm-svn: 278203	2016-08-10 07:41:26 +00:00
Davide Italiano	873219c406	[SimplifyLibCalls] Restore the old behaviour, emit a libcall. Hal pointed out that the semantic of our intrinsic and the libc call are slightly different. Add a comment while I'm here to explain why we can't emit an intrinsic. Thanks Hal! llvm-svn: 278200	2016-08-10 06:33:32 +00:00
Easwaran Raman	1c57cc2b68	Do not directly use inline threshold cl options in cost analysis. This adds an InlineParams struct which is populated from the command line options by getInlineParams and passed to getInlineCost for the call analyzer to use. Differential revision: https://reviews.llvm.org/D22120 llvm-svn: 278189	2016-08-10 00:48:04 +00:00
Adam Nemet	896c09bd10	[Inliner,OptDiag] Add hotness attribute to opt diagnostics Summary: The inliner not being a function pass requires the work-around of generating the OptimizationRemarkEmitter and in turn BFI on demand. This will go away after the new PM is ready. BFI is only computed inside ORE if the user has requested hotness information for optimization diagnostitics (-pass-remark-with-hotness at the 'opt' level). Thus there is no additional overhead without the flag. Reviewers: hfinkel, davidxl, eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22694 llvm-svn: 278185	2016-08-10 00:44:44 +00:00
Vedant Kumar	809fe6ca30	[IR] Remove some unused #includes (NFC) I needed a reader-writer lock for a downstream project and noticed that llvm has one. Function.cpp is the only file in-tree that refers to it. To anyone reading this: are you using RWMutex in out-of-tree code? Maybe it's not worth keeping around any more... Since we're not actually using RWMutex here, remove the #include (and a few other stale headers while we're at it). llvm-svn: 278178	2016-08-09 23:14:37 +00:00
Tim Northover	d403a3d8ee	GlobalISel: support 'undef' constant. llvm-svn: 278174	2016-08-09 23:01:30 +00:00
Michael Zolotukhin	aae168f993	[LoopSimplify] Rebuild LCSSA for the inner loop after separating nested loops. Summary: This hopefully fixes PR28825. The problem now was that a value from the original loop was used in a subloop, which became a sibling after separation. While a subloop doesn't need an lcssa phi node, a sibling does, and that's where we broke LCSSA. The most natural way to fix this now is to simply call formLCSSA on the original loop: it'll do what we've been doing before plus it'll cover situations described above. I think we don't need to run formLCSSARecursively here, and we have an assert to verify this (I've tried testing it on LLVM testsuite + SPECs). I'd be happy to be corrected here though. I also changed a run line in the test from '-lcssa -loop-unroll' to '-lcssa -loop-simplify -indvars', because it exercises LCSSA preservation to the same extent, but also makes less unrelated transformation on the CFG, which makes it easier to verify. Reviewers: chandlerc, sanjoy, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23288 llvm-svn: 278173	2016-08-09 22:44:56 +00:00
Andrew Kaylor	3c05edfd5e	[ValueTracking] Improve ValueTracking on left shift with nsw flag Patch by Li Huang Differential Revison: https://reviews.llvm.org/D23296 llvm-svn: 278172	2016-08-09 22:41:35 +00:00
Derek Schuff	66641322ce	[WebAssembly] Add -emscripten-cxx-exceptions-whitelist option This patch adds -emscripten-cxx-exceptions-whitelist option to WebAssemblyLowerEmscriptenExceptions pass. This options is the list of function names in which Emscripten-style exception handling is enabled. This is to support emscripten's EXCEPTION_CATCHING_WHITELIST which exists because of the performance impact of emscripten's non-zero-cost EH method. Patch by Heejin Ahn Differential Revision: https://reviews.llvm.org/D23292 llvm-svn: 278171	2016-08-09 22:37:00 +00:00
Tim Northover	5ed648e509	GlobalISel: first translation support for Constants. For now put them all in the entry block. This should be correct but may give poor runtime performance. Hopefully MachineSinking combined with isReMaterializable can solve those issues, but if not the interface is sound enough to support alternatives. llvm-svn: 278168	2016-08-09 21:28:04 +00:00
Wei Mi	575435012c	Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion". The patch is to fix the bug in PR28705. It was caused by setting wrong return value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion have different meanings when the function is used in different ways so it is easy to make mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion, and specifies where each interface is expected to be used. Differential Revision: https://reviews.llvm.org/D22942 llvm-svn: 278161	2016-08-09 20:40:03 +00:00
Wei Mi	785858cf6c	Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion". The fix for PR28705 will be committed consecutively. In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion. However, const folding and sext/zext distribution can make the reuse still difficult. A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and S1 = S2 + C_a S3 = S2 + C_b where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused by the fact that S3 is generated from S1 after const folding. In order to do that, we represent ExprValueMap as a mapping from SCEV to ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to V1 - C_a + C_b. Differential Revision: https://reviews.llvm.org/D21313 llvm-svn: 278160	2016-08-09 20:37:50 +00:00
Anna Thomas	b2d12b81c3	[EarlyCSE] Teach about CSE'ing over invariant.start intrinsics Summary: Teach EarlyCSE about invariant.start intrinsic. Specifically, we can perform store-load, load-load forwarding over this call. Reviewers: majnemer, reames, dberlin, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23268 llvm-svn: 278153	2016-08-09 20:00:47 +00:00
Lang Hames	bb9431acda	Re-apply r278065 (Weak symbol support in RuntimeDyld) with a fix for ELF. llvm-svn: 278149	2016-08-09 19:27:17 +00:00
David Majnemer	adc688ce9c	[X86] Don't model UD2/UD2B as a terminator A UD2 might make its way into the program via a call to @llvm.trap. Obviously, calls are not terminators. However, we modeled the X86 instruction, UD2, as a terminator. Later on, this confuses the epilogue insertion machinery which results in the epilogue getting inserted before the UD2. For some platforms, like x64, the result is a violation of the ABI. Instead, model UD2/UD2B as a side effecting instruction which may observe memory. llvm-svn: 278144	2016-08-09 17:55:12 +00:00
Simon Pilgrim	76964e3140	[DAGCombiner] Better support for shifting large value type by constants As detailed on D22726, much of the shift combining code assume constant values will fit into a uint64_t value and calls ConstantSDNode::getZExtValue where it probably shouldn't (leading to asserts). Using APInt directly avoids this problem but we encounter other assertions if we attempt to compare/operate on 2 APInt of different bitwidths. This patch adds a helper function to ensure that 2 APInt values are zero extended as required so that they can be safely used together. I've only added an initial example use for this to the '(SHIFT (SHIFT x, c1), c2) --> (SHIFT x, (ADD c1, c2))' combines. Further cases can easily be added as required. Differential Revision: https://reviews.llvm.org/D23007 llvm-svn: 278141	2016-08-09 17:39:11 +00:00
Anna Thomas	037e540f08	[AliasAnalysis] Treat invariant.start as read-memory Summary: We teach alias analysis that invariant.start is readonly. This helps with GVN and memcopy optimizations that currently treat. invariant.start as a clobber. We need to treat this as readonly, so that DSE does not incorrectly remove stores prior to the invariant.start Reviewers: sanjoy, reames, majnemer, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23214 llvm-svn: 278138	2016-08-09 17:18:05 +00:00
Xinliang David Li	9035cfceef	[Profile] turn off verbose warnings by default no prof data for func warning is turned off by default due to its high verbosity and minimal usefulness. Differential Revision: http://reviews.llvm.org/D23295 llvm-svn: 278127	2016-08-09 15:35:28 +00:00
Artur Pilipenko	c710a461b5	[LVI] Make LVI smarter about comparisons with non-constants Make LVI smarter about comparisons with a non-constant. For example, a s< b constraints a to be in [INT_MIN, INT_MAX) range. This is a part of https://llvm.org/bugs/show_bug.cgi?id=28620 fix. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D23205 llvm-svn: 278122	2016-08-09 14:50:08 +00:00
Simon Pilgrim	27740d038c	[X86][XOP] Add support for combining target shuffles to VPERMIL2PD/VPERMIL2PS llvm-svn: 278120	2016-08-09 12:56:15 +00:00
Simon Pilgrim	aae7d4a1b6	[X86][XOP] Add support for combining target shuffles to VPPERM llvm-svn: 278114	2016-08-09 10:56:29 +00:00
Dean Michael Berris	3a25d84a51	[XRay] Test for xray_instr_map in object file. (NFC) This makes a trivial change in the emission of the per-function XRay tables, and makes sure that the xray_instr_map section does show up in the object file. llvm-svn: 278113	2016-08-09 10:42:11 +00:00
Artur Pilipenko	d97eedff40	Revert 278107 which causes buildbot failures and in addition has wrong commit message llvm-svn: 278109	2016-08-09 10:00:22 +00:00
Artur Pilipenko	a410d81f64	Teach CorrelatedValuePropagation to mark adds as no wrap Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem. Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D23059 llvm-svn: 278107	2016-08-09 09:41:34 +00:00
Simon Pilgrim	54c32ddf55	[X86][SSE] Fix memory folding of (v)roundsd / (v)roundss We only had partial memory folding support for the intrinsic definitions, and (as noted on PR27481) was causing FR32/FR64/VR128 mismatch errors with the machine verifier. This patch adds missing memory folding support for both intrinsics and the ffloor/fnearbyint/fceil/frint/ftrunc patterns and in doing so fixes the failing machine verifier stack folding tests from PR27481. Differential Revision: https://reviews.llvm.org/D23276 llvm-svn: 278106	2016-08-09 09:32:34 +00:00
Artur Pilipenko	adcd01f6cd	[LVI] NFC. Fix a typo Bofore -> Before llvm-svn: 278105	2016-08-09 09:14:29 +00:00
Craig Topper	a10549d3e9	[X86] Reduce duplicated code in the execution domain lookup functions by passing tables as an argument. llvm-svn: 278098	2016-08-09 05:26:09 +00:00
Craig Topper	92a4ff1294	[AVX-512] Add support for execution domain switching masked logical ops between floating point and integer domain. This switches PS<->D and PD<->Q. llvm-svn: 278097	2016-08-09 05:26:07 +00:00
Craig Topper	9bd6241106	[X86] Remove the Fv packed logical operation alias instructions. Replace them with patterns to the regular instructions. This enables execution domain fixing which is why the tests changed. llvm-svn: 278090	2016-08-09 03:06:33 +00:00
Craig Topper	c09273b42b	[X86] Cleanup patterns for AVX/SSE for PS operations. Always try to look for bitcasts from floating point types. If only AVX1 is supported we also need to handle integer types with floating point ops without looking for bitcasts. Previously SSE1 had a pattern that looked for integer types without bitcasts, but the type wasn't legal with only SSE1 and SSE2 add an identical pattern for the integer instructions. llvm-svn: 278089	2016-08-09 03:06:28 +00:00
Craig Topper	de06b51d3d	[X86] Remove unnecessary bitcast from the front of AVX1Only 256-bit logical operation patterns. llvm-svn: 278088	2016-08-09 03:06:26 +00:00
Matthias Braun	7313ca6dbf	X86InstrInfo: Update liveness in classifyLea() We need to update liveness information when we create COPYs in classifyLea(). This fixes http://llvm.org/28301 llvm-svn: 278086	2016-08-09 01:47:26 +00:00
Derek Schuff	53b9af02c8	[WebAssembly] Fix bugs in WebAssemblyLowerEmscriptenExceptions pass * Delete extra '_' prefixes from JS library function names. fixImports() function in JS glue code deals with this for wasm. * Change command-line option names in order to be consistent with asm.js. * Add missing lowering code for llvm.eh.typeid.for intrinsics * Delete commas in mangled function names * Fix a function argument attributes bug. Because we add the pointer to the original callee as the first argument of invoke wrapper, all argument attribute indices have to be incremented by one. Patch by Heejin Ahn Differential Revision: https://reviews.llvm.org/D23258 llvm-svn: 278081	2016-08-09 00:29:55 +00:00
Sean Silva	5f6ec06f17	Consistently use CGSCCAnalysisManager Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278080	2016-08-09 00:28:56 +00:00
Sean Silva	0746f3bfa4	Consistently use LoopAnalysisManager One exception here is LoopInfo which must forward-declare it (because the typedef is in LoopPassManager.h which depends on LoopInfo). Also, some includes for LoopPassManager.h were needed since that file provides the typedef. Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278079	2016-08-09 00:28:52 +00:00
Sean Silva	fd03ac6a0c	Consistently use ModuleAnalysisManager Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278078	2016-08-09 00:28:38 +00:00
Sean Silva	36e0d01e13	Consistently use FunctionAnalysisManager Besides a general consistently benefit, the extra layer of indirection allows the mechanical part of https://reviews.llvm.org/D23256 that requires touching every transformation and analysis to be factored out cleanly. Thanks to David for the suggestion. llvm-svn: 278077	2016-08-09 00:28:15 +00:00
Saleem Abdulrasool	015280211b	CodeView: extract the OMF Directory Header The DebugDirectory contains a pointer to the CodeView info structure which is a derivative of the OMF debug directory. The structure has evolved a bit over time, and PDB 2.0 used a slightly different definition from PDB 7.0. Both of these are specific to CodeView and not COFF. Reflect this by moving the structure definitions into the DebugInfo/CodeView headers. Define a generic DebugInfo union type that can be used to pass around a reference to the DebugInfo irrespective of the versioning. NFC. llvm-svn: 278075	2016-08-09 00:25:12 +00:00
Sanjay Patel	06ba09af67	[x86] split combineVSelectWithAllOnesOrZeros into a helper function; NFCI llvm-svn: 278074	2016-08-09 00:01:11 +00:00
Derek Schuff	b7d6d9e3cd	[WebAssembly] Fix CFI index to account for padding nullptr function The WebAssembly linker now creates a dummy function at index 0 to prevent miscomparisons with the NULL pointer, see https://github.com/WebAssembly/binaryen/pull/658. Thanks to pcc for pointing out this problem! Patch by Dominic Chen Differential Revision: https://reviews.llvm.org/D23137 llvm-svn: 278073	2016-08-08 23:56:01 +00:00
Rui Ueyama	f53c8cb439	Revert "Do not ignore SizeOfOptionalHeader in COFF header even if PE header is not present." This reverts commit r278066 to unbreak buildbots. llvm-svn: 278070	2016-08-08 23:07:03 +00:00
Lang Hames	072728d419	Revert r278065 while I investigate some build-bot breakage. llvm-svn: 278069	2016-08-08 22:57:30 +00:00
Rui Ueyama	776c6828a5	Do not ignore SizeOfOptionalHeader in COFF header even if PE header is not present. Attribute SizeOfOptionalHeader is ignored if no PE header is present in the file. This attribute should be ignored according to standard, however there are uses of this field even though it should not be used. This change does not conform to PE/COFF standard, but there are several COFF files without PE header, where you had to add up SizeOfOptionalHeader in order to get proper section headers. Other tools and their own parsers do take this into account. Patch by Marek Milkovič! https://reviews.llvm.org/D22750 llvm-svn: 278066	2016-08-08 22:54:22 +00:00
Lang Hames	33c0b6bfca	[RuntimeDyld][Orc][MCJIT] Add partial weak-symbol support to RuntimeDyld. This patch causes RuntimeDyld to check for existing definitions when it encounters weak symbols. If a definition already exists then the new weak definition is discarded. All symbol lookups within a "logical dylib" should now agree on the address of any given weak symbol. This allows the JIT to better match the behavior of the static linker for C++ code. This support is only partial, as it does not allow strong definitions that occur after the first weak definition (in JIT symbol lookup order) to override the previous weak definitions. Support for this will be added in a future patch. llvm-svn: 278065	2016-08-08 22:53:37 +00:00
Charles Davis	e9c32c7ed3	Revert "[X86] Support the "ms-hotpatch" attribute." This reverts commit r278048. Something changed between the last time I built this--it takes awhile on my ridiculously slow and ancient computer--and now that broke this. llvm-svn: 278053	2016-08-08 21:20:15 +00:00
Justin Bogner	6b4422e6fe	InstCombine: Remove a redundant #ifdef NDEBUG. NFC The DEBUG() macro already does this. llvm-svn: 278049	2016-08-08 21:02:11 +00:00
Charles Davis	0822aa118e	[X86] Support the "ms-hotpatch" attribute. Summary: Based on two patches by Michael Mueller. This is a target attribute that causes a function marked with it to be emitted as "hotpatchable". This particular mechanism was originally devised by Microsoft for patching their binaries (which they are constantly updating to stay ahead of crackers, script kiddies, and other ne'er-do-wells on the Internet), but is now commonly abused by Windows programs to hook API functions. This mechanism is target-specific. For x86, a two-byte no-op instruction is emitted at the function's entry point; the entry point must be immediately preceded by 64 (32-bit) or 128 (64-bit) bytes of padding. This padding is where the patch code is written. The two byte no-op is then overwritten with a short jump into this code. The no-op is usually a `movl %edi, %edi` instruction; this is used as a magic value indicating that this is a hotpatchable function. Reviewers: majnemer, sanjoy, rnk Subscribers: dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D19908 llvm-svn: 278048	2016-08-08 21:01:39 +00:00
Krzysztof Parzyszek	341cf3fbe5	[Hexagon] Add pattern for 64-bit mulhs llvm-svn: 278040	2016-08-08 19:24:25 +00:00
Michael Zolotukhin	2f50725dbd	[LoopUnroll] Simplify loops created by unrolling. Summary: Currently loop-unrolling doesn't preserve loop-simplified form. This patch fixes it by resimplifying affected loops. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23148 llvm-svn: 278038	2016-08-08 19:02:15 +00:00
Mehdi Amini	c137c28c8b	RefreshCallGraph does not modify the SCC, adding "const" to make it clear (NFC) llvm-svn: 278037	2016-08-08 18:51:05 +00:00
Geoff Berry	290a13e7c7	[MemorySSA] Fix windows build breakage caused by r278028 r278028: [MemorySSA] Ensure address stability of MemorySSA object. llvm-svn: 278035	2016-08-08 18:27:22 +00:00
Nirav Dave	f45fd2ba87	[X86] Improve code size on X86 segment moves Moves of a value to a segment register from a 16-bit register is equivalent to one from it's corresponding 32-bit register. Match gas's behavior and rewrite instructions to the shorter of equivalent forms. Reviewers: rnk, ab Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23166 llvm-svn: 278031	2016-08-08 18:01:04 +00:00
Geoff Berry	cdf5333f6f	[MemorySSA] Ensure address stability of MemorySSA object. Summary: Ensure that the MemorySSA object never changes address when using the new pass manager since the walkers contained by MemorySSA cache pointers to it at construction time. This is achieved by wrapping the MemorySSAAnalysis result in a unique_ptr. Also add some asserts that check for this bug. Reviewers: george.burgess.iv, dberlin Subscribers: mcrosier, hfinkel, chandlerc, silvas, llvm-commits Differential Revision: https://reviews.llvm.org/D23171 llvm-svn: 278028	2016-08-08 17:52:01 +00:00
Oliver Stannard	8331aaee8f	[ARM] Add support for embedded position-independent code This patch adds support for some new relocation models to the ARM backend: * Read-only position independence (ROPI): Code and read-only data is accessed PC-relative. The offsets between all code and RO data sections are known at static link time. This does not affect read-write data. * Read-write position independence (RWPI): Read-write data is accessed relative to the static base register (r9). The offsets between all writeable data sections are known at static link time. This does not affect read-only data. These two modes are independent (they specify how different objects should be addressed), so they can be used individually or together. They are otherwise the same as the "static" relocation model, and are not compatible with SysV-style PIC using a global offset table. These modes are normally used by bare-metal systems or systems with small real-time operating systems. They are designed to avoid the need for a dynamic linker, the only initialisation required is setting r9 to an appropriate value for RWPI code. I have only added support to SelectionDAG, not FastISel, because FastISel is currently disabled for bare-metal targets where these modes would be used. Differential Revision: https://reviews.llvm.org/D23195 llvm-svn: 278015	2016-08-08 15:28:31 +00:00
Zhan Jun Liau	4fbc3f4a37	[SystemZ] Add support for the .insn directive Summary: Add support for the .insn directive. .insn is an s390 specific directive that allows encoding of an instruction instead of using a mnemonic. The motivating case is some code in node.js that requires support for the .insn directive. Reviewers: koriakin, uweigand Subscribers: koriakin, llvm-commits Differential Revision: https://reviews.llvm.org/D21809 llvm-svn: 278012	2016-08-08 15:13:08 +00:00
Sebastian Pop	bfb96c5bfd	GVN-hoist: enable by default llvm-svn: 278010	2016-08-08 14:46:15 +00:00
Artur Pilipenko	eed618d5c0	[LVI] NFC. On the fast dest path use inverse predicate instead of inverse range result Gathering constantins from a condition on the false path ask makeAllowedICmpRegion about inverse predicate instead of inversing the resulting range. This change was separated from the review "[LVI] Make LVI smarter about comparisons with non-constants" (https://reviews.llvm.org/D23205#inline-198361) llvm-svn: 278009	2016-08-08 14:33:11 +00:00
Artur Pilipenko	54b50cc1a8	[LVI] NFC. Rename confusing local NegOffset to Offset NegOffset is not necessarily negative llvm-svn: 278008	2016-08-08 14:13:56 +00:00
Artur Pilipenko	21472910c1	[LVI] NFC. Extract LHS, RHS, Predicate locals in getValueFromCondition llvm-svn: 278007	2016-08-08 14:08:37 +00:00
Silviu Baranga	fa00ba3c1a	[AArch64] PR28877: Don't assume we're running after legalization when creating vcvtfp2fxs Summary: The DAG combine transformation that was generating the aarch64_neon_vcvtfp2fxs node was assuming that all inputs where legal and wasn't accounting that the input could be a v4f64 if we're trying to do the transformation before legalization. We now bail out in this case. All illegal types besides v4f64 were already rejected. Fixes https://llvm.org/bugs/show_bug.cgi?id=28877. Reviewers: jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D23261 llvm-svn: 278002	2016-08-08 13:13:57 +00:00
Daniel Sanders	3feeb9c851	Re-commit r277988: [mips][ias] Fix all the hacks related to MIPS-specific unary operators (%hi/%lo/%gp_rel/etc.). Hopefully with the MSVC builds fixed. I've added a missing '#include <tuple>' that gcc and clang don't seem to need. llvm-svn: 277995	2016-08-08 11:50:25 +00:00
Simon Pilgrim	33fc788374	[X86][SSE] Assert if the shuffle mask indices are not -1 or within a valid input range As discussed in post-review rL277959 llvm-svn: 277993	2016-08-08 11:07:34 +00:00
Daniel Sanders	cae9aeed39	Revert r277988: [mips][ias] Fix all the hacks related to MIPS-specific unary operators (%hi/%lo/%gp_rel/etc.). It seems that MSVC doesn't like std::tie(). llvm-svn: 277990	2016-08-08 09:33:14 +00:00
Daniel Sanders	2ab623b5a3	[mips][ias] Fix all the hacks related to MIPS-specific unary operators (%hi/%lo/%gp_rel/etc.). Summary: They are now lexed as a single token on targets where MCAsmInfo::HasMipsExpressions is true and then parsed in a similar way to the '~' operator as part of MCExpr::parseExpression. As a result: * expressions and immediates no longer have different parsing rules. The difference is now solely down to whether evaluateAsAbsolute() succeeds. * %hi(%neg(%gp_rel(x))) are no longer parsed as a single operator and decomposed into the three MipsMCExpr nodes. They are parsed directly as three MipsMCExpr nodes. * parseMemOperand no longer needs to eat all the surrounding parenthesis to get at the outermost operator to make this work * %hi(%neg(%gp_rel(x))) and %lo(%neg(%gp_rel(x))) are no longer the only 3-in-1 relocs that parse for N64. They're still the only combinations that are permitted in relocatable expressions though. Fixing that should be a later patch. * We no longer need to list all the tokens that can occur as the first token of an expression or immediate. test/MC/Mips/expr1.s: This change also prevents the incorrect lowering of %lo(2*4)+foo to %lo(8+foo) which is not an equivalent expression (the difference is whether foo is truncated to 16-bit or not) and the test has been updated to account for the macro expansion the correct expression requires. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D23110 llvm-svn: 277988	2016-08-08 09:20:52 +00:00
Diana Picus	4dd6c249ac	[SelectionDAG] Refactor visitInlineAsm a bit. NFCI. This shaves off ~100 lines from visitInlineAsm. llvm-svn: 277987	2016-08-08 08:54:39 +00:00
Sean Silva	0873e7d218	Add some comments linking back to PR28400. Thanks to Mehdi for the suggestion! llvm-svn: 277984	2016-08-08 07:03:49 +00:00
Sean Silva	7f21f4b264	[PM] More workaround for PR28400 llvm-svn: 277982	2016-08-08 05:38:06 +00:00
Sean Silva	744f7a843f	[PM] Invalidate CallGraphAnalysis because it holds AssertingVH This is essentially PR28400. The fix here is similar to that implemented in r274656. llvm-svn: 277980	2016-08-08 05:38:01 +00:00
Daniel Berlin	4b4c722e79	[MSSA] Fix PR28880 by fixing use optimizer's lower bound tracking behavior. Summary: In the use optimizer, we need to keep of whether the lower bound still dominates us or else we may decide a lower bound is still valid when it is not due to intervening pushes/pops. Fixes PR28880 (and probably a bunch of other things). Reviewers: george.burgess.iv Subscribers: MatzeB, llvm-commits, sebpop Differential Revision: https://reviews.llvm.org/D23237 llvm-svn: 277978	2016-08-08 04:44:53 +00:00
Eli Friedman	02419a9849	[JumpThreading] Fix handling of aliasing metadata. Summary: The correctness fix here is that when we CSE a load with another load, we need to combine the metadata on the two loads. This matches the behavior of other passes, like instcombine and GVN. There's also a minor optimization improvement here: for load PRE, the aliasing metadata on the inserted load should be the same as the metadata on the original load. Not sure why the old code was throwing it away. Issue found by inspection. Differential Revision: http://reviews.llvm.org/D21460 llvm-svn: 277977	2016-08-08 04:10:22 +00:00
Davide Italiano	151e5be5ea	[MC] Delete use of *structors_used. Jim Grosbach and Kevin Enderby think those are not used anymore. Originally submitted by: Rafael Espindola llvm-svn: 277973	2016-08-08 03:30:01 +00:00
Davide Italiano	e3b916d164	[SimplifyLibCalls] Emit sqrt intrinsic instead of a libcall. llvm-svn: 277972	2016-08-08 03:23:01 +00:00
Eli Friedman	2a65dd1ba6	[SROA] Fix crash with lifetime intrinsic partially covering alloca. Summary: PromoteMemToReg looks specifically for the pattern bitcast+lifetime.start (or a bitcast-equivalent GEP); any offset will lead to an assertion failure. Fixes https://llvm.org/bugs/show_bug.cgi?id=27999 . Differential Revision: https://reviews.llvm.org/D22737 llvm-svn: 277969	2016-08-08 01:30:53 +00:00
Craig Topper	f44423120f	[AVX-512] Improve lowering of inserting a single element into lowest element of a 512-bit vector of zeroes by using vmovq/vmovd/vmovss/vmovsd. llvm-svn: 277965	2016-08-07 21:52:59 +00:00
Davide Italiano	27da131f32	[SLC] Emit an intrinsic instead of a libcall for pow. Differential Revision: https://reviews.llvm.org/D22104 llvm-svn: 277963	2016-08-07 20:27:03 +00:00
Nico Weber	99ceee8a85	Revert r277905, it caused PR28894 llvm-svn: 277962	2016-08-07 20:18:04 +00:00
Craig Topper	2c51c74d52	[AVX-512] Add 512-bit logical operations to load folding tables. Add avx512f stack folding test and move some tests from the avx512vl test. llvm-svn: 277961	2016-08-07 17:14:09 +00:00
Craig Topper	938e7ab9e1	[AVX-512] Add EVEX encoded floating point MAX/MIN instructions to the load folding tables. llvm-svn: 277960	2016-08-07 17:14:05 +00:00
Simon Pilgrim	21c61fba45	[X86] lowerVectorShuffle - ensure that undefined mask elements only use SM_SentinelUndef Help lowering and combining (which can specify SM_SentinelZero mask elements) share more shuffle matching code. llvm-svn: 277959	2016-08-07 15:29:12 +00:00
Elena Demikhovsky	dca03bebd3	AVX-512: Changed lowering of BITCAST between i1 vectors and i8/i16/i32 integer values Optimized lowering of BITCAST node. The BITCAST node can be replaced with COPY_TO_REG instead of KMOV. It allows to suppress two opposite BITCAST operations and avoid redundant "movs". Differential Revision: https://reviews.llvm.org/D23247 llvm-svn: 277958	2016-08-07 13:05:58 +00:00
David Majnemer	d150137f64	[InstSimplify] Fold gep (gep V, C), (sub 0, V) to C llvm-svn: 277952	2016-08-07 07:58:12 +00:00
David Majnemer	dc8767a49a	[InstSimplify] Try hard to simplify pointer comparisons Simplify ptrtoint comparisons involving operands with different source types. llvm-svn: 277951	2016-08-07 07:58:10 +00:00
David Majnemer	4e4f4437c2	[InstCombine] Infer inbounds on geps of allocas llvm-svn: 277950	2016-08-07 07:58:00 +00:00
Craig Topper	49841c3812	[X86] Add commutable floating point max/min instructions to the load folding tables. llvm-svn: 277949	2016-08-07 05:39:51 +00:00
Craig Topper	c4d757093e	[X86] Simplify a shuffle mask copy. NFC llvm-svn: 277947	2016-08-07 05:39:46 +00:00
Michael Zolotukhin	442b82f0eb	Revert "Revert "[LoopSimplify] Fix updating LCSSA after separating nested loops."" This reverts commit r277901. Reaaply the commit as it looks like it has nothing to do with the bots failures. llvm-svn: 277946	2016-08-07 01:56:54 +00:00
Lang Hames	4679644c53	[ExecutionEngine][RuntimeDyld] Move JITSymbol from ExecutionEngine to RuntimeDyld. JITSymbol really belongs in RuntimeDyld. This should fix the llvm-rtdyld link failures caused by r277943. llvm-svn: 277945	2016-08-07 01:19:37 +00:00
Lang Hames	71f089c82b	[RuntimeDyld] Remove symbol that is unused as of r277943. llvm-svn: 277944	2016-08-07 01:12:44 +00:00
Lang Hames	00769a0904	[RuntimeDyld] Replace manual flag checks with JITSymbolFlags::fromObjectSymbol. llvm-svn: 277943	2016-08-07 00:18:14 +00:00
Lang Hames	73976f622d	[ORC] Re-apply r277896, removing bogus triples and datalayouts that broke tests on linux last time. llvm-svn: 277942	2016-08-06 22:36:26 +00:00
Kostya Serebryany	728447bd3b	[libFuzzer] make libFuzzer work with a bit older clang versions llvm-svn: 277941	2016-08-06 21:28:56 +00:00
Kostya Serebryany	ff1f2107ec	[libFuzzer] don't print bogus error message llvm-svn: 277940	2016-08-06 21:23:29 +00:00
Simon Pilgrim	bc573ca1b8	[X86][AVX2] Improve sign/zero extension on AVX2 targets Split extensions to large vectors into 256-bit chunks - the equivalent of what we do with pre-AVX2 into 128-bit chunks llvm-svn: 277939	2016-08-06 21:21:12 +00:00
Gor Nishanov	28c889593a	CoroSplit: Squash unused variable FnTrigger warning in NDEBUG llvm-svn: 277938	2016-08-06 21:11:10 +00:00
Gor Nishanov	2ed6e788a8	[Coroutines] Part 5: Add CGSCC restart trigger Summary: CoroSplit pass processes the coroutine twice. First, it lets it go through complete IPO optimization pipeline as a single function. It forces restart of the pipeline by inserting an indirect call to an empty function "coro.devirt.trigger" which is devirtualized by CoroElide pass that triggers a restart of the pipeline by CGPassManager. (In later patches, when CoroSplit pass sees the same coroutine the second time, it splits it up, adds coroutine subfunctions to the SCC to be processed by IPO pipeline.) Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) 3.Add empty coroutine passes. (https://reviews.llvm.org/D22847) 4.Add coroutine devirtualization + tests. ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998) c) Do devirtualization (https://reviews.llvm.org/D23229) 5.Add CGSCC restart trigger + tests. <= we are here 6.Add coroutine heap elision + tests. 7.Add the rest of the logic (split into more patches) Reviewers: mehdi_amini, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23234 llvm-svn: 277936	2016-08-06 20:44:39 +00:00
Craig Topper	9d8676acc0	[AVX-512] Add SQRT/RCP14/RNDSCALE to hasUndefRegUpdate. llvm-svn: 277934	2016-08-06 19:31:52 +00:00
Craig Topper	19505bc354	[AVX-512] Add AVX-512 scalar CVT instructions to hasUndefRegUpdate. llvm-svn: 277933	2016-08-06 19:31:50 +00:00
Craig Topper	f5d05fb0ce	[X86] Add VRCPSSr_Int, VRSQRTSSr_Int, VSQRTSSr_Int, and VSQRTSDr_Int to hasUndefRegUpdate. llvm-svn: 277931	2016-08-06 19:31:44 +00:00
Simon Pilgrim	7d168e19e8	[X86][SSE] Enable commutation between MOVHLPS and UNPCKHPD Assuming SSE2 is available then we can safely commute between these, removing some unnecessary register moves and improving memory folding opportunities. VEX encoded versions don't benefit so I haven't added support to them. llvm-svn: 277930	2016-08-06 18:40:28 +00:00
Mike Aizatsky	a8e84b9b37	[libfuzzer] do not warn about missing pcbuffer functions: they are new. llvm-svn: 277927	2016-08-06 17:03:22 +00:00
Benjamin Kramer	3f0c1e625d	[ARM] Don't copy MCInsts in loop. NFC. llvm-svn: 277924	2016-08-06 12:58:24 +00:00
Benjamin Kramer	41e66dade1	[Inliner] Use function_ref for functors which are never taken ownership of. llvm-svn: 277922	2016-08-06 12:33:46 +00:00
Benjamin Kramer	a3d4def878	[LoadCombine] Simplify code with a brace init. NFC. llvm-svn: 277921	2016-08-06 12:11:11 +00:00
Simon Pilgrim	f56309f11a	[X86][SSE] Add 2 input shuffle support to matchBinaryVectorShuffle Not actually used yet... llvm-svn: 277919	2016-08-06 11:22:39 +00:00
Benjamin Kramer	b7d3311c77	Move helpers into anonymous namespaces. NFC. llvm-svn: 277916	2016-08-06 11:13:10 +00:00
David Majnemer	70c93fa69a	[CodeGen] Fix a -Wdocumentation warning A parameter was documented with the wrong name. No functionality change is intended. llvm-svn: 277915	2016-08-06 08:37:12 +00:00
David Majnemer	a19d0f2f3e	[ValueTracking] Teach computeKnownBits about [su]min/max Reasoning about a select in terms of a min or max allows us to derive a tigher bound on the result. llvm-svn: 277914	2016-08-06 08:16:00 +00:00
David Majnemer	1665d8635e	[CallGraphSCCPass] Use an ArrayRef instead of a pair of iterators No functional change is intended. llvm-svn: 277913	2016-08-06 06:21:02 +00:00
Sanjoy Das	ba04d3a620	[InstCombine] Don't coerce non-integral pointers to integers Reviewers: majnemer Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23231 llvm-svn: 277910	2016-08-06 02:58:48 +00:00
Matthias Braun	9a0035d8d2	Revert "(refs/bisect/bad) GVN-hoist: enable by default" GVN-Hoist appears to miscompile llvm-testsuite SingleSource/Benchmarks/Misc/fbench.c at the moment. I filed http://llvm.org/PR28880 This reverts commit r277786. llvm-svn: 277909	2016-08-06 02:23:15 +00:00
Gor Nishanov	31d8c9af89	Part 4c: Coroutine Devirtualization: Devirtualize coro.resume and coro.destroy. Summary: This is the 4c patch of the coroutine series. CoroElide pass now checks if PostSplit coro.begin is referenced by coro.subfn.addr intrinsics. If so replace coro.subfn.addrs with an appropriate coroutine subfunction associated with that coro.begin. Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) 3.Add empty coroutine passes. (https://reviews.llvm.org/D22847) 4.Add coroutine devirtualization + tests. ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998) c) Do devirtualization <= we are here 5.Add CGSCC restart trigger + tests. 6.Add coroutine heap elision + tests. 7.Add the rest of the logic (split into more patches) Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23229 llvm-svn: 277908	2016-08-06 02:16:35 +00:00
Nico Weber	c893e603ab	Revert r277896. It breaks ExecutionEngine/OrcLazy/weak-function.ll on most bots. Script: -- ... -- Exit Code: 1 Command Output (stderr): -- Could not find main function. llvm-svn: 277907	2016-08-06 02:00:45 +00:00
Kyle Butt	71cb44d969	CodeGen: If Convert blocks that would form a diamond when tail-merged. The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 277905	2016-08-06 01:52:37 +00:00
Kyle Butt	54bf3cef92	IfConverter: Split ScanInstructions into 2 functions. ScanInstructions is now 2 functions: AnalyzeBranches and ScanInstructions. ScanInstructions also now takes a pair of arguments delimiting the instructions to be scanned. This will be used for forked diamond support to re-scan only a portion of the block. llvm-svn: 277904	2016-08-06 01:52:34 +00:00
Kyle Butt	4f0e287906	IfConversion: Document countDuplicatedInstructions. NFC llvm-svn: 277903	2016-08-06 01:52:33 +00:00
Kyle Butt	fe916828ee	IfConversion: factor out 2 functions to skip debug instrs. NFC Skipping debug instructions occurrs repeatedly, factor it out. llvm-svn: 277902	2016-08-06 01:52:31 +00:00
Michael Zolotukhin	09cf304ebc	Revert "[LoopSimplify] Fix updating LCSSA after separating nested loops." This reverts commit r277877. Try to appease clang-x64-ninja-win7 buildbot. llvm-svn: 277901	2016-08-06 01:48:51 +00:00
Lang Hames	62a459603c	[ORC] Add (partial) weak symbol support to the CompileOnDemand layer. This adds partial support for weak functions to the CompileOnDemandLayer by modifying the addLogicalModule method to check for existing stub definitions before building a new stub for a weak function. This scheme is sufficient to support ODR definitions, but fails for general weak definitions if strong definition is encountered after the first weak definition. (A more extensive refactor will be required to fully support weak symbols). This patch does not add weak symbol support to RuntimeDyld: I hope to add that in the near future. llvm-svn: 277896	2016-08-06 00:54:43 +00:00
Sanjoy Das	b8c2ebea08	[IRCE] Remove unused headers; NFC llvm-svn: 277892	2016-08-06 00:02:01 +00:00
Sanjoy Das	cf181867a6	[IRCE] Preserve loop-simplify form Fixes PR28764. Right now there is no way to test this, but (as mentioned on the PR) with Michael Zolotukhin's yet to be checked in LoopSimplify verfier, 8 of the llvm-lit tests for IRCE crash. llvm-svn: 277891	2016-08-06 00:01:56 +00:00
Sanjay Patel	8e3ab17c44	[InstCombine] refactor ctlz/cttz folds (NFCI) Note that this fold really belongs in InstSimplify. Refactoring here anyway as an intermediate step because there's a planned addition to this function in D23134. Differential Revision: https://reviews.llvm.org/D23223 llvm-svn: 277883	2016-08-05 22:42:46 +00:00
Daniel Berlin	7ac3d74017	[MSSA] Use depth first iterator instead of custom version. Summary: Originally the plan was to use the custom worklist to do some block popping, and because we don't actually need a visited set. The custom one we have here is slightly broken, and it's not worth fixing vs using depth_first_iterator since we aren't going to go the route we originally were. Fixes PR28874 Reviewers: george.burgess.iv Subscribers: llvm-commits, gberry Differential Revision: https://reviews.llvm.org/D23187 llvm-svn: 277880	2016-08-05 22:09:14 +00:00
Justin Bogner	272cbacc25	CodeView: Remove an unused variable It was breaking the -Werror build. llvm-svn: 277878	2016-08-05 21:57:10 +00:00
Michael Zolotukhin	4c65c3596a	[LoopSimplify] Fix updating LCSSA after separating nested loops. This fixes PR28825. The problem was that we only checked if a value from a created inner loop is used in the outer loop, and fixed LCSSA for them. But we missed to fixup LCSSA for values used in exits of the outer loop. llvm-svn: 277877	2016-08-05 21:52:58 +00:00
Zachary Turner	5e35eaac83	Fix non portable include path. llvm-svn: 277876	2016-08-05 21:50:02 +00:00
Daniel Berlin	7af95876cf	[MSSA] Match assert vs llvm_unreachable style in verification functions. llvm-svn: 277873	2016-08-05 21:47:20 +00:00
Daniel Berlin	2919b1c41b	Rewrite domination verifier to handle local domination as well. Summary: Rewrite domination verifier to handle local domination as well. This catches a bug Geoff Berry noticed. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23184 llvm-svn: 277872	2016-08-05 21:46:52 +00:00
Zachary Turner	5e3e4bb26b	[CodeView] Decouple record deserialization from visitor dispatch. Until now, our use case for the visitor has been to take a stream of bytes representing a type stream, deserialize the records in sequence, and do something with them, where "something" is determined by how the user implements a particular set of callbacks on an abstract class. For actually writing PDBs, however, we want to do the reverse. We have some kind of description of the list of records in their in-memory format, and we want to process each one. Perhaps by serializing them to a byte stream, or perhaps by converting them from one description format (Yaml) to another (in-memory representation). This was difficult in the current model because deserialization and invoking the callbacks were tightly coupled. With this patch we change this so that TypeDeserializer is itself an implementation of the particular set of callbacks. This decouples deserialization from the iteration over a list of records and invocation of the callbacks. TypeDeserializer is initialized with another implementation of the callback interface, so that upon deserialization it can pass the deserialized record through to the next set of callbacks. In a sense this is like an implementation of the Decorator design pattern, where the Deserializer is a decorator. This will be useful for writing Pdbs from yaml, where we have a description of the type records in Yaml format. In this case, the visitor implementation would have each visitation callback method implemented in such a way as to extract the proper set of fields from the Yaml, and it could maintain state that builds up a list of these records. Finally at the end we can pass this information through to another set of callbacks which serializes them into a byte stream. Reviewed By: majnemer, ruiu, rnk Differential Revision: https://reviews.llvm.org/D23177 llvm-svn: 277871	2016-08-05 21:45:34 +00:00
Marek Olsak	355a8642b4	AMDGPU/SI: Increase SGPR limit to 96 on Tonga/Iceland Summary: This is the setting of the Vulkan closed source driver. It decreases the max wave count from 10 to 8. 26010 shaders in 14650 tests Totals: VGPRS: 829593 -> 808440 (-2.55 %) Spilled SGPRs: 81878 -> 42226 (-48.43 %) Spilled VGPRs: 367 -> 358 (-2.45 %) Scratch VGPRs: 1764 -> 1748 (-0.91 %) dwords per thread Code Size: 36677864 -> 35923932 (-2.06 %) bytes There is a massive decrease in SGPR spilling in general and -7.4% spilled VGPRs for DiRT Showdown (= SGPRs spilled to scratch?) Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23034 llvm-svn: 277867	2016-08-05 21:23:29 +00:00
Weiming Zhao	f68a6a720c	[ARM] Constant Materialize: imms with specific value can be encoded into mov.w Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. I'm resubmitting this patch. The test case in the original commit r277610 does not specify triple, so builds with differnt default triple will have different output. This patch fixed trile as thumb-darwin-apple. Reviewers: john.brawn, jmolloy, bruno Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277865	2016-08-05 20:58:29 +00:00
Davide Italiano	500929df9c	[FlattenCFG] Simplify + remove unused variable. NFCI. llvm-svn: 277864	2016-08-05 20:53:35 +00:00
Dehao Chen	e1c7c57d11	Remove cold callsite heuristic that is not necessary because of cold callee heuristic. llvm-svn: 277863	2016-08-05 20:49:04 +00:00
Dehao Chen	de39cb9384	Replace hot-callsite based heuristic to use its own threshold parameter instead of share inline-hint parameter Summary: Hot callsites should have higher threshold than inline hints. This patch uses separate threshold parameter for hot callsites. Reviewers: davidxl, eraman Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D22368 llvm-svn: 277860	2016-08-05 20:28:41 +00:00
Mike Aizatsky	b4bbc3bb7a	[sanitizers] trace buffer API to use user-allocated buffer. Differential Revision: https://reviews.llvm.org/D23185 llvm-svn: 277859	2016-08-05 20:09:53 +00:00
Ivan Krasin	b05e06e4fd	WholeProgramDevirt: print remarks with devirtualized method names. Summary: Chrome on Linux uses WholeProgramDevirt for speed ups, and it's important to detect regressions on both sides: the toolchain, if fewer methods get devirtualized after an update, and Chrome, if an innocently looking change caused many hot methods become virtual again. The need to track devirtualized methods is not Chrome-specific, but it's probably the only user of the pass at this time. Reviewers: kcc Differential Revision: https://reviews.llvm.org/D23219 llvm-svn: 277856	2016-08-05 19:45:16 +00:00
David Callahan	45e442ebaa	[ADCE] Refactoring for new functionality (NFC) Summary: This is another refactoring to break up the one function into three logical components functions. Another non-functional change before we start added in features. Reviewers: nadav, mehdi_amini, majnemer Subscribers: twoh, freik, llvm-commits Differential Revision: https://reviews.llvm.org/D23102 llvm-svn: 277855	2016-08-05 19:38:11 +00:00
Sanjoy Das	6fa08aafcc	[ConstantFolding] Don't create illegal (non-integral) inttoptrs Reviewers: majnemer, arsenm Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23182 llvm-svn: 277854	2016-08-05 19:23:29 +00:00
David Callahan	c1c810de0b	[AutoFDO] Fix handling of empty profiles Summary: If a profile has no samples for a function, then the function "entry count" is set to the value 0. Several places in the code test that if the Function::getEntryCount is defined at all. Here we change to treat a 0 entry count the same as undefined. In particular, this fixes a problem in getLayoutSuccessorProbThreshold in MachineBlockPlacement.cpp where we use a different and inferior heuristic for laying out basic blocks. Reviewers: danielcdh, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23082 llvm-svn: 277849	2016-08-05 18:38:19 +00:00
Sanjoy Das	b0b4e86215	[SCEV] Don't infinitely recurse on unreachable code llvm-svn: 277848	2016-08-05 18:34:14 +00:00
Kevin Enderby	600fb3f28e	Add the first of what will be a long line of additional error checks for invalid Mach-O files. This is where an LC_SEGMENT load command has a fileoff field that extends past the end of the file. Also fix llvm-nm and llvm-size to remove the errorToErrorCode() call so error messages are printed. And needed to update a few test cases now that they do print the error messages just a bit differently. llvm-svn: 277845	2016-08-05 18:19:40 +00:00
Dehao Chen	17c6afc35b	Do not assign new discriminator for all intrinsics. Summary: We do not care about intrinsic calls when assigning discriminators. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23212 llvm-svn: 277843	2016-08-05 17:56:49 +00:00
Tim Northover	14e7f73a0f	GlobalISel: clear pending phis after MachineFunction translated Test is just reordering the existing functions (it would trigger for any function after one with a phi). llvm-svn: 277841	2016-08-05 17:50:36 +00:00
Simon Pilgrim	69b6a70834	[X86][SSE] Add initial support for 2 input target shuffle combining. At the moment only the INSERTPS matching can actually use 2 inputs but the plumbing is now in place. llvm-svn: 277839	2016-08-05 17:36:14 +00:00
Tim Northover	97d0cb3165	GlobalISel: IRTranslate PHI instructions llvm-svn: 277835	2016-08-05 17:16:40 +00:00
Ulrich Weigand	c3b495a649	[PowerPC] Wrong fast-isel codegen for VSX floating-point loads There were two locations where fast-isel would generate a LFD instruction with a target register class VSFRC instead of F8RC when VSX was enabled. This can ccause invalid registers to be used in certain cases, like: lfd 36, ... instead of using a VSX load instruction. The wrong register number gets silently truncated, causing invalid code to be generated. The first place is PPCFastISel::PPCEmitLoad, which had multiple problems: 1.) The IsVSSRC and IsVSFRC flags are not initialized correctly, since they are computed from resultReg, which is still zero at this point in many cases. Fixed by changing the helper routines to operate on a register class instead of a register and passing in UseRC. 2.) Even with this fixed, Is64VSXLoad is still wrong due to a typo: bool Is32VSXLoad = IsVSSRC && Opc == PPC::LFS; bool Is64VSXLoad = IsVSSRC && Opc == PPC::LFD; The second line needs to use isVSFRC (like PPCEmitStore does). 3.) Once both the above are fixed, we're now generating a VSX instruction -- but an incorrect one, since generation of an indexed instruction with null index is wrong. Fixed by copying the code handling the same issue in PPCEmitStore. The second place is PPCFastISel::PPCMaterializeFP, where we would emit an LFD to load a constant from the literal pool, and use the wrong result register class. Fixed by hardcoding a F8RC class even on systems supporting VSX. Fixes: https://llvm.org/bugs/show_bug.cgi?id=28630 Differential Revision: https://reviews.llvm.org/D22632 llvm-svn: 277823	2016-08-05 15:22:05 +00:00
Zhan Jun Liau	8d3f29759f	[SystemZ] Add missing classes and instructions Summary: Add instruction formats E, RSI, SSd, SSE, and SSF. Added BRXH, BRXLE, PR, MVCK, STRAG, and ECTG instructions to test out those formats. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23179 llvm-svn: 277822	2016-08-05 15:14:34 +00:00
Benjamin Kramer	aa160c22f7	[SimplifyCFG] Make range reduction code deterministic. This generated IR based on the order of evaluation, which is different between GCC and Clang. With that in mind you get bootstrap miscompares if you compare a Clang built with GCC-built Clang vs. Clang built with Clang-built Clang. Diagnosing that made my head hurt. This also reverts commit r277337, which "fixed" the test case. llvm-svn: 277820	2016-08-05 14:55:02 +00:00
Simon Pilgrim	24dc1e7a90	[X86][SSE] Update the the target shuffle matches to use the effective mask's value type directly instead of via the input value type. Preparation for adding 2 input support so we want to avoid unnecessary references to the input value type. llvm-svn: 277817	2016-08-05 14:33:11 +00:00
Simon Pilgrim	7080005e67	[X86][SSE] Consistently use the target shuffle root value type for vector size calculations. NFCI. Preparation for adding 2 input support so we want to avoid unnecessary references to the input value type. llvm-svn: 277814	2016-08-05 13:02:53 +00:00
NAKAMURA Takumi	f72c663ac5	LLLexer.cpp: Avoid using BitsToDouble() to preserve SNaN like "double 0x7FF4000000000000". We should not use double (or float) in the LLVM, unless it is really needed. x87 FP register doesn't preserve SNaN to move the value. FIXME: APFloat() may have the constructor by raw bit. llvm-svn: 277813	2016-08-05 11:59:49 +00:00
NAKAMURA Takumi	2b8c774ce7	Reformat. llvm-svn: 277812	2016-08-05 11:59:45 +00:00
Simon Pilgrim	6f7b0cd530	[X86][SSE] Added target shuffle combine binary compute matching function. NFCI. Added matchBinaryPermuteVectorShuffle and moved the blend+zero and insertps matching code into it. llvm-svn: 277808	2016-08-05 11:16:53 +00:00
John Brawn	4d79ec7fe8	Reapply r276973 "Adjust Registry interface to not require plugins to export a registry" This differs from the previous version by being more careful about template instantiation/specialization in order to prevent errors when building with clang -Werror. Specifically: * begin is not defined in the template and is instead instantiated when Head is. I think the warning when we don't do that is wrong (PR28815) but for now at least do it this way to avoid the warning. * Instead of performing template specializations in LLVM_INSTANTIATE_REGISTRY instead provide a template definition then do explicit instantiation. No compiler I've tried has problems with doing it the other way, but strictly speaking it's not permitted by the C++ standard so better safe than sorry. Original commit message: Currently the Registry class contains the vestiges of a previous attempt to allow plugins to be used on Windows without using BUILD_SHARED_LIBS, where a plugin would have its own copy of a registry and export it to be imported by the tool that's loading the plugin. This only works if the plugin is entirely self-contained with the only interface between the plugin and tool being the registry, and in particular this conflicts with how IR pass plugins work. This patch changes things so that instead the add_node function of the registry is exported by the tool and then imported by the plugin, which solves this problem and also means that instead of every plugin having to export every registry they use instead LLVM only has to export the add_node functions. This allows plugins that use a registry to work on Windows if LLVM_EXPORT_SYMBOLS_FOR_PLUGINS is used. llvm-svn: 277806	2016-08-05 11:01:08 +00:00
Strahinja Petrovic	30e0ce8e9f	[PowerPC] fix passing long double arguments to function (soft-float) This patch fixes passing long double type arguments to function in soft float mode. If there is less than 4 argument registers free (long double type is mapped in 4 gpr registers in soft float mode) long double type argument must be passed through stack. Differential Revision: https://reviews.llvm.org/D20114. llvm-svn: 277804	2016-08-05 08:47:26 +00:00
Nicolai Haehnle	870bf1788c	[InstCombine] try to fold (select C, (sext A), B) into logical ops Summary: Turn (select C, (sext A), B) into (sext (select C, A, B')) when A is i1 and B is a compatible constant, also for zext instead of sext. This will then be further folded into logical operations. The transformation would be valid for non-i1 types as well, but other parts of InstCombine prefer to have sext from non-i1 as an operand of select. Motivated by the shader compiler frontend in Mesa for AMDGPU, which emits i32 for boolean operations. With this change, the boolean logic is fully recovered. Reviewers: majnemer, spatel, tstellarAMD Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22747 llvm-svn: 277801	2016-08-05 08:22:29 +00:00
Justin Bogner	c7e4fbe11c	InstCombine: Clean up some trailing whitespace. NFC llvm-svn: 277793	2016-08-05 01:09:48 +00:00
Justin Bogner	9979840f59	InstCombine: Replace some never-null pointers with references. NFC llvm-svn: 277792	2016-08-05 01:06:44 +00:00
Sebastian Pop	c33f0e25c9	GVN-hoist: enable by default llvm-svn: 277786	2016-08-04 23:49:07 +00:00
Sebastian Pop	429740a6c2	GVN-hoist: fix early exit logic The patch splits a complex && if condition into easier to read and understand logic. That wrong early exit condition was letting some instructions with not all operands available pass through when HoistingGeps was true. Differential Revision: https://reviews.llvm.org/D23174 llvm-svn: 277785	2016-08-04 23:49:05 +00:00
Justin Bogner	19dd0da153	IR: Provide an IRBuilder Inserter that calls a callback after insertion Add a generalized IRBuilderCallbackInserter, which is just given a callback to execute after insertion. This can be used to get rid of the custom inserter in InstCombine, which will in turn allow me to add target specific InstCombineCalls API for intrinsics without horrible layering violations. llvm-svn: 277784	2016-08-04 23:41:01 +00:00
Michael Kuperstein	3ceac2bbd5	[LV, X86] Be more optimistic about vectorizing shifts. Shifts with a uniform but non-constant count were considered very expensive to vectorize, because the splat of the uniform count and the shift would tend to appear in different blocks. That made the splat invisible to ISel, and we'd scalarize the shift at codegen time. Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we are able to select the appropriate vector shifts. This updates the cost model to to take this into account by making shifts by a uniform cheap again. Differential Revision: https://reviews.llvm.org/D23049 llvm-svn: 277782	2016-08-04 22:48:03 +00:00
Sanjay Patel	3bade138b5	[InstCombine] use m_APInt to allow icmp eq (mul X, C1), C2 folds for splat constant vectors This concludes the splat vector enhancements for foldICmpEqualityWithConstant(). Other commits in this series: https://reviews.llvm.org/rL277762 https://reviews.llvm.org/rL277752 https://reviews.llvm.org/rL277738 https://reviews.llvm.org/rL277731 https://reviews.llvm.org/rL277659 https://reviews.llvm.org/rL277638 https://reviews.llvm.org/rL277629 llvm-svn: 277779	2016-08-04 22:19:27 +00:00
Kevin Enderby	2c18270075	Clean up the logic of the Archive::Child::Child() with an assert to know Err is not a nullptr when we are pointed at real data. David Blaikie pointed out some odd logic in the case the Err value was a nullptr and Lang Hames suggested it could be cleaned it up with an assert to know that Err is not a nullptr when we are pointed at real data. As only in the case of constructing the sentinel value by pointing it at null data is Err is permitted to be a nullptr, since no error could occur in that case. With this change the testing for “if (Err)” is removed from the constructor’s logic and *Err is used directly without any check after the assert(). llvm-svn: 277776	2016-08-04 21:54:19 +00:00
Tim Northover	61c16142b4	GlobalISel: extend add widening to SUB, MUL, OR, AND and XOR. These are the operations that are trivially identical. Division is omitted for now because you need to use the correct sign/zero extension. llvm-svn: 277775	2016-08-04 21:39:49 +00:00
Tim Northover	1cfa919b3d	GlobalISel: add support for G_MUL llvm-svn: 277774	2016-08-04 21:39:44 +00:00
Tim Northover	9656f1476c	GlobalISel: implement narrowing for G_ADD. llvm-svn: 277769	2016-08-04 20:54:13 +00:00
Matt Arsenault	6ad97732aa	GVNHoist: Don't hoist convergent calls llvm-svn: 277767	2016-08-04 20:52:57 +00:00
Lang Hames	aac59a26a5	[ExecutionEngine] Refactor - Roll JITSymbolFlags functionality into JITSymbol.h and remove the JITSymbolFlags header. llvm-svn: 277766	2016-08-04 20:32:37 +00:00
David Majnemer	f93082e71a	[coroutines] Part 4[ab]: Coroutine Devirtualization: Lower coro.resume and coro.destroy. This is the forth patch in the coroutine series. CoroEaly pass now lowers coro.resume and coro.destroy intrinsics by replacing them with an indirect call to an address returned by coro.subfn.addr intrinsic. This is done so that CGPassManager recognizes devirtualization when CoroElide replaces a call to coro.subfn.addr with an appropriate function address. Patch by Gor Nishanov! Differential Revision: https://reviews.llvm.org/D22998 llvm-svn: 277765	2016-08-04 20:30:07 +00:00
Sanjay Patel	d938e88e89	[InstCombine] use m_APInt to allow icmp eq (and X, C1), C2 folds for splat constant vectors llvm-svn: 277762	2016-08-04 20:05:02 +00:00
Yaxun Liu	86c052238a	[OpenCL] Add missing tests for getOCLTypeName Adding missing tests for OCL type names for half, float, double, char, short, long, and unknown. Patch by Aaron En Ye Shi. Differential Revision: https://reviews.llvm.org/D22964 llvm-svn: 277759	2016-08-04 19:45:00 +00:00
Zachary Turner	660230eba4	[CodeView] Use llvm::Error instead of std::error_code. This eliminates the remnants of std::error_code from the DebugInfo libraries. llvm-svn: 277758	2016-08-04 19:39:55 +00:00
Tim Northover	2f32e7f0ac	AArch64: don't assume all i128s are BUILD_PAIRs It leads to a crash when they're not. I'm sure I've made this mistake before, at least once. llvm-svn: 277755	2016-08-04 19:32:28 +00:00
Sanjay Patel	b3de75d3a0	[InstCombine] use m_APInt to allow icmp eq (or X, C1), C2 folds for splat constant vectors llvm-svn: 277752	2016-08-04 19:12:12 +00:00
Tim Northover	06db18fbf8	GlobalISel: also add G_TRUNC to IRTranslator. llvm-svn: 277749	2016-08-04 18:35:17 +00:00
Tim Northover	323358184e	GlobalISel: add code to widen scalar G_ADD llvm-svn: 277747	2016-08-04 18:35:11 +00:00
Derek Schuff	732636d901	[WebAssembly] Check return value of getRegForValue in FastISel Previously, FastISel for WebAssembly wasn't checking the return value of `getRegForValue` in certain cases, which would generate instructions referencing NoReg. This patch fixes this behavior. Patch by Dominic Chen Differential Revision: https://reviews.llvm.org/D23100 llvm-svn: 277742	2016-08-04 18:01:52 +00:00
Krzysztof Parzyszek	04c0796e37	[Hexagon] Validate register class when doing bit simplification llvm-svn: 277740	2016-08-04 17:56:19 +00:00
Sanjay Patel	bcaf6f39dd	[InstCombine] use m_APInt to allow icmp eq (op X, Y), C folds for splat constant vectors I'm removing a misplaced pair of more specific folds from InstCombine in this patch as well, so we know where those folds are happening in InstSimplify. llvm-svn: 277738	2016-08-04 17:48:04 +00:00
Simon Pilgrim	3dbce52c16	[X86][SSE] Rename target shuffle unary permute matching function. NFCI. In preparation for adding a binary permute matching function. llvm-svn: 277737	2016-08-04 17:16:50 +00:00
Alina Sbirlea	6f937b1144	LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack adjustments. Summary: TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses. A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted. Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures: - x86 failing tests either did not have the right target, or the right alignment. - NVPTX failing tests did not have the right alignment. - AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks for a maximum size allowed and relies on the next condition checking for %4 for correctness. This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list). Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure), so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context. Reviewers: arsenm, jlebar, tstellarAMD Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23068 llvm-svn: 277735	2016-08-04 16:38:44 +00:00
Daniel Sanders	5dcbac57c5	[mips] Set Personality and LSDA encoding for FreeBSD Reviewers: seanbruno, sdardis Subscribers: tberghammer, danalbert, srhines, dsanders, sdardis, llvm-commits, seanbruno Differential Revision: https://reviews.llvm.org/D23113 llvm-svn: 277732	2016-08-04 15:36:03 +00:00
Sanjay Patel	9d591d15ec	[InstCombine] use m_APInt to allow icmp eq (sub C1, X), C2 folds for splat constant vectors llvm-svn: 277731	2016-08-04 15:19:25 +00:00
Simon Pilgrim	c2370b810d	[X86][SSE] Split off shuffle mask canonicalization from lowerVectorShuffle. NFCI. The new function now returns true if the shuffle should be commuted. This will allow target shuffle combines to share the code. llvm-svn: 277728	2016-08-04 14:21:32 +00:00
Krzysztof Parzyszek	7773c58458	[Hexagon] Clear kill flags from modified registers in peephole optimizer llvm-svn: 277727	2016-08-04 14:17:16 +00:00
Nikolai Bozhenov	f679530ba1	[X86] Heuristic to selectively build Newton-Raphson SQRT estimation On modern Intel processors hardware SQRT in many cases is faster than RSQRT followed by Newton-Raphson refinement. The patch introduces a simple heuristic to choose between hardware SQRT instruction and Newton-Raphson software estimation. The patch treats scalars and vectors differently. The heuristic is that for scalars the compiler should optimize for latency while for vectors it should optimize for throughput. It is based on the assumption that throughput bound code is likely to be vectorized. Basically, the patch disables scalar NR for big cores and disables NR completely for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores. Secondly, vector SQRT has been greatly improved in Skylake and has better throughput compared to NR. Differential Revision: https://reviews.llvm.org/D21379 llvm-svn: 277725	2016-08-04 12:47:28 +00:00
Hrvoje Varga	846bdb746d	[mips][microMIPS] Implement CFC1, CFC2, CTC1 and CTC2 instructions Differential Revision: https://reviews.llvm.org/D22347 llvm-svn: 277719	2016-08-04 11:22:52 +00:00
Simon Pilgrim	5d5ca9c0cb	[X86][SSE] Add initial costs for vector CTTZ/CTLZ llvm-svn: 277716	2016-08-04 10:51:41 +00:00
Simon Pilgrim	8ae6dad49b	[X86][SSE] Don't decide when to scalarize CTTZ/CTLZ for performance at lowering - this is what cost models are for Improved CTTZ/CTLZ costings will be added shortly llvm-svn: 277713	2016-08-04 10:14:39 +00:00
Simon Dardis	57f4ae4625	[mips] Enable tail calls by default Enable tail calls by default for (micro)MIPS(64). microMIPS is slightly more tricky than doing it for MIPS(R6) or microMIPSR6. microMIPS has two instruction encodings: 16bit and 32bit along with some restrictions on the size of the instruction that can fill the delay slot. For safe tail calls for microMIPS, the delay slot filler attempts to find a correct size instruction for the delay slot of TAILCALL pseudos. Reviewers: dsanders, vkalintris Subscribers: jfb, dsanders, sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D21138 llvm-svn: 277708	2016-08-04 09:17:07 +00:00
Diana Picus	ddddbc2440	Typo fix in comment. NFC llvm-svn: 277704	2016-08-04 08:25:08 +00:00
Dean Michael Berris	7e9abea2ae	[XRay] Align entry and return sleds to 2 byte boundaries This should ensure that we can atomically write two bytes (on top of the retq and the one past it) and have those two bytes not straddle cache lines. We also move the label past the alignment instruction so that we can refer to the actual first instruction, as opposed to potential padding before the aligned instruction. Update the tests to allow us to reflect the new order of assembly. Reviewers: rSerge, echristo, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23101 llvm-svn: 277701	2016-08-04 07:37:28 +00:00
Amaury Sechet	6bea674c43	Add popcount(n) == bitsize(n) -> n == -1 transformation. Summary: As per title. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23139 llvm-svn: 277694	2016-08-04 05:27:20 +00:00
David Majnemer	4eefd6bca4	Forgot the dyn_cast_or_null intended for r277691. llvm-svn: 277693	2016-08-04 04:47:18 +00:00
David Majnemer	909793fa63	Reinstate "[CloneFunction] Don't remove side effecting calls" This reinstates r277611 + r277614 and reverts r277642. A cast_or_null should have been a dyn_cast_or_null. llvm-svn: 277691	2016-08-04 04:24:02 +00:00
Bruno Cardoso Lopes	bd887581fc	Revert "GVN-hoist: enable by default" & "Make GVN Hoisting obey optnone/bisect." This reverts commits r277685 & r277688. r277685 broke compiler-rt compilation http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/23335 and r277685 is a followup from it. llvm-svn: 277690	2016-08-04 04:16:24 +00:00
Chandler Carruth	a053a88df5	[PM] Change the name of the repeating utility to something less overloaded (and simpler). Sean rightly pointed out in code review that we've started using "wrapper pass" as a specific part of the old pass manager, and in fact it is more applicable there. Here, we really have a pass template to build a repeated pass, so call it that. llvm-svn: 277689	2016-08-04 03:52:53 +00:00
Sebastian Pop	70ffe6523f	GVN-hoist: enable by default As we addressed all compilation time problems with GVN-hoist https://llvm.org/bugs/show_bug.cgi?id=28670 this patch turns GVN-hoist back by default. Differential Revision: https://reviews.llvm.org/D23136 llvm-svn: 277685	2016-08-04 01:59:42 +00:00
Rui Ueyama	d1d8c8312a	pdbdump: Fix crash bug. pdbdump calls DbiStreamBuilder::commit through PDBFileBuilder::commit without calling DbiStreamBuilder::finalize. Because `finalize` initializes `Header` member, `Header` remained nullptr which caused a crash bug. Differential Revision: https://reviews.llvm.org/D23143 llvm-svn: 277681	2016-08-03 23:43:23 +00:00
Matthias Braun	1873998b16	RenameIndependentSubregs: Fix liveness query in rewriteOperands() rewriteOperands() always performed liveness queries at the base index rather than the RegSlot/Base as apropriate for the machine operand. This could lead to illegal rewriting in some cases. llvm-svn: 277661	2016-08-03 22:37:47 +00:00
Sanjay Patel	00a324e893	[InstCombine] use m_APInt to allow icmp eq (add X, C1), C2 folds for splat constant vectors llvm-svn: 277659	2016-08-03 22:08:44 +00:00
Kevin Enderby	27e85bd0a6	Clean up of libObject/Archive interfaces and change the last three uses of ErrorOr<> changing them to Expected<> to allow them to pass through llvm Errors. No functional change. This commit by itself will break the next lld builds. I’ll be committing the matching change for lld immediately next. llvm-svn: 277656	2016-08-03 21:57:47 +00:00
Guozhi Wei	9584d18d48	[PPC] Handling CallInst in PPCBoolRetToInt This patch fixes pr25548. Current implementation of PPCBoolRetToInt doesn't handle CallInst correctly, so it failed to do the intended optimization when there is a CallInst with parameters. This patch fixed that. llvm-svn: 277655	2016-08-03 21:43:51 +00:00
Bruno Cardoso Lopes	3fcf832cce	Revert "[ARM] Constant Materialize: imms with specific value can be encoded into mov.w" This reverts commit r277610 / d619aa8878c3dafcc0d29a46517f63ff3209fdd4. This make subtarget-no-movt.ll fail in http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/26892, llvm-svn: 277654	2016-08-03 21:26:21 +00:00
George Burgess IV	363da6f589	[MSSA] Fix a bug in MemorySSA's move ctor. Not a correctness issue, but it would be nice if we didn't have to recompute our block numbering (worst-case) every time we move MSSA. llvm-svn: 277652	2016-08-03 21:07:52 +00:00
Sebastian Pop	2aadad7243	GVN-hoist: limit the length of dependent instructions Limit the number of times the while(1) loop is executed. With this restriction the number of hoisted instructions does not change in a significant way on the test-suite. Differential Revision: https://reviews.llvm.org/D23028 llvm-svn: 277651	2016-08-03 20:54:38 +00:00
Sebastian Pop	4ba7c88cc7	GVN-hoist: compute DFS numbers once With this patch we compute the DFS numbers of instructions only once and update them during the code generation when an instruction gets hoisted. Differential Revision: https://reviews.llvm.org/D23021 llvm-svn: 277650	2016-08-03 20:54:36 +00:00
Sebastian Pop	5d3822fc12	GVN-hoist: compute MSSA once per function (PR28670) With this patch we compute the MemorySSA once and update it in the code generator. Differential Revision: https://reviews.llvm.org/D22966 llvm-svn: 277649	2016-08-03 20:54:33 +00:00
Reid Kleckner	a6be60871f	Revert "[CloneFunction] Don't remove side effecting calls" This reverts commit r277611 and the followup r277614. Bootstrap builds and chromium builds are crashing during inlining after this change. llvm-svn: 277642	2016-08-03 20:01:01 +00:00
George Burgess IV	f7672854f0	[MSSA] clang-format. NFC. Didn't want to fold this in with r277640, since it touches bits that aren't entirely related to r277640. llvm-svn: 277641	2016-08-03 19:59:11 +00:00
George Burgess IV	024f3d2683	[MSSA] Add special handling for invariant/constant loads. This is a follow-up to r277637. It teaches MemorySSA that invariant loads (and loads of provably constant memory) are always liveOnEntry. llvm-svn: 277640	2016-08-03 19:57:02 +00:00
Sanjay Patel	2e9675ff52	[InstCombine] use m_APInt to allow icmp eq (srem X, C1), C2 folds for splat constant vectors llvm-svn: 277638	2016-08-03 19:48:40 +00:00
George Burgess IV	82e355ce48	[MSSA] Add logic for special handling of atomics/volatiles. This patch makes MemorySSA recognize atomic/volatile loads, and makes MSSA treat said loads specially. This allows us to be a bit more aggressive in some cases. Administrative note: Revision was LGTM'ed by reames in person. Additionally, this doesn't include the `invariant.load` recognition in the differential revision, because I feel it's better to commit that separately. Will commit soon. Differential Revision: https://reviews.llvm.org/D16875 llvm-svn: 277637	2016-08-03 19:39:54 +00:00
Tobias Grosser	8757e387dd	[InstCombine] Refactor optimization of zext(or(icmp, icmp)) to enable more aggressive cast-folding Summary: InstCombine unfolds expressions of the form `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` such that in a later iteration of InstCombine the exposed `zext(icmp)` instructions can be optimized. We now combine this unfolding and the subsequent `zext(icmp)` optimization to be performed together. Since the unfolding doesn't happen separately anymore, we also again enable the folding of `logic(cast(icmp), cast(icmp))` expressions to `cast(logic(icmp, icmp))` which had been disabled due to its interference with the unfolding transformation. Tested via `make check` and `lnt`. Background ========== For a better understanding on how it came to this change we subsequently summarize its history. In commit r275989 we've already tried to enable the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` which had to be reverted in r276106 because it could lead to an endless loop in InstCombine (also see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160718/374347.html). The root of this problem is that in `visitZExt()` in InstCombineCasts.cpp there also exists a reverse of the above folding transformation, that unfolds `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` in order to expose `zext(icmp)` operations which would then possibly be eliminated by subsequent iterations of InstCombine. However, before these `zext(icmp)` would be eliminated the folding from r275989 could kick in and cause InstCombine to endlessly switch back and forth between the folding and the unfolding transformation. This is the reason why we now combine the `zext`-unfolding and the elimination of the exposed `zext(icmp)` to happen at one go because this enables us to still allow the cast-folding in `logic(cast(icmp), cast(icmp))` without entering an endless loop again. Details on the submitted changes ================================ - In `visitZExt()` we combine the unfolding and optimization of `zext` instructions. - In `transformZExtICmp()` we have to use `Builder->CreateIntCast()` instead of `CastInst::CreateIntegerCast()` to make sure that the new `CastInst` is inserted in a `BasicBlock`. The new calls to `transformZExtICmp()` that we introduce in `visitZExt()` would otherwise cause according assertions to be triggered (in our case this happend, for example, with lnt for the MultiSource/Applications/sqlite3 and SingleSource/Regression/C++/EH/recursive-throw tests). The subsequent usage of `replaceInstUsesWith()` is necessary to ensure that the new `CastInst` replaces the `ZExtInst` accordingly. - In InstCombineAndOrXor.cpp we again allow the folding of casts on `icmp` instructions. - The instruction order in the optimized IR for the zext-or-icmp.ll test case is different with the introduced changes. - The test cases in zext.ll have been adopted from the reverted commits r275989 and r276105. Reviewers: grosser, majnemer, spatel Subscribers: eli.friedman, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22864 Contributed-by: Matthias Reisinger <d412vv1n@gmail.com> llvm-svn: 277635	2016-08-03 19:30:35 +00:00
Sebastian Pop	031b1bc06f	Pass EphValues by const-ref as it is not modified in the callee Patch by Aditya Kumar. Differential Revision: https://reviews.llvm.org/D22967 llvm-svn: 277634	2016-08-03 19:13:50 +00:00
Simon Pilgrim	898f030f70	[X86][SSE] Enable target shuffle combining to combine multiple shuffle inputs. We currently only support combining target shuffles that consist of a single source input (plus elements known to be undef/zero). This patch generalizes the recursive combining of the target shuffle to collect all the inputs, merging any duplicates along the way, into a full set of src ops and its shuffle mask. We uncover a number of cases where we have failed to combine a unary shuffle because the input has been duplicated and separated during lowering. This will allow us to combine to 2-input shuffles in a future patch. Differential Revision: https://reviews.llvm.org/D22859 llvm-svn: 277631	2016-08-03 19:08:24 +00:00
Vedant Kumar	4031d9f80e	Reapply "More fixes to get good error messages for bad archives." This reverts commit the revert commit r277627. The build errors mentioned in r277627 were likely caused by an unclean build directory. Sorry for the noise. llvm-svn: 277630	2016-08-03 19:02:50 +00:00
Sanjay Patel	43aeb001c9	[InstCombine] use m_APInt to allow icmp (binop X, Y), C folds with constant splat vectors This removes the restriction for the icmp constant, but as noted by the FIXME comments, we still need to change individual checks for binop operand constants. llvm-svn: 277629	2016-08-03 18:59:03 +00:00
Vedant Kumar	bfb6072d84	Revert "More fixes to get good error messages for bad archives." This reverts commit r277540. It breaks the build with: ../lib/Object/Archive.cpp:264:41: error: return type of out-of-line definition of 'llvm::object::ArchiveMemberHeader::getUID' differs from that in the declaration Expected<unsigned> ArchiveMemberHeader::getUID() const { ~~~~~~~~~~~~~~~~~~ ^ include/llvm/Object/Archive.h:53:12: note: previous declaration is here unsigned getUID() const; ~~~~~~~~ ^ llvm-svn: 277627	2016-08-03 18:44:32 +00:00
Krzysztof Parzyszek	23ee12e173	[Hexagon] Generate COPY/REG_SEQUENCE more aggressively for vectors llvm-svn: 277626	2016-08-03 18:35:48 +00:00
Duncan P. N. Exon Smith	9cbc69d1fe	IR: Drop uniquing when an MDNode Value operand is deleted This is a fix for PR28697. An MDNode can indirectly refer to a GlobalValue, through a ConstantAsMetadata. When the GlobalValue is deleted, the MDNode operand is reset to `nullptr`. If the node is uniqued, this can lead to a hard-to-detect cache invalidation in a Metadata map that's shared across an LLVMContext. Consider: 1. A map from Metadata* to `T` called RemappedMDs. 2. A node that references a global variable, `!{i1* @GV}`. 3. Insert `!{i1* @GV} -> SomeT` in the map. 4. Delete `@GV`, leaving behind `!{null} -> SomeT`. Looking up the generic and uninteresting `!{null}` gives you `SomeT`, which is likely related to `@GV`. Worse, `SomeT`'s lifetime may be tied to the deleted `@GV`. This occurs in practice in the shared ValueMap used since r266579 in the IRMover. Other code that handles more than one Module (with different lifetimes) in the same LLVMContext could hit it too. The fix here is a partial revert of r225223: in the rare case that an MDNode operand is a ConstantAsMetadata (i.e., wrapping a node from the Value hierarchy), drop uniquing if it gets replaced with `nullptr`. This changes step #4 above to leave behind `distinct !{null} -> SomeT`, which can't be confused with the generic `!{null}`. In theory, this can cause some churn in the LLVMContext's MDNode uniquing map when Values are being deleted. However: - The number of GlobalValues referenced from uniqued MDNodes is expected to be quite small. E.g., the debug info metadata schema only references GlobalValues from distinct nodes. - Other Constants have the lifetime of the LLVMContext, whose teardown is careful to drop references before deleting the constants. As a result, I don't expect a compile time regression from this change. llvm-svn: 277625	2016-08-03 18:19:43 +00:00
Krzysztof Parzyszek	623afbdbd7	[Hexagon-ish] Add function to print cell map contents in bit tracker llvm-svn: 277622	2016-08-03 18:13:32 +00:00
David Majnemer	fa8ef91748	[CloneFunction] Don't crash if the value map doesn't hold something It is possible for the value map to not have an entry for some value that has already been removed. I don't have a testcase, this is fall-out from a buildbot. llvm-svn: 277614	2016-08-03 17:37:10 +00:00
Sanjay Patel	51a767c6b8	use local variables; NFC llvm-svn: 277612	2016-08-03 17:23:08 +00:00
David Majnemer	fad0490869	[CloneFunction] Don't remove side effecting calls We were able to figure out that the result of a call is some constant. While propagating that fact, we added the constant to the value map. This is problematic because it results in us losing the call site when processing the value map. This fixes PR28802. llvm-svn: 277611	2016-08-03 17:12:47 +00:00
Weiming Zhao	57dc4cf0e1	[ARM] Constant Materialize: imms with specific value can be encoded into mov.w Summary: Thumb2 supports encoding immediates with specific patterns into mov.w by splatting the low 8 bits into other bytes. Reviewers: john.brawn, jmolloy Subscribers: jmolloy, aemerson, rengolin, samparker, llvm-commits Differential Revision: https://reviews.llvm.org/D23090 llvm-svn: 277610	2016-08-03 17:05:23 +00:00
Zachary Turner	8cf51c340d	[msf] Make FPM reader use MappedBlockStream. MappedBlockSTream can work with any sequence of block data where the ordering is specified by a list of block numbers. So rather than manually stitch them together in the case of the FPM, reuse this functionality so that we can treat the FPM as if it were contiguous. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D23066 llvm-svn: 277609	2016-08-03 16:53:21 +00:00
Renato Golin	f583097434	Revert "Teach CorrelatedValuePropagation to mark adds as no wrap" This reverts commit r277592, trying to fix the AArch64 42VMA buildbot. llvm-svn: 277607	2016-08-03 16:20:48 +00:00
Benjamin Kramer	0e4b7646c1	Hexagon: Use llvm_unreachable. NFC. llvm-svn: 277605	2016-08-03 15:51:10 +00:00
Elliot Colp	82b1468a4d	Disable shrinking of SNaN constants When expanding FP constants, we attempt to shrink doubles to floats and perform an extending load. However, on SystemZ, and possibly on other targets (I've only confirmed the problem on SystemZ), the FP extending load instruction may convert SNaN into QNaN, or may cause an exception. So in the general case, we would still like to shrink FP constants, but SNaNs should be left as doubles. Differential Revision: https://reviews.llvm.org/D22685 llvm-svn: 277602	2016-08-03 15:09:21 +00:00
Krzysztof Parzyszek	ed4e7827bb	[Hexagon] Do not check alignment for unsized types in isLegalAddressingMode When the same base address is used to load two different data types, LSR would assume a memory type of "void". This type is not sized and has no alignment information. Checking for it causes a crash. llvm-svn: 277601	2016-08-03 15:06:18 +00:00
Gil Rapaport	e7a8fab275	[Loop Vectorizer] Move store-predication into its own function, remove obsolete comment (NFC) Differential Revision: https://reviews.llvm.org/D23013 llvm-svn: 277595	2016-08-03 13:23:43 +00:00
Artur Pilipenko	68cb947cc9	Teach CorrelatedValuePropagation to mark adds as no wrap Use LVI to prove that adds do not wrap. The change is motivated by https://llvm.org/bugs/show_bug.cgi?id=28620 bug and it's the first step to fix that problem. Reviewed By: sanjoy Differential Revision: http://reviews.llvm.org/D23059 llvm-svn: 277592	2016-08-03 13:11:39 +00:00
Igor Breger	c59b3a2236	[AVX512] Add aliases for vcvttss2si{l\|q}, vcvttsd2si{l\|q}, vcvttss2usi{l\|q}, vcvttsd2usi{l\|q} instructions. Differential Revision: http://reviews.llvm.org/D23111 llvm-svn: 277586	2016-08-03 10:58:05 +00:00
Chandler Carruth	fdc6ba1e45	[PM] Fix a mis-named parameter in parseLoopPass -- the pass manager was called "FPM" instead of "LPM" in a hold-over from when the code was modeled on that used to parse function passes. llvm-svn: 277584	2016-08-03 09:14:03 +00:00
Chandler Carruth	241bf2456f	[PM] Add a generic 'repeat N times' pass wrapper to the new pass manager. While this has some utility for debugging and testing on its own, it is primarily intended to demonstrate the technique for adding custom wrappers that can provide more interesting interation behavior in a nice, orthogonal, and composable layer. Being able to write these kinds of very dynamic and customized controls for running passes was one of the motivating use cases of the new pass manager design, and this gives a hint at how they might look. The actual logic is tiny here, and most of this is just wiring in the pipeline parsing so that this can be widely used. I'm adding this now to show the wiring without a lot of business logic. This is a precursor patch for showing how a "iterate up to N times as long as we devirtualize a call" utility can be added as a separable and composable component along side the CGSCC pass management. Differential Revision: https://reviews.llvm.org/D22405 llvm-svn: 277581	2016-08-03 07:44:48 +00:00
Dean Michael Berris	0b8f6c8777	[XRay] Make the xray_instr_map section specification more correct Summary: We also add a test to show what currently happens when we create a section per function and emit an xray_instr_map. This illustrates the relationship (or lack thereof) between the per-function section and the xray_instr_map section. We also change the code generation slightly so that we don't always create group sections, but rather only do so if a function where the table is associated with is in a group. Also in this change: - Remove the "merge" flag on the xray_instr_map section. - Test that we're generating the right table for comdat and non-comdat functions. Reviewers: echristo, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23104 llvm-svn: 277580	2016-08-03 07:21:55 +00:00
Jonas Paulsson	196986ca95	[IfConversion] Bugfix: Don't use undef flag while adding use operands. IfConversion used to always add the undef flag when adding a use operand on a newly predicated instruction. This would be an operand for the register being conditionally redefined. Due to the undef flag, the liveness of this register prior to the predicated instruction would get lost. This patch changes this so that such use operands are added only when the register is live, without the undef flag. This was reverted but pushed again now, for details follow link below. Reviewed by Quentin Colombet. http://reviews.llvm.org/D209077 llvm-svn: 277571	2016-08-03 05:46:35 +00:00
David Callahan	cc5cd4dc65	[ADCE] Refactor anticipating new functionality (NFC) Summary: This is the first refactoring before adding new functionality. Add a class wrapper for the functions and container for state associated with the transformation. No functional change Reviewers: majnemer, nadav, mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23065 llvm-svn: 277565	2016-08-03 04:28:39 +00:00
Mehdi Amini	f9721ba5f1	RecordStreamer: handle inline asm "lazy_reference" and mark symbols as "used" llvm-svn: 277564	2016-08-03 03:51:42 +00:00
Chandler Carruth	4c3e3bf9fb	[PM] Remove the NDEBUG condition around isModulePassName. I forgot to do this initially, and added when I saw this fail in a no-asserts build, but managed to loose the diff from the actual patch that got submitted. Very sorry. llvm-svn: 277562	2016-08-03 03:26:09 +00:00
Chandler Carruth	6cb2ab2c60	[PM] Significantly refactor the pass pipeline parsing to be easier to reason about and less error prone. The core idea is to fully parse the text without trying to identify passes or structure. This is done with a single state machine. There were various bugs in the logic around this previously that were repeated and scattered across the code. Having a single routine makes it much easier to fix and get correct. For example, this routine doesn't suffer from PR28577. Then the actual pass construction is handled using much easier to read code and simple loops, with particular pass manager construction sunk to live with other pass construction. This is especially nice as the pass managers are in fact passes. Finally, the "implicit" pass manager synthesis is done much more simply by forming "pre-parsed" structures rather than having to duplicate tons of logic. One of the bugs fixed by this was evident in the tests where we accepted a pipeline that wasn't really well formed. Another bug is PR28577 for which I have added a test case. The code is less efficient than the previous code but I'm really hoping that's not a priority. ;] Thanks to Sean for the review! Differential Revision: https://reviews.llvm.org/D22724 llvm-svn: 277561	2016-08-03 03:21:41 +00:00
George Burgess IV	14633b5cd3	[MSSA] Fix a caching bug. This fixes a bug where we'd sometimes cache overly-conservative results with our walker. This bug was made more obvious by r277480, which makes our cache far more spotty than it was. Test case is llvm-unit, because we're likely going to use CachingWalker only for def optimization in the future. The bug stems from that there was a place where the walker assumed that `DefNode.Last` was a valid target to cache to when failing to optimize phis. This is sometimes incorrect if we have a cache hit. The fix is to use the thing we can assume is a valid target to cache to. :) llvm-svn: 277559	2016-08-03 01:22:19 +00:00
Chandler Carruth	8562d3a5e4	[Inliner] clang-format various parts of the inliner prior to changes here. NFC. llvm-svn: 277557	2016-08-03 01:02:31 +00:00
Ivan Krasin	3aade11252	Add -lowertypetests-bitsets-level to control bitsets generation. Summary: Sometimes, bitsets could get really large (>300k entries) and we might want to drop a check, as it would have a too much cost. Adding a flag to control how much penalty are we willing to pay for bitsets. Reviewers: kcc Differential Revision: https://reviews.llvm.org/D23088 llvm-svn: 277556	2016-08-03 00:59:38 +00:00
Daniel Berlin	df10119e4e	Support for lifetime begin/end markers in the MemorySSA use optimizer Summary: Depends on D23072 Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23076 llvm-svn: 277553	2016-08-03 00:01:46 +00:00
Derek Schuff	5629ec141f	[WebAssembly] Remove unnecessary subtarget checks in peephole pass Leftover from D22686; the passes can handle all the instructions unconditionally; only isel needs to care whether to generate them. llvm-svn: 277549	2016-08-02 23:31:56 +00:00
Evgeniy Stepanov	d99f80b48e	[safestack] Layout large allocas first to reduce fragmentation. llvm-svn: 277544	2016-08-02 23:21:30 +00:00
Derek Schuff	39bf39f35c	[WebAssembly] Initial SIMD128 support. Kicks off the implementation of wasm SIMD128 support (spec: https://github.com/stoklund/portable-simd/blob/master/portable-simd.md), adding support for add, sub, mul for i8x16, i16x8, i32x4, and f32x4. The spec is WIP, and might change in the near future. Patch by João Porto Differential Revision: https://reviews.llvm.org/D22686 llvm-svn: 277543	2016-08-02 23:16:09 +00:00
Tim Northover	765777ce67	ARM: only form SMMLS when SUBE flags unused. In this particular example we wouldn't want the smmls anyway (the value is actually unused), but in general smmls does not provide the required flags register so if that SUBE result is used we can't replace it. llvm-svn: 277541	2016-08-02 23:12:36 +00:00

... 13 14 15 16 17 ...

94831 Commits