llvm-project

Commit Graph

Author	SHA1	Message	Date
Petar Jovanovic	b71386a4a4	[Mips] Add support to match more patterns for DEXT and CINS This patch adds support for recognizing more patterns to match to DEXT and CINS instructions. It finds cases where multiple instructions could be replaced with a single DEXT or CINS instruction. For example, for the following: define i64 @dext_and32(i64 zeroext %a) { entry: %and = and i64 %a, 4294967295 ret i64 %and } instead of generating: 0000000000000088 <dext_and32>: 88: 64010001 daddiu at,zero,1 8c: 0001083c dsll32 at,at,0x0 90: 6421ffff daddiu at,at,-1 94: 03e00008 jr ra 98: 00811024 and v0,a0,at 9c: 00000000 nop the following gets generated: 0000000000000068 <dext_and32>: 68: 03e00008 jr ra 6c: 7c82f803 dext v0,a0,0x0,0x20 Cases that are covered: DEXT: 1. and $src, mask where mask > 0xffff 2. zext $src zero extend from i32 to i64 CINS: 1. and (shl $src, pos), mask 2. shl (and $src, mask), pos 3. zext (shl $src, pos) zero extend from i32 to i64 Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D30464 llvm-svn: 297832	2017-03-15 13:10:08 +00:00
Zvi Rackover	4aacd5d3c4	Fix malformed XFAIL in previous commit llvm-svn: 297823	2017-03-15 11:44:14 +00:00
Zvi Rackover	81f7b88910	[DAGCombine] Add reproducer for pr32278 llvm-svn: 297822	2017-03-15 11:34:51 +00:00
Sam Parker	274472f7c5	[ARM] Fix for branch label disassembly for Thumb Different MCInstrAnalysis classes for arm and thumb mode, each with their own evaluateBranch implementation. I added a test case and fixed the coff-relocations test to use '<label>:' rather than '<label>' in the CHECK-LABEL entries, since the ones without the colon would match branch targets. Might be worth noticing that llvm-objdump does not lookup the relocation and thus assigns it a target depending on the encoded immediate which #0, so it thinks it branches to the next instruction. Committed on behalf of Andre Vieira (avieira). Differential Revision: https://reviews.llvm.org/D30943 llvm-svn: 297821	2017-03-15 10:21:23 +00:00
Artyom Skrobov	3fa5fd1dd2	[Thumb1] Fix the bug when adding/subtracting -2147483648 Differential Revision: https://reviews.llvm.org/D30829 llvm-svn: 297820	2017-03-15 10:19:16 +00:00
Sam Parker	654cb8263a	[ARM] Enable SMLAL[B\|T] isel Enable the selection of the 64-bit signed multiply accumulate instructions which operate on 16-bit operands. These are enabled for ARMv5TE onwards for ARM and for V6T2 and other DSP enabled Thumb architectures. Differential Revision: https://reviews.llvm.org/D30044 llvm-svn: 297809	2017-03-15 08:27:11 +00:00
Michal Gorny	f89c874d44	[llvm-config] Add minimal sanity tests for path options Add minimal tests that check whether path options do not fail and output directories looking like expected. Requested in https://reviews.llvm.org/rL291218. Differential Revision: https://reviews.llvm.org/D28533 llvm-svn: 297807	2017-03-15 05:57:29 +00:00
Taewook Oh	fb1833efeb	[BranchFolding] Merge debug locations from common tail instead of removing Summary: D25742 improved the precision of debug locations for PGO by removing debug locations from common tail when tail-merging. However, if identical insturctions that are merged into a common tail have the same debug locations, there's no need to remove them. This patch creates a merged debug location of identical instructions across SameTails and assign it to the instruction in the common tail, so that the debug locations are maintained if they are same across identical instructions. Reviewers: aprantl, probinson, MatzeB, rob.lougher Reviewed By: aprantl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D30226 llvm-svn: 297805	2017-03-15 05:44:59 +00:00
Peter Collingbourne	7f6e2c97b8	Ensure that prefix data is preserved with subsections-via-symbols On MachO platforms that use subsections-via-symbols dead code stripping will drop prefix data. Unfortunately there is no great way to convey the relationship between a function and its prefix data to the linker. We are forced to use a bit of a hack: we give the prefix data it’s own symbol, and mark the actual function entry an .alt_entry. Patch by Moritz Angermann! Differential Revision: https://reviews.llvm.org/D30770 llvm-svn: 297804	2017-03-15 04:18:16 +00:00
Volkan Keles	4862c63594	[GlobalISel] IRTranslator: Return the scalar for <1 x Ty> constant vectors Summary: <1 x Ty> is not a legal vector type in LLT, we shouldn’t build G_MERGE_VALUES instruction for them. Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, ab, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D30948 llvm-svn: 297792	2017-03-14 23:45:06 +00:00
Fiona Glaser	a9bd572b6f	MemCpyOptimizer: don't create new addrspace casts This isn't safe on all targets, and since we don't have a way to know it's safe, avoid doing it for now. llvm-svn: 297788	2017-03-14 22:37:38 +00:00
Daniel Sanders	8a4bae9993	[globalisel][tblgen] Add support for ComplexPatterns Summary: Adds a new kind of MachineOperand: MO_Placeholder. This operand must not appear in the MIR and only exists as a way of creating an 'uninitialized' operand until a matcher function overwrites it. Depends on D30046, D29712 Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D30089 llvm-svn: 297782	2017-03-14 21:32:08 +00:00
Simon Pilgrim	cf2da96c82	[SelectionDAG] Add a signed integer absolute ISD node Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering. ARM/AArch64, Hexagon, PowerPC and NVPTX all have similar instructions allowing us to make this a generic opcode and move away from the hard coded tablegen patterns which makes it tricky to match more complex patterns. At the moment this patch doesn't attempt legalization as we only create an ABS node if its legal/custom. Differential Revision: https://reviews.llvm.org/D29639 llvm-svn: 297780	2017-03-14 21:26:58 +00:00
Rafael Espindola	8f2dd7c042	Archives require a symbol table on Solaris, even if empty. On Solaris ld (and some other tools that use the underlying utility libraries, such as elfdump) chokes on an archive library that has no symbol table. The Solaris tools always create one, even if it's empty. That bug has been fixed in the latest development line, and can probably be backported to a supported release, but it would be nice if LLVM's archiver could emit the empty symbol table, too. Patch by Danek Duvall! llvm-svn: 297773	2017-03-14 19:57:13 +00:00
Evgeniy Stepanov	43dcf4d330	Fix asm printing of associated sections. Make MCSectionELF::AssociatedSection be a link to a symbol, because that's how it works in the assembly, and use it in the asm printer. llvm-svn: 297769	2017-03-14 19:28:51 +00:00
Sanjay Patel	8dd99dce6c	[DAG] vector div/rem with any zero element in divisor is undef This is the backend counterpart to: https://reviews.llvm.org/rL297390 https://reviews.llvm.org/rL297409 and follow-up to: https://reviews.llvm.org/rL297384 It surprised me that we need to duplicate the check in FoldConstantArithmetic and FoldConstantVectorArithmetic, but one or the other doesn't catch all of the test cases. There is an existing code comment about merging those someday. Differential Revision: https://reviews.llvm.org/D30826 llvm-svn: 297762	2017-03-14 18:06:28 +00:00
Dehao Chen	4a435e0896	SamplePGO ThinLTO ICP fix for local functions. Summary: In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because ThinLTO adjusts local function names to avoid conflicts. There are two places of where the mismatch can happen: 1. thin-link prepends SourceFileName to front of FuncName to build the GUID (GlobalValue::getGlobalIdentifier). Unlike instrumentation FDO, SamplePGO does not use the PGOFuncName scheme and therefore the indirect call target profile data contains a hash of the OriginalName. 2. backend compiler promotes some local functions to global and appends .llvm.{$ModuleHash} to the end of the FuncName to derive PromotedFunctionName This patch tries at the best effort to find the GUID from the original local function name (in profile), and use that in ICP promotion, and in SamplePGO matching that happens in the backend after importing/inlining: 1. in thin-link, it builds the map from OriginalName to GUID so that when thin-link reads in indirect call target profile (represented by OriginalName), it knows which GUID to import. 2. in backend compiler, if sample profile reader cannot find a profile match for PromotedFunctionName, it will try to find if there is a match for OriginalFunctionName. 3. in backend compiler, we build symbol table entry for OriginalFunctionName and pointer to the same symbol of PromotedFunctionName, so that ICP can find the correct target to promote. Reviewers: mehdi_amini, tejohnson Reviewed By: tejohnson Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D30754 llvm-svn: 297757	2017-03-14 17:33:01 +00:00
Sanjay Patel	1c8c6a457d	[InstCombine] consolidate rem tests and update checks; NFC llvm-svn: 297747	2017-03-14 16:27:46 +00:00
Sanjay Patel	9deec85c34	[InstCombine] regenerate checks; NFC llvm-svn: 297746	2017-03-14 16:16:40 +00:00
Simon Pilgrim	3a196cbc4f	[X86] Add extra BITREVERSE tests Test on 32-bit and 64-bit targets. Add bitreverse tests for i64, i32 and i16 llvm-svn: 297741	2017-03-14 14:03:16 +00:00
Oliver Stannard	6ee22c41f8	[ARM] Diagnose ARM MOVT without :lower16: or :upper16: expression This instruction was missing from the list of opcodes that we check, so we were hitting an llvm_unreachable in ARMMCCodeEmitter.cpp for the ARM MOVT instruction, rather than the diagnostic that is emitted for the other MOVW/MOVT instructions. Differential revision: https://reviews.llvm.org/D30936 llvm-svn: 297739	2017-03-14 13:50:10 +00:00
Simon Pilgrim	e1a72a936f	[X86][MMX] Update FIXME comment. NFCI. llvm-svn: 297736	2017-03-14 12:13:41 +00:00
Oliver Stannard	062041113f	[ValueTracking] Out of range shifts might be undef If it is possible for the RHS of a shift operation to be greater than or equal to the bit-width, then the result might be undef, and we can't report any known bits. In some cases, this was allowing a transformation in instcombine which widened an undef value from i1 to i32, increasing the range of values that a function could return. Differential revision: https://reviews.llvm.org/D30781 llvm-svn: 297724	2017-03-14 10:13:17 +00:00
Sam Parker	916b1ba617	[ARM] Move SMULW[B\|T] isel to DAG Combine Create nodes for smulwb and smulwt and move their selection from DAGToDAG to DAG combine. smlawb and smlawt can then be selected using tablegen. Added some helper functions to detect shift patterns as well as a wrapper around SimplifyDemandBits. Added a couple of extra tests. Differential Revision: https://reviews.llvm.org/D30708 llvm-svn: 297716	2017-03-14 09:13:22 +00:00
Oren Ben Simhon	fe34c5e429	Disable Callee Saved Registers Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller. Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list. The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee. The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee. Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span). The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments. The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC. Differential Revision: https://reviews.llvm.org/D28566 llvm-svn: 297715	2017-03-14 09:09:26 +00:00
Craig Topper	7a5ee1c5ed	[AVX-512] Use iPTR instead of i64 in patterns for extract_subvector/insert_subvector index. llvm-svn: 297707	2017-03-14 06:40:04 +00:00
Craig Topper	b0a82eaea6	[AVX-512] Add test cases that demonstrate some patterns that don't work correctly in 32-bit mode. NFC llvm-svn: 297706	2017-03-14 06:40:00 +00:00
Jonas Paulsson	a48ea231c0	[TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improved getIntrinsicInstrCost() used to only compute scalarization cost based on types. This patch improves this so that the actual arguments are checked when they are available, in order to handle only unique non-constant operands. Tests updates: Analysis/CostModel/X86/arith-fp.ll Transforms/LoopVectorize/AArch64/interleaved_cost.ll Transforms/LoopVectorize/ARM/interleaved_cost.ll The improvement in getOperandsScalarizationOverhead() to differentiate on constants made it necessary to update the interleaved_cost.ll tests even though they do not relate to intrinsics. Review: Hal Finkel https://reviews.llvm.org/D29540 llvm-svn: 297705	2017-03-14 06:35:36 +00:00
Daniel Berlin	620f86ff2b	Add missing condprop-xfail.ll that contains the remaining xfail'd tests llvm-svn: 297699	2017-03-14 01:46:51 +00:00
Nirav Dave	4fc8401abf	Recommitting Craig Topper's patch now that r296476 has been recommitted. When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node. This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency. llvm-svn: 297698	2017-03-14 01:42:23 +00:00
Nirav Dave	54e22f33d9	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting with compiler time improvements Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 297695	2017-03-14 00:34:14 +00:00
Adrian Prantl	19aadf57c8	Revert "Debug Info: Add basic support for external types references." This reverts commit r242302. External type refs of this form were never used by any LLVM frontend so this is effectively dead code. (They were introduced to support clang module debug info, but in the end we came up with a better design that doesn't use this feature at all.) rdar://problem/25897929 Differential Revision: https://reviews.llvm.org/D30917 llvm-svn: 297684	2017-03-13 22:56:14 +00:00
Daniel Berlin	2aa23e8881	NewGVN: We pass rle-nonlocal, we just perform the replacement in a way that keeps the old name instead of the new one llvm-svn: 297683	2017-03-13 22:43:30 +00:00
Artyom Skrobov	bf19d4bc29	[Thumb1] combine ADDC/SUBC with a negative immediate Summary: This simple optimization has been split out of https://reviews.llvm.org/D30400 Reviewers: efriedma, jmolloy Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30829 llvm-svn: 297682	2017-03-13 22:36:14 +00:00
Craig Topper	784f241b59	[AVX-512] Fix another case where we are copying from a mask register using AH/BH/CH/DH with fastisel. Fixes PR32256. Still planning to do an audit for other possible cases. llvm-svn: 297678	2017-03-13 21:58:54 +00:00
Volkan Keles	38a91a0de6	GlobalISel: Translate ConstantDataVector Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, javed.absar, ab Reviewed By: qcolombet, dsanders, ab Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30216 llvm-svn: 297670	2017-03-13 21:36:19 +00:00
Tim Northover	55e6f10d69	Revert "GlobalISel: move vector extract/insert inside generic opcode region." I was writing against an earlier branch and Volkan had already fixed this. llvm-svn: 297668	2017-03-13 21:25:10 +00:00
Simon Pilgrim	9df7d08cb2	[X86][MMX] Fix folding of shift value loads to cover whole 64-bits rL230225 made the assumption that only the lower 32-bits of an MMX register load is used as a shift value, when in fact the whole 64-bits are reloaded and treated as a i64 to determine the shift value. This patch reverts rL230225 to ensure that the whole 64-bits of memory are folded and ensures that the upper 32-bit are zero'd for cases where the shift value has come from a scalar source. Found during fuzz testing. Differential Revision: https://reviews.llvm.org/D30833 llvm-svn: 297667	2017-03-13 21:23:29 +00:00
Tim Northover	0f1d32d557	GlobalISel: move vector extract/insert inside generic opcode region. Otherwise they won't be legalized or selected, causing instruction selection to fail horribly. llvm-svn: 297666	2017-03-13 21:18:59 +00:00
Andrew Kaylor	a11d020699	Revert r295004 (Add MXCSR) due to errors reported by MachineVerifier I am leaving the code in clang which filters mxcsr from the clobber list because that is still technically correct and will be useful again when the MXCSR register is reintroduced. llvm-svn: 297664	2017-03-13 20:35:10 +00:00
Rafael Espindola	d31f04b319	Bring back r297624. The issues was just a missing REQUIRES in the test. llvm-svn: 297661	2017-03-13 20:00:25 +00:00
Sanjay Patel	caf369bd03	[SimplifyCFG] move tests for PR31028 from CGP Hopefully, this will make sense with a forthcoming patch. If not, we can move these back. llvm-svn: 297660	2017-03-13 19:59:14 +00:00
Matt Arsenault	971c85ebb4	AMDGPU: Treat 0 as private null pointer in addrspacecast lowering llvm-svn: 297658	2017-03-13 19:47:31 +00:00
Rafael Espindola	3978b877d7	Revert "Fix crash when multiple raw_fd_ostreams to stdout are created." This reverts commit r297624. It was failing on the bots. llvm-svn: 297657	2017-03-13 19:38:32 +00:00
Jessica Paquette	c984e21394	[Outliner] Add tail call support This commit adds tail call support to the MachineOutliner pass. This allows the outliner to insert jumps rather than calls in areas where tail calling is possible. Outlined tail calls include the return or terminator of the basic block being outlined from. Tail call support allows the outliner to take returns and terminators into consideration while finding candidates to outline. It also allows the outliner to save more instructions. For example, in the X86-64 outliner, a tail called outlined function saves one instruction since no return has to be inserted. llvm-svn: 297653	2017-03-13 18:39:33 +00:00
Craig Topper	616641632e	[X86] Lower AVX2 gather intrinsics similar to AVX-512. Apply the same input source optimizations to break execution dependencies. For AVX-512 we force the input to zero if the input is undef or the mask is all ones to break an execution dependency. This patch brings the same behavior to AVX2. llvm-svn: 297652	2017-03-13 18:34:46 +00:00
Craig Topper	eb7ea28bdd	[AVX-512] If gather mask is all ones, force the input to a zero vector. We were already forcing undef inputs to become a zero vector, this now catches an all ones mask too. Ideally we'd use undef and let execution dep fix handle picking the best register/clearance for the undef, but I don't think it can handle the early clobber today. llvm-svn: 297651	2017-03-13 18:17:46 +00:00
Matt Arsenault	d81f557fe2	AMDGPU: Fold icmp/fcmp into icmp intrinsic The typical use is a library vote function which compares to 0. Fold the user condition into the intrinsic. llvm-svn: 297650	2017-03-13 18:14:02 +00:00
Jonas Devlieghere	5eb9c81d82	[Linker] Provide callback for internalization Differential Revision: https://reviews.llvm.org/D30738 llvm-svn: 297649	2017-03-13 18:08:11 +00:00
Sanjay Patel	6023a2501c	[CGP] add tests for PR31028; NFC llvm-svn: 297629	2017-03-13 15:45:37 +00:00
Rafael Espindola	82d55239ea	Fix crash when multiple raw_fd_ostreams to stdout are created. If raw_fd_ostream is constructed with the path of "-", it claims ownership of the stdout file descriptor. This means that it closes stdout when it is destroyed. If there are multiple users of raw_fd_ostream wrapped around stdout, then a crash can occur because of operations on a closed stream. An example of this would be running something like "clang -S -o - -MD -MF - test.cpp". Alternatively, using outs() (which creates a local version of raw_fd_stream to stdout) anywhere combined with such a stream usage would cause the crash. The fix duplicates the stdout file descriptor when used within raw_fd_ostream, so that only that particular descriptor is closed when the stream is destroyed. Patch by James Henderson! llvm-svn: 297624	2017-03-13 14:45:06 +00:00
Diana Picus	94db2e288b	[ARM] GlobalISel: Support SP in regbankselect We used to hit an unreachable in getRegBankFromRegClass when dealing with the stack pointer. This commit adds support for the GPRsp reg class. llvm-svn: 297621	2017-03-13 14:28:34 +00:00
Craig Topper	7746565754	[AVX-512] Add EVEX2VEX test cases for the cvt instructions fixed in r297599 and r297600. llvm-svn: 297603	2017-03-13 05:47:56 +00:00
Craig Topper	bb4089d260	Revert "[AVX-512] EVEX2VEX, don't reject intrinsic instructions when both have a memory operand. We should just continue to check other operands instead." This reverts r297596. There were other issues that were making this not work that have been fixed now. Reverting this results in a more accurate table. llvm-svn: 297602	2017-03-13 05:34:03 +00:00
Craig Topper	166085f0f2	[AVX-512] EVEX2VEX, don't reject intrinsic instructions when both have a memory operand. We should just continue to check other operands instead. This exposed that we have several intrinsic instructions that have identical TSFlags to other instructions. We should merge their patterns and kill of the duplicate. I'll fix that in a follow up patch. llvm-svn: 297596	2017-03-13 00:36:49 +00:00
Craig Topper	7d56c8315b	[AVX-512] Fix the valid immediates for the scatter/gather prefetch intrinsics. The immediate should be 1 or 2, not 0 or 1. This was found while adding bounds checking to clang. In fact the existing clang builtin test failed if we ran it all the way to assembly. llvm-svn: 297591	2017-03-12 22:29:12 +00:00
Sanjay Patel	f06b963a2b	[x86] don't blindly transform SETB into SBB I noticed unnecessary 'sbb' instructions in D30472 and while looking at 'ptest' codegen recently. This happens because we were transforming any 'setb' - even when we only wanted a single-bit result. This patch moves those transforms under visitAdd/visitSub, so we we're only creating sbb/adc when it is a win. I don't know why we need a SETCC_CARRY node type, but I'm not proposing to change that existing behavior in this patch. Also, I'm skeptical that sbb/adc are a win for all micro-arches, so I added comments to the test files where this transform still fires. The test changes here are all cases where we no longer produce sbb/adc. Avoiding partial register stalls (generating an xor to clear a register) is not handled in some cases, but that's a separate issue. Differential Revision: https://reviews.llvm.org/D30611 llvm-svn: 297586	2017-03-12 18:28:48 +00:00
Azharuddin Mohammed	473b75c3d5	Remove CRC32 instructions from AArch64InstrInfo::hasShiftedReg Summary: A53 scheduler causes an assertion failure on all CRC instructions: include/llvm/CodeGen/MachineInstr.h:280: const llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i < getNumOperands() && "getOperand() out of range!"' failed. The case statements corresponding to CRC instructions are incorrect and should be removed. Also adding a testcase while on this. Reviewers: t.p.northover, javed.absar, apazos, rengolin Reviewed By: rengolin Subscribers: evandro, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30274 llvm-svn: 297582	2017-03-12 14:02:32 +00:00
Igor Breger	293dfb9768	[X86] Add vector zext tests. llvm-svn: 297581	2017-03-12 13:20:10 +00:00
Craig Topper	58647b16e5	[AVX-512] Fix a bad use of a high GR8 register after copying from a mask register during fast isel. This ends up extracting from bits 15:8 instead of the lower bits of the mask. I'm pretty sure there are more problems lurking here. But I think this fixes PR32241. I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again. llvm-svn: 297574	2017-03-12 03:37:37 +00:00
Craig Topper	e726cd0cd1	[AVX-512] Add test case for PR32241. Fix coming in another commit. llvm-svn: 297573	2017-03-12 03:37:34 +00:00
Simon Pilgrim	18debfa5b4	[X86][SSE] Improve extraction of elements from v16i8 (pre-SSE41) Without SSE41 (pextrb) we currently extract byte elements from a vector by spilling to stack and reloading the byte. This patch is an initial attempt at using MOVD/PEXTRW to extract the relevant DWORD/WORD from the vector and then shift+truncate to collect the correct byte. Extraction of multiple bytes this way would result in code bloat, but as explained in the patch we could probably afford to be more aggressive with the supported extractions before again falling back on spilling - possibly through counting the number of extracts and which DWORD/WORD they originate? Differential Revision: https://reviews.llvm.org/D29841 llvm-svn: 297568	2017-03-11 20:42:31 +00:00
Craig Topper	d511c2ce04	[X86] Add avx2 gather tests cases that show a failure to remove zeroing of the source when the mask is all ones. llvm-svn: 297564	2017-03-11 18:26:00 +00:00
Matt Arsenault	dd905b0e9b	AMDGPU: Remove packf16 intrinsic llvm-svn: 297557	2017-03-11 05:51:16 +00:00
Matt Arsenault	3cb9ff8863	AMDGPU: Keep track of modifiers when converting v_mac to v_mad Since v_max_f32_e64/v_max_f16_e64 can be folded if the target instruction supports the clamp bit, we also need to maintain modifiers when converting v_mac to v_mad. This fixes a rendering issue with Dirt Rally because a v_mac instruction with the clamp bit set was converted to a v_mad but that bit was lost during the conversion. Fixes: e184e01dd79 ("AMDGPU: Fold FP clamp as modifier bit") Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com> llvm-svn: 297556	2017-03-11 05:40:40 +00:00
Sanjoy Das	3f1e8e0102	Use a WeakVH for UnknownInstructions in AliasSetTracker Summary: This change solves the same problem as D30726, except that this only throws out the bathwater. AST was not correctly tracking and deleting UnknownInstructions via handles. The existing code only tracks "pointers" in its `ASTCallbackVH`, so an UnknownInstruction (that isn't also def'ing a pointer used by another memory instruction) never gets a `ASTCallbackVH`. There are two other ways to solve this problem: - Use the `PointerRec` scheme for both known and unknown instructions. - Use a `CallbackVH` that erases the offending Instruction from the UnknownInstruction list. Both of the above changes seemed to be significantly (and unnecessarily IMO) more complex than this. Reviewers: chandlerc, dberlin, hfinkel, reames Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D30849 llvm-svn: 297539	2017-03-11 01:15:48 +00:00
Stanislav Mekhanoshin	79da2a7698	[AMDGPU] Remove getBidirectionalReasonRank This method inverts the Reason field of a scheduling candidate. It does right comparison between RegCritical and RegExcess, but everything else is broken. In fact it can prefer less strong reason such as Weak over RegCritical because Weak > -RegCritical. The CandReason enum is properly sorted, so just remove artificial ranking. Differential Revision: https://reviews.llvm.org/D30557 llvm-svn: 297536	2017-03-11 00:29:27 +00:00
Krzysztof Parzyszek	0e7b1f83b7	[RDF] Remove the map of reaching defs from copy propagation Use Liveness::getNearestAliasedRef to find the reaching def instead. llvm-svn: 297526	2017-03-10 22:44:24 +00:00
Simon Pilgrim	128a10a41d	[X86][SSE] Fix load folding for (V)CVTDQ2PD This only requires a 64-bit memory source, not the whole 128-bits. But the 128-bit case is still supported via X86InstrInfo::foldMemoryOperandImpl llvm-svn: 297523	2017-03-10 22:35:07 +00:00
Simon Pilgrim	9956661456	[X86][RTM] Regenerate RTM intrinsic tests for 32/64-bit targets. llvm-svn: 297518	2017-03-10 21:55:24 +00:00
Peter Collingbourne	711284b017	LTO: Hash type identifier resolutions for WholeProgramDevirt. Differential Revision: https://reviews.llvm.org/D30555 llvm-svn: 297514	2017-03-10 21:37:10 +00:00
Peter Collingbourne	780a4dd35f	LTO: Hash type identifier resolutions for LowerTypeTests. Differential Revision: https://reviews.llvm.org/D30553 llvm-svn: 297513	2017-03-10 21:35:17 +00:00
Volkan Keles	970fee4bfe	GlobalISel: Translate ConstantAggregateZero vectors Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30259 llvm-svn: 297509	2017-03-10 21:23:13 +00:00
Peter Collingbourne	14dcf02fcb	WholeProgramDevirt: Implement export/import support for VCP. Differential Revision: https://reviews.llvm.org/D30017 llvm-svn: 297503	2017-03-10 20:13:58 +00:00
Peter Collingbourne	59675ba0f8	WholeProgramDevirt: Implement export/import support for unique ret val opt. Differential Revision: https://reviews.llvm.org/D29917 llvm-svn: 297502	2017-03-10 20:09:11 +00:00
Konstantin Zhuravlyov	ffdb00eda9	[AMDGPU] Split R600/SI getFrameIndexReference and emit stack object offsets for SI Differential Revision: https://reviews.llvm.org/D29674 llvm-svn: 297499	2017-03-10 19:39:07 +00:00
Volkan Keles	04cb08cc83	[GlobalISel] Translate insertelement and extractelement Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30761 llvm-svn: 297495	2017-03-10 19:08:28 +00:00
Michael Kuperstein	5fb39a7966	[SLP] Revert everything that has to do with memory access sorting. This reverts r293386, r294027, r294029 and r296411. Turns out the SLP tree isn't actually a "tree" and we don't handle accessing the same packet of loads in several different orders well, causing miscompiles. Revert until we can fix this properly. llvm-svn: 297493	2017-03-10 18:59:07 +00:00
Simon Pilgrim	7dedbfa89d	[SelectionDAG] Add support for BUILD_VECTOR to ComputeNumSignBits llvm-svn: 297492	2017-03-10 18:36:46 +00:00
Simon Pilgrim	e54cd65399	[X86][SSE] Added tests showing missed truncations for sitofp conversion SelectionDAG::ComputeNumSignBits is poor at build_vector handling, meaning that we can't see that all the vXi64 sources are in fact sign extended i32 or smaller. llvm-svn: 297486	2017-03-10 18:01:53 +00:00
Amaury Sechet	62e0759d56	[SelectionDAG] Make SelectionDAG aware of the known bits in USUBO and SSUBO and SUBC. Summary: Depends on D30379 This improves the state of things for the sub class of operation. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30436 llvm-svn: 297482	2017-03-10 17:26:44 +00:00
Simon Pilgrim	ed655f09db	[X86][MMX] Add tests showing missed opportunities to use MMX sitofp conversions If we are transferring MMX registers to XMM for conversion we could use the MMX equivalents (CVTPI2PD + CVTPI2PS) without affecting rounding/exceptions etc. llvm-svn: 297481	2017-03-10 17:23:55 +00:00
Amaury Sechet	69fa16c810	[SelectionDAG] Make SelectionDAG aware of the known bits in UADDO and SADDO. Summary: As per title. This is extracted from D29872 and I threw SADDO in. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30379 llvm-svn: 297479	2017-03-10 17:06:52 +00:00
Simon Pilgrim	c6b55729a5	[X86][MMX] Add tests showing missed opportunities to use MMX fptosi conversions If we are transferring XMM conversion results to MMX registers we could use the MMX equivalents (CVTPD2PI/CVTTPD2PI + CVTPS2PI/CVTTPS2PI) with affecting rounding/expections etc. llvm-svn: 297476	2017-03-10 16:59:43 +00:00
Simon Pilgrim	b8856148d9	[X86][MMX] Updated bad stack spill shift value test to actually show the problem Cleaning up the ir had stopped showing the issue. llvm-svn: 297475	2017-03-10 16:18:50 +00:00
Simon Pilgrim	67d25b298a	[X86][MMX] Regenerate mmx bitcast tests llvm-svn: 297474	2017-03-10 16:07:39 +00:00
Simon Pilgrim	caa9172ba7	[X86][MMX] Add test showing bad stack spill of shift value i32 is spilled to stack but 64-bit mmx is reloaded - leaving garbage in the other half of the register llvm-svn: 297471	2017-03-10 15:53:41 +00:00
Simon Pilgrim	63ad95aee6	[X86][MMX] Regenerate mmx load folding tests llvm-svn: 297470	2017-03-10 15:41:05 +00:00
Simon Dardis	7090d145e8	[mips][msa] Accept more values for constant splats This patches teaches the MIPS backend to accept more values for constant splats. Previously, only 10 bit signed immediates or values that could be loaded using an ldi.[bhwd] instruction would be acceptted. This patch relaxes that constraint so that any constant value that be splatted is accepted. As a result, the constant pool is used less for vector operations, and the suite of bit manipulation instructions b(clr\|set\|neg)i can now be used with the full range of their immediate operand. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D30640 llvm-svn: 297457	2017-03-10 13:27:14 +00:00
Sanne Wouda	9dfa6ade4f	[Assembler] Add location info to unary expressions. Summary: This is a continuation of D28861. Add an SMLoc to MCUnaryExpr such that a better diagnostic can be given in case of an error in later stages of assembling. Reviewers: rengolin, grosbach, javed.absar, olista01 Reviewed By: olista01 Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30581 llvm-svn: 297454	2017-03-10 13:08:20 +00:00
Simon Atanasyan	6cfb101a6b	[llvm-readobj] Support SHT_MIPS_DWARF section type flag llvm-svn: 297448	2017-03-10 08:22:25 +00:00
Simon Atanasyan	ec8dfb1ca7	[MC] Set SHT_MIPS_DWARF section type for all .debug_* sections on MIPS All MIPS .debug_* sections should be marked with ELF type SHT_MIPS_DWARF accordingly the specification [1]. Also the same section type is assigned to these sections by GNU tools. [1] ftp.software.ibm.com/software/os390/czos/dwarf/mips_extensions.pdf Differential Revision: https://reviews.llvm.org/D29789 llvm-svn: 297447	2017-03-10 08:22:20 +00:00
Simon Atanasyan	2953224d64	[MC] Accept a numeric value as an ELF section header's type GAS supports specification of section header's type using a numeric value [1]. This patch brings the same functionality to LLVM. That allows to setup some target-specific section types belong to the SHT_LOPROC - SHT_HIPROC range. If we attempt to print unknown section type, MCSectionELF class shows an error message. It's better than print sole '@' sign without any section type name. In case of MIPS, example of such section's type is SHT_MIPS_DWARF. Without the patch we will have to implement some workarounds in probably not-MIPS-specific part of code base to convert SHT_MIPS_DWARF to the @progbits while printing assembly and to assign SHT_MIPS_DWARF for @progbits sections named .debug_* if we encounter such section in an input assembly. [1] https://sourceware.org/binutils/docs/as/Section.html Differential Revision: https://reviews.llvm.org/D29719 llvm-svn: 297446	2017-03-10 08:22:13 +00:00
Artyom Skrobov	0c93ceb5d8	For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes, same as already done for ARM and Thumb2. Reviewers: jmolloy, rogfer01, efriedma Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30400 llvm-svn: 297443	2017-03-10 07:40:27 +00:00
Matt Arsenault	a3bdd8f27b	AMDGPU: Fix insertion point when reducing load intrinsics The insertion point may be later than the next instruction, so it is necessary to set it when replacing the call. llvm-svn: 297439	2017-03-10 05:25:49 +00:00
Sanjay Patel	65e2e6805a	[x86] add tests for vec div/rem with 0 element in divisor; NFC llvm-svn: 297433	2017-03-10 00:55:29 +00:00
Daniel Berlin	e3e69e1680	NewGVN: Rewrite DCE during elimination so we do it as well as old GVN did. llvm-svn: 297428	2017-03-10 00:32:33 +00:00
Ahmed Bougacha	4ec6d5abed	[GlobalISel] Fallback when failing to translate invoke. We unintentionally stopped falling back in r293670. While there, change an unusual construct. llvm-svn: 297425	2017-03-10 00:25:35 +00:00
Tim Northover	aa995c98f4	GlobalISel: support trivial inlineasm calls. They're used for nefarious purposes by ObjC. llvm-svn: 297422	2017-03-09 23:36:26 +00:00
Amaury Sechet	e7d102cf02	[DAGCombiner] Do various combine on uaddo. Summary: This essentially does the same transform as for ADC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30417 llvm-svn: 297416	2017-03-09 22:47:00 +00:00
Krzysztof Parzyszek	544210304f	[Hexagon] Fixes to the bitsplit generation - Fix the insertion point, which occasionally could have been incorrect. - Avoid creating multiple bitsplits with the same operands, if an old one could be reused. llvm-svn: 297414	2017-03-09 22:02:14 +00:00
Tim Northover	d1e951e5eb	GlobalISel: inform FrameLowering when we emit a function call. Amongst other things (I expect) this is necessary to ensure decent backtraces when an "unreachable" is involved. llvm-svn: 297413	2017-03-09 22:00:39 +00:00
Sanjay Patel	962a8431ea	[InstSimplify] allow folds for bool vector div/rem llvm-svn: 297411	2017-03-09 21:56:03 +00:00
Tim Northover	7a9ea8f628	GlobalISel: put debug info for static allocas in the MachineFunction. The good reason to do this is that static allocas are pretty simple to handle (especially at -O0) and avoiding tracking DBG_VALUEs throughout the pipeline should give some kind of performance benefit. The bad reason is that the debug pipeline is an unholy mess of implicit contracts, where determining whether "DBG_VALUE %reg, imm" actually implies a load or not involves the services of at least 3 soothsayers and the sacrifice of at least one chicken. And it still gets it wrong if the variable is at SP directly. llvm-svn: 297410	2017-03-09 21:12:06 +00:00
Sanjay Patel	7e56366204	[ConstantFold] vector div/rem with any zero element in divisor is undef Follow-up for: https://reviews.llvm.org/D30665 https://reviews.llvm.org/rL297390 llvm-svn: 297409	2017-03-09 20:42:30 +00:00
Matt Arsenault	efe949cc67	AMDGPU: Support for SimplifyDemandedVectorElts for load intrinsics llvm-svn: 297408	2017-03-09 20:34:27 +00:00
Sanjay Patel	bb47616aef	[InstSimplify] add tests for vector constant folding div/rem-by-0; NFC llvm-svn: 297407	2017-03-09 20:31:20 +00:00
Amaury Sechet	10425de063	[DAGCombiner] Do various combine on usubo. Summary: This essentially does the same transform as for SUBC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30437 llvm-svn: 297404	2017-03-09 19:28:00 +00:00
Rong Xu	0cf1f56a8c	[PGO] Refactor profile dumping function for ease of adding other profile kind Refactor the dumping function so that we can add other value profile kind easily. Differential Revision: https://reviews.llvm.org/D30752 llvm-svn: 297399	2017-03-09 19:03:57 +00:00
Artem Belevich	f55e72a5a0	[FileCheck] Added --enable-var-scope option to enable scope for regex variables. If `--enable-var-scope` is in effect, variables with names that start with `$` are considered to be global. All other variables are local. All local variables get undefined at the beginning of each CHECK-LABEL block. Global variables are not affected by CHECK-LABEL. This makes it easier to ensure that individual tests are not affected by variables set in preceding tests. Differential Revision: https://reviews.llvm.org/D30749 llvm-svn: 297396	2017-03-09 17:59:04 +00:00
Krzysztof Parzyszek	78c4fcf12e	[Hexagon] Propagate zext of i1 into arithmetic code in selection DAG (op ... (zext i1 c) ...) -> (select c (op ... 1 ...), (op ... 0 ...)) llvm-svn: 297391	2017-03-09 16:29:30 +00:00
Sanjay Patel	2b1f6f4b92	[InstSimplify] vector div/rem with any zero element in divisor is undef This was suggested as a DAG simplification in the review for rL297026 : http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435253.html ...but let's start with IR since we have actual docs for IR (LangRef). Differential Revision: https://reviews.llvm.org/D30665 llvm-svn: 297390	2017-03-09 16:20:52 +00:00
Sam Parker	b308b48d69	[ARM] Remove t2xtpk feature from tests I previously removed the T2XtPk feature from the ARM backend, but it looks like I missed some of the tests that were using the feature. Differential Revision: https://reviews.llvm.org/D30778 llvm-svn: 297386	2017-03-09 15:14:32 +00:00
Sanjay Patel	df21979db7	[DAG] recognize div/rem by 0 as undef before trying constant folding As discussed in the review thread for rL297026, this is actually 2 changes that would independently fix all of the test cases in the patch: 1. Return undef in FoldConstantArithmetic for div/rem by 0. 2. Move basic undef simplifications for div/rem (simplifyDivRem()) before foldBinopIntoSelect() as a matter of efficiency. I will handle the case of vectors with any zero element as a follow-up. That change is the DAG sibling for D30665 + adding a check of vector elements to FoldConstantVectorArithmetic(). I'm deleting the test for PR30693 because it does not test for the actual bug any more (dangers of using bugpoint). Differential Revision: https://reviews.llvm.org/D30741 llvm-svn: 297384	2017-03-09 15:02:25 +00:00
Simon Dardis	7577ce2140	[mips] Revert fixes for PR32020. The fix introduces segfaults and clobbers the value to be stored when the atomic sequence loops. Revert "[Target/MIPS] Kill dead code, no functional change intended." This reverts commit r296153. Revert "Recommit "[mips] Fix atomic compare and swap at O0."" This reverts commit r296134. llvm-svn: 297380	2017-03-09 14:03:26 +00:00
Sjoerd Meijer	7f1a982d3d	[ARM] remove FIXMEs and add vcmp MC test Minor cleanup in ARMInstrVFP.td: removed some FIXMEs and added a MC test for vcmp that was actually missing. Differential Revision: https://reviews.llvm.org/D30745 llvm-svn: 297376	2017-03-09 13:28:37 +00:00
Chandler Carruth	20e588e1af	[PM/Inliner] Make the new PM's inliner process call edges across an entire SCC before iterating on newly-introduced call edges resulting from any inlined function bodies. This more closely matches the behavior of the old PM's inliner. While it wasn't really clear to me initially, this behavior is actually essential to the inliner behaving reasonably in its current design. Because the inliner is fundamentally a bottom-up inliner and all of its cost modeling is designed around that it often runs into trouble within an SCC where we don't have any meaningful bottom-up ordering to use. In addition to potentially cyclic, infinite inlining that we block with the inline history mechanism, it can also take seemingly simple call graph patterns within an SCC and turn them into insanely large functions by accidentally working top-down across the SCC without any of the threshold limitations that traditional top-down inliners use. Consider this diabolical monster.cpp file that Richard Smith came up with to help demonstrate this issue: ``` template <int N> extern const char str; void g(const char ); template <bool K, int N> void f(bool B, bool E) { if (K) g(str<N>); if (B == E) return; if (B) f<true, N + 1>(B + 1, E); else f<false, N + 1>(B + 1, E); } template <> void f<false, MAX>(bool B, bool E) { return f<false, 0>(B, E); } template <> void f<true, MAX>(bool B, bool E) { return f<true, 0>(B, E); } extern bool arr, end; void test() { f<false, 0>(arr, end); } ``` When compiled with '-DMAX=N' for various values of N, this will create an SCC with a reasonably large number of functions. Previously, the inliner would try to exhaust the inlining candidates in a single function before moving on. This, unfortunately, turns it into a top-down inliner within the SCC. Because our thresholds were never built for that, we will incrementally decide that it is always worth inlining and proceed to flatten the entire SCC into that one function. What's worse, we'll then proceed to the next function, and do the exact same thing except we'll skip the first function, and so on. And at each step, we'll also make some of the constant factors larger, which is awesome. The fix in this patch is the obvious one which makes the new PM's inliner use the same technique used by the old PM: consider all the call edges across the entire SCC before beginning to process call edges introduced by inlining. The result of this is essentially to distribute the inlining across the SCC so that every function incrementally grows toward the inline thresholds rather than allowing the inliner to grow one of the functions vastly beyond the threshold. The code for this is a bit awkward, but it works out OK. We could consider in the future doing something more powerful here such as prioritized order (via lowest cost and/or profile info) and/or a code-growth budget per SCC. However, both of those would require really substantial work both to design the system in a way that wouldn't break really useful abstraction decomposition properties of the current inliner and to be tuned across a reasonably diverse set of code and workloads. It also seems really risky in many ways. I have only found a single real-world file that triggers the bad behavior here and it is generated code that has a pretty pathological pattern. I'm not worried about the inliner not doing an awesome* job here as long as it does ok. On the other hand, the cases that will be tricky to get right in a prioritized scheme with a budget will be more common and idiomatic for at least some frontends (C++ and Rust at least). So while these approaches are still really interesting, I'm not in a huge rush to go after them. Staying even closer to the existing PM's behavior, especially when this easy to do, seems like the right short to medium term approach. I don't really have a test case that makes sense yet... I'll try to find a variant of the IR produced by the monster template metaprogram that is both small enough to be sane and large enough to clearly show when we get this wrong in the future. But I'm not confident this exists. And the behavior change here should be unobservable without snooping on debug logging. So there isn't really much to test. The test case updates come from two incidental changes: 1) We now visit functions in an SCC in the opposite order. I don't think there really is a "right" order here, so I just update the test cases. 2) We no longer compute some analyses when an SCC has no call instructions that we consider for inlining. llvm-svn: 297374	2017-03-09 11:35:40 +00:00
Simon Dardis	158956c6cc	[mips] Fix return lowering Fix a machine verifier issue where a instruction was using a invalid register. The return pseudo is expanded and has the return address register added to it. The return register may have been spuriously mark as killed earlier. This partially resolves PR/27458 Thanks to Quentin Colombet for reporting the issue! llvm-svn: 297372	2017-03-09 11:19:48 +00:00
Adam Nemet	5361b82d54	[SSP] In opt remarks, stream Function directly With this, it shows up as an attribute in YAML and non-printable characters are properly removed by GlobalValue::getRealLinkageName. llvm-svn: 297362	2017-03-09 06:10:27 +00:00
Matt Arsenault	9a3fd87523	DAG: Check no signed zeros instead of unsafe math attribute llvm-svn: 297354	2017-03-09 01:36:39 +00:00
Peter Collingbourne	0152c8156b	WholeProgramDevirt: Implement importing for uniform ret val opt. Differential Revision: https://reviews.llvm.org/D29854 llvm-svn: 297350	2017-03-09 01:11:15 +00:00
Konstantin Zhuravlyov	d923a35f34	AMDGPU: add missing lit.local.cfg to test/DebugInfo/AMDGPU llvm-svn: 297334	2017-03-09 00:21:36 +00:00
Peter Collingbourne	6d284fab20	WholeProgramDevirt: Implement importing for single-impl devirtualization. Differential Revision: https://reviews.llvm.org/D29844 llvm-svn: 297333	2017-03-09 00:21:25 +00:00
Teresa Johnson	d820447212	Perform symbol binding for .symver versioned symbols Summary: In a .symver assembler directive like: .symver name, name2@@nodename "name2@@nodename" should get the same symbol binding as "name". While the ELF object writer is updating the symbol binding for .symver aliases before emitting the object file, not doing so when the module inline assembly is handled by the RecordStreamer is causing the wrong behavior in LTO mode. E.g. when "name" is global, "name2@@nodename" must also be marked as global. Otherwise, the symbol is skipped when iterating over the LTO InputFile symbols (InputFile::Symbol::shouldSkip). So, for example, when performing any LTO via the gold-plugin, the versioned symbol definition is not recorded by the plugin and passed back to the linker. If the object was in an archive, and there were no other symbols needed from that object, the object would not be included in the final link and references to the versioned symbol are undefined. The llvm-lto2 tests added will give an error about an unused symbol resolution without the fix. Reviewers: rafael, pcc Reviewed By: pcc Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D30485 llvm-svn: 297332	2017-03-09 00:19:49 +00:00
Changpeng Fang	1be9b9f816	AMDGPU/SI: Disable unrolling in the loop vectorizer if the loop is not vectorized. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D30719 llvm-svn: 297328	2017-03-09 00:07:00 +00:00
Evgeniy Stepanov	8537d9994d	Don't merge global constants with non-dbg metadata. !type metadata can not be dropped. An alternative to this is adding !type metadata from the replaced globals to the replacement, but that may weaken type tests and make them slower at the same time. The merged global gets !dbg metadata from replaced globals, and can end up with multiple debug locations. llvm-svn: 297327	2017-03-09 00:03:37 +00:00
Konstantin Zhuravlyov	d5561e0a0b	[DebugInfo] Emit address space with DW_AT_address_class attribute for pointer and reference types Differential Revision: https://reviews.llvm.org/D29670 llvm-svn: 297320	2017-03-08 23:55:44 +00:00
Javed Absar	382f98733a	[ConstantFold] Fix defect in constant folding computation for GEP When the array indexes are all determined by GVN to be constants, a call is made to constant-folding to optimize/simplify the address computation. The constant-folding, however, makes a mistake in that it sometimes reads back stale Idxs instead of NewIdxs, that it re-computed in previous iteration. This leads to incorrect addresses coming out of constant-folding to GEP. A test case is included. The error is only triggered when indexes have particular patterns that the stale/new index updates interplay matters. Reviewers: Daniel Berlin Differential Revision: https://reviews.llvm.org/D30642 llvm-svn: 297317	2017-03-08 23:01:50 +00:00
Tim Northover	7596bd7a27	GlobalISel: correctly handle trivial fcmp predicates. It makes sense to only do them once in IRTranslator rather than making everyone deal with them. llvm-svn: 297304	2017-03-08 18:49:54 +00:00
Matthew Simpson	3388de1349	[LV] Select legal insert point when fixing first-order recurrences Because IRBuilder performs constant-folding, it's not guaranteed that an instruction in the original loop map to an instruction in the vector loop. It could map to a constant vector instead. The handling of first-order recurrences was incorrectly making this assumption when setting the IRBuilder's insert point. llvm-svn: 297302	2017-03-08 18:18:20 +00:00
Volkan Keles	5698b2ae6e	[GlobalISel] Add default action for G_FNEG Summary: rL297171 introduced G_FNEG for floating-point negation instruction and IRTranslator started to translate `FSUB -0.0, X` to `FNEG X`. This patch adds a default action for G_FNEG to avoid breaking existing targets. Reviewers: qcolombet, ab, kristof.beyls, t.p.northover, aditya_nandakumar, dsanders Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits Differential Revision: https://reviews.llvm.org/D30721 llvm-svn: 297301	2017-03-08 18:09:14 +00:00
Sanjay Patel	9f495695bb	[x86] regenerate checks; NFC This test could be reduced? The check fails for a seemingly unrelated change, so I'm adding full checks to see what is happening. llvm-svn: 297296	2017-03-08 17:19:56 +00:00
Matthew Simpson	8966848d17	[LV] Make the test case for PR30183 less fragile This patch also renames the PR number the test points to. The previous reference was PR29559, but that bug was somehow deleted and recreated under PR30183. llvm-svn: 297295	2017-03-08 17:03:38 +00:00
Matthew Simpson	903dd5aa9b	[LV] Add missing check labels to tests and reformat llvm-svn: 297294	2017-03-08 16:55:34 +00:00
Krzysztof Parzyszek	1b7197e690	[Hexagon] Use correct offset when extracting from the high word When extracting a bitfield from the high register in a register pair, the final offset should be relative to the high register (for 32-bit extracts). llvm-svn: 297288	2017-03-08 15:46:28 +00:00
Daniel Cederman	9db582a656	[Sparc] Check register use with isPhysRegUsed() instead of reg_nodbg_empty() Summary: By using reg_nodbg_empty() to determine if a function can be treated as a leaf function or not, we miss the case when the register pair L0_L1 is used but not L0 by itself. This has the effect that use_all_i32_regs(), a test in reserved-regs.ll which tries to use all registers, gets treated as a leaf function. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: davide, RKSimon, sepavloff, llvm-commits Differential Revision: https://reviews.llvm.org/D27089 llvm-svn: 297285	2017-03-08 15:23:10 +00:00
Jun Bum Lim	ac170872b2	[JumpThread] Use AA in SimplifyPartiallyRedundantLoad() Summary: Use AA when scanning to find an available load value. Reviewers: rengolin, mcrosier, hfinkel, trentxintong, dberlin Reviewed By: rengolin, dberlin Subscribers: aemerson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D30352 llvm-svn: 297284	2017-03-08 15:22:30 +00:00
Sanjay Patel	62906af379	[InstCombine] avoid crashing on shuffle shrinkage when input type is not same as result type llvm-svn: 297280	2017-03-08 15:02:23 +00:00
John Brawn	f82d68ff53	[ARM] Split up lsl-zero test into two tests On Windows stderr and stdout happen to get interleaved in a way that causes the test to fail, so split it up into a test that checks for errors and a test that doesn't. llvm-svn: 297273	2017-03-08 12:49:18 +00:00
Sam Parker	0f4db38c20	[LoopRotate] Propagate dbg.value intrinsics Recommitting patch which was previously reverted in r297159. These changes should address the casting issues. The original patch enables dbg.value intrinsics to be attached to newly inserted PHI nodes. Differential Review: https://reviews.llvm.org/D30701 llvm-svn: 297269	2017-03-08 09:56:22 +00:00
Tim Shen	c7472d912b	Revert "Revert "[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size"" After inspection, it's an UB in our code base. Someone cast a var-arg function pointer to a non-var-arg one. :/ Re-commit r296771 to continue testing on the patch. Sorry for the trouble! llvm-svn: 297256	2017-03-08 02:41:35 +00:00
Sebastian Pop	4a4d245b19	Handle UnreachableInst in isGuaranteedToTransferExecutionToSuccessor A block with an UnreachableInst does not transfer execution to a successor. The problem was exposed by GVN-hoist. This patch fixes bug 32153. Patch by Aditya Kumar. Differential Revision: https://reviews.llvm.org/D30667 llvm-svn: 297254	2017-03-08 01:54:50 +00:00
Matt Arsenault	52d1b62a28	AMDGPU: Don't wait at end of block with a trivial successor If there is only one successor, and that successor only has one predecessor the wait can obviously be delayed until uses or the end of the next block. This avoids code quality regressions when there are trivial fallthrough blocks inserted for structurization. llvm-svn: 297251	2017-03-08 01:06:58 +00:00
Eli Friedman	c2c2e21d77	[DAGCombine] Simplify ISD::AND in GetDemandedBits. This helps in cases involving bitfields where an AND is exposed by legalization. Differential Revision: https://reviews.llvm.org/D30472 llvm-svn: 297249	2017-03-08 00:56:35 +00:00
Matt Arsenault	d8ed207a20	AMDGPU: Constant fold rcp node When doing arcp optimization with a constant denominator, this was leaving behind rcps with constant inputs. llvm-svn: 297248	2017-03-08 00:48:46 +00:00
Konstantin Zhuravlyov	f9b41cd3d8	[DebugInfo] Make legal and emit DW_OP_swap and DW_OP_xderef Differential Revision: https://reviews.llvm.org/D29672 llvm-svn: 297247	2017-03-08 00:28:57 +00:00
Changpeng Fang	6b49fa4ca7	AMDGPU/SI: Do not insert EndCf in an unreachable block Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D22025 llvm-svn: 297243	2017-03-07 23:29:36 +00:00
Sanjay Patel	fe9705149b	[InstCombine] shrink truncated insertelement into undef vector This is the 2nd part of solving: http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html D30123 moves the trunc ahead of the shuffle, and this moves the trunc ahead of the insertelement. We're limiting this transform to undef rather than any constant to avoid backend problems. Differential Revision: https://reviews.llvm.org/D30137 llvm-svn: 297242	2017-03-07 23:27:14 +00:00
Krzysztof Parzyszek	434d50a796	[Hexagon] Check for presence before looking registers up in bit tracker llvm-svn: 297240	2017-03-07 23:12:04 +00:00
Krzysztof Parzyszek	8e4d2e0512	[Hexagon] Generate bitsplit instruction llvm-svn: 297239	2017-03-07 23:08:35 +00:00
Tim Northover	542d1c1463	GlobalISel: use inserts for landingpad instead of sequences. llvm-svn: 297237	2017-03-07 23:04:06 +00:00
Evgeniy Stepanov	7a5cfa9a11	Fix one-after-the-end type metadata handling in globalsplit. Itanium ABI may have an address point one byte after the end of a vtable. When such vtable global is split, the !type metadata needs to follow the right vtable. Differential Revision: https://reviews.llvm.org/D30716 llvm-svn: 297236	2017-03-07 22:18:48 +00:00
Zachary Turner	48d257d76c	Fix source-lines test on Windows. llvm-svn: 297233	2017-03-07 21:53:21 +00:00
Sanjay Patel	53fa17a014	[InstCombine] shrink truncated splat shuffle (2nd try) This was committed at r297155 and reverted at r297166 because of an over-reaching clang test. That should be fixed with r297189. This is one part of solving a recent bug report: http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html This keeps with our general approach: changing arbitrary shuffles is off-limts, but changing splat is ok. The transform is very similar to the existing shrinkBitwiseLogic() canonicalization. Differential Revision: https://reviews.llvm.org/D30123 llvm-svn: 297232	2017-03-07 21:45:16 +00:00
Chris Bieneman	a03cbcc6a6	[ObjectYAML] Fix issue with DWARF2 AddrSize 8 In my refactoring I introduced a bug where we were using the reference size instead of the offset size for DW_FORM_strp and similar forms. This patch resolves the error and adds a test case testing all the DWARF forms for DWARF2 AddrSize 8. There is similar coverage already in the DWARFDebugInfoTest sources that covers the parser. Once I migrate the DWARFGenerator APIs to be built on the YAML tools they will be fully covered under the same tests. llvm-svn: 297230	2017-03-07 21:34:35 +00:00
Tim Northover	2eb18d3c4b	GlobalISel: fix legalization of G_INSERT We were calculating incorrect extract/insert offsets by trying to be too tricksy with min/max. It's clearer to just split the logic up into "register starts before this segment" vs "after". llvm-svn: 297226	2017-03-07 21:24:33 +00:00
Gor Nishanov	c52006ab09	[coroutines] Add handling for unwind coro.ends Summary: The purpose of coro.end intrinsic is to allow frontends to mark the cleanup and other code that is only relevant during the initial invocation of the coroutine and should not be present in resume and destroy parts. In landing pads coro.end is replaced with an appropriate instruction to unwind to caller. The handling of coro.end differs depending on whether the target is using landingpad or WinEH exception model. For landingpad based exception model, it is expected that frontend uses the `coro.end`_ intrinsic as follows: ``` ehcleanup: %InResumePart = call i1 @llvm.coro.end(i8* null, i1 true) br i1 %InResumePart, label %eh.resume, label %cleanup.cont cleanup.cont: ; rest of the cleanup eh.resume: %exn = load i8, i8* %exn.slot, align 8 %sel = load i32, i32* %ehselector.slot, align 4 %lpad.val = insertvalue { i8, i32 } undef, i8 %exn, 0 %lpad.val29 = insertvalue { i8, i32 } %lpad.val, i32 %sel, 1 resume { i8, i32 } %lpad.val29 ``` The `CoroSpit` pass replaces `coro.end` with ``True`` in the resume functions, thus leading to immediate unwind to the caller, whereas in start function it is replaced with ``False``, thus allowing to proceed to the rest of the cleanup code that is only needed during initial invocation of the coroutine. For Windows Exception handling model, a frontend should attach a funclet bundle referring to an enclosing cleanuppad as follows: ``` ehcleanup: %tok = cleanuppad within none [] %unused = call i1 @llvm.coro.end(i8* null, i1 true) [ "funclet"(token %tok) ] cleanupret from %tok unwind label %RestOfTheCleanup ``` The `CoroSplit` pass, if the funclet bundle is present, will insert ``cleanupret from %tok unwind to caller`` before the `coro.end`_ intrinsic and will remove the rest of the block. Reviewers: majnemer Reviewed By: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D25543 llvm-svn: 297223	2017-03-07 21:00:54 +00:00
Ahmed Bougacha	55d10423a6	[GlobalISel] Don't translate intrinsics with metadata parameters. Some intrinsics take metadata parameters. These all need custom handling of some form, and cannot possibly be lowered generically to G_INTRINSIC calls with vreg operands. Reject them, instead of hitting an assert later in getOrCreateVReg. llvm-svn: 297209	2017-03-07 20:53:09 +00:00
Ahmed Bougacha	5c7924fca5	[GlobalISel] Avoid invalidating ValToVReg when translating no-op bitcast. When we translate a no-op (same type) bitcast, we try to be clever and only emit a COPY if we already assigned a vreg to the defined value. However, when we didn't, we tried to assign to a reference into the ValToVReg DenseMap, even though the RHS of the assignment (getOrCreateVReg) could potentially grow that DenseMap, invalidating the reference. Avoid that by getting the source vreg first. I audited the rest of the translator; this is the only tricky case. The test is quite unwieldy, as the problem is caused by the DenseMap growing, which happens after the 47th mapped value. llvm-svn: 297208	2017-03-07 20:53:06 +00:00
Ahmed Bougacha	38455ea8a6	[GlobalISel] Relax vector G_SELECT assertion. For vector operands, the `select` instruction supports both vector and non-vector conditions. The MIR builder had an overly restrictive assertion, that only accepted vector conditions for vector selects (in effect implementing ISD::VSELECT). Make it possible to express the full range of G_SELECTs. llvm-svn: 297207	2017-03-07 20:53:03 +00:00
Ahmed Bougacha	70dd6c2212	[GlobalISel] Add vector select translation test. NFC. llvm-svn: 297206	2017-03-07 20:53:00 +00:00
Ahmed Bougacha	c373262d52	[GlobalISel] Ignore %noreg when applying default regbank mapping. When computing the mapping for non-generic instructions, we skipped %noreg operands, because we can't always reason about their banks. Also skip them when applying the mapping. Otherwise, we could end up with mappings that we can't apply. While there, duplicate an assert to distinguish between the two error conditions. llvm-svn: 297201	2017-03-07 20:34:23 +00:00
Ahmed Bougacha	4826bae8b4	[GlobalISel] Emit DBG_VALUE %noreg for non-int/fp constant values. When a dbg_value has a constant operand that isn't representable in MI, there isn't much we can do. Use %noreg (0) for those situations. This matches the SelectionDAG behavior. llvm-svn: 297200	2017-03-07 20:34:20 +00:00
Ahmed Bougacha	ab50ecb1c7	[GlobalISel] Add constant dbg.value translation tests. NFC. llvm-svn: 297199	2017-03-07 20:34:13 +00:00
Artem Belevich	2524a22562	[NVPTX] Fixed lowering of unaligned loads/stores of f16 scalars and vectors. Differential Revision: https://reviews.llvm.org/D30672 llvm-svn: 297198	2017-03-07 20:33:38 +00:00
Arnold Schwaighofer	69e74b48f2	SjLjEHPrepare: Fix the pass for swifterror arguments We cannot leave the identity copies 'select true, arg, undef' that this pass inserts for arguments to simplify handling of values on swifterror arguments. swifterror arguments have restrictions on their uses. rdar://30839288 llvm-svn: 297197	2017-03-07 20:29:02 +00:00
Konstantin Zhuravlyov	f895b2019b	llvm-objdump: handle line numbers and source options for amdgpu objects Differential Revision: https://reviews.llvm.org/D30679 llvm-svn: 297193	2017-03-07 20:17:11 +00:00
Joel Jones	2852088126	[AArch64] Vulcan is now ThunderXT99 Broadcom Vulcan is now Cavium ThunderX2T99. LLVM Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32113 Minor fixes for the alignments of loops and functions for ThunderX T81/T83/T88 (better performance). Patch was tested with SpecCPU2006. Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D30510 llvm-svn: 297190	2017-03-07 19:42:40 +00:00
Chris Bieneman	b3ca711ab3	[ObjectYAML] Add support for DWARF5 Unit header In DWARF5 the Unit header added a new field, UnitType, and swapped the order of the address size and abbreviation offset fields. llvm-svn: 297183	2017-03-07 18:50:58 +00:00
Matthew Simpson	c86b2134c7	[LV] Consider users that are memory accesses in uniforms expansion step When expanding the set of uniform instructions beyond the seed instructions (e.g., consecutive pointers), we mark a new instruction uniform if all its loop-varying users are uniform. We should also allow users that are consecutive or interleaved memory accesses. This fixes cases where we have an instruction that is used as the pointer operand of a consecutive access but also used by a non-memory instruction that later becomes uniform as part of the expansion. llvm-svn: 297179	2017-03-07 18:47:30 +00:00
Adrian Prantl	d4ac2a2b43	Further reduce testcase llvm-svn: 297176	2017-03-07 18:26:36 +00:00
Teresa Johnson	a404d1436e	Fix test and add missing return for llvm-lto2 error case Summary: This test was missing the target triple. Once I fixed that, the case with the invalid character error stopped returning 1 from llvm-lto2 and the test reported a failure. Fixed by adding the missing return from llvm-lto2. Apparently we were failing when we eventually tried to get the target. Reviewers: pcc Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D30585 llvm-svn: 297173	2017-03-07 18:15:13 +00:00
Volkan Keles	20d3c4200d	[GlobalISel] Translate floating-point negation Reviewers: qcolombet, javed.absar, aditya_nandakumar, dsanders, t.p.northover, ab Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30671 llvm-svn: 297171	2017-03-07 18:03:28 +00:00
Adrian Prantl	654975b5a6	Update comment in testcase llvm-svn: 297170	2017-03-07 17:55:36 +00:00
Sanjay Patel	6d30606168	revert r297155 because there's a clang test that depends on InstCombine: tools/clang/test/CodeGen/zvector.c llvm-svn: 297166	2017-03-07 17:41:45 +00:00
Adrian Prantl	d4056501fb	Revert "Strip debug info when inlining into a nodebug function." This reverts commit r296488. As noted by David Blaikie on llvm-commits, I overlooked the case of a debug function being inlined into a nodebug function being inlined into a debug function. llvm-svn: 297163	2017-03-07 17:28:57 +00:00
Adrian Prantl	63d9695261	Relax the conflicting function arg verifier to allow for inlined debug info in nodebug functions. llvm-svn: 297161	2017-03-07 17:28:54 +00:00
Nico Weber	3b2f0094d7	Revert r297132, it caused PR32171 llvm-svn: 297159	2017-03-07 17:23:52 +00:00
Sanjay Patel	defdb7bed5	[InstCombine] shrink truncated splat shuffle This is one part of solving a recent bug report: http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html This keeps with our general approach: changing arbitrary shuffles is off-limts, but changing splat is ok. The transform is very similar to the existing shrinkBitwiseLogic() canonicalization. Differential Revision: https://reviews.llvm.org/D30123 llvm-svn: 297155	2017-03-07 16:10:36 +00:00
John Brawn	eba9fdac7e	[ARM] Correct handling of LSL #0 in an IT block The check for LSL #0 in an IT block was checking if operand 4 was zero, but operand 4 is the condition code operand so it was actually checking for LSLEQ. Fix this by checking operand 3, which really is the immediate operand, and add some tests. Differential Revision: https://reviews.llvm.org/D30692 llvm-svn: 297142	2017-03-07 14:42:03 +00:00
Krzysztof Parzyszek	3cceffb752	[Hexagon] Do not insert instructions before PHI nodes llvm-svn: 297141	2017-03-07 14:20:19 +00:00
Ranjeet Singh	3d0af578cc	[ARM] Reapply r296865 "[ARM] fpscr read/write intrinsics not aware of each other"" The original patch r296865 was reverted as it broke the chromium builds for Android https://bugs.llvm.org/show_bug.cgi?id=32134, this patch reapplies r296865 with a fix to make sure it doesn't cause the build regression. The problem was that intrinsic selection on int_arm_get_fpscr was failing in ISel this was because the code to manually select this intrinsic still thought it was the version with no side-effects (INTRINSIC_WO_CHAIN) which is wrong as it doesn't semantically match the definition in the tablegen code which says it does have side-effects, I've fixed this by updating the intrinsic type to INTRINSIC_W_CHAIN (has side-effects). I've also added a test for this based on Hans original reproducer. Differential Revision: https://reviews.llvm.org/D30645 llvm-svn: 297137	2017-03-07 11:17:53 +00:00
Jonas Paulsson	1d33cd3988	[SystemZ] Add check VT.isSimple() in canTreateAsByteVector() Since BB-vectorizer can produce vectors of for example 3 elements, this check is needed. Review: Ulrich Weigand llvm-svn: 297136	2017-03-07 09:49:31 +00:00
Artyom Skrobov	1388e2f792	In Thumb1, materialize a move between low registers as a `movs`, if CPSR isn't live. Summary: Previously, it had always been materialized as a push/pop sequence. Reviewers: labrinea, jroelofs Reviewed By: jroelofs Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D30648 llvm-svn: 297134	2017-03-07 09:38:16 +00:00
Sam Parker	6ec5fdbc94	[LoopRotate] Update dbg.value intrinsics Propagate debug info through the newly inserted PHI nodes. Differential Revision: https://reviews.llvm.org/D30190 llvm-svn: 297132	2017-03-07 09:34:25 +00:00
Ayman Musa	ac5a2c43af	[X86][AVX512] Add missing entries to EVEX2VEX tables evex2vex pass defines 2 tables which maps EVEX instructions to their VEX identical when possible. Adding all missing entries. Differential Revision: https://reviews.llvm.org/D30501 llvm-svn: 297126	2017-03-07 08:05:53 +00:00
Tim Shen	70054bb827	Revert "[PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size" This reverts commit r296771. We found some wide spread test failures internally. I'm working on a testcase. Politely revert the patch in the mean time. :) llvm-svn: 297124	2017-03-07 07:40:10 +00:00
Sanjoy Das	30c3538e2e	[LoopUnrolling] Fix loop size check for peeling Summary: We should check if loop size allows us to peel at least one iteration before we do so. Patch by Max Kazantsev! Reviewers: sanjoy, mkuper, efriedma Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30632 llvm-svn: 297122	2017-03-07 06:03:15 +00:00
Konstantin Zhuravlyov	e8aaab8abe	Revert "AMDGPU: Set MCAsmInfo::PointerSize" It breaks line tables because the patch is not complete, working on a complete one at the moment This reverts commit r294031. llvm-svn: 297118	2017-03-07 04:44:33 +00:00
Adrian Prantl	cb7b8f1094	Add a testcase for r297072. Check that missing debug locations on inlinable calls are a recoverable error. llvm-svn: 297113	2017-03-07 02:49:57 +00:00
Michael Kuperstein	768d013a03	[SLP] Revert r296863 due to miscompiles. Details and reproducer are on the email thread for r296863. llvm-svn: 297103	2017-03-06 23:54:51 +00:00
Tim Northover	c2c545b8f7	GlobalISel: restrict G_EXTRACT instruction to just one operand. A bit more painful than G_INSERT because it was more widely used, but this should simplify the handling of extract operations in most locations. llvm-svn: 297100	2017-03-06 23:50:28 +00:00
Chris Bieneman	bcf513f25a	[ObjectYAML] Support for DW_FORM_implicit_const DWARF5 form This patch adds support to the DWARF YAML reader and writer for the new DWARF5 abbreviation form, DW_FORM_implicit_const. The attribute was added in r291599. llvm-svn: 297091	2017-03-06 23:22:49 +00:00
Jessica Paquette	596f483a5e	[Outliner] Fixed Asan bot failure in r296418 Fixed the asan bot failure which led to the last commit of the outliner being reverted. The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector is no longer initialized using reserve but just a standard constructor. llvm-svn: 297081	2017-03-06 21:31:18 +00:00
Chad Rosier	9a70c7c02a	[AArch64][Redundant Copy Elim] Add support for CMN and shifted imm. This patch extends the current functionality of the AArch64 redundant copy elimination pass to handle CMN instructions as well as a shifted immediates. Differential Revision: https://reviews.llvm.org/D30576. llvm-svn: 297078	2017-03-06 21:20:00 +00:00
Hans Wennborg	254f5fa5f2	Disable gvn-hoist (PR32153) llvm-svn: 297075	2017-03-06 21:10:40 +00:00
Krzysztof Parzyszek	9e60e51a71	Revert r297039, it's causing some mysterious buildbot failures llvm-svn: 297062	2017-03-06 20:24:21 +00:00
Jan Vesely	3ea1704434	AMDGPU/R600: Fix ALU clause markers use detection also exit early on kill instead of redefinition. Differential Revision: https://reviews.llvm.org/D30230 llvm-svn: 297060	2017-03-06 20:10:05 +00:00
Daniel Berlin	961b002714	NewGVN: We were not really failing this testcase, because the instructions it was looking for are unused. GVN value numbers unused instructions, NewGVN does not. Fix the instructions to be used, so we eliminate the redundancies it's checking for, and un-XFAIL it llvm-svn: 297058	2017-03-06 20:01:31 +00:00
Krzysztof Parzyszek	5b8fae5edd	[IfConversion] Only renormalize probabilities if branches are analyzable If a block has non-analyzable branches, the listed successors don't need to add up to one. For example, if a block has a conditional tail call, that tail call will not have a corresponding successor in the successor list, but will still be a possible branch. Differential Revision: https://reviews.llvm.org/D30556 llvm-svn: 297054	2017-03-06 19:12:42 +00:00

... 2 3 4 5 6 ...

43709 Commits