llvm-project

Commit Graph

Author	SHA1	Message	Date
Adrian Prantl	210a29de7b	Fix a bug in GlobalOpt's handling of DIExpressions. This patch adds support for fragment expressions TryToShrinkGlobalToBoolean() which were previously just dropped. Thanks to Reid Kleckner for providing me a reproducer! llvm-svn: 331086	2018-04-27 21:41:36 +00:00
Roman Lebedev	6959b8e76f	[PatternMatch] Stabilize the matching order of commutative matchers Summary: Currently, we 1. match `LHS` matcher to the `first` operand of binary operator, 2. and then match `RHS` matcher to the `second` operand of binary operator. If that does not match, we swap the `LHS` and `RHS` matchers: 1. match `RHS` matcher to the `first` operand of binary operator, 2. and then match `LHS` matcher to the `second` operand of binary operator. This works ok. But it complicates writing of commutative matchers, where one would like to match (`m_Value()`) the value on one side, and use (`m_Specific()`) it on the other side. This is additionally complicated by the fact that `m_Specific()` stores the `Value `, not `Value `, so it won't work at all out of the box. The last problem is trivially solved by adding a new `m_c_Specific()` that stores the `Value `, not `Value `. I'm choosing to add a new matcher, not change the existing one because i guess all the current users are ok with existing behavior, and this additional pointer indirection may have performance drawbacks. Also, i'm storing pointer, not reference, because for some mysterious-to-me reason it did not work with the reference. The first one appears trivial, too. Currently, we 1. match `LHS` matcher to the `first` operand of binary operator, 2. and then match `RHS` matcher to the `second` operand of binary operator. If that does not match, we swap the ~~`LHS` and `RHS` matchers~~ operands: 1. match ~~`RHS`~~ `LHS` matcher to the ~~`first`~~ `second` operand of binary operator, 2. and then match ~~`LHS`~~ `RHS` matcher to the ~~`second`~ `first` operand of binary operator. Surprisingly, `$ ninja check-llvm` still passes with this. But i expect the bots will disagree.. The motivational unittest is included. I'd like to use this in D45664. Reviewers: spatel, craig.topper, arsenm, RKSimon Reviewed By: craig.topper Subscribers: xbolva00, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D45828 llvm-svn: 331085	2018-04-27 21:23:20 +00:00
Simon Pilgrim	8ee7d01dcf	[X86] Merge some x87 instruction instregex single matches. NFCI. llvm-svn: 331084	2018-04-27 21:14:19 +00:00
Daniel Sanders	4f246999d9	Attempt to fix remaining build failures after r331071 by changing the tuple to a struct Some of the bots were failing in a different way to the others. These were unable to compare tuples. Fix this by changing to a struct, thereby avoiding the quirks of tuples. llvm-svn: 331081	2018-04-27 21:03:27 +00:00
Philip Reames	5a6482450a	[LICM] Reduce nesting with an early return [NFC] llvm-svn: 331080	2018-04-27 20:58:30 +00:00
Philip Reames	e4ec473b3f	[MustExecute/LICM] Special case first instruction in throwing header We currently have a hard to solve analysis problem around the order of instructions within a potentially throwing block. We can't cheaply determine whether a given instruction is before the first potential throw in the block. While we're working on that in the background, special case the first instruction within the header. why this particular special case? Well, headers are guaranteed to execute if the loop does, and it turns out we tend to produce this form in practice. In a follow on patch, I tend to extend LICM with an alternate approach which works for any instruction in the header before the first throw, but this is the best I can come up with other users of the analysis (such as store promotion.) Note: I can't show the difference in the analysis result since we're ORing in the expensive instruction walk used by SCEV. Using the full walk is not suitable for a general solution. llvm-svn: 331079	2018-04-27 20:44:01 +00:00
Vlad Tsyrklevich	201a1086cf	ELFObjectWriter: Allow one unique symver per symbol Summary: Only allow a single unique .symver alias per symbol. This matches the behavior of gas. I noticed that we ignored multiple mismatched symver directives looking at https://reviews.llvm.org/D45798 Reviewers: pcc, tejohnson, espindola Reviewed By: pcc Subscribers: emaste, arichardson, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D45845 llvm-svn: 331078	2018-04-27 20:32:34 +00:00
Daniel Neilson	a19ee7d7b6	[LV] Common duplicate vector load/store address calculation (NFC) Summary: Commoning some obviously copy/paste code in InnerLoopVectorizer::vectorizeMemoryInstruction llvm-svn: 331076	2018-04-27 20:29:18 +00:00
Daniel Sanders	a05e8d3e68	Attempt to fix build failure after r331071 using std::make_tuple llvm-svn: 331074	2018-04-27 20:17:44 +00:00
Jun Bum Lim	9e3e14b5f9	[PostRASink] extend the live-in check for all aliased registers Extend the live-in check for all aliased registers so that we can allow sinking Copy instructions when only implicit def is in successor's live-in. llvm-svn: 331072	2018-04-27 19:59:20 +00:00
Daniel Sanders	27fe8a5011	[globalisel][legalizerinfo] Add support for legalization based on the MachineMemOperand Summary: Currently only the memory size is supported but others can be added as needed. narrowScalar for G_LOAD and G_STORE now correctly update the MachineMemOperand and will refuse to legalize atomics since those need more careful expansions to maintain atomicity. Reviewers: ab, aditya_nandakumar, bogner, rtereshin, aemerson, javed.absar Reviewed By: aemerson Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45466 llvm-svn: 331071	2018-04-27 19:48:53 +00:00
Jun Bum Lim	47aece1344	[CodeGen] Use RegUnits to track register aliases (NFC) Summary: Use RegUnits to track register aliases in PostRASink and AArch64LoadStoreOptimizer. Reviewers: thegameg, mcrosier, gberry, qcolombet, sebpop, MatzeB, t.p.northover, javed.absar Reviewed By: thegameg, sebpop Subscribers: javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D45695 llvm-svn: 331066	2018-04-27 18:44:37 +00:00
Simon Pilgrim	8a937e00d8	[X86] Split WriteFBlend/WriteFVarBlend/WriteFVarShuffle into XMM and YMM/ZMM scheduler classes This removes all the WriteFBlend/WriteFVarBlend InstRW overrides - some WriteFVarShuffle remain to be fixed. llvm-svn: 331065	2018-04-27 18:19:48 +00:00
Philip Reames	de5a1da2d2	[GuardWidening] Add some clarifying comments about heuristics [NFC] llvm-svn: 331061	2018-04-27 17:41:37 +00:00
Philip Reames	9258e9d190	[LoopGuardWidening] Split out a loop pass version of GuardWidening The idea is to have a pass which performs the same transformation as GuardWidening, but can be run within a loop pass manager without disrupting the pass manager structure. As demonstrated by the test case, this doesn't quite get there because of issues with post dom, but it gives a good step in the right direction. the motivation is purely to reduce compile time since we can now preserve locality during the loop walk. This patch only includes a legacy pass. A follow up will add a new style pass as well. llvm-svn: 331060	2018-04-27 17:29:10 +00:00
Nirav Dave	6b01b88012	[MC] Undo spurious commit added into r331052. llvm-svn: 331055	2018-04-27 16:16:06 +00:00
Simon Pilgrim	c3c767bf50	[X86] Split WriteFHadd into XMM and YMM/ZMM scheduler classes This removes all the HADD/HSUB PS/PD InstRW overrides. llvm-svn: 331054	2018-04-27 16:11:57 +00:00
Nirav Dave	38b4b54a2c	[MC] Provide default value for IsResolved. llvm-svn: 331052	2018-04-27 16:11:24 +00:00
Simon Pilgrim	b2aa89c909	[X86][AVX] Split WriteFLogic into XMM and YMM/ZMM scheduler classes This removes all the AND/ANDN/OR/XOR PS/PD InstRW overrides. llvm-svn: 331051	2018-04-27 15:50:33 +00:00
Simon Dardis	e3c3c5a7a7	[mips] Analyze and provide selection patterns microMIPSR6 branches These branches were previously unanalyzable and unselectable. Add them and recognize how to generate their inverses. Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D46113 llvm-svn: 331050	2018-04-27 15:49:49 +00:00
Nirav Dave	1b5533c9e8	[MC] Modify MCAsmStreamer to always build MCAssembler. NFCI. llvm-svn: 331048	2018-04-27 15:45:54 +00:00
Nirav Dave	8728e097df	[MC] Allow MCAssembler to be constructed without all subcomponents. NFCI. llvm-svn: 331047	2018-04-27 15:45:27 +00:00
Francis Visoiu Mistrih	c855e92ca9	[AArch64] Place the first ldp at the end when ReverseCSRRestoreSeq is true Put the first ldp at the end, so that the load-store optimizer can run and merge the ldp and the add into a post-index ldp. This didn't work in case no frame was needed and resulted in code size regressions. llvm-svn: 331044	2018-04-27 15:30:54 +00:00
Jonas Paulsson	9a485985cd	[SystemZ] Remove scheduling info from some Pseudo instructions (NFC). If the MachineInstr uses a custom inserter and is then erased after instruction selection, there is no use for mapping it to a sched class. Review: Ulrich Weigand llvm-svn: 331040	2018-04-27 14:09:03 +00:00
Florian Hahn	f3fea0f11f	[LoopInterchange] Allow some loops with PHI nodes in the exit block. We currently support LCSSA PHI nodes in the outer loop exit, if their incoming values do not come from the outer loop latch or if the outer loop latch has a single predecessor. In that case, the outer loop latch will be executed only if the inner loop gets executed. If we have multiple predecessors for the outer loop latch, it may be executed even if the inner loop does not get executed. This is a first step to support the case described in https://bugs.llvm.org/show_bug.cgi?id=30472 Reviewers: efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43237 llvm-svn: 331037	2018-04-27 13:52:51 +00:00
Oliver Stannard	76088a5929	[AArch64] Codegen for v8.2A dot product intrinsics This adds IR intrinsics for the AArch64 dot-product instructions introduced in v8.2-A. Differential revisioon: https://reviews.llvm.org/D46107 llvm-svn: 331036	2018-04-27 13:45:32 +00:00
Benjamin Kramer	733c7fc55d	[NVPTX] Turn on Loop/SLP vectorization Since PTX has grown a <2 x half> datatype vectorization has become more important. The late LoadStoreVectorizer intentionally only does loads and stores, but now arithmetic has to be vectorized for optimal throughput too. This is still very limited, SLP vectorization happily creates <2 x half> if it's a legal type but there's still a lot of register moving happening to get that fed into a vectorized store. Overall it's a small performance win by reducing the amount of arithmetic instructions. I haven't really checked what the loop vectorizer does to PTX code, the cost model there might need some more tweaks. I didn't see it causing harm though. Differential Revision: https://reviews.llvm.org/D46130 llvm-svn: 331035	2018-04-27 13:36:05 +00:00
Simon Pilgrim	aef5ca7299	[X86] Replace some system instruction instregex single matches with instrs entry. NFCI. llvm-svn: 331034	2018-04-27 13:32:42 +00:00
Aleksandar Beserminji	3546c1603a	[mips] Fix how compiler fuse instructions to fmadd/fmsub This patch makes compiler does not fuse fmul and fadd/fsub into fmadd/fmsub by default. Instead, -fp-contract=fast option can be used when such behavior is desired. Differential Revision: https://reviews.llvm.org/D46057 llvm-svn: 331033	2018-04-27 13:30:27 +00:00
Oliver Stannard	f3632143da	[ARM] Codegen for v8.2A dot product intrinsics This adds IR intrinsics for the ARM dot-product instructions introduced in v8.2-A. Differential revision: https://reviews.llvm.org/D46106 llvm-svn: 331032	2018-04-27 12:50:40 +00:00
David Green	c4cccea4c9	[ARM] Enable misched for R52. Back when the R52 schedule was added in rL286949, there was no way to enable machine schedules in ARM for specific cores. Since then a target feature has been added. This enables the feature for R52, removing the need to manually specify compiler flags. llvm-svn: 331027	2018-04-27 11:29:49 +00:00
Mikhail Maltsev	ffaa8a8781	[IR] Do not assume that function pointers are aligned Summary: The value tracking analysis uses function alignment to infer that the least significant bits of function pointers are known to be zero. Unfortunately, this is not correct for ARM targets: the least significant bit of a function pointer stores the ARM/Thumb state information (i.e., the LSB is set for Thumb functions and cleared for ARM functions). The original approach (https://reviews.llvm.org/D44781) introduced a new field for function pointer alignment in the DataLayout structure to address this. But it seems unlikely that optimizations based on function pointer alignment would bring much benefit in practice to justify the additional maintenance burden, so this patch simply assumes that function pointer alignment is always unknown. Reviewers: javed.absar, efriedma Reviewed By: efriedma Subscribers: kristof.beyls, llvm-commits, hfinkel, rogfer01 Differential Revision: https://reviews.llvm.org/D46110 llvm-svn: 331025	2018-04-27 09:12:12 +00:00
Petar Jovanovic	d4349f3bf6	[mips] Add support for Virtualization ASE This includes Instructions: tlbginv, tlbginvf, tlbgp, tlbgr, tlbgwi, tlbgwr, hypcall mfgc0, mtgc0, mfhgc0, mthgc0, dmfgc0, dmtgc0, Assembler directives: .set virt, .set novirt, .module virt, .module novirt Attribute: virt .MIPS.abiflags: VZ (0x100) Patch by Vladimir Stefanovic. Differential Revision: https://reviews.llvm.org/D44905 llvm-svn: 331024	2018-04-27 09:12:08 +00:00
Serguei Katkov	1956a48d27	[SCEV] Add trivial case handling for umin utilities. NFC. Reviewers: sanjoy, mkazantsev Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46175 llvm-svn: 331022	2018-04-27 08:02:50 +00:00
Serguei Katkov	fa7fd13cf8	[SCEV] Introduce bulk umin creation utilities Add new umin creation method which accepts a list of operands. SCEV does not represents umin which is required in getExact, so it transforms umin to umax with not. As a result the transformation of tree of max to max with several operands does not work. We just use the new introduced method for creation umin from several operands. Reviewers: sanjoy, mkazantsev Reviewed By: sanjoy Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46047 llvm-svn: 331015	2018-04-27 03:56:53 +00:00
Matt Morehouse	1ae1febfde	Revert "[SimplifyLibcalls] Replace locked IO with unlocked IO" This reverts r331002 due to sanitizer bot breakage. llvm-svn: 331011	2018-04-27 01:48:09 +00:00
Eli Friedman	e06539456c	[LowerTypeTests] Mark .cfi.jumptable nounwind. It doesn't unwind, and the wrong marking leads to the creation of an .eh_frame section when it isn't necessary. Differential Revision: https://reviews.llvm.org/D46082 llvm-svn: 331008	2018-04-27 00:32:24 +00:00
Eli Friedman	da018e5687	[MachineOutliner] Don't outline from functions with a section marking. The program might have unusual expectations for functions; for example, the Linux kernel's build system warns if it finds references from .text to .init.data. I'm not sure this is something we actually want to make any guarantees about (there isn't any explicit rule that would disallow outlining in this case), but we might want to be conservative anyway. Differential Revision: https://reviews.llvm.org/D46091 llvm-svn: 331007	2018-04-27 00:21:34 +00:00
Sam Clegg	e0658119ba	typo llvm-svn: 331006	2018-04-27 00:17:24 +00:00
Sam Clegg	d5504a0a62	[WebAssembly] Section symbols must have local binding Summary: Also test for symbols information in test/MC/WebAssembly/debug-info.ll. Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D46160 llvm-svn: 331005	2018-04-27 00:17:21 +00:00
David Bolvansky	2c9cc9c731	[SimplifyLibcalls] Replace locked IO with unlocked IO Summary: If file stream arg is not captured and source is fopen, we could replace IO calls by unlocked IO ("_unlocked" function variants) to gain better speed, Reviewers: efriedma, RKSimon, spatel, sanjoy, hfinkel, majnemer Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D45736 llvm-svn: 331002	2018-04-26 22:31:43 +00:00
Chandler Carruth	16429acacb	[x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997	2018-04-26 21:46:01 +00:00
Adrian Prantl	855b91022d	Revert "Fix a bug that prevents global variables from having a DW_OP_deref." This reverts commit r3309704 while investigating bot breakage. llvm-svn: 330993	2018-04-26 20:59:58 +00:00
Sanjoy Das	6f1937b10f	[InstCombine] Simplify Add with remainder expressions as operands. Summary: Simplify integer add expression X % C0 + (( X / C0 ) % C1) * C0 to X % (C0 * C1). This is a common pattern seen in code generated by the XLA GPU backend. Add test cases for this new optimization. Patch by Bixia Zheng! Reviewers: sanjoy Reviewed By: sanjoy Subscribers: efriedma, craig.topper, lebedev.ri, llvm-commits, jlebar Differential Revision: https://reviews.llvm.org/D45976 llvm-svn: 330992	2018-04-26 20:52:28 +00:00
Roman Tereshin	38489ed416	[GlobalISel] Reporting rules covered as part of the InstructionSelect's debug-only printing The main goal of this change is to make it much easier to track which rules are actually covered by Testgen'erated regression tests. Reviewers: aemerson, dsanders Differential Revision: https://reviews.llvm.org/D46095 llvm-svn: 330988	2018-04-26 20:22:17 +00:00
Simon Atanasyan	d4d892ff9f	[mips] Accept 32-bit offsets for lb and lbu commands `lb` and `lbu` commands accepts 16-bit signed offsets. But GAS accepts larger offsets for these commands. If an offset does not fit in 16-bit range, `lb` command is translated into lui/lb or lui/addu/lb series. It's interesting that initially LLVM assembler supported this feature, but later it was broken. This patch restores support for 32-bit offsets. It replaces `mem_simm16` operand for `LB` and `LBu` definitions by the new `mem_simmptr` operand. This operand is intended to check that offset fits to the same size as using for pointers. Later we will be able to extend this rule and accepts 64-bit offsets when it is possible. Some issues remain: - The regression also affects LD, SD, LH, LHU commands. I'm going to fix them by a separate patch. - GAS accepts any 32-bit values as an offset. Now LLVM accepts signed 16-bit values and this patch extends the range to signed 32-bit offsets. In other words, the following code accepted by GAS and still triggers an error by LLVM: ``` lb $4, 0x80000004 # gas lui a0, 0x8000 lb a0, 4(a0) ``` - In case of 64-bit pointers GAS accepts a 64-bit offset and translates it to the li/dsll/lb series of commands. LLVM still rejects it. Probably this feature has never been implemented in LLVM. This issue is for a separate patch. ``` lb $4, 0x800000001 # gas li a0, 0x8000 dsll a0, a0, 0x14 lb a0, 4(a0) ``` Differential Revision: https://reviews.llvm.org/D45020 llvm-svn: 330983	2018-04-26 19:55:28 +00:00
Sam Clegg	6a31a0d694	[WebAssembly] Write DWARF data into wasm object file - Writes ".debug_XXX" into corresponding custom sections. - Writes relocation records into "reloc.debug_XXX" sections. Patch by Yury Delendik! Differential Revision: https://reviews.llvm.org/D44184 llvm-svn: 330982	2018-04-26 19:27:28 +00:00
Matt Arsenault	540512c297	DAG: Fix not legalizing vector fcanonicalizes If an fcanoncialize was done on a vector type that was legal, llvm-svn: 330981	2018-04-26 19:21:37 +00:00
Matt Arsenault	fcc5ba46b7	AMDGPU: Extend extract_vector_elt fneg combine to fabs Fixes a regression in a future commit. llvm-svn: 330980	2018-04-26 19:21:32 +00:00
Matt Arsenault	8474803c7c	AMDGPU: Consolidate SubtargetPredicate definitions llvm-svn: 330979	2018-04-26 19:21:26 +00:00
Geoff Berry	08ab8c9544	[AArch64] Fix scavenged spill slot base when stack realignment required. Summary: Use the FP for scavenged spill slot accesses to prevent corruption of the callee-save region when the SP is re-aligned. Based on problem and patch reported by @paulwalker-arm This is an alternative to solution proposed in D45770 Reviewers: t.p.northover, paulwalker-arm, thegameg, javed.absar Subscribers: qcolombet, mcrosier, paulwalker-arm, kristof.beyls, rengolin, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46063 llvm-svn: 330976	2018-04-26 18:50:45 +00:00
Adrian Prantl	e42805d07c	Fix a bug that prevents global variables from having a DW_OP_deref. For local variables the first DW_OP_deref is consumed by turning the location kind into a memeory location, but that only makes sense for values that are in a register to begin with, which cannot happen for global variables that are attached to a symbol. rdar://problem/39741860 llvm-svn: 330970	2018-04-26 18:17:04 +00:00
Sam Clegg	6bb5a41f99	[WebAssembly] Add version to object file metadata Summary: See https://github.com/WebAssembly/tool-conventions/issues/54 Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D46069 llvm-svn: 330969	2018-04-26 18:15:32 +00:00
Haicheng Wu	b09308d82a	[GlobalMerge] Fix a typo now => know llvm-svn: 330965	2018-04-26 17:56:50 +00:00
Vlad Tsyrklevich	b768d235a9	Revert "Enable EliminateAvailableExternally pass for -O1" This reverts commit r330961 because it breaks a handful of clang tests. llvm-svn: 330964	2018-04-26 17:54:53 +00:00
Vlad Tsyrklevich	3b59a8aba0	Update stale comment in AsmWriter.cpp Summary: The old comment referred to llvm/IR/Writer.h which doesn't longer exist. This patch replaces it with an up-to-date description of AsmWriter library. Patch by Alex Yursha. Reviewers: gribozavr, vlad.tsyrklevich Reviewed By: vlad.tsyrklevich Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45895 llvm-svn: 330962	2018-04-26 17:34:51 +00:00
Vlad Tsyrklevich	42c5a9c29a	Enable EliminateAvailableExternally pass for -O1 Summary: Follow-up to D43690, the EliminateAvailableExternally pass currently runs under -O0 and -O2 and up. Under -O1 we would still want to drop available_externally symbols to reduce space without inlining having run. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: mehdi_amini, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D46093 llvm-svn: 330961	2018-04-26 17:33:24 +00:00
Sam Clegg	f676cdd515	[WebAssembly] Implement getRelocationValueString() And use it in llvm-objdump. Differential Revision: https://reviews.llvm.org/D46092 llvm-svn: 330957	2018-04-26 16:41:51 +00:00
Mark Searles	2a19af6e17	[AMDGPU][Waitcnt] As of gfx7, VMEM operations do not increment the export counter and the input registers are available in the next instruction; update the waitcnt pass to take this into account. Differential Revision: https://reviews.llvm.org/D46067 llvm-svn: 330954	2018-04-26 16:11:19 +00:00
Simon Dardis	8086b9db3d	[mips] Correct the definitions of some control instructions Correct the definitions of ei, di, eret, deret, wait, syscall and break. Also provide microMIPS specific aliases to match the MIPS aliases. Additionally correct the definition of the wait instruction so that it is present in the instruction mapping tables. Reviewers: smaksimovic, abeserminji, atanasyan Differential Revision: https://reviews.llvm.org/D45939 llvm-svn: 330952	2018-04-26 16:06:34 +00:00
Sanjay Patel	5a90285bd9	[DAGCombiner] limit ftrunc optimizations with function attribute As noted, the attribute name is subject to change once we have the clang side implemented, but it's clear that we need some kind of attribute-based predication here based on the discussion for: rL330437 llvm-svn: 330951	2018-04-26 16:04:44 +00:00
Alex Bradbury	fda6037e98	[RISCV] Implement isLoadFromStackSlot and isStoreToStackSlot This causes some slight shuffling but no meaningful codegen differences on the corpus I used for testing, but it has a larger impact when combined with e.g. rematerialisation. Regardless, it makes sense to report as accurate target-specific information as possible. llvm-svn: 330949	2018-04-26 15:34:27 +00:00
Benjamin Kramer	7dd437710e	[NVPTX] Make the legalizer expand shufflevector of <2 x half> There's no direct instruction for this, but it's trivially implemented with two movs. Without this the code generator just dies when encountering a shufflevector. Differential Revision: https://reviews.llvm.org/D46116 llvm-svn: 330948	2018-04-26 15:26:29 +00:00
Sanjay Patel	a5da086386	[DAGCombiner] refactor FP->int->FP folds; NFC As discussed in the post-review comments for rL330437, we need to guard this fold to allow existing code to keep working with the undefined behavior that they've come to rely on. That would mean duplicating more code than we already have, so let's fix that first. llvm-svn: 330947	2018-04-26 15:20:18 +00:00
Alex Bradbury	15e894baee	[RISCV] Implement isZextFree This returns true for 8-bit and 16-bit loads, allowing LBU/LHU to be selected and avoiding unnecessary masks. llvm-svn: 330943	2018-04-26 14:04:18 +00:00
Matthew Simpson	b4096ebe26	[TTI, AArch64] Add transpose shuffle kind This patch adds a new shuffle kind useful for transposing a 2xn matrix. These transpose shuffle masks read corresponding even- or odd-numbered vector elements from two n-dimensional source vectors and write each result into consecutive elements of an n-dimensional destination vector. The transpose shuffle kind is meant to model the TRN1 and TRN2 AArch64 instructions. As such, this patch also considers transpose shuffles in the AArch64 implementation of getShuffleCost. Differential Revision: https://reviews.llvm.org/D45982 llvm-svn: 330941	2018-04-26 13:48:33 +00:00
Alex Bradbury	130b8b3f2b	[RISCV] Implement isTruncateFree Adapted from ARM's implementation introduced in r313533 and r314280. llvm-svn: 330940	2018-04-26 13:37:00 +00:00
Lama Saba	a331f91853	[X86] Fix Update Kill Register in Avoid SFB Pass - Bug 37153 Differential Revision: https://reviews.llvm.org/D45823 Change-Id: Icf6f34f6babc3cb2ff5292fde003472473037a71 llvm-svn: 330939	2018-04-26 13:16:11 +00:00
Alex Bradbury	dcbff63c24	[RISCV] Implement isLegalICmpImmediate I'm unable to construct a representative test case that demonstrates the advantage, but it seems sensible to report accurate target-specific information regardless. llvm-svn: 330938	2018-04-26 13:15:17 +00:00
Alex Bradbury	5c41ecedf8	[RISCV] Implement isLegalAddImmediate This causes a trivial improvement in the recently added lsr-legaladdimm.ll test case. llvm-svn: 330937	2018-04-26 13:00:37 +00:00
Sander de Smalen	fe17a78b86	[AArch64][SVE] Enable DiagnosticPredicates for SVE LD1 instructions. This patch extends the PredicateMethod of AsmOperands used in SVE's LD1 instructions with a DiagnosticPredicate. This makes them 'context sensitive' to the operand that has been parsed and tells the user to use the right register (with expected shift/extend), rather than telling the immediate is out of range when it actually parsed a register. Patch [2/2] in a series to improve assembler diagnostics for SVE: - Patch [1/2]: https://reviews.llvm.org/D45879 - Patch [2/2]: https://reviews.llvm.org/D45880 Reviewers: olista01, stoklund, craig.topper, mcrosier, rengolin, echristo, fhahn, SjoerdMeijer, evandro, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D45880 llvm-svn: 330934	2018-04-26 12:54:42 +00:00
Benjamin Kramer	bd89647229	[NVPTX] Deduplicate code. No functionality change. llvm-svn: 330933	2018-04-26 12:30:16 +00:00
Alex Bradbury	09926296df	[RISCV] Implement isLegalAddressingMode for RISC-V This has no impact on codegen for the current RISC-V unit tests or my small benchmark set and very minor changes in a few programs in the GCC torture suite. Based on this, I haven't been able to produce a representative test program that demonstrates a benefit from isLegalAddressingMode. I'm committing the patch anyway, on the basis that presenting accurate information to the target-independent code is preferable to relying on incorrect generic assumptions. llvm-svn: 330932	2018-04-26 12:13:48 +00:00
Florian Hahn	fd2bc11248	[LoopInterchange] Ignore debug intrinsics during legality checks. Reviewers: aprantl, mcrosier, karthikthecool Reviewed By: aprantl Subscribers: mattd, vsk, #debug-info, llvm-commits Differential Revision: https://reviews.llvm.org/D45379 llvm-svn: 330931	2018-04-26 10:26:17 +00:00
Sander de Smalen	74f9e6720b	[AArch64][SVE] Asm: Support for gather LD1/LDFF1 (scalar + vector) load instructions. Patch [2/3] in series to add support for SVE's gather load instructions that use scalar+vector addressing modes: - Patch [1/3]: https://reviews.llvm.org/D45951 - Patch [2/3]: https://reviews.llvm.org/D46023 - Patch [3/3]: https://reviews.llvm.org/D45958 Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, t.p.northover, echristo, evandro, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D46023 llvm-svn: 330928	2018-04-26 08:19:53 +00:00
Craig Topper	bc26f3b61b	[X86] Print 'tbyte ptr' instead of 'xword ptr' for f80mem in Intel syntax. This matches objdump. llvm-svn: 330922	2018-04-26 05:07:40 +00:00
Craig Topper	b0227189fd	[X86] Remove alignment restriction on loading folding of pcmp[ei]str* during isel too. This is a follow up to the changes in r330896 which enabled folding after isel during peephole and register allocation. llvm-svn: 330897	2018-04-26 03:53:39 +00:00
Chandler Carruth	eb631ef51e	[x86] Allow folding unaligned memory operands into pcmp[ei]str* instructions. These have special permission according to the x86 manual to read unaligned memory, and this folding is done by ICC and GCC as well. This corrects one of the issues identified in PR37246. llvm-svn: 330896	2018-04-26 03:17:25 +00:00
Max Kazantsev	2c287ec9c5	Revert "[SCEV] Make computeExitLimit more simple and more powerful" This reverts commit 023c8be90980e0180766196cba86f81608b35d38. This patch triggers miscompile of zlib on PowerPC platform. Most likely it is caused by some pre-backend PPC-specific pass, but we don't clearly know the reason yet. So we temporally revert this patch with intention to return it once the problem is resolved. See bug 37229 for details. llvm-svn: 330893	2018-04-26 02:07:40 +00:00
Reid Kleckner	2c6430fe3c	[codeview] Ignore .cv_loc directives at the end of a function If no data or instructions are emitted after a location directive, we should clear the cv_loc when we change sections, or it will be emitted at the beginning of the next section. This violates our invariant that all .cv_loc directives belong to the same section. Add clearer assertions for this. llvm-svn: 330884	2018-04-25 23:34:15 +00:00
Simon Pilgrim	2faf606fb6	[CostModel][X86] Remove hard coded SDIV/UDIV vector costs Algorithmically compute the 'x20' SDIV/UDIV vector costs - this is necessary for PR36550 when DIV costs will be driven from the scheduler models. llvm-svn: 330870	2018-04-25 20:59:16 +00:00
Tom Stellard	dce46fa1cf	AMDGPU/R600: Move int_r600_store_stream_output to the public intrinsic file Summary: The TableGen'd GlobalISel instruction selector assumes all intrinsics are in the public Intrinsic:: namespace. Reviewers: jvesely, nhaehnle Reviewed By: jvesely, nhaehnle Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45989 llvm-svn: 330866	2018-04-25 20:02:53 +00:00
Mark Searles	ec58183e1b	[AMDGPU] Waitcnt pass: add debug options - Add "amdgpu-waitcnt-forcezero" to force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) - Add debug counters to control force emit of s_waitcnt instrs; debug counters: si-insert-waitcnts-forceexp: force emit s_waitcnt expcnt(0) instrs si-insert-waitcnts-forcevm: force emit s_waitcnt lgkmcnt(0) instrs si-insert-waitcnts-forcelgkm: force emit s_waitcnt vmcnt(0) instrs - Add some debug statements Note that a variant of this patch was previously committed/reverted. Differential Revision: https://reviews.llvm.org/D45888 llvm-svn: 330862	2018-04-25 19:21:26 +00:00
David Bolvansky	cb8ca5f37c	[SimplifyLibcalls] Atoi, strtol replacements Reviewers: spatel, lebedev.ri, xbolva00, efriedma Reviewed By: xbolva00, efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D45418 llvm-svn: 330860	2018-04-25 18:58:53 +00:00
Francis Visoiu Mistrih	57fcd3454a	[MIR] Add support for debug metadata for fixed stack objects Debug var, expr and loc were only supported for non-fixed stack objects. This patch adds the following fields to the "fixedStack:" entries, and renames the ones from "stack:" to: * debug-info-variable * debug-info-expression * debug-info-location Differential Revision: https://reviews.llvm.org/D46032 llvm-svn: 330859	2018-04-25 18:58:06 +00:00
Sam Clegg	9067b46e1b	[WebAssebmly] Add Module name to WasmSymbol Imports in a wasm module can have custom module name. This change adds the module name to the WasmSymbol structure so that the linker can preserve this module name. This is needed to fix: https://bugs.llvm.org/show_bug.cgi?id=37168 Differential Revision: https://reviews.llvm.org/D45797 llvm-svn: 330854	2018-04-25 18:24:08 +00:00
Craig Topper	300e20d61c	[X86] Form MUL_IMM for multiplies with 3/5/9 to encourage LEA formation over load folding. Previously we only formed MUL_IMM when we split a constant. This blocked load folding on those cases. We should also form MUL_IMM for 3/5/9 to favor LEA over load folding. Differential Revision: https://reviews.llvm.org/D46040 llvm-svn: 330850	2018-04-25 17:35:03 +00:00
Alex Bradbury	cd8688a4c2	[RISCV] Allow call pseudoinstruction to be used to call a function name that coincides with a register name Previously `call zero`, `call f0` etc would fail. This leads to compilation failures if building programs that define functions with those names and using -save-temps. llvm-svn: 330846	2018-04-25 17:25:29 +00:00
Taewook Oh	923c216da5	[ICP] Do not attempt type matching for variable length arguments. Summary: When performing indirect call promotion, current implementation inspects "all" parameters of the callsite and attemps to match with the formal argument type of the callee function. However, it is not possible to find the type for variable length arguments, and the compiler crashes when it attemps to match the type for variable lenght argument. It seems that the bug is introduced with D40658. Prior to that, the type matching is performed only for the parameters whose ID is less than callee->getFunctionNumParams(). The attached test case will crash without the patch. Reviewers: mssimpso, davidxl, davide Reviewed By: mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46026 llvm-svn: 330844	2018-04-25 17:19:21 +00:00
Nico Weber	79c6ec484e	Rename Attributes.gen, Intrinsics.gen to Attributes.inc, Intrinsics.inc Virtually all other tablegen outputs are called .inc, not .gen, so rename these two too for consistency. No behavior change. https://reviews.llvm.org/D46058 llvm-svn: 330843	2018-04-25 17:07:46 +00:00
Sanjay Patel	807ddee1bf	[InstCombine] clean up foldSelectICmpAnd(); NFC As discussed in D45862, we want to delete parts of this code because it can create more instructions than it removes. But we also want to preserve some folds that are winners, so tidy up what's here to make splitting the good from bad a bit easier. llvm-svn: 330841	2018-04-25 16:34:01 +00:00
Simon Pilgrim	58e03a09db	[CostModel][X86] Recursive call for cost of imul for packed v16i16 constant shift left. Don't just assume cost = 1. llvm-svn: 330834	2018-04-25 15:22:03 +00:00
Amara Emerson	1f5d994119	[AArch64][GlobalISel] Implement selection for the llvm.trap intrinsic. rdar://38674040 llvm-svn: 330831	2018-04-25 14:43:59 +00:00
Shiva Chen	d58bd8dc4a	[RISCV] Expand function call to "call" pseudoinstruction To do this: 1. Change GlobalAddress SDNode to TargetGlobalAddress to avoid legalizer split the symbol. 2. Change ExternalSymbol SDNode to TargetExternalSymbol to avoid legalizer split the symbol. 3. Let PseudoCALL match direct call with target operand TargetGlobalAddress and TargetExternalSymbol. Differential Revision: https://reviews.llvm.org/D44885 llvm-svn: 330827	2018-04-25 14:19:12 +00:00
Shiva Chen	98f9389f65	[RISCV] Support "call" pseudoinstruction in the MC layer To do this: 1. Add PseudoCALLIndirct to match indirect function call. 2. Add PseudoCALL to support parsing and print pseudo `call` in assembly 3. Expand PseudoCALL to the following form with R_RISCV_CALL relocation type while encoding: auipc ra, func jalr ra, ra, 0 If we expand PseudoCALL before emitting assembly, we will see auipc and jalr pair when compile with -S. It's hard for assembly parser to parsing this pair and identify it's semantic is function call and then insert R_RISCV_CALL relocation type. Although we could insert R_RISCV_PCREL_HI20 and R_RISCV_PCREL_LO12_I relocation types instead of R_RISCV_CALL. Due to RISCV relocation design, auipc and jalr pair only can relax to jal with R_RISCV_CALL + R_RISCV_RELAX relocation types. We expand PseudoCALL as late as encoding(RISCVMCCodeEmitter) instead of before emitting assembly(RISCVAsmPrinter) because we want to preserve call pseudoinstruction in assembly code. It's more readable and assembly parser could identify call assembly and insert R_RISCV_CALL relocation type. Differential Revision: https://reviews.llvm.org/D45859 llvm-svn: 330826	2018-04-25 14:18:55 +00:00
Simon Dardis	0f2f5976d0	[mips] Teach the delay slot filler to transform 'jal' for microMIPS ISel is currently picking 'JAL' over 'JAL_MM' for calling a function when targeting microMIPS. A later patch will correct this behaviour. This patch extends the mechanism for transforming instructions into their short delay to recognise 'JAL_MM' for transforming into 'JALS_MM'. llvm-svn: 330825	2018-04-25 14:12:57 +00:00
Simon Pilgrim	dbd1ae7ddd	[X86] Split WriteFMA into XMM, Scalar and YMM/ZMM scheduler classes This removes all the FMA InstRW overrides. If we ever get PR36924, then we can remove many of these declarations from models. llvm-svn: 330820	2018-04-25 13:07:58 +00:00
Alexander Timofeev	b934728cd2	[AMDGPU] Revert b0efc4fd6 (https://reviews.llvm.org/D40556 ) llvm-svn: 330818	2018-04-25 12:32:46 +00:00
Simon Pilgrim	6a82e96ed9	[X86][SKX] Setup WriteFAdd and remove unnecessary InstRW scheduler overrides. llvm-svn: 330813	2018-04-25 10:51:19 +00:00
Simon Pilgrim	98e21c5ade	[X86][SNB] Remove unnecessary WriteFBlendLd InstRW scheduler overrides. llvm-svn: 330812	2018-04-25 10:50:39 +00:00
Simon Dardis	eac9301cdb	[mips] Fix the definition of sync, synci Also, fix the disassembly of synci for microMIPS. Reviewers: abeserminji, smaksimovic, atanasyan Differential Revision: https://reviews.llvm.org/D45870 llvm-svn: 330810	2018-04-25 10:19:22 +00:00
Florian Hahn	1da30c659d	[LoopInterchange] Use getExitBlock()/getExitingBlock instead of manual impl. This also means we have to check if the latch is the exiting block now, as `transform` expects the latches to be the exiting blocks too. https://bugs.llvm.org/show_bug.cgi?id=36586 Reviewers: efriedma, davide, karthikthecool Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45279 llvm-svn: 330806	2018-04-25 09:35:54 +00:00
Sander de Smalen	eb896b148b	[AArch64][SVE] Asm: Add AsmOperand classes for SVE gather/scatter addressing modes. This patch adds parsing support for 'vector + shift/extend' and corresponding asm operand classes, needed for implementing SVE's gather/scatter addressing modes. The added combinations of vector (ZPR) and Shift/Extend are: Unscaled: ZPR64ExtLSL8: signed 64-bit offsets (z0.d) ZPR32ExtUXTW8: unsigned 32-bit offsets (z0.s, uxtw) ZPR32ExtSXTW8: signed 32-bit offsets (z0.s, sxtw) Unpacked and unscaled: ZPR64ExtUXTW8: unsigned 32-bit offsets (z0.d, uxtw) ZPR64ExtSXTW8: signed 32-bit offsets (z0.d, sxtw) Unpacked and scaled: ZPR64ExtUXTW<scale>: unsigned 32-bit offsets (z0.d, uxtw #<shift>) ZPR64ExtSXTW<scale>: signed 32-bit offsets (z0.d, sxtw #<shift>) Scaled: ZPR32ExtUXTW<scale>: unsigned 32-bit offsets (z0.s, uxtw #<shift>) ZPR32ExtSXTW<scale>: signed 32-bit offsets (z0.s, sxtw #<shift>) ZPR64ExtLSL<scale>: unsigned 64-bit offsets (z0.d, lsl #<shift>) ZPR64ExtLSL<scale>: signed 64-bit offsets (z0.d, lsl #<shift>) Patch [1/3] in series to add support for SVE's gather load instructions that use scalar+vector addressing modes: - Patch [1/3]: https://reviews.llvm.org/D45951 - Patch [2/3]: https://reviews.llvm.org/D46023 - Patch [3/3]: https://reviews.llvm.org/D45958 Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, t.p.northover, echristo, evandro, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D45951 llvm-svn: 330805	2018-04-25 09:26:47 +00:00
Bjorn Pettersson	bec2a7c4eb	[DebugInfo] Invalidate debug info in ReassociatePass::RewriteExprTree Summary: When Reassociate is rewriting an expression tree it may reuse old binary expression nodes, for new expressions. Whenever an expression node is reused, but with a non-trivial change in the result, we need to invalidate any debug info that is associated with the node. If for example rewriting x = mul a, b y = mul c, x into x = mul c, b y = mul a, x we still get the same result for 'y', but 'x' is a new expression. All debug info referring to 'x' must be invalidated (marked as optimized out) since we no longer calculate the expected value. As a side-effect this patch avoid (at least some) problems where reassociate could end up creating IR with debug-use before def. Earlier the dbg.value nodes where left untouched in the IR, while the reused binary nodes where sinked to just before the root node of the rewritten expression tree. See PR27273 for more info about such problems. Reviewers: dblaikie, aprantl, dexonsmith Reviewed By: aprantl Subscribers: JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D45975 llvm-svn: 330804	2018-04-25 09:23:56 +00:00
David Bolvansky	3ea50f9fef	Merging r46043: ------------------------------------------------------------------------ llvm-svn: 330799	2018-04-25 04:33:36 +00:00
Geoff Berry	2af5f3c1e5	[DivRemPairs] Fix non-determinism in use list order. Summary: Use a MapVector instead of a DenseMap for RemMap since it is iteratated over and the order of iteration can effect the order that new instructions are created. This can in turn effect the use list order of div/rem input values if multiple new instructions are created that share any input values. Reviewers: spatel Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45858 llvm-svn: 330792	2018-04-25 02:17:56 +00:00
Chandler Carruth	69e68f8468	[PM/LoopUnswitch] Begin teaching SimpleLoopUnswitch to use the new update API for dominators rather than doing manual, hacky updates. This is just the first step, but in some ways the most important as it moves the non-trivial unswitching to update the domtree rather than fully recalculating it each time. Subsequent patches should remove the custom update logic used by the trivial unswitch and replace it with uses of the update API. This also fixes a number of bugs I was seeing when testing non-trivial unswitch due to it querying the quasi-correct dominator tree. Now the tree is 100% correct and safe to query. That said, there are still more bugs I can see with non-trivial unswitch just running over the test suite, so more bugfix patches are needed as well. Thanks to both Sanjoy and Fedor for reviews and testing! Differential Revision: https://reviews.llvm.org/D45943 llvm-svn: 330787	2018-04-25 00:18:07 +00:00
Jessica Paquette	4f56428de1	[MachineOutliner] Check for explicit uses of LR/W30 in MI operands Before, the outliner would grab ADRPs that used LR/W30. This patch fixes that by checking for explicit uses of those registers before the special-casing for ADRPs. This also adds a test that ensures that those sorts of ADRPs won't be outlined. llvm-svn: 330783	2018-04-24 22:38:15 +00:00
Craig Topper	f3cefad255	[DAGCombiner][X86] When promoting loads don't use ZEXTLOAD even its legal We were previously prefering ZEXTLOAD over EXTLOAD if it is legal. This triggers during X86's promotion of i16->i32. Not sure about other targets. Using ZEXTLOAD can prevent folding it to SEXTLOAD later if we were to promote a sign extended operand like we would need for SRA. However, X86 doesn't currently promote i16 SRA. I was looking into doing that which is how I found this issue. This is also blocking our ability to fold 4 byte aligned EXTLOADs with "loadi32". This is what caused most of the test changes here. Differential Revision: https://reviews.llvm.org/D45585#inline-402825 llvm-svn: 330781	2018-04-24 22:35:27 +00:00
Warren Ristow	b960d2cb40	[X86] Account for partial stack slot spills (PR30821) Previously, _any_ store or load instruction was considered to be operating on a spill if it had a frameindex as an operand, and thus was fair game for optimisations such as "StackSlotColoring". This usually works, except on architectures where spills can be partially restored, for example on X86 where a spilt vector can have a single component loaded (zeroing the rest of the target register). This can be mis-interpreted and the zero extension unsoundly eliminated, see pr30821. To avoid this, this commit optionally provides the caller to isLoadFromStackSlot and isStoreToStackSlot with the number of bytes spilt/loaded by the given instruction. Optimisations can then determine that a full spill followed by a partial load (or vice versa), for example, cannot necessarily be commuted. Patch by Jeremy Morse! Differential Revision: https://reviews.llvm.org/D44782 llvm-svn: 330778	2018-04-24 22:01:50 +00:00
Tom Stellard	a2be8f4c35	AMDGPU: Remove deprecated llvm.AMDGPU.kilp intrinsic Summary: This is no longer used by mesa since its 18.0.0 release. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D45988 llvm-svn: 330775	2018-04-24 21:37:57 +00:00
Tom Stellard	257882ff72	AMDGPU/GlobalISel: Fall-back to SelectionDAG for non-void functions Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45843 llvm-svn: 330774	2018-04-24 21:29:36 +00:00
Daniel Neilson	3c148720fa	[CaptureTracking] Fixup const correctness of DomTree arg (NFC) Summary: The PointerMayBeCapturedBefore function's DomTree arg should be const instead of non-const. There are no non-const uses of it in the function. llvm-svn: 330769	2018-04-24 21:12:45 +00:00
Tom Stellard	c7709e1c29	AMDGPU/GlobalISel: Add support for amdgpu_ps calling convention Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45837 llvm-svn: 330767	2018-04-24 20:51:28 +00:00
Chandler Carruth	7e1c3345a0	[wasm] Fix uninitialized memory introduced in r330749. Found with MSan. This was causing all the WASM MC tests to fail about 10% of the time. llvm-svn: 330764	2018-04-24 20:30:56 +00:00
Simon Pilgrim	c4d25a2922	[X86][SKX] Setup WriteFMul and remove unnecessary InstRW scheduler overrides. llvm-svn: 330760	2018-04-24 19:22:01 +00:00
Simon Pilgrim	27bc83e228	[X86] Split off PHMINPOSUW to their own schedule class This also fixes Jaguar's schedule which was treating it as the WriteVecIMul default. llvm-svn: 330756	2018-04-24 18:49:25 +00:00
Stanislav Mekhanoshin	a4bfb3c446	[AMDGPU] Truncate packed inline constant If a packed inline constant is sign extended it must be truncated after the shift. I.e. a constant (0xH0000, 0xHBC00), will be represented as 0xFFFFFFFFBC000000 in the IR because the immediate is sign extended to 64 bit. After the value shifted right by 16 to use it in a low part with op_sel_hi it becomes 0xFFFFFFFFBC00 and does not qualify as inline constant any longer. Fixed the error and added verification code. Without the fix and with the verification bug is causing pk_max_f16_literal.ll to fail. Differential Revision: https://reviews.llvm.org/D45987 llvm-svn: 330752	2018-04-24 18:17:55 +00:00
Simon Pilgrim	81cb67ad82	[XOP] v4i32 IFMA 'VPMACS' instructions should use the WritePMULLD schedule class llvm-svn: 330751	2018-04-24 18:13:57 +00:00
Sam Clegg	6f08c84ae5	[WebAssembly] Use section index in relocation section header Rather than referring to sections my their code, use the absolute index of the target section within the module. See https://github.com/WebAssembly/tool-conventions/issues/52 Differential Revision: https://reviews.llvm.org/D45980 llvm-svn: 330749	2018-04-24 18:11:36 +00:00
Simon Pilgrim	cf0199a289	[AVX512] VPERMQ/VPERMPD/VPERMIL single op shuffles are not variable shuffles These variants all take an immediate shuffle mask value and should be scheduled as such. llvm-svn: 330747	2018-04-24 17:59:54 +00:00
Nico Weber	ebc7c74f2f	Let TableGen write output only if it changed, instead of doing so in cmake. Removes one subprocess and one temp file from the build for each tablegen invocation. No intended behavior change. https://reviews.llvm.org/D45899 llvm-svn: 330742	2018-04-24 17:29:05 +00:00
Simon Dardis	d2ac0faf3b	Reland "[mips] Guard traps for microMIPS correctly" This is part of fixing the instruction predicates for MIPS. Reviewers: atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D44212 This patch relands r327409, hopefully without the problematic part of the tests that cause FileCheck to assert on the windows expensive checks bot. llvm-svn: 330741	2018-04-24 17:11:37 +00:00
Diego Caballero	60f2776b2f	[LV][VPlan] Detect outer loops for explicit vectorization. Patch #2 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). This patch introduces the basic infrastructure to detect, legality check and process outer loops annotated with hints for explicit vectorization. All these changes are protected under the feature flag -enable-vplan-native-path. This should make this patch NFC for the existing inner loop vectorizer. Reviewers: hfinkel, mkuper, rengolin, fhahn, aemerson, mssimpso. Differential Revision: https://reviews.llvm.org/D42447 llvm-svn: 330739	2018-04-24 17:04:17 +00:00
Florian Hahn	ceee788947	[LoopInterchange] Make isProfitableForVectorization slightly more conservative. After D43236, we started interchanging loops with empty dependence matrices. In isProfitableForVectorization, we try to determine if interchanging makes the loop dependences more friendly to the vectorizer. If there are no dependences, we should not interchange, based on that heuristic. Reviewers: efriedma, mcrosier, karthikthecool, blitz.opensource Reviewed By: mcrosier Differential Revision: https://reviews.llvm.org/D45208 llvm-svn: 330738	2018-04-24 16:55:32 +00:00
Simon Pilgrim	f0945aa0e0	[X86][F16C] Add WriteCvtF2FSt scheduling class Fixes the classification of VCVTPS2PHmr/VCVTPS2PHYmr which were tagged as WriteCvtF2FLd_WriteRMW (PR36887) llvm-svn: 330737	2018-04-24 16:43:07 +00:00
Simon Pilgrim	828ef9e013	[X86][BtVer2] Fix VCVTPS2PHmr/VCVTPS2PHYmr latencies These are stores, not loads, so don't need to account for load latency. llvm-svn: 330735	2018-04-24 16:26:51 +00:00
Simon Atanasyan	9df3be3ccb	[mips] Show an error if register number is out of range Current code does not check that a register number is in the 0-31 range. Sometimes the parser checks that later for some kinds of instructions, but that leads to unclear / incorrect error messages like that: % cat test.s .text lb $4, 8($32) % llvm-mc test.s -triple=mips64-unknown-linux test.s:2:10: error: expected memory with 16-bit signed offset lb $4, 8($32) ^ Sometimes the parser just crashes: % cat test.s .text lw $4, 8($32) % llvm-mc test.s -triple=mips64-unknown-linux This patch resolves the problem by checking that register number after '$' sign is in the 0-31 range. If the number is out of the range the parser shows the `invalid register number` error, but treats invalid register number as a normal one to continue parsing and catch other possible errors. Differential Revision: https://reviews.llvm.org/D45919 llvm-svn: 330732	2018-04-24 16:14:00 +00:00
Mark Searles	70901b9047	[AMDGPU][Waitcnt] NFC. Cleanup some code/naming consistency: - s/SWaitcnt/Waitcnt s/WaitCnt/Waitcnt llvm-svn: 330730	2018-04-24 15:59:59 +00:00
David Blaikie	ba47dd16c5	Fix some layering in AggressiveInstCombine (avoiding inclusion of Scalar.h) llvm-svn: 330726	2018-04-24 15:40:07 +00:00
Benjamin Kramer	f85f5da3b2	[LoadStoreVectorize] Ignore interleaved invariant loads. The memory location an invariant load is using can never be clobbered by any store, so it's safe to move the load ahead of the store. Differential Revision: https://reviews.llvm.org/D46011 llvm-svn: 330725	2018-04-24 15:28:47 +00:00
Simon Pilgrim	16299273d0	[X86] Remove unnecessary FMA reg-mem InstRW scheduler overrides. llvm-svn: 330720	2018-04-24 14:47:11 +00:00
Ulrich Weigand	497c70fff1	[SystemZ] Use preferred 16-byte function alignment While not necessary for correctness, it is preferable for performance reasons on all architectures we currently support to align functions to 16-byte boundaries by default. llvm-svn: 330718	2018-04-24 14:03:21 +00:00
Simon Pilgrim	f7d2a93d5f	[X86] Add vector element insertion/extraction scheduler classes Split off pinsr/pextr and extractps instructions. (Mostly) fixes PR36887. Note: It might be worth adding a WriteFInsertLd class as well in the future. Differential Revision: https://reviews.llvm.org/D45929 llvm-svn: 330714	2018-04-24 13:21:41 +00:00
Alexander Ivchenko	5717fbaf4c	[X86] Replace action Promote with Expand for operation ISD::SINT_TO_FP Summary: If attribute "use-soft-float"="true" is set then X86ISelLowering.cpp sets 'Promote' action for ISD::SINT_TO_FP operation on type i32. But 'Promote' action is not proper in this case since lib function __floatsidf is available for casting from signed int to float type. Thus Expand action is more suitable here. The Expand action should be set for ISD::UINT_TO_FP for soft float as well. If function attribute "use-soft-float"="true" is set then infinite looping can happen in DAG combining, function visitSINT_TO_FP() replaces SINT_TO_FP node with UINT_TO_FP node and function combineUIntToFP() replace vice versa in cycle. The fix prevents it. Patch by vrybalov Differential Revision: https://reviews.llvm.org/D45572 llvm-svn: 330711	2018-04-24 12:57:51 +00:00
Francis Visoiu Mistrih	8ed0f741ae	[CodeGen] Print user-friendly debug locations as MI comments If available, print the file, line and column of the DebugLoc attached to the MachineInstr: MOV16mr $rbp, 1, $noreg, -112, $noreg, killed renamable $ax, debug-location !56 :: (store 2 into %ir.._value12); stepping.swift:10:17 renamable $edx = MOVZX32rm16 $rbp, 1, $noreg, -112, $noreg, debug-location !62 :: (dereferenceable load 2 from %ir.._value13); stepping.swift:10:17 Differential Revision: https://reviews.llvm.org/D45992 llvm-svn: 330709	2018-04-24 11:00:46 +00:00
Chandler Carruth	43acdb35bc	[PM/LoopUnswitch] Fix a bug in the loop block set formation of the new loop unswitch. This code incorrectly added the header to the loop block set early. As a consequence we would incorrectly conclude that a nested loop body had already been visited when the header of the outer loop was the preheader of the nested loop. In retrospect, adding the header eagerly doesn't really make sense. It seems nicer to let the cycle be formed naturally. This will catch crazy bugs in the CFG reconstruction where we can't correctly form the cycle earlier rather than later, and makes the rest of the logic just fall out. I've also added various asserts that make these issues much easier to debug. llvm-svn: 330707	2018-04-24 10:33:08 +00:00
Petar Jovanovic	e2bfcd6394	Correct dwarf unwind information in function epilogue This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: * CFI instructions do not affect code generation (they are not counted as instructions when tail duplicating or tail merging) * Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Added CFIInstrInserter pass: * analyzes each basic block to determine cfa offset and register are valid at its entry and exit * verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors * inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D42848 llvm-svn: 330706	2018-04-24 10:32:08 +00:00
Simon Dardis	fce722e6f8	[mips] Correct the patterns for bswap Guard the MIPS64 variant correctly for i64, mark the MIPS32 version as not in microMIPS and provide the microMIPS version. Additionally, remove a related stale XFAIL'd test as bswap has its own test case providing coverage. Reviewers: smaksimovic, abeserminji, atanasyan Differential Revision: https://reviews.llvm.org/D45816 llvm-svn: 330705	2018-04-24 10:19:29 +00:00
Andrei Elovikov	822602a75e	[CodeGen] Do not allow opt-bisect-limit to skip ScalarizeMaskedMemIntrin. Summary: The pass is supposed to scalarize such intrinsics if the target does not support them natively, so if the scalarization does not happen instruction selection crashes due to inability to lower these intrinsics. Reviewers: andrew.w.kaylor, craig.topper Reviewed By: andrew.w.kaylor Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45947 llvm-svn: 330700	2018-04-24 09:24:29 +00:00
Max Kazantsev	c54e67d6b9	[NFC] Remove recently added SE verification because it may be false-positive llvm-svn: 330699	2018-04-24 09:11:01 +00:00
Sander de Smalen	eb1053f9d3	[AArch64][SVE] Asm: Support for contiguous, first-faulting LDFF1 (scalar+scalar) load instructions. Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, t.p.northover, echristo, evandro, javed.absar Reviewed By: rengolin Subscribers: tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45946 llvm-svn: 330697	2018-04-24 08:59:08 +00:00
Xin Tong	adb5bfe75b	[LVI] Fix typo. NFC llvm-svn: 330688	2018-04-24 07:38:07 +00:00
Max Kazantsev	30dee7874d	[NFC] Use forgetTopmostLoop instead of logic duplication llvm-svn: 330683	2018-04-24 04:33:04 +00:00
Craig Topper	19b85103a3	[X86] Add a BSWAP16 instruction using the 32-bit encoding plus a 0x66 prefix. This encoding is recognized by the CPU, but the behavior is undefined. This makes the disassembler handle it correctly so we don't print bswapl with a 16-bit register. llvm-svn: 330682	2018-04-24 04:28:02 +00:00
Chandler Carruth	0ace148ca6	[PM/LoopUnswitch] Remove another over-aggressive assert. This code path can very clearly be called in a context where we have baselined all the cloned blocks to a particular loop and are trying to handle nested subloops. There is no harm in this, so just relax the assert. I've added a test case that will make sure we actually exercise this code path. llvm-svn: 330680	2018-04-24 03:27:00 +00:00
Eric Christopher	b9733d0f7c	Remove unused function HexagonEarlyIfConversion::replacePhiEdges. NFC. llvm-svn: 330678	2018-04-24 02:10:59 +00:00
Max Kazantsev	5a0a40b8cb	[NFC] Add clarification comment llvm-svn: 330677	2018-04-24 02:08:05 +00:00
Eric Christopher	24004d65a5	Reflow formatting after previous NFC commit. llvm-svn: 330676	2018-04-24 01:57:03 +00:00
Eric Christopher	29ff50454c	Change if-conditionals to else-if as they should all be mutually exclusive. No functional change intended. llvm-svn: 330675	2018-04-24 01:57:02 +00:00

1 2 3 4 5 ...

112901 Commits