llvm-project

Commit Graph

Author	SHA1	Message	Date
Johannes Doerfert	23400e618b	[Attributor] Manifest constant return values Summary: If the unique return value is a constant we now replace call uses with that constant. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66551 llvm-svn: 369785	2019-08-23 17:41:37 +00:00
Johannes Doerfert	785fad3202	[Attributor] Deal with shrinking dereferenceability in a loop Summary: If we have a loop in which the dereferenceability of a pointer decreases we did slowly decrease it iteration by iteration, leading to a timeout. With this patch we detect such circular reasoning and indicate a fixpoint early. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66558 llvm-svn: 369784	2019-08-23 17:29:23 +00:00
Nathan Huckleberry	5808077bc6	Allow Compiler.h to be included in C files and fix fallthrough warnings Summary: Since clang does not support comment style fallthrough annotations these should be switched to macros defined in Compiler.h. This requires some fixing to Compiler.h. Original patch: https://reviews.llvm.org/D66487 Reviewers: nickdesaulniers, aaron.ballman, xbolva00, rsmith Reviewed By: nickdesaulniers, aaron.ballman, rsmith Subscribers: rsmith, sfertile, ormris, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66609 llvm-svn: 369782	2019-08-23 17:25:21 +00:00
Craig Topper	e7211bb567	[SelectionDAG][X86] Enable iX SimplifyDemandedBits to vXi1 SimplifyDemandedVectorElts simplification. Add a hack to X86 to avoid a regression Patch showing the effect of enabling bool vector oversimplification. Non-VLX builds can simplify a kshift shuffle, but VLX builds simplify: insert_subvector v8i zeroinitializer, v2i --> insert_subvector v8i undef, v2i Preventing the removal of the AND to clear the upper bits of result Differential Revision: https://reviews.llvm.org/D53022 llvm-svn: 369780	2019-08-23 17:14:58 +00:00
Jeremy Morse	0ae5498146	[DebugInfo] Remove invalidated locations during LiveDebugValues LiveDebugValues gives variable locations to blocks, but it should also take away. There are various circumstances where a variable location is known until a loop backedge with a different location is detected. In those circumstances, where there's no agreement on the variable location, it should be undef / removed, otherwise we end up picking a location that's valid on some loop iterations but not others. However, LiveDebugValues doesn't currently do this, see the new testcase attached. Without this patch, the location of !3 is assumed to be %bar through the loop. Once it's added to the In-Locations list, it's never removed, even though the later dbg.value(0... of !3 makes the location un-knowable. This patch checks during block-location-joining to see whether any previously-present locations have been removed in a predecessor. If they have, the live-ins have changed, and the block needs reprocessing. Similarly, in transferTerminator, assign rather than \|= the Out-Locations after processing a block, as we may have deleted some previously valid locations. This will mean that LiveDebugValues performs more propagation -- but that's necessary for it being correct. Differential Revision: https://reviews.llvm.org/D66599 llvm-svn: 369778	2019-08-23 16:33:42 +00:00
Sanjay Patel	5a5d44e801	[SLP] use range-for loops, fix formatting; NFC These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369776	2019-08-23 16:22:32 +00:00
Cameron McInally	688f3bc240	[Reassoc] Small fix to support unary FNeg in NegateValue(...) Differential Revision: https://reviews.llvm.org/D66612 llvm-svn: 369772	2019-08-23 15:49:38 +00:00
Johannes Doerfert	2f2d7c3add	[Attributor][Fix] Deal with "growing" dereferenceability Summary: If we have a negative inbounds offset dereferenceabily "grows". However, until we do not handle the overflow that can occur in the dereferenceable bytes and the problem with loops, we simply do not grow the state. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66557 llvm-svn: 369771	2019-08-23 15:45:46 +00:00
Johannes Doerfert	deb9ea3a8c	[Attributor][NFCI] Avoid lookups when resolving returned values If the number of potentially returned values not change since the last traversal we do not need to visit the returned values again. This works as we only add values to the returned values set now. Differential Revision: https://reviews.llvm.org/D66484 llvm-svn: 369770	2019-08-23 15:42:19 +00:00
Sanjay Patel	9182467886	[SLP] fix formatting; NFC These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369769	2019-08-23 15:26:12 +00:00
Johannes Doerfert	9543f1498c	[Attributor] FIX: Treat new attributes as changed ones Summary: When we have new attributes and we end the fixpoint iteration because the iteration limit is reached, we need to treat the new ones as if they changed in the last iteration, as they might have. This adds a test for which we should not derive anything regardless of the iteration limit, e.g., if we abort there should not be any attributes manifested in the IR. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66549 llvm-svn: 369768	2019-08-23 15:24:57 +00:00
Johannes Doerfert	695089ecfb	[Attributor][NFCI] Try to avoid potential non-deterministic behavior This commit replaces sets with set vectors in an effort to make the behavior of the Attributor deterministic. llvm-svn: 369767	2019-08-23 15:23:49 +00:00
Teresa Johnson	ea314fd476	[ThinLTO] Fix handling of weak interposable symbols Summary: Keep aliasees alive if their alias is live, otherwise we end up with an alias to a declaration, which is invalid. This can happen when the aliasee is weak and non-prevailing. This fix exposed the fact that we were then attempting to internalize the weak symbol, which was not exported as it was not prevailing. We should not internalize interposable symbols in general, unless this is the prevailing copy, since it can lead to incorrect inlining and other optimizations. Most of the changes in this patch are due to the restructuring required to pass down the prevailing callback. Finally, while implementing the test cases, I found that in the case of a weak aliasee that is still marked not live because its alias isn't live, after dropping the definition we incorrectly marked the declaration with weak linkage when resolving prevailing symbols in the module. This was due to some special case handling for symbols marked WeakLinkage in the summary located before instead of after a subsequent check for the symbol being a declaration. It turns out that we don't actually need this special case handling any more (looking back at the history, when that was added the code was structured quite differently) - we will correctly mark with weak linkage further below when the definition hasn't been dropped. Fixes PR42542. Reviewers: pcc Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66264 llvm-svn: 369766	2019-08-23 15:18:58 +00:00
Johannes Doerfert	a5b10b464e	[MustExec] Add a generic "must-be-executed-context" explorer Given an instruction I, the MustBeExecutedContextExplorer allows to easily traverse instructions that are guaranteed to be executed whenever I is. For now, these instruction have to be statically "after" I, in the same or different basic blocks. This patch also adds a pass which prints the must-be-executed-context for each instruction in a module. It is used to test the MustBeExecutedContextExplorer, for now on the examples given in the class comment of the MustBeExecutedIterator. Differential Revision: https://reviews.llvm.org/D65186 llvm-svn: 369765	2019-08-23 15:17:27 +00:00
Simon Atanasyan	5f7d6ac7bf	[mips] Reduce number of instructions used for loading a global symbol's value Now `lw/sw $reg, sym+offset` pseudo instructions for global symbol `sym` are lowering into the following three instructions. ``` lw $reg, %got(symbol)($gp) addiu $reg, $reg, offset lw/sw $reg, 0($reg) ``` It's possible to reduce the number of instructions by taking the offset in account in the final `lw/sw` command. This patch implements that optimization. ``` lw $reg, %got(symbol)($gp) lw/sw $reg, offset($reg) ``` Differential Revision: https://reviews.llvm.org/D66553 llvm-svn: 369756	2019-08-23 13:36:24 +00:00
Simon Atanasyan	58492b1895	[mips] Do not include offset into `%got` expression for global symbols Now pseudo instruction `la $6, symbol+8($6)` is expanding into the following chain of commands: ``` lw $1, %got(symbol+8)($gp) addiu $1, $1, 8 addu $6, $1, $6 ``` This is incorrect. When a linker handles the `R_MIPS_GOT16` relocation, it does not expect to get any addend and breaks on assertion. Otherwise it has to create new GOT entry for each unique "sym + offset" pair. Offset for a global symbol should be added to result of loading GOT entry by a separate `add` command. The patch fixes the problem by stripping off an offset from the expression passed to the `%got`. That's interesting that even current code inserts a separate `add` command. Differential Revision: https://reviews.llvm.org/D66552 llvm-svn: 369755	2019-08-23 13:36:14 +00:00
Simon Pilgrim	c88408cf85	Use VT::getHalfNumVectorElementsVT helpers in a few places. NFCI. llvm-svn: 369751	2019-08-23 12:37:09 +00:00
Andrea Di Biagio	8e9af64da6	[X86][BtVer2] Add a read-advance to every implicit register use of CMPXCHG8B/16B. This is a follow up of r369642. This patch assigns a ReadAfterLd to every implicit register use of instruction CMPXCHG8B and instruction CMPXCHG16B. Perf micro-benchmarks show that implicit registers are read after 3cy from the start of execution. llvm-svn: 369750	2019-08-23 12:19:45 +00:00
Andrea Di Biagio	1630f64e2f	[X86][BtVer2] Fix latency of ALU RMW instructions. Excluding ADC/SBB and the bit-test instructions (BTR/BTS/BTC), the observed latency of all other RMW integer arithmetic/logic instructions is 6cy and not 5cy. Example (ADD): ``` addb $0, (%rsp) # Latency: 6cy addb $7, (%rsp) # Latency: 6cy addb %sil, (%rsp) # Latency: 6cy addw $0, (%rsp) # Latency: 6cy addw $511, (%rsp) # Latency: 6cy addw %si, (%rsp) # Latency: 6cy addl $0, (%rsp) # Latency: 6cy addl $511, (%rsp) # Latency: 6cy addl %esi, (%rsp) # Latency: 6cy addq $0, (%rsp) # Latency: 6cy addq $511, (%rsp) # Latency: 6cy addq %rsi, (%rsp) # Latency: 6cy ``` The same latency profile applies to SUB/AND/OR/XOR/INC/DEC. The observed latency of ADC/SBB is 7-8cy. So we need a different write to model those. Latency of BTS/BTR/BTC is not fixed by this patch (they are much slower than what the model for btver2 currently reports). Differential Revision: https://reviews.llvm.org/D66636 llvm-svn: 369748	2019-08-23 11:34:10 +00:00
Martin Storsjo	8dbdb1c2a2	[llvm-dlltool] Make sure to strip decorations from ExtName for renamed exports ExtName should not be decorated, just like Name. This avoids double decoration on symbols in import libraries that use = for renaming functions. (Weak aliases, which use ==, worked fine with respect to decoration.) Differential Revision: https://reviews.llvm.org/D66617 llvm-svn: 369747	2019-08-23 11:18:11 +00:00
Simon Pilgrim	04906ef1f2	[DAGCombine] GetNegatedExpression - add FMA\FMAD support If the accumulator and either of the multiply operands are negatable then we can we negate the entire expression. Differential Revision: https://reviews.llvm.org/D63141 llvm-svn: 369746	2019-08-23 10:49:46 +00:00
Jay Foad	eac23862a8	[AMDGPU] gfx10 atomic optimizer changes. Summary: Add support for gfx10, where all DPP operations are confined to work within a single row of 16 lanes, and wave32. Reviewers: arsenm, sheredom, critson, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, jfb, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65644 llvm-svn: 369745	2019-08-23 10:07:43 +00:00
George Rimar	668b11b2c8	[yaml2obj] - Allow setting the symbol st_other field to any integer. st_other field of a symbol usually contains its visibility. Other bits are usually 0, though some targets, like MIPS can set them using the named bit field values. Problem is that there is no way to set an arbitrary value now, though that might be useful for our test cases. In this patch I introduced a way to set st_other to any numeric value using the new StOther field. I added a test and simplified the existent one to show the effect/benefit Differential revision: https://reviews.llvm.org/D66583 llvm-svn: 369742	2019-08-23 09:31:07 +00:00
Craig Topper	4deb388bca	[X86] Make combineLoopSADPattern use CONCAT_VECTORS instead of INSERT_SUBVECTORS for widening with zeros. CONCAT_VECTORS is more canonical for the early DAG combine runs until we start getting into the op legalization phases. llvm-svn: 369734	2019-08-23 06:08:33 +00:00
Craig Topper	bdceb9fb14	[X86] Improve lowering of v2i32 SAD handling in combineLoopSADPattern. For v2i32 we only feed 2 i8 elements into the psadbw instructions with 0s in the other 14 bytes. The resulting psadbw instruction will produce zeros in bits [127:16] of the output. We need to take the result and feed it to a v2i32 add where the first element includes bits [15:0] of the sad result. The other element should be zero. Prior to this patch we were using a truncate to take 0 from bits 95:64 of the psadbw. This results in a pshufd to move those bits to 63:32. But since we also have zeroes in bits 63:32 of the psadbw output, we should just take those bits. The previous code probably worked better with promoting legalization, but now we use widening legalization. I've preserved the old behavior if -x86-experimental-vector-widening-legalization=false until we get that option removed. llvm-svn: 369733	2019-08-23 05:33:27 +00:00
Philip Reames	2a52583d67	[IndVars] Fix a bug noticed by inspection We were computing the loop exit value, but not ensuring the addrec belonged to the loop whose exit value we were computing. I couldn't actually trip this; the test case shows the basic setup which might trip this, but none of the variations I've tried actually do. llvm-svn: 369730	2019-08-23 04:03:23 +00:00
Fangrui Song	3fc933af8b	[AlignmentFromAssumptions] getNewAlignmentDiff(): use getURemExpr() The alignment is calculated incorrectly, thus sometimes it doesn't generate aligned mov instructions, as shown by the example below: ``` // b.cc typedef long long index; extern "C" index g_tid; extern "C" index g_num; void add3(float* __restrict__ a, float* __restrict__ b, float* __restrict__ c) { index n = 641024; index m = 161024; index k = 41024; index tid = g_tid; index num = g_num; __builtin_assume_aligned(a, 32); __builtin_assume_aligned(b, 32); __builtin_assume_aligned(c, 32); for (index i0=tidk; i0<m; i0+=numk) for (index i1=0; i1<nm; i1+=m) for (index i2=0; i2<k; i2++) c[i1+i0+i2] = b[i0+i2] + a[i1+i0+i2]; } ``` Compile with `clang b.cc -Ofast -march=skylake -mavx2 -S` ``` vmovaps -224(%rdi,%rbx,4), %ymm0 vmovups -192(%rdi,%rbx,4), %ymm1 # should be movaps vmovups -160(%rdi,%rbx,4), %ymm2 # should be movaps vmovups -128(%rdi,%rbx,4), %ymm3 # should be movaps vaddps -224(%rsi,%rbx,4), %ymm0, %ymm0 vaddps -192(%rsi,%rbx,4), %ymm1, %ymm1 vaddps -160(%rsi,%rbx,4), %ymm2, %ymm2 vaddps -128(%rsi,%rbx,4), %ymm3, %ymm3 vmovaps %ymm0, -224(%rdx,%rbx,4) vmovups %ymm1, -192(%rdx,%rbx,4) # should be movaps vmovups %ymm2, -160(%rdx,%rbx,4) # should be movaps vmovups %ymm3, -128(%rdx,%rbx,4) # should be movaps ``` Differential Revision: https://reviews.llvm.org/D66575 Patch by Dun Liang llvm-svn: 369723	2019-08-23 02:17:04 +00:00
Peter Collingbourne	21a1814417	hwasan: Untag unwound stack frames by wrapping personality functions. One problem with untagging memory in landing pads is that it only works correctly if the function that catches the exception is instrumented. If the function is uninstrumented, we have no opportunity to untag the memory. To address this, replace landing pad instrumentation with personality function wrapping. Each function with an instrumented stack has its personality function replaced with a wrapper provided by the runtime. Functions that did not have a personality function to begin with also get wrappers if they may be unwound past. As the unwinder calls personality functions during stack unwinding, the original personality function is called and the function's stack frame is untagged by the wrapper if the personality function instructs the unwinder to keep unwinding. If unwinding stops at a landing pad, the function is still responsible for untagging its stack frame if it resumes unwinding. The old landing pad mechanism is preserved for compatibility with old runtimes. Differential Revision: https://reviews.llvm.org/D66377 llvm-svn: 369721	2019-08-23 01:28:44 +00:00
Sam Clegg	90b6bb75e8	[MC] Minor cleanup to MCFixup::Kind handling. NFC. Prefer `MCFixupKind` where possible and add getTargetKind() to convert to `unsigned` when needed rather than scattering cast operators around the place. Differential Revision: https://reviews.llvm.org/D59890 llvm-svn: 369720	2019-08-23 01:00:55 +00:00
Peter Collingbourne	2452d7030b	IR. Change strip* family of functions to not look through aliases. I noticed another instance of the issue where references to aliases were being replaced with aliasees, this time in InstCombine. In the instance that I saw it turned out to be only a QoI issue (a symbol ended up being missing from the symbol table due to the last reference to the alias being removed, preventing HWASAN from symbolizing a global reference), but it could easily have manifested as incorrect behaviour. Since this is the third such issue encountered (previously: D65118, D65314) it seems to be time to address this common error/QoI issue once and for all and make the strip* family of functions not look through aliases. Includes a test for the specific issue that I saw, but no doubt there are other similar bugs fixed here. As with D65118 this has been tested to make sure that the optimization isn't load bearing. I built Clang, Chromium for Linux, Android and Windows as well as the test-suite and there were no size regressions. Differential Revision: https://reviews.llvm.org/D66606 llvm-svn: 369697	2019-08-22 19:56:14 +00:00
Benjamin Kramer	b3a991df3c	Fight a bit against global initializers. NFC. llvm-svn: 369695	2019-08-22 19:43:27 +00:00
Matt Arsenault	fba82858f2	GlobalISel: Don't create G_UADDE with constant false carry in The x86 tests are now broken (in paticular add-scalar.ll now hits the DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a custom lowering for this, so that will need to be implemented. llvm-svn: 369673	2019-08-22 17:29:17 +00:00
Francis Visoiu Mistrih	5b5ee61b5f	[MachO][TLOF] Use hasLocalLinkage to determine if indirect symbol is local Local symbols in the indirect symbol table contain the value `INDIRECT_SYMBOL_LOCAL` and the corresponding __pointers entry must contain the address of the target. In r349060, I added support for local symbols in the indirect symbol table, which was checking if the symbol `isDefined` && `!isExternal` to determine if the symbol is local or not. It turns out that `isDefined` will return false if the user of the symbol comes before its definition, and we'll again generate .long 0 which will be the symbol at the adress 0x0. Instead of doing that, use GlobalValue::hasLocalLinkage() to check if the symbol is local. Differential Revision: https://reviews.llvm.org/D66563 llvm-svn: 369671	2019-08-22 16:59:00 +00:00
Craig Topper	898a0e9b84	[X86] Remove MCInstLower code that drops operands from some CALL and TAILJMP instructions. Add asserts to verify operand count It appears the FIXME here was handled at some point. r159728 from 2012 seems to be at least aportion of fixing it. Differential Revision: https://reviews.llvm.org/D66570 llvm-svn: 369665	2019-08-22 16:23:35 +00:00
Guozhi Wei	51f48295cb	[MBP] Disable aggressive loop rotate in plain mode Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 369664	2019-08-22 16:21:32 +00:00
Amaury Sechet	95cf66de7c	[DAGCombiner] Remove explicit call to AddToWorklist in sqrt and reciprocal computations Summary: These nodes end up being processed regardless due to DAGCombiner ensuring arguments are processed. This changes the order in which nodes are processed, which fixes an issue on PowerPC. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri, mcberg2017, stefanp, hfinkel Subscribers: nemanjai, MaskRay, jsji, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66548 llvm-svn: 369662	2019-08-22 15:35:45 +00:00
Andrea Di Biagio	c9649eb9da	[X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions. Single operand MUL instructions that implicitly set EAX have the following latency/throughput profile (see below): imul %cl # latency: 3cy - uOPs: 1 - 1 JMul imul %cx # latency: 3cy - uOPs: 3 - 3 JMul imul %ecx # latency: 3cy - uOPs: 2 - 2 JMul imul %rcx # latency: 6cy - uOPs: 2 - 4 JMul mul %cl # latency: 3cy - uOPs: 1 - 1 JMul mul %cx # latency: 3cy - uOPs: 3 - 3 JMul mul %ecx # latency: 3cy - uOPs: 2 - 2 JMul mul %rcx # latency: 6cy - uOPs: 2 - 4 JMul Excluding the 64bit variant, which has a latency of 6cy, every other instruction has a latency of 3cy. However, the number of decoded macro-opcodes (as well as the resource cyles) depend on the MUL size. The two operand MULs have a more predictable profile (see below): imul %dx, %dx # latency: 3cy - uOPs: 1 - 1 JMul imul %edx, %edx # latency: 3cy - uOPs: 1 - 1 JMul imul %rdx, %rdx # latency: 6cy - uOPs: 1 - 4 JMul imul $3, %dx, %dx # latency: 4cy - uOPs: 2 - 2 JMul imul $3, %ecx, %ecx # latency: 3cy - uOPs: 1 - 1 JMul imul $3, %rdx, %rdx # latency: 6cy - uOPs: 1 - 4 JMul This patch updates the values in the Jaguar scheduling model and regenerates llvm-mca tests. Differential Revision: https://reviews.llvm.org/D66547 llvm-svn: 369661	2019-08-22 15:20:16 +00:00
Sean Fertile	5f85a7b1cf	[PowerPC] Add combined ELF ABI and 32/64 bit queries to the subtarget. [NFC] A lot of places in the code combine checks for both ABI (SVR4/Darwin/AIX) and addressing mode (64-bit vs 32-bit). In an attempt to make some of the code more readable I've added a couple functions that combine checking for the ELF abi and 64-bit/32-bit code at once. As we add more AIX support I intend to add similar functions for the AIX ABI. Differential Revision: https://reviews.llvm.org/D65814 llvm-svn: 369658	2019-08-22 15:11:28 +00:00
Sean Fertile	18fd1b0b49	[PowerPC][XCOFF][MC] Explicitly set containing csect on symbols. [NFC] Previously we would get the csect a symbol was contained in through its fragment. This works only if we are writing an object file, and only for defined symbols. To fix this we set the contating csect explicitly on the MCSymbolXCOFF object. Differential Revision: https://reviews.llvm.org/D66032 llvm-svn: 369657	2019-08-22 15:11:23 +00:00
Hideto Ueno	70576cac52	[Attributor][NFC] Move DerefState to header and use StateWrapper Summary: In D65402, I want to get DerefState from AADereferenceable but it was not allowed. This patch moves DerefState definition into Attributor.h and makes AADerefenceable inherit StateWrapper. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66585 llvm-svn: 369653	2019-08-22 14:18:29 +00:00
Jinsong Ji	545e993b8b	[SlotIndexes] Add print-slotindexes to disable printing slotindexes Summary: When we print the IR with --print-after/before-*, SlotIndexes will be printed whenever available (We haven't freed it). This introduces some noises when we try to compare the IR among different optimizations. eg: -print-before=machine-cp will print SlotIndexes for 1st machine-cp pass, but NOT for 2nd machine-cp; -print-after=machine-cp will NOT print SlotIndexes for both machine-cp passes. So SlotIndexes in 1st pass introduce noises when differing these IRs. This patch introduces an option to hide indexes. Reviewers: stoklund, thegameg, qcolombet Reviewed By: thegameg Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66500 llvm-svn: 369650	2019-08-22 13:44:47 +00:00
Andrea Di Biagio	589cb004de	[MCA] consistently use MCPhysReg instead of unsigned as register type. NFCI llvm-svn: 369648	2019-08-22 13:32:17 +00:00
George Rimar	91208447d0	[yaml2obj] - Lookup relocation symbols in dynamic symbol when .dynsym referenced. This fixes https://bugs.llvm.org/show_bug.cgi?id=40337. Previously, it was always assumed that relocations referenced symbols in the static symbol table. Now, if the Link field references a section called ".dynsym" it will look up these symbols in the dynamic symbol table. This patch is heavily based on D59097 by James Henderson Differential revision: https://reviews.llvm.org/D66532 llvm-svn: 369645	2019-08-22 12:39:56 +00:00
Andrea Di Biagio	c6744055ad	[X86][BtVer2] Fix latency and throughput of XCHG and XADD. On Jaguar, XCHG has a latency of 1cy and decodes to 2 macro-opcodes. Maximum throughput for XCHG is 1 IPC. The byte exchange has worse latency and decodes to 1 extra uOP; maximum observed throughput is 0.5 IPC. ``` xchgb %cl, %dl # Latency: 2cy - uOPs: 3 - 2 ALU xchgw %cx, %dx # Latency: 1cy - uOPs: 2 - 2 ALU xchgl %ecx, %edx # Latency: 1cy - uOPs: 2 - 2 ALU xchgq %rcx, %rdx # Latency: 1cy - uOPs: 2 - 2 ALU ``` The reg-mem forms of XCHG are atomic operations with an observed latency of 16cy. The resource usage is similar to the XCHGrr variants. The biggest difference is obviously the bus-locking, which prevents the LS to issue other memory uOPs in parallel until the unlocking store uOP is executed. ``` xchgb %cl, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgw %cx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgl %ecx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgq %rcx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy ``` The exchanged in/out register operand becomes available after 11cy from the start of execution. Added test xchg.s to verify that we correctly see that register write committed in 11cy (and not 16cy). Reg-reg XADD instructions have the same latency/throughput than the byte exchange (register-register variant). ``` xaddb %cl, %dl # latency: 2cy - uOPs: 3 - 3 ALU xaddw %cx, %dx # latency: 2cy - uOPs: 3 - 3 ALU xaddl %ecx, %edx # latency: 2cy - uOPs: 3 - 3 ALU xaddq %rcx, %rdx # latency: 2cy - uOPs: 3 - 3 ALU ``` The non-atomic RM variants have a latency of 11cy, and decode to 4 macro-opcodes. They still consume 2 ALU pipes, and the exchange in/out register operand becomes available in 3cy (it matches the 'load-to-use latency'). ``` xaddb %cl, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddw %cx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddl %ecx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddq %rcx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU ``` The atomic XADD variants execute in 16cy. The in/out register operand is available after 11cy from the start of execution. ``` lock xaddb %cl, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddw %cx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddl %ecx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddq %rcx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy ``` Added test xadd.s to verify those latencies as well as read-advance values. Differential Revision: https://reviews.llvm.org/D66535 llvm-svn: 369642	2019-08-22 11:32:47 +00:00
Simon Pilgrim	6dd51c2f19	[MVT] Add MVT equivalent to EVT::getHalfNumVectorElementsVT() helper. NFCI. Allows for some cleanup in a lot of SSE/AVX vector splitting code llvm-svn: 369640	2019-08-22 11:14:30 +00:00
Sam Tebbs	a69d9d6156	Reapply: [ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32 The CodeGen/Thumb2/mve-vaddv.ll test needed to be amended to reflect the changes from the above patch. This reverts commit `cd53ff6`, reapplying `7c6b229`. llvm-svn: 369638	2019-08-22 10:29:20 +00:00
Serguei Katkov	036e636aa7	[Loop Peeling] Fix silly bug in metadata update. We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637	2019-08-22 10:06:46 +00:00
Hans Wennborg	cd53ff6c0d	Revert r369626 "[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32" It broke the bots, see e.g. http://lab.llvm.org:8011/builders/clang-cuda-build/builds/36275/ > This patch fixes shifts by a 128/256 bit shift amount. It also fixes > codegen for shifts of 32 by delegating to LLVM's default optimisation > instead of emitting a long shift. > > Tests that used to generate long shifts of 32 are updated to check for the > more optimised codegen. > > Differential revision: https://reviews.llvm.org/D66519 > > llvm-svn: 369626 llvm-svn: 369636	2019-08-22 09:16:53 +00:00
Craig Topper	d420616313	[X86] Lower the cost of v2i32->v2f64 sint_to_fp under vector widening legalization. I don't really understand the costs we're using for fp_to_sint, but prior to widening legalization we used 20 as the cost for this via the v2i64->v2f64 entry. That number seems better than the 40 we got with widening legalization. So now we need either a v2i32->v2f64 entry or a v4i32->v2f64 entry depending on whether AVX is enabled or not since we skip the first SSE2 table look up under AVX. llvm-svn: 369628	2019-08-22 08:18:45 +00:00
Pavel Labath	1b30ea2c50	[Support] Improve readNativeFile(Slice) interface Summary: There was a subtle, but pretty important difference between the Slice and regular versions of this function. The Slice function was zero-initializing the rest of the buffer when the read syscall returned less bytes than expected, while the regular function did not. This patch removes the inconsistency by making both functions not zero-initialize the buffer. The zeroing code is moved to the MemoryBuffer class, which is currently the only user of this code. This makes the API more consistent, and the code shorter. While in there, I also refactor the functions to return the number of bytes through the regular return value (via Expected<size_t>) instead of a separate by-ref argument. Reviewers: aganea, rnk Subscribers: kristina, Bigcheese, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66471 llvm-svn: 369627	2019-08-22 08:13:30 +00:00
Sam Tebbs	7c6b229204	[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32 This patch fixes shifts by a 128/256 bit shift amount. It also fixes codegen for shifts of 32 by delegating to LLVM's default optimisation instead of emitting a long shift. Tests that used to generate long shifts of 32 are updated to check for the more optimised codegen. Differential revision: https://reviews.llvm.org/D66519 llvm-svn: 369626	2019-08-22 08:12:06 +00:00
Shiva Chen	72a41e7b0d	[TargetLowering] Remove optional arguments passing to makeLibCall The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497. The struct contain argument flags which will pass to makeLibCall function. The patch should not has any functionality changes. Differential Revision: https://reviews.llvm.org/D65795 llvm-svn: 369622	2019-08-22 04:59:43 +00:00
Pengfei Wang	7630e24492	[X86] Making X86OptimizeLEAs pass public. NFC Reviewers: wxiao3, LuoYuanke, andrew.w.kaylor, craig.topper, annita.zhang, liutianle, pengfei, xiangzhangllvm, RKSimon, spatel, andreadb Reviewed By: RKSimon Subscribers: andreadb, hiraditya, llvm-commits Tags: #llvm Patch by Gen Pei (gpei) Differential Revision: https://reviews.llvm.org/D65933 llvm-svn: 369612	2019-08-22 02:29:27 +00:00
Fangrui Song	246750c2a9	[COFF] Fix section name for constants larger than 64 bits on Windows APIntToHexString returns wrong value ("0000000000000000ffffffffffffffff") for integer larger than 64 bits, and thus TargetLoweringObjectFileCOFF::getSectionForConstant returns same section name for all numbers larger than 64 bits. This patch tries to fix it. Differential Revision: https://reviews.llvm.org/D66458 Patch by Senran Zhang llvm-svn: 369610	2019-08-22 01:48:34 +00:00
Cyndy Ishida	9443d0e2c0	[Object] FIX: update PlatformKind name in TapiFile Buildbots that use GCC failed to compile because overwritten namespace with variable name llvm-svn: 369602	2019-08-21 23:57:57 +00:00
Cyndy Ishida	c20d1f90b5	[Object] Add tapi files to object Summary: The intention for this is to allow reading and printing symbols out from llvm-nm. Tapi file, and Tapi universal follow a similiar format to their respective MachO Object format. The tests are dependent on llvm-nm processing tbd files which is why its in D66160 Reviewers: ributzka, steven_wu, lhames Reviewed By: ributzka, lhames Subscribers: mgorny, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66159 llvm-svn: 369600	2019-08-21 23:30:53 +00:00
Craig Topper	78e6507b0a	[X86] Correct the scheduler classes for TAILJMP and TCRETURN CodeGenOnly instructions. We had an odd combination of WriteJump applied to some memory instructions and WriteJumpLd applied to register and immediate instructions. Thsi should hopefully assign them all correctly. llvm-svn: 369599	2019-08-21 23:17:52 +00:00
Craig Topper	303bbc3be2	[X86] Replace a couple hardcoded '5's with X86::AddrNumOperands for readability. NFC llvm-svn: 369598	2019-08-21 22:40:07 +00:00
Luis Marques	f7cdff4ffd	[RISCV] Remove fix introduced by r369573, superseded by r369580 llvm-svn: 369590	2019-08-21 22:02:56 +00:00
Johannes Doerfert	d98f975089	[Attributor] Fix: Gracefully handle non-instruction users Function can have users that are not instructions, e.g., bitcasts. For now, we simply give up when we see them. llvm-svn: 369588	2019-08-21 21:48:56 +00:00
Greg Clayton	bf9ee07afa	Add FileWriter to GSYM and encode/decode functions to AddressRange and AddressRanges The full GSYM patch started with: https://reviews.llvm.org/D53379 This patch add the ability to encode data using the new llvm::gsym::FileWriter class. FileWriter is a simplified binary data writer class that doesn't require targets, target definitions, architectures, or require any other optional compile time libraries to be enabled via the build process. This class needs the ability to seek to different spots in the binary data that it produces to fix up offsets and sizes in GSYM data. It currently uses std::ostream over llvm::raw_ostream because llvm::raw_ostream doesn't support seeking which is required when encoding and decoding GSYM data. AddressRange objects are encoded and decoded to be relative to a base address. This will be the FunctionInfo's start address if the AddressRange is directly contained in a FunctionInfo, or a base address of the containing parent AddressRange or AddressRanges. This allows address ranges to be efficiently encoded using ULEB128 encodings as we encode the offset and size of each range instead of full addresses. This also makes encoded addresses easy to relocate as we just need to relocate one base address. Differential Revision: https://reviews.llvm.org/D63828 llvm-svn: 369587	2019-08-21 21:48:11 +00:00
Luis Marques	4f488b594a	[RISCV] Fix use of side-effects in asserts in decoder functions llvm-svn: 369580	2019-08-21 21:11:37 +00:00
Cyndy Ishida	359840a6e4	[BinaryFormat] Teach identify_magic about Tapi files. Summary: Tapi files are YAML files that start with the !tapi tag. The only execption are TBD v1 files, which don't have a tag. In that case we have to scan a little further and check if the first key "archs" exists. This is the first patch in a series of patches to add libObject support for text-based dynamic library (.tbd) files. This patch is practically exactly the same as D37820, that was never pushed to master, and is needed for future commits related to reading tbd files for llvm-nm Reviewers: ributzka, steven_wu, bollu, espindola, jfb, shafik, jdoerfert Reviewed By: steven_wu Subscribers: dexonsmith, llvm-commits Tags: #llvm, #clang, #sanitizers, #lldb, #libc, #openmp Differential Revision: https://reviews.llvm.org/D66149 llvm-svn: 369579	2019-08-21 21:00:16 +00:00
Johannes Doerfert	5427aa843b	[Attributor][NFC] Fix copy & paste error llvm-svn: 369577	2019-08-21 20:57:20 +00:00
Johannes Doerfert	2db8528fb4	[Attributor][NFC] Remove leftover semicolon llvm-svn: 369576	2019-08-21 20:56:56 +00:00
Johannes Doerfert	d410805d57	[Attributor] Use existing unreachable instead of introducing new ones So far we split the unreachable off and placed a new one, this is not necessary. llvm-svn: 369575	2019-08-21 20:56:41 +00:00
Richard Smith	b73cd33625	Fix -Werror=unused-variable error after r369528. llvm-svn: 369573	2019-08-21 20:42:37 +00:00
Florian Hahn	b5e52bfd83	[GVN] Do PHI translations across all edges between the load and the unavailable pred. Currently we do not properly translate addresses with PHIs if LoadBB != LI->getParent(), because PHITranslateAddr expects a direct predecessor as argument, because it considers all instructions outside of the current block to not requiring translation. The amount of cases that trigger this should be very low, as most single predecessor blocks should be folded into their predecessor by GVN before we actually start with value numbering. It is still not guaranteed to happen, so we should do PHI translation along all edges between the loads' block and the predecessor where we have to place a load. There are a few test cases showing current limits of the PHI translation, which could be improved later. Reviewers: spatel, reames, efriedma, john.brawn Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D65020 llvm-svn: 369570	2019-08-21 20:06:50 +00:00
Aaron Ballman	6a29ff1754	Revert r369549 as it broke the bots. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/13605/ llvm-svn: 369569	2019-08-21 20:00:41 +00:00
Nico Weber	ed18e70c86	Revert r367389 (and follow-up r368404); it caused PR43073. llvm-svn: 369567	2019-08-21 19:53:42 +00:00
Sam Clegg	dde8a25a4b	[WebAssembly] Handle aliases in WebAssemblyFixFunctionBitcasts Fixes: https://github.com/emscripten-core/emscripten/issues/8770 Differential Revision: https://reviews.llvm.org/D66508 llvm-svn: 369566	2019-08-21 19:52:33 +00:00
Craig Topper	3f59bfd5be	[MVT] Add v16f16 and v32f16 vectors. I might look at improving PR43065 which will require being able to mark a 256 and 512 bit vector of f16 as Legal. Differential Revision: https://reviews.llvm.org/D66515 llvm-svn: 369565	2019-08-21 19:14:48 +00:00
Simon Atanasyan	159f621c5c	[mips] Replace call `expandLoadAddress` by `loadAndAddSymbolAddress`. NFC In case of expanding `lw/sw $reg, symbol($reg)` instruction for PIC it's enough to call the `loadAndAddSymbolAddress` method. Additional work performed by the `expandLoadAddress` is not required here. llvm-svn: 369563	2019-08-21 18:54:51 +00:00
Simon Atanasyan	bb2f857247	[mips] Remove duplicated case from the `StringSwitch`. NFC llvm-svn: 369562	2019-08-21 18:54:41 +00:00
Amaury Sechet	c0f190a048	[DAGCombiner] Remove mostly redundant calls to AddToWorklist Summary: These calls change the order in which some nodes are processed and so have an effect on codegen. The change in fixup-bw-copy.ll is due to (and (load anyext)) gets transformed into (load zext) while previously the and was removed by SimplifyDemandedBits, so the (load anyext) remained. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66543 llvm-svn: 369561	2019-08-21 18:51:08 +00:00
Florian Hahn	969b3e6a8f	[BitcodeReader] Check if we can create a null constant for type. We cannot create null constants for certain types, e.g. VoidTy, FunctionTy or LabelTy. getNullValue asserts if we pass in an unsupported type. We should also check for opaque types, but I'm not sure how. This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14795. Reviewers: t.p.northover, jfb, vsk Reviewed By: vsk Tags: #llvm Differential Revision: https://reviews.llvm.org/D65897 llvm-svn: 369557	2019-08-21 18:20:11 +00:00
Nathan Huckleberry	01a413695c	Fix -Wimplicit-fallthrough warnings in regcomp.c Summary: Since clang does not support comment style fallthrough annotations these should be switched. Reviewers: aaron.ballman, nickdesaulniers, xbolva00 Reviewed By: aaron.ballman, nickdesaulniers, xbolva00 Subscribers: xbolva00, nickdesaulniers, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66487 llvm-svn: 369549	2019-08-21 17:07:43 +00:00
Alina Sbirlea	7425179fee	[LoopPassManager + MemorySSA] Only enable use of MemorySSA for LPMs known to preserve it. Summary: Add a flag to the FunctionToLoopAdaptor that allows enabling MemorySSA only for the loop pass managers that are known to preserve it. If an LPM is known to have only loop transforms that all preserve MemorySSA, then use MemorySSA if `EnableMSSALoopDependency` is set. If an LPM has loop passes that do not preserve MemorySSA, then the flag passed is `false`, regardless of the value of `EnableMSSALoopDependency`. When using a custom loop pass pipeline via `passes=...`, use keyword `loop` vs `loop-mssa` to use MemorySSA in that LPM. If a loop that does not preserve MemorySSA is added while using the `loop-mssa` keyword, that's an error. Add the new `loop-mssa` keyword to a few tests where a difference occurs when enabling MemorySSA. Reviewers: chandlerc Subscribers: mehdi_amini, Prazek, george.burgess.iv, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66376 llvm-svn: 369548	2019-08-21 17:00:57 +00:00
Matt Arsenault	954a012b4c	GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547	2019-08-21 16:59:10 +00:00
David Green	717feabdf0	[ARM] Formatting for ARMInstrMVE.td. NFC This is just some formatting cleanup, prior to the masked load and store patch in D66534. llvm-svn: 369545	2019-08-21 16:20:35 +00:00
Philip Reames	764b0fd5a3	[instcombine] icmp eq/ne (sub C, Y), C -> icmp eq/ne Y, 0 Noticed while looking at pr43028. llvm-svn: 369541	2019-08-21 15:51:57 +00:00
Nilanjana Basu	ac3851c434	Improving CodeView debug info type record's inline comments llvm-svn: 369533	2019-08-21 15:19:58 +00:00
Alexander Timofeev	78347c979e	[AMDGPU] Prevent VGPR copies from moving across the EXEC mask definitions Differential Revision: https://reviews.llvm.org/D63731 Reviewers: qcolombet, rampitec llvm-svn: 369532	2019-08-21 15:15:04 +00:00
Guillaume Chatelet	1c18a9cb9e	[LLVM][Alignment] Introduce Alignment In MachineFrameInfo Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: jfb Subscribers: hiraditya, dexonsmith, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65800 llvm-svn: 369531	2019-08-21 14:29:30 +00:00
Igor Kudrin	ed413074f2	[DWARF] Adjust return type of DWARFUnit::getLength(). DWARFUnitHeader::getLength() returns uint64_t. DWARFUnit::getLength() should do the same. Differential Revision: https://reviews.llvm.org/D66472 llvm-svn: 369529	2019-08-21 14:10:57 +00:00
Luis Marques	c3bf3d14ea	[RISCV] Add support for RVC HINT instructions The hint instructions are enabled by default (if the standard C extension is enabled). To disable them pass -mattr=-rvc-hints. Differential Revision: https://reviews.llvm.org/D62592 llvm-svn: 369528	2019-08-21 14:00:58 +00:00
Amaury Sechet	045f33aec9	[DAGCombiner] Various nits. NFC llvm-svn: 369520	2019-08-21 12:01:37 +00:00
Sanjay Patel	e728259278	[InstCombine] narrow icmp with extended operands of different widths An intermediate extend is used to widen the narrow operand to the width of the other (wider) operand. At that point, we have the same logic as the existing transform that was restricted to folds of equal width zext/sext. This mostly solves PR42700: https://bugs.llvm.org/show_bug.cgi?id=42700 llvm-svn: 369519	2019-08-21 11:56:08 +00:00
Pavel Labath	82275ec51d	MinidumpYAML: move serialization code to MinidumpEmitter.cpp Summary: The code for serializing minidumps was living in MinidumpYAML.cpp so that it would be accessible from unit tests. While this had its advantages, it was also unfortunate because it broke symmetry with all other yaml2obj serializers. Fortunately, nowadays all of yaml2obj is a library, so we don't need to do anything special. This patch improves the code consistency by moving the serialization code to MinidumpEmitter.cpp to match the style used in other backends. It also removes the writeAsBinary entry point in favor of the more general convertYAML interface. This patch is just massaging the code a bit. There shouldn't be any functional change here. Reviewers: jhenderson, abrachet Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66474 llvm-svn: 369517	2019-08-21 11:30:48 +00:00
Petar Avramovic	7f581df649	[MIPS GlobalISel] NarrowScalar G_ZEXTLOAD and G_SEXTLOAD NarrowScalar G_ZEXTLOAD and G_SEXTLOAD to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66205 llvm-svn: 369512	2019-08-21 09:43:20 +00:00
Petar Avramovic	e406aa791c	[MIPS GlobalISel] NarrowScalar G_ZEXT and G_SEXT NarrowScalar G_ZEXT and G_SEXT to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66204 llvm-svn: 369511	2019-08-21 09:35:02 +00:00
Petar Avramovic	61bf2675b9	[MIPS GlobalISel] Consider type1 when legalizing shifts after r351882 r351882 allows different type for shift amount then result and value being shifted. Fix MIPS Legalizer rules to take r351882 into account. Differential Revision: https://reviews.llvm.org/D66203 llvm-svn: 369510	2019-08-21 09:31:29 +00:00
Petar Avramovic	5b4c5c2c54	[MIPS GlobalISel] NarrowScalar G_TRUNC Add NarrowScalar for G_TRUNC when NarrowTy is half the size of source. NarrowScalar G_TRUNC to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66202 llvm-svn: 369509	2019-08-21 09:26:39 +00:00
Jeremy Morse	67443c3c6e	[DebugInfo] Avoid dropping location info across block boundaries LiveDebugValues propagates variable locations between blocks by creating new DBG_VALUE insts in the successors, then interpreting them when it passes back through the block at a later time. However, this flushes out any extra information about the location that LiveDebugValues holds: for example, connections between variable locations such as discussed in D65368. And as reported in PR42772 this causes us to lose track of the fact that a spill-location is actually a spill, not a register location. This patch fixes that by deferring the creation of propagated DBG_VALUEs until after propagation has completed: instead location propagation occurs only by sharing location ID numbers between blocks. Differential Revision: https://reviews.llvm.org/D66412 llvm-svn: 369508	2019-08-21 09:22:31 +00:00
Luke Cheeseman	71d38b3c62	[AArch64] Update MTE system register encodings The encodings for the system registers TFSRE0_EL1, TFSR_EL1 TFSR_EL2, TFSR_EL3 and TFSR_EL12 have been changed so that they consistently have CRn=5 and CRm=6 as per https://developer.arm.com/docs/ddi0487/latest. Differential Revision: https://reviews.llvm.org/D65442 llvm-svn: 369505	2019-08-21 09:09:56 +00:00
Serge Guelton	d1262a6e91	Be explicit about Windows coff name trailing character policy It's okay to not copy the trailing zero of a windows section/symbol name. This is compatible with strncpy behavior but gcc doesn't know that and throws an invalid warning. Encode this behavior in a proper function. Differential Revision: https://reviews.llvm.org/D66420 llvm-svn: 369501	2019-08-21 07:54:42 +00:00
Vitaly Buka	5d84a67ce0	Fix 'fall through' annotation llvm-svn: 369490	2019-08-21 04:05:34 +00:00
Amara Emerson	56606a4db3	[AArch64][GlobalISel] Add support for narrowScalar of G_ZEXT We do this by merging the source with the high bits set to 0. Differential Revision: https://reviews.llvm.org/D66181 llvm-svn: 369480	2019-08-21 00:12:37 +00:00
Sean Fertile	9467734a1c	Fix assert in XCOFFObjectWriter related to program code csects. Removed code that added program code csects to a collection as part of addressing review comments, but I failed to update an assert affected by the change before commiting. llvm-svn: 369471	2019-08-20 23:24:47 +00:00
Stefan Stipanovic	26121ae4d0	[Attributor] Liveness for internal functions. For an internal function, if all its call sites are dead, the body of the function is considered dead. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D66155 llvm-svn: 369470	2019-08-20 23:16:57 +00:00
Alexandre Ganea	21e9603030	[Sanitizer] Remove unused functions Differential Revision: https://reviews.llvm.org/D66503 llvm-svn: 369468	2019-08-20 22:56:40 +00:00
Daniel Sanders	a16bd4f9f2	[RISCV GlobalISel] Adding initial GlobalISel infrastructure Summary: Add an initial GlobalISel skeleton for RISCV. It can only run ir translator for `ret void`. Patch by Andrew Wei Reviewers: asb, sabuasal, apazos, lenary, simoncook, lewis-revill, edward-jones, rogfer01, xiangzhai, rovka, Petar.Avramovic, mgorny, dsanders Reviewed By: dsanders Subscribers: pzheng, s.egerton, dsanders, hiraditya, rbar, johnrusso, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, psnobl, benna, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65219 llvm-svn: 369467	2019-08-20 22:53:24 +00:00
Alina Sbirlea	2863721f05	[MemorySSA] Make Phi cleanups consistent. Summary: Make Phi cleanups consistent: remove self as a trivial Phi and recurse to potentially remove other trivial phis. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66454 llvm-svn: 369466	2019-08-20 22:47:58 +00:00
Jessica Paquette	e6c299b983	[AArch64][GlobalISel] Select logical_imm32 and logical_imm64 patterns Add a GlobalISel equivalent for the logical_imm32_XFORM and logical_imm64_XFORM SDNodeXForms in AArch64InstrFormats.td. - Add select-logical-imm.mir, which contains tests for each imported pattern. - Update select-pr32733.mir and select-scalar-shift-imm.mir, since they now select instructions of this form. Differential Revision: https://reviews.llvm.org/D66162 llvm-svn: 369465	2019-08-20 22:31:25 +00:00
Alina Sbirlea	1c528e8f1b	[MemorySSA] Fix existing phis when inserting defs. Summary: When inserting a new Def, and inserting Phis in the IDF when needed, also mark the already existing Phis in the IDF as non-optimized, since these may need fixing as well. In the test attached, there is a Phi in the IDF that happens to be trivial, and is wrongfully removed by the call to getLastDef that follows. This is a valid situation and the existing IDF Phis need to marked as "may need fixing" as well. Resolves PR43044. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66495 llvm-svn: 369464	2019-08-20 22:29:06 +00:00
Sean Fertile	89463fcfc7	Remove assert with tautological compare from XCOFFObjectWriter. Remove assert of 'Sec->getCSectType() <= 0x07u' added in r369454, since its always true. llvm-svn: 369462	2019-08-20 22:23:34 +00:00
Jessica Paquette	9a95e79b1b	[AArch64][GlobalISel] Select patterns which use shifted register operands This adds GlobalISel equivalents for the following from AArch64InstrFormats: - arith_shifted_reg32 - arith_shifted_reg64 And partial support for - logical_shifted_reg32 - logical_shifted_reg32 The only thing missing for the logical cases is support for rotates. Other than the missing support, the transformation is identical for the arithmetic shifted register and the logical shifted register. Lots of tests here: - Add select-arith-shifted-reg.mir to show that we correctly select add and sub instructions which use this pattern. - Add select-logical-shifted-reg.mir to cover patterns which are not shared between the arithmetic and logical cases. - Update addsub-shifted.ll to show that we correctly fold shifts into adds/subs. - Update eon.ll to show that we can select the eon instruction by folding xors. Differential Revision: https://reviews.llvm.org/D66163 llvm-svn: 369460	2019-08-20 22:18:06 +00:00
Craig Topper	ba375263e8	[DAGCombiner][X86] Teach visitCONCAT_VECTORS to combine (concat_vectors (concat_vectors X, Y), undef)) -> (concat_vectors X, Y, undef, undef) I also had to add a new combine to X86's combineExtractSubvector to prevent a regression. This helps our vXi1 code see the full concat operation and allow it optimize undef to a zero if there is already a zero in the concat. This helped us use a movzx instead of an AND in some of the tests. In those tests, one concat comes from SelectionDAGBuilder and the second comes from type legalization of v4i1->i4 bitcasts which uses an additional concat. Though these changes weren't my original motivation. I'm looking at making X86ISelLowering's narrowShuffle emit a concat_vectors instead of an insert_subvector since concat_vectors is more canonical during early DAG combine. This patch helps prevent a regression from my experiments with that. Differential Revision: https://reviews.llvm.org/D66456 llvm-svn: 369459	2019-08-20 22:12:50 +00:00
Reid Kleckner	22fb734907	Revert [WinEH] Allocate space in funclets stack to save XMM CSRs This reverts r367088 (git commit `9ad565f70e`) And the follow up fix r368631 / `e9865b9b31` llvm-svn: 369457	2019-08-20 22:08:57 +00:00
Sean Fertile	1e46d4cec5	Adds support for writing the .bss section for XCOFF object files. Adds Wrapper classes for MCSymbol and MCSection into the XCOFF target object writer. Also adds a class to represent the top-level sections, which we materialize in the ObjectWriter. executePostLayoutBinding will map all csects into the appropriate container depending on its storage mapping class, and map all symbols into their containing csect. Once all symbols have been processed we - Assign addresses and symbol table indices. - Calaculte section sizes. - Build the section header table. - Assign the sections raw-pointer value for non-virtual sections. Since the .bss section is virtual, writing the header table is enough to add support. Writing of a sections raw data, or of any relocations is not included in this patch. Testing is done by dumping the section header table, but it needs to be extended to include dumping the symbol table once readobj support for dumping auxiallary entries lands. Differential Revision: https://reviews.llvm.org/D65159 llvm-svn: 369454	2019-08-20 22:03:18 +00:00
Michael Liao	a99086dbdd	[Attributor] Remove unused variable. NFC. llvm-svn: 369444	2019-08-20 21:02:31 +00:00
Wenlei He	5adace352d	[AutoFDO] Make call targets order deterministic for sample profile Summary: StringMap is used for storing call target to frequency map for AutoFDO. However the iterating order of StringMap is non-deterministic, which leads to non-determinism in AutoFDO profile output. Now new API getSortedCallTargets and SortCallTargets are added for deterministic ordering and output. Roundtrip test for text profile and binary profile is added. Reviewers: wmi, davidxl, danielcdh Subscribers: hiraditya, mgrang, llvm-commits, twoh Tags: #llvm Differential Revision: https://reviews.llvm.org/D66191 llvm-svn: 369440	2019-08-20 20:52:00 +00:00
Craig Topper	3a2b08e6c9	[X86] Add a DAG combine to transform (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) -> (i8 (trunc (i16 (bitcast (v16i1 X))))) on KNL target Without AVX512DQ we don't have KMOVB so we can't really copy 8-bits of a k-register to a GPR. We have to copy 16 bits instead. We do this even if the DAG copy is from v8i1->v16i1. If we detect the (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) we should rewrite the types to match the copy we do support. By doing this, we can help known bits to propagate without losing the upper 8 bits of the input to the extract_subvector. This allows some zero extends to be removed since we have an isel pattern to use kmovw for (zero_extend (i16 (bitcast (v16i1 X))). Differential Revision: https://reviews.llvm.org/D66489 llvm-svn: 369434	2019-08-20 20:20:04 +00:00
Craig Topper	250951abf5	[X86] Add isel patterns for (i64 (zext (i8 (bitcast (v16i1 X))))) to use a KMOVW and a SUBREG_TO_REG. Similar for i8 and anyextend. We already had patterns for extending to i32 to take advantage of the impliciting zeroing of the upper bits of a 32-bit GPR that is done by KMOVW/KMOVB. But the extend might be all the way to i64, in which case the existing patterns would fail and we'd get a KMOVW/B followed by a MOVZX. By adding patterns for i64 we can use the fact that KMOVW/B zero the upper bits of the 32-bit GPR and the normal property that 32-bit GPR writes implicitly zero the upper 32-bits of the full 64-bit GPR. The anyextend patterns are slightly different since we don't care about the upper zeros. For the i8->i64 I think this avoids selecting the anyextend as a MOVZX to prevent a partial register issue that doesn't exist. For i16->i64 I think we would have just emitted an insert_subreg on top of the extract_subreg that the vXi16->i16 bitcast pattern emits. The register coalescer or peephole pass should combine those, but this saves that work and makes i8/16 consistent. llvm-svn: 369431	2019-08-20 19:43:48 +00:00
Martin Storsjo	514f3a122d	[TargetMachine] Don't try to create COFFSTUB references on windows on non-COFF This avoids spurious relocation types for windows/elf targets. Differential Revision: https://reviews.llvm.org/D66401 llvm-svn: 369426	2019-08-20 18:58:05 +00:00
Sam Clegg	cf2b8722d4	[WebAssembly][lld] Fix crash when applying relocations to debug sections Debug sections are special in that they can contain relocations against symbols that are not present in the final output (i.e. not live). However it is also possible to have R_WASM_TABLE_INDEX relocations against symbols that don't have a table index assigned (since they are not address taken by actual code. Fixes: https://github.com/emscripten-core/emscripten/issues/9023 Differential Revision: https://reviews.llvm.org/D66435 llvm-svn: 369423	2019-08-20 18:39:24 +00:00
Sanjay Patel	292b1087f4	[InstCombine] add helper function for icmp+zext/sext; NFC llvm-svn: 369421	2019-08-20 18:15:17 +00:00
Matt Arsenault	4b7fc85c0b	Revert "AMDGPU: Fix iterator error when lowering SI_END_CF" This reverts r367500 and r369203. This is causing various test failures. llvm-svn: 369417	2019-08-20 17:45:25 +00:00
Andrea Di Biagio	2e897a94f5	[X86][BtVer2] Use ReadAfterLd entries for the register operands of CMPXCHG. This is a follow-up of r369365. llvm-svn: 369412	2019-08-20 17:05:56 +00:00
Sanjay Patel	2e68e4d60e	[InstCombine] make fold for icmp with sext more efficient; NFC We were creating 2 instructions and relying on a subsequent fold to invert a not(icmp). Create the final icmp directly instead. llvm-svn: 369411	2019-08-20 17:03:22 +00:00
Craig Topper	22ac9f396f	[X86] Use isNullConstant instead of getConstantOperandVal == 0. NFC llvm-svn: 369410	2019-08-20 16:55:12 +00:00
Sam Tebbs	dcfc2d40d3	[ARM] Select vaddva This patch adds vaddva selection. Differential revision: https://reviews.llvm.org/D66410 llvm-svn: 369404	2019-08-20 16:33:34 +00:00
Aditya Nandakumar	08bd080872	[GlobalISel] Handle multiple registers in dbg.value intrinsic https://reviews.llvm.org/D66077 The value passed into dbg.value may relate to multiple registers, each of which need a DBG_VALUE. This fix calls MIRBuilder.buildDirectDbgValue for each register. Without this, IR passed in from flang-compiler/flang may fail an assertion in getOrCreateVReg. Patch by : peterwaller-arm. llvm-svn: 369403	2019-08-20 16:28:37 +00:00
Thomas Raoux	be699bf389	[CodeGen] Add a pass to do block predication on SSA machine IR. For targets requiring aggressive scheduling and/or software pipeline we need to apply predication before preRA scheduling. This adds a pass re-using the early if-cvt infrastructure but generating predicated instructions instead of speculatively executing instructions. It allows doing if conversion on blocks containing instructions with side-effects. The pass re-use the target hook from postRA if-conversion to let the target decide on the heuristic to apply. Differential Revision: https://reviews.llvm.org/D66190 llvm-svn: 369395	2019-08-20 15:54:59 +00:00
Sanjay Patel	a90ee0eeb6	[InstCombine] improve readability for icmp with cast folds; NFC 1. Update function name and stale code comments. 2. Use variable names that are less ambiguous. 3. Move operand checks into the function as early exits. llvm-svn: 369390	2019-08-20 14:56:44 +00:00
Jinsong Ji	cda334ba54	[BlockExtractor] Avoid assert with wrong line format Summary: When the line format is wrong, we may end up accessing out of bound memory. eg: the test with invalide line will cause assert. Assertion `idx < size()' failed The fix is to report fatal when we found mismatched line format. Reviewers: qcolombet, volkan Reviewed By: qcolombet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66444 llvm-svn: 369389	2019-08-20 14:46:02 +00:00
Andrea Di Biagio	16111d3795	[X86][BtVer2] Fix latency and throughput of atomic INC/DEC/NEG/NOT. Latency and throughput of LOCK INC/DEC/NEG/NOT is always 19cy. Number of uOPs is still 1. Differential Revision: https://reviews.llvm.org/D66469 llvm-svn: 369388	2019-08-20 14:31:27 +00:00
Sanjay Patel	f99d254aae	[InstCombine] simplify min/max of min/max with same operands (PR35607) This is the original integer variant requested in: https://bugs.llvm.org/show_bug.cgi?id=35607 As noted in the TODO and several similar TODOs around this block, we could do this in instsimplify, but then it would cost more because we would be trying to match min/max via ValueTracking in 2 different places. There are 4 commuted variants for each of smin/smax/umin/umax that are not matched here. There are also icmp predicate variants that are not included in the affected test file because they are already handled by instsimplify by folding the final icmp to true/false. https://rise4fun.com/Alive/3KVc Name: smax(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: smin(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %min, i32 %max => %r = %min Name: umax(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: umin(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %min, i32 %max => %r = %min llvm-svn: 369386	2019-08-20 13:39:17 +00:00
Igor Kudrin	59d5abaa71	[DWARF] Fix reading 64-bit DWARF type units. The type_offset field is 8 bytes long in DWARF64. The patch extends TypeOffset to uint64_t and fixes its reading. The patch also fixes checking of TypeOffset bounds as it was inaccurate in DWARF64 case. Differential Revision: https://reviews.llvm.org/D66465 llvm-svn: 369378	2019-08-20 12:52:32 +00:00
Alex Bradbury	7cb3cd34e8	[RISCV] Implement getExprForFDESymbol to ensure RISCV_32_PCREL is used for the FDE location Follow binutils in using RISCV_32_PCREL for the FDE initial location. As explained in the relevant binutils commit <`a6cbf936e3`>, the ADD/SUB pair of relocations is problematic in the presence of linker relaxation. This patch has the same end goal as D64715 but includes test changes and avoids adding a new global VariantKind to MCExpr.h (preferring RISCVMCExpr VKs like the rest of the RISC-V backend). Differential Revision: https://reviews.llvm.org/D66419 llvm-svn: 369375	2019-08-20 12:32:31 +00:00
Pavel Labath	51d7398f63	Recommit "MemoryBuffer: Add a missing error-check to getOpenFileImpl" This recommits r368977, which was reverted in r369027 due to test failures in lldb. The cause of this was different behavior of readNativeFileSlice on windows and unix. These have been addressed in r369269. The original commit message was: In case the function was called with a desired read size and the file was not an "mmap()" candidate, the function was falling back to a "pread()", but it was failing to check the result of that system call. This meant that the function would return "success" even though the read operation failed, and it returned a buffer full of uninitialized memory. Reviewers: rnk, dblaikie Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66224 llvm-svn: 369370	2019-08-20 12:08:52 +00:00
Andrea Di Biagio	b1bdd97a26	[X86][Btver2] Fix latency and throughput of CMPXCHG instructions. On Jaguar, CMPXCHG has a latency of 11cy, and a maximum throughput of 0.33 IPC. Throughput is superiorly limited to 0.33 because of the implicit in/out dependency on register EAX. In the case of repeated non-atomic CMPXCHG with the same memory location, store-to-load forwarding occurs and values for sequent loads are quickly forwarded from the store buffer. Interestingly, the functionality in LLVM that computes the reciprocal throughput doesn't seem to know about RMW instructions. That functionality only looks at the "consumed resource cycles" for the throughput computation. It should be fixed/improved by a future patch. In particular, for RMW instructions, that logic should also take into account for the write latency of in/out register operands. An atomic CMPXCHG has a latency of ~17cy. Throughput is also limited to ~17cy/inst due to cache locking, which prevents other memory uOPs to start executing before the "lock releasing" store uOP. CMPXCHG8rr and CMPXCHG8rm are treated specially because they decode to one less macro opcode. Their latency tend to be the same as the other RR/RM variants. RR variants are relatively fast 3cy (but still microcoded - 5 macro opcodes). CMPXCHG8B is 11cy and unfortunately doesn't seem to benefit from store-to-load forwarding. That means, throughput is clearly limited by the in/out dependency on GPR registers. The uOP composition is sadly unknown (due to the lack of PMCs for the Integer pipes). I have reused the same mix of consumed resource from the other CMPXCHG instructions for CMPXCHG8B too. LOCK CMPXCHG8B is instead 18cycles. CMPXCHG16B is 32cycles. Up to 38cycles when the LOCK prefix is specified. Due to the in/out dependencies, throughput is limited to 1 instruction every 32 (or 38) cycles dependeing on whether the LOCK prefix is specified or not. I wouldn't be surprised if the microcode for CMPXCHG16B is similar to 2x microcode from CMPXCHG8B. So, I have speculatively set the JALU01 consumption to 2x the resource cycles used for CMPXCHG8B. The two new hasLockPrefix() functions are used by the btver2 scheduling model check if a MCInst/MachineInst has a LOCK prefix. Calls to hasLockPrefix() have been encoded in predicates of variant scheduling classes that describe lat/thr of CMPXCHG. Differential Revision: https://reviews.llvm.org/D66424 llvm-svn: 369365	2019-08-20 10:23:55 +00:00
Seiya Nuta	522377494b	[yaml2obj/obj2yaml][MachO] Allow setting custom section data Reviewers: alexshap, jhenderson, rupprecht Reviewed By: alexshap, jhenderson Subscribers: abrachet, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65799 llvm-svn: 369348	2019-08-20 08:49:07 +00:00
Fangrui Song	2682340cdf	[MC] Delete an overload of MCExpr::evaluateKnownAbsolute and its associated hack The hack dated back to 2010 (r121076) and was documented by r122144: // FIXME: The use if InSet = Addrs is a hack. Setting InSet causes us // absolutize differences across sections and that is what the MachO writer // uses Addrs for. llvm-svn: 369337	2019-08-20 07:42:04 +00:00
Fangrui Song	f182617352	[Attributor] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after r369331 llvm-svn: 369334	2019-08-20 07:21:43 +00:00
Craig Topper	1ada137854	[X86] Add back the -x86-experimental-vector-widening-legalization comand line flag and all associated code, but leave it enabled by default Google is reporting performance issues with the new default behavior and have asked for a way to switch back to the old behavior while we investigate and make fixes. I've restored all of the code that had since been removed and added additional checks of the command flag onto code paths that are not otherwise guarded by a check of getTypeAction. I've also modified the cost model tables to hopefully get us back to the previous costs. Hopefully we won't need to support this for very long since we have no test coverage of the old behavior so we can very easily break it. llvm-svn: 369332	2019-08-20 06:58:00 +00:00
Johannes Doerfert	12cbbab9d9	[Attributor] Create abstract attributes on-demand Before, we create the set of abstract attributes initially and then dealt with the fact hat a lookup could fail, e.g., return a nullptr. This patch will ensure we always return a valid object from a lookup, allowing us not only to remove the nullptr checks but also to grow the set of abstract attributes "in-flight" on-demand. One can now start from those that have the best chance of improving performance without the need to specify all they might depend on. While this introduces some boilerplate, the usage of attributes is much easier and cleaner now. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66276 llvm-svn: 369331	2019-08-20 06:15:50 +00:00
Johannes Doerfert	169af994bc	[Attributor][NFC] Cleanup statistics code llvm-svn: 369330	2019-08-20 06:09:56 +00:00
Johannes Doerfert	cfcca1a5b1	[Attributor] Use structured deduction for AADereferenceable Summary: This is analogous to D66128 but for AADereferenceable. We have the logic concentrated in the floating value updateImpl and we use the combiner helper classes for arguments and return values. The regressions will go away with "on-demand" attribute creation. Improvements are already visible in the existing tests. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66272 llvm-svn: 369329	2019-08-20 06:08:35 +00:00
Johannes Doerfert	b9b8791fed	[Attributor] Use structured deduction for AANonNull Summary: What D66126 did for AAAlign, this patch does for AANonNull. Agian, the logic becomes more concise and localized. Again, returned poiners are not annotated properly but that will not be an issue if this lands with the "on-demand" generation of attributes. First improvements due to the genericValueTraversal are already visible. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66128 llvm-svn: 369328	2019-08-20 06:02:39 +00:00
Johannes Doerfert	028b2aa56a	[Attributor] Fix the "clamp" operator The clamp operator should not take the known of the given state as the known is potentially based on assumed information. This also adds TODOs to guide improvements. llvm-svn: 369327	2019-08-20 05:57:01 +00:00
Thomas Raoux	a08e139d50	[NFC] Test commit, fix some comment spelling. llvm-svn: 369326	2019-08-20 05:21:27 +00:00
Karl-Johan Karlsson	40da6be2bd	[AsmPrinter] Remove const qualifier from EmitBasicBlockStart. Overriders may want to modify state in it. AMDGPU wants to, but has to make its members mutable in order to do so. Besides, EmitBasicBlockEnd is not const, so why should Start be? Patch by Bevin Hansson. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D66341 llvm-svn: 369325	2019-08-20 05:13:57 +00:00
Fangrui Song	ce21c3e12c	MCAsmMacro: add `#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)` to some dump() declarations llvm-svn: 369324	2019-08-20 04:14:43 +00:00
Fangrui Song	e828ce1b88	[WebAssembly][MC] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after r369317 llvm-svn: 369318	2019-08-20 02:02:57 +00:00
Sam Clegg	ecc5e8084f	[WebAssembly][MC] Simplify WasmObjectWriter::recordRelocation. NFC. WebAssembly doesn't support PC relative relocation or relocation expressions that can't be reduced to single symbol. The only support for we have for fixups involving two symbols are when both symbols are defined and withing the same section. In this case evaluateFixup will already have evaluated to the expression before calling recordRelocation. llvm-svn: 369317	2019-08-20 00:33:50 +00:00
Dinar Temirbulatov	081c57989e	[SLP][NFC] Avoid repetitive calls to getSameOpcode() We can avoid repetitive calls getSameOpcode() for already known tree elements by keeping MainOp and AltOp in TreeEntry. Differential Revision: https://reviews.llvm.org/D64700 llvm-svn: 369315	2019-08-20 00:22:04 +00:00
Hubert Tong	71974b5175	[cmake] Link in LLVMPasses due to dependency by LLVMOrcJIT; NFC Summary: rL367756 (`f5c40cb`) increases the dependency of LLVMOrcJIT on LLVMPasses. In particular, symbols defined in LLVMPasses that are referenced by the destructor of `PassBuilder` are now referenced by LLVMOrcJIT through `Speculation.cpp.o`. We believe that referencing symbols defined in LLVMPasses in the destructor of `PassBuilder` is valid, and that adding to the set of such symbols is legitimate. To support such cases, this patch adds LLVMPasses to the set of libraries being linked when linking in LLVMOrcJIT causes such symbols from LLVMPasses to be referenced. Reviewers: Whitney, anhtuyen, pree-jackie Reviewed By: pree-jackie Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66441 llvm-svn: 369310	2019-08-19 23:12:48 +00:00
Anton Afanasyev	3f3a2573c3	[Support][Time profiler] Make FE codegen blocks to be inside frontend blocks Summary: Add `Frontend` time trace entry to `HandleTranslationUnit()` function. Add test to check all codegen blocks are inside frontend blocks. Also, change `--time-trace-granularity` option a bit to make sure very small time blocks are outputed to json-file when using `--time-trace-granularity=0`. This fixes http://llvm.org/pr41969 Reviewers: russell.gallop, lebedev.ri, thakis Reviewed By: russell.gallop Subscribers: vsapsai, aras-p, lebedev.ri, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63325 llvm-svn: 369308	2019-08-19 22:58:26 +00:00
Matthias Gehre	5b3275e56f	[ORC] fix use-after-free detected by -Wreturn-stack-address Summary: llvm/lib/ExecutionEngine/Orc/Layer.cpp:53:12: warning: returning address of local temporary object [-Wreturn-stack-address] In ``` StringRef IRMaterializationUnit::getName() const { [...] return TSM.withModuleDo( [](const Module &M) { return M.getModuleIdentifier(); }); ``` `getModuleIdentifier()` returns a `const std::string &`, but the implicit return type of the lambda is `std::string` by value, and thus the returned `StringRef` refers to a temporary `std::string`. Detect by annotating `llvm::StringRef` with `[[gsl::Pointer]]`. Reviewers: lhames, sgraenitz Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66440 llvm-svn: 369306	2019-08-19 21:59:44 +00:00
Johannes Doerfert	8b962f2814	[CaptureTracker] Let subclasses provide dereferenceability information Summary: CaptureTracker subclasses might have better dereferenceability information which allows null pointer checks to be no-capturing. The first user will be D59922. Reviewers: sanjoy, hfinkel, aykevl, sstefan1, uenoku, xbolva00 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66371 llvm-svn: 369305	2019-08-19 21:56:38 +00:00
Johannes Doerfert	de7674ce76	Recommit "[Attributor] Fix: Do not partially resolve returned calls." This reverts commit `b1752f670f`. Fixed the issue with a different commit, reapply this one as it was, afaik, not broken. llvm-svn: 369303	2019-08-19 21:35:31 +00:00
Evgeniy Stepanov	55ccd16354	Refactor isPointerOffset (NFC). Summary: Simplify the API using Optional<> and address comments in https://reviews.llvm.org/D66165 Reviewers: vitalybuka Subscribers: hiraditya, llvm-commits, ostannard, pcc Tags: #llvm Differential Revision: https://reviews.llvm.org/D66317 llvm-svn: 369300	2019-08-19 21:08:04 +00:00
Vyacheslav Zakharin	f7229ac7d8	Fixed placement of llvm.global_dtors on Windows. Differential revision: https://reviews.llvm.org/D66373 llvm-svn: 369299	2019-08-19 21:07:03 +00:00
Evgeniy Stepanov	50affbe47f	MemTag: stack initializer merging. Summary: MTE provides instructions to update memory tags and data at the same time. This change makes use of those to generate more compact code for stack variable tagging + initialization. We collect memory store and memset instructions following an alloca or a lifetime.start call, and replace them with the corresponding MTE intrinsics. Since the intrinsics work on 16-byte aligned chunks, the stored values are combined as necessary. Reviewers: pcc, vitalybuka, ostannard Subscribers: srhines, javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66167 llvm-svn: 369297	2019-08-19 20:47:09 +00:00
Benjamin Kramer	928071ae4e	[Support] Replace sys::Mutex with their standard equivalents. Only use a recursive mutex if it can be locked recursively. llvm-svn: 369295	2019-08-19 19:49:57 +00:00
Johannes Doerfert	056f1b5cc7	Re-apply fixed "[Attributor] Fix: Make sure we set the changed flag" This reverts commit `cedd0d9a6e`. Re-apply the original commit but make sure the variables are initialized (even if they are not used) so UBSan is not complaining. llvm-svn: 369294	2019-08-19 19:14:10 +00:00
Sam Clegg	19bf637eb1	[WebAssembly][MC] Allow empty assembly functions Differential Revision: https://reviews.llvm.org/D66434 llvm-svn: 369292	2019-08-19 19:04:54 +00:00
Alina Sbirlea	1a3fdaf6a6	[MemorySSA] Rename uses when inserting memory uses. Summary: When inserting uses from outside the MemorySSA creation, we don't normally need to rename uses, based on the assumption that there will be no inserted Phis (if Def existed that required a Phi, that Phi already exists). However, when dealing with unreachable blocks, MemorySSA will optimize away Phis whose incoming blocks are unreachable, and these Phis end up being re-added when inserting a Use. There are two potential solutions here: 1. Analyze the inserted Phis and clean them up if they are unneeded (current method for cleaning up trivial phis does not cover this) 2. Leave the Phi in place and rename uses, the same way as whe inserting defs. This patch use approach 2. Resolves first test in PR42940. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66033 llvm-svn: 369291	2019-08-19 18:57:40 +00:00
Craig Topper	a0d92c7262	[X86] Teach lowerV4I32Shuffle to only use broadcasts if the mask has more than one undef element. Prioritize shifts over broadcast in lowerV8I16Shuffle. The motivating case are the changes in vector-reduce-add.ll where we were doing extra work in the scalar domain instead of shuffling. There may be some one use check that needs to be looked into there, but this patch sidesteps the issue by avoiding broadcasts that aren't really broadcasting. Differential Revision: https://reviews.llvm.org/D66071 llvm-svn: 369287	2019-08-19 18:15:50 +00:00
Craig Topper	93c2787193	[CGP] Remove ModifiedDT from the makeBitReverse loop I don't think anything in this loop modifies the control flow and we don't restart any iteration after setting the flag. This code was added in http://reviews.llvm.org/D16893 but looking at the test case added there the code that caused the dominator tree to change was merging blocks with their predecessor not the bitreverse optimization. Differential Revision: https://reviews.llvm.org/D66366 llvm-svn: 369283	2019-08-19 18:02:24 +00:00
Pavel Labath	08c77b97c0	Filesystem/Windows: fix inconsistency in readNativeFileSlice API Summary: The windows version implementation of readNativeFileSlice, was trying to match the POSIX behavior of not treating EOF as an error, but it was only handling the case of reading from a pipe. Attempting to read past the end of a regular file returns a slightly different error code, which needs to be handled too. This patch adds ERROR_HANDLE_EOF to the list of error codes to be treated as an end of file, and adds some unit tests for the API. This issue was found while attempting to land D66224, which caused a bunch of lldb tests to start failing on windows. Reviewers: rnk, aganea Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66344 llvm-svn: 369269	2019-08-19 15:40:49 +00:00
Roman Lebedev	edfaee0811	[TargetLowering] x s% C == 0 fold: vector divisor with INT_MIN handling Summary: The general fold is only valid for positive divisors. Which effectively means, it is invalid for `INT_MIN` divisors, and we currently bailout if we see them. But that is too strict, we can just fix-up the results. For that, let's do a second computation 'in parallel': ``` Name: srem -> and Pre: isPowerOf2(C) %o = srem i8 %X, C %r = icmp eq %o, 0 => %n = and i8 %X, C-1 %r = icmp eq %n, 0 ``` https://rise4fun.com/Alive/Sup And then just blend results: if the divisor was `INT_MIN`, pick the value we got via bit-test, else pick the value from general fold. There's interesting observation - `ISD::ROTR` is set to `LegalizeAction::Expand` before AVX512, so we should not treat `INT_MIN` divisor as even; and as it can be seen while `@test_srem_odd_even_one` improves on all run-lines, `@test_srem_odd_even_INT_MIN` only improves for AVX512. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66300 llvm-svn: 369268	2019-08-19 15:01:42 +00:00
Serge Guelton	a023d6b7de	[nfc] Silent gcc warning llvm-svn: 369266	2019-08-19 14:40:33 +00:00
George Rimar	9d5e8a476f	[Object/COFF.h] - Stop returning std::error_code in a few methods. NFCI. There are 4 methods that return std::error_code now, though they do not have to because they are always succeed. I refactored them. This allows to simplify the code in tools a bit. llvm-svn: 369263	2019-08-19 14:32:23 +00:00
Jinsong Ji	0776da5236	[PeepholeOptimizer] Don't assume bitcast def always has input Summary: If we have a MI marked with bitcast bits, but without input operands, PeepholeOptimizer might crash with assert. eg: If we apply the changes in PPCInstrVSX.td as in this patch: [(set v4i32:$XT, (bitconvert (v16i8 immAllOnesV)))]>; We will get assert in PeepholeOptimizer. ``` llvm-lit llvm-project/llvm/test/CodeGen/PowerPC/build-vector-tests.ll -v llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:417: const llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i < getNumOperands() && "getOperand() out of range!"' failed. ``` The fix is to abort if we found out of bound access. Reviewers: qcolombet, MatzeB, hfinkel, arsenm Reviewed By: qcolombet Subscribers: wdng, arsenm, steven.zhang, wuzish, nemanjai, hiraditya, kbarton, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65542 llvm-svn: 369261	2019-08-19 14:19:04 +00:00
Alex Bradbury	1c1f8f215d	[RISCV] Don't force absolute FK_Data_X fixups to relocs The current behavior of shouldForceRelocation forces relocations for the majority of fixups when relaxation is enabled. This makes sense for fixups which incorporate symbols but is unnecessary for simple data fixups where the fixup target is already resolved to an absolute value. Differential Revision: https://reviews.llvm.org/D63404 Patch by Edward Jones. llvm-svn: 369257	2019-08-19 13:23:02 +00:00
David Stenberg	88df53e6ea	[DebugInfo] Allow bundled calls in the MIR's call site info Summary: Extend the MIR parser and writer so that the call site information can refer to calls that are bundled. Reviewers: aprantl, asowda, NikolaPrica, djtodoro, ivanbaev, vsk Reviewed By: aprantl Subscribers: arsenm, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D66145 llvm-svn: 369256	2019-08-19 12:41:22 +00:00
Sanjay Patel	b38bac3699	[SLP] reduce duplicated code; NFC llvm-svn: 369250	2019-08-19 11:39:56 +00:00
Fangrui Song	d9a071c54b	[MC] Simplify ELFObjectWriter::recordRelocation. NFC llvm-svn: 369248	2019-08-19 10:05:59 +00:00
Jeremy Morse	176bbd5cde	[DebugInfo] Make postra sinking of DBG_VALUEs subregister-safe Currently the machine instruction sinker identifies DBG_VALUE insts that also need to sink by comparing register numbers. Unfortunately this isn't safe, because (after register allocation) a DBG_VALUE may read a register that aliases what's being sunk. To fix this, identify the DBG_VALUEs that need to sink by recording & examining their register units. Register units gives us the following guarantee: "Two registers overlap if and only if they have a common register unit" [MCRegisterInfo.h] Thus we can always identify aliasing DBG_VALUEs if the set of register units read by the DBG_VALUE, and the register units of the instruction being sunk, intersect. (MachineSink already uses classes like "LiveRegUnits" for determining sinking validity anyway). The test added checks for super and subregister DBG_VALUE reads of a sunk copy being sunk as well. Differential Revision: https://reviews.llvm.org/D58191 llvm-svn: 369247	2019-08-19 09:53:07 +00:00
Sam Tebbs	f312c1ecf4	[ARM] Add support for MVE vaddv This patch adds vecreduce_add and the relevant instruction selection for vaddv. Differential revision: https://reviews.llvm.org/D66085 llvm-svn: 369245	2019-08-19 09:38:28 +00:00
David Green	2bfc13fde1	[ARM] MVE sext costs This adds some sext costs for MVE, taken from the length of assembly sequences that we currently generate. Differential Revision: https://reviews.llvm.org/D66010 llvm-svn: 369244	2019-08-19 09:13:22 +00:00
David L. Jones	cedd0d9a6e	Revert [Attributor] Fix: Make sure we set the changed flag This reverts r369159 (git commit `cbaf1fdea2`) r369160 caused a test to fail under UBSAN. See thread on llvm-commits. llvm-svn: 369241	2019-08-19 08:00:08 +00:00
Fangrui Song	b127771f7d	[MC] Delete unnecessary diagnostic: "No relocation available to represent this relative expression" Replace - error: No relocation available to represent this relative expression with + error: symbol 'undef' can not be undefined in a subtraction expression or + error: Cannot represent a difference across sections Keep !IsPcRel as an assertion after the two diagnostic checks are done. llvm-svn: 369239	2019-08-19 07:59:35 +00:00
David L. Jones	b1752f670f	Revert [Attributor] Fix: Do not partially resolve returned calls. This reverts r369160 (git commit `f72d9b1c97`) r369160 caused some tests to fail under UBSAN. See thread on llvm-commits. llvm-svn: 369236	2019-08-19 07:16:24 +00:00
Fangrui Song	38426c114f	[MC] Don't emit .symver redirected symbols to the symbol table GNU as keeps the original symbol in the symbol table for defined @ and @@, but suppresses it in other cases (@@@ or undefined). The original symbol is usually undesired: In a shared object, the original symbol can be localized with a version script, but it is hard to remove/localize in an archive: 1) a post-processing step removes the undesired original symbol 2) consumers (executable) of the archive are built with the version script Moreover, it can cause linker issues like binutils PR/18703 if the original symbol name and the base name of the versioned symbol is the same (both ld.bfd and gold have some code to work around defined @ and @@). In lld, if it sees f and f@v1: --version-script =(printf 'v1 {};') => f and f@v1 --version-script =(printf 'v1 { f; };') => f@v1 and f@@v1 It can be argued that @@@ added on 2000-11-13 corrected the @ and @@ mistake. This patch catches some more multiple version errors (defined @ and @@), and consistently suppress the original symbol. This addresses all the problems listed above. If the user wants other aliases to the versioned symbol, they can copy the original symbol to other symbol names with .set directive, e.g. .symver f, f@v1 # emit f@v1 but not f into .symtab .set f_impl, f # emit f_impl into .symtab llvm-svn: 369233	2019-08-19 06:17:30 +00:00
Craig Topper	ebb7ddc633	[X86] Teach lower1BitShuffle to match right shifts with upper zero elements on types that don't natively support KSHIFT. We can support these by widening to a supported type, then shifting all the way to the left and then back to the right to ensure that we shift in zeroes. llvm-svn: 369232	2019-08-19 05:45:39 +00:00
Craig Topper	e47437a6ef	[X86] Fix the lower1BitShuffle code added in r369215 to correctly pass the widened vector to the KSHIFT node. Not sure how to test this as we have tests that exercise this code, but nothing failed for the types not matching. Since all the k-registers use equivalent register classes everything just ends up working. llvm-svn: 369228	2019-08-19 04:08:44 +00:00
Craig Topper	269c6b1c15	[X86] Teach lower1BitShuffle to match KSHIFTR that doesn't use Zeroable and only relies on undef. This allows us to widen the type when the KSHIFTR instruction doesn't exist for the type. If we need to shift in zeroes into the upper elements we would need more work to guarantee zeroes when widening. llvm-svn: 369227	2019-08-19 04:08:40 +00:00
Craig Topper	2eb7951da3	[X86] Teach lower1BitShuffle to recognize padding a subvector with zeros with V2 as the source and V1 as the zero vector. Shuffle canonicalization can swap the sources so the zero vector might be V1 and the subvector that's being padded can be V2. llvm-svn: 369226	2019-08-19 00:39:22 +00:00
Craig Topper	2ee46c7c4b	[X86] Add a special case to LowerCONCAT_VECTORSvXi1 to handle concatenating zero vectors followed by one non-zero vector followed by undef vectors. For such a case we should only need a KSHIFTL, but we were previously generating a KSHIFTL followed by a KSHIFTR because we mistakenly believed we need to zero the undef elements. llvm-svn: 369224	2019-08-18 23:30:11 +00:00
Craig Topper	388b8dd94a	[X86] Replace uses of getZeroVector for vXi1 vectors with DAG.getConstant. vXi1 vectors don't need special handling. llvm-svn: 369222	2019-08-18 23:30:03 +00:00
Craig Topper	9e074c06fe	[X86] Improve lower1BitShuffle handling for KSHIFTL on narrow vectors. We can insert the value into a larger legal type and shift that by the desired amount. llvm-svn: 369215	2019-08-18 18:52:46 +00:00
Simon Pilgrim	63b3c56fca	Fix signed/unsigned comparison warning. NFCI. llvm-svn: 369213	2019-08-18 17:26:30 +00:00
Simon Pilgrim	fee2546f3f	[X86] isTargetShuffleEquivalent - add BUILD_VECTOR matching Add similar functionality to isShuffleEquivalent - if the mask elements don't match, try matching the BUILD_VECTOR scalars instead. As target shuffles need to handle SM_Sentinel values, this can get a bit tricky, so commit just adds actual mask element index handling - full SM_SentinelZero support will be added when the need arises. Also, enables support in matchVectorShuffleWithPACK llvm-svn: 369212	2019-08-18 17:15:26 +00:00
Simon Pilgrim	a66edd86e2	[X86] isTargetShuffleEquivalent - early out on illegal shuffle masks. NFCI. Simplifies shuffle mask comparisons by just bailing out if the shuffle mask has any out of range values - will make an upcoming patch much simpler. llvm-svn: 369211	2019-08-18 16:37:58 +00:00
Roman Lebedev	9b957d3321	[InstCombine] Cherry-pick NFC cleanups of foldShiftIntoShiftInAnotherHandOfAndInICmp() from D66383 llvm-svn: 369207	2019-08-18 12:26:33 +00:00
Craig Topper	74168ded03	[TargetLowering] Teach computeRegisterProperties to only widen v3i16/v3f16 vectors to the next power of 2 type if that's legal. These were recently made simple types. This restores their behavior back to something like their EVT legalization. We might be able to fix the code in type legalization where the assert was failing, but I didn't investigate too much as I had already looked at the computeRegisterProperties code during the review for v3i16/v3f16. Most of the test changes restore the X86 codegen back to what it looked like before the recent change. The test case in vec_setcc.ll and is a reduced version of the reproducer from the fuzzer. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=16490 llvm-svn: 369205	2019-08-18 06:28:06 +00:00
Craig Topper	f43106e341	[SelectionDAG] Add a node creation debug message to getMachineNode. llvm-svn: 369204	2019-08-18 06:28:00 +00:00
Matt Arsenault	479f3bdb2c	AMDGPU: Fix iterator error when lowering SI_END_CF If the instruction is the last in the block, there is no next instruction but the iteration still needs to look at the new block. llvm-svn: 369203	2019-08-18 00:20:44 +00:00
Matt Arsenault	cfdc2b9bd9	AMDGPU: Disambiguate v3f16 format in load/store tables Currently the searchable tables report the number of dwords. These round to the same number for 3 and 4 component d16 instructions. Change this to report the number of elements so this isn't ambiguous. llvm-svn: 369202	2019-08-18 00:20:43 +00:00
Craig Topper	31f829f0cd	[X86] Add a one use check to the combineStore code that handles v16i16->v16i8 truncate+store by extending to v16i32 and then emitting a v16i32->v16i8 truncstore. This prevent us from emitting a separate truncate and a truncating store instruction. llvm-svn: 369200	2019-08-17 22:46:15 +00:00
Yonghong Song	a8dad5c79b	[BPF] Fix bpf llvm-objdump issues. Commit https://reviews.llvm.org/D57939 ("[DWARF] Refactor RelocVisitor and fix computation of SHT_RELA-typed relocation entries) made a change for relocation resolution when operating on an object file. The change unfortunately broke BPF as given SymbolValue (S) and Addent (A), previously relocation is resolved to S + A and after the change, it is resolved to S This patch fixed the issue by resolving relocation correctly. It looks not all relocation resolution reaches here and I did not trace down exactly when. But I do find if the object file includes codes in two different ELF sections than default ".text", the above bug will be triggered. This patch included a trivial two function source code to demonstrate this issue. The relocation for .debug_loc is resolved incorrectly due to this and llvm-objdump cannot display source annotated assembly. Differential Revision: https://reviews.llvm.org/D66372 llvm-svn: 369199	2019-08-17 22:12:00 +00:00
Kang Zhang	b3d258fc44	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: Fix a bug of preducessors. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 369191	2019-08-17 14:37:05 +00:00
Paul Walker	26295676a4	Revert Revert [AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. This reverts r369132 (git commit `19301d75f0`) llvm-svn: 369186	2019-08-17 09:22:36 +00:00
Paul Walker	93c7a4a47c	Revert [AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. This reverts r369133 (git commit `2632c677f8`) llvm-svn: 369185	2019-08-17 09:22:28 +00:00
Alina Sbirlea	f92109dc01	[MemorySSA] Loop passes should mark MSSA preserved when available. This patch applies only to the new pass manager. Currently, when MSSA Analysis is available, and pass to each loop pass, it will be preserved by that loop pass. Hence, mark the analysis preserved based on that condition, vs the current `EnableMSSALoopDependency`. This leaves the global flag to affect only the entry point in the loop pass manager (in FunctionToLoopPassAdaptor). llvm-svn: 369181	2019-08-17 01:02:12 +00:00
Sanjay Patel	a53ad0e157	Revert r367891 - "[InstCombine] combine mul+shl separated by zext" This reverts commit `5dbb90bfe1`. As noted in the post-commit thread for r367891, this can create a multiply that is lowered to a libcall that may not exist. We need to improve the backend decomposition for integer multiply before trying to re-land this (if it's still worthwhile after doing the backend work). llvm-svn: 369174	2019-08-16 23:36:28 +00:00
Jian Cai	16fa8b0970	Reland "[ARM] push LR before __gnu_mcount_nc" This relands r369147 with fixes to unit tests. https://reviews.llvm.org/D65019 llvm-svn: 369173	2019-08-16 23:30:16 +00:00
Amara Emerson	57ec292ab8	[AArch64][GlobalISel] Fix an assertion during G_UNMERGE selection for s128 types. llvm-svn: 369172	2019-08-16 23:23:40 +00:00
Sanjay Patel	acceedb15f	[CodeGenPrepare] Fix use-after-free If OptimizeExtractBits() encountered a shift instruction with no operands at all, it would erase the instruction, but still return false. This previously didn’t matter because its caller would always return after processing the instruction, but https://reviews.llvm.org/D63233 changed the function’s caller to fall through if it returned false, which would then cause a use-after-free detectable by ASAN. This change makes OptimizeExtractBits return true if it removes a shift instruction with no users, terminating processing of the instruction. Patch by: @brentdax (Brent Royal-Gordon) Differential Revision: https://reviews.llvm.org/D66330 llvm-svn: 369168	2019-08-16 23:10:34 +00:00
Jordan Rupprecht	d0797ece46	Revert [X86] SimplifyDemandedVectorElts - attempt to recombine target shuffle using DemandedElts mask (reapplied) This reverts r368662 (git commit `1a8d790cf5`) The compile-time regression repro is in https://bugs.llvm.org/show_bug.cgi?id=43024 llvm-svn: 369167	2019-08-16 23:08:56 +00:00
Eli Friedman	eaff844fe9	[ARM] Preserve liveness in ARMConstantIslands. We currently don't use liveness information after this point, but it can be useful to catch bugs using -verify-machineinstrs, and optimizations could potentially use this information in the future. Differential Revision: https://reviews.llvm.org/D66319 llvm-svn: 369162	2019-08-16 22:20:14 +00:00
Johannes Doerfert	f72d9b1c97	[Attributor] Fix: Do not partially resolve returned calls. By partially resolving returned calls we did not record that they were not fully resolved which caused odd behavior down the line. We could also end up with some, but not all, returned values of the callee in the returned values map of the caller, another odd behavior we want to avoid. llvm-svn: 369160	2019-08-16 21:59:52 +00:00
Johannes Doerfert	cbaf1fdea2	[Attributor] Fix: Make sure we set the changed flag The flag was updated before we actually run the visitor callback so we might miss updates. llvm-svn: 369159	2019-08-16 21:55:01 +00:00
Johannes Doerfert	17cb918536	[CaptureTracking] Allow null to be in either icmp operand Summary: Before we required the comparison against null to be "canonical", hence null to be operand #1. This patch allows null to be in either operand, similar to the handling of loaded globals that follows. Reviewers: sanjoy, hfinkel, aykevl, sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66321 llvm-svn: 369158	2019-08-16 21:53:49 +00:00
Johannes Doerfert	6dedc78d9d	[Attributor] Add all missing attribute definitions/symbols As a preparation to "on-demand" abstract attribute generation we need implementations for all attributes (as they can be queried and then created on-demand where we now fail to find one). Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66129 llvm-svn: 369155	2019-08-16 21:31:11 +00:00
Jonas Devlieghere	f4bdbea02f	[RWMutex] Simplify availability check Check for the actual version number for the scenarios where the macOS version isn't available (__MAC_10_12). llvm-svn: 369154	2019-08-16 21:25:40 +00:00
Craig Topper	a17d1d2250	[X86] Use Register/MCRegister in more places in X86 This was a quick pass through some obvious places. I haven't tried the clang-tidy check. I also replaced the zeroes in getX86SubSuperRegister with X86::NoRegister which is the real sentinel name. Differential Revision: https://reviews.llvm.org/D66363 llvm-svn: 369151	2019-08-16 20:50:23 +00:00
Jian Cai	2d957cfe02	Revert "[ARM] push LR before __gnu_mcount_nc" This reverts commit `f4cf3b9593`. llvm-svn: 369149	2019-08-16 20:40:21 +00:00
Jian Cai	f4cf3b9593	[ARM] push LR before __gnu_mcount_nc Push LR register before calling __gnu_mcount_nc as it expects the value of LR register to be the top value of the stack on ARM32. Differential Revision: https://reviews.llvm.org/D65019 llvm-svn: 369147	2019-08-16 20:21:08 +00:00
Johannes Doerfert	234eda563d	[Attributor] Towards a more structured deduction pattern Summary: This is the first commit aiming to structure the attribute deduction. The base idea is that we have default propagation patterns as listed below on top of which we can add specific, e.g., context sensitive, logic. Deduction patterns used in this patch: - argument states are determined from call site argument states, see AAAlignArgument and AAArgumentFromCallSiteArguments. - call site argument states are determined as if they were floating values, see AAAlignCallSiteArgument and AAAlignFloating. - floating value states are determined by traversing the def-use chain and combining the states determined for the leaves, see AAAlignFloating and genericValueTraversal. - call site return states are determined from function return states, see AAAlignCallSiteReturned and AACallSiteReturnedFromReturned. - function return states are determined from returned value states, see AAAlignReturned and AAReturnedFromReturnedValues. Through this strategy all logic for alignment is concentrated in the AAAlignFloating::updateImpl method. Note: This commit works on its own but is part of a larger change that involves "on-demand" creation of abstract attributes that will participate in the fixpoint iteration. Without this part, we sometimes do not have an AAAlign abstract attribute to query, loosing information we determined before. All tests have appropriate FIXMEs and the information will be recovered once we added all parts. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66126 llvm-svn: 369144	2019-08-16 19:51:23 +00:00
Johannes Doerfert	66cf87e290	[Attributor][NFC] Introduce aliases for call site attributes Until we have call site specific liveness and/or value information there is no need to do call site specific deduction. Though, we need the symbols in follow up patches that make Attributor::getAAFor return a reference. llvm-svn: 369143	2019-08-16 19:49:00 +00:00
Johannes Doerfert	fe6dbadc0d	[Attributor] Introduce initialize calls and move code to keep attributes concise Summary: This patch should not change the behavior except that the added initialize methods might indicate an optimistic fixpoint earlier. The code movement is done to keep the attribute definitions in a single block where it makes sense. No functional changes intended there. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66258 llvm-svn: 369142	2019-08-16 19:36:17 +00:00
Lang Hames	9bb9a0c10b	[ORC] Remove some stray debugging output accidentally left in r368707 llvm-svn: 369141	2019-08-16 19:33:37 +00:00
Sanjay Patel	39eb2324f7	[InstCombine] canonicalize a scalar-select-of-vectors to vector select This pattern may arise more frequently with an enhancement to SLP vectorization suggested in PR42755: https://bugs.llvm.org/show_bug.cgi?id=42755 ...but we should handle this pattern to make things easier for the backend either way. For all in-tree targets that I looked at, codegen for typical vector sizes looks better when we change to a vector select, so this is safe to do without a cost model (in other words, as a target-independent canonicalization). For example, if the condition of the select is a scalar, we end up with something like this on x86: vpcmpgtd %xmm0, %xmm1, %xmm0 vpextrb $12, %xmm0, %eax testb $1, %al jne LBB0_2 ## %bb.1: vmovaps %xmm3, %xmm2 LBB0_2: vmovaps %xmm2, %xmm0 Rather than the splat-condition variant: vpcmpgtd %xmm0, %xmm1, %xmm0 vpshufd $255, %xmm0, %xmm0 ## xmm0 = xmm0[3,3,3,3] vblendvps %xmm0, %xmm2, %xmm3, %xmm0 Differential Revision: https://reviews.llvm.org/D66095 llvm-svn: 369140	2019-08-16 18:51:30 +00:00
Evgeniy Stepanov	187c63f145	Escape % in printf format string. Fixes branch-relax-block-size.mir on the ASan builder. llvm-svn: 369138	2019-08-16 18:23:54 +00:00
Guanzhong Chen	b1cb9fd1aa	[WebAssembly] Forbid use of EM_ASM with setjmp/longjmp Summary: We tried to support EM_ASM with setjmp/longjmp in binaryen. But with dynamic linking thrown into the mix, the code is no longer understandable and cannot be maintained. We also discovered more bugs in the EM_ASM handling code. To ensure maintainability and correctness of the binaryen code, EM_ASM will no longer be supported with setjmp/longjmp. This is probably fine since the support was added recently and haven't be published. Reviewers: tlively, sbc100, jgravelle-google, kripken Reviewed By: tlively, kripken Subscribers: dschuff, hiraditya, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66356 llvm-svn: 369137	2019-08-16 18:21:08 +00:00
Simon Pilgrim	63b78b678b	[X86] resolveTargetShuffleInputs - add DemandedElts variant. NFCI. Nothing calls this yet, everything still goes through the non (all) DemandedElts wrapper. llvm-svn: 369136	2019-08-16 18:13:22 +00:00
Amara Emerson	c809230a69	[AArch64][GlobalISel] Lower G_SHUFFLE_VECTOR with 1 elt src and 1 elt mask. Again, it's weird that these are allowed. Since lowering support was added in r368709 we started crashing on compiling the neon intrinsics test in the test suite. This fixes the lowering to fold the 1 elt src/mask case into copies. llvm-svn: 369135	2019-08-16 18:06:53 +00:00
Simon Pilgrim	8ff1b7de4d	[X86] combineExtractWithShuffle - handle extract(truncate(x), 0) Eventually we need to generalize combineExtractWithShuffle to handle all faux shuffles and handle truncate (and X86ISD::VTRUNC etc.) there, but we're not ready yet (still creates nodes on the fly, incomplete DemandedElts support, bad use of recursive Depth limit). llvm-svn: 369134	2019-08-16 17:35:08 +00:00
Paul Walker	2632c677f8	[AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. Recommit with fixes for mac builders. Summary: AArch64InstrInfo::getInstSizeInBytes is incorrectly treating meta instructions (e.g. CFI_INSTRUCTION) as normal instructions and giving them a size of 4. This results in branch relaxation calculating block sizes wrong. Branch relaxation also considers alignment and thus a single mistake can result in later blocks being incorrectly sized even when they themselves do not contain meta instructions. The net result is we might not relax a branch whose destination is not within range. Reviewers: nickdesaulniers, peter.smith Reviewed By: peter.smith Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66337 > llvm-svn: 369111 llvm-svn: 369133	2019-08-16 17:29:53 +00:00
Paul Walker	19301d75f0	Revert [AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. This reverts r369111 (git commit `3ccee5f7c4`) llvm-svn: 369132	2019-08-16 17:29:42 +00:00
Vasileios Porpodas	1d254f3dae	[SLPVectorizer] Make the scheduler aware of the TreeEntry operands. Summary: The scheduler's dependence graph gets the use-def dependencies by accessing the operands of the instructions in a bundle. However, buildTree_rec() may change the order of the operands in TreeEntry, and the scheduler is currently not aware of this. This is not causing any functional issues currently, because reordering is restricted to the operands of a single instruction. Once we support operand reordering across multiple TreeEntries, as shown here: http://www.llvm.org/devmtg/2019-04/slides/Poster-Porpodas-Supernode_SLP.pdf , the scheduler will need to get the correct operands from TreeEntry and not from the individual instructions. In short, this patch: - Connects the scheduler's bundle with the corresponding TreeEntry. It introduces new TE and Lane fields in ScheduleData. - Moves the location where the operands of the TreeEntry are initialized. This used to take place in newTreeEntry() setting one operand at a time, but is now moved pre-order just before the recursion of buildTree_rec(). This is required because the scheduler needs to access both operands of the TreeEntry in tryScheduleBundle(). - Updates the scheduler to access the instruction operands through the TreeEntry operands instead of accessing the instruction operands directly. Reviewers: ABataev, RKSimon, dtemirbulatov, Ayal, dorit, hfinkel Reviewed By: ABataev Subscribers: hiraditya, llvm-commits, lebedev.ri, rcorcs Tags: #llvm Differential Revision: https://reviews.llvm.org/D62432 llvm-svn: 369131	2019-08-16 17:21:18 +00:00
Simon Pilgrim	3a8c698771	[X86] Alphabetize pass initialization definitions. NFCI. llvm-svn: 369126	2019-08-16 16:41:38 +00:00
Guozhi Wei	e03f6a1631	[CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization In function Analysis.cpp:isInTailCallPosition, instructions between call and ret are checked to see if they block tail call optimization. If an instruction is an intrinsic call, only llvm.lifetime_end is allowed and other intrinsic functions block tail call. When compiling tcmalloc, we found llvm.assume between a hot function call and ret, it blocks the optimization. But llvm.assume doesn't generate instructions, it should not block tail call. Differential Revision: https://reviews.llvm.org/D66096 llvm-svn: 369125	2019-08-16 16:26:12 +00:00
Krzysztof Parzyszek	ac83aab035	[Hexagon] Generate min/max instructions for 64-bit vectors llvm-svn: 369124	2019-08-16 16:16:27 +00:00
Sander de Smalen	f28e1128d9	Relanding r368987 [AArch64] Change location of frame-record within callee-save area. Changes: There was a condition for `!NeedsFrameRecord` missing in the assert. The assert in question has changed to: + assert((!RPI.isPaired() \|\| !NeedsFrameRecord \|\| RPI.Reg2 != AArch64::FP \|\| + RPI.Reg1 == AArch64::LR) && + "FrameRecord must be allocated together with LR"); This addresses PR43016. llvm-svn: 369122	2019-08-16 15:42:28 +00:00
Evandro Menezes	05e9c2ac2e	[InstCombine] Simplify pow(2.0, itofp(y)) to ldexp(1.0, y) Simplify `pow(2.0, itofp(y))` to `ldexp(1.0, y)`. Differential revision: https://reviews.llvm.org/D65979 llvm-svn: 369120	2019-08-16 15:33:41 +00:00
Cyndy Ishida	5f865ecf06	[TextAPI] Update reader to be supported by lib/Object Summary: To be able to use the TextAPI/Reader for tbd file consumption (by libObject) it gets passed a MemoryBufferRef which isn't castable to MemoryBuffer. Updated the tests to expect that input as well. Reviewers: ributzka, steven_wu Reviewed By: steven_wu Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66147 llvm-svn: 369119	2019-08-16 15:30:48 +00:00
David Green	b782e61e47	[ARM] MVE sext of a load is free MVE also has some sext of loads, which will be free just as scalar instructions are. Differential Revision: https://reviews.llvm.org/D66008 llvm-svn: 369118	2019-08-16 15:13:37 +00:00
Roman Lebedev	16244fccfe	[InstCombine] Shift amount reassociation in bittest: trunc-of-shl (PR42399) Summary: This is continuation of D63829 / https://bugs.llvm.org/show_bug.cgi?id=42399 I thought naive pattern would solve my issue, but nope, it involved truncation, thus more folds needed.. This isn't really the fold i'm interested in, i need trunc-of-lshr, but i'we decided to start with `shl` because it's simpler. In this case, no extra legality checks are needed: https://rise4fun.com/Alive/CAb We should be careful about not increasing instruction count, since we need to produce `zext` because `and` is done in wider type. Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66057 llvm-svn: 369117	2019-08-16 15:10:41 +00:00
Luis Marques	fa06e95898	[RISCV] Convert registers from unsigned to Register Only in public interfaces that have not yet been converted should there remain registers with unsigned type. Differential Revision: https://reviews.llvm.org/D66252 llvm-svn: 369114	2019-08-16 14:27:50 +00:00
Paul Walker	3ccee5f7c4	[AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. Summary: AArch64InstrInfo::getInstSizeInBytes is incorrectly treating meta instructions (e.g. CFI_INSTRUCTION) as normal instructions and giving them a size of 4. This results in branch relaxation calculating block sizes wrong. Branch relaxation also considers alignment and thus a single mistake can result in later blocks being incorrectly sized even when they themselves do not contain meta instructions. The net result is we might not relax a branch whose destination is not within range. Reviewers: nickdesaulniers, peter.smith Reviewed By: peter.smith Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66337 llvm-svn: 369111	2019-08-16 14:17:52 +00:00
Simon Pilgrim	9da4989c52	[X86] Remove unused include. NFCI. We don't use anything from TargetOptions.h directly and its included via TargetLowering.h anyhow. llvm-svn: 369110	2019-08-16 14:05:46 +00:00
David Green	6e1ac42474	[ARM] Correct register for narrowing and widening MVE loads and stores. The widening and narrowing MVE instructions like VLDRH.32 are only permitted to use low tGPR registers. This means that if they are used for a stack slot, where the register used is only decided during frame setup, we need to be able to correctly pick a thumb1 register over a normal GPR. This attempts to add the required logic into eliminateFrameIndex and rewriteT2FrameIndex, only picking the FrameReg if it is a valid register for the operands register class, and picking a valid scratch register for the register class. Differential Revision: https://reviews.llvm.org/D66285 llvm-svn: 369108	2019-08-16 13:42:39 +00:00
Florian Hahn	403e85cbc5	Revert [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks This reverts r368997 (git commit `2a903c0b67`) It looks like this commit adds invalid predecessors to MBBs. The example below fails the verifier after MachineBlockPlacement (run llc -verify-machineinstrs): @global.4 = external constant i8* declare i32 @zot(...) define i16* @snork.67() personality i8* bitcast (i32 (...)* @zot to i8) { bb: invoke void undef() to label %bb5 unwind label %bb4 bb4: ; preds = %bb %tmp = landingpad { i8, i32 } catch i8* null unreachable bb5: ; preds = %bb %tmp6 = load i32, i32* null, align 4 %tmp7 = icmp eq i32 %tmp6, 0 br i1 %tmp7, label %bb14, label %bb8 bb8: ; preds = %bb11, %bb5 invoke void undef() to label %bb9 unwind label %bb11 bb9: ; preds = %bb8 %tmp10 = invoke i16* undef() to label %bb14 unwind label %bb11 bb11: ; preds = %bb9, %bb8 %tmp12 = landingpad { i8, i32 } cleanup catch i8 bitcast (i8** @global.4 to i8) %tmp13 = icmp ult i64 undef, undef br i1 %tmp13, label %bb8, label %bb14 bb14: ; preds = %bb11, %bb9, %bb5 %tmp15 = phi i16 [ null, %bb5 ], [ null, %bb11 ], [ %tmp10, %bb9 ] ret i16* %tmp15 } llvm-svn: 369104	2019-08-16 13:19:29 +00:00
Bjorn Pettersson	9dddd26e31	[DAGCombiner] Add simple folds for SMULFIX/UMULFIX/SMULFIXSAT Summary: Add the following DAGCombiner folds for mulfix being one of SMULFIX/UMULFIX/SMULFIXSAT: (mulfix x, undef, scale) -> 0 (mulfix x, 0, scale) -> 0 Also added canonicalization of constants to RHS. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66052 llvm-svn: 369103	2019-08-16 13:16:48 +00:00
David Green	8c2c5f5045	[ARM] Don't pretend we know how to generate MVE VLDn We don't yet know how to generate these instructions for MVE. And in the case of VLD3, we don't even have the instruction. For the moment don't tell the vectoriser that we have VLD4, just to end up serialising the results. Differential Revision: https://reviews.llvm.org/D66009 llvm-svn: 369101	2019-08-16 13:06:49 +00:00
Lewis Revill	d3f774d33c	[RISCV] Allow parsing of bare symbols with offsets This patch allows symbols followed by an expression for an offset to be parsed as bare symbols. Differential Revision: https://reviews.llvm.org/D57332 llvm-svn: 369097	2019-08-16 12:00:56 +00:00
Benjamin Kramer	31a47f9890	Revert "[CallGraph] Refine call graph for indirect calls with !callees metadata" This reverts commit r369025. Crashes clang, test case is on the mailing list. llvm-svn: 369096	2019-08-16 10:59:18 +00:00
Lewis Revill	7abf863f76	[RISCV] Lower inline asm constraint A for RISC-V This allows arguments with the constraint A to be lowered to input nodes for RISC-V, which implies a memory address stored in a register. This patch adds the minimal amount of code required to get operands with the right constraints to compile. https://reviews.llvm.org/D54296 llvm-svn: 369095	2019-08-16 10:28:34 +00:00
Simon Pilgrim	59894d4668	[SLPVectorizer] Silence null dereference warning. NFCI. cppcheck + MSVC analyzer both over zealously warn that we might dereference a null Bundle pointer - add an assertion to check for null to silence the warning, plus its a good idea to check that we succeeded in finding a schedule bundle anyway.... llvm-svn: 369094	2019-08-16 10:28:23 +00:00
Jeremy Morse	8b593480d3	[DebugInfo] Handle complex expressions with spills in LiveDebugValues In r369026 we disabled spill-recognition in LiveDebugValues for anything that has a complex expression. This is because it's hard to recover the complex expression once the spill location is baked into it. This patch re-enables spill-recognition and slightly adjusts the DBG_VALUE insts that LiveDebugValues tracks: instead of tracking the last DBG_VALUE for a variable, it tracks the last _unspilt_ DBG_VALUE. The spill-restore code is then able to access and copy the original complex expression; but the rest of LiveDebugValues has to be aware of the slight semantic shift, and produce a new spilt location if a spilt location is propagated between blocks. The test added produces an incorrect variable location (see FIXME), which will be the subject of future work. Differential Revision: https://reviews.llvm.org/D65368 llvm-svn: 369092	2019-08-16 10:04:17 +00:00
Tim Northover	22970d66be	AssumptionCache: remove old affected values after RAUW. If they're left in the cache then they can't be removed efficiently when the cache is notified to unlink a @llvm.assume call, and that can lead to values from different functions entirely remaining there. llvm-svn: 369091	2019-08-16 09:34:27 +00:00
Florian Hahn	75be1a9e58	[ValueTracking] Fix recurrence detection to check both PHI operands. Summary: Currently we fail to compute known bits for recurrences where the first incoming value is the start value of the recurrence. Instead of exiting the loop when the first incoming value is not the step of the recurrence, continue to check the second incoming value. The original code uses a loop to handle both cases, but incorrectly exits instead of continuing. Reviewers: lebedev.ri, spatel, nikic Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66216 llvm-svn: 369088	2019-08-16 09:15:02 +00:00
Craig Topper	120cffccf8	[X86] Manually reimplement getTargetInsertSubreg in X86DAGToDAGISel::matchBitExtract so we can call insertDAGNode on the target constant. This is needed to maintain the topological sort order. Fixes PR42992. llvm-svn: 369084	2019-08-16 04:47:44 +00:00
Igor Kudrin	a33004aca7	Remove the temporary code. NFC. That should have been done in rL368156 but somehow was missed. llvm-svn: 369082	2019-08-16 03:40:04 +00:00
Nico Weber	ee96499a42	Revert r368987, it caused PR43016. llvm-svn: 369080	2019-08-16 02:21:21 +00:00
Jonas Devlieghere	de0ce98abe	[DebugLine] Don't try to guess the path style In r368879 I made an attempt to guess the path style from the files in the line table. After some consideration I now think this is a poor idea. This patch undoes that behavior and instead adds an optional argument to specify the path style. This allows us to make that decision elsewhere where we have more information. In case of LLDB based on the Unit. llvm-svn: 369072	2019-08-15 23:53:15 +00:00
Volkan Keles	0ae6006bee	[GlobalISel] CSEMIRBuilder: Add support for G_GEP Summary: This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors `translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types. Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson Reviewed By: aditya_nandakumar Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66316 llvm-svn: 369070	2019-08-15 23:45:45 +00:00
Eli Friedman	9b9a308452	[ARM][LowOverheadLoops] Fix generated code for "revert". Two issues: 1. t2CMPri shouldn't use CPSR if it isn't predicated. This doesn't really have any visible effect at the moment, but it might matter in the future. 2. The t2CMPri generated for t2WhileLoopStart might need to use a register that isn't LR. My team found this because we have a patch to track register liveness late in the pass pipeline. I'll look into upstreaming it to help catch issues like this earlier. Differential Revision: https://reviews.llvm.org/D66243 llvm-svn: 369069	2019-08-15 23:35:53 +00:00
Jonas Devlieghere	6d6babf745	[Support] Re-introduce the RWMutexImpl for macOS < 10.12 In r369018, Benjamin replaced the custom RWMutex implementation with their C++14 counterpart. Unfortunately, std::shared_timed_mutex is only available on macOS 10.12 and later. This prevents LLVM from compiling even on newer versions of the OS when you have an older deployment target. This patch reintroduced the old RWMutexImpl but guards it by the macOS availability macro. Differential revision: https://reviews.llvm.org/D66313 llvm-svn: 369064	2019-08-15 23:07:20 +00:00
Evgeniy Stepanov	75344955fc	Move isPointerOffset function to ValueTracking (NFC). Summary: To be reused in MemTag sanitizer. Reviewers: pcc, vitalybuka, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66165 llvm-svn: 369062	2019-08-15 22:58:28 +00:00
Philip Reames	5c38ca3534	[SDAG] Minor code cleanup/standardization of atomic accessors [NFC] llvm-svn: 369057	2019-08-15 22:21:14 +00:00
Evgeniy Stepanov	10ce5f88d1	Add missing MIR serialization text for AArch64II::MO_TAGGED. Reviewers: pcc Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66312 llvm-svn: 369053	2019-08-15 22:03:55 +00:00
Alina Sbirlea	79ff20428e	[MemorySSA] Remove restrictive asserts. The verification I added has overly restrictive asserts. Unreachable blocks can have any incoming value in practice, after an update due to a "replaceAllUses" call when the repalced entry is LiveOnEntry. llvm-svn: 369050	2019-08-15 21:20:08 +00:00
Daniel Sanders	0c47611131	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041	2019-08-15 19:22:08 +00:00
Krzysztof Parzyszek	8e987702b1	[Hexagon] Fix instruction selection for vselect v4i8 llvm-svn: 369040	2019-08-15 19:20:09 +00:00
Matt Arsenault	1f2b727298	MVT: Add v3i16/v3f16 vectors AMDGPU has some buffer intrinsics which theoretically could use this. Some of the generated tables include the 3 and 4 element vector versions of these rounded to 64-bits, which is ambiguous. Add these to help the table disambiguate these. Assertion change is for the path odd sized vectors now take for R600. v3i16 is widened to v4i16, which then needs to be promoted to v4i32. llvm-svn: 369038	2019-08-15 18:58:25 +00:00
Philip Reames	d202899431	[NFC] Add a couple of dump routines for RegisterPressure helper classes llvm-svn: 369037	2019-08-15 18:49:39 +00:00
Florian Hahn	3f2850bc60	[ValueTracking] Look through ptrmask intrinsics during getUnderlyingObject. Reviewers: nlopes, efriedma, hfinkel, sanjoy, aqjune, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61669 llvm-svn: 369036	2019-08-15 18:39:56 +00:00
Craig Topper	2a372ba534	[X86] Add custom type legalization for bitcasting mmx to v2i32/v4i16/v8i8 to use movq2dq instead of going through memory. llvm-svn: 369031	2019-08-15 18:23:37 +00:00
Benjamin Kramer	2e62396c2f	Link libpthread into LLVMCore.so After r369018 the compiler can inline pthread calls into users of RWMutex. llvm-svn: 369029	2019-08-15 18:06:30 +00:00
Pavel Labath	11d9e46f8e	Revert "MemoryBuffer: Add a missing error-check to getOpenFileImpl" This reverts commit r368977 because it broke a couple of tests in lldb. llvm-svn: 369027	2019-08-15 17:52:40 +00:00
Jeremy Morse	c476124bc8	[DebugInfo] Avoid crash from dropped fragments in LiveDebugValues This patch avoids a crash caused by DW_OP_LLVM_fragments being dropped from DIExpressions by LiveDebugValues spill-restore code. The appearance of a previously unseen fragment configuration confuses LDV, as documented in PR42773, and reproduced by the test function this patch adds (Crashes on a x86_64 debug build). To avoid this, on spill restore, we now use fragment information from the spilt-location-expression. In addition, when spilling, we now don't spill any DBG_VALUE with a complex expression, as it can't be safely restored and will definitely lead to an incorrect variable location. The discussion of this is in D65368. Differential Revision: https://reviews.llvm.org/D66284 llvm-svn: 369026	2019-08-15 17:49:46 +00:00
Mark Lacey	626ed22fbe	[CallGraph] Refine call graph for indirect calls with !callees metadata For indirect call sites having a small set of possible callees, !callees metadata can be used to indicate what those callees are. This patch updates the call graph and lazy call graph analyses so that they consider this metadata when encountering call sites. For the call graph, it adds a new external call graph node to the graph for each unique !callees metadata node. A call graph edge connects an indirect call site with the external node associated with the !callees metadata that is attached to it. And there is an edge from this external node to each of the callees indicated by the metadata. Similarly, for the lazy call graph, the patch adds Ref edges from a caller to the possible callees indicated by the metadata. The primary purpose of the patch is to facilitate iterating over the functions in a module such that all of the callees indicated by a given !callees metadata node will be visited prior to the functions containing call sites annotated by that node. This property is required by optimizations performing a bottom-up traversal of the SCC DAG. For example, the inliner can be made to inline through an indirect call. If the call site is annotated with !callees metadata, this patch ensures that the inliner will have visited all of the callees prior to the caller, allowing it to reliably compute the cost of inlining one or more of the potential callees. Original patch by @mssimpso. I've made some small changes to get it to apply, build, and pass tests on the top of tree, as well as some minor tweaks to formatting and functionality. Subscribers: mehdi_amini, hiraditya, llvm-commits, mssimpso Tags: #llvm Differential Revision: https://reviews.llvm.org/D39339 llvm-svn: 369025	2019-08-15 17:47:53 +00:00
Taewook Oh	213d8a9f13	[NewPM][PassInstrumentation] IR printing support for (Thin)LTO Summary: IR printing has not been correctly supported with (Thin)LTO if the new pass manager is enabled. Previously we only get outputs from backend(codegen) passes, as they are still under legacy pass manager even when the new pass manager is enabled. This patch addresses the issue and enables IR printing for optimization passes with new pass manager + (Thin)LTO setting. Reviewers: fedor.sergeev, philip.pfaffe Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66253 llvm-svn: 369024	2019-08-15 17:47:44 +00:00
Craig Topper	6eebd2bcd7	[X86] Improve cost model for subvector extraction of less than 128-bit vectors Now that we're using widening legalization. We need to improve our extract_subvector cost model for these types. This patch begins by modeling these as a subvector extract followed by a permute. I've left FIXMEs in the code for future improvements. Differential Revision: https://reviews.llvm.org/D65892 llvm-svn: 369022	2019-08-15 17:29:42 +00:00
Benjamin Kramer	8d3a1523dd	[Support] Base RWMutex on std::shared_timed_mutex (C++14) This should have the same semantics. We use std::shared_mutex instead on MSVC and C++17, std::shared_timed_mutex is less efficient than our custom implementation on Windows, std::shared_mutex should be faster. llvm-svn: 369018	2019-08-15 16:55:23 +00:00
Krzysztof Parzyszek	8460301d58	[Hexagon] Generate vector min/max for HVX llvm-svn: 369014	2019-08-15 16:13:17 +00:00
Jonas Devlieghere	0eaee545ee	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Andrea Di Biagio	3de2f0330f	[MCA] Slightly refactor class RetireControlUnit, and add the ability to override the mask of used buffered resources in class mca::Instruction. NFCI This patch teaches the RCU how to peek 'next' RCUTokens. A new method has been added to the RetireControlUnit class with the goal of minimizing the complexity of follow-up patches that will enable macro-fusion support in mca. This patch also adds method Instruction::getNumMicroOpcodes() to simplify common interactions with the instruction descriptor (a pattern quite common in some pipeline stages). Added the ability to override the default set of consumed scheduler resources (this -again- is to simplify future patches that add support for macro-op fusion). No functional change intended. llvm-svn: 369010	2019-08-15 15:27:40 +00:00
Simon Pilgrim	d4df81f463	Remove SmallBitVector.h include. NFCI. SmallBitVector/BitVector types aren't used at all in the cpp file. llvm-svn: 369008	2019-08-15 14:40:37 +00:00
Simon Pilgrim	983e9118a2	Remove BitVector.h include. NFCI. BitVector type isn't used at all in the cpp file. llvm-svn: 369007	2019-08-15 14:39:28 +00:00
Jinsong Ji	9fd81dc139	[PowerPC] Use xxleqv to set all one vector IMM(-1). Summary: xxspltib/vspltisb are 3 cycle PM instructions, xxleqv is 2 cycle ALU instruction. We should use xxleqv to set all one vectors. Reviewers: hfinkel, nemanjai, steven.zhang Subscribers: hiraditya, kbarton, MaskRay, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65529 llvm-svn: 369006	2019-08-15 14:32:51 +00:00
Simon Pilgrim	ed804dad1e	[DAGCombine] MergeConsecutiveStores - fix cppcheck/MSVC extension warning. NFCI. Set the StartIdx type to size_t so that it matches the StoreNodes SmallVector size() and index types. Silences the MSVC analyzer warning that unsigned increment might overflow before exceeding size_t on 64-bit targets - this isn't likely to happen but it means we use consistent types and reduces the warning "noise" a little. llvm-svn: 368998	2019-08-15 13:07:14 +00:00
Kang Zhang	2a903c0b67	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: This patch has trigger a bug of r368339, and the r368339 has been reverted, So upstream this patch again. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 368997	2019-08-15 13:05:16 +00:00
David Green	3a99101812	[ARM] Fix alignment checks for BE VLDRH We need to allow any alignment at least 2, not just exactly 2, so that the big endian loads and stores can be selected successfully. I've also added extra BE testing for the load and store tests. Thanks to Oliver for the report. Differential Revision: https://reviews.llvm.org/D66222 llvm-svn: 368996	2019-08-15 12:54:47 +00:00
Sanjay Patel	57d459309d	[SDAG][x86] check for relaxed math when matching an FP reduction If the last step in an FP add reduction allows reassociation and doesn't care about -0.0, then we are free to recognize that computation as a reduction that may reorder the intermediate steps. This is requested directly by PR42705: https://bugs.llvm.org/show_bug.cgi?id=42705 and solves PR42947 (if horizontal math instructions are actually faster than the alternative): https://bugs.llvm.org/show_bug.cgi?id=42947 Differential Revision: https://reviews.llvm.org/D66236 llvm-svn: 368995	2019-08-15 12:43:15 +00:00
Andrea Di Biagio	7aa0dbb664	[MCA] Slightly refactor the logic in ResourceManager. NFCI This patch slightly changes the API in the attempt to simplify resource buffer queries. It is done in preparation for a patch that will enable support for macro fusion. llvm-svn: 368994	2019-08-15 12:39:55 +00:00
Florian Hahn	fd72bf21c9	[ValueTracking] Add MustPreserveNullness arg to functions analyzing calls. (NFC) Some uses of getArgumentAliasingToReturnedPointer and isIntrinsicReturningPointerAliasingArgumentWithoutCapturing require the calls/intrinsics to preserve the nullness of the argument. For alias analysis, the nullness property does not really come into play. This patch explicitly sets it to true. In D61669, the alias analysis uses will be switched to not require preserving nullness. Reviewers: nlopes, efriedma, hfinkel, sanjoy, aqjune, jdoerfert Reviewed By: jdoerfert Tags: #llvm Differential Revision: https://reviews.llvm.org/D64150 llvm-svn: 368993	2019-08-15 12:13:02 +00:00
David Green	0ff2296a49	[ARM] MVE predicate store patterns Stack loads and stores were already working, but direct stores were not. This adds the patterns for them, same as predicate loads. Differential Revision: https://reviews.llvm.org/D66213 llvm-svn: 368988	2019-08-15 10:41:42 +00:00
Sander de Smalen	643adb5576	[AArch64] Change location of frame-record within callee-save area. This patch changes the location of the frame-record (FP, LR) to the bottom of the callee-saved area. According to the AAPCS the location of the frame-record within the stackframe is unspecified (section 5.2.3 The Frame Pointer), so the compiler should be free to choose a different location. The reason for changing the location of the frame-record is to prepare the frame for allocating an SVE area below the callee-saves. This way the compiler can use the VL-scaled addressing modes to directly access SVE objects from the frame-pointer. : : \| stack \| \| stack \| \| args \| \| args \| +-------+ +-------+ \| x30 \| \| x19 \| \| x29 \| \| x20 \| FP -> \|- - - -\| \| x21 \| \| x19 \| ==> \| x22 \| \| x20 \| \|- - - -\| \| x21 \| \| x30 \| \| x22 \| \| x29 \| +-------+ +-------+ <- FP \|///////\| \|///////\| // realignment gap \|- - - -\| \|- - - -\| \|spills/\| \|spills/\| \| locals\| \| locals\| SP -> +-------+ +-------+ <- SP Things to point out: - The algorithm to find a paired register should be prevented from accidentally pairing some callee-saved register with LR that is not FP, since they should always be paired together when the frame has a frame-record. - For Darwin platforms the location of the frame-record is unchanged, since the unwind encoding does not allow for encoding this position dynamically and other tools currently depend on the former layout. Reviewers: efriedma, rovka, rengolin, thegameg, greened, t.p.northover Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65653 llvm-svn: 368987	2019-08-15 10:34:16 +00:00
Florian Hahn	de1d6c8220	Add ptrmask intrinsic This patch adds a ptrmask intrinsic which allows masking out bits of a pointer that must be zero when accessing it, because of ABI alignment requirements or a restriction of the meaningful bits of a pointer through the data layout. This avoids doing a ptrtoint/inttoptr round trip in some cases (e.g. tagged pointers) and allows us to not lose information about the underlying object. Reviewers: nlopes, efriedma, hfinkel, sanjoy, jdoerfert, aqjune Reviewed by: sanjoy, jdoerfert Differential Revision: https://reviews.llvm.org/D59065 llvm-svn: 368986	2019-08-15 10:12:26 +00:00
Sven van Haastregt	0096d1938e	[Support] Fix Wundef warning llvm-svn: 368984	2019-08-15 10:05:22 +00:00
David Green	04f2f32869	[ARM] MVE trunc to i1 vectors This adds patterns for selecting trunc instructions from full vectors to i1's vectors. Differential Revision: https://reviews.llvm.org/D66201 llvm-svn: 368981	2019-08-15 09:26:51 +00:00
Pavel Labath	46bfdb956c	MemoryBuffer: Add a missing error-check to getOpenFileImpl Summary: In case the function was called with a desired read size and the file was not an "mmap()" candidate, the function was falling back to a "pread()", but it was failing to check the result of that system call. This meant that the function would return "success" even though the read operation failed, and it returned a buffer full of uninitialized memory. Reviewers: rnk, dblaikie Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66224 llvm-svn: 368977	2019-08-15 08:20:15 +00:00
Dorit Nuzman	d57d73daed	[LV] fold-tail predication should be respected even with assume_safety assume_safety implies that loads under "if's" can be safely executed speculatively (unguarded, unmasked). However this assumption holds only for the original user "if's", not those introduced by the compiler, such as the fold-tail "if" that guards us from loading beyond the original loop trip-count. Currently the combination of fold-tail and assume-safety pragmas results in ignoring the fold-tail predicate that guards the loads, generating unmasked loads. This patch fixes this behavior. Differential Revision: https://reviews.llvm.org/D66106 Reviewers: Ayal, hsaito, fhahn llvm-svn: 368973	2019-08-15 07:12:14 +00:00
Craig Topper	1e246b20c0	[X86] Add isel pattern to match VZEXT_MOVL and a v2i64 scalar_to_vector bitcasted from x86mmx to MOVQ2DQ. We already had the pattern for just the scalar to vector and bitcast, but not the case where we wanted zeroes in the high half of the xmm. llvm-svn: 368972	2019-08-15 06:46:30 +00:00
Craig Topper	e6409602a1	[X86] Make sure load is non-volatile in the MMX_X86movdq2q (loadv2i64) isel pattern. This pattern will narrow the load so we should make sure its not volatile. llvm-svn: 368971	2019-08-15 06:46:26 +00:00
Craig Topper	dbcbbf5658	[X86] Remove unneeded isel pattern for v4f32->v4i32 fp_to_sint and conversion to MMX. fp_to_sint is turned into X86cvttp2si during isel preprocessing. The other redundant isel patterns were removed previously, but I missed this one because its in the MMX td file. llvm-svn: 368968	2019-08-15 05:52:02 +00:00
Craig Topper	a57734ba4e	[X86] Disable custom type legalization for v2i32/v4i16/v8i8->i64. The default legalization can take care of this. llvm-svn: 368967	2019-08-15 05:51:58 +00:00
Craig Topper	57286afe4e	[X86] Disable custom type legalization for v2i32/v4i16/v8i8->f64 bitcast. The generic legalization handles this in the same way so just use that. llvm-svn: 368966	2019-08-15 05:51:54 +00:00
Craig Topper	ba39fcd8c6	[X86] Remove some unreachable code from LowerBITCAST. llvm-svn: 368965	2019-08-15 05:51:50 +00:00
Michael Pozulp	9abf668c08	[llvm-objdump] Add warning messages if disassembly + source for problematic inputs Summary: Addresses https://bugs.llvm.org/show_bug.cgi?id=41905 Reviewers: jhenderson, rupprecht, grimar Reviewed By: jhenderson, grimar Subscribers: RKSimon, MaskRay, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62462 llvm-svn: 368963	2019-08-15 05:15:22 +00:00
Craig Topper	14f7560020	[X86] Remove some dead code and combine some repeated code that's left. If the width is 256 bits, then we must have AVX so the else here was unnecessary. Once that's removed then the >= 256 bit code is identical to the 128 bit code with a different VT so combine them. llvm-svn: 368956	2019-08-15 04:07:43 +00:00
Jonas Devlieghere	ed3b6d1bb2	Revert "Expose TailCallKind via the LLVM C API" This is failing on several build bots. Reverting as discussed in https://reviews.llvm.org/D66061. llvm-svn: 368953	2019-08-15 03:49:51 +00:00
Gor Nishanov	efe0093404	[coroutine] Fixes "cannot move instruction since its users are not dominated by CoroBegin" problem. Summary: Fixes https://bugs.llvm.org/show_bug.cgi?id=36578 and https://bugs.llvm.org/show_bug.cgi?id=36296. Supersedes: https://reviews.llvm.org/D55966 One of the fundamental transformation that CoroSplit pass performs before splitting the coroutine is to find which values need to survive between suspend and resume and provide a slot for them in the coroutine frame to spill and restore the value as needed. Coroutine frame becomes available once the storage for it was allocated and that point is marked in the pre-split coroutine with a llvm.coro.begin intrinsic. FE normally puts all of the user-authored code that would be accessing those values after llvm.coro.begin, however, sometimes instructions accessing those values would end up prior to coro.begin. For example, writing out a value of the parameter into the alloca done by the FE or instructions that are added by the optimization passes such as SROA when it rewrites allocas. Prior to this change, CoroSplit pass would try to move instructions that may end up accessing the values in the coroutine frame after CoroBegin. However it would run into problems (report_fatal_error) if some of the values would be used both in the allocation function (for example allocator is passed as a parameter to a coroutine) and in the use-authored body of the coroutine. To handle this case and to simplify the instruction moving logic, this change removes all of the instruction moving. Instead, we only change the uses of the spilled values that are dominated by coro.begin and leave other instructions intact. Before: ``` %var = alloca i32 %1 = getelementptr .. %var; ; will move this one after coro.begin %f = call i8* @llvm.coro.begin( ``` After: ``` %var = alloca i32 %1 = getelementptr .. %var; stays put %f = call i8* @llvm.coro.begin( ``` If we discover that there is a potential write into an alloca, prior to coro.begin we would copy its value from the alloca into the spill slot in the coroutine frame. Before: ``` %var = alloca i32 store .. %var ; will move this one after coro.begin %f = call i8* @llvm.coro.begin( ``` After: ``` %var = alloca i32 store .. %var ;stays put %f = call i8* @llvm.coro.begin( %tmp = load %var store %tmp, %spill.slot.for.var ``` Note: This change does not handle array allocas as that is something that C++ FE does not produce, but, it can be added in the future if need arises Reviewers: llvm-commits, modocache, ben-clayton, tks2103, rjmccall Reviewed By: modocache Subscribers: bartdesmet Differential Revision: https://reviews.llvm.org/D66230 llvm-svn: 368949	2019-08-15 00:48:51 +00:00
Robert Widmann	708c4605a1	Expose TailCallKind via the LLVM C API Summary: This exposes `CallInst`'s tail call kind via new `LLVMGetTailCallKind` and `LLVMSetTailCallKind` functions. The motivation for this is to be able to see `musttail` for languages that require mandatory tail calls for correctness. Today only the weaker `LLVMSetTail` is exposed and there is no way to set `GuaranteedTailCallOpt` via the C API. Reviewers: CodaFi, jyknight, deadalnix, rnk Reviewed By: CodaFi Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66061 llvm-svn: 368945	2019-08-14 23:54:35 +00:00
Johannes Doerfert	54f6be7b83	[Attributor] Try to fix "missing field 'RetInsts' initializer" warning http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/35674/steps/build_Lld/logs/stdio llvm-svn: 368938	2019-08-14 22:32:29 +00:00
Johannes Doerfert	5304b72a81	[Attributor][NFC] Make debug output consistent llvm-svn: 368931	2019-08-14 22:04:28 +00:00
Philip Reames	7b0515176b	[SCEV] Rename getMaxBackedgeTakenCount to getConstantMaxBackedgeTakenCount [NFC] llvm-svn: 368930	2019-08-14 21:58:13 +00:00
Johannes Doerfert	4395b31d99	[Attributor][NFC] Try to eliminate warnings (debug build + fall through) llvm-svn: 368928	2019-08-14 21:46:28 +00:00
Johannes Doerfert	17b578bc75	[Attributor][NFC] Introduce statistics macros for new positions llvm-svn: 368927	2019-08-14 21:46:25 +00:00
Craig Topper	e7ea06b7d2	[SelectionDAGBuilder] Teach gather/scatter getUniformBase to look through vector zeroinitializer indices in addition to scalar zeroes. llvm-svn: 368926	2019-08-14 21:38:56 +00:00
Johannes Doerfert	e1e844d6b0	[Attributor][NFC] Add merge/join/clamp operators to the IntegerState Differential Revision: https://reviews.llvm.org/D66146 llvm-svn: 368925	2019-08-14 21:35:20 +00:00
Johannes Doerfert	6a1274a52e	[Attributor] Use the AANoNull attribute directly in AADereferenceable Summary: Instead of constantly keeping track of the nonnull status with the dereferenceable information we can simply query the nonnull attribute whenever we need the information (debug + manifest). Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66113 llvm-svn: 368924	2019-08-14 21:31:32 +00:00
Amara Emerson	1222cfd5fe	[AArch64][GlobalISel] Custom selection for s8 load acquire. Implement this single atomic load instruction so that we can compile stack protector code. Differential Revision: https://reviews.llvm.org/D66245 llvm-svn: 368923	2019-08-14 21:30:30 +00:00
Johannes Doerfert	def9928204	[Attributor] Use liveness during the creation of AAReturnedValues Summary: As one of the first attributes, and one of the complex ones, AAReturnedValues was not using liveness but we filtered the result after the fact. This change adds liveness usage during the creation. The algorithm is also improved and shorter. The new algorithm will collect returned values over time using the generic facilities that work with liveness already, e.g., genericValueTraversal which does not look at dead PHI node predecessors. A test to show how this leads to better results is included. Note: Unresolved calls and resolved calls are now tracked explicitly. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66120 llvm-svn: 368922	2019-08-14 21:29:37 +00:00
Johannes Doerfert	9a1a1f96d9	[Attributor] Do not update or manifest dead attributes Summary: If the associated context instruction is assumed dead we do not need to update or manifest the state. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66116 llvm-svn: 368921	2019-08-14 21:25:08 +00:00
Johannes Doerfert	710ebb03ed	[Attributor] Use IRPosition consistently Summary: The next attempt to clean up the Attributor interface before we grow it further. Before, we used a combination of two values (associated + anchor) and an argument number (or -1) to determine a location. This was very fragile. The new system uses exclusively IR positions and we restrict the generation of IR positions to special constructor methods that verify internal constraints we have. This will catch misuse early. The auto-conversion, e.g., in getAAFor, is now performed through the SubsumingPositionIterator. This iterator takes an IR position and allows to visit all IR positions that "subsume" the given one, e.g., function attributes "subsume" argument attributes of that function. For a detailed breakdown see the class comment of SubsumingPositionIterator. This patch also introduces the IRPosition::getAttrs() to extract IR attributes at a certain position. The method knows how to look up in different positions that are equivalent, e.g., the argument position for call site arguments. We also introduce three new positions kinds such that we have all IR positions where attributes can be placed and one for "floating" values. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65977 llvm-svn: 368919	2019-08-14 21:18:01 +00:00
Dinar Temirbulatov	da0435a690	[SLP][NFC] Use pointers to address to ScalarToTreeEntry elements, instead of indexes. llvm-svn: 368906	2019-08-14 19:46:50 +00:00
Sanjay Patel	ecccf29e6c	[SDAG] move variable closer to use; NFC llvm-svn: 368905	2019-08-14 19:46:15 +00:00
Jan Korous	14230f9926	[Support][NFC] Fix error message for posix_spawn_file_actions_addopen failed call Seems like a copy-paste from couple lines above. llvm-svn: 368899	2019-08-14 18:30:18 +00:00
Philip Reames	6cca3ad43e	[RLEV] Rewrite loop exit values for multiple exit loops w/o overall loop exit count We already supported rewriting loop exit values for multiple exit loops, but if any of the loop exits were not computable, we gave up on all loop exit values. This patch generalizes the existing code to handle individual computable loop exits where possible. As discussed in the review, this is a starting point for figuring out a better API. The code is a bit ugly, but getting it in lets us test as we go. Differential Revision: https://reviews.llvm.org/D65544 llvm-svn: 368898	2019-08-14 18:27:57 +00:00
Matt Arsenault	dbc1f207fa	InferAddressSpaces: Move target intrinsic handling to TTI I'm planning on handling intrinsics that will benefit from checking the address space enums. Don't bother moving the address collection for now, since those won't need th enums. llvm-svn: 368895	2019-08-14 18:13:00 +00:00
Matt Arsenault	0eac2a2963	InferAddressSpaces: Remove unnecessary check for ConstantInt The IR is invalid if this isn't a constant since immarg was added. llvm-svn: 368893	2019-08-14 18:01:42 +00:00
Taewook Oh	df7022825c	[DebugInfo] Consider debug label scope has an extra lexical block file Summary: There are places where a case that debug label scope has an extra lexical block file is not considered properly. The modified test won't pass without this patch. Reviewers: aprantl, HsiangKai Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66187 llvm-svn: 368891	2019-08-14 17:58:45 +00:00
David Bolvansky	f94460d4b6	[SLC] Dereferenceable annonation - handle valid null pointers Reviewers: jdoerfert, reames Reviewed By: jdoerfert Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66161 llvm-svn: 368884	2019-08-14 17:15:20 +00:00
Jonas Devlieghere	c0a9b1edca	[DebugLine] Improve path handling. After switching over LLDB's line table parser to libDebugInfo, we noticed two regressions on the Windows bot. The problem is that when obtaining a file from the line table prologue, we append paths without specifying a path style. This leads to incorrect results on Windows for debug info containing Posix paths: 0x0000000000201000: /tmp\b.c, is_start_of_statement = TRUE This patch is an attempt to fix that by guessing the path style whenever possible. Differential revision: https://reviews.llvm.org/D66227 llvm-svn: 368879	2019-08-14 17:00:10 +00:00
David Bolvansky	0e0fbae1a4	[BuildLibCalls] Noalias annotation Summary: I think this is better solution than annotating callsites in IC/SLC. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66217 llvm-svn: 368875	2019-08-14 16:50:06 +00:00
Bill Wendling	cc2bebe039	Ignore indirect branches from callbr. Summary: We can't speculate around indirect branches: indirectbr and invoke. The callbr instruction needs to be included here. Reviewers: nickdesaulniers, manojgupta, chandlerc Reviewed By: chandlerc Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66200 llvm-svn: 368873	2019-08-14 16:44:07 +00:00
Thomas Lively	de0133eaa2	[WebAssembly] Stop unrolling SIMD shifts since they are fixed in V8 Summary: Fixes PR42973. Tests don't change because simd-arith.ll tests behavior on unimplemented-simd128, which does not include any temporary workarounds such as the one removed in this revision. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66166 llvm-svn: 368868	2019-08-14 16:24:37 +00:00
Craig Topper	3e44d96170	[X86] Use PSADBW for v8i8 addition reductions. Improves the 8 byte case from PR42674. Differential Revision: https://reviews.llvm.org/D66069 llvm-svn: 368864	2019-08-14 15:57:29 +00:00
Xiangling Liao	49661f94c8	[NFC][AIX] Change assertion Address one left comment on https://reviews.llvm.org/D63547. A minor change for assertion. Differential Revision: https://reviews.llvm.org/D63547 llvm-svn: 368860	2019-08-14 14:57:25 +00:00
Craig Topper	30d3e9c395	[X86][CostModel] Adjust the costs of ZERO_EXTEND/SIGN_EXTEND with less than 128-bit inputs Now that we legalize by widening, the element types here won't change. Previously these were modeled as the elements being widened and then the instruction might become an AND or SHL/ASHR pair. But now they'll become something like a ZERO_EXTEND_VECTOR_INREG/SIGN_EXTEND_VECTOR_INREG. For AVX2, when the destination type is legal its clear the cost should be 1 since we have extend instructions that can produce 256 bit vectors from less than 128 bit vectors. I'm a little less sure about AVX1 costs, but I think the ones I changed were definitely too high, but they might still be too high. Differential Revision: https://reviews.llvm.org/D66169 llvm-svn: 368858	2019-08-14 14:52:39 +00:00
Craig Topper	8c545168ee	[X86] Add llvm_unreachable to a switch that covers all expected values. llvm-svn: 368857	2019-08-14 14:51:19 +00:00
Jinsong Ji	e71db6584d	[PowerPC][NFC] Consolidate duplicate XX3Form_SetZero and XX3Form_Zero. Rename one to XX3Form_SameOp, remove the other one. llvm-svn: 368856	2019-08-14 14:16:26 +00:00
Jason Liu	8fc095d453	[AIX] Add call lowering for parameters that could pass onto FPRs Summary: This patch adds call lowering functionality to enable passing parameters onto floating point registers when needed. Differential Revision: https://reviews.llvm.org/D63654 llvm-svn: 368855	2019-08-14 14:13:11 +00:00
Pavel Labath	0d802a4923	Revert "raw_ostream: add operator<< overload for std::error_code" This reverts commit r368849, because it breaks some bots (e.g. llvm-clang-x86_64-win-fast). It turns out this is not as NFC as we had hoped, because operator== will consider two std::error_codes to be distinct even though they both hold "success" values if they have different categories. llvm-svn: 368854	2019-08-14 13:59:04 +00:00
Pavel Labath	40837e97b1	raw_ostream: add operator<< overload for std::error_code Summary: The main motivation for this is unit tests, which contain a large macro for pretty-printing std::error_code, and this macro is duplicated in every file that needs to do this. However, the functionality may be useful elsewhere too. In this patch I have reimplemented the existing ASSERT_NO_ERROR macros to reuse the new functionality, but I have kept the macro (as a one-liner) as it is slightly more readable than ASSERT_EQ(..., std::error_code()). Reviewers: sammccall, ilya-biryukov Subscribers: zturner, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65643 llvm-svn: 368849	2019-08-14 13:33:28 +00:00
Jeremy Morse	90c2794bfc	[DebugInfo] MCP: collect and update DBG_VALUEs encountered in local block MCP currently uses changeDebugValuesDefReg / collectDebugValues to find debug users of a register, however those functions assume that all DBG_VALUEs immediately follow the specified instruction, which isn't necessarily true. This is going to become very often untrue when we turn off CodeGenPrepare::placeDbgValues. Instead of calling changeDebugValuesDefReg on an instruction to change its debug users, in this patch we instead collect DBG_VALUEs of copies as we iterate over insns, and update the debug users of copies that are made dead. This isn't a non-functional change, because MCP will now update DBG_VALUEs that aren't immediately after a copy, but refer to the same register. I've hijacked the regression test for PR38773 to test for this new behaviour, an entirely new test seemed overkill. Differential Revision: https://reviews.llvm.org/D56265 llvm-svn: 368835	2019-08-14 12:20:02 +00:00
Fangrui Song	4c8deb6172	[IR] Simplify removeDeadConstantUsers. NFC llvm-svn: 368833	2019-08-14 11:38:45 +00:00
Simon Pilgrim	828a89e244	Fix "not all control paths return a value" MSVC warnings. NFCI. llvm-svn: 368831	2019-08-14 11:31:05 +00:00
Simon Pilgrim	3f40bdb558	Fix "not all control paths return a value" MSVC warning. NFCI. llvm-svn: 368830	2019-08-14 11:29:56 +00:00
Simon Pilgrim	8bba4798c2	Fix "not all control paths return a value" MSVC warnings. NFCI. llvm-svn: 368829	2019-08-14 11:29:16 +00:00
George Rimar	bcc00e1afb	Recommit r368812 "[llvm/Object] - Convert SectionRef::getName() to return Expected<>" Changes: no changes. A fix for the clang code will be landed right on top. Original commit message: SectionRef::getName() returns std::error_code now. Returning Expected<> instead has multiple benefits. For example, it forces user to check the error returned. Also Expected<> may keep a valuable string error message, what is more useful than having a error code. (Object\invalid.test was updated to show the new messages printed.) This patch makes a change for all users to switch to Expected<> version. Note: in a few places the error returned was ignored before my changes. In such places I left them ignored. My intention was to convert the interface used, and not to improve and/or the existent users in this patch. (Though I think this is good idea for a follow-ups to revisit such places and either remove consumeError calls or comment each of them to clarify why it is OK to have them). Differential revision: https://reviews.llvm.org/D66089 llvm-svn: 368826	2019-08-14 11:10:11 +00:00
Fangrui Song	8caa0aaa4d	[AsmPrinter] Delete redundant .type foo, @function when emitting an ifunc In MCAsmStreamer: .type foo,@function # <--- this is redundant .type foo,@gnu_indirect_function In MCELFStreamer, the latter STT_GNU_IFUNC overrides STT_FUNC. llvm-svn: 368823	2019-08-14 10:30:27 +00:00
Roman Lebedev	32f1e1a01d	[InstCombine] Refactor getFlippedStrictnessPredicateAndConstant() out of canonicalizeCmpWithConstant(), NFCI I'd like to use it elsewhere, hopefully without reinventing the wheel. No functional change intended so far. llvm-svn: 368820	2019-08-14 09:57:20 +00:00
George Rimar	468919e182	Revert r368812 "[llvm/Object] - Convert SectionRef::getName() to return Expected<>" It broke clang BB: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16455 llvm-svn: 368813	2019-08-14 08:56:55 +00:00
George Rimar	a0c6a35714	[llvm/Object] - Convert SectionRef::getName() to return Expected<> SectionRef::getName() returns std::error_code now. Returning Expected<> instead has multiple benefits. For example, it forces user to check the error returned. Also Expected<> may keep a valuable string error message, what is more useful than having a error code. (Object\invalid.test was updated to show the new messages printed.) This patch makes a change for all users to switch to Expected<> version. Note: in a few places the error returned was ignored before my changes. In such places I left them ignored. My intention was to convert the interface used, and not to improve and/or the existent users in this patch. (Though I think this is good idea for a follow-ups to revisit such places and either remove consumeError calls or comment each of them to clarify why it is OK to have them). Differential revision: https://reviews.llvm.org/D66089 llvm-svn: 368812	2019-08-14 08:46:54 +00:00
Dorit Nuzman	491ca2425d	[LV] Fold-tail flag This is the compiler-flag equivalent of the Predicate pragma (https://reviews.llvm.org/D65197), to direct the vectorizer to fold the remainder-loop into the main-loop using predication. Differential Revision: https://reviews.llvm.org/D66108 Reviewers: Ayal, hsaito, fhahn, SjoerdMeije llvm-svn: 368801	2019-08-14 05:22:20 +00:00
David L. Jones	d4edd9d97e	Revert '[LICM] Make Loop ICM profile aware' and 'Fix pass dependency for LICM' This reverts r368526 (git commit `7e71aa24bc`) This reverts r368542 (git commit `cb5a90fd31`) llvm-svn: 368800	2019-08-14 04:50:33 +00:00
John McCall	a318c55073	Coroutines: adjust for SVN r358739 CallSite has been removed in favour of CallBase. Adjust the coroutine split to account for that. llvm-svn: 368798	2019-08-14 03:54:25 +00:00
John McCall	3bbf207fbc	Don't run a full verifier pass in coro-splitting's private pipeline. Potentially addresses rdar://49022293. llvm-svn: 368797	2019-08-14 03:54:18 +00:00
John McCall	5f60b68c68	Remove unreachable blocks before splitting a coroutine. The suspend-crossing algorithm is not correct in the presence of uses that cannot be reached on some successor path from their defs. llvm-svn: 368796	2019-08-14 03:54:13 +00:00
John McCall	2133feec93	Support swifterror in coroutine lowering. The support for swifterror allocas should work in all lowerings. The support for swifterror arguments only really works in a lowering with prototypes where you can ensure that the prototype also has a swifterror argument; I'm not really sure how it could possibly be made to work in the switch lowering. llvm-svn: 368795	2019-08-14 03:54:05 +00:00
John McCall	d47801e718	In coro.retcon lowering, don't explode if the optimizer messes around with the linkage of the prototype or the exact types of the yielded values. llvm-svn: 368793	2019-08-14 03:53:52 +00:00
John McCall	ac40483276	Fix a use-after-free in the coro.alloca treatment. llvm-svn: 368792	2019-08-14 03:53:46 +00:00
John McCall	62a5dde0c2	Add intrinsics for doing frame-bound dynamic allocations within a coroutine. These rely on having an allocator provided to the coroutine and thus, for now, only work in retcon lowerings. llvm-svn: 368791	2019-08-14 03:53:40 +00:00
John McCall	137b50f0c3	Guard dumps in the coro intrinsic validation logic behind NDEBUG checks. dump() is not guaranteed to be defined in all builds. llvm-svn: 368790	2019-08-14 03:53:31 +00:00
John McCall	3829214185	Generalize llvm.coro.suspend.retcon to allow an arbitrary number of arguments to be passed back to the continuation function. llvm-svn: 368789	2019-08-14 03:53:26 +00:00
John McCall	94010b2b7f	Extend coroutines to support a "returned continuation" lowering. A quick contrast of this ABI with the currently-implemented ABI: - Allocation is implicitly managed by the lowering passes, which is fine for frontends that are fine with assuming that allocation cannot fail. This assumption is necessary to implement dynamic allocas anyway. - The lowering attempts to fit the coroutine frame into an opaque, statically-sized buffer before falling back on allocation; the same buffer must be provided to every resume point. A buffer must be at least pointer-sized. - The resume and destroy functions have been combined; the continuation function takes a parameter indicating whether it has succeeded. - Conversely, every suspend point begins its own continuation function. - The continuation function pointer is directly returned to the caller instead of being stored in the frame. The continuation can therefore directly destroy the frame when exiting the coroutine instead of having to leave it in a defunct state. - Other values can be returned directly to the caller instead of going through a promise allocation. The frontend provides a "prototype" function declaration from which the type, calling convention, and attributes of the continuation functions are taken. - On the caller side, the frontend can generate natural IR that directly uses the continuation functions as long as it prevents IPO with the coroutine until lowering has happened. In combination with the point above, the frontend is almost totally in charge of the ABI of the coroutine. - Unique-yield coroutines are given some special treatment. llvm-svn: 368788	2019-08-14 03:53:17 +00:00
Aditya Nandakumar	c65ac865c3	[GlobalISel]: Fix lowering of G_Shuffle_vector where we pick up the wrong source index https://reviews.llvm.org/D66182 llvm-svn: 368781	2019-08-14 01:23:33 +00:00
Amara Emerson	2a312fc989	[AArch64][GlobalISel] RBS: Treat s128s like vectors when unmerging. The destinations should be FPRs (for now). Differential Revision: https://reviews.llvm.org/D66184 llvm-svn: 368775	2019-08-13 23:51:20 +00:00
Eli Friedman	b5eb3e1e82	[AArch64] Remove incorrect usage of MONonTemporal. This has no effect at the moment, but might matter if we try to implement non-temporal loads in the future. llvm-svn: 368770	2019-08-13 23:12:14 +00:00
Aditya Nandakumar	615eee6402	[GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sources https://reviews.llvm.org/D66171 llvm-svn: 368753	2019-08-13 21:49:11 +00:00
Xiangling Liao	a8c624a1c4	[AIX]Lowering global address for 32/64bit small/large code models This patch implements global address lowering for 32/64 bit with small/large code models. 1.For 32bit large code model on AIX, there are newly added pseudo opcode LWZtocL & ADDIStocHA32, the support of which on MC layer will be provided by future patches. 2.The default code model on AIX should be small code model. 3.Since AIX does not have medium code model, "report_fatal_error" when users specify it. Differential Revision: https://reviews.llvm.org/D63547 llvm-svn: 368744	2019-08-13 20:29:01 +00:00
Tim Renouf	10db641aab	[AMDGPU] Fix to 'Fold readlane from copy of SGPR or imm' That change (r363670) could leave a copy from vgpr to sgpr. Fixed. Differential Revision: https://reviews.llvm.org/D66133 Change-Id: I00c3fe6fda2e8e1e36f53195b881b1449c777ea4 llvm-svn: 368736	2019-08-13 18:57:55 +00:00
David Green	a655393f17	[ARM] Add MVE beats vector cost model The MVE architecture has the idea of "beats", where a vector instruction can be executed over several ticks of the architecture. This adds a similar system into the Arm backend cost model, multiplying the cost of all vector instructions by a factor. This factor essentially becomes the expected difference between scalar code and vector code, on average. MVE Vector instructions can also overlap so the a true cost of them is often lower. But equally scalar instructions can in some situations be dual issued, or have other optimisations such as unrolling or make use of dsp instructions. The default is chosen as 2. This should not prevent vectorisation is a most cases (as the vector instructions will still be doing at least 4 times the work), but it will help prevent over vectorising in cases where the benefits are less likely. This adds things so far to the obvious places in ARMTargetTransformInfo, and updates a few related costs like not treating float instructions as cost 2 just because they are floats. Differential Revision: https://reviews.llvm.org/D66005 llvm-svn: 368733	2019-08-13 18:12:08 +00:00
Wenlei He	d328954467	[llvm-profdata] Profile dump for compact binary format Summary: Fix "llvm-profdata show" so it can work with compact binary format profile. The change is to mark all functions "used" so SampleProfileReaderCompactBinary::read will read in all profiles available for dumping. The function names will be MD5 hash for compact binary format. Reviewers: wmi, davidxl, danielcdh Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65162 llvm-svn: 368731	2019-08-13 17:56:08 +00:00
Steven Wu	9e51fb6c57	[AutoUpgrader] Make ArcRuntime Autoupgrader more conservative Summary: This is a tweak to r368311 and r368646 which auto upgrades the calls to objc runtime functions to objc runtime intrinsics, in order to make sure that the auto upgrader does not trigger with up-to-date bitcode. It is possible for bitcode that is up-to-date to contain direct calls to objc runtime function and those are not inserted by compiler as part of ARC and they should not be upgraded. Now auto upgrader only triggers as when the old style of ARC marker is used so it is guaranteed that it won't trigger on update-to-date bitcode. This also means it won't do this upgrade for bitcode from llvm-8 and llvm-9, which preserves the behavior of those releases. Ideally they should be upgraded as well but it is more important to make sure AutoUpgrader will not trigger on up-to-date bitcode. Reviewers: ahatanak, rjmccall, dexonsmith, pete Reviewed By: dexonsmith Subscribers: hiraditya, jkorous, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66153 llvm-svn: 368730	2019-08-13 17:52:21 +00:00
Heejin Ahn	64517a6419	Use Register over unsigned in LateEHPrepare (NFC) Summary: While D65962 is pending for review, I landed D65475 that added one more use of `unsigned`. Changed it to `Register`. Reviewers: dsanders Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66064 llvm-svn: 368727	2019-08-13 17:35:44 +00:00
David Bolvansky	038d604f4f	[SimplifyLibCalls] Add noalias from known callsites Summary: Should be fine for memcpy, strcpy, strncpy. Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66135 llvm-svn: 368724	2019-08-13 17:18:46 +00:00
Nikita Popov	2a4f26b4c2	[ValueTracking] Improve reverse assumption inference Use isGuaranteedToTransferExecutionToSuccessor() instead of isSafeToSpeculativelyExecute() when seeing whether we can propagate the information in an assume backwards in isValidAssumeForContext(). The latter is more general - it also allows arbitrary loads/stores - and is also the condition we want: if our assume is guaranteed to execute, its condition not holding would be UB. Original patch by arielb1. Differential Revision: https://reviews.llvm.org/D37215 llvm-svn: 368723	2019-08-13 17:15:42 +00:00
Hubert Tong	0996705009	Reland r368691: "[AIX] Implement LR prolog/epilog save/restore" Trying again with the code changes (and not just the new test). Summary: This patch fixes the offsets of fields in the stack frame linkage save area for AIX. Reviewers: sfertile, hubert.reinterpretcast, jasonliu, Xiangling_L, xingxue, ZarkoCA, daltenty Reviewed By: hubert.reinterpretcast Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64424 Patch by Chris Bowler! llvm-svn: 368721	2019-08-13 17:05:53 +00:00
David Tenty	9bf01e53a3	[NFC][AIX] Use assert instead of llvm_unreachable Addresses post-commit comments on https://reviews.llvm.org/D64825. Use assert instead of llvm_unreachable to check if invalid csect types are being generated. Use report_fatal_error on unimplemented XCOFF features. Differential Revision: https://reviews.llvm.org/D64825 llvm-svn: 368720	2019-08-13 17:04:51 +00:00
Jonas Devlieghere	57ae300562	[Dwarf] Complete the list of type tags. An incorrect verification error revealed that the list of type tags was incomplete. This patch adds the missing types by adding a tag kind to the Dwarf.def file, which is used by the `isType` function. A test was added for the original verification error. Differential revision: https://reviews.llvm.org/D65914 llvm-svn: 368718	2019-08-13 17:00:54 +00:00
David Bolvansky	90a30fdcc3	[SLC] Improve dereferenceable bytes annotation llvm-svn: 368715	2019-08-13 16:44:16 +00:00
Matt Arsenault	28215caa60	GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUES Odd sized vectors aren't handled yet. llvm-svn: 368713	2019-08-13 16:26:28 +00:00
Momchil Velikov	114c37e72a	[ARM] Fix detection of duplicates when parsing reg list operands Differential Revision: https://reviews.llvm.org/D65957 llvm-svn: 368712	2019-08-13 16:13:00 +00:00
Momchil Velikov	f990e4a4c7	[ARM] Fix encoding of APSR in CLRM instruction The APSR is encoded by setting bit 15 in the register list of the CLRM instruction (cf. https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf). Differential Revision: https://reviews.llvm.org/D65873 llvm-svn: 368711	2019-08-13 16:12:46 +00:00
Matt Arsenault	690645bda0	GlobalISel: Implement lower for G_SHUFFLE_VECTOR llvm-svn: 368709	2019-08-13 16:09:07 +00:00
Lang Hames	52a34a78d9	[ORC] Refactor definition-generation, add a generator for static libraries. This patch replaces the JITDylib::DefinitionGenerator typedef with a class of the same name, and adds support for attaching a sequence of DefinitionGeneration objects to a JITDylib. This patch also adds a new definition generator, StaticLibraryDefinitionGenerator, that can be used to add symbols fom a static library to a JITDylib. An object from the static library will be added (via a supplied ObjectLayer reference) whenever a symbol from that object is referenced. To enable testing, lli is updated to add support for the --extra-archive option when running in -jit-kind=orc-lazy mode. llvm-svn: 368707	2019-08-13 16:05:18 +00:00
Matt Arsenault	0a04a06250	GlobalISel: Add more verifier checks for G_SHUFFLE_VECTOR llvm-svn: 368705	2019-08-13 15:52:21 +00:00
Matt Arsenault	5af9cf042f	GlobalISel: Change representation of shuffle masks Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704	2019-08-13 15:34:38 +00:00
Roman Lebedev	676594305a	[CodeGen][SelectionDAG] More efficient code for X % C == 0 (SREM case) Summary: This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. One huge caveat: this signed case is only valid for positive divisors. While we can freely negate negative divisors, we can't negate `INT_MIN`, so for now if `INT_MIN` is encountered, we bailout. As a follow-up, it should be possible to handle that more gracefully via extra `and`+`setcc`+`select`. This passes llvm's test-suite, and from cursory(!) cross-examination the folds (the assembly) match those of GCC, and manual checking via alive did not reveal any issues (other than the `INT_MIN` case) Reviewers: RKSimon, spatel, hermord, craig.topper, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: xbolva00, thakis, javed.absar, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65366 llvm-svn: 368702	2019-08-13 14:57:37 +00:00
Roman Lebedev	f4de7eda4a	[TargetLowering][NFC] prepareUREMEqFold(): fixup comment The comment initially matched the code, but the code was incorrect and was fixed after the initial revert back back when it was introduced, but the comment was never updated. llvm-svn: 368701	2019-08-13 14:57:08 +00:00
Roman Lebedev	73f702ff19	[InstCombine] Non-canonical clamp-like pattern handling Summary: Given a pattern like: ``` %old_cmp1 = icmp slt i32 %x, C2 %old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high %old_x_offseted = add i32 %x, C1 %old_cmp0 = icmp ult i32 %old_x_offseted, C0 %r = select i1 %old_cmp0, i32 %x, i32 %old_replacement ``` it can be rewritten as more canonical pattern: ``` %new_cmp1 = icmp slt i32 %x, -C1 %new_cmp2 = icmp sge i32 %x, C0-C1 %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low ``` Iff `-C1 s<= C2 s<= C0-C1` Also, `ULT` predicate can also be `UGE`; or `UGT` iff `C0 != -1` (+invert result) Also, `SLT` predicate can also be `SGE`; or `SGT` iff `C2 != INT_MAX` (+invert result) If `C1 == 0`, then all 3 instructions must be one-use; else at most either `%old_cmp1` or `%old_x_offseted` can have extra uses. NOTE: if we could reuse `%old_cmp1` as one of the comparisons we'll have to build, this could be less limiting. So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants: \| \| ULT \| UGE \| UGT \| \| SLT \| https://rise4fun.com/Alive/yIJ \| https://rise4fun.com/Alive/5BfN \| https://rise4fun.com/Alive/INH \| \| SGE \| https://rise4fun.com/Alive/hd8 \| https://rise4fun.com/Alive/Abk \| https://rise4fun.com/Alive/PlzS \| \| SGT \| https://rise4fun.com/Alive/VYG \| https://rise4fun.com/Alive/oMY \| https://rise4fun.com/Alive/KrzC \| {F9730206} This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch. This patch requires D65530. Reviewers: spatel, nikic, xbolva00, dmgreen Reviewed By: spatel Subscribers: hiraditya, llvm-commits, dmgreen Tags: #llvm Differential Revision: https://reviews.llvm.org/D65765 llvm-svn: 368687	2019-08-13 12:49:28 +00:00
Roman Lebedev	0410489a34	[InstCombine][NFC] Rename IsFreeToInvert() -> isFreeToInvert() for consistency As per https://reviews.llvm.org/D65530#inline-592325 llvm-svn: 368686	2019-08-13 12:49:16 +00:00
Roman Lebedev	2635c324da	[InstCombine] foldXorOfICmps(): don't give up on non-single-use ICmp's if all users are freely invertible Summary: This is rather unconventional.. As the comment there says, we don't have much folds for xor-of-icmps, we try to turn them into an and-of-icmps, for which we have plenty of folds. But if the ICmp we need to invert is not single-use - we give up. As discussed in https://reviews.llvm.org/D65148#1603922, we may have a non-canonical CLAMP pattern, with bit match and select-of-threshold that we'll potentially clamp. As it can be seen in `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`, out of all 8 variations of the pattern, only two are not canonicalized into the variant with and+icmp instead of bit math. The reason is because the ICmp we need to invert is not single-use - we give up. We indeed can't perform this fold at will, the general rule is that we should not increase instruction count in InstCombine, But we wouldn't end up increasing instruction count if we can adapt every other user to the inverted value. This way the `not` we create will get folded, and in the end the instruction count did not increase. For that, of course, we need to look at the users of a Value, which is again rather unconventional for InstCombine :S Thus i'm proposing to be a little bit more insistive in `foldXorOfICmps()`. The alternatives would be to not create that `not`, but add duplicate code to manually invert all users; or to add some even less general combine to handle some more specific pattern[s]. Reviewers: spatel, nikic, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, jdoerfert, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65530 llvm-svn: 368685	2019-08-13 12:49:06 +00:00
Simon Pilgrim	e7b350a5d1	[X86] XFormVExtractWithShuffleIntoLoad - handle shuffle mask scaling If the target shuffle mask is from a wider type, attempt to scale the mask so that the extraction can attempt to peek through. Fixes the regression mentioned in rL368662 Reapplying this as rL368308 had to be reverted as part of rL368660 to revert rL368276 llvm-svn: 368663	2019-08-13 11:11:42 +00:00
Simon Pilgrim	1a8d790cf5	[X86] SimplifyDemandedVectorElts - attempt to recombine target shuffle using DemandedElts mask (reapplied) If we don't demand all elements, then attempt to combine to a simpler shuffle. At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts. The insertps-combine.ll regression is because XFormVExtractWithShuffleIntoLoad can't see through shuffles of different widths - this will be fixed in a follow-up commit. Reapplying this as rL368307 had to be reverted as part of rL368660 to revert rL368276 llvm-svn: 368662	2019-08-13 10:51:39 +00:00
Hans Wennborg	5390d25f2b	Revert r368276 "[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT" This introduced a false positive MemorySanitizer warning about use of uninitialized memory in a vectorized crc function in Chromium. That suggests maybe something is not right with this transformation. See https://crbug.com/992853#c7 for a reproducer. This also reverts the follow-up commits r368307 and r368308 which depended on this. > This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. > > In particular this helps remove some unnecessary scalar->vector->scalar patterns. > > The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. > > Differential Revision: https://reviews.llvm.org/D65887 llvm-svn: 368660	2019-08-13 09:33:25 +00:00
David Bolvansky	39130314fe	[SimplifyLibCalls] Add dereferenceable bytes from known callsites Summary: int mm(char a, char b) { return memcmp(a,b,16); } Currently: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* %a, i8* %b, i64 16) ret i32 %call } After patch: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* dereferenceable(16) %a, i8* dereferenceable(16) %b, i64 16) ret i32 %call } Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: javed.absar, spatel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66079 llvm-svn: 368657	2019-08-13 09:11:49 +00:00
Qiu Chaofan	4fb99a3330	[PowerPC] Fix ICE when truncating some vectors The legalizer would hit an assertion on PowerPC platform when truncating a vector whose size is not power of 2. This patch is to add a check to prevent vectors with such odd-size elements from being custom lowered. Reviewed By: Hal Finkel Differential Revision: https://reviews.llvm.org/D65261 llvm-svn: 368654	2019-08-13 07:53:29 +00:00
Amara Emerson	72c81b94cb	[AArch64][GlobalISel] Replace explicit vreg creation with implicit using SrcOp. NFC. llvm-svn: 368653	2019-08-13 06:55:32 +00:00
Amara Emerson	e14c91b71a	[GlobalISel] Make the InstructionSelector instance non-const, allowing state to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652	2019-08-13 06:26:59 +00:00
Serge Pavlov	2a09b9acfb	Added unit tests to check supported rounding modes Also added fixed misspelled metadata name. Differential Revision: https://reviews.llvm.org/D66073 llvm-svn: 368650	2019-08-13 05:21:18 +00:00
Aditya Nandakumar	70fdfed45f	[GlobalISel]: Add KnownBits for G_XOR https://reviews.llvm.org/D66119 llvm-svn: 368648	2019-08-13 04:32:33 +00:00
Yevgeny Rouban	8b996dc16e	Verifier: check prof branch_weights This patch is to check some of constraints on !pro branch_weights metadata: https://llvm.org/docs/BranchWeightMetadata.html Reviewers: asbirlea, reames, chandlerc Reviewed By: reames Differential Revision: https://reviews.llvm.org/D61179 llvm-svn: 368647	2019-08-13 04:03:38 +00:00
Akira Hatanaka	3c7c053145	Do not call replaceAllUsesWith to upgrade calls to ARC runtime functions to intrinsic calls This fixes a bug in r368311. It turns out that the ARC runtime functions in the IR can have pointer parameter types that are not i8* or i8**. Instead of RAUWing normal functions with intrinsics, manually bitcast the arguments before passing them to the intrinsic functions and bitcast the return value back to the type of the original call instruction. This recommits r368634, which was reverted in r368637. The loop in the patch was iterating over uses of a function and deleting function calls inside it, which caused bots to crash. rdar://problem/54125406 Differential Revision: https://reviews.llvm.org/D66047 llvm-svn: 368646	2019-08-13 01:23:06 +00:00
Stanislav Mekhanoshin	438315bf69	[AMDGPU] Fix msan failure in printf lowering llvm-svn: 368645	2019-08-13 01:07:27 +00:00
Daniel Sanders	a58a27513b	Eliminate implicit Register->unsigned conversions in VirtRegMap. NFC Summary: This was mostly an experiment to assess the feasibility of completely eliminating a problematic implicit conversion case in D61321 in advance of landing that* but it also happens to align with the goal of propagating the use of Register/MCRegister instead of unsigned so I believe it makes sense to commit it. The overall process for eliminating the implicit conversions from Register/MCRegister -> unsigned was to: 1. Add an explicit conversion to support genuinely required conversions to unsigned. For example, using them as an index for IndexedMap. Sadly it's not possible to have an explicit and implicit conversion to the same type and only deprecate the implicit one so I called the explicit conversion get(). 2. Temporarily annotate the implicit conversion to unsigned with LLVM_ATTRIBUTE_DEPRECATED to make them visible 3. Eliminate implicit conversions by propagating Register/MCRegister/ explicit-conversions appropriately 4. Remove the deprecation added in 2. * My conclusion is that it isn't feasible as there's too much code to update in one go. Depends on D65678 Reviewers: arsenm Subscribers: MatzeB, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65685 llvm-svn: 368643	2019-08-13 00:55:24 +00:00
Akira Hatanaka	c109808982	Revert "Do not call replaceAllUsesWith to upgrade calls to ARC runtime functions" This reverts commit r368634 because it broke a bot. llvm-svn: 368637	2019-08-13 00:20:36 +00:00
Eric Christopher	4acb4ee767	Move findBBwithCalls to the file it's used in to avoid unused function warnings. llvm-svn: 368636	2019-08-13 00:05:01 +00:00
Akira Hatanaka	6817ce24c1	Do not call replaceAllUsesWith to upgrade calls to ARC runtime functions to intrinsic calls This fixes a bug in r368311. It turns out that the ARC runtime functions in the IR can have pointer parameter types that are not i8* or i8**. Instead of RAUWing normal functions with intrinsics, manually bitcast the arguments before passing them to the intrinsic functions and bitcast the return value back to the type of the original call instruction. rdar://problem/54125406 llvm-svn: 368634	2019-08-12 23:53:23 +00:00
Stanislav Mekhanoshin	5b32752d10	[AMDGPU] removed unused functions from printf lowering Differential Revision: https://reviews.llvm.org/D66117 llvm-svn: 368633	2019-08-12 23:32:35 +00:00
Reid Kleckner	e9865b9b31	[WinEH] Fix catch block parent frame pointer offset r367088 made it so that funclets store XMM registers into their local frame instead of storing them to the parent frame. However, that change forgot to update the parent frame pointer offset for catch blocks. This change does that. Fixes crashes when an exception is rethrown in a catch block that saves XMMs, as described in https://crbug.com/992860. llvm-svn: 368631	2019-08-12 23:02:00 +00:00
Juergen Ributzka	b978c51ce4	[TextAPI] Fix & Add tests for tbd files version 3. - There was a simple typo in TextStub code that prevented version 3 files to be read. - Included a version 3 unit test to handle the differences in the format. - Also a typo in Error.h inside the comments. https://reviews.llvm.org/D66041 This patch is from Cyndy Ishida <cyndy_ishida@apple.com>. llvm-svn: 368630	2019-08-12 23:01:07 +00:00
Daniel Sanders	3836874dbb	[risc-v] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Depends on D65919 Reviewers: lenary Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for full review was: https://reviews.llvm.org/D65962 llvm-svn: 368629	2019-08-12 22:41:02 +00:00
Daniel Sanders	5ae66e56cf	[aarch64] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Manual fixups in: AArch64InstrInfo.cpp - genFusedMultiply() now takes a Register* instead of unsigned* AArch64LoadStoreOptimizer.cpp - Ternary operator was ambiguous between Register/MCRegister. Settled on Register Depends on D65919 Reviewers: aemerson Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for full review was: https://reviews.llvm.org/D65962 llvm-svn: 368628	2019-08-12 22:40:53 +00:00
Daniel Sanders	05c145d694	[webassembly] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Reviewers: aheejin Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for whole review: https://reviews.llvm.org/D65962 llvm-svn: 368627	2019-08-12 22:40:45 +00:00
Stanislav Mekhanoshin	ef8f1c473a	[AMDGPU] Use PredicateControl in MIMGBaseOpcode. NFC. This is infrastructural, will be needed for future work. For some reason it was only used in MIMG_NoSampler, while needed everywere we use MIMGBaseOpcode if we want to use predicates. Differential Revision: https://reviews.llvm.org/D66115 llvm-svn: 368626	2019-08-12 22:32:21 +00:00
Johannes Doerfert	26e58466de	[Attributor] Use the cached data layout directly This removes the warning by using the new DL member. It also simplifies the code. llvm-svn: 368625	2019-08-12 22:21:09 +00:00
Craig Topper	e07e593782	[X86] Allow combineTruncateWithSat to use pack instructions for i16->i8 without AVX512BW. We need AVX512BW to be able to truncate an i16 vector. If we don't have that we have to extend i16->i32, then trunc, i32->i8. But we won't be able to remove the min/max if we do that. At least not without more special handling. llvm-svn: 368623	2019-08-12 22:18:23 +00:00
Johannes Doerfert	acc8079f8e	[Attributor][NFC] Add IntegerState raw_ostream << operator llvm-svn: 368622	2019-08-12 22:07:34 +00:00
Johannes Doerfert	ece8190497	[Attributor] Make the InformationCache an Attributor member The functionality is not changed but the interfaces are simplified and repetition is removed. llvm-svn: 368621	2019-08-12 22:05:53 +00:00
Aditya Nandakumar	55371e697c	[GISel]: Fix a bug in KnownBits where we should have been using SizeInBits https://reviews.llvm.org/D66039 We were using getIndexSize instead of getIndexSizeInBits(). Added test case for G_PTRTOINT and G_INTTOPTR. llvm-svn: 368618	2019-08-12 21:28:12 +00:00
Craig Topper	0761a38e8a	[X86] Remove unreachable code from LowerTRUNCATE. NFC All three 256->128 bit cases were already handled above. Noticed while looking at the coverage report. llvm-svn: 368609	2019-08-12 19:26:45 +00:00
Craig Topper	a3605baaff	[X86] Add a paranoia type check to the code that detects AVG patterns from truncating stores. If we're after type legalize, we should make sure we won't create a store with an illegal type when we separate the AVG pattern from the truncating store. I don't know of a way to fail for this today. Just noticed while I was in the vicinity. llvm-svn: 368608	2019-08-12 19:26:37 +00:00
Craig Topper	1b02909847	[X86] Simplify creation of saturating truncating stores. We just need to check if the truncating store is legal instead of going through isSATValidOnAVX512Subtarget. llvm-svn: 368607	2019-08-12 19:26:30 +00:00
Craig Topper	3f4e9b156d	[X86] Replace call to isTruncStoreLegalOrCustom with isTruncStoreLegal. NFC We have no custom trunc stores on X86. llvm-svn: 368606	2019-08-12 19:26:22 +00:00
Wenlei He	4b99b58a84	[ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends Summary: This commit fixed a race condition from multi-threaded thinLTO backends that causes non-deterministic memory corruption for a data structure used only by AutoFDO with compact binary profile. GUIDToFuncNameMap, a static data member of type DenseMap in FunctionSamples is used as a per-module mapping from function name MD5 to name string when input AutoFDO profile is in compact binary format. However with ThinLTO, we can have parallel backends modifying and accessing the class static map concurrently. The fix is to make GUIDToFuncNameMap a member of SampleProfileLoader instead of a file static data. Reviewers: wmi, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65848 llvm-svn: 368596	2019-08-12 17:45:14 +00:00
Craig Topper	09d5d15339	[X86] Disable use of zmm registers for varargs musttail calls under prefer-vector-width=256 and min-legal-vector-width=256. Under this config, the v16f32 type we try to use isn't to a register class so the getRegClassFor call will fail. llvm-svn: 368594	2019-08-12 17:43:26 +00:00
David Green	86876422ef	[ARM] sext of a load is free This teaches the cost model that the sext or zext of a load is going to be free. Differential Revision: https://reviews.llvm.org/D66006 llvm-svn: 368593	2019-08-12 17:39:56 +00:00
Stanislav Mekhanoshin	4c9c98f36b	[AMDGPU] Printf runtime binding pass This pass is a port of the according pass from the HSAIL compiler. It parses printf calls and setup runtime printf buffer. After that it copies printf arguments to the buffer and fills in module metadata for runtime. Differential Revision: https://reviews.llvm.org/D24035 llvm-svn: 368592	2019-08-12 17:12:29 +00:00
David Green	3e39f39ad9	[ARM] MVE shuffle broadcast costs A VDUP will perform a vector broadcast in a single instruction. Update the cost model for MVE accordingly. Code originally by David Sherwood. Differential Revision: https://reviews.llvm.org/D63448 llvm-svn: 368589	2019-08-12 16:54:07 +00:00
David Green	83bbfaa5e4	[ARM] Put some of the TTI costmodel behind hasNeon calls. This puts some of the calls in ARMTargetTransformInfo.cpp behind hasNeon() checks, now that we have MVE, and updates all the tests accordingly. Differential Revision: https://reviews.llvm.org/D63447 llvm-svn: 368587	2019-08-12 15:59:52 +00:00
Sean Fertile	29141da75e	[XCOFF] Use a single symbolic constant for the size of an embeded name. [NFC] Convert SymbolNameSize and SectionNameSize into just `NameSize`. The length of a name embeded in a symbol table entry or section header table entry is length 8 for Sections, Symbols and Files. No need to have a distinct constant for each one. Also removes the Size argument to 'generateStringRef' as the size is always 'XCOFF::NameSize'. llvm-svn: 368584	2019-08-12 15:27:40 +00:00
Hans Wennborg	a45f301f7a	Revert r368339 "[MBP] Disable aggressive loop rotate in plain mode" It caused assertions to fire when building Chromium: lib/CodeGen/LiveDebugValues.cpp:331: bool {anonymous}::LiveDebugValues::OpenRangesSet::empty() const: Assertion `Vars.empty() == VarLocs.empty() && "open ranges are inconsistent"' failed. See https://crbug.com/992871#c3 for how to reproduce. > Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. > > To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. > > Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 368579	2019-08-12 14:23:13 +00:00
Kang Zhang	489efc68a5	Revert r368565: [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks llvm-svn: 368574	2019-08-12 14:00:31 +00:00
Sam Elliott	fee242aed4	[RISCV] Fix ICE in isDesirableToCommuteWithShift Summary: Ana Pazos reported a bug where we were not checking that an APInt would fit into 64-bits before calling `getSExtValue()`. This caused asserts when compiling large constants, such as i128s, as happens when compiling compiler-rt. This patch adds a testcase and makes the callback less error-prone. Reviewers: apazos, asb, luismarques Reviewed By: luismarques Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66081 llvm-svn: 368572	2019-08-12 13:51:00 +00:00
David Bolvansky	20d37fab82	[InstCombine] x /c fabs(x) -> copysign(1.0, x) Summary: x / fabs(x) -> copysign(1.0, x) fabs(x) / x -> copysign(1.0, x) Reviewers: spatel, foad, RKSimon, efriedma Reviewed By: spatel Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65898 llvm-svn: 368570	2019-08-12 13:43:35 +00:00
David Stenberg	9b29ec58b7	[DebugInfo] Remove call sites when eliminating unreachable blocks Summary: When eliminating an unreachable block we must remove any call site information for calls residing in the block. This was originally found on a downstream target, and the attached x86 test case was produced by hand-modifying some MIR. Reviewers: aprantl, asowda, NikolaPrica, djtodoro, ivanbaev, vsk Reviewed By: NikolaPrica, vsk Subscribers: vsk, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D64500 llvm-svn: 368566	2019-08-12 13:22:29 +00:00
Kang Zhang	342fb0db6d	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 368565	2019-08-12 13:15:31 +00:00
Hans Wennborg	5b96d4655c	Revert r368509 "[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks" > In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. > But the `early-ret` pass is before `block-placement`, we don't want to run it again. > This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. > > Reviewed By: efriedma > > Differential Revision: https://reviews.llvm.org/D63972 This also revertes follow-ups r368514 and r368532. llvm-svn: 368560	2019-08-12 12:43:51 +00:00
Simon Pilgrim	182249daee	[X86][SSE] ComputeKnownBits - add basic PSADBW handling llvm-svn: 368558	2019-08-12 12:19:19 +00:00
Roman Lebedev	ccdad6ef48	[InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): avoid constantexpr pitfail (PR42962) Instead of matching value and then blindly casting to BinaryOperator just to get the opcode, just match instruction and do no cast. Fixes https://bugs.llvm.org/show_bug.cgi?id=42962 llvm-svn: 368554	2019-08-12 11:28:02 +00:00
Simon Pilgrim	05e8209e33	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::TRUNCATE llvm-svn: 368553	2019-08-12 10:56:05 +00:00
Pengfei Wang	e28cbbd5d4	[X86] Support -march=tigerlake Support -march=tigerlake for x86. Compare with Icelake Client, It include 4 more new features ,they are avx512vp2intersect, movdiri, movdir64b, shstk. Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D65840 llvm-svn: 368543	2019-08-12 01:29:46 +00:00
Wenlei He	cb5a90fd31	Fix pass dependency for LICM Expected to address buildbot failure http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16285 caused by D65060. llvm-svn: 368542	2019-08-11 22:54:05 +00:00
Bjorn Pettersson	27038a3780	[SelectionDAG] Widen vector results of SMULFIX/UMULFIX/SMULFIXSAT Summary: After the commits that changed x86 backend to widen vectors instead of using promotion some of our downstream tests started to fail. It was noticed that WidenVectorResult has been missing support for SMULFIX/UMULFIX/SMULFIXSAT. This patch adds the missing functionality. Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66051 llvm-svn: 368540	2019-08-11 19:27:06 +00:00
Craig Topper	ce6a2cf966	[X86] Simplify some of the type checks in combineSubToSubus. If we have SSE2 we can handle any i8/i16 type and let type legalization deal with it. llvm-svn: 368538	2019-08-11 17:36:49 +00:00
Craig Topper	637964bfd8	[X86] Don't use SplitOpsAndApply for ISD::USUBSAT. Target independent type legalization and custom lowering should be able to handle it. llvm-svn: 368537	2019-08-11 17:36:45 +00:00
Kang Zhang	b1a62d168f	[NFC][CodeGen] Use while loop instead for loop in MachineBlockPlacement::optimizeBranches() This will pass EXPENSIVE check. llvm-svn: 368532	2019-08-11 12:58:50 +00:00
David Green	11c4602fce	[MVE] Don't try to unroll vectorised MVE loops Due to the nature of the beat system in the MVE architecture, along with tail predication and low-overhead loops, unrolling has less benefit compared to normal loops. You can not, for example, hide the latency of a load with other instructions as you can for scalar code. Preventing unrolling also makes the code easier to read and reason about. So if a loop contains vector code, don't enable the runtime unrolling. At least for the time being. Differential Revision: https://reviews.llvm.org/D65803 llvm-svn: 368530	2019-08-11 08:53:18 +00:00
David Green	44f8d635e2	[ARM] Permit auto-vectorization using MVE With enough codegen complete, we can now correctly report the number and size of vector registers for MVE, allowing auto vectorisation. This also allows FP auto-vectorization for MVE without -Ofast/-ffast-math, due to support for IEEE FP arithmetic and parity between scalar and vector FP behaviour. Patch by David Sherwood. Differential Revision: https://reviews.llvm.org/D63728 llvm-svn: 368529	2019-08-11 08:42:57 +00:00
Heejin Ahn	831efe0e0f	Fix __clang_call_termiante's argument for foreign exceptions Summary: When exceptions are repeatedly thrown in the middle of handling another exception, we call `__clang_call_terminate` with the exception pointer (i32) as an argument. But in case of foreign exceptions, we don't have the pointer, so we call the function with 0. (This requires `__clang_call_terminate` can deal with 0 argument, which will be done later) But previously the 0 argument was not added as a `i32.const 0` but an immediate by mistake, causing the `call` instruction to take not an i32 but rather an exnref, because an `exnref` is left on top of the value stack if `br_on_exn` is not taken. ``` block i32 br_on_exn 0, __cpp_exception ;; exnref is on top of stack now i32.const 0 ;; This was missing! call __clang_call_terminate unreachable end call __clang_call_terminate ;; This takes i32 extracted by br_on_exn ``` Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65475 llvm-svn: 368527	2019-08-11 06:24:07 +00:00
Wenlei He	7e71aa24bc	[LICM] Make Loop ICM profile aware Summary: Hoisting/sinking instruction out of a loop isn't always beneficial. Hoisting an instruction from a cold block inside a loop body out of the loop could hurt performance. This change makes Loop ICM profile aware - it now checks block frequency to make sure hoisting/sinking anly moves instruction to colder block. Test Plan: ninja check Reviewers: asbirlea, sanjoy, reames, nikic, hfinkel, vsk Reviewed By: asbirlea Subscribers: fhahn, vsk, davidxl, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65060 llvm-svn: 368526	2019-08-11 06:05:35 +00:00
Wenlei He	d664072dd5	Revert "test commit" This reverts commit ad92a4a2769425ad0d39ac1dbb6282f6f51a1af7. llvm-svn: 368525	2019-08-11 05:59:20 +00:00
Wenlei He	7bd327da00	test commit llvm-svn: 368524	2019-08-11 05:50:28 +00:00
Craig Topper	9758e0e1bf	[X86] Remove some more code from combineShuffle that is no longer needed with widening legalization. llvm-svn: 368523	2019-08-11 02:17:18 +00:00
Craig Topper	0f74b82ef1	[X86] Remove some code from combineShuffle that seems largely unnecessary with widening legalization. The test case that changed is probably better served through allowing combineTruncatedArithmetic to create narrow vectors. It also appears InstCombine would have simplified this test case to remove the zext and trunc anyway. llvm-svn: 368522	2019-08-11 02:08:38 +00:00
Roman Lebedev	96474d17c6	[InstCombine][NFC] Use SimplifyAddInst() instead of SimplifyBinOp(Instruction::BinaryOps::Add, ) llvm-svn: 368521	2019-08-10 19:29:10 +00:00
Roman Lebedev	a8d20b4467	[InstCombine] Shift amount reassociation in bittest: relax one-use check when shifting constant If one of the values being shifted is a constant, since the new shift amount is known-constant, the new shift will end up being constant-folded so, we don't need that one-use restriction then. llvm-svn: 368519	2019-08-10 19:28:54 +00:00
Roman Lebedev	64fe806c4e	[InstCombine] Shift amount reassociation in bittest: drop pointless one-use restriction That one-use restriction is not needed for correctness - we have already ensured that one of the shifts will go away, so we know we won't increase the instruction count. So there is no need for that restriction. llvm-svn: 368518	2019-08-10 19:28:44 +00:00
Simon Pilgrim	ec128709f0	[X86][SSE] Lower shuffle as ANY_EXTEND_VECTOR_INREG On SSE41+ targets we always lower vector shuffles to ZERO_EXTEND_VECTOR_INREG, even if we don't need the extended bits. This patch relaxes this so that we lower to ANY_EXTEND_VECTOR_INREG if we can, meaning that shuffle combines have a better idea of what elements need to be kept zero. This helps the multiple reduction code as we can now combine away a lot more of the pack+extend codes. Differential Revision: https://reviews.llvm.org/D65741 llvm-svn: 368515	2019-08-10 16:46:07 +00:00
Kang Zhang	555f7495df	[NFC][CodeGen] Modify the PI++ to ++PI in MachineBlockPlacement::optimizeBranches() llvm-svn: 368514	2019-08-10 16:23:17 +00:00
Sanjay Patel	21c15ef384	[Reassociate] try harder to convert negative FP constants to positive This is an extension of a transform that tries to produce positive floating-point constants to improve canonicalization (and hopefully lead to more reassociation and CSE). The original patches were: D4904 D5363 (rL221721) But as the test diffs show, these were limited to basic patterns by walking from an instruction to its single user rather than recursively moving up the def-use sequence. No fast-math is required here because we're only rearranging implicit FP negations in intermediate ops. A motivating bug is: https://bugs.llvm.org/show_bug.cgi?id=32939 Differential Revision: https://reviews.llvm.org/D65954 llvm-svn: 368512	2019-08-10 13:17:54 +00:00
Kang Zhang	36cd84bdd9	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 368509	2019-08-10 09:58:52 +00:00
Craig Topper	74c43a2277	[X86] Match the IR pattern form movmsk on SSE1 only targets where v4i32 isn't legal Summary: This patch adds a special DAG combine for SSE1 to recognize the IR pattern InstCombine gives us for movmsk. This only does the recognition for a few cases where its obvious the input won't be scalarized resulting in building a vector just do to the movmsk. I've made it separate from our existing matching for movmsk since that's called in multiple places and I didn't spend time to see if the other callers would make sense here. Plus the restrictions and additional checks would complicate that. This fixes the case from PR42870. Buts its probably still broken the presence of logic ops feeding the movmsk pattern which would further hide the v4f32 type. Reviewers: spatel, RKSimon, xbolva00 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65689 llvm-svn: 368506	2019-08-10 07:51:13 +00:00
Craig Topper	a8e5e73711	[X86] Improve the diagnostic for larger than 4-bit immediate for vpermil2pd/ps. Only allow MCConstantExprs. llvm-svn: 368505	2019-08-10 04:28:52 +00:00
Luo, Yuanke	c6c86f4f81	[X86] Fix stack probe issue on windows32. Summary: On windows if the frame size exceed 4096 bytes, compiler need to generate a call to _alloca_probe. X86CallFrameOptimization pass changes the reserved stack size and cause of stack probe function not be inserted. This patch fix the issue by detecting the call frame size, if the size exceed 4096 bytes, drop X86CallFrameOptimization. Reviewers: craig.topper, wxiao3, annita.zhang, rnk, RKSimon Reviewed By: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65923 llvm-svn: 368503	2019-08-10 02:49:02 +00:00
Fedor Sergeev	92e160abab	[MemDep] allow to select block-scan-limit when constructing MemoryDependenceAnalysis Introducing non-global control for default block-scan-limit in MemDep analysis. Useful when there are many compilations per initialized LLVM instance (e.g. JIT). Reviewed By: asbirlea Tags: #llvm Differential Revision: https://reviews.llvm.org/D65806 llvm-svn: 368502	2019-08-10 01:23:38 +00:00
Peter Collingbourne	0e497d1554	cfi-icall: Allow the jump table to be optionally made non-canonical. The default behavior of Clang's indirect function call checker will replace the address of each CFI-checked function in the output file's symbol table with the address of a jump table entry which will pass CFI checks. We refer to this as making the jump table `canonical`. This property allows code that was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address of a function, but it comes with a couple of caveats that are especially relevant for users of cross-DSO CFI: - There is a performance and code size overhead associated with each exported function, because each such function must have an associated jump table entry, which must be emitted even in the common case where the function is never address-taken anywhere in the program, and must be used even for direct calls between DSOs, in addition to the PLT overhead. - There is no good way to take a CFI-valid address of a function written in assembly or a language not supported by Clang. The reason is that the code generator would need to insert a jump table in order to form a CFI-valid address for assembly functions, but there is no way in general for the code generator to determine the language of the function. This may be possible with LTO in the intra-DSO case, but in the cross-DSO case the only information available is the function declaration. One possible solution is to add a C wrapper for each assembly function, but these wrappers can present a significant maintenance burden for heavy users of assembly in addition to adding runtime overhead. For these reasons, we provide the option of making the jump table non-canonical with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump table is made non-canonical, symbol table entries point directly to the function body. Any instances of a function's address being taken in C will be replaced with a jump table address. This scheme does have its own caveats, however. It does end up breaking function address equality more aggressively than the default behavior, especially in cross-DSO mode which normally preserves function address equality entirely. Furthermore, it is occasionally necessary for code not compiled with ``-fsanitize=cfi-icall`` to take a function address that is valid for CFI. For example, this is necessary when a function's address is taken by assembly code and then called by CFI-checking C code. The ``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make the jump table entry of a specific function canonical so that the external code will end up taking a address for the function that will pass CFI checks. Fixes PR41972. Differential Revision: https://reviews.llvm.org/D65629 llvm-svn: 368495	2019-08-09 22:31:59 +00:00
Sanjay Patel	26b2c11451	[DAGCombiner] exclude x*2.0 from normal negation profitability rules This is the codegen part of fixing: https://bugs.llvm.org/show_bug.cgi?id=32939 Even with the optimal/canonical IR that is ideally created by D65954, we would reverse that transform in DAGCombiner and end up with the same asm on AArch64 or x86. I see 2 options for trying to correct this: 1. Limit isNegatibleForFree() by special-casing the fmul pattern (this patch). 2. Avoid creating (fmul X, 2.0) in the 1st place by adding a special-case transform to SelectionDAG::getNode() and/or SelectionDAGBuilder::visitFMul() that matches the transform done by DAGCombiner. This seems like the less intrusive patch, but if there's some other reason to prefer 1 option over the other, we can change to the other option. Differential Revision: https://reviews.llvm.org/D66016 llvm-svn: 368490	2019-08-09 21:37:32 +00:00
Daniel Sanders	e9a57c2b23	[globalisel] Add G_SEXT_INREG Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487	2019-08-09 21:11:20 +00:00
Eric Christopher	db2f17d362	Remove variable only used in an assert. llvm-svn: 368486	2019-08-09 21:02:47 +00:00
Craig Topper	6cb05ca044	[X86] Remove custom handling for extloads from LowerLoad. We don't appear to need this with widening legalization. llvm-svn: 368479	2019-08-09 20:27:22 +00:00
Bill Wendling	79176a2542	[CodeGen] Require a name for a block addr target Summary: A block address may be used in inline assembly. In which case it requires a name so that the asm parser has something to parse. Creating a name for every block address is a large hammer, but is necessary because at the point when a temp symbol is created we don't necessarily know if it's used in inline asm. This ensures that it exists regardless. Reviewers: nickdesaulniers, craig.topper Subscribers: nathanchance, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65352 llvm-svn: 368478	2019-08-09 20:18:30 +00:00
Bill Wendling	1b10438875	[MC] Don't recreate a label if it's already used Summary: This patch keeps track of MCSymbols created for blocks that were referenced in inline asm. It prevents creating a new symbol which doesn't refer to the block. Inline asm may have a reference to a label. The asm parser however doesn't recognize it as a label and tries to create a new symbol. The result being that instead of the original symbol (e.g. ".Ltmp0") the parser replaces it in the inline asm with the new one (e.g. ".Ltmp00") without updating it in the symbol table. So the machine basic block retains the "old" symbol (".Ltmp0"), but the inline asm uses the new one (".Ltmp00"). Reviewers: nickdesaulniers, craig.topper Subscribers: nathanchance, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65304 llvm-svn: 368477	2019-08-09 20:16:31 +00:00
Evandro Menezes	59fbe516bd	[InstCombine] Refactor optimizeExp2() (NFC) Refactor `LibCallSimplifier::optimizeExp2()` to use the new `emitBinaryFloatFnCall()` version that fetches the function name from TLI. llvm-svn: 368457	2019-08-09 17:22:56 +00:00
Evandro Menezes	8a21214174	[Transforms] Add a emitBinaryFloatFnCall() version that fetches the function name from TLI Add the counterpart to a similar function for single operands. Differential revision: https://reviews.llvm.org/D65976 llvm-svn: 368453	2019-08-09 17:06:46 +00:00
Sunil Srivastava	27f6f2f88b	Print reasonable representations of type names in llvm-nm, readelf and readobj For type values that do not have proper names, print reasonable representation in llvm-nm, llvm-readobj and llvm-readelf, matching GNU tools.s Fixes PR41713. Differential Revision: https://reviews.llvm.org/D65537 llvm-svn: 368451	2019-08-09 16:54:51 +00:00
Evandro Menezes	c6c00cdf2e	[Transforms] Rename hasUnaryFloatFn() and getUnaryFloatFn() (NFC) Rename `hasUnaryFloatFn()` to `hasFloatFn()` and `getUnaryFloatFn()` to `getFloatFnName()`. llvm-svn: 368449	2019-08-09 16:04:18 +00:00
Sanjay Patel	0b4ae34c2f	[DAGCombiner] remove redundant fold for X*1.0; NFC This is handled at node creation time (similar to X/1.0) after: rL357029 (no fast-math-flags needed) llvm-svn: 368443	2019-08-09 14:30:59 +00:00
Jinsong Ji	6349ce5ca5	[MachinePipeliner] Avoid indeterminate order in FuncUnitSorter Summary: This is exposed by adding a new testcase in PowerPC in https://reviews.llvm.org/rL367732 The testcase got different output on different platform, hence breaking buildbots. The problem is that we get differnt FuncUnitOrder when calculateResMII. The root cause is: 1. Two MachineInstr might get SAME priority(MFUsx) from minFuncUnits. 2. Current comparison operator() will return `MFUs1 > MFUs2`. 3. We use iterators for MachineInstr, so the input to FuncUnitSorter might be different on differnt platform due to the iterator nature. So for two MI with same MFU, their order is actually depends on the iterator order, which is platform (implemtation) dependent. This is risky, and may cause cross-compiling problems. The fix is to check make sure we assign a determine order when they are equal. Reviewers: bcahoon, hfinkel, jmolloy Subscribers: nemanjai, hiraditya, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65992 llvm-svn: 368441	2019-08-09 14:10:57 +00:00
Whitney Tsang	dd3b6498b0	Title: Loop Cache Analysis Summary: Implement a new analysis to estimate the number of cache lines required by a loop nest. The analysis is largely based on the following paper: Compiler Optimizations for Improving Data Locality By: Steve Carr, Katherine S. McKinley, Chau-Wen Tseng http://www.cs.utexas.edu/users/mckinley/papers/asplos-1994.pdf The analysis considers temporal reuse (accesses to the same memory location) and spatial reuse (accesses to memory locations within a cache line). For simplicity the analysis considers memory accesses in the innermost loop in a loop nest, and thus determines the number of cache lines used when the loop L in loop nest LN is placed in the innermost position. The result of the analysis can be used to drive several transformations. As an example, loop interchange could use it determine which loops in a perfect loop nest should be interchanged to maximize cache reuse. Similarly, loop distribution could be enhanced to take into consideration cache reuse between arrays when distributing a loop to eliminate vectorization inhibiting dependencies. The general approach taken to estimate the number of cache lines used by the memory references in the inner loop of a loop nest is: Partition memory references that exhibit temporal or spatial reuse into reference groups. For each loop L in the a loop nest LN: a. Compute the cost of the reference group b. Compute the 'cache cost' of the loop nest by summing up the reference groups costs For further details of the algorithm please refer to the paper. Authored By: etiotto Reviewers: hfinkel, Meinersbur, jdoerfert, kbarton, bmahjour, anemet, fhahn Reviewed By: Meinersbur Subscribers: reames, nemanjai, MaskRay, wuzish, Hahnfeld, xusx595, venkataramanan.kumar.llvm, greened, dmgreen, steleman, fhahn, xblvaOO, Whitney, mgorny, hiraditya, mgrang, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63459 llvm-svn: 368439	2019-08-09 13:56:29 +00:00
Simon Pilgrim	60394f47b0	[X86][SSE] Swap X86ISD::BLENDV inputs with an inverted selection mask (PR42825) As discussed on PR42825, if we are inverting the selection mask we can just swap the inputs and avoid the inversion. Differential Revision: https://reviews.llvm.org/D65522 llvm-svn: 368438	2019-08-09 12:44:20 +00:00
Sanjay Patel	991834a516	[GlobalOpt] prevent crashing on large integer types (PR42932) This is a minimal fix (copy the predicate for the assert) to prevent the crashing seen in: https://bugs.llvm.org/show_bug.cgi?id=42932 ...when converting a constant integer of arbitrary width to uint64_t. Differential Revision: https://reviews.llvm.org/D65970 llvm-svn: 368437	2019-08-09 12:43:25 +00:00
Simon Atanasyan	242c5a70d4	[Mips][Codegen] Fix fast-isel mixing of FGR64 and AFGR64 registers Fast-isel was picking AFGR64 register class for processing call arguments when +fp64 options was used. We simply check is option +fp64 is used and pick appropriate register. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D65886 llvm-svn: 368433	2019-08-09 12:02:32 +00:00
Andrea Di Biagio	cbec9af6bf	[MCA] Add flag -show-encoding to llvm-mca. Flag -show-encoding enables the printing of instruction encodings as part of the the instruction info view. Example (with flags -mtriple=x86_64-- -mcpu=btver2): Instruction Info: [1]: #uOps [2]: Latency [3]: RThroughput [4]: MayLoad [5]: MayStore [6]: HasSideEffects (U) [7]: Encoding Size [1] [2] [3] [4] [5] [6] [7] Encodings: Instructions: 1 2 1.00 4 c5 f0 59 d0 vmulps %xmm0, %xmm1, %xmm2 1 4 1.00 4 c5 eb 7c da vhaddps %xmm2, %xmm2, %xmm3 1 4 1.00 4 c5 e3 7c e3 vhaddps %xmm3, %xmm3, %xmm4 In this example, column Encoding Size is the size in bytes of the instruction encoding. Column Encodings reports the actual instruction encodings as byte sequences in hex (objdump style). The computation of encodings is done by a utility class named mca::CodeEmitter. In future, I plan to expose the CodeEmitter to the instruction builder, so that information about instruction encoding sizes can be used by the simulator. That would be a first step towards simulating the throughput from the decoders in the hardware frontend. Differential Revision: https://reviews.llvm.org/D65948 llvm-svn: 368432	2019-08-09 11:26:27 +00:00
Pablo Barrio	3cdd586be2	[AArch64] Set pref. func. align to 8 bytes on Neoverse E1 & Cortex-A65 Summary: The Arm Neoverse E1 and Cortex-A65 Software Optimization Guide [1][2], Section "4.7 Branch instruction alignment" state: "It is preferable for branch targets, including subroutine entry points, to be placed on aligned 64-bit boundaries to maximize instruction fetch efficiency." This patch sets the preferred function alignment on Neoverse E1 and Cortex-A65 to 2^3=8B. This was already the case in some Cortex-A CPUs such as Cortex-A53. [1] https://developer.arm.com/docs/swog466751/latest/arm-neoversetm-e1-core-software-optimization-guide [2] https://developer.arm.com/docs/swog010045/latest/arm-cortex-a65-core-software-optimization-guide Reviewers: dmgreen, fhahn, samparker Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65937 llvm-svn: 368431	2019-08-09 11:05:15 +00:00
Tim Northover	01eb869114	AArch64: support TLS on Darwin platforms in GlobalISel. All TLS access on Darwin is in the "general dynamic" form where we call a function to resolve the address, so implementation is pretty simple. llvm-svn: 368418	2019-08-09 09:32:38 +00:00
Tim Northover	e1a5f668b3	GlobalISel: pack various parameters for lowerCall into a struct. I've now needed to add an extra parameter to this call twice recently. Not only is the signature getting extremely unwieldy, but just updating all of the callsites and implementations is a pain. Putting the parameters in a struct sidesteps both issues. llvm-svn: 368408	2019-08-09 08:26:38 +00:00
Sam Parker	0dba791a25	[ARM][ParallelDSP] Replace SExt uses As loads are combined and widened, we replaced their sext users operands whereas we should have been replacing the uses of the sext. I've added a load of tests, with only a few of them originally causing assertion failures, the rest improve pattern coverage. Differential Revision: https://reviews.llvm.org/D65740 llvm-svn: 368404	2019-08-09 07:48:50 +00:00
Bjorn Pettersson	d218a3326e	[InstSimplify] Report "Changed" also when only deleting dead instructions Summary: Make sure that we report that changes has been made by InstSimplify also in situations when only trivially dead instructions has been removed. If for example a call is removed the call graph must be updated. Bug seem to have been introduced by llvm-svn r367173 (commit `02b9e45a7e`), since the code in question was rewritten in that commit. Reviewers: spatel, chandlerc, foad Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65973 llvm-svn: 368401	2019-08-09 07:08:25 +00:00
Craig Topper	6179175551	[X86] Remove code that expands truncating stores from combineStore. We shouldn't form trunc stores that need to be expanded now that we are using widening legalization. llvm-svn: 368400	2019-08-09 06:59:53 +00:00
Craig Topper	7e33f11ba7	[X86] Remove stale FIXME from combineMaskedStore. NFC I believe PR34584 was tracking that FIXME, but its since been closed and a test case was added. llvm-svn: 368397	2019-08-09 05:55:41 +00:00
Craig Topper	8c5c09780d	[X86] Remove DAG combine expansion of extending masked load and truncating masked store. The only way to generate these was through promoting legalization of narrow vectors, but we widen those types now. So we shouldn't produce these nodes. llvm-svn: 368396	2019-08-09 05:53:37 +00:00
Craig Topper	509c8774fa	[X86] Remove handler for (U/S)(ADD/SUB)SAT from ReplaceNodeResults. Remove TypeWidenVector check from code that handles X86ISD::VPMADDWD and X86ISD::AVG. More unneeded code since we now legalize narrow vectors by widening. llvm-svn: 368395	2019-08-09 05:17:52 +00:00
Craig Topper	824961824f	[X86] Remove ISD::SETCC handling from ReplaceNodeResults. This is no longer needed since we widen v2i32 instead of promoting. llvm-svn: 368394	2019-08-09 05:17:48 +00:00
Craig Topper	ef5b435b00	[X86] Simplify ISD::LOAD handling in ReplaceNodeResults and ISD::STORE handling in LowerStore now that v2i32 is widened to v4i32. llvm-svn: 368390	2019-08-09 03:09:43 +00:00
Craig Topper	0da681a2be	[X86] Merge v2f32 and v2i32 gather/scatter handling in ReplaceNodeResults/LowerMSCATTER now that v2i32 is also widened like v2f32. llvm-svn: 368389	2019-08-09 03:09:28 +00:00
Craig Topper	6f81db0f68	[X86] Now unreachable handling for f64->v2i32/v4i16/v8i8 bitcasts from ReplaceNodeResults. We rely on the generic type legalizer for this now. llvm-svn: 368388	2019-08-09 03:09:19 +00:00
Craig Topper	d871f638d7	[X86] Simplify ReplaceNodeResults handling for FP_TO_SINT/UINT for vectors to only handle widening. llvm-svn: 368387	2019-08-09 03:09:10 +00:00
Craig Topper	0bd44d59db	[X86] Simplify ReplaceNodeResults handling for SIGN_EXTEND/ZERO_EXTEND/TRUNCATE for vectors to only handle widening. llvm-svn: 368386	2019-08-09 03:08:54 +00:00
Craig Topper	cdb9a8ebd8	[X86] Simplify ReplaceNodeResults handling for UDIV/UREM/SDIV/SREM for vectors to only handle widening. llvm-svn: 368385	2019-08-09 03:08:45 +00:00
Craig Topper	35848345f0	[X86] Remove vector promotion handling from the ReplaceNodeResults ISD::MUL handling code. We now widen illegal vector types so we don't need this anymore. llvm-svn: 368384	2019-08-09 03:08:28 +00:00
David Blaikie	0fcc1f7bac	DebugInfo/DWARF: Provide some (pretty half-hearted) error handling access when parsing units This isn't the most robust error handling API, but does allow clients to opt-in to getting Errors they can handle. I suspect the long-term solution would be to move away from the lazy unit parsing and have an explicit step that parses the unit and then allows access to the other APIs that require a parsed unit. llvm-dwarfdump could be expanded to use this (or newer/better API) to demonstrate the benefit of it - but for now lld will use this in a follow-up cl which ensures lld can exit non-zero on errors like this (& provide more descriptive diagnostics including which object file the error came from). (error access to later errors when parsing nested DIEs would be good too - but, again, exposing that without it being a hassle for every consumer may be tricky) llvm-svn: 368377	2019-08-09 01:14:33 +00:00
Akira Hatanaka	3e61ed0299	Change the return type of UpgradeARCRuntimeCalls to void Nothing is using the function return. llvm-svn: 368367	2019-08-08 23:33:17 +00:00
David Blaikie	5b9508396c	Remove else-after-return llvm-svn: 368364	2019-08-08 23:17:23 +00:00
Peter Collingbourne	bb17e46644	Linker: Add support for GlobalIFunc. GlobalAlias and GlobalIFunc ought to be treated the same by the IR linker, so we can generalize the code to be in terms of their common base class GlobalIndirectSymbol. Differential Revision: https://reviews.llvm.org/D55046 llvm-svn: 368357	2019-08-08 22:09:18 +00:00
Cameron McInally	8416f20f2f	[LICM] Support unary FNeg in LICM Differential Revision: https://reviews.llvm.org/D65908 llvm-svn: 368350	2019-08-08 21:38:31 +00:00
Craig Topper	c49d3e6c4d	[X86] Improve codegen of v8i64->v8i16 and v16i32->v16i8 truncate with avx512vl, avx512bw, min-legal-vector-width<=256 and prefer-vector-width=256 Under this configuration we'll want to split the v8i64 or v16i32 into two vectors. The default legalization will try to truncate each of those 256-bit pieces one step to 128-bit, concatenate those, then truncate one more time from the new 256 to 128 bits. With this patch we now truncate the two splits to 64-bits then concatenate those. We have to do this two different ways depending on whether have widening legalization enabled. Without widening legalization we have to manually construct X86ISD::VTRUNC to prevent the ISD::TRUNCATE with a narrow result being promoted to 128 bits with a larger element type than what we want followed by something like a pshufb to grab the lower half of each element to finish the job. With widening legalization we just get the right thing. When we switch to widening by default we can just delete the other code path. Differential Revision: https://reviews.llvm.org/D65626 llvm-svn: 368349	2019-08-08 21:36:47 +00:00
Craig Topper	9158e54270	[SelectionDAG][X86] Move setcc mask splitting for mload/mstore/mgather/mscatter from DAGCombiner to the type legalizer. We may be able to look to how VSELECT is handled to further improve this, but this appears to be neutral or an improvement on the test cases we have. llvm-svn: 368344	2019-08-08 21:14:08 +00:00
Craig Topper	bce4d79f37	[LegalizeTypes] Remove SplitVSETCC helper and just call SplitVecRes_SETCC. llvm-svn: 368343	2019-08-08 21:13:58 +00:00
Guozhi Wei	80347c3acc	[MBP] Disable aggressive loop rotate in plain mode Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 368339	2019-08-08 20:25:23 +00:00
Brian Cain	6dbbd0f343	[llvm-mc] Add reportWarning() to MCContext Adding reportWarning() to MCContext, so that it can be used from the Hexagon assembler backend. llvm-svn: 368327	2019-08-08 19:13:23 +00:00
Craig Topper	9d55e2c85e	[X86] Make CMPXCHG16B feature imply CMPXCHG8B feature. This fixes znver1 so that it properly enables CMPXHG8B. We can probably remove explicit CMPXCHG8B from CPUs that also have CMPXCHG16B, but keeping this simple to allow cherry pick to 9.0. Fixes PR42935. llvm-svn: 368324	2019-08-08 18:11:17 +00:00
Pirama Arumuga Nainar	0cb2a33dfd	[AArch64] Do not emit '#' before immediates in inline asm Summary: The A64 assembly language does not require the '#' character to introduce constant immediate operands. Avoid the '#' since the AArch64 asm parser does not accept '#' before the lane specifier and rejects the following: __asm__ ("fmla v2.4s, v0.4s, v1.s[%0]" :: "I"(0x1)) Fix a test to not expect the '#' and add a new test case with the above asm. Fixes: https://github.com/android-ndk/ndk/issues/1036 Reviewers: peter.smith, kristof.beyls Subscribers: javed.absar, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D65550 llvm-svn: 368320	2019-08-08 17:50:39 +00:00
Akira Hatanaka	ecde8c7ad4	[ObjC][ARC] Upgrade calls to ARC runtime functions to intrinsic calls if the bitcode has the arm64 retainAutoreleasedReturnValue marker The ARC middle-end passes stopped optimizing or transforming bitcode that has been compiled with old compilers after we started emitting calls to ARC runtime functions as intrinsic calls instead of normal function calls in the front-end and made changes to teach the ARC middle-end passes about those intrinsics (see r349534). This patch converts calls to ARC runtime functions that are not intrinsic functions to intrinsic function calls if the bitcode has the arm64 retainAutoreleasedReturnValue marker. Checking for the presence of the marker is necessary to make sure we aren't changing ARC function calls that were originally MRR message sends (see r349952). rdar://problem/53280660 Differential Revision: https://reviews.llvm.org/D65902 llvm-svn: 368311	2019-08-08 16:59:31 +00:00
Simon Pilgrim	eb7a553db8	[X86] XFormVExtractWithShuffleIntoLoad - handle shuffle mask scaling If the target shuffle mask is from a wider type, attempt to scale the mask so that the extraction can attempt to peek through. Fixes the regression mentioned in rL368307 llvm-svn: 368308	2019-08-08 16:05:23 +00:00
Simon Pilgrim	67c246bbe6	[X86] SimplifyDemandedVectorElts - attempt to recombine target shuffle using DemandedElts mask If we don't demand all elements, then attempt to combine to a simpler shuffle. At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts. The insertps-combine.ll regression is because XFormVExtractWithShuffleIntoLoad can't see through shuffles of different widths - this will be fixed in a follow-up commit. llvm-svn: 368307	2019-08-08 15:54:20 +00:00
David Tenty	8558aac82c	Enable assembly output of local commons for AIX Summary: This patch enable assembly output of local commons for AIX using .lcomm directives. Adds a EmitXCOFFLocalCommonSymbol to MCStreamer so we can emit the AIX version of .lcomm assembly directives which include a csect name. Handle the case of BSS locals in PPCAIXAsmPrinter by using EmitXCOFFLocalCommonSymbol. Adds a test for generating .lcomm on AIX Targets. Reviewers: cebowleratibm, hubert.reinterpretcast, Xiangling_L, jasonliu, sfertile Reviewed By: sfertile Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64825 llvm-svn: 368306	2019-08-08 15:40:35 +00:00
David Green	27ca82f32a	[ARM] Add support for MVE pre and post inc loads and stores This adds pre- and post- increment and decrements for MVE loads and stores. It uses the builtin pre and post load/store detection, unlike Neon. Loads are selected with the code in tryT2IndexedLoad, stores are selected with tablegen patterns. The immediates have a +/-7bit range, multiplied by the size of the element. Differential Revision: https://reviews.llvm.org/D63840 llvm-svn: 368305	2019-08-08 15:27:58 +00:00
David Green	824ffd8b12	[ARM] MVE big endian loads/stores This adds some missing patterns for big endian loads/stores, allowing unaligned loads/stores to also be selected with an extra VREV, which produces better code than aligning through a stack. Also moves VLDR_P0 to not be LE only, and adjusts some of the tests to show all that working. Differential Revision: https://reviews.llvm.org/D65583 llvm-svn: 368304	2019-08-08 15:15:19 +00:00
Sam Elliott	856d5c5817	[RISCV] Allow ABI Names in Inline Assembly Constraints Summary: Clang will replace references to registers using ABI names in inline assembly constraints with references to architecture names, but other frontends do not. LLVM uses the regular assembly parser to parse inline asm, so inline assembly strings can contain references to registers using their ABI names. This patch adds support for parsing constraints using either the ABI name or the architectural register name. This means we do not need to implement the ABI name replacement code in every single frontend, especially those like Rust which are a very thin shim on top of LLVM IR's inline asm, and that constraints can more closely match the assembly strings they refer to. Reviewers: asb, simoncook Reviewed By: simoncook Subscribers: hiraditya, rbar, johnrusso, JDevlieghere, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65947 llvm-svn: 368303	2019-08-08 14:59:16 +00:00
Sam Elliott	cd44aee3da	[RISCV] Minimal stack realignment support Summary: Currently the RISC-V backend does not realign the stack. This can be an issue even for the RV32I/RV64I ABIs (where the stack is 16-byte aligned), though is rare. It will be much more comment with RV32E (though the alignment requirements for common data types remain under-documented...). This patch adds minimal support for stack realignment. It should cope with large realignments. It will error out if the stack needs realignment and variable sized objects are present. It feels like a lot of the code like getFrameIndexReference and determineFrameLayout could be refactored somehow, as right now it feels fiddly and brittle. We also seem to allocate a lot more memory than GCC does for equivalent C code. Reviewers: asb Reviewed By: asb Subscribers: wwei, jrtc27, s.egerton, MaskRay, Jim, lenary, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62007 llvm-svn: 368300	2019-08-08 14:40:54 +00:00
Thomas Preud'homme	b1add2b774	[FileCheck] Add missing includes in header Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65778 llvm-svn: 368297	2019-08-08 13:56:59 +00:00
Tim Corringham	4f64f1ba3c	Add llvm.licm.disable metadata For some targets the LICM pass can result in sub-optimal code in some cases where it would be better not to run the pass, but it isn't always possible to suppress the transformations heuristically. Where the front-end has insight into such cases it is beneficial to attach loop metadata to disable the pass - this change adds the llvm.licm.disable metadata to enable that. Differential Revision: https://reviews.llvm.org/D64557 llvm-svn: 368296	2019-08-08 13:46:17 +00:00
Simon Pilgrim	59fabf9c60	[X86][SSE] matchBinaryPermuteShuffle - split INSERTPS combines We need to prefer INSERTPS with zeros over SHUFPS, but fallback to INSERTPS if that fails. llvm-svn: 368292	2019-08-08 13:23:53 +00:00
Simon Pilgrim	e2e366797e	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. In particular this helps remove some unnecessary scalar->vector->scalar patterns. The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. Differential Revision: https://reviews.llvm.org/D65887 llvm-svn: 368276	2019-08-08 10:37:03 +00:00
Andrea Di Biagio	987331671f	[MCA] Remove dependency from InstrBuilder in mca::Context. NFC InstrBuilder is not required to construct the default pipeline. llvm-svn: 368275	2019-08-08 10:30:58 +00:00
Petar Avramovic	caef930699	[MIPS GlobalISel] Select jump_table and brjt G_JUMP_TABLE and G_BRJT appear from translation of switch statement. Select these two instructions for MIPS32, both pic and non-pic. Differential Revision: https://reviews.llvm.org/D65861 llvm-svn: 368274	2019-08-08 10:21:12 +00:00
George Rimar	d3963051c4	[yaml2obj/obj2yaml] - Add a basic support for extended section indexes. In some cases a symbol might have section index == SHN_XINDEX. This is an escape value indicating that the actual section header index is too large to fit in the containing field. Then the SHT_SYMTAB_SHNDX section is used. It contains the 32bit values that stores section indexes. ELF gABI says that there can be multiple SHT_SYMTAB_SHNDX sections, i.e. for example one for .symtab and one for .dynsym (1) https://groups.google.com/forum/#!topic/generic-abi/-XJAV5d8PRg (2) DT_SYMTAB_SHNDX: http://www.sco.com/developers/gabi/latest/ch5.dynamic.html In this patch I am only supporting a single SHT_SYMTAB_SHNDX associated with a .symtab. This is a more or less common case which is used a few tests I saw in LLVM. I decided not to create the SHT_SYMTAB_SHNDX section as "implicit", but implement is like a kind of regular section for now. i.e. tools do not recreate this section or its content, like they do for symbol table sections, for example. That should allow to write all kind of possible broken test cases for our needs and keep the output closer to requested. Differential revision: https://reviews.llvm.org/D65446 llvm-svn: 368272	2019-08-08 09:49:05 +00:00
Sam Tebbs	7ca980edcd	[ARM] Select VFMA llvm-svn: 368264	2019-08-08 08:21:01 +00:00
Craig Topper	724c6053ac	[X86] Remove -x86-experimental-vector-widening-legalization command line option and all its uses. This option is now defaulted to true and we don't want to support turning it off so remove the option. llvm-svn: 368258	2019-08-08 06:48:22 +00:00
David Green	1becefd3f7	[ARM] Tighten up VLDRH.32 with low alignments VLDRH needs to have an alignment of at least 2, including the widening/narrowing versions. This tightens up the ISel patterns for it and alters allowsMisalignedMemoryAccesses so that unaligned accesses are expanded through the stack. It also fixed some incorrect shift amounts, which seemed to be passing a multiple not a shift. Differential Revision: https://reviews.llvm.org/D65580 llvm-svn: 368256	2019-08-08 06:22:03 +00:00
Craig Topper	0aacc7da8b	[X86] Add CMOV_FR32X and CMOV_FR64X to the isCMOVPseudo function. llvm-svn: 368250	2019-08-08 04:40:59 +00:00
Amy Huang	0b870b969f	Recommit "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" with a fix to clear the SDNode map when SelectionDAG is cleared. llvm-svn: 368230	2019-08-07 22:49:40 +00:00
Johannes Doerfert	d1b79e0774	[Attributor][Stats] Locate statistics tracking with the attributes Summary: The ever growing switch required Attribute::AttrKind values but they might not be available for all abstract attributes we deduce. With the new method we track statistics at the abstract attribute level. The provided macros simplify the usage and make the messages uniform. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65732 llvm-svn: 368227	2019-08-07 22:46:11 +00:00
Johannes Doerfert	beb5150f47	[Attributor][NFC] Code simplification and style normalization llvm-svn: 368225	2019-08-07 22:36:15 +00:00
Johannes Doerfert	344d038960	[Attributor] Introduce a state wrapper class Summary: The wrapper reduces boilerplate code and also provide a nice way to determine the state type used by an abstract attributes statically via AAType::StateType. This was already discussed as part of the review of D65711. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65786 llvm-svn: 368224	2019-08-07 22:34:26 +00:00
Johannes Doerfert	d620781872	[Attributor][NFC] Avoid unnecessary liveness queries If we know everything is live there is no need to query for liveness. Indicating a pessimistic fixpoint will cause the state to be "invalid" which will cause the Attributor to not return the AAIsDead on request, which will prevent us from querying isAssumedDead(). llvm-svn: 368223	2019-08-07 22:32:38 +00:00
Johannes Doerfert	14a0493a88	[Attributor] Provide easier checkForallReturnedValues functionality Summary: So far, whenever one wants to look at returned values, one had to deal with the AAReturnedValues and potentially with the AAIsDead attribute. In the same spirit as other checkForAllXXX methods, we add this functionality now to the Attributor. By adopting the use sites we got better results when return instructions were dead. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65733 llvm-svn: 368222	2019-08-07 22:27:24 +00:00
Craig Topper	005b22855e	[LoopVectorize][X86] Clamp interleave factor if we have a known constant trip count that is less than VF*interleave If we know the trip count, we should make sure the interleave factor won't cause the vectorized loop to exceed it. Improves one of the cases from PR42674 Differential Revision: https://reviews.llvm.org/D65896 llvm-svn: 368215	2019-08-07 21:44:14 +00:00
David Blaikie	1b1f1d6677	DebugInfo/DWARF: Remove unused return type from DWARFUnit::extractDIEsIfNeeded llvm-svn: 368212	2019-08-07 21:31:33 +00:00
Craig Topper	7f7ef0208b	[X86] Allow pack instructions to be used for 512->256 truncates when -mprefer-vector-width=256 is causing 512-bit vectors to be split If we're splitting the 512-bit vector anyway and we have zero/sign bits, then we might as well use pack instructions to concat and truncate at once. Differential Revision: https://reviews.llvm.org/D65904 llvm-svn: 368210	2019-08-07 21:16:10 +00:00
Bob Haarman	885fa02da9	Revert r367501 "Create unique, but identically-named ELF sections..." This reverts commit `fbc563e2cb` "Create unique, but identically-named ELF sections for explicitly-sectioned functions and globals when using -function-sections and -data-sections." Reason for revert: sections are created with potentially wrong attributes. llvm-svn: 368204	2019-08-07 20:45:23 +00:00
David Blaikie	353938ec68	Fix indentation llvm-svn: 368198	2019-08-07 19:09:31 +00:00
Craig Topper	66c08430f6	[ValueTracking] When calculating known bits for integer abs, make sure we're looking at a negate and not just any instruction with the nsw flag set. The matchSelectPattern code can match patterns like (x >= 0) ? x : -x for absolute value. But it can also match ((x-y) >= 0) ? (x-y) : (y-x). If the latter form was matched we can only use the nsw flag if its set on both subtracts. This match makes sure we're looking at the former case only. Differential Revision: https://reviews.llvm.org/D65692 llvm-svn: 368195	2019-08-07 18:28:16 +00:00
Stefan Stipanovic	aaa5270c53	[Attributor] Introduce checkForAllReadWriteInstructions(...). Summary: Similarly to D65731 `Attributor::checkForAllReadWriteInstructions` is introduced. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65825 llvm-svn: 368194	2019-08-07 18:26:02 +00:00
Nikolai Bozhenov	03edcd68dd	[SCEV] Return zero from computeConstantDifference(X, X) Without this patch computeConstantDifference returns None for cases like these: computeConstantDifference(%x, %x) computeConstantDifference({%x,+,16}, {%x,+,16}) Differential Revision: https://reviews.llvm.org/D65474 llvm-svn: 368193	2019-08-07 17:38:38 +00:00
Florian Hahn	d8c3c17394	[DataLayout] Check StackNatural and FunctionPtr alignments. MaybeAlignment asserts that the passed in value is == 0 or a power of 2. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=16272 Reviewers: michaelplatings, gchatelet, jakehehrlich, jfb Reviewed By: gchatelet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65858 llvm-svn: 368191	2019-08-07 17:20:55 +00:00
David Blaikie	90146cd8b9	DebugInfo/DWARF: Normalize DWARFObject members on the DWARF spec section names Some of these names were abbreviated, some were not, some pluralised, some not. Made the API difficult to use - since it's an exact 1:1 mapping to the DWARF sections - use those names (changing underscore separation for camel casing). llvm-svn: 368189	2019-08-07 17:18:11 +00:00
Nico Weber	1919317929	Support: Remove needless allocation when getMainExecutable() calls readlink() We built a StringRef from a string literal which we then converted to a std::string to call c_str(). Just use a pointer to the string literal instead of a StringRef. No behavior change. Differential Revision: https://reviews.llvm.org/D65890 llvm-svn: 368187	2019-08-07 17:00:19 +00:00
Craig Topper	8b5f2ab2a4	Recommit r367901 "[X86] Enable -x86-experimental-vector-widening-legalization by default." The assert that caused this to be reverted should be fixed now. Original commit message: This patch changes our defualt legalization behavior for 16, 32, and 64 bit vectors with i8/i16/i32/i64 scalar types from promotion to widening. For example, v8i8 will now be widened to v16i8 instead of promoted to v8i16. This keeps the elements widths the same and pads with undef elements. We believe this is a better legalization strategy. But it carries some issues due to the fragmented vector ISA. For example, i8 shifts and multiplies get widened and then later have to be promoted/split into vXi16 vectors. This has the potential to cause regressions so we wanted to get it in early in the 10.0 cycle so we have plenty of time to address them. Next steps will be to merge tests that explicitly test the command line option. And then we can remove the option and its associated code. llvm-svn: 368183	2019-08-07 16:24:26 +00:00
Oliver Cruickshank	4d4eefda6c	[ARM] Expand CTPOP intrinsic for MVE llvm-svn: 368180	2019-08-07 15:47:45 +00:00
Jay Foad	8e8b295835	[InstCombine] Propagate fast math flags through selects Summary: In SimplifySelectsFeedingBinaryOp, propagate fast math flags from the outer op into both arms of the new select, to take advantage of simplifications that require fast math flags. Reviewers: mcberg2017, majnemer, spatel, arsenm, xbolva00 Subscribers: wdng, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65658 llvm-svn: 368175	2019-08-07 15:16:28 +00:00
Cameron McInally	303b6dbfb4	[EarlyCSE] Add support for unary FNeg to EarlyCSE Differential Revision: https://reviews.llvm.org/D65815 llvm-svn: 368171	2019-08-07 14:34:41 +00:00
Tim Northover	3c10f346dc	GlobalISel: factor common code from translateCall and translateInvoke. NFC. llvm-svn: 368166	2019-08-07 12:43:53 +00:00
Simon Pilgrim	d52bc482a5	[X86] EltsFromConsecutiveLoads - early out for non-byte sized memory (PR42909) Don't attempt to merge loads for types that aren't modulo 8-bits. llvm-svn: 368165	2019-08-07 12:41:59 +00:00
Sander de Smalen	1d2bfa4a86	[AArch64][WinCFI] Do not pair callee-save instructions in LoadStoreOptimizer Prevent the LoadStoreOptimizer from pairing any load/store instructions with instructions from the prologue/epilogue if the CFI information has encoded the operations as separate instructions. This would otherwise lead to a mismatch of the actual prologue size from the size as recorded in the Windows CFI. Reviewers: efriedma, mstorsjo, ssijaric Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65817 llvm-svn: 368164	2019-08-07 12:41:38 +00:00
Simon Atanasyan	e5fa049efa	[mips] Make a couple of class methods plain static functions. NFC llvm-svn: 368162	2019-08-07 12:21:41 +00:00
Simon Atanasyan	8a7c0e7c0a	[mips] Use isMicroMips() function to check enabled feature flag. NFC llvm-svn: 368161	2019-08-07 12:21:32 +00:00
Simon Atanasyan	9f2e076f27	[Mips] Instruction `sc` now accepts symbol as an argument Function MipsAsmParser::expandMemInst() did not properly handle instruction `sc` with a symbol as an argument because first argument would be counted twice. We add additional checks and handle this case separately. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D64252 llvm-svn: 368160	2019-08-07 12:21:26 +00:00
Benjamin Kramer	ea134f221f	[Support] Base SmartMutex on std::recursive_mutex - Remove support for non-recursive mutexes. This was unused. - The std::recursive_mutex is now created/destroyed unconditionally. Locking is still only done if threading is enabled. - Alias SmartScopedLock to std::lock_guard. This should make no semantic difference on the existing APIs. llvm-svn: 368158	2019-08-07 11:59:57 +00:00
Igor Kudrin	45ee93323b	Remove support for 32-bit offsets in utility classes (5/5) Differential Revision: https://reviews.llvm.org/D65641 llvm-svn: 368156	2019-08-07 11:44:47 +00:00
Simon Pilgrim	0eafe011ca	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::VECTOR_SHUFFLE In particular this helps the SSE vector shift cvttps2dq+add+shl pattern by avoiding the need for zeros in shuffle style extensions to vXi32 types as we'll be shifting out those bits anyway llvm-svn: 368155	2019-08-07 11:43:13 +00:00
Benjamin Kramer	3d5360a439	Replace llvm::MutexGuard/UniqueLock with their standard equivalents All supported platforms have <mutex> now, so we don't need our own copies any longer. No functionality change intended. llvm-svn: 368149	2019-08-07 10:57:25 +00:00
Oliver Cruickshank	30dcae0956	[ARM] Generate MVE VHADDs/VHSUBs llvm-svn: 368146	2019-08-07 10:26:57 +00:00
Roman Lebedev	9bece444dd	[InstCombine] Recommit: Shift amount reassociation: shl-trunc-shl pattern This was initially committed in r368059 but got reverted in r368084 because there was a faulty logic in how the shift amounts type mismatch was being handled (it simply wasn't). I've added an explicit bailout before we SimplifyAddInst() - i don't think it's designed in general to handle differently-typed values, even though the actual problem only comes from ConstantExpr's. I have also changed the common type deduction, to not just blindly look past zext, but try to do that so that in the end types match. Differential Revision: https://reviews.llvm.org/D65380 llvm-svn: 368141	2019-08-07 09:41:50 +00:00
Rui Ueyama	cac8df1ab9	Re-submit r367649: Improve raw_ostream so that you can "write" colors using operator<< The original patch broke buildbots, perhaps because it changed the default setting whether colors are enabled or not. llvm-svn: 368131	2019-08-07 08:08:17 +00:00
Sam Parker	173de03740	[ARM][LowOverheadLoops] Revert after read/write Currently we check whether LR is stored/loaded to/from inbetween the loop decrement and loop end pseudo instructions. There's two problems here: - It relies on all load/store instructions being labelled as such in tablegen. - Actually any use of loop decrement is troublesome because the value doesn't exist! So we need to check for any read/write of LR that occurs between the two instructions and revert if we find anything. Differential Revision: https://reviews.llvm.org/D65792 llvm-svn: 368130	2019-08-07 07:39:19 +00:00
Yevgeny Rouban	cb87f3734b	Force check prof branch_weights consistency in SwitchInstProfUpdateWrapper This patch turns on the prof branch_weights metadata consistency check in SwitchInstProfUpdateWrapper. If this patch causes a failure then please before reverting do report the IR that hits the assertion and try identifying the pass that introduces the inconsistency. We have to fix all such passes. See also the upcoming change https://reviews.llvm.org/D61179 in the Verifier. Reviewers: davidx, nikic, eraman, reames, chandlerc Reviewed By: davidx Differential Revision: https://reviews.llvm.org/D64061 llvm-svn: 368129	2019-08-07 07:17:45 +00:00
Craig Topper	f192cc587c	[X86] Allow any 8-bit immediate to be used with bt/btc/btr/bts memory aliases. We have aliases that disambiguate memory forms of bt/btc/btr/bts without suffixes to the 32-bit form. These aliases should have been updated when the instructions were updated in r356413. llvm-svn: 368127	2019-08-07 06:17:58 +00:00
Craig Topper	624980037d	[X86] Use isInt<8> to simplify some code. NFC llvm-svn: 368126	2019-08-07 06:17:55 +00:00
Kai Luo	02b8056cc1	[MachineCSE][NFC] Use 'profitable' rather than 'beneficial' to name method. llvm-svn: 368124	2019-08-07 05:40:21 +00:00
Craig Topper	29688f4da0	[X86] Limit vpermil2pd/vpermil2ps immediates to 4 bits in the assembly parser. The upper 4 bits of the immediate byte are used to encode a register. We need to limit the explicit immediate to fit in the remaining 4 bits. Fixes PR42899. llvm-svn: 368123	2019-08-07 05:34:27 +00:00
Alex Brachet	c22d9666fc	[yaml2obj] Move core yaml2obj code into lib and include for use in unit tests Reviewers: jhenderson, rupprecht, MaskRay, grimar, labath Reviewed By: rupprecht Subscribers: gribozavr, mgrang, seiya, mgorny, sbc100, hiraditya, aheejin, jakehehrlich, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65255 llvm-svn: 368119	2019-08-07 02:44:49 +00:00
Alex Lorenz	8d5c280316	TLI: darwin does not support _bcmp Not all Darwin targets support _bcmp in all circumstances. Differential Revision: https://reviews.llvm.org/D65834 llvm-svn: 368113	2019-08-07 00:03:37 +00:00
Mitch Phillips	924359dc0f	Revert "[X86] Add more extract subvector cost model tests for smaller element sizes and smaller than 128-bit vectors." This reverts commit `fc33e33776`. This commit depends on the rolled back commit rL367901, and thus needs to be rolled back. llvm-svn: 368109	2019-08-06 23:38:14 +00:00
Mitch Phillips	bd0d97e1c4	Revert "[X86] Enable -x86-experimental-vector-widening-legalization by default." This reverts commit `3de33245d2`. This commit broke the MSan buildbots. See https://reviews.llvm.org/rL367901 for more information. llvm-svn: 368107	2019-08-06 23:00:43 +00:00
Bill Wendling	73be7cf5aa	Use parenthses to silence warning. llvm-svn: 368105	2019-08-06 22:47:47 +00:00
Peter Collingbourne	0930643ff6	hwasan: Instrument globals. Globals are instrumented by adding a pointer tag to their symbol values and emitting metadata into a special section that allows the runtime to tag their memory when the library is loaded. Due to order of initialization issues explained in more detail in the comments, shadow initialization cannot happen during regular global initialization. Instead, the location of the global section is marked using an ELF note, and we require libc support for calling a function provided by the HWASAN runtime when libraries are loaded and unloaded. Based on ideas discussed with @evgeny777 in D56672. Differential Revision: https://reviews.llvm.org/D65770 llvm-svn: 368102	2019-08-06 22:07:29 +00:00
Guanzhong Chen	b3292a8469	[WebAssembly] Lower ASan constructor priority on Emscripten Summary: This change gives Emscripten the ability to use more than one constructor priorities that runs before ASan. By convention, constructor priorites 0-100 are reserved for use by the system. ASan on Emscripten now uses priority 50, leaving plenty of room for use by Emscripten before and after ASan. This change is done in response to: https://github.com/emscripten-core/emscripten/pull/9076#discussion_r310323723 Reviewers: kripken, tlively, aheejin Reviewed By: tlively Subscribers: cfe-commits, dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D65684 llvm-svn: 368101	2019-08-06 21:52:58 +00:00
Peter Collingbourne	411d96f99a	IR: Disable verifier check for GlobalValues with private linkage named after a comdat for non-COFF. This check is only meaningful for COFF and it is perfectly valid to create such a GlobalValue in ELF. Differential Revision: https://reviews.llvm.org/D65686 llvm-svn: 368094	2019-08-06 21:47:18 +00:00
Craig Topper	ecc1e5d476	[X86] Don't allow combineSIntToFP to create v2i32 vectors after type legalization. If we're after type legalization we should only be trying to turn v2i64 into v2i32. So bitcast to v4i32, shuffle the even elements together. Then use X86ISD::CVTSI2P. The alternative is to leave the v2i64 type alone and let it scalarized. Hopefully keeping it packed is better. Fixes PR42905. llvm-svn: 368091	2019-08-06 21:43:15 +00:00
Reid Kleckner	e4bd38478b	Revert [InstCombine] Shift amount reassociation: shl-trunc-shl pattern This reverts r368059 (git commit `0f95710976`) This caused Clang to assert while self-hosting and compiling SystemZInstrInfo.cpp. Reduction is running. llvm-svn: 368084	2019-08-06 20:32:07 +00:00
Craig Topper	fc33e33776	[X86] Add more extract subvector cost model tests for smaller element sizes and smaller than 128-bit vectors. With the switch to widening legalization, we need to a better job of costing extractions of less than 128-bits. llvm-svn: 368081	2019-08-06 20:12:41 +00:00
Kristina Brooks	26e60f0653	[Attributor][modulemap] Revert r368064 but fix the build Commit r368064 was necessary after r367953 (D65712) broke the module build. That happened, apparently, because the template class IRAttribute defined in the header had a virtual method defined in the corresponding source file (IRAttribute::manifest). To unbreak the situation this patch introduces a helper function IRAttributeManifest::manifestAttrs which is used to implement IRAttribute::manifest in the header. The deifnition of the helper function is still in the source file. Patch by jdoerfert (Johannes Doerfert) Differential Revision: https://reviews.llvm.org/D65821 llvm-svn: 368076	2019-08-06 19:53:19 +00:00
Aditya Nandakumar	6bbfde5c48	[GISel]: Fix trivial build breakage llvm-svn: 368067	2019-08-06 17:53:04 +00:00
Aditya Nandakumar	c8ac029d0a	[GISel]: Add GISelKnownBits analysis https://reviews.llvm.org/D65698 This adds a KnownBits analysis pass for GISel. This was done as a pass (compared to static functions) so that we can add other features such as caching queries(within a pass and across passes) in the future. This patch only adds the basic pass boiler plate, and implements a lazy non caching knownbits implementation (ported from SelectionDAG). I've also hooked up the AArch64PreLegalizerCombiner pass to use this - there should be no compile time regression as the analysis is lazy. llvm-svn: 368065	2019-08-06 17:18:29 +00:00
Daniel Sanders	d9934d4939	[globalisel] Allow SrcOp to convert an APInt and render it as an immediate operand (MO.isImm() == true) Summary: This is tested by D61289 but has been pulled into a separate patch at a reviewers request. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm, rovka Reviewed By: arsenm Subscribers: javed.absar, hiraditya, wdng, kristof.beyls, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61321 llvm-svn: 368063	2019-08-06 17:16:27 +00:00
Roman Lebedev	213817327f	[X86] Move CPU features for Barcelona/K10 out of line Summary: Cleans X86.td's Barcelona entry to be more like the others, by moving the features out of the `Proc<>`, thus potentially making it possible to inherit from them. Split off from D63628 Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65791 llvm-svn: 368061	2019-08-06 17:04:02 +00:00
Roman Lebedev	0f95710976	[InstCombine] Shift amount reassociation: shl-trunc-shl pattern Summary: Currently `reassociateShiftAmtsOfTwoSameDirectionShifts()` only handles two shifts one after another. If the shifts are `shl`, we still can easily perform the fold, with no extra legality checks: https://rise4fun.com/Alive/OQbM If we have right-shift however, we won't be able to make it any simpler than it already is. After this the only thing missing here is constant-folding: (`NewShAmt >= bitwidth(X)`) * If it's a logical shift, then constant-fold to `0` (not `undef`) * If it's a `ashr`, then a splat of original signbit https://rise4fun.com/Alive/E1K https://rise4fun.com/Alive/i0V Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65380 llvm-svn: 368059	2019-08-06 17:03:40 +00:00
Diego Caballero	8bac17709e	Re-land D65760/r367944 Fixed most vexing parse ambiguation. llvm-svn: 368055	2019-08-06 16:24:17 +00:00
Jonas Devlieghere	cb6f2646fd	[Path] Fix bug in make_absolute logic This fixes a bug for making path with a //net style root absolute. I discovered the bug while writing a test case for the VFS, which uses these paths because they're both legal absolute paths on Windows and Unix. Differential revision: https://reviews.llvm.org/D65675 llvm-svn: 368053	2019-08-06 15:46:45 +00:00
Sander de Smalen	ad7e95df5a	[AArch64] NFC: Generalize emitFrameOffset to support more than byte offsets. Refactor emitFrameOffset to accept a StackOffset struct as its offset argument. This method currently only supports byte offsets (MVT::i8) but will be extended in a later patch to support scalable offsets (MVT::nxv1i8) as well. Reviewers: thegameg, rovka, t.p.northover, efriedma, greened Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61436 llvm-svn: 368049	2019-08-06 15:06:31 +00:00
Hubert Tong	fc34a536d0	[XCOFF][MC] report_fatal_error before dereferencing NULL This patch replaces a TODO comment with a call to `report_fatal_error`. The path that reaches the added call to `report_fatal_error` manifestly dereferences a null pointer. llvm-svn: 368048	2019-08-06 15:05:20 +00:00
Simon Pilgrim	dae5ddad9d	[TargetLowering] SimplifyMultipleUseDemandedBits - return UNDEF for undemanded ops If we demand no bits/elts from an Op, just return UNDEF llvm-svn: 368043	2019-08-06 14:30:42 +00:00
Tim Renouf	5a0794327a	[StructurizeCFG] Enable -structurizecfg-relaxed-uniform-regions by default D62198 introduced an option to relax the checks for hasOnlyUniformBranches. This commit turns the option on by default, for better code generation in some cases in AMDGPU. Differential Revision: https://reviews.llvm.org/D63198 Change-Id: I9cbff002a1e74d3b7eb96b4192dc8129936d537d llvm-svn: 368042	2019-08-06 14:30:19 +00:00
Dmitri Gribenko	fc21bb661f	Revert "[yaml2obj] Move core yaml2obj code into lib and include for use in unit tests" This reverts commit r368021, it broke tests. llvm-svn: 368035	2019-08-06 13:39:50 +00:00
Tim Northover	b5abc425d2	AArch64: bail instead of asserting on unexpected type in G_CONSTANT 0. llvm-svn: 368031	2019-08-06 13:34:08 +00:00
Simon Pilgrim	cf62047d29	[X86][SSE] Call SimplifyMultipleUseDemandedBits on PACKSS/PACKUS arguments. This mainly helps to replace unused arguments with UNDEF in the case where they have multiple users. llvm-svn: 368026	2019-08-06 13:10:42 +00:00
Sander de Smalen	612b038966	[AArch64] NFC: Add generic StackOffset to describe scalable offsets. To support spilling/filling of scalable vectors we need a more generic representation of a stack offset than simply 'int'. For this we introduce the StackOffset struct, which comprises multiple offsets sized by their respective MVTs. Byte-offsets will thus be a simple tuple such as { offset, MVT::i8 }. Adding two byte-offsets will result in a byte offset { offsetA + offsetB, MVT::i8 }. When two offsets have different types, we can canonicalise them to use the same MVT, as long as their runtime sizes are guaranteed to have the same size-ratio as they would have at compile-time. When we have both scalable- and fixed-size objects on the stack, we can create an offset that is: ({ offset_fixed, MVT::i8 } + { offset_scalable, MVT::nxv1i8 }) The struct also contains a getForFrameOffset() method that is specific to AArch64 and decomposes the frame-offset to be used directly in instructions that operate on the stack or index into the stack. Note: This patch adds StackOffset as an AArch64-only concept, but we would like to make this a generic concept/struct that is supported by all interfaces that take or return stack offsets (currently as 'int'). Since that would be a bigger change that is currently pending on D32530 landing, we thought it makes sense to first show/prove the concept in the AArch64 target before proposing to roll this out further. Reviewers: thegameg, rovka, t.p.northover, efriedma, greened Reviewed By: rovka, greened Differential Revision: https://reviews.llvm.org/D61435 llvm-svn: 368024	2019-08-06 13:06:40 +00:00
Simon Pilgrim	01d267dc4f	[X86] SimplifyMultipleUseDemandedBits - target shuffles might not be identity If we don't demand any non-undef shuffle elements then the assert will fail as all shuffle inputs would still be flagged as 'identity' safe. Exposed by an incoming patch. llvm-svn: 368022	2019-08-06 12:41:29 +00:00
Alex Brachet	3cfeaa4d2c	[yaml2obj] Move core yaml2obj code into lib and include for use in unit tests Reviewers: jhenderson, rupprecht, MaskRay, grimar, labath Reviewed By: rupprecht Subscribers: seiya, mgorny, sbc100, hiraditya, aheejin, jakehehrlich, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65255 llvm-svn: 368021	2019-08-06 12:15:18 +00:00
Igor Kudrin	2836cf0b72	Try to unbreak buildbots after r368014 llvm-svn: 368018	2019-08-06 11:12:13 +00:00
Simon Pilgrim	c6735aecfa	[X86][SSE] Enable min/max partial reduction As mentioned on D65047 / rL366933 the plan is to enable partial reduction handling wherever possible. llvm-svn: 368016	2019-08-06 11:00:34 +00:00
Igor Kudrin	f26a70a5e7	Switch LLVM to use 64-bit offsets (2/5) This updates all libraries and tools in LLVM Core to use 64-bit offsets which directly or indirectly come to DataExtractor. Differential Revision: https://reviews.llvm.org/D65638 llvm-svn: 368014	2019-08-06 10:49:40 +00:00
Igor Kudrin	f5f35c5cd1	Support 64-bit offsets in utility classes (1/5) Using 64-bit offsets is required to fully implement 64-bit DWARF. As these classes are used in many different libraries they should temporarily support both 32- and 64-bit offsets. Differential Revision: https://reviews.llvm.org/D64006 llvm-svn: 368013	2019-08-06 10:47:20 +00:00
Ulrich Weigand	7b24dd741c	[Strict FP] Allow custom operation actions This patch changes the DAG legalizer to respect the operation actions set by the target for strict floating-point operations. (Currently, the legalizer will usually fall back to mutate to the non-strict action (which is assumed to be legal), and only skip mutation if the strict operation is marked legal.) With this patch, if whenever a strict operation is marked as Legal or Custom, it is passed to the target as usual. Only if it is marked as Expand will the legalizer attempt to mutate to the non-strict operation. Note that this will now fail if the non-strict operation is itself marked as Custom -- the target will have to provide a Custom definition for the strict operation then as well. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D65226 llvm-svn: 368012	2019-08-06 10:43:13 +00:00
Fangrui Song	cb4327d7db	Change two unnecessary uses of llvm::size(C) to C.size() llvm-svn: 368011	2019-08-06 10:24:36 +00:00
Cullen Rhodes	ced419f4d7	[SelectionDAG] Extend base addressing modes supported by MGATHER/MSCATTER Summary: Before this patch MGATHER/MSCATTER is capable of representing all common addressing modes, but only when illegal types are used. This patch adds an IndexType property so more representations are available when using legal types only. Original modes: vector of bases base + vector of signed scaled offsets New modes: base + vector of signed unscaled offsets base + vector of unsigned scaled offsets base + vector of unsigned unscaled offsets The current behaviour of addressing modes for gather/scatter remains unchanged. Patch by Paul Walker. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D65636 llvm-svn: 368008	2019-08-06 09:46:13 +00:00
Tim Northover	de98e92bc2	AArch64: use xzr/wzr for constant 0 in GlobalISel. COPYs from xzr and wzr can often be folded away entirely during register allocation, unlike a movz. llvm-svn: 368003	2019-08-06 09:18:41 +00:00
Guillaume Chatelet	a7b6a7c851	[LLVM][Alignment] Introduce Alignment In Attributes Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: jfb Subscribers: hiraditya, dexonsmith, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65742 llvm-svn: 368002	2019-08-06 09:16:33 +00:00
Guillaume Chatelet	396521378f	[LLVM][Alignment] Introduce Alignment In GlobalObject Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: jfb Subscribers: hiraditya, dexonsmith, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65748 Address comments llvm-svn: 368000	2019-08-06 09:03:21 +00:00
Bill Wendling	ebc2cf9c27	Use "isa" since the variable isn't used. llvm-svn: 367985	2019-08-06 07:27:26 +00:00
Hideki Saito	ec818d7fb3	[LV][NFC] Share the LV illegality reporting with LoopVectorize. Reviewers: hsaito, fhahn, rengolin Reviewed By: rengolin Patch by psamolysov, thanks! Differential Revision: https://reviews.llvm.org/D62997 llvm-svn: 367980	2019-08-06 06:08:48 +00:00
Matt Arsenault	f4d3113a5f	CodeGen: Migration to using Register llvm-svn: 367974	2019-08-06 03:59:31 +00:00
Austin Kerbow	a05c384132	Re-commit: [AMDGPU] Use S_DENORM_MODE for gfx10 Summary: During fdiv32 lowering use S_DENORM_MODE to select denorm mode in gfx10. Reviewers: arsenm, rampitec Reviewed By: arsenm, rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65620 llvm-svn: 367969	2019-08-06 02:16:11 +00:00
Johannes Doerfert	21fe0a314e	[Attributor][NFC] Outline common pattern into helper method This helper will also allow to also place logic to determine if an abstract attribute is necessary in the first place. llvm-svn: 367966	2019-08-06 00:55:11 +00:00
Johannes Doerfert	d0f6400978	[Attributor] Provide a generic interface to check live instructions Summary: Similar to `Attributor::checkForAllCallSites`, we now provide such functionality for instructions of a certain opcode through `Attributor::checkForAllInstructions` and the convenient wrapper `Attributor::checkForAllCallLikeInstructions`. This cleans up code, avoids duplication, and simplifies the usage of liveness information. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65731 llvm-svn: 367961	2019-08-06 00:32:43 +00:00
Shiva Chen	b12056bd33	[RISCV] Custom legalize i32 operations for RV64 to reduce signed extensions Differential Revision: https://reviews.llvm.org/D65434 llvm-svn: 367960	2019-08-06 00:24:00 +00:00
Peter Collingbourne	f0380bac5f	Silence ubsan after r367926. Fixes e.g. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-ubsan/builds/14273 We can't left shift here because left shifting of a negative number is UB. The same doesn't apply to unsigned arithmetic, but switching to unsigned doesn't appear to stop ubsan from complaining, so we need to mask out the high bits. llvm-svn: 367959	2019-08-06 00:21:30 +00:00
Puyan Lotfi	37fe40c330	Reverting D65760/r367944 due to buildbot failure. http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/15952/steps/build/logs/stdio JITTargetMachineBuilder.cpp fails to build. llvm-svn: 367954	2019-08-05 23:47:07 +00:00
Johannes Doerfert	eccdf08577	[Attributor] Introduce the IRAttribute helper struct Summary: Certain properties, e.g., an AttrKind, are not shared among all abstract attributes. This patch extracts the functionality into a helper struct. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65712 llvm-svn: 367953	2019-08-05 23:35:12 +00:00
Johannes Doerfert	fb69f7688a	[Attributor] Make abstract attributes stateless To remove boilerplate, mostly passing through values to the AbstractAttriubute base class, we extract the state into an IRPosition helper. There is no function change intended but the IRPosition struct will provide more functionality down the line. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65711 llvm-svn: 367952	2019-08-05 23:32:31 +00:00
Johannes Doerfert	2402062557	[Attributor] Use proper ID for attribute lookup Summary: The new scheme is similar to the pass manager and dyn_cast scheme where we identify classes by the address of a static member. This is better than the old scheme in which we had to "invent" new Attributor enums if there was no corresponding one. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65710 llvm-svn: 367951	2019-08-05 23:30:01 +00:00
Johannes Doerfert	007153e9d4	[Attributor][NFCI] Avoid duplication of the InformationCache reference Summary: Instead of storing the reference to the InformationCache we now pass it whenever it might be needed. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65709 llvm-svn: 367950	2019-08-05 23:26:06 +00:00
Johannes Doerfert	e83f303938	[Attributor] Deduce the "no-return" attribute for functions A function is "no-return" if we never reach a return instruction, either because there are none or the ones that exist are dead. Test have been adjusted: - either noreturn was added, or - noreturn was avoided by modifying the code. The new noreturn_{sync,async} test make sure we do handle invoke instructions with a noreturn (and potentially nowunwind) callee correctly, even in the presence of potential asynchronous exceptions. llvm-svn: 367948	2019-08-05 23:22:05 +00:00
Amara Emerson	bc1172df14	[GlobalISel][CallLowering] Rename isArgumentHandler() -> isIncomingArgumentHandler() Previous name and comment incorrectly implied it was just for formal arg handlers, which is not true. llvm-svn: 367945	2019-08-05 23:05:28 +00:00
Diego Caballero	1647758882	[ORC] Add CPU name and sub-target features to detectHost This commit adds host CPU name and sub-target features to the `JITTargetMachineBuilder` created by `JITTargetMachineBuilder::detectHost()`. Differential Revision: https://reviews.llvm.org/D65760 llvm-svn: 367944	2019-08-05 23:02:12 +00:00
Keno Fischer	5c3cdef84b	[WebAssembly] Fix conflict between ret legalization and sjlj Summary: When the WebAssembly backend encounters a return type that doesn't fit within i32, SelectionDAG performs sret demotion, adding an additional argument to the start of the function that contains a pointer to an sret buffer to use instead. However, this conflicts with the emscripten sjlj lowering pass. There we translate calls like: ``` call {i32, i32} @foo() ``` into (in pseudo-llvm) ``` %addr = @foo call {i32, i32} @__invoke_{i32,i32}(%addr) ``` i.e. we perform an indirect call through an extra function. However, the sret transform now transforms this into the equivalent of ``` %addr = @foo %sret = alloca {i32, i32} call {i32, i32} @__invoke_{i32,i32}(%sret, %addr) ``` (while simultaneously translation the implementation of @foo as well). Unfortunately, this doesn't work out. The __invoke_ ABI expected the function address to be the first argument, causing crashes. There is several possible ways to fix this: 1. Implementing the sret rewrite at the IR level as well and performing it as part of lowering to __invoke 2. Fixing the wasm backend to recognize that __invoke has a special ABI 3. A change to the binaryen/emscripten ABI to recognize this situation This revision implements the middle option, teaching the backend to treat __invoke_ functions specially in sret lowering. This is achieved by 1) Introducing a new CallingConv ID for invoke functions 2) When this CallingConv ID is seen in the backend and the first argument is marked as sret (a function pointer would never be marked as sret), swapping the first two arguments. Reviewed By: tlively, aheejin Differential Revision: https://reviews.llvm.org/D65463 llvm-svn: 367935	2019-08-05 21:36:09 +00:00
Johannes Doerfert	3d7bbc6f9c	[Attributor][Fix] Do not remove instructions during manifestation When we remove instructions cached references could still be live. This patch avoids removing invoke instructions that are replaced by calls and instead keeps them around but in a dead block. llvm-svn: 367933	2019-08-05 21:35:02 +00:00
Daniel Sanders	eac86ec25f	Revert Register/MCRegister: Add conversion operators to avoid use of implicit convert to unsigned. NFC MSVC finds ambiguity where clang doesn't and it looks like it's not going to be an easy fix Reverting while I figure out how to fix it This reverts r367916 (git commit `aa15ec3c23`) This reverts r367920 (git commit `5d14efe279`) llvm-svn: 367932	2019-08-05 21:34:45 +00:00
Johannes Doerfert	924d2138fc	[Attributor][Fix] Keep invokes if handlers catch asynchronous exceptions Similar to other places where we transform invokes to calls we need to be careful if the handler (=personality) can catch asynchronous exceptions as they are not modeled as part of nounwind. This is tested with D59978. llvm-svn: 367931	2019-08-05 21:34:45 +00:00
Eric Christopher	1d73e228db	BMI2 support is indicated in bit eight of EBX, not nine. See Intel SDM, Vol 2A, Table 3-8: https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-2a-manual.pdf#page=296 Differential Revision: https://reviews.llvm.org/D65766 llvm-svn: 367929	2019-08-05 21:25:59 +00:00
Peter Collingbourne	a56d81f4fb	llvm-symbolizer: Untag addresses in object files by default. Any addresses that we pass to llvm-symbolizer are going to be untagged, while any HWASAN instrumented globals are going to be tagged in the symbol table. Therefore we need to untag the addresses before using them. Differential Revision: https://reviews.llvm.org/D65769 llvm-svn: 367926	2019-08-05 20:59:25 +00:00
Lang Hames	1707735fa4	[ORC] Work around broken GCC/libstdc++ by adding an explicit conversion. This should fix the bots that have been failing due to r367712. llvm-svn: 367921	2019-08-05 20:30:35 +00:00
Daniel Sanders	5d14efe279	Fix MSVC error after r367916 It seems that MSVC sees ambiguity between the operator==()'s where clang doesn't llvm-svn: 367920	2019-08-05 20:03:43 +00:00
Amara Emerson	85e5e28ab4	[AArch64][GlobalISel] Inline tiny memcpy et al at -O0. FastISel already does this since the initial arm64 port was upstreamed, so it seems there are no issues with doing this at -O0 for very small memcpys. Gives a 0.2% geomean code size improvement on CTMark. Differential Revision: https://reviews.llvm.org/D65758 llvm-svn: 367919	2019-08-05 20:02:52 +00:00
Dmitri Gribenko	37aa8ad663	Revert "[AMDGPU] Use S_DENORM_MODE for gfx10" This reverts commit r367882. It broke the test MC/Disassembler/AMDGPU/gfx10_dasm_all.txt. llvm-svn: 367904	2019-08-05 18:36:43 +00:00
Craig Topper	3de33245d2	[X86] Enable -x86-experimental-vector-widening-legalization by default. This patch changes our defualt legalization behavior for 16, 32, and 64 bit vectors with i8/i16/i32/i64 scalar types from promotion to widening. For example, v8i8 will now be widened to v16i8 instead of promoted to v8i16. This keeps the elements widths the same and pads with undef elements. We believe this is a better legalization strategy. But it carries some issues due to the fragmented vector ISA. For example, i8 shifts and multiplies get widened and then later have to be promoted/split into vXi16 vectors. This has the potential to cause regressions so we wanted to get it in early in the 10.0 cycle so we have plenty of time to address them. Next steps will be to merge tests that explicitly test the command line option. And then we can remove the option and its associated code. llvm-svn: 367901	2019-08-05 18:25:36 +00:00
Evandro Menezes	a005c1ac4f	[AArch64] Expand bcmp() for small block lengths Patch D56593 by @courbet results in calls to `bcmp()` in some cases, should the target support the it. Unless `TTI::MemCmpExpansionOptions()` is overridden by the target. In a proprietary benchmark we see a performance drop of about 12% on PNG compression before this patch, though it passes all tests. This patch mirrors X86 for AArch64 and initializes `TTI::MemCmpExpansionOptions()` to then expand calls to `bcmp()` when appropriate. No tuning of the parameters was performed, but, at this point, it's enough to recover the performance drop above. This problem also exists on ARM. Once a consensus is reached for AArch64, we can work to fix ARM as well. Authors: - Evandro Menezes (@evandro) <e.menezes@samsung.com> - Brian Rzycki (@brzycki) <b.rzycki@samsung.com> Differential revision: https://reviews.llvm.org/D64805 llvm-svn: 367898	2019-08-05 18:09:14 +00:00
Pablo Barrio	a8426b43f8	[AArch64] Set preferred function alignment to 16 bytes on Neoverse N1 Summary: The Arm Neoverse N1 Software Optimization Guide [1], Section "4.8 Branch instruction alignment" states: "Consider aligning subroutine entry points and branch targets to 32B boundaries, within the bounds of the code-density requirements of the program." This patch sets the preferred function alignment on Neoverse N1 to 2^4=16B. This was already the case in some of the latest Cortex-A CPUs. Benchmarking in previous Cortex-A CPUs suggested that 16B alignment is already better than the default. See commit d04ee305. The reason we don't set it to 32B right now (as the optimisation guide suggests) is that this will impact code size and perhaps the instruction cache performance. Therefore we need benchmark numbers first. I have also added testing for A75 and A76 that we were missing. [1] https://developer.arm.com/docs/swog309707/latest Reviewers: fhahn, greened, samparker, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65654 llvm-svn: 367894	2019-08-05 17:38:58 +00:00
Sanjay Patel	5dbb90bfe1	[InstCombine] combine mul+shl separated by zext This appears to slightly help patterns similar to what's shown in PR42874: https://bugs.llvm.org/show_bug.cgi?id=42874 ...but not in the way requested. That fix will require some later IR and/or backend pass to decompose multiply/shifts into something more optimal per target. Those transforms already exist in some basic forms, but probably need enhancing to catch more cases. https://rise4fun.com/Alive/Qzv2 llvm-svn: 367891	2019-08-05 16:59:58 +00:00
Austin Kerbow	8d229dbb47	[AMDGPU] Use S_DENORM_MODE for gfx10 Summary: During fdiv32 lowering use S_DENORM_MODE to select denorm mode in gfx10. Reviewers: arsenm, rampitec Reviewed By: arsenm, rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65620 llvm-svn: 367882	2019-08-05 16:09:49 +00:00
Tom Stellard	e15d95a987	AMDGPU/LoadStoreOptimizer: Set the correct offset whem merging MMOs Summary: This is a follow up to r367237. MachineFunction::getMachineMemOperand() adds the offset parameter to the existing offset instead of resetting it. So we need to reset the offset to the correct value after calling this function. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65557 llvm-svn: 367881	2019-08-05 16:08:44 +00:00
Sanjay Patel	1a29823b9c	[InstCombine] add extra use constraint for shl-zext fold As the test shows, we can end up with more instructions than we started with if we don't include the extra-use check. llvm-svn: 367880	2019-08-05 16:04:07 +00:00
Matt Arsenault	3922392969	AMDGPU: Correct behavior of f16 buffer loads Don't assume format loads for f16. Also fixes support for targets without i16. llvm-svn: 367879	2019-08-05 15:59:07 +00:00
Matt Arsenault	0e0a1c80fb	AMDGPU: Correct behavior of f16/i16 non-format store intrinsics This was switching to use a format store for a non-format store for f16 types. Also fixes i16/f16 stores on targets without legal f16. The corresponding loads also need to be fixed. llvm-svn: 367872	2019-08-05 14:57:59 +00:00
Matt Arsenault	ff6b007772	AMDGPU/GlobalISel: Alternative mappings for constants Without context we assume SGPR. Allowing VGPR constants theoretically helps avoid a copy. This seems to not actually work now, and the choice isn't based on the use bank. llvm-svn: 367871	2019-08-05 14:40:26 +00:00
Matt Arsenault	4e21730300	AMDGPU/GlobalISel: Don't reject shader types I'm not sure what complications these present, but the current argument lowering is pretty much directly copied from the DAG lowering, so I assume these work as they should. No tests because I'm lazy and things are getting pretty close to the point where the existing calling-conventions.ll can be shared with SelectionDAG. llvm-svn: 367870	2019-08-05 14:40:23 +00:00
Nilanjana Basu	da60fc813c	Changing representation of .cv_def_range directives in Codeview debug info assembly format for better readability llvm-svn: 367867	2019-08-05 14:16:58 +00:00
Nilanjana Basu	b5e4d7de17	Revert "Changing representation of .cv_def_range directives in Codeview debug info assembly format for better readability" This reverts commit `a885afa9fa`. llvm-svn: 367861	2019-08-05 13:55:21 +00:00
Cullen Rhodes	2a48176373	[AArch64] Implement initial SVE calling convention support Summary: This patch adds initial support for the SVE calling convention such that SVE types can be passed as arguments and return values to/from a subroutine. The SVE AAPCS states [1]: z0-z7 are used to pass scalable vector arguments to a subroutine, and to return scalable vector results from a function. If a subroutine takes arguments in scalable vector or predicate registers, or if it is a function that returns results in such registers, it must ensure that the entire contents of z8-z23 are preserved across the call. In other cases it need only preserve the low 64 bits of z8-z15, as described in §5.1.2. p0-p3 are used to pass scalable predicate arguments to a subroutine and to return scalable predicate results from a function. If a subroutine takes arguments in scalable vector or predicate registers, or if it is a function that returns results in these registers, it must ensure that p4-p15 are preserved across the call. In other cases it need not preserve any scalable predicate register contents. SVE predicate and data registers are passed indirectly (i.e. spilled to the stack and pass the address) if they exceed the registers used for argument passing defined by the PCS referenced above. Until SVE stack support is merged we can't spill SVE registers to the stack, so currently an llvm_unreachable is used where we will eventually handle this. [1] https://static.docs.arm.com/100986/0000/100986_0000.pdf Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D65448 llvm-svn: 367859	2019-08-05 13:44:10 +00:00
Nilanjana Basu	a885afa9fa	Changing representation of .cv_def_range directives in Codeview debug info assembly format for better readability llvm-svn: 367850	2019-08-05 13:11:51 +00:00
Sanjay Patel	eaf13044bd	[DAGCombiner][x86] prevent infinite loop from truncate/extend transforms The test case is based on the example from the post-commit thread for: https://reviews.llvm.org/rGc9171bd0a955 This replaces the x86-specific simple-type check from: rL367766 with a check in the DAGCombiner. Adding the check isn't strictly necessary after the fix from: rL367768 ...but it seems likely that we're heading for trouble if we are creating weird types in this transform. I combined the earlier legality check into the initial clause to simplify the code. So we should only try the trunc/sext transform at the earliest combine stage, but we limit the transform to simple types anyway because the TLI hook is probably too lax about what it considers a free truncate. llvm-svn: 367834	2019-08-05 11:27:07 +00:00
Graham Hunter	208d63ea90	[MVT][SVE] Map between scalable vector IR Type and VTs Adds a two way mapping between the scalable vector IR type and corresponding SelectionDAG ValueTypes. Reviewers: craig.topper, jeroen.dobbelaere, fhahn, rengolin, greened, rovka Reviewed By: greened Differential Revision: https://reviews.llvm.org/D47770 llvm-svn: 367832	2019-08-05 11:18:19 +00:00
Florian Hahn	e3ea97b049	[AArch64] Skip isZIPMask check for masks with an odd number of elements. We process 2 elements at a time and expect the number of elements to be even. Similar to D60690. Reviewers: dmgreen, samparker, t.p.northover Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D65400 llvm-svn: 367831	2019-08-05 11:12:23 +00:00
Guillaume Chatelet	c97a3d15d2	[LLVM][Alignment] Introduce Alignment Type Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Reviewed By: jfb Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65514 llvm-svn: 367828	2019-08-05 11:02:05 +00:00
David Bolvansky	ef72cded32	[TLI][NFC] Fixed typo llvm-svn: 367827	2019-08-05 10:14:09 +00:00
Guillaume Chatelet	6c5fb61f8b	[LLVM][Alignment] Introduce Alignment In CallingConv Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Subscribers: hiraditya, llvm-commits, courbet, jfb Tags: #llvm Differential Revision: https://reviews.llvm.org/D65659 llvm-svn: 367822	2019-08-05 09:49:09 +00:00
Nicolai Haehnle	e204786b6c	AMDGPU: add missing llvm.amdgcn.{raw,struct}.buffer.atomic.{inc,dec} Summary: Wrapping increment/decrement. These aren't exposed by many APIs... Change-Id: I1df25c7889de5a5ba76468ad8e8a2597efa9af6c Reviewers: arsenm, tpr, dstuttard Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65283 llvm-svn: 367821	2019-08-05 09:36:06 +00:00
Oliver Stannard	8ed8353fc4	Reland: Fix and test inter-procedural register allocation for ARM Add an explicit construction of the ArrayRef, gcc 5 and earlier don't seem to select the ArrayRef constructor which takes a C array when the construction is implicit. Original commit message: - Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves with a null RegScavenger. Simply not updating the register scavenger is fine because IPRA only cares about the SavedRegs vector, the acutal code of the function has already been generated at this point. - Add a new hook to TargetRegisterInfo to get the set of registers which can be clobbered inside a call, even if the compiler can see both sides, by linker-generated code. Differential revision: https://reviews.llvm.org/D64908 llvm-svn: 367819	2019-08-05 09:04:10 +00:00
Guillaume Chatelet	65e4b47aad	[LLVM][Alignment] Introduce Alignment Type in DataLayout Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jfb, jakehehrlich Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65521 Make getFunctionPtrAlign() return MaybeAlign llvm-svn: 367817	2019-08-05 09:00:43 +00:00
Michael Pozulp	3046ef5c11	Revert "[llvm-objdump] Re-commit r367284." This reverts r367776 (git commit `d34099926e`). My changes to llvm-objdump tests caused them to fail on windows: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/27368 llvm-svn: 367816	2019-08-05 08:52:28 +00:00
Fangrui Song	db26488bf9	[DWARF] Change DWARFDebugLoc::Entry::Loc from SmallVector<char, 4> to SmallString<4> SmallString has a conversion to StringRef, which can be leveraged to simplify two use sites. llvm-svn: 367801	2019-08-05 06:33:52 +00:00
Fangrui Song	d9b948b6eb	Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC F_{None,Text,Append} are kept for compatibility since r334221. llvm-svn: 367800	2019-08-05 05:43:48 +00:00
Craig Topper	635f5ff580	[X86] Fix a bad early out in combineExtInVec that prevented recursive shuffle combining from running with -x86-experimental-vector-widening-legalization. llvm-svn: 367798	2019-08-05 03:48:31 +00:00
Johannes Doerfert	305b961f64	[Attributor][NFC] Create some attributes earlier llvm-svn: 367793	2019-08-04 18:40:01 +00:00
Johannes Doerfert	6471bb6f18	[Attributor][NFC] Improve debug output llvm-svn: 367792	2019-08-04 18:39:28 +00:00
Johannes Doerfert	4361da24ac	[Attributor][Fix] Resolve various liveness issues Summary: This contains various fixes: - Explicitly determine and return the next noreturn instruction. - If an invoke calls a noreturn function which is not nounwind we keep the unwind destination live. This also means we require an invoke. Though we can still add the unreachable to the normal destination block. - Check if the return instructions are dead after we look for calls to avoid triggering an optimistic fixpoint in the presence of assumed liveness information. - Make the interface work with "const" pointers. - Some simplifications While additional tests are included, full coverage is achieved only with D59978. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65701 llvm-svn: 367791	2019-08-04 18:38:53 +00:00
Johannes Doerfert	d1c3793563	[Attributor][NFC] Simplify common pattern wrt. fixpoints When a fixpoint is indicated the change status is known due to the fixpoint kind. This simplifies a common code pattern by making the connection explicit. llvm-svn: 367790	2019-08-04 18:37:38 +00:00
Johannes Doerfert	b6acee5c7b	[Attributor][NFC] Invalid DerefState is at fixpoint Summary: If the DerefBytesState (and thereby the DerefState) is invalid, we reached a fixpoint for the whole DerefState as we will not manifest/provide information then. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65586 llvm-svn: 367789	2019-08-04 17:55:15 +00:00
Craig Topper	5a4989e2ac	[TargetLowering][X86] Teach SimplifyDemandedVectorElts to replace the base vector of INSERT_SUBVECTOR with undef if none of the elements are demanded even if the node has other users. Summary: The SimplifyDemandedVectorElts function can replace with undef when no elements are demanded, but due to how it interacts with TargetLoweringOpts, it can only do this when the node has no other users. Remove a now unneeded DAG combine from the X86 backend. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65713 llvm-svn: 367788	2019-08-04 17:30:41 +00:00
Simon Pilgrim	436fd52a71	[X86] lowerShuffleAsSpecificZeroOrAnyExtend - use undef PSHUFB mask indices for ANY_EXTEND shuffles llvm-svn: 367784	2019-08-04 13:15:23 +00:00
Simon Pilgrim	c5891eaa34	Fix signed/unsigned comparison warning. NFC. llvm-svn: 367783	2019-08-04 12:48:19 +00:00
Simon Pilgrim	e16901844d	[X86] SimplifyMultipleUseDemandedBits - Add target shuffle support llvm-svn: 367782	2019-08-04 12:24:40 +00:00
David Green	91296295d0	[ARM] MVE big endian bitcasts This adds big endian MVE patterns for bitcasts. They are defined in llvm as being the same as a store of the existing type and the load into the new. This means that they have to become a VREV between the two types, working in the same way that NEON works in big-endian. This also adds some example tests for bigendian, showing where code is and isn't different. The main difference, especially from a testing perspective is that vectors are passed as v2f64, and so are VREV into and out of call arguments, and the parameters are passed in a v2f64 format. Same happens for inline assembly where the register class is used, so it is VREV to a v16i8. So some of this is probably not correct yet, but it is (mostly) self-consistent and seems to be consistent with how llvm treats vectors. The rest we can hopefully fix later. More details about big endian neon can be found in https://llvm.org/docs/BigEndianNEON.html. Differential Revision: https://reviews.llvm.org/D65581 llvm-svn: 367780	2019-08-04 10:18:15 +00:00
Michael Pozulp	d34099926e	[llvm-objdump] Re-commit r367284. Add warning messages if disassembly + source for problematic inputs Summary: Addresses https://bugs.llvm.org/show_bug.cgi?id=41905 Reviewers: jhenderson, rupprecht, grimar Reviewed By: jhenderson, grimar Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62462 llvm-svn: 367776	2019-08-04 06:04:00 +00:00
Craig Topper	0fff1e4f3d	[X86] Consistently use MVT::i8 for the constant operand of BLENDI and INSERTPS nodes. This is the type listed in the type constraint for isel. But since we list a type there, it doesn't get checked during isel matching. llvm-svn: 367775	2019-08-04 06:01:31 +00:00
Craig Topper	76f0f2e0f0	[SelectionDAG] Add node creation debug message to getMemIntrinsicNode. llvm-svn: 367771	2019-08-04 02:32:06 +00:00
Yonghong Song	44b16bd4a5	[Transforms] Do not drop !preserve.access.index metadata Currently, when a GVN or CSE optimization happens, the llvm.preserve.access.index metadata is dropped. This caused a problem for BPF AbstructMemberOffset phase as it relies on the metadata (debuginfo types). This patch added proper hooks in lib/Transforms to preserve !preserve.access.index metadata. A test case is added to ensure metadata is preserved under CSE. Differential Revision: https://reviews.llvm.org/D65700 llvm-svn: 367769	2019-08-03 23:41:26 +00:00
Craig Topper	2edeb8a11a	[DAGCombiner] Prevent the combine added in r367710 from creating illegal types after type legalization. This is further fix for PR42880. Sanjay already disabled the X86 TLI hook for non-simple types, but we should really call isTypeLegal here if we're after type legalization. llvm-svn: 367768	2019-08-03 23:09:13 +00:00
Sanjay Patel	c9171bd0a9	[x86] change free truncate hook to handle only simple types (PR42880) This avoids the crash from: https://bugs.llvm.org/show_bug.cgi?id=42880 ...and I think it's a proper constraint for the TLI hook. But that example raises questions about what happens to get us into this situation (created i29 types) and what happens later (why does legalization die on those types), so I'm not sure if we will resolve the bug based on this change. llvm-svn: 367766	2019-08-03 21:46:27 +00:00
Keno Fischer	3c805d125a	[WebAssembly] Fix allocsize attribute in sjlj lowering Summary: The allocsize attribute refers to call parameters by index. Thus, when we add the extra parameter in sjlj lowering, we need to increment the referenced paramater in the allocsize attribute to avoid angering the Verifier. Reviewed By: aheejin Differential Revision: https://reviews.llvm.org/D65470 llvm-svn: 367765	2019-08-03 21:38:19 +00:00
Lang Hames	3daccaac8a	[JITLink] Add support for MachO/x86-64 UNSIGNED relocs with length=2. MachO/x86-64 UNSIGNED relocs are almost always 64-bit (length=3), but UNSIGNED relocs of length=2 are allowed if the target resides in the low 32-bits. This patch adds support for such relocations in JITLink (previously they would have triggered an unsupported relocation error). llvm-svn: 367764	2019-08-03 20:17:10 +00:00
Lang Hames	b31229af4f	[JITLink] Fix error message formatting. llvm-svn: 367763	2019-08-03 20:17:08 +00:00
Stefan Stipanovic	7849e41635	[Attributor][NFC] run clang-format on Attributor.cpp llvm-svn: 367757	2019-08-03 15:27:41 +00:00
Praveen Velliengiri	f5c40cb900	Speculative Compilation [ORC] Remove Speculator Variants for Different Program Representations [ORC] Block Freq Analysis Speculative Compilation with Naive Block Frequency Add Applications to OrcSpeculation ORC v2 with Block Freq Query & Example Deleted BenchMark Programs Signed-off-by: preejackie <praveenvelliengiri@gmail.com> ORCv2 comments resolved [ORCV2] NFC ORCv2 NFC [ORCv2] Speculative compilation - CFGWalkQuery ORCv2 Adapting IRSpeculationLayer to new locking scheme llvm-svn: 367756	2019-08-03 14:42:13 +00:00
Tim Northover	a009a60a91	IR: print value numbers for unnamed function arguments For consistency with normal instructions and clarity when reading IR, it's best to print the %0, %1, ... names of function arguments in definitions. Also modifies the parser to accept IR in that form for obvious reasons. llvm-svn: 367755	2019-08-03 14:28:34 +00:00
Sylvestre Ledru	6bf861298a	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367754	2019-08-03 13:51:58 +00:00
Nikita Popov	4f8259bdbc	[Thumb] Fix invalid symbol redefinition due to duplicated jumptable (PR42760) Fix for https://bugs.llvm.org/show_bug.cgi?id=42760. A tBR_JTr instruction is duplicated by tail duplication, which results in the same jumptable with the same label being emitted twice. Fix this by marking tBR_JTr as not duplicable. The corresponding ARM/Thumb instructions are already marked as not duplicable. Additionally also mark tTBB_JT and tTBH_JT to be consistent with Thumb2, even though this shouldn't be strictly necessary. Differential Revision: https://reviews.llvm.org/D65606 llvm-svn: 367753	2019-08-03 06:47:23 +00:00
Bill Wendling	41a2847a9a	Emit diagnostic if an inline asm constraint requires an immediate Summary: An inline asm call can result in an immediate after inlining. Therefore emit a diagnostic here if constraint requires an immediate but one isn't supplied. Reviewers: joerg, mgorny, efriedma, rsmith Reviewed By: joerg Subscribers: asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, MaskRay, jyknight, dylanmckay, javed.absar, fedor.sergeev, jrtc27, Jim, krytarowski, eraman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60942 llvm-svn: 367750	2019-08-03 05:52:47 +00:00
Hideto Ueno	96bb347205	[Attributor] Fix dereferenceable callsite argument initialization llvm-svn: 367748	2019-08-03 04:10:50 +00:00
Amara Emerson	c835164a47	Re-commit "[GlobalISel] Add legalization support for non-power-2 loads and stores"" This is an old commit that exposed a bug in the GISel importer, which caused non-truncating stores to be selected for truncating store patterns. Now that's been fixed in r367737 this can go back in. llvm-svn: 367739	2019-08-02 23:44:24 +00:00
Craig Topper	b1cfcd1a56	[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use scalar bit tests for the branches for expandload/compressstore. Same as what was done for gather/scatter/load/store in r367489. Expandload/compressstore were delayed due to lack of constant masking handling that has since been fixed. llvm-svn: 367738	2019-08-02 23:43:53 +00:00
Craig Topper	45ea25289d	[X86] Use the pointer VT for the Scale node when lowering x86 gather/scatter intrinsics. This is consistent with the target independent intrinsic handling. Not sure this really matters since we just pull the constant out using getZExtValue later. llvm-svn: 367736	2019-08-02 23:18:16 +00:00
Yonghong Song	37d24a696b	[BPF] Handling type conversions correctly for CO-RE With newly added debuginfo type metadata for preserve_array_access_index() intrinsic, this patch did the following two things: (1). checking validity before adding a new access index to the access chain. (2). calculating access byte offset in IR phase BPFAbstractMemberAccess instead of when BTF is emitted. For (1), the metadata provided by all preserve_*_access_index() intrinsics are used to check whether the to-be-added type is a proper struct/union member or array element. For (2), with all available metadata, calculating access byte offset becomes easier in BPFAbstractMemberAccess IR phase. This enables us to remove the unnecessary complexity in BTFDebug.cpp. New tests are added for . user explicit casting to array/structure/union . global variable (or its dereference) as the source of base . multi demensional arrays . array access given a base pointer . cases where we won't generate relocation if we cannot find type name. Differential Revision: https://reviews.llvm.org/D65618 llvm-svn: 367735	2019-08-02 23:16:44 +00:00
JF Bastien	748dac7389	Remove support for unsupported MSVC versions Re-land r367727 with the #if fixed. Reviewers: rnk, lebedev.ri Subscribers: hiraditya, jkorous, dexonsmith, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65662 llvm-svn: 367734	2019-08-02 23:09:01 +00:00
Douglas Yung	42618b270d	Revert Fix and test inter-procedural register allocation for ARM This reverts r367669 (git commit `f6b00c279a`) This was breaking a build bot http://lab.llvm.org:8011/builders/netbsd-amd64/builds/21233 llvm-svn: 367731	2019-08-02 22:11:49 +00:00
JF Bastien	21d01ea9b6	Revert "Remove support for unsupported MSVC versions" Mismatched preprocessor, I'll fix in a follow-up. llvm-svn: 367728	2019-08-02 22:02:25 +00:00
JF Bastien	dc8af80c19	Remove support for unsupported MSVC versions Reviewers: rnk, lebedev.ri Subscribers: hiraditya, jkorous, dexonsmith, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65662 llvm-svn: 367727	2019-08-02 21:52:35 +00:00
Stefan Stipanovic	d021617bf7	[Attributor] Using liveness in other attributes. Modifying other AbstractAttributes to use Liveness AA and skip dead instructions. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential revision: https://reviews.llvm.org/D65243 llvm-svn: 367725	2019-08-02 21:31:22 +00:00
Amara Emerson	73752abeab	[AArch64][GlobalISel] Eliminate redundant G_ZEXT when the source is implicitly zext-loaded. These cases can come up when the extending loads combiner doesn't combine a zext(load) to a zextload op, due to some other operation being in between, which then gets simplified at a later stage. Differential Revision: https://reviews.llvm.org/D65360 llvm-svn: 367723	2019-08-02 21:15:36 +00:00
Simon Pilgrim	794f7591ec	[TargetLowering] SimplifyMultipleUseDemandedBits - don't assume INSERT_VECTOR_ELT value type is simple. Noticed by inspection - this was copied from the X86 target equivalent where we can assume its legal/simple. llvm-svn: 367721	2019-08-02 21:07:07 +00:00
Daniel Sanders	e7694f34ab	Use MCRegister in MCRegisterInfo's interfaces Summary: As part of this, define DenseMapInfo for MCRegister (and Register while I'm at it) Depends on D65599 Reviewers: arsenm Subscribers: MatzeB, qcolombet, jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65605 llvm-svn: 367719	2019-08-02 20:23:00 +00:00
Philip Reames	511be2a158	[Statepoints] Fix overalignment of loads in no-realign-stack functions This really should have been part of 366765. For some reason, I forgot to handle the corresponding load side, and the readable test cases (using deopt vs statepoints) turned out to be overly reduced. Oops. As seen in the test change, the problem was that we were using a load with alignment expectations rather than the unaligned variant when the stack alignment was less than that prefered type alignment. llvm-svn: 367718	2019-08-02 20:17:37 +00:00
Peter Collingbourne	196931a7dd	hwasan: Remove unused field CurModuleUniqueId. NFCI. llvm-svn: 367717	2019-08-02 20:14:58 +00:00
Lang Hames	10430f4174	[ORC] Remove a dead method. llvm-svn: 367716	2019-08-02 20:09:30 +00:00
Craig Topper	de9b1d7912	[ScalarizeMaskedMemIntrin] Add constant mask support to expandload and compressstore scalarization This adds support for generating all the loads or stores for a constant mask into a single basic block with no conditionals. Differential Revision: https://reviews.llvm.org/D65613 llvm-svn: 367715	2019-08-02 20:04:34 +00:00
Lang Hames	cb391279b4	[ORC] Turn on symbol-flags overrides for LLJIT on Windows by default. libObject does not apply the Exported flag to symbols in COFF object files, which can lead to assertions when the symbol flags initially derived from IR added to the JIT clash with the flags seen by the JIT linker. Both RTDyldObjectLinkingLayer and ObjectLinkingLayer have a workaround for this: they can be told to override the flags seen by the linker with the flags attached to the materialization responsibility object that was passed down to the linker. This patch modifies LLJIT's setup code to enable this override by default on platforms where COFF is the default object format. llvm-svn: 367712	2019-08-02 19:43:20 +00:00
Sanjay Patel	68264558f9	[DAGCombiner] try to convert opposing shifts to casts This reverses a questionable IR canonicalization when a truncate is free: sra (add (shl X, N1C), AddC), N1C --> sext (add (trunc X to (width - N1C)), AddC') https://rise4fun.com/Alive/slRC More details in PR42644: https://bugs.llvm.org/show_bug.cgi?id=42644 I limited this to pre-legalization for code simplicity because that should be enough to reverse the IR patterns. I don't have any evidence (no regression test diffs) that we need to try this later. Differential Revision: https://reviews.llvm.org/D65607 llvm-svn: 367710	2019-08-02 19:33:46 +00:00
Eric Christopher	5fb56b1966	Temporarily Revert "Changing representation of cv_def_range directives in Codeview debug info assembly format for better readability" This is breaking bots and the author asked me to revert. This reverts commit 367704. llvm-svn: 367707	2019-08-02 19:10:37 +00:00
Nilanjana Basu	1c67521591	Changing representation of cv_def_range directives in Codeview debug info assembly format for better readability llvm-svn: 367704	2019-08-02 18:44:39 +00:00
Jessica Paquette	e4c46c34ce	[AArch64][GlobalISel] Support the neg_addsub_shifted_imm32 pattern Add an equivalent ComplexRendererFns function for SelectNegArithImmed. This allows us to select immediate adds of -1 by turning them into subtracts. Update select-binop.mir to show that the pattern works. Differential Revision: https://reviews.llvm.org/D65460 llvm-svn: 367700	2019-08-02 18:12:53 +00:00
Alina Sbirlea	5545e6963f	[SimplifyCFG] Cleanup redundant conditions [NFC]. Summary: Since the for loop iterates over BB's predecessors, the branch conditions found must have BB as one of the successors. For an unconditional branch the successor must be BB, added `assert`. For a conditional branch, one of the two successors must be BB, simplify `else if` to `else` and `assert`. Sink common instructions outside the if/else block. Reviewers: sanjoy.google Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65596 llvm-svn: 367699	2019-08-02 18:06:54 +00:00
Daniel Sanders	c94c91f55c	Fix ARC after r367633 llvm-svn: 367697	2019-08-02 17:52:17 +00:00
Peter Collingbourne	4dcf8800e2	CodeGen: Don't follow aliases when extracting type info. This fixes a crash in the case where the type info object is an alias pointing to a non-zero offset within a global or is otherwise unanalyzable by the stripPointerCasts() function. Looking through the alias is not the right thing to do anyway for similar reasons as D65118. Differential Revision: https://reviews.llvm.org/D65314 llvm-svn: 367696	2019-08-02 17:43:45 +00:00
Sanjay Patel	9ce5f41851	[InstCombine] fold cmp+select using select operand equivalence As discussed in PR42696: https://bugs.llvm.org/show_bug.cgi?id=42696 ...but won't help that case yet. We have an odd situation where a select operand equivalence fold was implemented in InstSimplify when it could have been done more generally in InstCombine if we allow dropping of {nsw,nuw,exact} from a binop operand. Here's an example: https://rise4fun.com/Alive/Xplr %cmp = icmp eq i32 %x, 2147483647 %add = add nsw i32 %x, 1 %sel = select i1 %cmp, i32 -2147483648, i32 %add => %sel = add i32 %x, 1 I've left the InstSimplify code in place for now, but my guess is that we'd prefer to remove that as a follow-up to save on code duplication and compile-time. Differential Revision: https://reviews.llvm.org/D65576 llvm-svn: 367695	2019-08-02 17:39:32 +00:00
Lang Hames	809e9d1efa	[ORC] Change the locking scheme for ThreadSafeModule. ThreadSafeModule/ThreadSafeContext are used to manage lifetimes and locking for LLVMContexts in ORCv2. Prior to this patch contexts were locked as soon as an associated Module was emitted (to be compiled and linked), and were not unlocked until the emit call returned. This could lead to deadlocks if interdependent modules that shared contexts were compiled on different threads: when, during emission of the first module, the dependence was discovered the second module (which would provide the required symbol) could not be emitted as the thread emitting the first module still held the lock. This patch eliminates this possibility by moving to a finer-grained locking scheme. Each client holds the module lock only while they are actively operating on it. To make this finer grained locking simpler/safer to implement this patch removes the explicit lock method, 'getContextLock', from ThreadSafeModule and replaces it with a new method, 'withModuleDo', that implicitly locks the context, calls a user-supplied function object to operate on the Module, then implicitly unlocks the context before returning the result. ThreadSafeModule TSM = getModule(...); size_t NumFunctions = TSM.withModuleDo( [](Module &M) { // <- context locked before entry to lambda. return M.size(); }); Existing ORCv2 layers that operate on ThreadSafeModules are updated to use the new method. This method is used to introduce Module locking into each of the existing layers. llvm-svn: 367686	2019-08-02 15:21:37 +00:00
David Candler	7eacefedab	[NFC] Test commit, corrected some spelling in comment Test commit, corrected some spelling in comment. Differential Revision: https://reviews.llvm.org/D65516 llvm-svn: 367685	2019-08-02 14:44:17 +00:00
Tim Northover	522fb7eedc	GlobalISel: support swiftself attribute llvm-svn: 367683	2019-08-02 14:09:49 +00:00
Teresa Johnson	d2df54e6a5	[ThinLTO] Implement index-based WPD This patch adds support to the WholeProgramDevirt pass to perform index-based WPD, which is invoked from ThinLTO during the thin link. The ThinLTO backend (WPD import phase) behaves the same regardless of whether the WPD decisions were made with the index-based or (the existing) IR-based analysis. Depends on D54815. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D55153 llvm-svn: 367679	2019-08-02 13:10:52 +00:00
Martin Storsjo	ed7e1cd877	[llvm-dlltool] Clarify an error message. NFC. The parameter to the -D (--dllname) option is the name of the dll that llvm-dlltool produces an import library for. Even though this is named "OutputFile" in the COFFModuleDefinition class, it's not an output file name in the context of llvm-dlltool, but the name of the DLL to create an import library for. llvm-svn: 367676	2019-08-02 11:20:03 +00:00
Oliver Stannard	4b7239ebac	[IPRA][ARM] Disable no-CSR optimisation for ARM This optimisation isn't generally profitable for ARM, because we can save/restore many registers in the prologue and epilogue using the PUSH and POP instructions, but mostly use individual LDR/STR instructions for other spills. Differential revision: https://reviews.llvm.org/D64910 llvm-svn: 367670	2019-08-02 10:23:17 +00:00
Oliver Stannard	f6b00c279a	Fix and test inter-procedural register allocation for ARM - Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves with a null RegScavenger. Simply not updating the register scavenger is fine because IPRA only cares about the SavedRegs vector, the acutal code of the function has already been generated at this point. - Add a new hook to TargetRegisterInfo to get the set of registers which can be clobbered inside a call, even if the compiler can see both sides, by linker-generated code. Differential revision: https://reviews.llvm.org/D64908 llvm-svn: 367669	2019-08-02 10:23:05 +00:00
Serguei Katkov	de67affd00	[Loop Peeling] Introduce an option for profile based peeling disabling. This patch adds an ability to disable profile based peeling causing the peeling of all iterations and as a result prohibits further unroll/peeling attempts on that loop. The motivation to get an ability to separate peeling usage in pipeline where in the first part we peel only separate iterations if needed and later in pipeline we apply the full peeling which will prohibit further peeling. Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D64983 llvm-svn: 367668	2019-08-02 09:32:52 +00:00
Sam Parker	cd38599275	[NFC][ARM[ParallelDSP] Rename/remove/change types Remove forward declaration, fold a couple of typedefs and change one to be more useful. llvm-svn: 367665	2019-08-02 08:21:17 +00:00
Sam Parker	14c6dfdfe2	[NFC][ARM][ParallelDSP] Remove ValueList We only care about the first element in the list. llvm-svn: 367660	2019-08-02 07:32:28 +00:00
Rui Ueyama	4d41c332ef	Revert r367649: Improve raw_ostream so that you can "write" colors using operator<< This reverts commit r367649 in an attempt to unbreak Windows bots. llvm-svn: 367658	2019-08-02 07:22:34 +00:00
Hideki Saito	09fac2450b	[LV] Avoid building interleaved group in presence of WAW dependency Reviewers: hsaito, Ayal, fhahn, anna, mkazantsev Reviewed By: hsaito Patch by evrevnov, thanks! Differential Revision: https://reviews.llvm.org/D63981 llvm-svn: 367654	2019-08-02 06:31:50 +00:00
Rui Ueyama	a52f982f1c	Improve raw_ostream so that you can "write" colors using operator<< 1. raw_ostream supports ANSI colors so that you can write messages to the termina with colors. Previously, in order to change and reset color, you had to call `changeColor` and `resetColor` functions, respectively. So, if you print out "error: " in red, for example, you had to do something like this: OS.changeColor(raw_ostream::RED); OS << "error: "; OS.resetColor(); With this patch, you can write the same code as follows: OS << raw_ostream::RED << "error: " << raw_ostream::RESET; 2. Add a boolean flag to raw_ostream so that you can disable colored output. If you disable colors, changeColor, operator<<(Color), resetColor and other color-related functions have no effect. Most LLVM tools automatically prints out messages using colors, and you can disable it by passing a flag such as `--disable-colors`. This new flag makes it easy to write code that works that way. Differential Revision: https://reviews.llvm.org/D65564 llvm-svn: 367649	2019-08-02 04:48:30 +00:00
Serguei Katkov	bbdcc82111	[Loop Peeling] Do not close further unroll/peel if profile based peeling was not used. Current peeling cost model can decide to peel off not all iterations but only some of them to eliminate conditions on phi. At the same time if any peeling happens the door for further unroll/peel optimizations on that loop closes because the part of the code thinks that if peeling happened it is profile based peeling and all iterations are peeled off. To resolve this inconsistency the patch provides the flag which states whether the full peeling basing on profile is enabled or not and peeling cost model is able to modify this field like it does not PeelCount. In a separate patch I will introduce an option to allow/disallow peeling basing on profile. To avoid infinite loop peeling the patch tracks the total number of peeled iteration through llvm.loop.peeled.count loop metadata. Reviewers: reames, fhahn Reviewed By: reames Subscribers: hiraditya, zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D64972 llvm-svn: 367647	2019-08-02 04:29:23 +00:00
Stanislav Mekhanoshin	6fe00a21f2	Handle casts changing pointer size in the vectorizer Added code to truncate or shrink offsets so that we can continue base pointer search if size has changed along the way. Differential Revision: https://reviews.llvm.org/D65612 llvm-svn: 367646	2019-08-02 04:03:37 +00:00
Kai Luo	fec7da8285	[PowerPC][Peephole] Check if `extsw`'s second operand is a virtual register Summary: When combining `extsw` and `sldi` in `PPCMIPeephole`, we have to check if `extsw`'s second operand is a virtual register, otherwise we might get miscompile. Differential Revision: https://reviews.llvm.org/D65315 llvm-svn: 367645	2019-08-02 03:14:17 +00:00
Kang Zhang	038dd43782	[NFC][CodeGen] Modify the type element of TailCalls to simplify the dupRetToEnableTailCallOpts() Summary: The old code can be simplified to define the element type of TailCalls as `BasicBlock` not `CallInst`. Also I use the for-range loop instead the for loop. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D64905 llvm-svn: 367644	2019-08-02 03:09:07 +00:00
Eric Christopher	5a00b0772a	Temporarily revert "Changes to improve CodeView debug info type record inline comments" due to a sanitizer failure. This reverts commit 367623. llvm-svn: 367640	2019-08-02 01:05:47 +00:00
Daniel Sanders	12961ff0fa	Fix up an unused variable warning caused by TRI->isVirtualRegister() -> Register::isVirtualRegister() llvm-svn: 367637	2019-08-02 00:17:48 +00:00
Daniel Sanders	2bea69bf65	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633	2019-08-01 23:27:28 +00:00
Rong Xu	ca161fa008	[PGO] Add PGO support at -O0 in the experimental new pass manager Add PGO support at -O0 in the experimental new pass manager to sync the behavior of the legacy pass manager. Also change the test of gcc-flag-compatibility.c for more complete test: (1) change the match string to "profc" and "profd" to ensure the instrumentation is happening. (2) add IR format proftext so that PGO use compilation is tested. Differential Revision: https://reviews.llvm.org/D64029 llvm-svn: 367628	2019-08-01 22:36:34 +00:00
Stanislav Mekhanoshin	eee9312a85	Relax load store vectorizer pointer strip checks The previous change to fix crash in the vectorizer introduced performance regressions. The condition to preserve pointer address space during the search is too tight, we only need to match the size. Differential Revision: https://reviews.llvm.org/D65600 llvm-svn: 367624	2019-08-01 22:18:56 +00:00
Nilanjana Basu	ac7e5788ca	Changes to improve CodeView debug info type record inline comments Signed-off-by: Nilanjana Basu <nilanjana.basu87@gmail.com> llvm-svn: 367623	2019-08-01 22:05:14 +00:00
Wouter van Oortmerssen	7fee93ed59	[WebAssembly] Fixed relocation errors having no location. Summary: Fixes: https://bugs.llvm.org/show_bug.cgi?id=42441 Used to print: <unknown>:0: error: Cannot represent a difference across sections (the location was null). Now prints: err.s:20:3: error: Cannot represent a difference across sections i32.const foo-bar ^ Note: I looked at adding a test for this, but I don't think it is worth it. We're not testing error formatting in the Wasm backend :) Reviewers: sbc100, jgravelle-google Subscribers: dschuff, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65602 llvm-svn: 367619	2019-08-01 21:34:54 +00:00
Matt Arsenault	d9d30a408e	GlobalISel: Lower scalarizing unmerge of a vector to shifts AMDGPU sometimes has legal s16 and <2 x s16> operations, but all registers are really 32-bit. An unmerge destination really should ben widened to a 32-bit register. If widening a scalarizing vector with a target size that matches the vector size, bitcast to integer and extract the relevant bits with shifts. I'm not sure if this is the right place for this. This could arguably be part of widenScalar for the result. I also have a growing feeling that we're missing a bitcast legalize action. llvm-svn: 367604	2019-08-01 19:10:05 +00:00
Sjoerd Meijer	e0dfce0723	Follow up of rL367592, fix the build Some buildbots complained about: error: default label in switch which covers all enumeration values llvm-svn: 367603	2019-08-01 18:54:29 +00:00
Craig Topper	a9ed5436bd	[X86] In decomposeMulByConstant, legalize the VT before querying whether the multiply is legal If a type is larger than a legal type and needs to be split, we would previously allow the multiply to be decomposed even if the split multiply is legal. Since the shift + add/sub code would also need to be split, its not any better to decompose it. This patch figures out what type the mul will eventually be legalized to and then uses that type for the query. I tried just returning false illegal types and letting them get handled after type legalization, but then we can't recognize and i64 constant splat on 32-bit targets since will be destroyed by type legalization. We could special case vectors of i64 to avoid that... Differential Revision: https://reviews.llvm.org/D65533 llvm-svn: 367601	2019-08-01 18:49:07 +00:00
Matt Arsenault	bb582ebdba	AMDGPU: Remove v0 workaround for DS_GWS_* instructions Any register should work for the src field since r366067, since the used value is not pulled from the expected encoding field. llvm-svn: 367598	2019-08-01 18:41:32 +00:00
Matt Arsenault	e56a2ad85e	CodeGen: Allow virtual registers in bundles The note in the documentation suggests this restriction is a compile time optimization for architectures that make heavy use of bundling. Allowing virtual registers in a bundle is useful for some (non-R600) AMDGPU use cases and are infrequent enough to matter. A more common AMDGPU use case has already been using virtual registers in bundles since r333691, although never calling finalizeBundle on them and manually creating the use/def list on the BUNDLE instruction. This is also relatively infrequent, and only happens for consecutive sequences of some load/store types. llvm-svn: 367597	2019-08-01 18:41:28 +00:00
Alina Sbirlea	3af2a69575	[SimplifyCFG] Mark missed Changed to true. Summary: DominatorTree is invalid after SimplifyCFG because of a missed `Changed = true` when simplifying a branch condition and removing an edge. Resolves PR42272. Reviewers: zhizhouy, manojgupta Subscribers: jlebar, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65490 llvm-svn: 367596	2019-08-01 18:37:34 +00:00
Alina Sbirlea	172838df6b	[MemorySSA] Set LoopSimplify to preserve MemorySSA in the NPM, if analysis exists. Summary: LoopSimplify is preserved in the legacy pass manager, but not in the new pass manager. Update LoopSimplify to preserve MemorySSA conditionally when the analysis is available (same behavior as the legacy pass manager). Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65418 llvm-svn: 367594	2019-08-01 18:28:28 +00:00
Matt Arsenault	aff2995f46	AMDGPU: Use tablegen pattern for sendmsg intrinsics Since this now emits a direct copy to m0, SIFixSGPRCopies has to handle a physical register. llvm-svn: 367593	2019-08-01 18:27:11 +00:00
Sjoerd Meijer	20b198ec5e	[LV] Tail-Loop Folding This allows folding of the scalar epilogue loop (the tail) into the main vectorised loop body when the loop is annotated with a "vector predicate" metadata hint. To fold the tail, instructions need to be predicated (masked), enabling/disabling lanes for the remainder iterations. Differential Revision: https://reviews.llvm.org/D65197 llvm-svn: 367592	2019-08-01 18:21:44 +00:00
Matt Arsenault	5faa533e47	GlobalISel: Fix widenScalar for G_MERGE_VALUES to pointer AMDGPU testcase isn't broken now, but will be in a future patch without this. llvm-svn: 367591	2019-08-01 18:13:16 +00:00
Wouter van Oortmerssen	87af0b1911	[WebAssembly] Assembler/InstPrinter: support call_indirect type index. A TYPE_INDEX operand (as used by call_indirect) used to be represented by the InstPrinter as a symbol (e.g. .Ltype_index0@TYPE_INDEX) which was a bit of a mismatch with the WasmObjectWriter which expects an unnamed symbol, to receive the signature from and then turn into a reloc. There was really no good way to round-trip this information. An earlier version of this patch tried to attach the signature information using a .functype, but that ran into trouble when the symbol was re-emitted without a name. Removing the name was a giant hack also. The current version changes the assembly syntax to have an inline signature spec for TYPEINDEX operands that is always unnamed, which is much more elegant both in syntax and in implementation (as now the assembler is able to follow the same path as the regular backend) Reviewers: sbc100, dschuff, aheejin, jgravelle-google, sunfish, tlively Subscribers: arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64758 llvm-svn: 367590	2019-08-01 18:08:26 +00:00
Simon Pilgrim	1d183b407a	[TargetLowering] SimplifyMultipleUseDemandedBits - Add ISD::INSERT_VECTOR_ELT handling Allow us to peek through vector insertions to avoid dependencies on entire insertion chains. llvm-svn: 367588	2019-08-01 17:46:44 +00:00
Simon Pilgrim	63d4114f72	[X86][SSE] Add PEXTR(PINSR(v, s, c), c) -> s combine. We should probably extend this to cover bitcasts as well to help other cases in promote-vec3.ll. llvm-svn: 367582	2019-08-01 16:38:39 +00:00
Johannes Doerfert	da4d811707	[Attributor][FIX] Indicate a missing update change User of AAReturnedValues need to know if HasOverdefinedReturnedCalls changed from false to true as it will impact the result of the return value traversal (calls are not ignored anymore). This will be tested with the tests in D59978. llvm-svn: 367581	2019-08-01 16:21:54 +00:00
Simon Atanasyan	0620cf11ec	[mips] Fix lowering load/store instruction in PIC case If an operand of the `lw/sw` instructions is a symbol, these instructions incorrectly lowered using not-position-independent chain of commands. For PIC code we should use `lw/addiu` instructions with the `R_MIPS_GOT16` and `R_MIPS_LO16` relocations respectively. Instead of that LLVM generates position dependent code with the `R_MIPS_HI16` and `R_MIPS_LO16` relocations. This patch provides a fix for the bug by handling PIC case separately in the `MipsAsmParser::expandMemInst`. The main idea is to generate a chain of PIC instructions to load a symbol address into a register and then load the address content. The fix is not optimal and does not fix all PIC-related problems. This is a task for subsequent patches. Differential Revision: https://reviews.llvm.org/D65524 llvm-svn: 367580	2019-08-01 16:04:29 +00:00
Simon Pilgrim	33f5f863b5	[X86][SSE] SimplifyMultipleUseDemandedBits - Add PEXTR/PINSR B+W handling This adds SimplifyMultipleUseDemandedBitsForTargetNode X86 support and uses it to allow us to peek through vector insertions to avoid dependencies on entire insertion chains. llvm-svn: 367570	2019-08-01 14:46:03 +00:00
Simon Pilgrim	f99f9881e3	[X86] EltsFromConsecutiveLoads - don't attempt to merge volatile loads (PR42846) llvm-svn: 367556	2019-08-01 13:13:18 +00:00
Sam Elliott	f596f45070	[RISCV] Add Custom Parser for Atomic Memory Operands Summary: GCC Accepts both (reg) and 0(reg) for atomic instruction memory operands. These instructions do not allow for an offset in their encoding, so in the latter case, the 0 is silently dropped. Due to how we have structured the RISCVAsmParser, the easiest way to add support for parsing this offset is to add a custom AsmOperand and parser. This parser drops all the parens, and just keeps the register. This commit also adds a custom printer for these operands, which matches the GCC canonical printer, printing both `(a0)` and `0(a0)` as `(a0)`. Reviewers: asb, lewis-revill Reviewed By: asb Subscribers: s.egerton, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65205 llvm-svn: 367553	2019-08-01 12:42:31 +00:00
Roman Lebedev	081e990d08	[IR] Value: add replaceUsesWithIf() utility Summary: While there is always a `Value::replaceAllUsesWith()`, sometimes the replacement needs to be conditional. I have only cleaned a few cases where `replaceUsesWithIf()` could be used, to both add test coverage, and show that it is actually useful. Reviewers: jdoerfert, spatel, RKSimon, craig.topper Reviewed By: jdoerfert Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65528 llvm-svn: 367548	2019-08-01 12:32:08 +00:00
Roman Lebedev	0efeaa8162	[IR] SelectInst: add swapValues() utility Summary: Sometimes we need to swap true-val and false-val of a `SelectInst`. Having a function for that is nicer than hand-writing it each time. Reviewers: spatel, RKSimon, craig.topper, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65520 llvm-svn: 367547	2019-08-01 12:31:35 +00:00
David Green	1343814fb4	[ARM] Fix for MVE VREV64 The VREV64 instruction is apparently unpredictable if Qd == Qm, due to the cross-beat nature of the instruction. This adds an earlyclobber to Qd, which seems to be the same way we deal with this on other instructions like the write-back on loads and stores. Differential Revision: https://reviews.llvm.org/D65502 llvm-svn: 367544	2019-08-01 11:22:03 +00:00
Sander de Smalen	7ebccfefb8	[AArch64] Do not allocate unnecessary emergency slot. Fix an issue where the compiler still allocates an emergency spill slot even though it already decided to spill an extra callee-save register to use as a scratch register. Reviewers: gberry, thegameg, mstorsjo, t.p.northover Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D65504 llvm-svn: 367540	2019-08-01 10:53:45 +00:00
Petar Avramovic	8a40cedfe6	[MIPS GlobalISel] Fold load/store + G_GEP + G_CONSTANT Fold load/store + G_GEP + G_CONSTANT when immediate in G_CONSTANT fits into 16 bit signed integer. Differential Revision: https://reviews.llvm.org/D65507 llvm-svn: 367535	2019-08-01 09:40:13 +00:00
Sam Parker	7ca8c6f6db	[NFC][ARM][ParallelDSP] Getters and renaming Add a couple of getters for Reduction and do some renaming of variables around CreateSMLAD for clarity. llvm-svn: 367522	2019-08-01 08:17:51 +00:00
Craig Topper	388df2ea19	[SelectionDAG] Use APInt::isSubsetOf/intersects to simplify some code. Also use KnownBits::isNegative/isNonNegative to further simplify. llvm-svn: 367518	2019-08-01 06:06:21 +00:00
Tom Stellard	7a2958bc20	AMDGPU/SILoadStoreOptimizer: Make some functions const Reviewers: arsenm, pendingchaos, rampitec Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65316 llvm-svn: 367517	2019-08-01 05:39:17 +00:00
Zi Xuan Wu	66c320908b	recommit:[PowerPC] Eliminate loads/swap feeding swap/store for vector type by using big-endian load/store In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target. So we can combine vector load + reverse into big endian load to eliminate the swap instruction. Also combine vector reverse + store into big endian store. Differential Revision: https://reviews.llvm.org/D65063 llvm-svn: 367516	2019-08-01 05:26:02 +00:00
Matt Arsenault	9952f46407	AMDGPU/GlobalISel: Fix flat load/store of pointer types llvm-svn: 367513	2019-08-01 03:57:42 +00:00
Matt Arsenault	57495268ac	AMDGPU/GlobalISel: Remove manual store select code This regresses the weird types that are newly treated as legal load types, but fixes incorrectly using flat instrucions on SI. llvm-svn: 367512	2019-08-01 03:52:40 +00:00
Matt Arsenault	ae87b9f2c2	AMDGPU/GlobalISel: Select local atomic cmpxchg llvm-svn: 367511	2019-08-01 03:41:41 +00:00
Matt Arsenault	26cb53b260	AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADD llvm-svn: 367509	2019-08-01 03:33:15 +00:00
Matt Arsenault	da5b9bfa95	AMDGPU/GlobalISel: Allow selection of DS atomicrmw llvm-svn: 367507	2019-08-01 03:29:01 +00:00
Matt Arsenault	e6ce48422c	AMDGPU: Start redefining atomic PatFrags Start migrating to a form that will be compatible with the global isel emitter. Also should fix some overly lax checks on the memory type, which allowed mis-selecting some illegal atomics. llvm-svn: 367506	2019-08-01 03:25:52 +00:00
Matt Arsenault	70e20c0f08	AMDGPU: Correct FP atomic patterns These need to use an fadd, not an add. Also make the noret part clear in the name. llvm-svn: 367505	2019-08-01 03:22:40 +00:00
Matt Arsenault	3baf4d3418	AMDGPU/GlobalISel: Select simple local stores llvm-svn: 367504	2019-08-01 03:09:15 +00:00
Matt Arsenault	7bedceb5b2	GlobalISel: moreElementsVector for G_LOAD/G_STORE AMDGPU change and test is a placeholder until a future patch with complete handling. llvm-svn: 367503	2019-08-01 01:44:22 +00:00
Peter Collingbourne	fbc563e2cb	Create unique, but identically-named ELF sections for explicitly-sectioned functions and globals when using -function-sections and -data-sections. This allows functions and globals to to be reordered later in the linking phase (using the -symbol-ordering-file) even though reordering will be limited to the scope of the explicit section. Patch by Rahman Lavaee! Differential Revision: https://reviews.llvm.org/D65478 llvm-svn: 367501	2019-08-01 01:38:53 +00:00
Matt Arsenault	d48324ff6f	Reapply "AMDGPU: Split block for si_end_cf" This reverts commit r359363, reapplying r357634 llvm-svn: 367500	2019-08-01 01:25:27 +00:00
Philip Reames	79c27c9464	Fix a release-only build warning triggered by rL367485 llvm-svn: 367499	2019-08-01 01:16:08 +00:00
Matt Arsenault	3594011de0	AMDGPU/GlobalISel: Select local loads llvm-svn: 367498	2019-08-01 00:53:38 +00:00
Amy Huang	153f20057c	Revert "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" and and partial fix. Causes windows buildbot errors. This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and `53da7ca943`. llvm-svn: 367496	2019-07-31 23:59:31 +00:00
Eli Friedman	89b80f1239	[ARM] Lower "(x<<c) > 0x80000000U" to "lsls" on Thumb1. This is extremely specific, but saves three instructions when it's legal. I don't think the code can be usefully generalized. Differential Revision: https://reviews.llvm.org/D65351 llvm-svn: 367492	2019-07-31 23:19:21 +00:00
Eli Friedman	2f45ec1c39	[ARM] Transform compare of masked value to shift on Thumb1. Thumb1 has very limited immediate modes, so turning an "and" into a shift can save multiple instructions. It's possible to simplify the generated code for test2 and test3 in cmp-and-fold.ll a little more, but I'll implement that as a followup. Differential Revision: https://reviews.llvm.org/D65175 llvm-svn: 367491	2019-07-31 23:17:34 +00:00
Craig Topper	b70026c43c	[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use scalar bit tests for the branches. X86 at least is able to use movmsk or kmov to move the mask to the scalar domain. Then we can just use test instructions to test individual bits. This is more efficient than extracting each mask element individually. I special cased v1i1 to use the previous behavior. This avoids poor type legalization of bitcast of v1i1 to i1. I've skipped expandload/compressstore as I think we need to handle constant masks for those better first. Many tests end up with duplicate test instructions due to tail duplication in the branch folding pass. But the same thing happens when constructing similar code in C. So its not unique to the scalarization. Not sure if this lowering code will also be good for other targets, but we're only testing X86 today. Differential Revision: https://reviews.llvm.org/D65319 llvm-svn: 367489	2019-07-31 22:58:15 +00:00
Craig Topper	b51dc64063	[X86] Add DAG combine to fold any_extend_vector_inreg+truncstore to an extractelement+store We have custom code that ignores the normal promoting type legalization on less than 128-bit vector types like v4i8 to emit pavgb, paddusb, psubusb since we don't have the equivalent instruction on a larger element type like v4i32. If this operation appears before a store, we can be left with an any_extend_vector_inreg followed by a truncstore after type legalization. When truncstore isn't legal, this will normally be decomposed into shuffles and a non-truncating store. This will then combine away the any_extend_vector_inreg and shuffle leaving just the store. On avx512, truncstore is legal so we don't decompose it and we had no combines to fix it. This patch adds a new DAG combine to detect this case and emit either an extract_store for 64-bit stoers or a extractelement+store for 32 and 16 bit stores. This makes the avx512 codegen match the avx2 codegen for these situations. I'm restricting to only when -x86-experimental-vector-widening-legalization is false. When we're widening we're not likely to create this any_extend_inreg+truncstore combination. This means we should be able to remove this code when we flip the default. I would like to flip the default soon, but I need to investigate some performance regressions its causing in our branch that I wasn't seeing on trunk. Differential Revision: https://reviews.llvm.org/D65538 llvm-svn: 367488	2019-07-31 22:43:08 +00:00
Michael Berg	005d705d43	Migrate some more fadd and fsub cases away from UnsafeFPMath control to utilize NoSignedZerosFPMath options control Summary: Honoring no signed zeroes is also available as a user control through clang separately regardless of fastmath or UnsafeFPMath context, DAG guards should reflect this context. Reviewers: spatel, arsenm, hfinkel, wristow, craig.topper Reviewed By: spatel Subscribers: rampitec, foad, nhaehnle, wuzish, nemanjai, jvesely, wdng, javed.absar, MaskRay, jsji Differential Revision: https://reviews.llvm.org/D65170 llvm-svn: 367486	2019-07-31 21:57:28 +00:00
Philip Reames	f8e7b53657	[IndVars, RLEV] Support rewriting exit values in loops without known exits (prep work) This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts. The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes. The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop. llvm-svn: 367485	2019-07-31 21:15:21 +00:00
Amy Huang	27a73dd02c	Fix to r367374 "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" after windows buildbot failure. Added a check that the MachineInstr exists and is a call before trying to add symbols around it. llvm-svn: 367483	2019-07-31 21:03:38 +00:00
Eric Christopher	36fb93982f	Fix unused variable warning for non-assert builds. llvm-svn: 367482	2019-07-31 21:02:03 +00:00
Mark Lacey	641ea2e701	[GISel] Address review feedback on passing MD_callees to lowerCall. Preserve the nullptr default for KnownCallees that appears in the base class. llvm-svn: 367477	2019-07-31 20:34:05 +00:00
Mark Lacey	7b8d3eb9e2	[GISel] Pass MD_callees metadata down in call lowering. Summary: This will make it possible to improve IPRA by taking into account register usage in indirect calls. NFC yet; this is just laying the groundwork to start building up patches to take advantage of the information for improved register allocation. Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65488 llvm-svn: 367476	2019-07-31 20:34:02 +00:00
Peter Collingbourne	09f39967a2	AArch64: Add a tagged-globals backend feature. This feature instructs the backend to allow locally defined global variable addresses to contain a pointer tag in bits 56-63 that will be ignored by the hardware (i.e. TBI), but may be used by an instrumentation pass such as HWASAN. It works by adding a MOVK instruction to the regular ADRP/ADD sequence that sets bits 48-63 to the corresponding bits of the global, with the linker bounds check disabled on the ADRP instruction to prevent the tag from causing a link failure. This implementation of the feature omits the MOVK when loading from or storing to a global, which is sufficient for TBI. If the same approach is extended to MTE, assuming that 0 is not configured as a catch-all tag, we will most likely also need the MOVK in this case in order to avoid a tag mismatch. Differential Revision: https://reviews.llvm.org/D65364 llvm-svn: 367475	2019-07-31 20:14:19 +00:00
Peter Collingbourne	33773d5cfc	SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned char to unsigned. This makes the field wider than MachineOperand::SubReg_TargetFlags so that we don't end up silently truncating any higher bits. We should still catch any bits truncated from the MachineOperand field as a consequence of the assertion in MachineOperand::setTargetFlags(). Differential Revision: https://reviews.llvm.org/D65465 llvm-svn: 367474	2019-07-31 20:14:09 +00:00
Wei Mi	f49c107f06	[DAGCombine] Limit the number of times for the same store and root nodes to bail out in store merging dependence check. We run into a case where dependence check in store merging bail out many times for the same store and root nodes in a huge basicblock. That increases compile time by almost 100x. The patch add a map to track how many times the bailing out happen for the same store and root, and if it is over a limit, stop considering the store with the same root as a merging candidate. Differential Revision: https://reviews.llvm.org/D65174 llvm-svn: 367472	2019-07-31 19:59:24 +00:00
Alina Sbirlea	7153f2784c	[SCCP] Update condition to avoid overflow. Summary: Update condition to remove addition that may cause an overflow. Resolves PR42814. Reviewers: sanjoy, RKSimon Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65417 llvm-svn: 367461	2019-07-31 18:22:22 +00:00
Alina Sbirlea	63e97fa0b3	[MemorySSA] Add additional verification for phis. Summary: Verify that the incoming defs into phis are the last defs from the respective incoming blocks. When moving blocks, insertDef must RenameUses. Adding this verification makes GVNHoist tests fail that uncovered this issue. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63147 llvm-svn: 367451	2019-07-31 17:41:04 +00:00
Sanjay Patel	435cdecdf7	[InstCombine] canonicalize fneg before fmul/fdiv Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it easier to implement the transforms (and possibly other fneg transforms) in 1 place because we can always start the pattern match from fneg (either the legacy binop or the new unop). There's a secondary practical benefit seen in PR21914 and PR42681: https://bugs.llvm.org/show_bug.cgi?id=21914 https://bugs.llvm.org/show_bug.cgi?id=42681 ...hoisting fneg rather than sinking seems to play nicer with LICM in IR (although this change may expose analysis holes in the other direction). 1. The instcombine test changes show the expected neutral IR diffs from reversing the order. 2. The reassociation tests show that we were missing an optimization opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says that all of these transforms are allowed (regardless of binop/unop fneg version) because: "For all other operations [besides copy/abs/negate/copysign], this standard does not specify the sign bit of a NaN result." In all of these transforms, we always have some other binop (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a potential intermediate NaN operand. (If that interpretation is wrong, then we must already have a bug in the existing transforms?) 3. The clang tests shouldn't exist as-is, but that's effectively a revert of rL367149 (the test broke with an extension of the pre-existing fneg canonicalization in rL367146). Differential Revision: https://reviews.llvm.org/D65399 llvm-svn: 367447	2019-07-31 16:53:22 +00:00
Djordje Todorovic	b9973f87c6	Reland "[DwarfDebug] Dump call site debug info" The build failure found after the rL365467 has been resolved. Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 367446	2019-07-31 16:51:28 +00:00
Stanislav Mekhanoshin	ba1e845c21	[AMDGPU] Fix for vectorizer crash with pointers of different size When vectorizer strips pointers it can eventually end up with pointers of two different sizes, then SCEV will crash. Differential Revision: https://reviews.llvm.org/D65480 llvm-svn: 367443	2019-07-31 16:33:11 +00:00
Simon Pilgrim	0707f66ad0	[X86] Moved IsNOT helper earlier. NFCI. Makes it available for more combines to use without adding declarations. llvm-svn: 367436	2019-07-31 14:36:04 +00:00
Mikhail Maltsev	806231ecc3	[ARM] Reject CSEL instructions with invalid operands Summary: According to the Armv8.1-M manual CSEL, CSINC, CSINV and CSNEG are "constrained unpredictable" when SP is used as the source register Rn. The assembler should diagnose this case. Reviewers: momchil.velikov, dmgreen, ostannard, simon_tatham, t.p.northover Reviewed By: ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65505 llvm-svn: 367433	2019-07-31 14:22:45 +00:00
Florian Hahn	fa42f42858	[IPSCCP] Move callsite check to the beginning of the loop. We have some code marks instructions with struct operands as overdefined, but if the instruction is a call to a function with tracked arguments, this breaks the assumption that the lattice values of all call sites are not overdefined and will be replaced by a constant. This also re-adds the assertion from D65222, with additionally skipping non-callsite uses. This patch should address the cases reported in which the assertion fired. Fixes PR42738. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65439 llvm-svn: 367430	2019-07-31 12:57:04 +00:00
Simon Pilgrim	24ad2b5e7d	[X86][AVX] Ensure chained subvector insertions are the same size (PR42833) Before combining insert_subvector(insert_subvector(vec, sub0, c0), sub1, c1) patterns, ensure that the subvectors are all the same type. On AVX512 targets especially we might have a mixture of 128/256 subvector insertions. llvm-svn: 367429	2019-07-31 12:55:39 +00:00
Momchil Velikov	a36d31478c	[AArch64] Add support for Transactional Memory Extension (TME) Re-commit r366322 after some fixes TME is a future architecture technology, documented in https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools https://developer.arm.com/docs/ddi0601/a More about the future architectures: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and TCANCEL and the target feature/arch extension "tme". It also implements TME builtin functions, defined in ACLE Q2 2019 (https://developer.arm.com/docs/101028/latest) Differential Revision: https://reviews.llvm.org/D64416 Patch by Javed Absar and Momchil Velikov llvm-svn: 367428	2019-07-31 12:52:17 +00:00
Roman Lebedev	5e4e6b1fb1	[DivRemPairs] Fixup DNDEBUG build - variable is only used in assertion llvm-svn: 367423	2019-07-31 12:26:37 +00:00
Roman Lebedev	a686c60c45	[DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673) Summary: While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair is unsupported by target, nothing performs the opposite fold. We can't do that in InstCombine or DAGCombine since neither of those has access to TTI. So it makes most sense to teach `-div-rem-pairs` about it. If we matched rem in expanded form, we know we will be able to place div-rem pair next to each other so we won't regress the situation. Also, we shouldn't decompose rem if we matched already-decomposed form. This is surprisingly straight-forward otherwise. The original patch was committed in rL367288 but was reverted in rL367289 because it exposed pre-existing RAUW issues in internal data structures of the pass; those now have been addressed in a previous patch. https://bugs.llvm.org/show_bug.cgi?id=42673 Reviewers: spatel, RKSimon, efriedma, ZaMaZaN4iK, bogner Reviewed By: bogner Subscribers: bogner, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65298 llvm-svn: 367419	2019-07-31 12:06:51 +00:00
Roman Lebedev	5f616901f5	[DivRemPairs] Avoid RAUW pitfalls (PR42823) Summary: `DivRemPairs` internally creates two maps: * {sign, divident, divisor} -> div instruction * {sign, divident, divisor} -> rem instruction Then it iterates over rem map, and looks if there is an entry in div map with the same key. Then depending on some internal logic it may RAUW rem instruction with something else. But if that rem instruction is an input to other div/rem, then it was used as a key in these maps, so the old value (used in key) is now dandling, because RAUW didn't update those maps. And we can't even RAUW map keys in general, there's `ValueMap`, but we don't have a single `Value` as key... The bug was discovered via D65298, and the test there exists. Now, i'm not sure how to expose this issue in trunk. The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`, but i guess this didn't miscompiled anything thus far? I really don't think this is benin without that patch. The fix is actually rather straight-forward - instead of trying to somehow shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new `ValueMap` with key being a struct of `Value`s, we can just have an intermediate data structure - a vector, each entry containing matching `Div, Rem` pair, and pre-filling it before doing any modifications. This way we won't need to query map after doing RAUW, so no bug is possible. Reviewers: spatel, bogner, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, hans, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65451 llvm-svn: 367417	2019-07-31 12:06:38 +00:00
Oliver Cruickshank	09a1b8172b	[ARM] Generate MVE VFMAs llvm-svn: 367408	2019-07-31 10:44:11 +00:00
Oliver Cruickshank	e7241e8592	[NFC] Test Commit llvm-svn: 367405	2019-07-31 10:08:09 +00:00

... 14 15 16 17 18 ...

126735 Commits