llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	3950095edf	[InstCombine] add tests to show disabling of libcall/intrinsic shrinking; NFC llvm-svn: 339467	2018-08-10 20:12:36 +00:00
Zachary Turner	909b819cf9	Resubmit r339450 - [MS Demangler] Add conversion operator tests This was broken because of a malformed check line. Incidentally, this exposed a case where we crash when we should just be returning an error, so we should fix that. The demangler shouldn't crash due to user input. llvm-svn: 339466	2018-08-10 20:08:46 +00:00
Zachary Turner	073620bc3b	[MS Demangler] Demangle cv qualifiers on template args. Before we wouldn't properly demangle something like Foo<const int>. Template args have a special escape sequence '$$C' that is optional, but if it is present contains qualifiers. So we need to check for this and only if it present, demangle qualifiers before demangling the type. With this fix, we re-enable some tests that were previously marked FIXME. llvm-svn: 339465	2018-08-10 19:57:36 +00:00
Matt Arsenault	940e6075e4	AMDGPU: More canonicalized operations llvm-svn: 339464	2018-08-10 19:20:17 +00:00
Sanjay Patel	8988b8d92c	revert r339450 - [MS Demangler] Add conversion operator tests Something here causes an assertion failure that killed a bunch of bots. Example: http://lab.llvm.org:8011/builders/reverse-iteration/builds/7021/steps/check_all/logs/stdio llvm-svn: 339463	2018-08-10 19:20:16 +00:00
Matt Arsenault	3dcf4ce435	AMDGPU: Combine and of seto/setuo and fp_class Clear the nan (or non-nan) test bits from the mask. llvm-svn: 339462	2018-08-10 18:58:56 +00:00
Matt Arsenault	d35f46caf1	AMDGPU: Turn class x, p_zero\|n_zero into fcmp oeq x, 0 The library does use this for some reason. llvm-svn: 339461	2018-08-10 18:58:49 +00:00
Matt Arsenault	8ad00d30fa	AMDGPU: Match isfinite pattern to class instructions llvm-svn: 339460	2018-08-10 18:58:41 +00:00
Sanjay Patel	12a2911f62	[InstCombine] add/update tests for selectBinOpIdentity; NFC This includes a test that would have exposed the bug in rL339439 which was reverted at rL339446. The compare can be integer while the binop is FP or vice-versa, so we need to use the binop type when we ask for the identity constant. llvm-svn: 339453	2018-08-10 17:20:24 +00:00
Zachary Turner	d664117794	[MS Demangler] Add conversion operator tests. The mangled names were added in the original commit, but the demangled equivalents weren't, so nothing was actually being checked. llvm-svn: 339450	2018-08-10 16:55:59 +00:00
Evgeniy Stepanov	453e7ac785	[hwasan] Add -hwasan-with-ifunc flag. Summary: Similar to asan's flag, it can be used to disable the use of ifunc to access hwasan shadow address. Reviewers: vitalybuka, kcc Subscribers: srhines, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50544 llvm-svn: 339447	2018-08-10 16:21:37 +00:00
David Bolvansky	5099835541	[InstCombine][NFC] Added tests for select with binop fold llvm-svn: 339441	2018-08-10 15:29:09 +00:00
Zachary Turner	a17721cf5d	[MS Demangler] Properly demangle conversion operators. These were completely broken before. We need to handle the 'B' operator tag. llvm-svn: 339436	2018-08-10 15:04:56 +00:00
Zachary Turner	e89f2fa657	[MS Demangler] Disable a couple of tests. The check lines are marked FIXME but not the mangled names. This is causing an error. llvm-svn: 339435	2018-08-10 14:53:33 +00:00
Zachary Turner	dbefc6cd4e	[MS Demangler] Fix several issues related to templates. These were uncovered when porting the mangling tests in ms-templates.cpp from clang/CodeGenCXX over to demangling tests. The main issues fixed here are surrounding integer literal signed and unsignedness, empty array dimensions, and pointer and reference non-type template parameters. Differential Revision: https://reviews.llvm.org/D50512 llvm-svn: 339434	2018-08-10 14:31:04 +00:00
Sam Parker	8c4b964c5a	[ARM] Disallow zexts in ARMCodeGenPrepare Enabling ARMCodeGenPrepare by default caused a whole load of failures. This is due to zexts and truncs not being handled properly. ZExts are messy so it's just easier to disable for now and truncs are allowed only as 'sinks'. I still need to figure out why allowing them as 'sources' causes so many failures. The other main changes are that we are explicit in the types that we converting to, it's now always 'TypeSize'. Type support is also now performed while checking for valid opcodes as it unnecessarily complicated having the checks are different stages. I've moved the tests around too, so we have the zext and truncs in their own file as well as the overflowing opcode tests. Differential Revision: https://reviews.llvm.org/D50518 llvm-svn: 339432	2018-08-10 13:57:13 +00:00
Hans Wennborg	d4090be340	Rename the cfguard module flag to cfguardtable The previous name sounds like it inserts cfguard implementation, but it really just emits the table of address-taken functions. Change the name to better reflect that. Clang will be updated in the next commit. llvm-svn: 339419	2018-08-10 09:48:53 +00:00
Max Kazantsev	4e9def57c7	[NFC] Add tests that demonstrate that MustExecute is fundamentally broken llvm-svn: 339417	2018-08-10 09:20:46 +00:00
Alexander Potapenko	75a954330b	[MSan] Shrink the register save area for non-SSE builds If code is compiled for X86 without SSE support, the register save area doesn't contain FPU registers, so `AMD64FpEndOffset` should be equal to `AMD64GpEndOffset`. llvm-svn: 339414	2018-08-10 08:06:43 +00:00
George Burgess IV	ff08c80efc	[MemorySSA] "Fix" lifetime intrinsic handling MemorySSA currently creates MemoryAccesses for lifetime intrinsics, and sometimes treats them as clobbers. This may/may not be the best way forward, but while we're doing it, we should consider MayAlias/PartialAlias to be clobbers. The ideal fix here is probably to remove all of this reasoning about lifetimes from MemorySSA + put it into the passes that need to care. But that's a wayyy broader fix that needs some consensus, and we have miscompiles + a release branch today, and this should solve the miscompiles just as well. differential revision is D43269. Landing without an explicit LGTM (and without using the special please-autoclose-this syntax) so we can still use that revision as a place to decide what the right fix here is. llvm-svn: 339411	2018-08-10 05:14:43 +00:00
David Bolvansky	909889b2cb	[InstCombine] Transform str(n)cmp to memcmp Summary: Motivation examples: int strcmp_memcmp() { char buf[12]; return strcmp(buf, "key") == 0; } int strcmp_memcmp2() { char buf[12]; return strcmp(buf, "key") != 0; } int strncmp_memcmp() { char buf[12]; return strncmp(buf, "key", 3) == 0; } can be turned to memcmp. See test file for more cases. Reviewers: efriedma Reviewed By: efriedma Subscribers: spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D50233 llvm-svn: 339410	2018-08-10 04:32:54 +00:00
Heejin Ahn	5831e9cc79	[WebAssembly] Gate i64x2 and f64x2 on -wasm-enable-unimplemented Summary: i64x2 and f64x2 operations are not implemented in V8, so we normally do not want to emit them. However, they are in the SIMD spec proposal, so we still want to be able to test them in the toolchain. This patch adds a flag to enable their emission. Reviewers: aheejin, dschuff Subscribers: sunfish, jgravelle-google, sbc100, llvm-commits Differential Revision: https://reviews.llvm.org/D50423 Patch by Thomas Lively (tlively) llvm-svn: 339407	2018-08-09 23:58:51 +00:00
Craig Topper	9a8136f7b4	[X86] Qualify one of the heuristics in combineMul to only apply to positive multiply amounts. This seems to slightly help the performance of one of our internal benchmarks. We probably need better heuristics here. llvm-svn: 339406	2018-08-09 23:27:42 +00:00
Jordan Rupprecht	88ed5e59bd	[llvm-objcopy] NFC: Add some color to error() llvm-svn: 339404	2018-08-09 22:52:03 +00:00
Matt Arsenault	d54b7f0592	ValueTracking: Start enhancing isKnownNeverNaN llvm-svn: 339399	2018-08-09 22:40:08 +00:00
Sanjay Patel	c6944f795d	[InstSimplify] move minnum/maxnum with Inf folds from instcombine llvm-svn: 339396	2018-08-09 22:20:44 +00:00
Ana Pazos	10de234905	[RISC-V] Fixed alias for addi x2, x2, 0 A missing check for non-zero immediate in MCOperandPredicate caused c.addi16sp sp, 0 to be selected which is not a valid instruction. llvm-svn: 339381	2018-08-09 20:51:53 +00:00
Philip Reames	ca256d93fb	[LICM] hoist fences out of loops w/o memory operations The motivating case is an otherwise dead loop with a fence in it. At the moment, this goes all the way through the optimizer and we end up emitting an entirely pointless loop on x86. This case may seem a bit contrived, but we've seen it in real code as the result of otherwise reasonable lowering strategies combined w/thread local memory optimizations (such as escape analysis). To handle this simple case, we can teach LICM to hoist must execute fences when there is no other memory operation within the loop. Differential Revision: https://reviews.llvm.org/D50489 llvm-svn: 339378	2018-08-09 20:18:42 +00:00
Sanjay Patel	55accd7dd3	[InstCombine] allow fsub+fmul FMF folds for vectors llvm-svn: 339368	2018-08-09 18:42:12 +00:00
Krzysztof Parzyszek	75c2ca3638	[Hexagon] Map ISD::TRAP to J2_trap0(#0 ) llvm-svn: 339365	2018-08-09 18:03:45 +00:00
Alina Sbirlea	bf9fe79397	SCEV should forget all loops containing a deleted block. Summary: LoopSimplifyCFG should update ScEv for all loops after a block is deleted. If the deleted block "Succ" is part of L, then it is part of all parent loops, so forget topmost loop. Reviewers: greened, mkazantsev, sanjoy Subscribers: jlebar, javed.absar, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D50422 llvm-svn: 339363	2018-08-09 17:53:26 +00:00
Paul Semel	7a3dc2c184	[llvm-objcopy] Add --prefix-symbols option Differential Revision: https://reviews.llvm.org/D50381 llvm-svn: 339362	2018-08-09 17:49:04 +00:00
Sanjay Patel	373790293e	[InstCombine] add vector tests for fsub+fmul; NFC llvm-svn: 339361	2018-08-09 17:40:27 +00:00
Reid Kleckner	80c6ec11d9	[GlobalOpt] Don't apply fastcc if it would break inalloca invariants The inalloca parameter has to be the only parameter passed in memory. Changing the convention to fastcc can break that. At some point we should teach global opt how to optimize ABI attributes like inalloca and maybe byval. These attributes are mainly used to match C ABIs. They are harder for LLVM to optimize and they don't always generate the best code. Fixes PR38487 llvm-svn: 339360	2018-08-09 17:29:26 +00:00
Sanjay Patel	15d1501aae	[SelectionDAG] try harder to convert funnel shift to rotate Similar to rL337966 - if the DAGCombiner's rotate matching was working as expected, I don't think we'd see any test diffs here. AArch only goes right, and PPC only goes left. x86 has both, so no diffs there. Differential Revision: https://reviews.llvm.org/D50091 llvm-svn: 339359	2018-08-09 17:26:22 +00:00
Paul Semel	a42dec7a1b	[llvm-objcopy] Add --dump-section Differential Revision: https://reviews.llvm.org/D49979 llvm-svn: 339358	2018-08-09 17:05:21 +00:00
Michael Berg	ca38254601	extend folding fsub/fadd to fneg for FMF Summary: This change provides a common optimization path for both Unsafe and FMF driven optimization for this fsub fold adding reassociation, as it the flag that most closely represents the translation Reviewers: spatel, wristow, arsenm Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D50195 llvm-svn: 339357	2018-08-09 17:00:03 +00:00
Evandro Menezes	9a92fe0c9e	[ARM] Replace processor check with feature Add new feature, `FeatureUseWideStrideVFP`, that replaces the need for a processor check. Otherwise, NFC. llvm-svn: 339354	2018-08-09 16:13:24 +00:00
Sjoerd Meijer	806f70d229	[ARM] FP16: codegen support for VTRN Differential Revision: https://reviews.llvm.org/D50454 llvm-svn: 339340	2018-08-09 12:45:09 +00:00
Simon Pilgrim	511c3fc529	[X86][SSE] Remove PMULDQ/PMULUDQ by zero Exposed by D50328 Differential Revision: https://reviews.llvm.org/D50328 llvm-svn: 339337	2018-08-09 12:37:36 +00:00
Simon Pilgrim	01ae462fef	[X86][SSE] Combine (some) target shuffles with multiple uses As discussed on D41794, we have many cases where we fail to combine shuffles as the input operands have other uses. This patch permits these shuffles to be combined as long as they don't introduce additional variable shuffle masks, which should reduce instruction dependencies and allow the total number of shuffles to still drop without increasing the constant pool. However, this may mean that some memory folds may no longer occur, and on pre-AVX require the occasional extra register move. This also exposes some poor PMULDQ/PMULUDQ codegen which was doing unnecessary upper/lower calculations which will in fact fold to zero/undef - the fix will be added in a followup commit. Differential Revision: https://reviews.llvm.org/D50328 llvm-svn: 339335	2018-08-09 12:30:02 +00:00
Jonas Hahnfeld	20526bf483	[NVPTX] Select atomic loads and stores According to PTX ISA .volatile has the same memory synchronization semantics as .relaxed.sys, so it can be used to implement monotonic atomic loads and stores. This is important for OpenMP's atomic construct where - 'read's and 'write's are lowered to atomic loads and stores, and - an update of float or double types are lowered into a cmpxchg loop. (Note that PTX could do better because it has atom.add.f{32,64} but LLVM's atomicrmw instruction only allows integer types.) Higher levels of atomicity (like acquire and release) need additional synchronization properties which were added with PTX ISA 6.0 / sm_70. So using these instructions still results in an error. Differential Revision: https://reviews.llvm.org/D50391 llvm-svn: 339316	2018-08-09 07:45:49 +00:00
Roger Ferrer Ibanez	577a97e2b9	[RISCV] Add "lla" pseudo-instruction to assembler This pseudo-instruction is similar to la but uses PC-relative addressing unconditionally. This is, la is only different to lla when using -fPIC. This pseudo-instruction seems often forgotten in several specs but it is definitely mentioned in binutils opcodes/riscv-opc.c. The semantics are defined both in page 37 of the "RISC-V Reader" book but also in function macro found in gas/config/tc-riscv.c. This is a very first step towards adding PIC support for Linux in the RISC-V backend. The lla pseudo-instruction expands to a sequence of auipc + addi with a couple of pc-rel relocations where the second points to the first one. This is described in https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#pc-relative-symbol-addresses For now, this patch only introduces support of that pseudo instruction at the assembler parser. Differential Revision: https://reviews.llvm.org/D49661 llvm-svn: 339314	2018-08-09 07:08:20 +00:00
Philip Reames	954eab1087	[LICM] Add tests for future hoisting of fence instructions [NFC] The main interesting case is a fence in an otherwise dead loop or one containing only arithmetic. This can happen as a result of DSE or other transforms from seemingly reasonable initial IR. llvm-svn: 339310	2018-08-09 04:21:02 +00:00
Petr Hosek	eb46c95c3e	[CMake] Use normalized Windows target triples Changes the default Windows target triple returned by GetHostTriple.cmake from the old environment names (which we wanted to move away from) to newer, normalized ones. This also requires updating all tests to use the new systems names in constraints. Differential Revision: https://reviews.llvm.org/D47381 llvm-svn: 339307	2018-08-09 02:16:18 +00:00
Paul Robinson	508b081514	[DWARF] Verifier now handles .debug_types sections. Differential Revision: https://reviews.llvm.org/D50466 llvm-svn: 339302	2018-08-08 23:50:22 +00:00
Sanjay Patel	f9a80fe87a	[x86] add test for commuted variant for fsub fold; NFC llvm-svn: 339300	2018-08-08 23:06:59 +00:00
Sanjay Patel	e47dc1a405	[DAGCombiner] loosen constraints for fsub+fadd fold isNegatibleForFree() should not matter here (as the test diffs show) because it's always a win to replace an fsub+fadd with fneg. The problem in D50195 persists because either (1) we are doing these folds in the wrong order or (2) we're missing another fold for fadd. llvm-svn: 339299	2018-08-08 23:04:43 +00:00
Petr Hosek	7b27454477	[ADT] Normalize empty triple components LLVM triple normalization is handling "unknown" and empty components differently; for example given "x86_64-unknown-linux-gnu" and "x86_64-linux-gnu" which should be equivalent, triple normalization returns "x86_64-unknown-linux-gnu" and "x86_64--linux-gnu". autoconf's config.sub returns "x86_64-unknown-linux-gnu" for both "x86_64-linux-gnu" and "x86_64-unknown-linux-gnu". This changes the triple normalization to behave the same way, replacing empty triple components with "unknown". This addresses PR37129. Differential Revision: https://reviews.llvm.org/D50219 llvm-svn: 339294	2018-08-08 22:23:57 +00:00
Sanjay Patel	f8937c8406	[x86] add tests for fsub+fadd with FMF; NFC These are related to the block of code under review in D50195. llvm-svn: 339293	2018-08-08 22:18:16 +00:00
Jonas Devlieghere	49ff4d9041	[DWARF] Unclamp line table version on Darwin for v5 and later. On Darwin we pin the DWARF line tables to version 2. Stop doing so for DWARF v5 and later. Differential revision: https://reviews.llvm.org/D49381 llvm-svn: 339288	2018-08-08 21:16:50 +00:00
Eli Friedman	5b45a39056	[ARM] Avoid spilling lr with Thumb1 tail calls. Normally, if any registers are spilled, we prefer to spill lr on Thumb1 so we can fold the "bx lr" into the "pop". However, if there are tail calls involved, restoring lr is expensive, so skip the optimization in that case. The spill of r7 in the new test also isn't necessary, but that's mostly orthogonal to this patch. (It's the same code in ARMFrameLowering, but it's not related to tail calls.) Differential Revision: https://reviews.llvm.org/D49459 llvm-svn: 339283	2018-08-08 20:03:10 +00:00
Ties Stuij	0244aa67d6	revert tests of '[CodeGen] emit inline asm clobber list warnings for reserved' llvm-svn: 339276	2018-08-08 17:19:32 +00:00
Zachary Turner	d346cba91b	[MS Demangler] Create a new backref context for template instantiations. Template manglings use a fresh back-referencing context, so we need to do the same. This fixes several existing tests which are marked as FIXME, so those are now actually run. llvm-svn: 339275	2018-08-08 17:17:04 +00:00
Krzysztof Parzyszek	1df7059150	[Hexagon] Diagnose misaligned absolute loads and stores Differential Revision: https://reviews.llvm.org/D50405 llvm-svn: 339272	2018-08-08 17:00:09 +00:00
Matt Arsenault	935f3b70fe	AMDGPU: Error more gracefully on libcalls I think this is the only situation where the callsite will have a null instruction. llvm-svn: 339271	2018-08-08 16:58:39 +00:00
Matt Arsenault	e719139b10	AMDGPU: Fix shifts for i128 llvm-svn: 339270	2018-08-08 16:58:33 +00:00
Jonas Devlieghere	8511777d3a	[WASM] Fix overflow when reading custom section When reading a custom WASM section, it was possible that its name extended beyond the size of the section. This resulted in a bogus value for the section size due to the size overflowing. Fixes heap buffer overflow detected by OSS-fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=8190 Differential revision: https://reviews.llvm.org/D50387 llvm-svn: 339269	2018-08-08 16:34:03 +00:00
Jonas Devlieghere	caacedb03e	[DebugInfo] Fine tune emitting flags as part of the producer When using APPLE extensions, don't duplicate the compiler invocation's flags both in AT_producer and AT_APPLE_flags. Differential revision: https://reviews.llvm.org/D50453 llvm-svn: 339268	2018-08-08 16:33:22 +00:00
Sanjay Patel	fe839695a8	[InstCombine] fold fadd+fsub with common operand This is a sibling to the simplify from: https://reviews.llvm.org/rL339174 llvm-svn: 339267	2018-08-08 16:19:22 +00:00
Sanjay Patel	2054dd79c2	[InstCombine] fold fsub+fsub with common operand This is a sibling to the simplify from: rL339171 llvm-svn: 339266	2018-08-08 16:04:48 +00:00
Sanjay Patel	abd4767a0d	[InstCombine] add tests for fsub folds; NFC The scalar cases are handled in instcombine's internal reassociation pass for FP ops, but it misses the vector types. These patterns are similar to what was handled in InstSimplify in: https://reviews.llvm.org/rL339171 https://reviews.llvm.org/rL339174 https://reviews.llvm.org/rL339176 ...but we can't use instsimplify on these because we require negation of the original operand. llvm-svn: 339263	2018-08-08 15:44:56 +00:00
Zaara Syeda	b2595b988b	[PowerPC] Improve codegen for vector loads using scalar_to_vector This patch aims to improve the codegen for vector loads involving the scalar_to_vector (load X) sequence. Initially, ld->mv instructions were used for scalar_to_vector (load X), so this patch allows scalar_to_vector (load X) to utilize: LXSD and LXSDX for i64 and f64 LXSIWAX for i32 (sign extension to i64) LXSIWZX for i32 and f64 Committing on behalf of Amy Kwan. Differential Revision: https://reviews.llvm.org/D48950 llvm-svn: 339260	2018-08-08 15:20:43 +00:00
Ties Stuij	52f3631f4b	[CodeGen] emit inline asm clobber list warnings for reserved Summary: Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results. For example: ``` extern int bar(int[]); int foo(int i) { int a[i]; // VLA asm volatile( "mov r7, #1" : : : "r7" ); return 1 + bar(a); } ``` Compiled for thumb, this gives: ``` $ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb ... foo: .fnstart @ %bb.0: @ %entry .save {r4, r5, r6, r7, lr} push {r4, r5, r6, r7, lr} .setfp r7, sp, #12 add r7, sp, #12 .pad #4 sub sp, #4 movs r1, #7 add.w r0, r1, r0, lsl #2 bic r0, r0, #7 sub.w r0, sp, r0 mov sp, r0 @APP mov.w r7, #1 @NO_APP bl bar adds r0, #1 sub.w r4, r7, #12 mov sp, r4 pop {r4, r5, r6, r7, pc} ... ``` r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block. This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now. The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter. If we find a reserved register, we print a warning: ``` repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm] "mov r7, #1" ^ ``` Reviewers: eli.friedman, olista01, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49727 llvm-svn: 339257	2018-08-08 15:15:59 +00:00
Alex Bradbury	07224dfb47	[RISCV] Add mnemonic alias: move, sbreak and scall. Further improve compatibility with the GNU assembler. Differential Revision: https://reviews.llvm.org/D50217 Patch by Kito Cheng. llvm-svn: 339255	2018-08-08 14:53:45 +00:00
Simon Pilgrim	164e8b0b5c	[TargetLowering] BuildUDIV - Add support for divide by one (PR38477) Provide a pass-through of the numerator for divide by one cases - this is the same approach we take in DAGCombiner::visitSDIVLike. I investigated whether we could achieve this by magic MULHU/SRL values but nothing appeared to work as we don't have a way for MULHU(x,c) -> x llvm-svn: 339254	2018-08-08 14:51:19 +00:00
Alex Bradbury	7d8d87c143	[RISCV] Add InstAlias definitions for add[w], and, xor, or, sll[w], srl[w], sra[w], slt and sltu with immediate Match the GNU assembler in supporting immediate operands for these instructions even when the reg-reg mnemonic is used. Differential Revision: https://reviews.llvm.org/D50046 Patch by Kito Cheng. llvm-svn: 339252	2018-08-08 14:45:44 +00:00
Sjoerd Meijer	1919ecfd0b	[ARM][NFC] Replaced tab-characters in test file vtrn.ll llvm-svn: 339251	2018-08-08 14:42:11 +00:00
Sanjay Patel	a194b2d2ff	[InstCombine] fold fneg into constant operand of fmul/fdiv This accounts for the missing IR fold noted in D50195. We don't need any fast-math to enable the negation transform. FP negation can always be folded into an fmul/fdiv constant to eliminate the fneg. I've limited this to one-use to ensure that we are eliminating an instruction rather than replacing fneg by a potentially expensive fdiv or fmul. Differential Revision: https://reviews.llvm.org/D50417 llvm-svn: 339248	2018-08-08 14:29:08 +00:00
Simon Pilgrim	9f5b8f093e	[X86][SSE] PR38477 test is more cleanly tested with udiv instead of urem Making the test use urem relies on it calling udiv-like combines, but the real issue is with the udiv so we're better off using that directly. llvm-svn: 339247	2018-08-08 14:11:44 +00:00
Roman Lebedev	a677651a5a	[InstCombine] De Morgan: sink 'not' into 'xor' (PR38446) Summary: https://rise4fun.com/Alive/IT3 Comes up in the [most ugliest] `signed int` -> `signed char` case of `-fsanitize=implicit-conversion` (https://reviews.llvm.org/D50250) Previously, we were stuck with `not`: {F6867736} But now we are able to completely get rid of it: {F6867737} (FIXME: why are we loosing the metadata? that seems wrong/strange.) Here, we only want to do that it we will be able to completely get rid of that 'not'. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: vsk, erichkeane, llvm-commits Differential Revision: https://reviews.llvm.org/D50301 llvm-svn: 339243	2018-08-08 13:31:19 +00:00
Sjoerd Meijer	f8c394f0f5	[ARM] FP16: codegen support for VEXT Differential Revision: https://reviews.llvm.org/D50427 llvm-svn: 339241	2018-08-08 13:26:38 +00:00
Sjoerd Meijer	db5908deb9	[ARM] FP16: vector vmov and vdup support This adds codegen support for the vmov_n_f16 and vdup_n_f16 variants. Differential Revision: https://reviews.llvm.org/D50329 llvm-svn: 339238	2018-08-08 13:11:31 +00:00
Sjoerd Meijer	920a453485	[ARM] FP16: vector VMUL variants This adds codegen support for the vmul_lane_f16 and vmul_n_f16 variants. Differential Revision: https://reviews.llvm.org/D50326 llvm-svn: 339232	2018-08-08 10:27:34 +00:00
Simon Pilgrim	5477f11ba3	[X86][SSE] Add divide-by-one exact sdiv vector test Based on PR38477, we need to ensure we're testing for divide-by-one in non-uniform vectors llvm-svn: 339231	2018-08-08 10:16:43 +00:00
Simon Pilgrim	a10cfcc1db	[TargetLowering] BuildUDIV - Early out for divide by one (PR38477) We're not handling the UDIV by one special case properly - for now just early out. llvm-svn: 339229	2018-08-08 10:00:54 +00:00
Sjoerd Meijer	b33a4c02cc	[ARM] FP16: support vector INT_TO_FP and FP_TO_INT This adds codegen support for the different vcvt_f16 variants. Differential Revision: https://reviews.llvm.org/D50393 llvm-svn: 339227	2018-08-08 09:45:34 +00:00
Thomas Preud'homme	4107b31df2	Support inline asm with multiple 64bit output in 32bit GPR Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45437 llvm-svn: 339225	2018-08-08 09:35:26 +00:00
Roman Lebedev	c6a00f545c	[NFC][InstCombine] Cleanup demorgan-sink-not-into-xor.ll test We are only going to do it if it is free to do. llvm-svn: 339223	2018-08-08 08:46:07 +00:00
Sjoerd Meijer	b264944ed5	[ARM] FP16: support the vector vmin and vmax variants Differential Revision: https://reviews.llvm.org/D50238 llvm-svn: 339221	2018-08-08 07:20:15 +00:00
Max Kazantsev	c9dca6df78	[NFC] Add some tests on mustexec llvm-svn: 339219	2018-08-08 04:40:47 +00:00
Zachary Turner	58d29cf590	[MS Demangler] Properly handle backreferencing of special names. Function template names are not stored in the backref table, but non-template function names are. The general pattern seems to be that when you are demangling a symbol name, if the name starts with '?' it does not go into the backreference table, otherwise it does. Note that this even handles the general case of operator names (template or otherwise) not going into the back-reference table, anonymous namespaces not going into the backreference table, etc. It's important that we apply this check only for the unqualified portion of a name, and only for symbol names. For example, this does not apply to type names (such as class templates) and we need to make sure that these still do go into the backref table. Differential Revision: https://reviews.llvm.org/D50394 llvm-svn: 339211	2018-08-08 00:43:31 +00:00
Sanjay Patel	979423c996	[InstCombine] add tests for fneg fold including FMF; NFC llvm-svn: 339203	2018-08-07 23:24:25 +00:00
Sanjay Patel	bac052ef52	[InstCombine] fix FP constant in test; NFC Too many digits... llvm-svn: 339200	2018-08-07 23:03:29 +00:00
Michael Berg	2e60ad2e58	[NFC] adding tests for Y - (X + Y) --> -X llvm-svn: 339197	2018-08-07 22:52:57 +00:00
Sanjay Patel	25887da162	[InstCombine] add tests for fneg of fmul/fdiv with constant; NFC llvm-svn: 339195	2018-08-07 22:30:43 +00:00
Jan Vesely	7b2c98ab59	AMDGPU: Remove broken i16 ternary patterns Fixup test to check for GCN prefix These patterns always zero extend the result even though it might need sign extension. This has been broken since the addition of i16 support. It has popped up in mad_sat(char) test since min(max()) combination is turned into v_med3, resulting in the following (incorrect) sequence: v_mad_i16 v2, v10, v9, v11 v_med3_i32 v2, v2, v8, v7 Fixes mad_sat(char) piglit on VI. Differential Revision: https://reviews.llvm.org/D49836 llvm-svn: 339190	2018-08-07 21:54:37 +00:00
Derek Schuff	51ed131ed2	[WebAssembly] Update SIMD binary arithmetic Add missing SIMD types (v2f64) and binary ops. Also adds tablegen support for automatically prepending prefix byte to SIMD opcodes. Differential Revision: https://reviews.llvm.org/D50292 Patch by Thomas Lively llvm-svn: 339186	2018-08-07 21:24:01 +00:00
Krzysztof Parzyszek	e7ce247dd7	[Hexagon] Allow use of gather intrinsics even with no-packets Vgather requires must be in a packet with a store, which contradicts the no-packets feature. As a consequence, gather/scatter could not be used with no-packets. Relax this, and allow gather packets as exceptions to the no-packets requirements. llvm-svn: 339177	2018-08-07 20:33:47 +00:00
Sanjay Patel	9b07347033	[InstSimplify] fold fsub+fadd with common operand llvm-svn: 339176	2018-08-07 20:32:55 +00:00
Sanjay Patel	4364d604c2	[InstSimplify] fold fadd+fsub with common operand llvm-svn: 339174	2018-08-07 20:23:49 +00:00
Heejin Ahn	7fb68d2679	[WebAssembly] CFG sort support for exception handling Summary: This patch extends CFGSort pass to support exception handling. Once it places a loop header, it does not place blocks that are not dominated by the loop header until all the loop blocks are sorted. This patch extends the same algorithm to exception 'catch' part, using the information calculated by WebAssemblyExceptionInfo class. Reviewers: dschuff, sunfish Subscribers: sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D46500 llvm-svn: 339172	2018-08-07 20:19:23 +00:00
Sanjay Patel	f7a8fb2dee	[InstSimplify] fold fsub+fsub with common operand llvm-svn: 339171	2018-08-07 20:14:27 +00:00
Sanjay Patel	50976393ed	[InstSimplify] add tests for fadd/fsub; NFC Instcombine gets some, but not all, of these cases via it's internal reassociation transforms. It fails in all cases with vector types. llvm-svn: 339168	2018-08-07 19:49:13 +00:00
Alexey Bataev	0edcd0278d	[SLP] Fix insert point for reused extract instructions. Summary: Reworked the previously committed patch to insert shuffles for reused extract element instructions in the correct position. Previous logic was incorrect, and might lead to the crash with PHIs and EH instructions. Reviewers: efriedma, javed.absar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50143 llvm-svn: 339166	2018-08-07 19:21:05 +00:00
Wei Mi	b1ef2cc53d	[SampleFDO] Fix a bug in getOrCompHotCountThreshold/getOrCompColdCountThreshold getOrCompHotCountThreshold/getOrCompColdCountThreshold introduced in https://reviews.llvm.org/D45377 contain a bad mistake and will only return 1 or 0 instead of the true hot/cold cutoff value. The patch fixes the mistake. But the mistake seems not causing big performance difference according to internal server benchmarks testing. Differential Revision: https://reviews.llvm.org/D50370 llvm-svn: 339162	2018-08-07 18:13:10 +00:00
Philip Reames	c792e197b4	[LICM] Strengthen assume hoisting tests [NFC] As requested in review of https://reviews.llvm.org/D50364 llvm-svn: 339159	2018-08-07 17:54:36 +00:00
Craig Topper	49ed49fcb1	[SelectionDAG] When splitting scatter nodes during DAGCombine, create a serial chain dependency. Scatter could have multiple identical indices. We need to maintain sequential order. We get this right in LegalizeVectorTypes, but not in this code. Differential Revision: https://reviews.llvm.org/D50374 llvm-svn: 339157	2018-08-07 17:35:02 +00:00
Florian Hahn	950576bdf8	[GVN,NewGVN] Keep nonnull if K does not move. In combineMetadata, we should be able to preserve K's nonnull metadata, if K does not move. This condition should hold for all replacements by NewGVN/GVN, but I added a bunch of assertions to verify that. Fixes PR35038. There probably are additional kinds of metadata that could be preserved using similar reasoning. This is follow-up work. Reviewers: dberlin, davide, efriedma, nlopes Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D47339 llvm-svn: 339149	2018-08-07 15:36:11 +00:00
Sjoerd Meijer	b39cd886b9	[ARM] FP16: codegen support for VACGT Differential Revision: https://reviews.llvm.org/D50236 llvm-svn: 339148	2018-08-07 15:11:47 +00:00
Andrew V. Tischenko	1fe3375620	[X86] MCA tests for XCHG, XADD and CMPXCHG* instructions Differential Revision: https://reviews.llvm.org/D49912 llvm-svn: 339145	2018-08-07 14:36:43 +00:00
Sanjay Patel	948ff87d7d	[InstSimplify] move minnum/maxnum with common op fold from instcombine llvm-svn: 339144	2018-08-07 14:36:27 +00:00
Sanjay Patel	b06d283909	[InstSimplify] add tests for minnum/maxnum with shared op; NFC llvm-svn: 339142	2018-08-07 14:13:40 +00:00
Sanjay Patel	b802d18df7	[InstSimplify] move misplaced minnum/maxnum tests; NFC llvm-svn: 339141	2018-08-07 14:12:08 +00:00
Jonas Devlieghere	42243df3b9	Fix inconsistency with/without debug information (-g) This fixes an inconsistency in code generation when compiling with or without debug information (-g). When debug information is available in an empty block, the original test would fail, resulting in possibly different code. Patch by: Jeroen Dobbelaere Differential revision: https://reviews.llvm.org/D49467 llvm-svn: 339129	2018-08-07 12:14:01 +00:00
Aleksandar Beserminji	949a17c016	[mips] Handle branch expansion corner cases When potential jump instruction and target are in the same segment, use jump instruction with immediate field. In cases where offset does not fit immediate value of a bc/j instructions, offset is stored into register, and then jump register instruction is used. Differential Revision: https://reviews.llvm.org/D48019 llvm-svn: 339126	2018-08-07 10:45:45 +00:00
Pavel Labath	2f0881160c	[DebugInfo] Reduce debug_str_offsets section size Summary: The accelerator tables use the debug_str section to store their strings. However, they do not support the indirect method of access that is available for the debug_info section (DW_FORM_strx et al.). Currently our code is assuming that all strings can/will be referenced indirectly, and puts all of them into the debug_str_offsets section. This is generally true for regular (unsplit) dwarf, but in the DWO case, most of the strings in the debug_str section will only be used from the accelerator tables. Therefore the contents of the debug_str_offsets section will be largely unused and bloating the main executable. This patch rectifies this by teaching the DwarfStringPool to differentiate between strings accessed directly and indirectly. When a user inserts a string into the pool it has to declare whether that string will be referenced directly or not. If at least one user requsts indirect access, that string will be assigned an index ID and put into debug_str_offsets table. Otherwise, the offset table is skipped. This approach reduces the overall binary size (when compiled with -gdwarf-5 -gsplit-dwarf) in my tests by about 2% (debug_str_offsets is shrunk by 99%). Reviewers: probinson, dblaikie, JDevlieghere Subscribers: aprantl, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D49493 llvm-svn: 339122	2018-08-07 09:54:52 +00:00
Simon Pilgrim	7e18938793	[TargetLowering] Add support for non-uniform vectors to BuildUDIV This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators. It also includes a fold for MULHU by pow2 constants to SRL which can now more readily occur from BuildUDIV. Differential Revision: https://reviews.llvm.org/D49248 llvm-svn: 339121	2018-08-07 09:51:34 +00:00
Simon Pilgrim	974a5a7d94	[X86][SSE] Add more non-uniform exact sdiv vector tests covering all/none ashr paths llvm-svn: 339120	2018-08-07 09:31:22 +00:00
George Rimar	65a6828b17	[yaml2obj] - Add a support for changing EntSize. I was trying to add a test case for LLD and found that it is impossible to set sh_entsize via yaml. The patch implements the missing part. Differential revision: https://reviews.llvm.org/D50235 llvm-svn: 339113	2018-08-07 08:11:38 +00:00
Sjoerd Meijer	a2ddddfd3e	[ARM][NFC] Replaced tab characters in test file vfcmp.ll. llvm-svn: 339111	2018-08-07 08:05:15 +00:00
Heejin Ahn	e8653bb89a	[WebAssembly] Enable atomic expansion for unsupported atomicrmws Summary: Wasm does not have direct counterparts to some of LLVM IR's atomicrmw instructions (min, max, umin, umax, and nand). This enables atomic expansion using cmpxchg instruction within a loop for those atomicrmw instructions. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49440 llvm-svn: 339084	2018-08-07 00:22:22 +00:00
Matt Arsenault	08f3fe4fae	AMDGPU: cvt_pk_rtz_f16 canonicalizes llvm-svn: 339078	2018-08-06 23:01:31 +00:00
Matt Arsenault	e94ee833f9	AMDGPU: Handle some vector operations in isCanonicalized llvm-svn: 339077	2018-08-06 22:45:51 +00:00
Stella Stamenova	cc2404c01d	[lit, python] Always add quotes around the python path in lit Summary: The issue with the python path is that the path to python on Windows can contain spaces. To make the tests always work, the path to python needs to be surrounded by quotes. This change updates several configuration files which specify the path to python as a substitution and also remove quotes from existing tests. Reviewers: asmith, zturner, alexshap, jakehehrlich Reviewed By: zturner, alexshap, jakehehrlich Subscribers: mehdi_amini, nemanjai, eraman, kbarton, jakehehrlich, steven_wu, dexonsmith, stella.stamenova, delcypher, llvm-commits Differential Revision: https://reviews.llvm.org/D50206 llvm-svn: 339073	2018-08-06 22:37:44 +00:00
Matt Arsenault	a29e76244a	AMDGPU: Push fcanonicalize through partially constant build_vector This usually avoids some re-packing code, and may help find canonical sources. llvm-svn: 339072	2018-08-06 22:30:44 +00:00
Peter Collingbourne	69dd7cd45e	MC: Redirect .addrsig directives referring to private (.L) symbols to the section symbol. This matches our behaviour for regular (i.e. relocated) references to private symbols and therefore avoids needing to unnecessarily write address-significant .L symbols to the object file's symbol table, which can interfere with stack traces. Fixes check-cfi after r339050. llvm-svn: 339066	2018-08-06 21:59:58 +00:00
Matt Arsenault	d49ab0b214	AMDGPU: Treat more custom operations as canonicalizing Everything should quiet, and I think everything should flush. I assume the min3/med3/max3 follow the same rules as regular min/max for flushing, which should at least be conservatively correct. There are still more operations that need to be handled. llvm-svn: 339065	2018-08-06 21:58:11 +00:00
Matt Arsenault	ce6d61fba8	AMDGPU: Conversions always produce canonical results Not sure why this was checking for denormals for f16. My interpretation of the IEEE standard is conversions should produce a canonical result, and the ISA manual says denormals are created when appropriate. llvm-svn: 339064	2018-08-06 21:51:52 +00:00
Philip Reames	94b29601ef	[LICM] Further strengthen tests for hoisting guards and invariant.starts [NFC] llvm-svn: 339062	2018-08-06 21:39:43 +00:00
Matt Arsenault	f8768bfc84	AMDGPU: Fix implementation of isCanonicalized If denormals are enabled, denormals are canonical. Also fix a few other issues. minnum/maxnum are supposed to canonicalize. Temporarily improve workaround for the instruction behavior change in gfx9. Handle selects and fcopysign. The tests were also largely broken, since they were checking for a flush used on some targets after the store of the result. llvm-svn: 339061	2018-08-06 21:38:27 +00:00
Philip Reames	9d7bb2f700	[LICM] Strengthen invariant.start hoisting tests [NFC] llvm-svn: 339057	2018-08-06 21:18:34 +00:00
Reid Kleckner	15e91c3235	[X86] Fix assertion in subreg extraction This assert fires when attempting to extract a subregister from the global PIC base register. This virtual register SD node is not in the VRBaseMap, so we shouldn't call getVR to look it up there. If this is a RegisterSDNode, we should be able to use the virtual register directly. Fixes PR38385 llvm-svn: 339056	2018-08-06 21:16:16 +00:00
Philip Reames	81c7dc93d2	[LICM] Add tests highlighting missing hoists for intrinsics [NFC] llvm-svn: 339054	2018-08-06 21:06:15 +00:00
Evandro Menezes	6e137cb9f0	[SLC] Fix shrinking of pow() Properly shrink `pow()` to `powf()` as a binary function and, when no other simplification applies, do not discard it. Differential revision: https://reviews.llvm.org/D50113 llvm-svn: 339046	2018-08-06 19:40:17 +00:00
Alexandre Ganea	741cc3531a	[llvm-pdbutil] Support PDBs without a DBI stream Differential Revision: https://reviews.llvm.org/D50258 llvm-svn: 339045	2018-08-06 19:35:00 +00:00
Easwaran Raman	10fd92dd94	[X86] Recognize a splat of negate in isFNEG Summary: Expand isFNEG so that we generate the appropriate F(N)M(ADD\|SUB) instructions in more cases. For example, the following sequence a = _mm256_broadcast_ss(f) d = _mm256_fnmadd_ps(a, b, c) generates an fsub and fma without this patch and an fnma with this change. Reviewers: craig.topper Subscribers: llvm-commits, davidxl, wmi Differential Revision: https://reviews.llvm.org/D48467 llvm-svn: 339043	2018-08-06 19:23:38 +00:00
Craig Topper	0076477a4c	[X86] When using "and $0" and "orl $-1" to store 0 and -1 for minsize, make sure the store isn't volatile If the store is volatile this might be a memory mapped IO access. In that case we shouldn't generate a load that didn't exist in the source Differential Revision: https://reviews.llvm.org/D50270 llvm-svn: 339041	2018-08-06 18:44:26 +00:00
Craig Topper	f8a8c746e3	[X86] Add test cases to show bad use of "and $0" and "orl $-1" for minsize when the store is volatile If the store is volatile we shouldn't be adding a little that didn't exist in the source. llvm-svn: 339040	2018-08-06 18:44:21 +00:00
Wei Mi	3c1c088500	[RegisterCoalescer] Delay live interval update work until the rematerialization for all the uses from the same def is done. We run into a compile time problem with flex generated code combined with `-fno-jump-tables`. The cause is that machineLICM hoists a lot of invariants outside of a big loop, and drastically increases the compile time in global register splitting and copy coalescing. https://reviews.llvm.org/D49353 relieves the problem in global splitting. This patch is to handle the problem in copy coalescing. About the situation where the problem in copy coalescing happens. After machineLICM, we have several defs outside of a big loop with hundreds or thousands of uses inside the loop. Rematerialization in copy coalescing happens for each use and everytime rematerialization is done, shrinkToUses will be called to update the huge live interval. Because we have 'n' uses for a def, and each live interval update will have at least 'n' complexity, the total update work is n^2. To fix the problem, we try to do the live interval update work in a collective way. If a def has many copylike uses larger than a threshold, each time rematerialization is done for one of those uses, we won't do the live interval update in time but delay that work until rematerialization for all those uses are completed, so we only have to do the live interval update work once. Delaying the live interval update could potentially change the copy coalescing result, so we hope to limit that change to those defs with many (like above a hundred) copylike uses, and the cutoff can be adjusted by the option -mllvm -late-remat-update-threshold=xxx. Differential Revision: https://reviews.llvm.org/D49519 llvm-svn: 339035	2018-08-06 17:30:45 +00:00
Matt Arsenault	0d1b3934e2	AMDGPU: Fold v_lshl_or_b32 with 0 src0 Appears from expansion of some packed cases. llvm-svn: 339025	2018-08-06 15:40:20 +00:00
Matt Arsenault	56b31d8d75	ValueTracking: Handle canonicalize in CannotBeNegativeZero Also fix apparently missing test coverage for any of the handling here. llvm-svn: 339023	2018-08-06 15:16:26 +00:00
Matt Arsenault	dbf77c5b41	AMDGPU: Rename check prefixes in test Will avoid noisy diff in future change. llvm-svn: 339022	2018-08-06 15:16:12 +00:00
Bryan Chan	e023706471	[AArch64] Fix assertion failure on widened f16 BUILD_VECTOR Summary: Ensure that NormalizedBuildVector returns a BUILD_VECTOR with operands of the same type. This fixes an assertion failure in VerifySDNode. Reviewers: SjoerdMeijer, t.p.northover, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50202 llvm-svn: 339013	2018-08-06 14:14:41 +00:00
Tim Northover	9956e4a24b	ARM-MachO: don't add Thumb bit for addend to non-external relocation. ld64 supplies its own Thumb bit for Thumb functions, and intentionally zeroes out that part of any addend in an object file. But it only does that for symbols marked N_EXT -- i.e. external symbols. So LLVM should avoid setting that extra bit in other cases. llvm-svn: 339007	2018-08-06 11:32:44 +00:00
Max Kazantsev	2dbbd64cb7	Re-enable "[ValueTracking] Teach isKnownNonNullFromDominatingCondition about AND" The patch was reverted because of bug detected by sanitizer. The bug is fixed, respective tests added. Differential Revision: https://reviews.llvm.org/D50172 llvm-svn: 339005	2018-08-06 11:14:18 +00:00
Max Kazantsev	3271f379a9	Revert rL338990 to see if it causes sanitizer failures Multiple failues reported by sanitizer-x86_64-linux, seem to be caused by this patch. Reverting to see if they sustain without it. Differential Revision: https://reviews.llvm.org/D50172 llvm-svn: 338994	2018-08-06 08:10:28 +00:00
Max Kazantsev	34b0666be9	[ValueTracking] Teach isKnownNonNullFromDominatingCondition about AND `isKnownNonNullFromDominatingCondition` is able to prove non-null basing on `br` or `guard` by `%p != null` condition, but is unable to do so basing on `(%p != null) && %other_cond`. This patch allows it to do so. Differential Revision: https://reviews.llvm.org/D50172 Reviewed By: reames llvm-svn: 338990	2018-08-06 06:11:36 +00:00
Max Kazantsev	eded4abef8	[GuardWidening] Widen guards with conditions of frequently taken dominated branches If there is a frequently taken branch dominated by a guard, and its condition is available at the point of the guard, we can widen guard with condition of this branch and convert the branch into unconditional: guard(cond1) if (cond2) { // taken in 99.9% cases // do something } else { // do something else } Converts to guard(cond1 && cond2) // do something Differential Revision: https://reviews.llvm.org/D49974 Reviewed By: reames llvm-svn: 338988	2018-08-06 05:49:19 +00:00
David Bolvansky	b7fcd10700	[NFC] Fixed inliner tests - 2 llvm-svn: 338973	2018-08-05 16:53:36 +00:00
David Bolvansky	2f1f3b10ad	[NFC] Fixed inliner tests llvm-svn: 338972	2018-08-05 16:30:46 +00:00
David Bolvansky	c0aa4b75a4	Enrich inline messages Summary: This patch improves Inliner to provide causes/reasons for negative inline decisions. 1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message. 2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision. 3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost. 4. Adjusted tests for changed printing. Patch by: yrouban (Yevgeny Rouban) Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00 Reviewed By: tejohnson, xbolva00 Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D49412 llvm-svn: 338969	2018-08-05 14:53:08 +00:00
Eric Christopher	9855a5a0a1	Revert "Add a warning if someone attempts to add extra section flags to sections" There are a bunch of edge cases and inconsistencies in how we're emitting sections cause this warning to fire and it needs more work. This reverts commit r335558. llvm-svn: 338968	2018-08-05 14:23:37 +00:00
Roman Lebedev	365fa96055	[NFC][InstCombine] Add tests for sinking 'not' into 'xor' (PR38446) https://rise4fun.com/Alive/IT3 Comes up in the [most ugliest] signed int -> signed char case of -fsanitize=implicit-conversion (https://reviews.llvm.org/D50250) Not sure if we want to do it always, or only when it is free to invert. llvm-svn: 338967	2018-08-05 10:15:04 +00:00
Roman Lebedev	656a478e98	[NFC][InstCombine] Regenerate set.ll test llvm-svn: 338965	2018-08-05 08:53:40 +00:00
Craig Topper	fb33181038	[X86] Remove stale comments from a test. NFC The 16-bit case was recently fixed so this comment no longer applies. llvm-svn: 338964	2018-08-05 06:25:01 +00:00
David Bolvansky	b82a5ec1b6	[InstCombine] [NFC] Tests for strcmp to memcmp transformation llvm-svn: 338963	2018-08-05 05:46:56 +00:00
Chijun Sima	8b5de48d62	[TailCallElim] Preserve DT and PDT Summary: Previously, in the NewPM pipeline, TailCallElim recalculates the DomTree when it modifies any instruction in the Function. For example, ``` CallInst *CI = dyn_cast<CallInst>(&I); ... CI->setTailCall(); Modified = true; ... if (!Modified \|\| ...) return PreservedAnalyses::all(); ``` After applying this patch, the DomTree only recalculates if needed (plus an extra insertEdge() + an extra deleteEdge() call). When optimizing SQLite with `-passes="default<O3>"` pipeline of the newPM, the number of DomTree recalculation decreases by 6.2%, the number of nodes visited by DFS decreases by 2.9%. The time used by DomTree will decrease approximately 1%~2.5% after applying the patch. Statistics: ``` Before the patch: 23010 dom-tree-stats - Number of DomTree recalculations 489264 dom-tree-stats - Number of nodes visited by DFS -- DomTree After the patch: 21581 dom-tree-stats - Number of DomTree recalculations 475088 dom-tree-stats - Number of nodes visited by DFS -- DomTree ``` Reviewers: kuhar, dmgreen, brzycki, grosser, davide Reviewed By: kuhar, brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49982 llvm-svn: 338954	2018-08-04 08:13:47 +00:00
Chijun Sima	eacad79777	[ADCE] Remove the need of DomTree Summary: ADCE doesn't need to query domtree. Reviewers: kuhar, brzycki, dmgreen, davide, grosser Reviewed By: kuhar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49988 llvm-svn: 338950	2018-08-04 02:50:12 +00:00
Aditya Nandakumar	e07b3b737b	[GISel]: Add Opcodes for CTLZ/CTTZ/CTPOP https://reviews.llvm.org/D48600 Added IRTranslator support to translate these known intrinsics into GISel opcodes. llvm-svn: 338944	2018-08-04 01:22:12 +00:00
Craig Topper	3c869cb5e5	[X86] Add isel patterns for atomic_load+sub+atomic_sub. Despite the comment removed in this patch, this is beneficial when the RHS of the sub is a register. llvm-svn: 338930	2018-08-03 22:08:30 +00:00
Craig Topper	84319d1b42	[X86] Add test cases to show missed opportunity to use RMW for atomic_load+sub+atomic_store. llvm-svn: 338929	2018-08-03 22:08:28 +00:00
Reid Kleckner	8e40702c1c	[X86] Re-generate abi-isel.ll checks with update_llc_test_checks.py These tests were clearly auto-generated when they were converted to FileCheck back in r80019 (2009), but we didn't have a fancy script to keep them up to date then. I've reviewed the diff, and we should be generating the exact same code sequences we used to. After this, I plan to commit a change that changes our output slightly, but in a way that is still correct. It will generate a large diff, and I want it to be clearly correct, so I am regenerating these checks in preparation for that. llvm-svn: 338928	2018-08-03 21:58:25 +00:00
Reid Kleckner	5578b53c92	[X86] Make abi-isel.ll like update_llc_test_checks.py output - Remove -asm-verbose=0 from every llc command. The tests still pass. - Reorder the RUN lines to match CHECKs. - Use -LABEL like update_llc_test_checks.py does. llvm-svn: 338927	2018-08-03 21:58:12 +00:00
Reid Kleckner	13a9035190	[X86] Layout tests exactly as update_llc_test_checks.py would Put the LLVM IR at the bottom of the function instead of the top. In my next patch, I will run update_llc_test_checks.py on this file, and I want to only highlight the diffs in the CHECK lines. Hopefully by doing this change first, the patch will be more understandable. llvm-svn: 338926	2018-08-03 21:57:59 +00:00
Craig Topper	d7391eefdf	[X86] Remove RELEASE_ and ACQUIRE_ pseudo instructions. Use isel patterns and the normal instructions instead At one point in time acquire implied mayLoad and mayStore as did release. Thus we needed separate pseudos that also carried that property. This appears to no longer be the case. I believe it was changed in 2012 with a comment saying that atomic memory accesses are marked volatile which preserves the ordering. So from what I can tell we shouldn't need additional pseudos since they aren't carry any flags that are different from the normal instructions. The only thing I can think of is that we may consider them for load folding candidates in the peephole pass now where we didn't before. If that's important hopefully there's something in the memory operand we can check to prevent the folding without relying on pseudo instructions. Differential Revision: https://reviews.llvm.org/D50212 llvm-svn: 338925	2018-08-03 21:40:44 +00:00
Craig Topper	8c41136ca3	[X86] Autogenerate complete checks. NFC llvm-svn: 338921	2018-08-03 20:58:14 +00:00
Anastasis Grammenos	4dfe279e00	[TRE][DebugInfo] Preserve Debug Location in new branch instruction There are two branch instructions created so the new test covers them both. Differential Revision: https://reviews.llvm.org/D50263 llvm-svn: 338917	2018-08-03 20:27:13 +00:00
Craig Topper	c4960582ec	[SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked store. The mask operand is visited before the data operand so we need to be able to widen it. Fixes PR38436. llvm-svn: 338915	2018-08-03 20:14:18 +00:00
Matt Arsenault	c3dc8e65e2	DAG: Enhance isKnownNeverNaN Add a parameter for testing specifically for sNaNs - at least one instruction pattern on AMDGPU needs to check specifically for this. Also handle more cases, and add a target hook for custom nodes, similar to the hooks for known bits. llvm-svn: 338910	2018-08-03 18:27:52 +00:00
Artem Belevich	0a11b6366a	[NVPTX] Handle __nvvm_reflect("__CUDA_ARCH"). Summary: libdevice in recent CUDA versions relies on __nvvm_reflect() to select GPU-specific bitcode. This patch addresses the requirement. Reviewers: jlebar Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D50207 llvm-svn: 338908	2018-08-03 18:05:24 +00:00
Craig Topper	feb2a58860	[X86] Add a DAG combine for the __builtin_parity idiom used by clang to enable better codegen Clang uses "ctpop & 1" to implement __builtin_parity. If the popcnt instruction isn't supported this generates a large amount of code to calculate the population count. Instead we can bisect the data down to a single byte using xor and then check the parity flag. Even when popcnt is supported, its still a good idea to split 64-bit data on 32-bit targets using an xor in front of a single popcnt. Otherwise we get two popcnts and an add before the and. I've specifically targeted this at the sizes supported by clang builtins, but we could generalize this if we think that's useful. Differential Revision: https://reviews.llvm.org/D50165 llvm-svn: 338907	2018-08-03 18:00:29 +00:00
Craig Topper	b0ad9b9fd7	[X86] Add test cases for the current codegen of __builtin_parity. Will be improved in a follow commit llvm-svn: 338906	2018-08-03 18:00:23 +00:00
Joel Galenson	cfe5bc158d	Fix crash in bounds checking. In r337830 I added SCEV checks to enable us to insert fewer bounds checks. Unfortunately, this sometimes crashes when multiple bounds checks are added due to SCEV caching issues. This patch splits the bounds checking pass into two phases, one that computes all the conditions (using SCEV checks) and the other that adds the new instructions. Differential Revision: https://reviews.llvm.org/D49946 llvm-svn: 338902	2018-08-03 17:12:23 +00:00
Nicholas Wilson	e408a89a3a	[WebAssembly] Cleanup of the way globals and global flags are handled Differential Revision: https://reviews.llvm.org/D44030 llvm-svn: 338894	2018-08-03 14:33:37 +00:00
Jonas Devlieghere	3a92c5c1d3	[DebugInfo/Verifier] Don't emit error for missing module in index We don't expect module names to be present in the index. This patch adds DW_TAG_module to the blacklist. Differential revision: https://reviews.llvm.org/D50237 llvm-svn: 338878	2018-08-03 12:01:43 +00:00
Jonas Paulsson	f107b7275c	[SystemZ] Improve handling of instructions which expand to several groups Some instructions expand to more than one decoder group. This has been hitherto ignored, but is handled with this patch. Review: Ulrich Weigand https://reviews.llvm.org/D50187 llvm-svn: 338849	2018-08-03 10:43:05 +00:00
Sjoerd Meijer	d62c5ec2fe	[ARM] FP16: support vector zip and unzip This is addressing PR38404. Differential Revision: https://reviews.llvm.org/D50186 llvm-svn: 338835	2018-08-03 09:24:29 +00:00
Simon Pilgrim	4014fb1049	[X86] Add example of 'zero shift' guards on rotation patterns (PR34924) Basic pattern that leaves an unnecessary select on a rotation by zero result. This variant is trivial - the more general case with a compare+branch to prevent execution of undefined shifts is more tricky. llvm-svn: 338833	2018-08-03 09:20:02 +00:00
Sjoerd Meijer	9b30213828	[ARM] FP16: support VFMA This is addressing PR38404. llvm-svn: 338830	2018-08-03 09:12:56 +00:00
Craig Topper	a7a12399a1	[X86] Remove all the vector NOP bitcast patterns. Use a few lines of code in the Select method in X86ISelDAGToDAG.cpp instead. There are a lot of permutations of types here generating a lot of patterns in the isel table. It's more efficient to just ReplaceUses and RemoveDeadNode from the Select function. The test changes are because we have a some shuffle patterns that have a bitcast as their root node. But the behavior is identical to another instruction whose pattern doesn't start with a bitcast. So this isn't a functional change. llvm-svn: 338824	2018-08-03 07:01:10 +00:00
Craig Topper	e902b7d0b0	[X86] Support fp128 and/or/xor/load/store with VEX and EVEX encoded instructions. Move all the patterns to X86InstrVecCompiler.td so we can keep SSE/AVX/AVX512 all in one place. To save some patterns we'll use an existing DAG combine to convert f128 fand/for/fxor to integer when sse2 is enabled. This allows use to reuse all the existing patterns for v2i64. I believe this now makes SHA instructions the only case where VEX/EVEX and legacy encoded instructions could be generated simultaneously. llvm-svn: 338821	2018-08-03 06:12:56 +00:00
Hiroshi Inoue	73f8b255b6	[InstSimplify] fold extracting from std::pair (2/2) This is the second patch of the series which intends to enable jump threading for an inlined method whose return type is std::pair<int, bool> or std::pair<bool, int>. The first patch is https://reviews.llvm.org/rL338485. This patch handles code sequences that merges two values using `shl` and `or`, then extracts one value using `and`. Differential Revision: https://reviews.llvm.org/D49981 llvm-svn: 338817	2018-08-03 05:39:48 +00:00
Craig Topper	a80352c04e	[X86] When post-processing the DAG to remove zero extending moves for YMM/ZMM, make sure the producing instruction is VEX/XOP/EVEX encoded. If the producing instruction is legacy encoded it doesn't implicitly zero the upper bits. This is important for the SHA instructions which don't have a VEX encoded version. We might also be able to hit this with the incomplete f128 support that hasn't been ported to VEX. llvm-svn: 338812	2018-08-03 04:49:42 +00:00
Craig Topper	ded14af7aa	[X86] Autogenerate complete checks. NFC llvm-svn: 338811	2018-08-03 04:49:41 +00:00
Craig Topper	55697276dc	[X86] Autogenerate complete checks. NFC llvm-svn: 338802	2018-08-03 01:28:12 +00:00
Craig Topper	b99281c9b8	[X86] Autogenerate complete checks. NFC llvm-svn: 338799	2018-08-03 01:20:32 +00:00
Craig Topper	2c095444a4	[X86] Prevent promotion of i16 add/sub/and/or/xor to i32 if we can fold an atomic load and atomic store. This makes them consistent with i8/i32/i64. Which still seems to be more aggressive on folding than icc, gcc, or MSVC. llvm-svn: 338795	2018-08-03 00:37:34 +00:00
Philip Reames	5937368d4f	[LICM] Remove unneccessary safety check to increase sinking effectiveness This one requires a bit of explaination. It's not every day you simply delete code to implement an optimization. :) The transform in question is sinking an instruction from a loop to the uses in loop exiting blocks. We know (from LCSSA) that all of the uses outside the loop must be phi nodes, and after predecessor splitting, we know all phi users must have a single operand. Since the use must be strictly dominated by the def, we know from the definition of dominance/ssa that the exit block must execute along a (non-strict) subset of paths which reach the def. As a result, duplicating a potentially faulting instruction can not introduce a fault that didn't previously exist in the program. The full story is that this patch builds on "rL338671: [LICM] Factor out fault legality from canHoistOrSinkInst [NFC]" which pulled this logic out of a common helper routine. As best I can tell, this check was originally added to the helper function for hoisting legality, later an incorrect fastpath for loads/calls was added, and then the bug was fixed by duplicating the fault safety check in the hoist path. This left the redundant check in the common code to pessimize sinking for no reason. I split it out in an NFC, and am not removing the unneccessary check. I wanted there to be something easy to revert in case I missed something. Reviewed by: Anna Thomas (in person) llvm-svn: 338794	2018-08-03 00:21:56 +00:00
Dave Lee	3fb120f12e	objdump: Better handling of Mach-O universal binaries Summary: With Mach-O, there is a flag requirement discrepancy between working with universal binaries and thin binaries. Many flags that don't require the `-macho` flag (for example `-private-headers` and `-disassemble`) fail to work on universal binaries unless `-macho` is given. When this happens, the error message is unhelpful, stating: The file was not recognized as a valid object file. Which can lead to confusion. This change allows generic flags to be used on universal binaries with and without the `-macho` flag. This means flags that can be used for thin files can be used consistently with fat files too. To do this, the universal binary support within `ParseInputMachO()` is extracted into a new function. This new function is called directly from `DumpInput()` when the input binary is universal. Additionally the `-arch` flag validation in `ParseInputMachO()` was extracted to be reused. Reviewers: compnerd Reviewed By: compnerd Subscribers: keith, llvm-commits Differential Revision: https://reviews.llvm.org/D48702 llvm-svn: 338792	2018-08-03 00:06:38 +00:00
Eli Friedman	1ba5e9ac24	[GlobalMerge] Allow merging globals with explicit section markings. At least on ELF, it's impossible to tell from the object file whether two globals with the same section marking were merged: the merged global uses "private" linkage to hide its symbol, and the aliases look like regular symbols. I can't think of any other reason to disallow it. (Of course, we can only merge globals in the same section.) The weird alignment handling matches AsmPrinter; our alignment handling for global variables should probably be refactored. Differential Revision: https://reviews.llvm.org/D49822 llvm-svn: 338791	2018-08-02 23:54:16 +00:00
Tim Renouf	abd85fb1f5	[AMDGPU] Reworked SIFixWWMLiveness Summary: I encountered some problems with SIFixWWMLiveness when WWM is in a loop: 1. It sometimes gave invalid MIR where there is some control flow path to the new implicit use of a register on EXIT_WWM that does not pass through any def. 2. There were lots of false positives of registers that needed to have an implicit use added to EXIT_WWM. 3. Adding an implicit use to EXIT_WWM (and adding an implicit def just before the WWM code, which I tried in order to fix (1)) caused lots of the values to be spilled and reloaded unnecessarily. This commit is a rework of SIFixWWMLiveness, with the following changes: 1. Instead of considering any register with a def that can reach the WWM code and a def that can be reached from the WWM code, it now considers three specific cases that need to be handled. 2. A register that needs liveness over WWM to be synthesized now has it done by adding itself as an implicit use to defs other than the dominant one. Also added the following fixmes: FIXME: We should detect whether a register in one of the above categories is already live at the WWM code before deciding to add the implicit uses to synthesize its liveness. FIXME: I believe this whole scheme may be flawed due to the possibility of the register allocator doing live interval splitting. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46756 Change-Id: Ie7fba0ede0378849181df3f1a9a7a39ed1a94a94 llvm-svn: 338783	2018-08-02 23:31:32 +00:00
Craig Topper	63873db5c4	[X86] Allow 'atomic_store (neg/not atomic_load)' to isel to a RMW instruction. There was a FIXMe in the td file about a type inference issue that was easy to fix. llvm-svn: 338782	2018-08-02 23:30:38 +00:00
Craig Topper	2deeeae2a5	[X86] Add NEG and NOT test cases to atomic_mi.ll in preparation for fixing the FIXME in X86InstrCompiler.td to make these work for atomic load/store. llvm-svn: 338781	2018-08-02 23:30:31 +00:00
Tim Renouf	f1c7b92a6a	[AMDGPU] Avoid using divergent value in mubuf addr64 descriptor Summary: This fixes a problem where a load from global+idx generated incorrect code on <=gfx7 when the index is divergent. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47383 Change-Id: Ib4d177d6254b1dd3f8ec0203fdddec94bd8bc5ed llvm-svn: 338779	2018-08-02 22:53:57 +00:00
Zachary Turner	666de23fbf	[MS Demangler] Fix some tests that are no longer broken. These were fixed with earlier patches, but had not yet been re-enabled. llvm-svn: 338778	2018-08-02 22:37:40 +00:00
Krzysztof Parzyszek	d91a9e27a9	[Hexagon] Simplify CFG after atomic expansion This will remove suboptimal branching from the generated ll/sc loops. The extra simplification pass affects a lot of testcases, which have been modified to accommodate this change: either by modifying the test to become immune to the CFG simplification, or (less preferablt) by adding option -hexagon-initial-cfg-clenaup=0. llvm-svn: 338774	2018-08-02 22:17:53 +00:00
Heejin Ahn	4128cb0b6b	[WebAssembly] Support for atomic.wait / atomic.wake instructions Summary: This adds support for atomic.wait / atomic.wake instructions in the wasm thread proposal. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49395 llvm-svn: 338770	2018-08-02 21:44:24 +00:00
Craig Topper	db89ec1185	[X86] Autogenerate complete checks. NFC llvm-svn: 338765	2018-08-02 20:28:45 +00:00
Krzysztof Parzyszek	90f3249ce2	[SCEV] Properly solve quadratic equations Differential Revision: https://reviews.llvm.org/D48283 llvm-svn: 338758	2018-08-02 19:13:35 +00:00
Sam Clegg	41d7047de5	[WebAssembly] Ensure bitcasts that would result in invalid wasm are removed by FixFunctionBitcasts Rather than allowing invalid bitcasts to be lowered to wasm call instructions that won't validate, generate wrappers that contain unreachable thereby delaying the error until runtime. Differential Revision: https://reviews.llvm.org/D49517 llvm-svn: 338744	2018-08-02 17:38:06 +00:00
Craig Topper	0423881820	[X86] Allow fake unary unpckhpd and movhlps to be commuted for execution domain fixing purposes These instructions perform the same operation, but the semantic of which operand is destroyed is reversed. If the same register is used as both operands we can change the execution domain without worrying about this difference. Unfortunately, this really only works in cases where the input register is killed by the instruction. If its not killed, the two address isntruction pass inserts a copy that will become a move instruction. This makes the instruction use different physical registers that contain the same data at the time the unpck/movhlps executes. I've considered using a unary pseudo instruction with tied operand to trick the two address instruction pass. We could then expand the pseudo post regalloc to get the same physical register on both inputs. Differential Revision: https://reviews.llvm.org/D50157 llvm-svn: 338735	2018-08-02 16:48:01 +00:00
Simon Pilgrim	ef494e1722	[X86][SSE] Add uniform/non-uniform exact sdiv vector tests covering all paths Regenerated tests and tested on 64-bit (AVX2) as well. llvm-svn: 338729	2018-08-02 15:34:51 +00:00
David Bolvansky	67647bcfbe	[InstCombine] [NFC] Tests for select with binop fold llvm-svn: 338727	2018-08-02 14:59:23 +00:00
Sanjay Patel	3f6e9a71f7	[InstSimplify] move minnum/maxnum with undef fold from instcombine llvm-svn: 338719	2018-08-02 14:33:40 +00:00
Sjoerd Meijer	8e7fab0443	[ARM][NFC] Follow up of r338568 I disabled more tests than necessary, this enables them. llvm-svn: 338717	2018-08-02 14:04:48 +00:00
Sanjay Patel	f9a0d593e9	[ValueTracking] fix maxnum miscompile for cannotBeOrderedLessThanZero (PR37776) This adds the NAN checks suggested in PR37776: https://bugs.llvm.org/show_bug.cgi?id=37776 If both operands to maxnum are NAN, that should get constant folded, so we don't have to handle that case. This is the same assumption as other FP ops in this function. Returning 'false' is always conservatively correct. Copying from the bug report: Currently, we have this for "when is cannotBeOrderedLessThanZero (mustBePositiveOrNaN) true for maxnum": L ------------------- \| Pos \| Neg \| NaN \| ------------------------ \|Pos \| x \| x \| x \| ------------------------ R \|Neg \| x \| \| x \| ------------------------ \|NaN \| x \| x \| x \| ------------------------ The cases with (Neg & NaN) are wrong. We should have: L ------------------- \| Pos \| Neg \| NaN \| ------------------------ \|Pos \| x \| x \| x \| ------------------------ R \|Neg \| x \| \| \| ------------------------ \|NaN \| x \| \| x \| ------------------------ Differential Revision: https://reviews.llvm.org/D50081 llvm-svn: 338716	2018-08-02 13:46:20 +00:00
Matt Arsenault	1f3977a856	DAG: Fix vector widening fcanonicalize llvm-svn: 338715	2018-08-02 13:43:53 +00:00
Matt Arsenault	36cdcfadcf	AMDGPU: Fix scalarizing v4f16 fcanonicalize llvm-svn: 338714	2018-08-02 13:43:42 +00:00
Ben Dunbobbin	d498dcdbbf	[llvm-ar] Fix help text test. NFC. Missed from @338703 llvm-svn: 338709	2018-08-02 12:27:01 +00:00
Simon Pilgrim	090d58b2b5	[X86][SSE] Add more UDIV nonuniform-constant vector tests Ensure we cover all paths for vector data as requested on D49248 llvm-svn: 338698	2018-08-02 10:53:53 +00:00
Alexander Ivchenko	49168f6778	[GlobalISel] Rewrite CallLowering::lowerReturn to accept multiple VRegs per Value This is logical continuation of https://reviews.llvm.org/D46018 (r332449) Differential Revision: https://reviews.llvm.org/D49660 llvm-svn: 338685	2018-08-02 08:33:31 +00:00
David Green	ea60446c6d	[AArch64] Add support for got relocated LDR's As a part of adding the tiny codemodel, we need to support ldr's with :got: relocations on them. This seems to be mostly already done, just needs the relocation type support. Differential Revision: https://reviews.llvm.org/D50137 llvm-svn: 338673	2018-08-02 06:24:40 +00:00
Philip Reames	24b13cb06d	[LICM] Expand tests to highlight an oddity in sinking implementation llvm-svn: 338670	2018-08-02 03:54:29 +00:00
Lei Liu	b9a7b7a84d	Fix FCOPYSIGN expansion In expansion of FCOPYSIGN, the shift node is missing when the two operands of FCOPYSIGN are of the same size. We should always generate shift node (if the required shift bit is not zero) to put the sign bit into the right position, regardless of the size of underlying types. Differential Revision: https://reviews.llvm.org/D49973 llvm-svn: 338665	2018-08-02 01:54:12 +00:00
Nemanja Ivanovic	e1a525ed06	[PowerPC] Do not round values prior to converting to integer Adding the FP_ROUND nodes when combining FP_TO_[SU]INT of elements feeding a BUILD_VECTOR into an FP_TO_[SU]INT of the built vector loses precision. This patch removes the code that adds these nodes to true f64 operands. It also adds patterns required to ensure the code is still vectorized rather than converting individual elements and inserting into a vector. Fixes https://bugs.llvm.org/show_bug.cgi?id=38342 Differential Revision: https://reviews.llvm.org/D50121 llvm-svn: 338658	2018-08-02 00:03:22 +00:00
Lei Liu	8e422b8403	[AArch64] DWARF: do not generate AT_location for thread local AArch64 ELF ABI does not define a static relocation type for TLS offset within a module, which makes it impossible for compiler to generate a valid DW_AT_location content for thread local variables. Currently LLVM generates an invalid R_AARCH64_ABS64 relocation at the DW_AT_location field for a TLS variable. That causes trouble for linker because thread local variable does not have an absolute address at link time. AArch64 GCC solves the problem by not generating DW_AT_location for thread local variables. We should do the same in LLVM. Differential Revision: https://reviews.llvm.org/D43860 llvm-svn: 338655	2018-08-01 23:46:49 +00:00
George Burgess IV	213d1d23ef	Reland r338431: "Add DebugCounters to DivRemPairs" (Previously reverted in r338442) I'm told that the breakage came from us using an x86 triple on configs that didn't have x86 enabled. This is remedied by moving the debugcounter test to an x86 directory (where there's also a opt-bisect-isel.ll test for similar reasons). I can't repro the reverse-iteration failure mentioned in the revert with this patch, so I assume that a misconfiguration on my end is what caused that. Original commit message: Add DebugCounters to DivRemPairs For people who don't use DebugCounters, NFCI. Patch by Zhizhou Yang! Differential Revision: https://reviews.llvm.org/D50033 llvm-svn: 338653	2018-08-01 23:14:14 +00:00
Sanjay Patel	28c7e41c09	[InstSimplify] move minnum/maxnum with same arg fold from instcombine llvm-svn: 338652	2018-08-01 23:05:55 +00:00
Reid Kleckner	a30a6d2c29	Load from the GOT for external symbols in the large, PIC code model Do the same handling for external symbols that we do for jump table symbols and global values. Fixes one of the cases in PR38385 llvm-svn: 338651	2018-08-01 22:56:05 +00:00
John Baldwin	c5d7e04052	[ASAN] Use the correct shadow offset for ASAN on FreeBSD/mips64. Reviewed By: atanasyan Differential Revision: https://reviews.llvm.org/D49939 llvm-svn: 338650	2018-08-01 22:51:13 +00:00
Matt Arsenault	709374d186	AMDGPU: Improve hack for packing conversion ops Mutate the node type during selection when it doesn't matter. This avoids an intermediate bitcast node on targets with legal i16/f16. Also fixes missing output modifiers on v_cvt_pkrtz_f32_f16, which I assume are OK. llvm-svn: 338619	2018-08-01 20:13:58 +00:00
Matt Arsenault	55ab9213d3	AMDGPU: Partially fix handling of packed amdgpu_ps arguments Fixes annoying limitations when writing tests. Also remove more leftover code for manually scalarizing arguments and return values. llvm-svn: 338618	2018-08-01 19:57:34 +00:00
Heejin Ahn	b3724b7169	[WebAssembly] Support for a ternary atomic RMW instruction Summary: This adds support for a ternary atomic RMW instruction: cmpxchg. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49195 llvm-svn: 338617	2018-08-01 19:40:28 +00:00
Alexey Bataev	d4dd7215f6	[DEBUGINFO] Disable emission of the dwarf sections, but allow directives. Summary: Added an option that allows to emit only '.loc' and '.file' kind debug directives, but disables emission of the DWARF sections. Required for NVPTX target to support profiling. It requires '.loc' and '.file' directives, but does not require any DWARF sections for the profiler. Reviewers: probinson, echristo, dblaikie Subscribers: aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D46021 llvm-svn: 338616	2018-08-01 19:38:20 +00:00
Craig Topper	c985d42903	[X86] Canonicalize the pattern for __builtin_ffs in a similar way to '__builtin_ffs + 5' We now emit a move of -1 before the cmov and do the addition after the cmov just like the case with an extra addition. This may be slightly worse for code size, but is more consistent with other compilers. And we might be able to hoist the mov -1 outside of loops. llvm-svn: 338613	2018-08-01 18:38:46 +00:00
Craig Topper	ffb8eb30ff	[X86] Add test cases for the patterns used by __builtin_ffs. We previously had tests for "__builtin_ffs + 5", but the SelectinoDAG without an extra addition came out slightly different. llvm-svn: 338612	2018-08-01 18:38:43 +00:00
Jan Vesely	93b252799b	AMDGPU/R600: Convert kernel param loads to use PARAM_I_ADDRESS Non ext aligned i32 loads are still optimized to use CONSTANT_BUFFER (AS 8) llvm-svn: 338610	2018-08-01 18:36:07 +00:00
Zachary Turner	44ebbc216a	[MS Demangler] Properly demangle templated operators. After we detected the presence of a template via ?$ we would proceed by only demangling a simple unqualified name. This means we would fail on templated operators (and perhaps other yet-to-be-determined things) This was discovered while doing some refactoring to store richer semantic information about the demangled types to pave the way for overhauling the way we handle backreferences. (Specifically, we need to defer recording or resolving back-references until a symbol has been completely demangled, because we need to use information that only occurs later in the mangled string to decide whether a back-reference should be recorded.) Differential Revision: https://reviews.llvm.org/D50145 llvm-svn: 338608	2018-08-01 18:32:47 +00:00
Vlad Tsyrklevich	ab016e00ec	[X86] FastISel fall back on !absolute_symbol GVs Summary: D25878, which added support for !absolute_symbol for normal X86 ISel, did not add support for materializing references to absolute symbols for X86 FastISel. This causes build failures because FastISel generates PC-relative relocations for absolute symbols. Fall back to normal ISel for references to !absolute_symbol GVs. Fix for PR38200. Reviewers: pcc, craig.topper Reviewed By: pcc Subscribers: hiraditya, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D50116 llvm-svn: 338599	2018-08-01 17:44:37 +00:00
Simon Pilgrim	b911d6721d	[llvm-mca][x86] Add CMPXCHG instruction resource tests I've put CMPXCHG8B/CMPXCHG16B in the same file, even though technically they are under separate CPUID bits all targets seem to support both (or neither). llvm-svn: 338595	2018-08-01 17:25:11 +00:00
Sanjay Patel	d5ae183034	[x86] remove stale FIXME note from test; NFC This was fixed with rL338592. llvm-svn: 338593	2018-08-01 17:18:50 +00:00
Sanjay Patel	8aac22e06a	[SelectionDAG] fix bug in translating funnel shift with non-power-of-2 type The bug is visible in the constant-folded x86 tests. We can't use the negated shift amount when the type is not power-of-2: https://rise4fun.com/Alive/US1r ...so in that case, use the regular lowering that includes a select to guard against a shift-by-bitwidth. This path is improved by only calculating the modulo shift amount once now. Also, improve the rotate (with power-of-2 size) lowering to use a negate rather than subtract from bitwidth. This improves the codegen whether we have a rotate instruction or not (although we can still see that we're not matching to a legal rotate in all cases). llvm-svn: 338592	2018-08-01 17:17:08 +00:00
Sanjay Patel	6d302c93cc	[x86] add tests to show miscompile for funnel shift with weird size; NFC llvm-svn: 338587	2018-08-01 16:59:54 +00:00
Simon Pilgrim	5c4fb14e07	[llvm-mca][x86] Add PREFETCHW instruction resource tests These aren't just available via 3DNow! so test for them separately as well. llvm-svn: 338584	2018-08-01 16:34:39 +00:00
Simon Pilgrim	dcfa732b2f	[llvm-mca][x86] Add PCLMUL instruction resource tests Renamed the btver2 file that already contained them - the other targets were only testing the AVX versions llvm-svn: 338583	2018-08-01 16:25:50 +00:00
Jordan Rupprecht	d67c1e129b	[llvm-objcopy] Add support for --rename-section flags from gnu objcopy Summary: Add support for --rename-section flags from gnu objcopy. Not all flags appear to have an effect for ELF objects, but allowing them would allow easier drop-in replacement. Other unrecognized flags are rejected. This was only tested by comparing flags printed by "readelf -e <.o>" against the output of gnu vs llvm objcopy, it hasn't been tested to be valid beyond that. Reviewers: jakehehrlich, alexshap Subscribers: llvm-commits, paulsemel, alexshap Differential Revision: https://reviews.llvm.org/D49870 llvm-svn: 338582	2018-08-01 16:23:22 +00:00
Andrea Di Biagio	7f3bf5c1f9	[llvm-mca] Correctly update the rank in `Scheduler::select()`. Found by inspection. llvm-svn: 338579	2018-08-01 16:06:33 +00:00
Simon Pilgrim	34ac6533f4	[llvm-mca][x86] Add SET/TEST instruction resource tests llvm-svn: 338576	2018-08-01 15:29:47 +00:00
Sjoerd Meijer	590e4e8dde	[ARM] Armv8.2-A FP16 vector intrinsics tests Clang support for the Armv8.2-A FP16 vector intrinsic was committed in rC328277, but this was never followed up, i.e. the LLVM part is missing. I've raised PR38404, and this is the first step to address this. I.e., this adds tests for the Armv8.2-A FP16 vector intrinsic, and thus shows which intrinsics already work, and which need further work. Differential Revision: https://reviews.llvm.org/D50142 llvm-svn: 338568	2018-08-01 14:43:59 +00:00
Simon Pilgrim	e364e57ac9	[llvm-mca][x86] Add LEA instruction resource tests We already added these to btver2, now add them to other targets, even though none of their models treat them specially (yet). llvm-svn: 338565	2018-08-01 14:25:33 +00:00
Simon Pilgrim	6754913e95	[llvm-mca][x86] Add more x86-64 system instruction resource tests CPUID, IN/OUT, INS/OUTS, INT, PAUSE, SCAS, UD2, XLAT llvm-svn: 338563	2018-08-01 14:18:09 +00:00
Cameron McInally	04ae85859d	[FPEnv] Widen illegal width StrictFP vector operations as needed Differential Revision: https://reviews.llvm.org/D49806 llvm-svn: 338562	2018-08-01 14:17:19 +00:00
Bryan Chan	67106b5e08	[AArch64] Fix FCCMP with FP16 operands Summary: This patch adds support for FCCMP instruction with FP16 operands, avoiding an assertion during instruction selection. Reviewers: olista01, SjoerdMeijer, t.p.northover, javed.absar Reviewed By: SjoerdMeijer Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50115 llvm-svn: 338554	2018-08-01 13:50:29 +00:00
Simon Pilgrim	5f41ab79c0	[llvm-mca][x86] Add CLFLUSHOPT instruction resource tests llvm-svn: 338550	2018-08-01 13:34:17 +00:00
Simon Pilgrim	bd014f4d91	[llvm-mca][x86] Add CMPS/LODS/MOVS/STOS string instruction resource tests llvm-svn: 338532	2018-08-01 13:14:45 +00:00
Jonas Devlieghere	8acb74e01f	[MC] Report fatal error for DWARF types for non-ELF object files Getting the DWARF types section is only implemented for ELF object files. We already disabled emitting debug types in clang (r337717), but now we also report an fatal error (rather than crashing) when trying to obtain this section in MC. Additionally we ignore the generate debug types flag for unsupported target triples. See PR38190 for more information. Differential revision: https://reviews.llvm.org/D50057 llvm-svn: 338527	2018-08-01 12:53:06 +00:00
Ryan Taylor	894c8fd0e2	[AMDGPU] Optimize _L image intrinsic to _LZ when lod is zero Summary: Add _L to _LZ image intrinsic table mapping to table gen. In ISelLowering check if image intrinsic has lod and if it's equal to zero, if so remove lod and change opcode to equivalent mapped _LZ. Change-Id: Ie24cd7e788e2195d846c7bd256151178cbb9ec71 Subscribers: arsenm, mehdi_amini, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49483 llvm-svn: 338523	2018-08-01 12:12:01 +00:00
Ulrich Weigand	58a9786e81	[SystemZ, TableGen] Fix shift count handling The DAG combiner logic to simplify AND masks in shift counts is invalid. While it is true that the SystemZ shift instructions ignore all but the low 6 bits of the shift count, it is still invalid to simplify the AND masks while the DAG still uses the standard shift operators (which are not defined to match the SystemZ instruction behavior). Instead, this patch performs equivalent operations during instruction selection. For completely removing the AND, this now happens via additional DAG match patterns implemented by a multi-alternative PatFrags. For simplifying a 32-bit AND to a 16-bit AND, the existing DAG patterns were already mostly OK, they just needed an output XForm to actually truncate the immediate value. Unfortunately, the latter change also exposed a bug in TableGen: it seems XForms are currently only handled correctly for direct operands of the outermost operation node. This patch also fixes that bug by simply recurring through the whole pattern. This should be NFC for all other targets. Differential Revision: https://reviews.llvm.org/D50096 llvm-svn: 338521	2018-08-01 11:57:58 +00:00
Simon Pilgrim	18d025a732	[llvm-mca][x86] Add STC + STD instruction resource tests llvm-svn: 338514	2018-08-01 11:00:11 +00:00
Petar Jovanovic	64c10ba8e2	[MIPS GlobalISel] Select global address Select G_GLOBAL_VALUE for position dependent code. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D49803 llvm-svn: 338499	2018-08-01 09:03:23 +00:00
David Bolvansky	fbbb83c782	Revert "Enrich inline messages", tests fail llvm-svn: 338496	2018-08-01 08:02:40 +00:00
David Bolvansky	7f36cd9d96	Enrich inline messages Summary: This patch improves Inliner to provide causes/reasons for negative inline decisions. 1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message. 2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision. 3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost. 4. Adjusted tests for changed printing. Patch by: yrouban (Yevgeny Rouban) Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00 Reviewed By: tejohnson, xbolva00 Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D49412 llvm-svn: 338494	2018-08-01 07:37:16 +00:00
Martin Storsjo	d4590c38ab	[AArch64] Disallow the MachO specific .loh directive for windows Also add a test for it being unsupported for linux. Differential Revision: https://reviews.llvm.org/D49929 llvm-svn: 338493	2018-08-01 06:50:18 +00:00
Victor Leschuk	64e0c56717	[DWARF] Basic support for producing DWARFv5 .debug_addr section This revision implements support for generating DWARFv5 .debug_addr section. The implementation is pretty straight-forward: we just check the dwarf version and emit section header if needed. Reviewers: aprantl, dblaikie, probinson Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D50005 llvm-svn: 338487	2018-08-01 05:48:06 +00:00
Hiroshi Inoue	02f79eae06	[InstSimplify] fold extracting from std::pair (1/2) This patch intends to enable jump threading when a method whose return type is std::pair<int, bool> or std::pair<bool, int> is inlined. For example, jump threading does not happen for the if statement in func. std::pair<int, bool> callee(int v) { int a = dummy(v); if (a) return std::make_pair(dummy(v), true); else return std::make_pair(v, v < 0); } int func(int v) { std::pair<int, bool> rc = callee(v); if (rc.second) { // do something } SROA executed before the method inlining replaces std::pair by i64 without splitting in both callee and func since at this point no access to the individual fields is seen to SROA. After inlining, jump threading fails to identify that the incoming value is a constant due to additional instructions (like or, and, trunc). This series of patch add patterns in InstructionSimplify to fold extraction of members of std::pair. To help jump threading, actually we need to optimize the code sequence spanning multiple BBs. These patches does not handle phi by itself, but these additional patterns help NewGVN pass, which calls instsimplify to check opportunities for simplifying instructions over phi, apply phi-of-ops optimization to result in successful jump threading. SimplifyDemandedBits in InstCombine, can do more general optimization but this patch aims to provide opportunities for other optimizers by supporting a simple but common case in InstSimplify. This first patch in the series handles code sequences that merges two values using shl and or and then extracts one value using lshr. Differential Revision: https://reviews.llvm.org/D48828 llvm-svn: 338485	2018-08-01 04:40:32 +00:00
Jatin Bhateja	36432a70c1	[X86] Adding more test patterns for lea-opt (PR37939) Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50128 llvm-svn: 338483	2018-08-01 03:53:27 +00:00
Chandler Carruth	2ce191e220	[x86] Fix a really subtle miscompile due to a somewhat glaring bug in EFLAGS copy lowering. If you have a branch of LLVM, you may want to cherrypick this. It is extremely unlikely to hit this case empirically, but it will likely manifest as an "impossible" branch being taken somewhere, and will be ... very hard to debug. Hitting this requires complex conditions living across complex control flow combined with some interesting memory (non-stack) initialized with the results of a comparison. Also, because you have to arrange for an EFLAGS copy to be in just the right place, almost anything you do to the code will hide the bug. I was unable to reduce anything remotely resembling a "good" test case from the place where I hit it, and so instead I have constructed synthetic MIR testing that directly exercises the bug in question (as well as the good behavior for completeness). The issue is that we would mistakenly assume any SETcc with a valid condition and an initial operand that was a register and a virtual register at that to be a register defining SETcc... It isn't though.... This would in turn cause us to test some other bizarre register, typically the base pointer of some memory. Now, testing this register and using that to branch on doesn't make any sense. It even fails the machine verifier (if you are running it) due to the wrong register class. But it will make it through LLVM, assemble, and it looks fine... But wow do you get a very unsual and surprising branch taken in your actual code. The fix is to actually check what kind of SETcc instruction we're dealing with. Because there are a bunch of them, I just test the may-store bit in the instruction. I've also added an assert for sanity that ensure we are, in fact, defining the register operand. =D llvm-svn: 338481	2018-08-01 03:01:58 +00:00
Chandler Carruth	014047a99a	[x86/slh] Add unwind info to several tests to make it more obvious that we aren't incorrectly generating any of it when doing SLH. There was a bug that only occured with SLH that very much looked like it could be caused by bad unwind info, and so this was a prime suspect. Turns out that everything is fine, but this way we'll see if we end up, for example, putting things we shouldn't inside the prolog. llvm-svn: 338480	2018-08-01 03:01:10 +00:00
Hsiangkai Wang	5c63af0d04	[DebugInfo] Generate fixups as emitting DWARF .debug_line. It is necessary to generate fixups in .debug_line as relaxation is enabled due to the address delta may be changed after relaxation. DWARF will record the mappings of lines and addresses in .debug_line section. It will encode the information using special opcodes, standard opcodes and extended opcodes in Line Number Program. I use DW_LNS_fixed_advance_pc to encode fixed length address delta and DW_LNE_set_address to encode absolute address to make it possible to generate fixups in .debug_line section. Differential Revision: https://reviews.llvm.org/D46850 llvm-svn: 338477	2018-08-01 02:18:06 +00:00
Amara Emerson	6cdfe29d8e	[GlobalISel][IRTranslator] Use RPO traversal when visiting blocks to translate. Previously we were just visiting the blocks in the function in IR order, which is rather arbitrary. Therefore we wouldn't always visit defs before uses, but the translation code relies on this assumption in some places. Only codegen change seen in tests is an elision of a redundant copy. Fixes PR38396 llvm-svn: 338476	2018-08-01 02:17:42 +00:00
Konstantin Zhuravlyov	bb30ef7af4	AMDGPU: Add clamp bit to dot intrinsics Differential Revision: https://reviews.llvm.org/D49874 llvm-svn: 338470	2018-08-01 01:31:30 +00:00
Evandro Menezes	eb159e56a6	[PATCH] [SLC] Test simplification of pow() for vector types (NFC) Add test case for the simplification of `pow()` for vector types that D50035 enables. llvm-svn: 338463	2018-08-01 00:30:43 +00:00
Reid Kleckner	b32ff46ff7	Revert r338354 "[ARM] Revert r337821" Disable ARMCodeGenPrepare by default again. It is causing verifier failues in V8 that look like: Duplicate integer as switch case switch i32 %trunc, label %if.end13 [ i32 0, label %cleanup36 i32 0, label %if.then8 ], !dbg !4981 i32 0 fatal error: error in backend: Broken function found, compilation aborted! I will continue reducing the test case and send it along. llvm-svn: 338452	2018-07-31 23:09:42 +00:00
David L. Jones	9fb2e3ceaa	[WebAssembly] Fix debug info tests after r338437. After r338437, debug_ranges are no longer emitted. Previously, this was only done for DWARF version 5 and above. llvm-svn: 338448	2018-07-31 22:24:14 +00:00
Victor Leschuk	58d3399d8a	[DWARF] Support for .debug_addr (consumer) This patch implements basic support for parsing and dumping DWARFv5 .debug_addr section. llvm-svn: 338447	2018-07-31 22:19:19 +00:00
Fangrui Song	87b4b8f7b4	[llvm-objcopy] Make --strip-debug strip .gdb_index Summary: See binutils-gdb/bfd/elf.c, GNU objcopy also strips .stab* (STABS) .line* (DWARF 1) .gnu.linkonce.wi.* (linkonce section for .debug_info) but I'm not sure we need to be compatible with it. Reviewers: dblaikie, alexshap, jakehehrlich, jhenderson Reviewed By: alexshap, jakehehrlich Subscribers: aprantl, JDevlieghere, jakehehrlich, llvm-commits Differential Revision: https://reviews.llvm.org/D50100 llvm-svn: 338443	2018-07-31 21:26:35 +00:00
George Burgess IV	497e8fad51	Revert r338431: "Add DebugCounters to DivRemPairs" This reverts r338431; the test it added is making buildbots unhappy. Locally, I can repro the failure on reverse-iteration builds. llvm-svn: 338442	2018-07-31 21:18:44 +00:00
Matt Arsenault	118c47b6d1	AMDGPU: Split amdgcn/r600 fminnum/fmaxnum tests R600 breaks on too many things to usefully test changes with ieee_mode on vs. off. llvm-svn: 338435	2018-07-31 20:38:42 +00:00
George Burgess IV	907f4f6a74	Add DebugCounters to DivRemPairs For people who don't use DebugCounters, NFCI. Patch by Zhizhou Yang! Differential Revision: https://reviews.llvm.org/D50033 llvm-svn: 338431	2018-07-31 20:07:46 +00:00
Alexandre Ganea	b92d3ad762	[CodeView] Add coverage test for r338308 (Fixed crash in type merging) llvm-svn: 338423	2018-07-31 19:30:03 +00:00
Matt Arsenault	feedabfde7	AMDGPU: Break 64-bit arguments into 32-bit pieces llvm-svn: 338421	2018-07-31 19:29:04 +00:00
Matt Arsenault	0395da7842	AMDGPU: Split wide vectors of i16/f16 into 32-bit regs on calls This improves code for the same reasons as scalarizing 32-bit element vectors. llvm-svn: 338418	2018-07-31 19:17:47 +00:00
Alexandre Ganea	ee8a720051	[CodeView] Minimal support for S_UNAMESPACE records Differential Revision: https://reviews.llvm.org/D50007 llvm-svn: 338417	2018-07-31 19:15:50 +00:00
Matt Arsenault	9ced1e0d80	AMDGPU: Scalarize vector argument types to calls When lowering calling conventions, prefer to decompose vectors into the constitute register types. This avoids artifical constraints to satisfy a wide super-register. This improves code quality because now optimizations don't need to deal with the super-register constraint. For example the immediate folding code doesn't deal with 4 component reg_sequences, so by breaking the register down earlier the existing immediate folding code is able to work. This also avoids the need for the shader input processing code to manually split vector types. llvm-svn: 338416	2018-07-31 19:05:14 +00:00
Vlad Tsyrklevich	48ed9acede	Revert "[DebugInfo] Generate DWARF debug information for labels." This reverts commits r338390 and r338398, they were causing LSan failures on the ASan bot. llvm-svn: 338408	2018-07-31 18:10:37 +00:00
Simon Pilgrim	5d9b00d15b	[X86][SSE] Use ISD::MULHU for constant/non-zero ISD::SRL lowering (PR38151) As was done for vector rotations, we can efficiently use ISD::MULHU for vXi8/vXi16 ISD::SRL lowering. Shift-by-zero cases are still problematic (mainly on v32i8 due to extra AND/ANDN/OR or VPBLENDVB blend masks but v8i16/v16i16 aren't great either if PBLENDW fails) so I've limited this first patch to known non-zero cases if we can't easily use PBLENDW. Differential Revision: https://reviews.llvm.org/D49562 llvm-svn: 338407	2018-07-31 18:05:56 +00:00
Simon Pilgrim	1f4b9cb6fe	[llvm-mca][x86] Add 32-bit instruction resource tests These aren't exhaustive, but cover some instructions that are only available in 32-bit mode (where would we be without good BCD math performance?). llvm-svn: 338404	2018-07-31 17:33:08 +00:00
Zachary Turner	d30700f82d	Resubmit r338340 "[MS Demangler] Better demangling of template arguments." This broke the build with GCC, but has since been fixed. llvm-svn: 338403	2018-07-31 17:16:44 +00:00
Craig Topper	bef126fb71	[X86] Add pattern matching for PMADDUBSW Summary: Similar to D49636, but for PMADDUBSW. This instruction has the additional complexity that the addition of the two products saturates to 16-bits rather than wrapping around. And one operand is treated as signed and the other as unsigned. A C example that triggers this pattern ``` static const int N = 128; int8_t A[2N]; uint8_t B[2N]; int16_t C[N]; void foo() { for (int i = 0; i != N; ++i) C[i] = MIN(MAX((int16_t)A[2i](int16_t)B[2i] + (int16_t)A[2i+1](int16_t)B[2i+1], -32768), 32767); } ``` Reviewers: RKSimon, spatel, zvi Reviewed By: RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49829 llvm-svn: 338402	2018-07-31 17:12:08 +00:00
Craig Topper	d03d44e0b9	[X86] Add test cases that could use PMADDUBSW. llvm-svn: 338401	2018-07-31 17:12:06 +00:00
Francis Visoiu Mistrih	ae8002c1cf	[X86] Preserve more liveness information in emitStackProbeInline This commit fixes two issues with the liveness information after the call: 1) The code always spills RCX and RDX if InProlog == true, which results in an use of undefined phys reg. 2) FinalReg, JoinReg, RoundedReg, SizeReg are not added as live-ins to the basic blocks that use them, therefore they are seen undefined. https://llvm.org/PR38376 Differential Revision: https://reviews.llvm.org/D50020 llvm-svn: 338400	2018-07-31 16:41:12 +00:00
Hsiangkai Wang	68c6860434	[DebugInfo] Fix build failed in 'clang-cmake-armv8-full'. Builder clang-cmake-armv8-full failed due to the assembly 'comment' notation is not '#' in the target. So, I use CHECK-SAME to avoid to check the comment notation in the same line in the test case. llvm-svn: 338398	2018-07-31 16:22:09 +00:00
Ewan Crawford	d83beb804c	Fix InstCombine address space assert Workaround bug where the InstCombine pass was asserting on the IR added in lit test, where we have a bitcast instruction after a GEP from an addrspace cast. The second bitcast in the test was getting combined into `bitcast <16 x i32>* %0 to <16 x i32> addrspace(3)`, which looks like it should be an addrspace cast instruction instead. Otherwise if control flow is allowed to continue as it is now we create a GEP instruction `<badref> = getelementptr inbounds <16 x i32>, <16 x i32> %0, i32 0`. However because the type of this instruction doesn't match the address space we hit an assert when replacing the bitcast with that GEP. ``` void llvm::Value::doRAUW(llvm::Value*, bool): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed. ``` Differential Revision: https://reviews.llvm.org/D50058 llvm-svn: 338395	2018-07-31 15:53:03 +00:00
Sanjay Patel	a35781fdf9	[InstCombine] regenerate checks and add tests for D50035; NFC llvm-svn: 338392	2018-07-31 15:07:32 +00:00
Anastasis Grammenos	ac3f8028da	[DebugInfo][LCSSA] Preserve debug location in lcssa phis Summary: When inserting lcssa Phi Nodes in the exit block mak sure to preserve the original instructions DL. Reviewers: vsk Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D50009 llvm-svn: 338391	2018-07-31 14:54:52 +00:00
Hsiangkai Wang	cbc58ada99	[DebugInfo] Generate DWARF debug information for labels. There are two forms for label debug information in DWARF format. 1. Labels in a non-inlined function: DW_TAG_label DW_AT_name DW_AT_decl_file DW_AT_decl_line DW_AT_low_pc 2. Labels in an inlined function: DW_TAG_label DW_AT_abstract_origin DW_AT_low_pc We will collect label information from DBG_LABEL. Before every DBG_LABEL, we will generate a temporary symbol to denote the location of the label. The symbol could be used to get DW_AT_low_pc afterwards. So, we create a mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase. The DBG_LABEL in the mapping is used to query the symbol before it. The AbstractLabels in DwarfCompileUnit is used to process labels in inlined functions. We also keep a mapping between scope and labels in DwarfFile to help to generate correct tree structure of DIEs. It also generates label debug information under global isel. Differential Revision: https://reviews.llvm.org/D45556 llvm-svn: 338390	2018-07-31 14:48:32 +00:00
David Bolvansky	ab79414f7b	Revert Enrich inline messages llvm-svn: 338389	2018-07-31 14:47:22 +00:00
Sanjay Patel	57d617d676	[InstCombine] auto-generate checks; NFC llvm-svn: 338388	2018-07-31 14:27:30 +00:00
David Bolvansky	b562dbabda	Enrich inline messages Summary: This patch improves Inliner to provide causes/reasons for negative inline decisions. 1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message. 2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision. 3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost. 4. Adjusted tests for changed printing. Patch by: yrouban (Yevgeny Rouban) Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00 Reviewed By: tejohnson, xbolva00 Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D49412 llvm-svn: 338387	2018-07-31 14:25:24 +00:00
John Brawn	cd5f37f3f1	[MemDep] Use PhiValuesAnalysis to improve alias analysis results This is being done in order to make GVN able to better optimize certain inputs. MemDep doesn't use PhiValues directly, but does need to notifiy it when things get invalidated. Differential Revision: https://reviews.llvm.org/D48489 llvm-svn: 338384	2018-07-31 14:19:29 +00:00
David Bolvansky	16d8a69b90	[InstSimplify] Fold another Select with And/Or pattern Summary: Proof: https://rise4fun.com/Alive/L5J Reviewers: lebedev.ri, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49975 llvm-svn: 338383	2018-07-31 14:17:15 +00:00
Matt Arsenault	a5ed032118	DAG: Fix PromoteFloatResult for fcanonicalize llvm-svn: 338382	2018-07-31 14:15:22 +00:00
Alexey Bataev	c0c3a6ed5e	[SLP] Fix PR38339: Instruction does not dominate all uses! Summary: If the ExtractElement instructions can be optimized out during the vectorization and we need to reshuffle the parent vector, this ShuffleInstruction may be inserted in the wrong place causing compiler to produce incorrect code. Reviewers: spatel, RKSimon, mkuper, hfinkel, javed.absar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49928 llvm-svn: 338380	2018-07-31 14:02:43 +00:00
Matt Arsenault	4aec86d37a	AMDGPU: Fold undef fcanonicalize to qNaN We could choose a free 0 for this, but this matches the behavior for fmul undef, 1.0. Also, the NaN use is more useful for folding use operations although if it's not eliminated it is more expensive in terms of code size. llvm-svn: 338376	2018-07-31 13:34:31 +00:00
Matt Arsenault	c1335eaf7e	AMDGPU: Fix test check line bugs llvm-svn: 338374	2018-07-31 13:25:23 +00:00
Andrea Di Biagio	a1852b6194	[llvm-mca][BtVer2] Teach how to identify dependency-breaking idioms. This patch teaches llvm-mca how to identify dependency breaking instructions on btver2. An example of dependency breaking instructions is the zero-idiom XOR (example: `XOR %eax, %eax`), which always generates zero regardless of the actual value of the input register operands. Dependency breaking instructions don't have to wait on their input register operands before executing. This is because the computation is not dependent on the inputs. Not all dependency breaking idioms are also zero-latency instructions. For example, `CMPEQ %xmm1, %xmm1` is independent on the value of XMM1, and it generates a vector of all-ones. That instruction is not eliminated at register renaming stage, and its opcode is issued to a pipeline for execution. So, the latency is not zero. This patch adds a new method named isDependencyBreaking() to the MCInstrAnalysis interface. That method takes as input an instruction (i.e. MCInst) and a MCSubtargetInfo. The default implementation of isDependencyBreaking() conservatively returns false for all instructions. Targets may override the default behavior for specific CPUs, and return a value which better matches the subtarget behavior. In future, we should teach to Tablegen how to automatically generate the body of isDependencyBreaking from scheduling predicate definitions. This would allow us to expose the knowledge about dependency breaking instructions to the machine schedulers (and, potentially, other codegen passes). Differential Revision: https://reviews.llvm.org/D49310 llvm-svn: 338372	2018-07-31 13:21:43 +00:00
Jonas Paulsson	2f12e45d5a	[SystemZ] Improve decoding in case of instructions with four register operands. Since z13, the max group size will be 2 if any μop has more than 3 register sources. This has been ignored sofar in the SystemZHazardRecognizer, but is now handled by recognizing those instructions and adjusting the tracking of decoding and the cost heuristic for grouping. Review: Ulrich Weigand https://reviews.llvm.org/D49847 llvm-svn: 338368	2018-07-31 13:00:42 +00:00
Sanjay Patel	9a801cb598	[InstCombine] simplify code for A & (A ^ B) --> A & ~B This fold was written in an odd way and tried to avoid an endless loop by bailing out on all constants instead of the supposedly problematic case of -1. But (X & -1) should always be simplified before we reach here, so I'm not sure how that is a problem. There were no tests for the commuted patterns, so I added those at rL338364. llvm-svn: 338367	2018-07-31 13:00:03 +00:00
Sanjay Patel	995138ce60	[InstCombine] move/add tests for xor+add fold; NFC llvm-svn: 338364	2018-07-31 12:31:00 +00:00
Martin Storsjo	293079f2de	[ARM] Allow automatically deducing the thumb instruction size for .inst This matches GAS, that allows unsuffixed .inst for thumb. Differential Revision: https://reviews.llvm.org/D49937 llvm-svn: 338357	2018-07-31 09:27:07 +00:00
Martin Storsjo	af18947f0a	[ARM] Support the .inst directive for MachO and COFF targets Contrary to ELF, we don't add any markers that distinguish data generated with .short/.long from normal instructions, so the .inst directive only adds compatibility with assembly that uses it. Differential Revision: https://reviews.llvm.org/D49936 llvm-svn: 338356	2018-07-31 09:27:01 +00:00
Martin Storsjo	3e3d39d07e	[AArch64] Support the .inst directive for MachO and COFF targets Contrary to ELF, we don't add any markers that distinguish data generated with .long from normal instructions, so the .inst directive only adds compatibility with assembly that uses it. Differential Revision: https://reviews.llvm.org/D49935 llvm-svn: 338355	2018-07-31 09:26:52 +00:00
Sam Parker	2a6c842fda	[ARM] Revert r337821 Re-enabling ARMCodeGenPrepare by default after failing to reproduce the bootstrap issues that I was concerned it was causing. llvm-svn: 338354	2018-07-31 09:04:14 +00:00
Hiroshi Inoue	2f6769be60	[InstSimplify] tests for D48828, D49981: fold extraction from std::pair Minor touch up in the previous comment. llvm-svn: 338351	2018-07-31 05:29:20 +00:00
Hiroshi Inoue	5427d3efc2	[InstSimplify] tests for D48828, D49981: fold extraction from std::pair Updated unit tests for D48828 and D49981. llvm-svn: 338350	2018-07-31 05:10:36 +00:00
Reid Kleckner	d2bad6c639	Revert r338340 "[MS Demangler] Better demangling of template arguments." Breaks the build with GCC, apparently. llvm-svn: 338344	2018-07-31 01:08:42 +00:00
Craig Topper	9164b9b16e	[X86] Stop accidentally running the Bonnell LEA fixup path on Goldmont. In one place we checked X86Subtarget.slowLEA() to decide if the pass should run. But to decide what the pass should we only check isSLM. This resulted in Goldmont going down the Bonnell path. llvm-svn: 338342	2018-07-31 00:43:54 +00:00
Ana Pazos	2baa767455	[RISCV] Fixed test case failure due to r338047 llvm-svn: 338341	2018-07-31 00:36:28 +00:00
Zachary Turner	4f85809a84	[MS Demangler] Better demangling of template arguments. This patch fixes demangling of template aliases as template-template arguments, and also fixes function pointers and references as not type template parameters. All of these can be properly demangled now, so I've ported over the test clang/test/CodeGenCXX/ms-template-callbacks.cpp. All of these tests pass llvm-svn: 338340	2018-07-31 00:26:52 +00:00
Amara Emerson	1e8c164c63	[AArch64][GlobalISel] Add isel support for G_BLOCK_ADDR. Also refactors some existing code to materialize addresses for the large code model so it can be shared between G_GLOBAL_VALUE and G_BLOCK_ADDR. This implements PR36390. Differential Revision: https://reviews.llvm.org/D49903 llvm-svn: 338337	2018-07-31 00:09:02 +00:00
Amara Emerson	0e86c07077	[AArch64][GlobalISel] Make G_BLOCK_ADDR legal. Differential Revision: https://reviews.llvm.org/D49902 llvm-svn: 338336	2018-07-31 00:08:56 +00:00
Amara Emerson	6aff5a7810	[GlobalISel] Add a G_BLOCK_ADDR opcode to handle IR blockaddress constants. Differential Revision: https://reviews.llvm.org/D49900 llvm-svn: 338335	2018-07-31 00:08:50 +00:00
Zachary Turner	a51403f5cc	[MS Demangler] Add ms-return-qualifiers.test. This is a copy of the tests from clang/CodeGenCXX/ms-return-qualifiers.cpp converted to demangling tests. llvm-svn: 338330	2018-07-30 23:22:39 +00:00
Zachary Turner	931e879cef	[MS Demangler] Add rudimentary C++11 Support This patch adds support for demangling r-value references, new operators such as the ""_foo operator, lambdas, alias types, nullptr_t, and various other C++11'isms. There is 1 failing test remaining in this file, which appears to be related to back-referencing. This type of problem has the potential to get ugly so I'd rather fix it in a separate patch. Differential Revision: https://reviews.llvm.org/D50013 llvm-svn: 338324	2018-07-30 23:02:10 +00:00
Sanjay Patel	9f807f44b1	[DAGCombiner] transform sub-of-shifted-signbit to add This is exchanging a sub-of-1 with add-of-minus-1: https://rise4fun.com/Alive/plKAH This is another step towards improving select-of-constants codegen (see D48970). x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral. I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but I think canonicalizing to 'add' is more likely to produce further transforms because we have more folds for 'add'. Differential Revision: https://reviews.llvm.org/D49924 llvm-svn: 338317	2018-07-30 22:21:37 +00:00
David Bolvansky	6737b3a6a1	[InstCombine] Fold Select with binary op Summary: Fold %A = icmp eq i8 %x, 0 %B = xor i8 %x, %z %C = select i1 %A, i8 %B, i8 %y To %C = select i1 %A, i8 %z, i8 %y Fixes https://bugs.llvm.org/show_bug.cgi?id=38345 Proof: https://rise4fun.com/Alive/43J Reviewers: lebedev.ri, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49954 llvm-svn: 338300	2018-07-30 20:38:53 +00:00
Vlad Tsyrklevich	1c7160e85f	Revert "[GVNHoist] Re-enable GVNHoist by default" This reverts commit r338240 because it was causing OOMs on the UBSan buildbot when building clang/lib/Sema/SemaChecking.cpp llvm-svn: 338297	2018-07-30 20:07:33 +00:00
Manoj Gupta	9d83ce9043	[Inline] Copy "null-pointer-is-valid" attribute in caller. Summary: Normally, inling does not happen if caller does not have "null-pointer-is-valid"="true" attibute but callee has it. However, alwaysinline may force callee to be inlined. In this case, if the caller has the "null-pointer-is-valid"="true" attribute, copy the attribute to caller. Reviewers: efriedma, a.elovikov, lebedev.ri, jyknight Reviewed By: efriedma Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D50000 llvm-svn: 338292	2018-07-30 19:33:53 +00:00
David Bolvansky	76e95865f3	[InstSimplify] [NFC] Tests for Select with AND/OR fold llvm-svn: 338285	2018-07-30 18:22:18 +00:00
Jessica Paquette	fa3bee4756	[MachineOutliner][AArch64] Add support for saving LR to a register This teaches the outliner to save LR to a register rather than the stack when possible. This allows us to avoid bumping the stack in outlined functions in some cases. By doing this, in a later patch, we can teach the outliner to do something like this: f1: ... bl OUTLINED_FUNCTION ... f2: ... move LR's contents to a register bl OUTLINED_FUNCTION move the register's contents back instead of falling back to saving LR in both cases. llvm-svn: 338278	2018-07-30 17:45:28 +00:00
Jessica Paquette	bbcc8895bb	Add machine verifier to arm64-opt-remarks-lazy-bfi Previously, I thought this was a Windows failure. Then I realized it failed on every bot that used the verifier. This makes it use the verifier always, and adds that pass to the pipeline checks so that it's consistent across all bots. llvm-svn: 338272	2018-07-30 17:13:25 +00:00
David Bolvansky	2fa7fb14ea	[DAGCombiner] Bug 31275- Extract a shift from a constant mul or udiv if a rotate can be formed Summary: Attempt to extract a shrl from a udiv or a shl from a mul if this allows a rotate to be formed. This targets cases where the input to a rotate pattern was a mul or udiv by a constant and InstCombine merged one of the shifts with the op. Patch by: sameconrad (Sam Conrad) Reviewers: RKSimon, craig.topper, spatel, lebedev.ri, javed.absar Reviewed By: lebedev.ri Subscribers: efriedma, kparzysz, llvm-commits Differential Revision: https://reviews.llvm.org/D47681 llvm-svn: 338270	2018-07-30 16:50:00 +00:00
Thomas Preud'homme	196149c943	Reapply "Fix crash on inline asm with 64bit matching input in 32bit GPR" This reapplies commit r338206 reverted by r338214 since the bug that r338206 uncovered has been fixed in r338268. Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338269	2018-07-30 16:48:39 +00:00
Thomas Preud'homme	6c1b075299	Fix uninitialized read in ARM's PrintAsmOperand Summary: Fix read of uninitialized RC variable in ARM's PrintAsmOperand when hasRegClassConstraint returns false. This was causing inline-asm-operand-implicit-cast test to fail in r338206. Reviewers: t.p.northover, weimingz, javed.absar, chill Reviewed By: chill Subscribers: chill, eraman, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D49984 llvm-svn: 338268	2018-07-30 16:45:40 +00:00
Jessica Paquette	7816531f3c	Attempt to fix Windows test failure caused by r338133 It seems like the pass pipeline on Windows is slightly different than on Linux and macOS. As a result, the arm64-opt-remarks-lazy-bfi test has been failing. This switches a CHECK-NEXT to a CHECK-DAG to try and get this running properly again. It'd be nice to switch it back to a CHECK-NEXT if possible, but the CHECK-NEXT lines following the line we care about (the optimization remark emitter) do a pretty good job of enforcing the ordering we want. Hopefully this works, since I don't have a Windows machine. ;) Example failure: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/11295 llvm-svn: 338267	2018-07-30 16:36:22 +00:00
Evandro Menezes	a7d48286fb	[SLC] Refactor the simplication of pow() (NFC) Use more meaningful variable names. Mostly NFC. llvm-svn: 338266	2018-07-30 16:20:04 +00:00
Simon Pilgrim	186b62c9e4	[X86] Regenerate NOBMI/BMI combine-select tests. Test cleanup for D38128 llvm-svn: 338265	2018-07-30 16:18:38 +00:00
Simon Pilgrim	2d5118432b	[X86] Regenerate PKU test to merge 32/64-bit rdpkru checks Test cleanup for D38128 llvm-svn: 338264	2018-07-30 16:15:18 +00:00
Simon Pilgrim	22ff9f94bb	[X86] Regenerate fast-isel tests. Test cleanup for D38128 llvm-svn: 338262	2018-07-30 16:13:40 +00:00
Sander de Smalen	e64206a02c	[AArch64][SVE] Asm: Enable instructions to be prefixed. This patch enables instructions that are destructive on their destination- and first source operand, to be prefixed with a MOVPRFX instruction. This patch also adds a variety of tests: - positive tests for all instructions and forms that accept a movprfx for either or both predicated and unpredicated forms. - negative tests for all instructions and forms that do not accept an unpredicated or predicated movprfx. - negative tests for the diagnostics that get emitted when a MOVPRFX instruction is used incorrectly. This is patch [2/2] in a series to add MOVPRFX instructions: - Patch [1/2]: https://reviews.llvm.org/D49592 - Patch [2/2]: https://reviews.llvm.org/D49593 Reviewers: rengolin, SjoerdMeijer, samparker, fhahn, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D49593 llvm-svn: 338261	2018-07-30 16:05:45 +00:00
David Bolvansky	403826ee0f	[InstCombine] [NFC] Added tests for Select with binop fold llvm-svn: 338257	2018-07-30 15:38:42 +00:00
Krzysztof Parzyszek	24fae50905	[Hexagon] Simplify A4_rcmp[n]eqi R, 0 Consider cases when register R is known to be zero/non-zero, or when it is defined by a C2_muxii instruction. llvm-svn: 338251	2018-07-30 14:28:02 +00:00
John Brawn	898cd398d3	Adjust opt pass pipeline tests to cope with combination of r338240 and r338242 The combination of r338240 and r338242 causes the opt pass pipeline tests to fail because of how r338242 makes BasicAA be invalidated more often. Adjust the tests to reflect this. llvm-svn: 338250	2018-07-30 14:26:24 +00:00
Matt Arsenault	de496c32a4	AMDGPU: Reduce code size with fcanonicalize (fneg x) When fcanonicalize is lowered to a mul, we can use -1.0 for free and avoid the cost of the bigger encoding for source modifers. llvm-svn: 338244	2018-07-30 12:16:58 +00:00
Matt Arsenault	f3c9a34def	AMDGPU: Make fneg combine handle fcanonicalize llvm-svn: 338243	2018-07-30 12:16:47 +00:00
John Brawn	cd73fe8989	[BasicAA] Use PhiValuesAnalysis if available when handling phi alias By using PhiValuesAnalysis we can get all the values reachable from a phi, so we can be more precise instead of giving up when a phi has phi operands. We can't make BaseicAA directly use PhiValuesAnalysis though, as the user of BasicAA may modify the function in ways that PhiValuesAnalysis can't cope with. For this optional usage to work correctly BasicAAWrapperPass now needs to be not marked as CFG-only (i.e. it is now invalidated even when CFG is preserved) due to how the legacy pass manager handles dependent passes being invalidated, namely the depending pass still has a pointer to the now-dead dependent pass. Differential Revision: https://reviews.llvm.org/D44564 llvm-svn: 338242	2018-07-30 11:52:08 +00:00
Alexandros Lamprineas	de3ca964c1	[GVNHoist] Re-enable GVNHoist by default My initial motivation for this came from https://reviews.llvm.org/D48122, where it was pointed out that my change didn't fit well in SimplifyCFG and therefore using GVNHoist was a better way to go. GVNHoist has been disabled for a while as there was a list of bugs related to it. I have fixed the following bugs: https://bugs.llvm.org/show_bug.cgi?id=37808 -> https://reviews.llvm.org/D48372 (rL337149) https://bugs.llvm.org/show_bug.cgi?id=36787 -> https://reviews.llvm.org/D49555 (rL337674) https://bugs.llvm.org/show_bug.cgi?id=37445 -> https://reviews.llvm.org/D49425 (rL337680) The next two bugs no longer occur, and it's unclear which commit fixed them: https://bugs.llvm.org/show_bug.cgi?id=36635 https://bugs.llvm.org/show_bug.cgi?id=37791 I investigated this one and proved to be unrelated to GVNHoist, but a genuine bug in NewGvn: https://bugs.llvm.org/show_bug.cgi?id=37660 To convince myself GVNHoist is in a good state I made a successful bootstrap build of LLVM. Merging this change now in order to make it to the LLVM 7.0.0 branch. Differential Revision: https://reviews.llvm.org/D49858 llvm-svn: 338240	2018-07-30 10:50:18 +00:00
Francis Visoiu Mistrih	7d003657de	[MachineOutliner][X86] Use TAILJMPd64 instead of JMP_1 for TailCall construction The machine verifier asserts with: Assertion failed: (isMBB() && "Wrong MachineOperand accessor"), function getMBB, file ../include/llvm/CodeGen/MachineOperand.h, line 542. It calls analyzeBranch which tries to call getMBB if the opcode is JMP_1, but in this case we do: JMP_1 @OUTLINED_FUNCTION I believe we have to use TAILJMPd64 instead of JMP_1 since JMP_1 is used with brtarget8. Differential Revision: https://reviews.llvm.org/D49299 llvm-svn: 338237	2018-07-30 09:59:33 +00:00
Nicolai Haehnle	7f0d05d532	AMDGPU: Force skip over s_sendmsg and exp instructions Summary: These instructions interact with hardware blocks outside the shader core, and they can have "scalar" side effects even when EXEC = 0. We don't want these scalar side effects to occur when all lanes want to skip these instructions, so always add the execz skip branch instruction for basic blocks that contain them. Also ensure that we skip scalar stores / atomics, though we don't code-gen those yet. Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48431 Change-Id: Ieaeb58352e2789ffd64745603c14970c60819d44 llvm-svn: 338235	2018-07-30 09:23:59 +00:00
Petr Pavlu	8b6eff4e77	[ARM] Fix over-alignment in arguments that are HA of 128-bit vectors Code in `CC_ARM_AAPCS_Custom_Aggregate()` is responsible for handling homogeneous aggregates for `CC_ARM_AAPCS_VFP`. When an aggregate ends up fully on stack, the function tries to pack all resulting items of the aggregate as tightly as possible according to AAPCS. Once the first item was laid out, the alignment used for consecutive items was the size of one item. This logic went wrong for 128-bit vectors because their alignment is normally only 64 bits, and so could result in inserting unexpected padding between the first and second element. The patch fixes the problem by updating the alignment with the item size only if this results in reducing it. Differential Revision: https://reviews.llvm.org/D49720 llvm-svn: 338233	2018-07-30 08:49:30 +00:00
Zachary Turner	71c91f948a	[MS Demangler] Demangle symbols in function scopes. There are a couple of issues you run into when you start getting into more complex names, especially with regards to function local statics. When you've got something like: int x() { static int n = 0; return n; } Then this needs to demangle to something like int `int __cdecl x()'::`1'::n The nested mangled symbols (e.g. `int __cdecl x()` in the above example) also share state with regards to back-referencing, so we need to be able to re-use the demangler in the middle of demangling a symbol while sharing back-ref state. To make matters more complicated, there are a lot of ambiguities when demangling a symbol's qualified name, because a function local scope pattern (usually something like `?1??name?`) looks suspiciously like many other possible things that can occur, such as `?1` meaning the second back-ref and disambiguating these cases is rather interesting. The `?1?` in a local scope pattern is actually a special case of the more general pattern of `? + <encoded number> + ?`, where "encoded number" can itself have embedded `@` symbols, which is a common delimeter in mangled names. So we have to take care during the disambiguation, which is the reason for the overly complicated `isLocalScopePattern` function in this patch. I've added some pretty obnoxious tests to exercise all of this, which exposed several other problems related to back-referencing, so those are fixed here as well. Finally, I've uncommented some tests that were previously marked as `FIXME`, since now these work. Differential Revision: https://reviews.llvm.org/D49965 llvm-svn: 338226	2018-07-30 03:12:34 +00:00
Sanjay Patel	577c705752	[InstCombine] try to fold 'add+sub' to 'not+add' These are reassociated versions of the same pattern and similar transforms as in rL338200 and rL338118. The motivation is identical to those commits: Patterns with add/sub combos can be improved using 'not' ops. This is better for analysis and may lead to follow-on transforms because 'xor' and 'add' are commutative/associative. It can also help codegen. llvm-svn: 338221	2018-07-29 18:13:16 +00:00
Sanjay Patel	2daf28f9ce	[InstCombine] add tests for another sub-not variant; NFC llvm-svn: 338220	2018-07-29 18:07:28 +00:00
Sanjay Patel	54421ce918	[InstSimplify] fold funnel shifts with 0-shift amount llvm-svn: 338218	2018-07-29 16:36:38 +00:00
Sanjay Patel	46af5835af	[InstSimplify] add tests for funnel shift intrinsics; NFC llvm-svn: 338217	2018-07-29 16:27:17 +00:00
Jonas Devlieghere	ae1727e3dd	[dsymutil] Simplify temporary file handling. Dsymutil's update functionality was broken on Windows because we tried to rename a file while we're holding open handles to that file. TempFile provides a solution for this through its keep(Twine) method. This patch changes dsymutil to make use of that functionality. Differential revision: https://reviews.llvm.org/D49860 llvm-svn: 338216	2018-07-29 14:56:15 +00:00
Sanjay Patel	7312206f2f	revert r338206 because the test does not pass Example of bot failure: http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll llvm-svn: 338214	2018-07-29 14:30:49 +00:00
Sander de Smalen	ad88a99956	[AArch64][SVE] Asm: Support for WHILE(LE\|LO\|LS\|LT) instructions. The WHILE instructions generate a predicate that is true while the comparison of the first scalar operand (incremented for each predicate element) with the second scalar operand is true and false thereafter. WHILELE While incrementing signed scalar less than or equal to scalar WHILELO While incrementing unsigned scalar lower than scalar WHILELS While incrementing unsigned scalar lower than or same as scalar WHILELT While incrementing signed scalar less than scalar e.g. whilele p0.s, x0, x1 generates predicate p0 (for 32bit elements) by incrementing (signed) x0 and comparing that vector to splat(x1). llvm-svn: 338211	2018-07-29 08:51:08 +00:00
Sander de Smalen	e70ed3187c	[AArch64][SVE] Asm: Instructions to perform serialized operations. The instructions added in this patch permit active elements within a vector to be processed sequentially without unpacking the vector. PFIRST Set the first active element to true. PNEXT Find next active element in predicate. CTERMEQ Compare and terminate loop when equal. CTERMNE Compare and terminate loop when not equal. llvm-svn: 338210	2018-07-29 08:00:16 +00:00
Thomas Preud'homme	74ffd14e15	Fix crash on inline asm with 64bit matching input in 32bit GPR Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338206	2018-07-28 21:33:39 +00:00
David Bolvansky	f801a58c0d	[InstCombine] Tests for fold Select with binary op Differential Revision: https://reviews.llvm.org/D49961 llvm-svn: 338201	2018-07-28 17:13:33 +00:00
Sanjay Patel	818b253d3a	[InstCombine] try to fold 'sub' to 'not' https://rise4fun.com/Alive/jDd Patterns with add/sub combos can be improved using 'not' ops. This is better for analysis and may lead to follow-on transforms because 'xor' and 'add' are commutative/associative. It can also help codegen. llvm-svn: 338200	2018-07-28 16:48:44 +00:00
Sander de Smalen	5b3a289424	[AArch64][SVE] Asm: Support for PFALSE and PTEST instructions. This patch adds PFALSE (unconditionally sets all elements of the predicate to false) and PTEST (set the status flags for the predicate). llvm-svn: 338198	2018-07-28 14:18:11 +00:00
Matt Arsenault	8f9dde94b7	AMDGPU: Stop wasting argument registers with v3i32/v3f32 SelectionDAGBuilder widens v3i32/v3f32 arguments to to v4i32/v4f32 which consume an additional register. In addition to wasting argument space, this produces extra instructions since now it appears the 4th vector component has a meaningful value to most combines. llvm-svn: 338197	2018-07-28 14:11:34 +00:00
Sander de Smalen	3878bf83dd	[AArch64][SVE] Asm: Data-dependent loop predicate partitioning instructions. This patch adds support for instructions that partition a predicate based on data-dependent termination conditions in a loop. BRKA Break after the first true condition BRKAS Break after the first true condition, setting condition flags BRKB Break before the first true condition BRKBS Break before the first true condition, setting condition flags BRKPA Break after the first true condition, propagating from the previous partition BRKPAS Break after the first true condition, propagating from the previous partition, setting condition flags BRKPB Break before the first true condition, propagating from the previous partition BRKPBS Break before the first true condition, propagating from the previous partition, setting condition flags BRKN Propagate break to next partition BKRNS Propagate break to next partition, setting condition flags llvm-svn: 338196	2018-07-28 14:04:52 +00:00
David Bolvansky	9800b710c2	[InstSimplify] Moved Select + AND/OR tests from InstCombine Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49957 llvm-svn: 338195	2018-07-28 13:52:45 +00:00
Matt Arsenault	72b0e38b26	AMDGPU: Stop trying to extend arguments for clover This was trying to replace i8/i16 arguments with i32, which was broken and no longer necessary. llvm-svn: 338193	2018-07-28 12:34:25 +00:00
David Green	fc4b0fe0a2	[GlobalOpt] Test array indices inside structs for out-of-bounds accesses We now, from clang, can turn arrays of static short g_data[] = {16, 16, 16, 16, 16, 16, 16, 16, 0, 0, 0, 0, 0, 0, 0, 0}; into structs of the form @g_data = internal global <{ [8 x i16], [8 x i16] }> ... GlobalOpt will incorrectly SROA it, not realising that the access to the first element may overflow into the second. This fixes it by checking geps more thoroughly. I believe this makes the globalsra-partial.ll test case invalid as the %i value could be out of bounds. I've re-purposed it as a negative test for this case. Differential Revision: https://reviews.llvm.org/D49816 llvm-svn: 338192	2018-07-28 08:20:10 +00:00
David Bolvansky	f947608ddf	[InstCombine] Fold Select with AND/OR condition Summary: Fold ``` %A = icmp ne i8 %X, %V1 %B = icmp ne i8 %X, %V2 %C = or i1 %A, %B %D = select i1 %C, i8 %X, i8 %V1 ret i8 %D => ret i8 %X Fixes https://bugs.llvm.org/show_bug.cgi?id=38334 Proof: https://rise4fun.com/Alive/plI8 Reviewers: spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: craig.topper, llvm-commits Differential Revision: https://reviews.llvm.org/D49919 llvm-svn: 338191	2018-07-28 06:55:51 +00:00
Craig Topper	50b1d4303d	[DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B) This can be useful since addition is commutable, and subtraction is not. This matches a transform that is also done by InstCombine. llvm-svn: 338181	2018-07-28 00:27:25 +00:00
Wouter van Oortmerssen	a90d24da1c	Revert "[WebAssembly] Added default stack-only instruction mode for MC." This reverts commit d3c9af4179eae7793d1487d652e2d4e23844555f. (SVN revision 338164) llvm-svn: 338176	2018-07-27 23:19:51 +00:00
Craig Topper	c3e11bf3f7	[X86] Add support expanding multiplies by constant where the constant is -3/-5/-9 multplied by a power of 2. These can be replaced with an LEA, a shift, and a negate. This seems to match what gcc and icc would do. llvm-svn: 338174	2018-07-27 23:04:59 +00:00
Reid Kleckner	ba82788ff6	[InstrProf] Don't register __llvm_profile_runtime_user Refactor some FileCheck prefixes while I'm at it. Fixes PR38340 llvm-svn: 338172	2018-07-27 22:21:35 +00:00
Wouter van Oortmerssen	a67c4137c3	[WebAssembly] Added default stack-only instruction mode for MC. Summary: Moved Explicit Locals pass to last. Made that pass obligatory. Made it convert from register to stack based instructions, and removed the registers. Fixes to related code that was expecting register based instructions. Added the correct testing flag to all tests, depending on what the format they were expecting so far. Translated one test to stack format as example: reg-stackify-stack.ll tested: llvm-lit -v `find test -name WebAssembly` unittests/MC/* Reviewers: dschuff, sunfish Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits Differential Revision: https://reviews.llvm.org/D49160 llvm-svn: 338164	2018-07-27 20:56:43 +00:00
David Bolvansky	173484d78c	[InstCombine] [NFC] [Tests] Fold Select with AND/OR condition - fixed Differential Revision: https://reviews.llvm.org/D49933 llvm-svn: 338161	2018-07-27 20:29:32 +00:00
Jessica Paquette	f90edbe3d6	Recommit "Enable MachineOutliner by default under -Oz for AArch64" Fixed the ASAN failure from before in r338148, so recommiting. This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338160	2018-07-27 20:18:27 +00:00
David Bolvansky	1b82617473	[InstCombine] [NFC] [Tests] Fold Select with AND/OR condition Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49932 llvm-svn: 338159	2018-07-27 20:18:12 +00:00
Evandro Menezes	f611ce82c0	[SLC] Test simplification of pow(x, 0.333...) to cbrt(x) (NFC) Add test case for simplifying `pow(x, 0.333...)` into `cbrt(x)`, which D49040 enables. llvm-svn: 338152	2018-07-27 18:56:47 +00:00
Sanjay Patel	06c7d5aef6	[AArch64, PowerPC, x86] add more signbit math tests; NFC The tests with a constant sub operand were added with rL338143, but the potential transform doesn't have that requirement, so adding more tests with variable operands. llvm-svn: 338150	2018-07-27 18:31:21 +00:00
Evandro Menezes	fcca45f0dd	[ARM] Add new target feature to fuse literal generation This feature enables the fusion of such operations on Cortex A57 and Cortex A72, as recommended in their Software Optimisation Guides, sections 4.14 and 4.11, respectively. Differential revision: https://reviews.llvm.org/D49563 llvm-svn: 338147	2018-07-27 18:16:47 +00:00
Sanjay Patel	efac39eef6	[AArch64, PowerPC, x86] add more signbit math tests; NFC llvm-svn: 338143	2018-07-27 18:12:29 +00:00
Jessica Paquette	faea2d3130	Revert "Enable MachineOutliner by default under -Oz for AArch64" It failed an Asan test on a bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/21543/steps/check-llvm%20asan/logs/stdio Fixing that before recommitting. llvm-svn: 338136	2018-07-27 17:25:38 +00:00
Yonghong Song	04ccfda075	bpf: add missing RegState to notify MachineInstr verifier necessary register usage Errors like the following are reported by: https://urldefense.proofpoint.com/v2/url?u=http-3A__lab.llvm.org-3A8011_builders_llvm-2Dclang-2Dx86-5F64-2Dexpensive-2Dchecks-2Dwin_builds_11261&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=929oWPCf7Bf2qQnir4GBtowB8ZAlIRWsAdTfRkDaK-g&s=9k-wbEUVpUm474hhzsmAO29VXVvbxJPWD9RTgCD71fQ&e= * Bad machine code: Explicit definition marked as use * - function: cal_align1 - basic block: %bb.0 entry (0x47edd98) - instruction: LDB $r3, $r2, 0 - operand 0: $r3 This is because RegState info was missing for ScratchReg inside expandMEMCPY. This caused incomplete register usage information to MachineInstr verifier which then would complain as there could be potential code-gen issue if the complained MachineInstr is used in place where register usage information matters even though the memcpy expanding is not in such case as it happens at the last stage of IR optimization pipeline. We should always specify those register usage information which compiler couldn't deduct automatically whenever we add a hardware register manually. Reported-by: Builder llvm-clang-x86_64-expensive-checks-win Build #11261 Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 338134	2018-07-27 16:58:52 +00:00
Jessica Paquette	d4229b985c	Enable MachineOutliner by default under -Oz for AArch64 This patch enables the MachineOutliner by default in AArch64 under -Oz. The MachineOutliner offers around a 4.5% improvement on the current -Oz code size improvements. We have done work into improving the debuggability of outlined code, so that users of -Oz won't be surprised by the optimization. We have also been executing the LLVM test suite and common external tests such as the SPEC suites continuously with no issue. The outliner has a low compile-time overhead of roughly 1%. At this point, the outliner would be a really good addition to the -Oz pass pipeline! llvm-svn: 338133	2018-07-27 16:44:42 +00:00
Sanjay Patel	c7abb416dc	[DAGCombiner] fold 'not' with signbit math This is a follow-up suggested in D48970. Alive proofs: https://rise4fun.com/Alive/sII We can eliminate an instruction in the usual select-of-constants to bit hack transform by adjusting the add/sub with constant. This is always a win. There are more transforms that are likely wins, but they may need target hooks in case some targets do not benefit. This is another step towards making up for canonicalizing to select-of-constants in rL331486. llvm-svn: 338132	2018-07-27 16:42:55 +00:00
Sanjay Patel	1812d33e22	[x86] add more tests for signbit math; NFC llvm-svn: 338131	2018-07-27 16:22:40 +00:00
Sanjay Patel	60c04b961e	[PowerPC] add more tests for signbit math; NFC llvm-svn: 338130	2018-07-27 16:22:18 +00:00
Sanjay Patel	f815bc658b	[AArch64] add more tests for signbit math; NFC llvm-svn: 338129	2018-07-27 16:21:56 +00:00
Jan Vesely	6ff58ed5ca	AMDGPU/R600: Add MOV instructions to BFE patterns R600 can't handle immediates for BFE, these will be eliminated later. Fixes powr/pow regressions n r600 since r334817 Differential Revision: https://reviews.llvm.org/D49641 llvm-svn: 338127	2018-07-27 15:00:13 +00:00
Sander de Smalen	a703b8dc71	[AArch64][SVE] Asm: Predicated integer reductions. This patch adds support for various integer reduction operations: SADDV signed add reduction to scalar UADDV unsigned add reduction to scalar SMAXV signed maximum reduction to scalar SMINV signed minimum reduction to scalar UMAXV unsigned maximum reduction to scalar UMINV unsigned minimum reduction to scalar ANDV logical AND reduction to scalar ORV logical OR reduction to scalar EORV logical EOR reduction to scalar The reduction is predicated, e.g. smaxv s0, p0, z1.s performs a signed maximum reduction on active elements in z1, and stores the (signed max value) result in s0. llvm-svn: 338126	2018-07-27 14:24:55 +00:00
Sander de Smalen	fcb636d222	[AArch64][SVE] Asm: Predicated floating point reductions. This patch adds support for various floating-point reduction operations: FADDA strictly-ordered add reduction, accumulating in scalar FADDV recursive add reduction to scalar FMAXV recursive max reduction to scalar FMINV recursive min reduction to scalar FMAXNMV recursive max number reduction to scalar FMINNMV recursive min number reduction to scalar The reduction is predicated, e.g. fadda d0, p0, d0, z1.d performs the add-reduction in strict order on active elements in z1, accumulating into d0. faddv d0, p0, z1.d performs the add-reduction (not in strict order) on active elements in z1, storing the result in d0. llvm-svn: 338123	2018-07-27 13:58:48 +00:00
Sander de Smalen	88e154ff90	[AArch64][SVE] Asm: Support for FEXPA and FTSSEL. This patch adds support for transcendental acceleration instructions 'FEXPA' (exponential accelerator) and 'FTSSEL' (trigonometric select coefficient). llvm-svn: 338121	2018-07-27 12:40:09 +00:00
Sander de Smalen	71929e7cad	[AArch64][SVE] Asm: Support for FRECPE and FRSQRTE. Support for floating-point instructions for reciprocal estimate (FRECPE) and reciprocal square root estimate (FRSQRTE). llvm-svn: 338120	2018-07-27 12:26:24 +00:00
Sanjay Patel	78e4b4d3c4	[InstCombine] not(sub X, Y) --> add (not X), Y The tests with constants show a missing optimization. Analysis for adds is better than subs, so this can also help with other transforms. And codegen is better with adds for targets like x86 (destructive ops, no sub-from). https://rise4fun.com/Alive/llK llvm-svn: 338118	2018-07-27 10:54:48 +00:00
Sanjay Patel	eee52b5090	[InstCombine] add tests for not+sub; NFC llvm-svn: 338117	2018-07-27 10:45:04 +00:00
Max Kazantsev	4d980515d2	[SimplifyIndVar] Canonicalize comparisons to unsigned while eliminating truncs This is a follow-up for the patch rL335020. When we replace compares against trunc with compares against wide IV, we can also replace signed predicates with unsigned where it is legal. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D48763 llvm-svn: 338115	2018-07-27 09:43:39 +00:00
Matt Arsenault	0183c56c11	AMDGPU: Fix code size for return_to_epilog pseudo llvm-svn: 338113	2018-07-27 09:15:03 +00:00
Anastasis Grammenos	f6e143e67f	Revert "[LV][DebugInfo] Set DL to the middle block Icmp instruction" This reverts commit r338106. llvm-svn: 338109	2018-07-27 08:22:54 +00:00
Hiroshi Inoue	eeab694cea	[InstSimplify] tests for D48828: fold extraction from std::pair This commit includes unit tests for D48828, which enhances InstSimplify to enable jump threading with a method whose return type is std::pair<int, bool> or std::pair<bool, int>. I am going to commit the actual transformation later. llvm-svn: 338107	2018-07-27 07:21:02 +00:00
Anastasis Grammenos	03948d0e0f	[LV][DebugInfo] Set DL to the middle block Icmp instruction Reviewers: hsaito Differential Revision: https://reviews.llvm.org/D49746 llvm-svn: 338106	2018-07-27 07:12:44 +00:00
Tom Stellard	e9bdc5f1d8	AMDGPU/GlobalISel: Fix crash in regbankselect on non-power-of-2 types Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D49624 llvm-svn: 338102	2018-07-27 06:04:40 +00:00
Craig Topper	561e298e29	[X86] Remove an unnecessary 'if' that prevented treating INT64_MAX and -INT64_MAX as power of 2 minus 1 in the multiply expansion code. Not sure why they were being explicitly excluded, but I believe all the math inside the if works. I changed the absolute value to be uint64_t instead of int64_t so INT64_MIN+1 wouldn't be signed wrap. llvm-svn: 338101	2018-07-27 05:56:27 +00:00
Bob Haarman	eae4742d81	[LTO] Don't internalize declarations Summary: Some links were failing with "Global is external, but doesn't have external or weak linkage!" in ThinLTO builds with debug information. This happened when we elide the body of a global that is referenced by debug info. This results in a declaration, which we would then internalize - but declarations cannot be internal. This change avoids the problem by not internalizing these declarations. Fixes PR38046. Reviewers: pcc, tejohnson Subscribers: mehdi_amini, aprantl, hiraditya, JDevlieghere, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49777 llvm-svn: 338100	2018-07-27 05:40:29 +00:00
Craig Topper	e364baa88b	[X86] Add matching for another pattern of PMADDWD. Summary: This is the pattern you get from the loop vectorizer for something like this int16_t A[1024]; int16_t B[1024]; int32_t C[512]; void pmaddwd() { for (int i = 0; i != 512; ++i) C[i] = (A[2i]B[2i]) + (A[2i+1]B[2i+1]); } In this case we will have (add (mul (build_vector), (build_vector)), (mul (build_vector), (build_vector))). This is different than the pattern we currently match which has the build_vectors between an add and a single multiply. I'm not sure what C code would get you that pattern. Reviewers: RKSimon, spatel, zvi Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49636 llvm-svn: 338097	2018-07-27 04:29:10 +00:00
Chen Zheng	567485a72f	[InstCombine] canonicalize abs pattern Differential Revision: https://reviews.llvm.org/D48754 llvm-svn: 338092	2018-07-27 01:49:51 +00:00
Craig Topper	f7bc550223	[X86] When removing sign extends from gather/scatter indices, make sure we handle UpdateNodeOperands finding an existing node to CSE with. If this happens the operands aren't updated and the existing node is returned. Make sure we pass this existing node up to the DAG combiner so that a proper replacement happens. Otherwise we get stuck in an infinite loop with an unoptimized node. llvm-svn: 338090	2018-07-27 00:00:30 +00:00
Craig Topper	1a40a06549	[SelectionDAGBuilder] Add masked loads to PendingLoads rather than calling DAG.setRoot. Masked loads are calling DAG.getRoot rather than calling SelectionDAGBuilder::getRoot, which means the PendingLoads weren't emptied to update the root and create any needed TokenFactor. So it would be incorrect to call setRoot for the masked load. This patch instead adds the masked load to PendingLoads so that the root doesn't get update until a store or scatter or something happens.. Alternatively, we could call SelectionDAGBuilder::getRoot before it, but that would create unnecessary serialization. llvm-svn: 338085	2018-07-26 23:22:11 +00:00
Reid Kleckner	c6cf918f44	[InstrProf] Use comdats on COFF for available_externally functions Summary: r262157 added ELF-specific logic to put a comdat on the __profc_* globals created for available_externally functions. We should be able to generalize that logic to all object file formats that support comdats, i.e. everything other than MachO. This fixes duplicate symbol errors, since on COFF, linkonce_odr doesn't make the symbol weak. Fixes PR38251. Reviewers: davidxl, xur Subscribers: hiraditya, dmajor, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D49882 llvm-svn: 338082	2018-07-26 22:59:17 +00:00
Wolfgang Pieb	9ea65082ff	[DWARF v5] Reposting r337981, which was reverted in r337997 due to a test failure in debuginfo_tests. The test failure was caused by the compiler not emitting a __debug_ranges section with DWARF 4 and earlier when no ranges are needed. The test checks for the existence regardless. llvm-svn: 338081	2018-07-26 22:48:52 +00:00
Zachary Turner	23df1319ca	[MS Demangler] Properly handle function parameter back-refs. Properly demangle function parameter back-references. Previously we treated lists of function parameters and template parameters the same. There are some important differences with regards to back-references, and some less important differences regarding which characters can appear before or after the name. The important differences are that with a given type T, all instances of a function parameter list share the same global back-ref table. Specifically, if X and Y are function pointers, then there are 3 entities in the declaration X func(Y) which all affect and are affected by the master parameter back-ref table: 1) The parameter list of X's function type 2) the parameter list of func itself 3) The parameter list of Y's function type. The previous code would create a back-reference table that was local to a single parameter list, so it would not be shared across parameter lists. This was discovered when porting ms-back-references.test from clang's mangling tests. All of these tests should now pass with the new changes. In doing so, I split the function for parsing template and function parameters into two separate functions. This makes the template parameter list parsing code in particular very small and easy to understand now. Differential Revision: https://reviews.llvm.org/D49875 llvm-svn: 338075	2018-07-26 22:13:39 +00:00
Keno Fischer	864fbd8e9a	[SCEV] Don't expand Wrap predicate using inttoptr in ni addrspaces Summary: In non-integral address spaces, we're not allowed to introduce inttoptr/ptrtoint intrinsics. Instead, we need to expand any pointer arithmetic as geps on the base pointer. Luckily this is a common task for SCEV, so all we have to do here is hook up the corresponding helper function and add test case. Fixes PR38290 Reviewers: sanjoy Differential Revision: https://reviews.llvm.org/D49832 llvm-svn: 338073	2018-07-26 21:55:06 +00:00
Vedant Kumar	b572f64212	[DebugInfo] LowerDbgDeclare: Add derefs when handling CallInst users LowerDbgDeclare inserts a dbg.value before each use of an address described by a dbg.declare. When inserting a dbg.value before a CallInst use, however, it fails to append DW_OP_deref to the DIExpression. The DW_OP_deref is needed to reflect the fact that a dbg.value describes a source variable directly (as opposed to a dbg.declare, which relies on pointer indirection). This patch adds in the DW_OP_deref where needed. This results in the correct values being shown during a debug session for a program compiled with ASan and optimizations (see https://reviews.llvm.org/D49520). Note that ConvertDebugDeclareToDebugValue is already correct -- no changes there were needed. One complication is that SelectionDAG is unable to distinguish between direct and indirect frame-index (FRAMEIX) SDDbgValues. This patch also fixes this long-standing issue in order to not regress integration tests relying on the incorrect assumption that all frame-index SDDbgValues are indirect. This is a necessary fix: the newly-added DW_OP_derefs cannot be lowered properly otherwise. Basically the fix prevents a direct SDDbgValue with DIExpression(DW_OP_deref) from being dereferenced twice by a debugger. There were a handful of tests relying on this incorrect "FRAMEIX => indirect" assumption which actually had incorrect DW_AT_locations: these are all fixed up in this patch. Testing: - check-llvm, and an end-to-end test using lldb to debug an optimized program. - Existing unit tests for DIExpression::appendToStack fully cover the new DIExpression::append utility. - check-debuginfo (the debug info integration tests) Differential Revision: https://reviews.llvm.org/D49454 llvm-svn: 338069	2018-07-26 20:56:53 +00:00
Zachary Turner	024e1762aa	[MS Demangler] Print calling convention inside parentheses. For function pointers, we would print something like int __cdecl ()(int) We need to move the calling convention inside, and print int (__cdecl )(int) This patch implements this change for regular function pointers as well as member function pointers. llvm-svn: 338068	2018-07-26 20:33:48 +00:00
Zachary Turner	ca7aef10c4	[MS Demangler] Add ms-arg-qualifiers.test This converts the arg qualifier mangling tests from clang/CodeGenCXX/mangle-ms-arg-qualifiers.cpp to demangling tests. Most tests already pass, so this patch doesn't come with any functional change, just the addition of new tests. The few tests that don't pass are left in with a FIXME label so that they don't run but serve as documentation about what still doesn't work. llvm-svn: 338067	2018-07-26 20:25:35 +00:00
Zachary Turner	f4c4519532	Add missing tests from ms-mangle.cpp. None of these tests pass yet so they are commented out, but I'm adding them with a FIXME label so that they don't get lost when copying tests over from clang's mangling tests. Currently these tests are all commented out. llvm-svn: 338066	2018-07-26 20:20:29 +00:00
Zachary Turner	38b78a7f0e	[MS Demangler] Demangle pointers to member functions. After this patch, we can now properly demangle pointers to member functions. The calling convention is located in the wrong place, but this will be fixed in a followup since it also affects non member function pointers. Differential Revision: https://reviews.llvm.org/D49639 llvm-svn: 338065	2018-07-26 20:20:10 +00:00
Martin Storsjo	390bce4322	[MC] Add support for the .rva assembler directive for COFF targets Even though gas doesn't document it, it has been supported there for a very long time. This produces the 32 bit relative virtual address (aka image relative address) for a given symbol. ".rva foo" is essentially equal to ".long foo@imgrel". Differential Revision: https://reviews.llvm.org/D49821 llvm-svn: 338063	2018-07-26 20:11:26 +00:00
Stephen Hines	e6e75bf84c	Handle the lack of a symbol table correctly. Summary: These two cases will trigger a dereference on a nullptr, since the SymbolTable can be nonexistent for a given library, in addition to just being empty. Reviewers: alexshap Reviewed By: alexshap Subscribers: meikeb, kongyi, chh, jakehehrlich, llvm-commits, pirama Differential Revision: https://reviews.llvm.org/D49534 llvm-svn: 338062	2018-07-26 20:05:31 +00:00
Zachary Turner	d742d645a1	[MS Demangler] Demangle data member pointers. Differential Revision: https://reviews.llvm.org/D49630 llvm-svn: 338061	2018-07-26 19:56:09 +00:00
Scott Linder	eb1f75d561	[AMDGPU] Fix VGPR spills where offset doesn't fit in 12 bits Scale the offset of VGPR spills by the wave size when it cannot fit in the 12-bit offset immediate field and so is added to the soffset SGPR. This accounts for hardware swizzling of scratch memory. Differential Revision: https://reviews.llvm.org/D49448 llvm-svn: 338060	2018-07-26 19:47:51 +00:00
Sanjay Patel	6d6eab66e0	[InstCombine] fold udiv with common factor from muls with nuw Unfortunately, sdiv isn't as simple because of UB due to overflow. This fold is mentioned in PR38239: https://bugs.llvm.org/show_bug.cgi?id=38239 llvm-svn: 338059	2018-07-26 19:22:41 +00:00
Ana Pazos	2e4106b73d	[RISCV] Add support for _interrupt attribute - Save/restore only registers that are used. This includes Callee saved registers and Caller saved registers (arguments and temporaries) for integer and FP registers. - If there is a call in the interrupt handler, save/restore all Caller saved registers (arguments and temporaries) and all FP registers. - Emit special return instructions depending on "interrupt" attribute type. Based on initial patch by Zhaoshi Zheng. Reviewers: asb Reviewed By: asb Subscribers: rkruppe, the_o, MartinMosbeck, brucehoult, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D48411 llvm-svn: 338047	2018-07-26 17:49:43 +00:00
Matthias Braun	09810c9269	MacroFusion: Fix macro fusion with ExitSU failing in top-down scheduling When fusing instructions A and B, we must add all predecessors of B as predecessors of A to avoid instructions getting scheduling in between. There is a special case involving ExitSU: Every other node must be scheduled before it by design and we don't need to make this explicit in the graph, however when fusing with a different node we need to schedule every othere node before the fused node too and we need to make this explicit now: This patch adds a dependency from the fused node to all roots in the graph. Differential Revision: https://reviews.llvm.org/D49830 llvm-svn: 338046	2018-07-26 17:43:56 +00:00
Roman Lebedev	41ba5c1455	[DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle ule,ugt CondCodes. Summary: A follow-up for D49266 / rL337166. At least one of these cases is more canonical, so we really do have to handle it. https://godbolt.org/g/pkzP3X https://rise4fun.com/Alive/pQyhZZ We won't get to these cases with I1 being -1, as that will be constant-folded to true or false. I'm also not sure we actually hit the 'ule' case, but i think the worst think that could happen is that being dead code. Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49497 llvm-svn: 338044	2018-07-26 17:34:28 +00:00
Alexey Bataev	4dd7558fab	[DEBUGINFO, NVPTX] Emit correct debug information for local variables. Summary: NVPTX target dos not use register-based frame information. Instead it relies on the artificial local_depot that is used instead of the frame and the data for variables must be emitted relatively to this local_depot. Reviewers: tra, jlebar, echristo Subscribers: jholewinski, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45963 llvm-svn: 338039	2018-07-26 16:29:52 +00:00
Sanjay Patel	b381e7e59a	[InstCombine] add tests for udiv with common factor; NFC This fold is mentioned in PR38239: https://bugs.llvm.org/show_bug.cgi?id=38239 The general case probably belongs in -reassociate, but given that we do basic reassociation optimizations similar to this in instcombine already, we might as well be consistent within instcombine and handle this pattern? llvm-svn: 338038	2018-07-26 16:14:53 +00:00
Alexey Bataev	7ae86fe71c	[DEBUGINFO, NVPTX] Set `DW_AT_frame_base` to `DW_OP_call_frame_cfa`. Summary: For NVPTX target the value of `DW_AT_frame_base` attribute must be set to `DW_OP_call_frame_cfa`. Reviewers: tra, jlebar, echristo Subscribers: jholewinski, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45785 llvm-svn: 338036	2018-07-26 16:10:05 +00:00
Jonas Devlieghere	640e790af2	[test] Disable dsymutil update test on windows Apparently, the issue with dsymutil update functionality on Windows was that Windows doesn't like dsymutil renaming files that have open handles to them. This disables the new accelerator test and updates the comment in the other two test. We should be able to enable the tests again once we updated the implementation to use TempFile::keep() to keep the temporary files in MachOUtils. A big thank you to Jeremy Morse from Sony for figuring this out and bringing it to my attention. llvm-svn: 338030	2018-07-26 14:16:19 +00:00
Luke Cheeseman	66b5e7da4c	Enable some pointer authentication instructions for aarch64 v8a targets - Some of the v8.3 pointer authentication instruction inhabit the Hint space - These instructions can be assembled to hint instructions which act as NOP instructions prior to v8.3 - This patch permits using the hint instructions for all v8a targets - Also, correct the RETA{A,B} instructions to match the instruction attributes of RET (set isTerminator and isBarrier) Differential Revision: https://reviews.llvm.org/D49786 llvm-svn: 338029	2018-07-26 14:00:50 +00:00
Stefan Maksimovic	4a612d4bf2	[mips] Sign extend i32 return values on MIPS64 Override getTypeForExtReturn so that functions returning an i32 typed value have it sign extended on MIPS64. Also provide patterns to get rid of unneeded sign extensions for arithmetic instructions which implicitly sign extend their results. Differential Revision: https://reviews.llvm.org/D48374 llvm-svn: 338019	2018-07-26 10:59:35 +00:00
Martin Storsjo	9dafd6f6d9	Revert "[COFF] Use comdat shared constants for MinGW as well" This reverts commit r337951. While that kind of shared constant generally works fine in a MinGW setting, it broke some cases of inline assembly that worked before: $ cat const-asm.c int MULH(int a, int b) { int rt, dummy; __asm__ ( "imull %3" :"=d"(rt), "=a"(dummy) :"a"(a), "rm"(b) ); return rt; } int func(int a) { return MULH(a, 1); } $ clang -target x86_64-win32-gnu -c const-asm.c -O2 const-asm.c:4:9: error: invalid variant '00000001' "imull %3" ^ <inline asm>:1:15: note: instantiated into assembly here imull __real@00000001(%rip) ^ A similar error is produced for i686 as well. The same test with a target of x86_64-win32-msvc or i686-win32-msvc works fine. llvm-svn: 338018	2018-07-26 10:48:20 +00:00
Jonas Devlieghere	f290256dfb	[test] Do dsymutil update in place Update the dSYM bundle in place when swapping out the accelerator tables. This should unbreak the windows bot that have been failing with an access denied. llvm-svn: 338014	2018-07-26 09:23:10 +00:00
Sjoerd Meijer	31d38586e7	[AArch64][NFC] Removed tab characters from test files. llvm-svn: 338011	2018-07-26 07:59:39 +00:00
Sjoerd Meijer	dc198344ce	[AArch64] Armv8.2-A: add the crypto extensions This adds MC support for the crypto instructions that were made optional extensions in Armv8.2-A (AArch64 only). Differential Revision: https://reviews.llvm.org/D49370 llvm-svn: 338010	2018-07-26 07:13:59 +00:00
Fangrui Song	f2822e2d9d	[ConstProp] Fix calls-math-finite.ll on FreeBSD FreeBSD's log(3.0) is less precise than glibc and musl. Let's forgive its rounding error of more than half an ulp. llvm-svn: 338009	2018-07-26 06:24:11 +00:00
Fangrui Song	c32561ea8b	[AsmParser] Fix preserve-comments-crlf.s on FreeBSD --strip-trailing-cr is a diffutils option which is also available on BSD-licensed diff introduced in FreeBSD 11.2, however, it has a bug comparing files mixing \r and \r\n. Use -b (POSIX) instead. llvm-svn: 338008	2018-07-26 06:07:03 +00:00
Craig Topper	4e687d5bb2	[X86] Don't use CombineTo to skip adding new nodes to the DAGCombiner worklist in combineMul. I'm not sure if this was trying to avoid optimizing the new nodes further or what. Or maybe to prevent a cycle if something tried to reform the multiply? But I don't think its a reliable way to do that. If the user of the expanded multiply is visited by the DAGCombiner after this conversion happens, the DAGCombiner will check its operands, see that they haven't been visited by the DAGCombiner before and it will then add the first node to the worklist. This process will repeat until all the new nodes are visited. So this seems like an unreliable prevention at best. So this patch just returns the new nodes like any other combine. If this starts causing problems we can try to add target specific nodes or something to more directly prevent optimizations. Now that we handle the combine normally, we can combine any negates the mul expansion creates into their users since those will be visited now. llvm-svn: 338007	2018-07-26 05:40:10 +00:00
Alex Lorenz	7d808c19ff	Revert r337981: it breaks the debuginfo-tests This commit caused a regression in the debuginfo-tests: FAIL: debuginfo-tests :: apple-accel.cpp (40748 of 46595) llvm-svn: 337997	2018-07-26 03:21:40 +00:00
Amara Emerson	fdd089aa14	[GlobalISel] Fall back to SDISel for swifterror/swiftself attributes. We don't currently support these, fall back until we do. llvm-svn: 337994	2018-07-26 01:25:58 +00:00
Wolfgang Pieb	1d56b4ae40	[DWARF v5] Don't report an error when the .debug_rnglists section is empty or non-existent. Fixes PR38297. Reviewer: JDevlieghere Differential Revision: https://reviews.llvm.org/D49815 llvm-svn: 337993	2018-07-26 01:12:41 +00:00
Wolfgang Pieb	c42087df7c	[DWARF v5] Don't emit multiple DW_AT_rnglists_base attributes. Some refactoring of range lists emissions and added test cases. Reviewer: dblaikie Differential Revision: https://reviews.llvm.org/D49522 llvm-svn: 337981	2018-07-25 23:03:22 +00:00
Jonas Devlieghere	743d351120	[dsymutil] Add support for generating DWARF5 accelerator tables. This patch add support for emitting DWARF5 accelerator tables (.debug_names) from dsymutil. Just as with the Apple style accelerator tables, it's possible to update existing dSYMs. This patch includes a test that show how you can convert back and forth between the two types. If no kind of table is specified, dsymutil will default to generating Apple-style accelerator tables whenever it finds those in its input. The same is true when there are no accelerator tables at all. Finally, in the remaining case, where there's at least one DWARF v5 table and no Apple ones, the output will contains a DWARF accelerator tables (.debug_names). Differential revision: https://reviews.llvm.org/D49137 llvm-svn: 337980	2018-07-25 23:01:38 +00:00
Yonghong Song	71d81e5c8f	bpf: new option -bpf-expand-memcpy-in-order to expand memcpy in order Some BPF JIT backends would want to optimize memcpy in their own architecture specific way. However, at the moment, there is no way for JIT backends to see memcpy semantics in a reliable way. This is due to LLVM BPF backend is expanding memcpy into load/store sequences and could possibly schedule them apart from each other further. So, BPF JIT backends inside kernel can't reliably recognize memcpy semantics by peephole BPF sequence. This patch introduce new intrinsic expand infrastructure to memcpy. To get stable in-order load/store sequence from memcpy, we first lower memcpy into BPF::MEMCPY node which then expanded into in-order load/store sequences in expandPostRAPseudo pass which will happen after instruction scheduling. By this way, kernel JIT backends could reliably recognize memcpy through scanning BPF sequence. This new memcpy expand infrastructure is gated by a new option: -bpf-expand-memcpy-in-order Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 337977	2018-07-25 22:40:02 +00:00
Eli Friedman	d6baff65f7	[GlobalMerge] Handle llvm.compiler.used correctly. Reuse the handling for llvm.used, and don't transform such globals. Fixes a failure on the asan buildbot caused by my previous commit. llvm-svn: 337973	2018-07-25 22:03:35 +00:00
Sanjay Patel	215dcbf4db	[SelectionDAG] try to convert funnel shift directly to rotate if legal If the DAGCombiner's rotate matching was working as expected, I don't think we'd see any test diffs here. This sidesteps the issue of custom lowering for rotates raised in PR38243: https://bugs.llvm.org/show_bug.cgi?id=38243 ...by only dealing with legal operations. llvm-svn: 337966	2018-07-25 21:38:30 +00:00
Roman Tereshin	4f10a9d3a3	[LSV] Look through selects for consecutive addresses In some cases LSV sees (load/store _ (select _ <pointer expression> <pointer expression>)) patterns in input IR, often due to sinking and other forms of CFG simplification, sometimes interspersed with bitcasts and all-constant-indices GEPs. With this patch`areConsecutivePointers` method would attempt to handle select instructions. This leads to an increased number of successful vectorizations. Technically, select instructions could appear in index arithmetic as well, however, we don't see those in our test suites / benchmarks. Also, there is a lot more freedom in IR shapes computing integral indices in general than in what's common in pointer computations, and it appears that it's quite unreliable to do anything short of making select instructions first class citizens of Scalar Evolution, which for the purposes of this patch is most definitely an overkill. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D49428 llvm-svn: 337965	2018-07-25 21:33:00 +00:00
Sanjay Patel	f94c4c84e6	[AArch, PowerPC] add more tests for legal rotate ops; NFC llvm-svn: 337964	2018-07-25 21:25:50 +00:00
Eli Friedman	0887cf9cab	[GlobalMerge] Allow merging globals with arbitrary alignment. Instead of depending on implicit padding from the structure layout code, use a packed struct and emit the padding explicitly. Differential Revision: https://reviews.llvm.org/D49710 llvm-svn: 337961	2018-07-25 20:58:01 +00:00
Florian Hahn	b6613ac665	Revert r337904: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. I suspect it is causing the clang-stage2-Rthinlto failures. llvm-svn: 337956	2018-07-25 19:44:19 +00:00
Martin Storsjo	ff33a95ed4	[COFF] Use comdat shared constants for MinGW as well GNU binutils tools have no problems with this kind of shared constants, provided that we actually hook it up completely in AsmPrinter and produce a global symbol. This effectively reverts SVN r335918 by hooking the rest of it up properly. This feature was implemented originally in SVN r213006, with no reason for why it can't be used for MinGW other than the fact that GCC doesn't do it while MSVC does. Differential Revision: https://reviews.llvm.org/D49646 llvm-svn: 337951	2018-07-25 18:35:42 +00:00
Martin Storsjo	d2662c32fb	[COFF] Hoist constant pool handling from X86AsmPrinter into AsmPrinter In SVN r334523, the first half of comdat constant pool handling was hoisted from X86WindowsTargetObjectFile (which despite the name only was used for msvc targets) into the arch independent TargetLoweringObjectFileCOFF, but the other half of the handling was left behind in X86AsmPrinter::GetCPISymbol. With only half of the handling in place, inconsistent comdat sections/symbols are created, causing issues with both GNU binutils (avoided for X86 in SVN r335918) and with the MS linker, which would complain like this: fatal error LNK1143: invalid or corrupt file: no symbol for COMDAT section 0x4 Differential Revision: https://reviews.llvm.org/D49644 llvm-svn: 337950	2018-07-25 18:35:31 +00:00
Eli Friedman	733f4ed1bb	[ARM] Prefer lsls+lsrs over lsls+ands or lsrs+ands in Thumb1. Saves materializing the immediate for the "ands". Corresponding patterns exist for lsrs+lsls, but that seems less common in practice. Now implemented as a DAGCombine. Differential Revision: https://reviews.llvm.org/D49585 llvm-svn: 337945	2018-07-25 18:22:22 +00:00
Roman Tereshin	ed047b0184	[SCEV] Add [zs]ext{C,+,x} -> (D + [zs]ext{C-D,+,x})<nuw><nsw> transform as well as sext(C + x + ...) -> (D + sext(C-D + x + ...))<nuw><nsw> similar to the equivalent transformation for zext's if the top level addition in (D + (C-D + x * n)) could be proven to not wrap, where the choice of D also maximizes the number of trailing zeroes of (C-D + x * n), ensuring homogeneous behaviour of the transformation and better canonicalization of such AddRec's (indeed, there are 2^(2w) different expressions in `B1 + ext(B2 + Y)` form for the same Y, but only 2^(2w - k) different expressions in the resulting `B3 + ext((B4 * 2^k) + Y)` form, where w is the bit width of the integral type) This patch generalizes sext(C1 + C2X) --> sext(C1) + sext(C2X) and sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformations added in r209568 relaxing the requirements the following way: 1. C2 doesn't have to be a power of 2, it's enough if it's divisible by 2 a sufficient number of times; 2. C1 doesn't have to be less than C2, instead of extracting the entire C1 we can split it into 2 terms: (00...0XXX + YY...Y000), keep the second one that may cause wrapping within the extension operator, and move the first one that doesn't affect wrapping out of the extension operator, enabling further simplifications; 3. C1 and C2 don't have to be positive, splitting C1 like shown above produces a sum that is guaranteed to not wrap, signed or unsigned; 4. in AddExpr case there could be more than 2 terms, and in case of AddExpr the 2nd and following terms and in case of AddRecExpr the Step component don't have to be in the C2X form or constant (respectively), they just need to have enough trailing zeros, which in turn could be guaranteed by means other than arithmetics, e.g. by a pointer alignment; 5. the extension operator doesn't have to be a sext, the same transformation works and profitable for zext's as well. Apparently, optimizations like SLPVectorizer currently fail to vectorize even rather trivial cases like the following: double bar(double a, unsigned n) { double x = 0.0; double y = 0.0; for (unsigned i = 0; i < n; i += 2) { x += a[i]; y += a[i + 1]; } return x * y; } If compiled with `clang -std=c11 -Wpedantic -Wall -O3 main.c -S -o - -emit-llvm` (!{!"clang version 7.0.0 (trunk 337339) (llvm/trunk 337344)"}) it produces scalar code with the loop not unrolled with the unsigned `n` and `i` (like shown above), but vectorized and unrolled loop with signed `n` and `i`. With the changes made in this commit the unsigned version will be vectorized (though not unrolled for unclear reasons). How it all works: Let say we have an AddExpr that looks like (C + x + y + ...), where C is a constant and x, y, ... are arbitrary SCEVs. Let's compute the minimum number of trailing zeroes guaranteed of that sum w/o the constant term: (x + y + ...). If, for example, those terms look like follows: i XXXX...X000 YYYY...YY00 ... ZZZZ...0000 then the rightmost non-guaranteed-zero bit (a potential one at i-th position above) can change the bits of the sum to the left (and at i-th position itself), but it can not possibly change the bits to the right. So we can compute the number of trailing zeroes by taking a minimum between the numbers of trailing zeroes of the terms. Now let's say that our original sum with the constant is effectively just C + X, where X = x + y + .... Let's also say that we've got 2 guaranteed trailing zeros for X: j CCCC...CCCC XXXX...XX00 // this is X = (x + y + ...) Any bit of C to the left of j may in the end cause the C + X sum to wrap, but the rightmost 2 bits of C (at positions j and j - 1) do not affect wrapping in any way. If the upper bits cause a wrap, it will be a wrap regardless of the values of the 2 least significant bits of C. If the upper bits do not cause a wrap, it won't be a wrap regardless of the values of the 2 bits on the right (again). So let's split C to 2 constants like follows: 0000...00CC = D CCCC...CC00 = (C - D) and represent the whole sum as D + (C - D + X). The second term of this new sum looks like this: CCCC...CC00 XXXX...XX00 ----------- // let's add them up YYYY...YY00 The sum above (let's call it Y)) may or may not wrap, we don't know, so we need to keep it under a sext/zext. Adding D to that sum though will never wrap, signed or unsigned, if performed on the original bit width or the extended one, because all that that final add does is setting the 2 least significant bits of Y to the bits of D: YYYY...YY00 = Y 0000...00CC = D ----------- <nuw><nsw> YYYY...YYCC Which means we can safely move that D out of the sext or zext and claim that the top-level sum neither sign wraps nor unsigned wraps. Let's run an example, let's say we're working in i8's and the original expression (zext's or sext's operand) is 21 + 12x + 8y. So it goes like this: 0001 0101 // 21 XXXX XX00 // 12x YYYY Y000 // 8y 0001 0101 // 21 ZZZZ ZZ00 // 12x + 8y 0000 0001 // D 0001 0100 // 21 - D = 20 ZZZZ ZZ00 // 12x + 8y 0000 0001 // D WWWW WW00 // 21 - D + 12x + 8y = 20 + 12x + 8y therefore zext(21 + 12x + 8y) = (1 + zext(20 + 12x + 8y))<nuw><nsw> This approach could be improved if we move away from using trailing zeroes and use KnownBits instead. For instance, with KnownBits we could have the following picture: i 10 1110...0011 // this is C XX X1XX...XX00 // this is X = (x + y + ...) Notice that some of the bits of X are known ones, also notice that known bits of X are interspersed with unknown bits and not grouped on the rigth or left. We can see at the position i that C(i) and X(i) are both known ones, therefore the (i + 1)th carry bit is guaranteed to be 1 regardless of the bits of C to the right of i. For instance, the C(i - 1) bit only affects the bits of the sum at positions i - 1 and i, and does not influence if the sum is going to wrap or not. Therefore we could split the constant C the following way: i 00 0010...0011 = D 10 1100...0000 = (C - D) Let's compute the KnownBits of (C - D) + X: XX1 1 = carry bit, blanks stand for known zeroes 10 1100...0000 = (C - D) XX X1XX...XX00 = X --- ----------- XX X0XX...XX00 Will this add wrap or not essentially depends on bits of X. Adding D to this sum, however, is guaranteed to not to wrap: 0 X 00 0010...0011 = D sX X0XX...XX00 = (C - D) + X --- ----------- sX XXXX XX11 As could be seen above, adding D preserves the sign bit of (C - D) + X, if any, and has a guaranteed 0 carry out, as expected. The more bits of (C - D) we constrain, the better the transformations introduced here canonicalize expressions as it leaves less freedom to what values the constant part of ((C - D) + x + y + ...) can take. Reviewed By: mzolotukhin, efriedma Differential Revision: https://reviews.llvm.org/D48853 llvm-svn: 337943	2018-07-25 18:01:41 +00:00
Ulrich Weigand	5f75371c5d	Fix corruption of result number in LegalizeVectorOps.cpp When VectorLegalizer::LegalizeOp creates a new SDValue after iterating over its arguments, we need to refer to the same result number of the new node that the original value used. Reviewed by: cameron.mcinally Differential Revision: https://reviews.llvm.org/D49805 llvm-svn: 337939	2018-07-25 17:08:13 +00:00
Stanislav Mekhanoshin	7e7268ac1c	[AMDGPU] Use AssumptionCacheTracker in the divrem32 expansion Differential Revision: https://reviews.llvm.org/D49761 llvm-svn: 337938	2018-07-25 17:02:11 +00:00
Stanislav Mekhanoshin	b8269a9589	Fix llvm::ComputeNumSignBits with some operations and llvm.assume Currently ComputeNumSignBits does early exit while processing some of the operations (add, sub, mul, and select). This prevents the function from using AssumptionCacheTracker if passed. Differential Revision: https://reviews.llvm.org/D49759 llvm-svn: 337936	2018-07-25 16:39:24 +00:00
Krzysztof Parzyszek	4e07509d18	[Hexagon] Properly scale bit index when extracting elements from vNi1 For example v = <2 x i1> is represented as bbbbaaaa in a predicate register, where b = v[1], a = v[0]. Extracting v[1] is equivalent to extracting bit 4 from the predicate register. llvm-svn: 337934	2018-07-25 16:20:59 +00:00
Petar Jovanovic	58c0210023	[MIPS GlobalISel] Lower pointer arguments Add support for lowering pointer arguments. Changing type from pointer to integer is already done in MipsTargetLowering::getRegisterTypeForCallingConv. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D49419 llvm-svn: 337912	2018-07-25 12:35:01 +00:00
Florian Hahn	6f5c6adbcd	Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. r337828 resolves a PredicateInfo issue with unnamed types. Original message: This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin llvm-svn: 337904	2018-07-25 11:13:40 +00:00
Thomas Preud'homme	768d6ce4a3	Fix PR34170: Crash on inline asm with 64bit output in 32bit GPR Add support for inline assembly with output operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). llvm-svn: 337903	2018-07-25 11:11:12 +00:00
Paul Semel	0913dcd747	[llvm-objdump] Add dynamic section printing to private-headers option Differential Revision: https://reviews.llvm.org/D49016 llvm-svn: 337902	2018-07-25 11:09:20 +00:00
Paul Semel	5ce8f1598c	[llvm-readobj] Generic hex-dump option Helpers are available to make this option file format independant. This patch adds the feature for Wasm file format. It doesn't change the behavior of the other file format handling. Differential Revision: https://reviews.llvm.org/D49545 llvm-svn: 337896	2018-07-25 10:04:37 +00:00
Simon Atanasyan	b524459288	[mips] Replace custom parsing logic for data directives by the `addAliasForDirective` The target independent AsmParser doesn't recognise .hword, .word, .dword which are required for Mips. Currently MipsAsmParser recognises these through dispatch to MipsAsmParser::parseDataDirective. This contains equivalent logic to AsmParser::parseDirectiveValue. This patch allows reuse of AsmParser::parseDirectiveValue by making use of addAliasForDirective to support .hword, .word and .dword. Original patch provided by Alex Bradbury at D47001 was modified to fix handling of microMIPS symbols. The `AsmParser::parseDirectiveValue` calls either `EmitIntValue` or `EmitValue`. In this patch we override `EmitIntValue` in the `MipsELFStreamer` to clear a pending set of microMIPS symbols. Differential revision: https://reviews.llvm.org/D49539 llvm-svn: 337893	2018-07-25 07:07:43 +00:00
Craig Topper	d9fa8147c4	[X86] Autogenerate complete checks and fix a failure introduced in r337875. llvm-svn: 337889	2018-07-25 05:22:13 +00:00
Chandler Carruth	7024921c0a	[x86/SLH] Teach the x86 speculative load hardening pass to harden against v1.2 BCBS attacks directly. Attacks using spectre v1.2 (a subset of BCBS) are described in the paper here: https://people.csail.mit.edu/vlk/spectre11.pdf The core idea is to speculatively store over the address in a vtable, jumptable, or other target of indirect control flow that will be subsequently loaded. Speculative execution after such a store can forward the stored value to subsequent loads, and if called or jumped to, the speculative execution will be steered to this potentially attacker controlled address. Up until now, this could be mitigated by enableing retpolines. However, that is a relatively expensive technique to mitigate this particular flavor. Especially because in most cases SLH will have already mitigated this. To fully mitigate this with SLH, we need to do two core things: 1) Unfold loads from calls and jumps, allowing the loads to be post-load hardened. 2) Force hardening of incoming registers even if we didn't end up needing to harden the load itself. The reason we need to do these two things is because hardening calls and jumps from this particular variant is importantly different from hardening against leak of secret data. Because the "bad" data here isn't a secret, but in fact speculatively stored by the attacker, it may be loaded from any address, regardless of whether it is read-only memory, mapped memory, or a "hardened" address. The only 100% effective way to harden these instructions is to harden the their operand itself. But to the extent possible, we'd like to take advantage of all the other hardening going on, we just need a fallback in case none of that happened to cover the particular input to the control transfer instruction. For users of SLH, currently they are paing 2% to 6% performance overhead for retpolines, but this mechanism is expected to be substantially cheaper. However, it is worth reminding folks that this does not mitigate all of the things retpolines do -- most notably, variant #2 is not in any way mitigated by this technique. So users of SLH may still want to enable retpolines, and the implementation is carefuly designed to gracefully leverage retpolines to avoid the need for further hardening here when they are enabled. Differential Revision: https://reviews.llvm.org/D49663 llvm-svn: 337878	2018-07-25 01:51:29 +00:00
Craig Topper	fc501a9223	[X86] Use a shift plus an lea for multiplying by a constant that is a power of 2 plus 2/4/8. The LEA allows us to combine an add and the multiply by 2/4/8 together so we just need a shift for the larger power of 2. llvm-svn: 337875	2018-07-25 01:15:38 +00:00
Craig Topper	5be253d988	[X86] Expand mul by pow2 + 2 using a shift and two adds similar to what we do for pow2 - 2. llvm-svn: 337874	2018-07-25 01:15:35 +00:00
Craig Topper	56c104f104	[X86] Use a two lea sequence for multiply by 37, 41, and 73. These fit a pattern used by 11, 21, and 19. llvm-svn: 337871	2018-07-24 23:44:17 +00:00
Craig Topper	b5342b592e	[X86] Add test cases for multiply by 37, 41, and 73. These can all be handled with 2 LEAs similar to what we do for 11, 19, 21. llvm-svn: 337870	2018-07-24 23:44:15 +00:00
Craig Topper	f8fcee70a3	[X86] Change multiply by 26 to use two multiplies by 5 and an add instead of multiply by 3 and 9 and a subtract. Same number of operations, but ending in an add is friendlier due to it being commutable. llvm-svn: 337869	2018-07-24 23:44:12 +00:00
Hideki Saito	ef380b0fc5	[LV] Fix for PR38110, LV encountered llvm_unreachable() Summary: truncateToMinimalBitWidths() doesn't handle all Instructions and the worst case is compiler crash via llvm_unreachable(). Fix is to add a case to handle PHINode and changed the worst case to NO-OP (from compiler crash). Reviewers: sbaranga, mssimpso, hsaito Reviewed By: hsaito Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49461 llvm-svn: 337861	2018-07-24 22:30:31 +00:00
Roman Tereshin	1ba1f9310c	[SCEV] Add zext(C + x + ...) -> D + zext(C-D + x + ...)<nuw><nsw> transform if the top level addition in (D + (C-D + x + ...)) could be proven to not wrap, where the choice of D also maximizes the number of trailing zeroes of (C-D + x + ...), ensuring homogeneous behaviour of the transformation and better canonicalization of such expressions. This enables better canonicalization of expressions like 1 + zext(5 + 20 * %x + 24 * %y) and zext(6 + 20 * %x + 24 * %y) which get both transformed to 2 + zext(4 + 20 * %x + 24 * %y) This pattern is common in address arithmetics and the transformation makes it easier for passes like LoadStoreVectorizer to prove that 2 or more memory accesses are consecutive and optimize (vectorize) them. Reviewed By: mzolotukhin Differential Revision: https://reviews.llvm.org/D48853 llvm-svn: 337859	2018-07-24 21:48:56 +00:00
Craig Topper	5ddc0a2b14	[X86] When expanding a multiply by a negative of one less than a power of 2, like 31, don't generate a negate of a subtract that we'll never optimize. We generated a subtract for the power of 2 minus one then negated the result. The negate can be optimized away by swapping the subtract operands, but DAG combine doesn't know how to do that and we don't add any of the new nodes to the worklist anyway. This patch makes use explicitly emit the swapped subtract. llvm-svn: 337858	2018-07-24 21:31:21 +00:00
Craig Topper	6d29891bef	[X86] Generalize the multiply by 30 lowering to generic multipy by power 2 minus 2. Use a left shift and 2 subtracts like we do for 30. Move this out from behind the slow lea check since it doesn't even use an LEA. Use this for multiply by 14 as well. llvm-svn: 337856	2018-07-24 21:15:41 +00:00
Heejin Ahn	8daef0751d	[WebAssembly] Add tests for weaker memory consistency orderings Summary: Currently all wasm atomic memory access instructions are sequentially consistent, so even if LLVM IR specifies weaker orderings than that, we should upgrade them to sequential ordering and treat them in the same way. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D49194 llvm-svn: 337854	2018-07-24 21:06:44 +00:00
Craig Topper	86d6320b94	[X86] Change multiply by 19 to use (9 * X) * 2 + X instead of (5 * X) * 4 - 1. The new lowering can be done in 2 LEAs. The old code took 1 LEA, 1 shift, and 1 sub. llvm-svn: 337851	2018-07-24 20:31:48 +00:00
Craig Topper	1296c622df	[X86] Add test case to show failure to combine away negates that may be created by mul by constant expansion. Mul by constant can expand to a sequence that ends with a negate. If the next instruction is an add or sub we might be able to fold the negate away. We currently fail to do this because we explicitly don't add anything to the DAG combine worklist when we expand multiplies. This is primarily to keep the multipy from being reformed, but we should consider adding the users to worklist. llvm-svn: 337843	2018-07-24 18:36:46 +00:00
Joel Galenson	8dbcc58917	Use SCEV to avoid inserting some bounds checks. This patch uses SCEV to avoid inserting some bounds checks when they are not needed. This slightly improves the performance of code compiled with the bounds check sanitizer. Differential Revision: https://reviews.llvm.org/D49602 llvm-svn: 337830	2018-07-24 15:21:54 +00:00
Florian Hahn	36d2e25d5a	[PredicateInfo] Use custom mangling to support ssa_copy with unnamed types. This is a workaround and it would be better to fix this generally, but doing it generally is quite tricky. See D48541 and PR38117. Doing it in PredicateInfo directly allows us to use the type address to differentiate different unnamed types, because neither the created declarations nor the ssa_copy calls should be visible after PredicateInfo got destroyed. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D49126 llvm-svn: 337828	2018-07-24 14:49:52 +00:00
Simon Atanasyan	28ded4ee19	[mips] Fix local dynamic TLS with Sym64 For the final DTPREL addition, rather than a lui/daddiu/daddu triple, LLVM was erronously emitting a daddiu/daddiu pair, treating the %dtprel_hi as if it were a %dtprel_lo, since Mips::Hi expands unshifted for Sym64. Instead, use a new TlsHi node and, although unnecessary due to the exact structure of the nodes emitted, use TlsHi for local exec too to prevent future bugs. Also garbage-collect the unused TprelLo and TlsGd nodes, and TprelHi since its functionality is provided by the new common TlsHi node. Patch by James Clarke. Differential revision: https://reviews.llvm.org/D49259 llvm-svn: 337827	2018-07-24 13:47:52 +00:00
Sam Parker	8b93e82c3d	[ARM] Disable ARMCodeGenPrepare by default ARM Stage 2 builders have been suspiciously broken since the pass was committed. Disabling to hopefully fix the bots and give me time to debug. llvm-svn: 337821	2018-07-24 12:04:23 +00:00
Shiva Chen	f5938bfbf9	Revert "[DebugInfo] Generate DWARF debug information for labels." This reverts commit b454fa1b4079b6c0a5b1565982d16516385838d7. llvm-svn: 337812	2018-07-24 06:17:45 +00:00
Chandler Carruth	a25aca21af	[x86] Clean up and convert test to use generated CHECK lines. This test was already checking microscopic behavior of tail call under specific conditions. This just makes the CHECK lines much more consistent, clear, and easily updated when intentional changes are made. I've also switched the test to consistently name the entry block and to order the helper declarations and comments for specific tests in the more usual locations. llvm-svn: 337806	2018-07-24 03:18:08 +00:00
Chandler Carruth	d41dca2ddc	[x86] Update the CHECK lines of this test to use the latest patterns from the script. This minimizes the diff in subsequent changes. llvm-svn: 337805	2018-07-24 03:07:07 +00:00
Shiva Chen	d6b2cdf9d4	[DebugInfo] Generate DWARF debug information for labels. There are two forms for label debug information in DWARF format. 1. Labels in a non-inlined function: DW_TAG_label DW_AT_name DW_AT_decl_file DW_AT_decl_line DW_AT_low_pc 2. Labels in an inlined function: DW_TAG_label DW_AT_abstract_origin DW_AT_low_pc We will collect label information from DBG_LABEL. Before every DBG_LABEL, we will generate a temporary symbol to denote the location of the label. The symbol could be used to get DW_AT_low_pc afterwards. So, we create a mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase. The DBG_LABEL in the mapping is used to query the symbol before it. The AbstractLabels in DwarfCompileUnit is used to process labels in inlined functions. We also keep a mapping between scope and labels in DwarfFile to help to generate correct tree structure of DIEs. Differential Revision: https://reviews.llvm.org/D45556 Patch by Hsiangkai Wang. llvm-svn: 337799	2018-07-24 02:22:55 +00:00
Tom Stellard	b7f19e6d1e	AMDGPU/GlobalISel: Legalize G_INSERT Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49601 llvm-svn: 337798	2018-07-24 02:19:20 +00:00
Vedant Kumar	d6ff43cc71	[Debugify] Export per-pass debug info loss statistics Add a -debugify-export option to opt. This exports per-pass `debugify` loss statistics to a file in CSV format. For some interesting numbers on debug value loss during an -O2 build of the sqlite3 amalgamation, see the review thread. Differential Revision: https://reviews.llvm.org/D49003 llvm-svn: 337787	2018-07-24 00:41:29 +00:00
Thomas Anderson	8e8a652c2f	Fix typo in test/CodeGen/Mips/dins.ll Differential Revision: https://reviews.llvm.org/D49704 llvm-svn: 337771	2018-07-23 23:19:53 +00:00
Wolfgang Pieb	439801ba1d	[DWARF v5] Refactor range lists dumping by using a more generic way of handling tables of lists. The intent is to use it for location list tables as well. Change is almost NFC with the exception of the spelling of some strings used during dumping (all lowercase now). Reviewer: JDevlieghere Differential Revision: https://reviews.llvm.org/D49500 llvm-svn: 337763	2018-07-23 22:37:17 +00:00
Teresa Johnson	b963c0b658	[LTO] Handle __imp_ (dllimport) symbols consistently with lld Summary: Similar to what lld already does for dllimport symbols which are prefaced with __imp_ (see lld patch r240620), strip off the __imp_ prefix in LTO. Otherwise we can get 2 separate GlobalResolution for a single symbol, the dllimport declaration, and the definition, which leads to incorrect LTO handling. Fixes PR38105. Reviewers: pcc Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49138 llvm-svn: 337762	2018-07-23 22:33:57 +00:00
Martin Storsjo	100fc97051	[COFF] Fix assembly output of comdat sections without an attached symbol Since SVN r335286, the .xdata sections are produced without an attached symbol, which requires using a different syntax when printing assembly output. Instead of the usual syntax of '.section <name>,"dr",discard,<symbol>', use '.section <name>,"dr"' + '.linkonce discard' (which is what GCC uses for all assembly output). This fixes PR38254. Differential Revision: https://reviews.llvm.org/D49651 llvm-svn: 337756	2018-07-23 22:15:19 +00:00
Martin Storsjo	c2b701408e	[AArch64] Use MCAsmInfoMicrosoft and MCAsmInfoGNUCOFF as base classes This matches the structure used on X86 and ARM. This requires a little bit of duplication of the parts that are equal in both AArch64 COFF variants though. Before SVN r335286, these classes didn't add anything that MCAsmInfoCOFF didn't, but now they do. This makes AArch64 match X86 in how comdat is used for float constants for MinGW. Differential Revision: https://reviews.llvm.org/D49637 llvm-svn: 337755	2018-07-23 22:15:14 +00:00
Teresa Johnson	e214fdeb69	[ThinLTO] Ensure the TargetLibraryInfo is constructed early enough Summary: Without this change, the WholeProgramDevirt pass, which requires the TargetLibraryInfo, will construct one from the default triple. Fixes PR38139. Reviewers: pcc Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49278 llvm-svn: 337750	2018-07-23 21:58:19 +00:00
Manoj Gupta	f9f50f634d	ConstantFolding: Avoid a crash. Summary: Check if the parent basic block and caller exists before calling CS.getCaller when constant folding strip.invariant.group instrinsic. This avoids a crash when the function containing the intrinsic is being inlined. The instruction is checked for any simplifiction but has not yet been added to a basic block. Reviewers: Prazek, rsmith, efriedma Reviewed By: efriedma Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D49690 llvm-svn: 337742	2018-07-23 21:20:00 +00:00
Reid Kleckner	980c4df037	Re-land r335297 "[X86] Implement more of x86-64 large and medium PIC code models" Don't try to generate large PIC code for non-ELF targets. Neither COFF nor MachO have relocations for large position independent code, and users have been using "large PIC" code models to JIT 64-bit code for a while now. With this change, if they are generating ELF code, their JITed code will truly be PIC, but if they target MachO or COFF, it will contain 64-bit immediates that directly reference external symbols. For a JIT, that's perfectly fine. llvm-svn: 337740	2018-07-23 21:14:35 +00:00
Nirav Dave	5af81d5bfa	Add inline asm aliasing test. llvm-svn: 337734	2018-07-23 20:19:10 +00:00
Paul Semel	1dbbfba888	[yaml2obj] Add default sh_entsize for dynamic sections Dynamic section holds a table, so the sh_entsize might be set. As the dynamic section entry size never changes, we can default it to the size of a dynamic entry. Differential Revision: https://reviews.llvm.org/D49619 llvm-svn: 337725	2018-07-23 18:49:04 +00:00
Krzysztof Parzyszek	9500a24fce	[Hexagon] Handle unnamed globals in HexagonConstExpr Instead of comparing names, compare positions in the parent module. llvm-svn: 337723	2018-07-23 18:30:17 +00:00
Simon Atanasyan	307e5b31ce	[mips] Add more checks to the tls.ll test case. NFC llvm-svn: 337705	2018-07-23 16:05:44 +00:00
Cameron McInally	2c9bcffc92	[FPEnv] Legalize double width StrictFP vector operations Differential Revision: https://reviews.llvm.org/D48809 llvm-svn: 337698	2018-07-23 14:40:17 +00:00
Sam Parker	3828c6ff94	[ARM] ARMCodeGenPrepare backend pass Arm specific codegen prepare is implemented to perform type promotion on icmp operands, which can enable the removal of uxtb and uxth (unsigned extend) instructions. This is possible because performing type promotion before ISel alleviates this duty from the DAG builder which has to perform legalisation, but has a limited view on data ranges. The pass visits any instruction operand of an icmp and creates a worklist to traverse the use-def tree to determine whether the values can simply be promoted. Our concern is values in the registers overflowing the narrow (i8, i16) data range, so instructions marked with nuw can be promoted easily. For add and sub instructions, we are able to use the parallel dsp instructions to operate on scalar data types and avoid overflowing bits. Underflowing adds and subs are also permitted when the result is only used by an unsigned icmp. Differential Revision: https://reviews.llvm.org/D48832 llvm-svn: 337687	2018-07-23 12:27:47 +00:00
John Brawn	fc18a6ad7d	[GVN] Don't use the eliminated load as an available value in phi construction In ConstructSSAForLoadSet if an available value is actually the load that we're doing SSA construction to eliminate, then we can omit it as SSAUpdate will add in the value for the phi that will be replacing it anyway. This can result in simpler IR which can allow further optimisation. Differential Revision: https://reviews.llvm.org/D44160 llvm-svn: 337686	2018-07-23 12:14:45 +00:00
Alexandros Lamprineas	bf6009c234	[MemorySSAUpdater] Update Phi operands after trivial Phi elimination Bug fix for PR37445. The underlying problem and its fix are similar to PR37808. The bug lies in MemorySSAUpdater::getPreviousDefRecursive(), where PhiOps is computed before the call to tryRemoveTrivialPhi() and it ends up being out of date, pointing to stale data. We have now turned each of the PhiOps into a TrackingVH<MemoryAccess>. Differential Revision: https://reviews.llvm.org/D49425 llvm-svn: 337680	2018-07-23 10:56:30 +00:00
Roman Lebedev	52b85377eb	[NFC][MCA] ZnVer1: Update RegisterFile to identify false dependencies on partially written registers. Summary: Pretty mechanical follow-up for D49196. As microarchitecture.pdf notes, "20 AMD Ryzen pipeline", "20.8 Register renaming and out-of-order schedulers": The integer register file has 168 physical registers of 64 bits each. The floating point register file has 160 registers of 128 bits each. "20.14 Partial register access": The processor always keeps the different parts of an integer register together. ... An instruction that writes to part of a register will therefore have a false dependence on any previous write to the same register or any part of it. Reviewers: andreadb, courbet, RKSimon, craig.topper, GGanesh Reviewed By: GGanesh Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D49393 llvm-svn: 337676	2018-07-23 10:10:13 +00:00
Roman Lebedev	d57bd45acc	[NFC][MCA] ZnVer1: add partial-reg-update tests Reviewers: andreadb, courbet, RKSimon, craig.topper, GGanesh Reviewed By: GGanesh Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D49392 llvm-svn: 337675	2018-07-23 10:10:04 +00:00
Alexandros Lamprineas	592cc78dd8	[GVNHoist] safeToHoistLdSt allows illegal hoisting Bug fix for PR36787. When reasoning if it's safe to hoist a load we want to make sure that the defining memory access dominates the new insertion point of the hoisted instruction. safeToHoistLdSt calls firstInBB(InsertionPoint,DefiningAccess) which returns false if InsertionPoint == DefiningAccess, and therefore it falsely thinks it's safe to hoist. Differential Revision: https://reviews.llvm.org/D49555 llvm-svn: 337674	2018-07-23 09:42:35 +00:00
Chandler Carruth	1d926fb9f4	[x86/SLH] Fix a bug where we would harden tail calls twice -- once as a call, and then again as a return. Also added a comment to try and explain better why we would be doing what we're doing when hardening the (non-call) returns. llvm-svn: 337673	2018-07-23 07:56:15 +00:00
Chandler Carruth	b66f2d8df8	[x86/SLH] Add a test covering indirect forms of control flow. NFC. This specifically covers different ways of making indirect calls and jumps. There are some bugs in SLH that I will be fixing in subsequent patches where the diff in the generated instructions makes the bug fix much more clear, so just checking in a baseline of this test to start. I'm also going to be adding direct mitigation for variant 1.2 which this file very specifically tests in the various forms it can arise on x86. Again, the diff to the generated instructions should make the change for that much more clear, so having the test as a baseline seems useful. llvm-svn: 337672	2018-07-23 07:51:51 +00:00
Craig Topper	b2a626b52e	[X86] Remove the max vector width restriction from combineLoopMAddPattern and rely splitOpsAndApply to handle splitting. This seems to be a net improvement. There's still an issue under avx512f where we have a 512-bit vpaddd, but not vpmaddwd so we end up doing two 256-bit vpmaddwds and inserting the results before a 512-bit vpaddd. It might be better to do two 512-bits paddds with zeros in the upper half. Same number of instructions, but breaks a dependency. llvm-svn: 337656	2018-07-22 19:44:35 +00:00
Craig Topper	d8f80e90ce	[X86] Add more MADD recurrence test cases with larger and narrower vector widths. llvm-svn: 337650	2018-07-22 05:16:47 +00:00
Simon Atanasyan	ecd1e0afdd	[mips] Move out the WrapperPat declaration from the NotInMicroMips predicate This is a follow-up to the rL335185. Those commit adds some WrapperPat patterns for microMIPS target. But declaration of the WrapperPat class is under the NotInMicroMips predicate and microMIPS patterns cannot be selected because predicate (Subtarget->inMicroMipsMode()) && (!Subtarget->inMicroMipsMode()) is always false. This change move out the WrapperPat class declaration from the NotInMicroMips predicate and enables microMIPS WrapperPat patterns. Differential revision: https://reviews.llvm.org/D49533 llvm-svn: 337646	2018-07-21 16:16:03 +00:00
Chen Zheng	69bb064539	[InstrSimplify] fold sdiv if two operands are negated and non-overflow Differential Revision: https://reviews.llvm.org/D49382 llvm-svn: 337642	2018-07-21 12:27:54 +00:00
Krzysztof Parzyszek	05337bdb50	[Hexagon] Disable packets in test to avoid ordering issues in checks llvm-svn: 337624	2018-07-20 21:55:55 +00:00
Martin Storsjo	a6ffc9c8df	[COFF] Adjust how we flag weak externals This fixes PR36096. Originally based on a patch by Martell Malone. Differential Revision: https://reviews.llvm.org/D44357 llvm-svn: 337613	2018-07-20 20:48:29 +00:00
George Karpenkov	346dfbe2bc	[FileCheck] Provide an option for FileCheck to dump original input to stderr on failure The option can be either set using environment variable (e.g. env FILECHECK_DUMP_INPUT_ON_FAILURE=1 ninja check-fuzzer) or with a FileCheck flag. This can be extremely useful for debugging, cf. https://groups.google.com/forum/#!topic/llvm-dev/kLrzg8OM_h8 for discussion. Differential Revision: https://reviews.llvm.org/D49328 llvm-svn: 337609	2018-07-20 20:21:57 +00:00
Roman Tereshin	31d52847ef	Reapply "[LSV] Refactoring + supporting bitcasts to a type of different size" This reapplies commit r337489 reverted by r337541 Additionally, this commit contains a speculative fix to the issue reported in r337541 (the report does not contain an actionable reproducer, just a stack trace) llvm-svn: 337606	2018-07-20 20:10:04 +00:00
Joel E. Denny	6fc21c2522	[FileCheck] Fix search ranges for DAG-NOT-DAG A DAG-NOT-DAG is a CHECK-DAG group, X, followed by a CHECK-NOT group, N, followed by a CHECK-DAG group, Y. Let y be the initial directive of Y. This patch makes the following changes to the behavior: 1. Directives in N can no longer match within part of Y's match range just because y happens not to be the earliest match from Y. Specifically, this patch withdraws N's search range end from y's match range start to Y's match range start. 2. y can no longer match within X's match range, where a y match produced a reordering complaint, which is thus no longer possible. Specifically, this patch withdraws y's search range start from X's permitted range start to X's match range end, which was already the search range start for other members of Y. Both of these changes can only increase the number of test passes: #1 constrains the ability of CHECK-NOTs to match, and #2 expands the ability of CHECK-DAGs to match without complaints. These changes are based on discussions at: <http://lists.llvm.org/pipermail/llvm-dev/2018-May/123550.html> <https://reviews.llvm.org/D47106> which conclude that: 1. These changes simplify the FileCheck conceptual model. First, it makes search ranges for DAG-NOT-DAG more consistent with other cases. Second, it was confusing that y was treated differently from the rest of Y. 2. These changes add theoretical use cases for DAG-NOT-DAG that had no obvious means to be expressed otherwise. We can justify the first half of this assertion with the observation that these changes can only increase the number of test passes. 3. Reordering detection for DAG-NOT-DAG had no obvious real benefit. We don't have evidence from real uses cases to help us debate conclusions #2 and #3, but #1 at least seems intuitive. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D48986 llvm-svn: 337605	2018-07-20 20:09:56 +00:00
Jordan Rupprecht	db2036e1f5	[llvm-objcopy] Add basic support for --rename-section Summary: Add basic support for --rename-section=old=new to llvm-objcopy. A full replacement for GNU objcopy requires also modifying flags (i.e. --rename-section=old=new,flag1,flag2); I'd like to keep that in a separate change to keep this simple. Reviewers: jakehehrlich, alexshap Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49576 llvm-svn: 337604	2018-07-20 19:54:24 +00:00
Reid Kleckner	415b0bf370	And add a lit substitution for llvm-undname, as the comment says to llvm-svn: 337600	2018-07-20 18:45:01 +00:00

... 8 9 10 11 12 ...

55610 Commits