llvm-project

Commit Graph

Author	SHA1	Message	Date
Sam Parker	2088a7536c	[ARM] Fix for verifier buildbot Copy the MachineOperand first and then change the flags instead of making a copy. llvm-svn: 350811	2019-01-10 10:47:23 +00:00
Fedor Sergeev	b7871405fa	[LoopUnroll] add parsing for unroll parameters in -passes pipeline Allow to specify loop-unrolling with optional parameters explicitly spelled out in -passes pipeline specification. Introducing somewhat generic way of specifying parameters parsing via FUNCTION_PASS_PARAMETRIZED pass registration. Syntax of parametrized unroll pass name is as follows: 'unroll<' parameter-list '>' Where parameter-list is ';'-separate list of parameter names and optlevel optlevel: 'O[0-3]' parameter: { 'partial' \| 'peeling' \| 'runtime' \| 'upperbound' } negated: 'no-' parameter Example: -passes=loop(unroll<O3;runtime;no-upperbound>) this invokes LoopUnrollPass configured with OptLevel=3, Runtime, no UpperBound, everything else by default. llvm-svn: 350808	2019-01-10 10:01:53 +00:00
Sam Parker	7208221452	[ARM] Size reduce teq to eors Add t2TEQrr to the map of instructions with can be reduced down into a T1 instruction. This is a special case because TEQ just sets the CPSR and doesn't write to a GPR, which is not the case for EOR. So, we need to ensure that the EOR can write to the first operand. Differential Revision: https://reviews.llvm.org/D56255 llvm-svn: 350801	2019-01-10 08:36:33 +00:00
Craig Topper	5d20eb240f	[X86] Disable DomainReassignment pass when AVX512BW is disabled to avoid injecting VK32/VK64 references into the MachineIR Summary: This pass replaces GR8/GR16/GR32/GR64 with their equivalent sized mask register classes. But VK32/VK64 aren't legal without AVX512BW. Apparently this mostly appears to work if the register coalescer is able to remove the VK32/VK64 register class reference. Or if we don't ever spill it. But there's no guarantee of that. Another Intel employee managed to trigger a crash due to this with ISPC. Unfortunately, I've lost the test case he sent me at the time. I'm trying to get him to reproduce it for me. I'd like to get this in before 8.0 branches since its a little scary. The regressions here are unfortunate, but I think we can make some improvements to DAG combine, load folding, etc. to fix them. Just not sure if we can get that done for 8.0. Fixes PR39741 Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56460 llvm-svn: 350800	2019-01-10 07:43:54 +00:00
Zi Xuan Wu	64c956eea8	Recommit "[PowerPC] Fix assert from machine verify pass that unmatched register class about fcmp selection in fast-isel" This re-commit r350685. Differential Revision: https://reviews.llvm.org/D55686 llvm-svn: 350799	2019-01-10 06:20:14 +00:00
Mandeep Singh Grang	859cb2e35d	[AArch64] Emit the correct MCExpr relocations specifiers like VK_ABS_G0, etc Summary: D55896 and D56029 add support to emit fixups for :abs_g0: , :abs_g1_s: , etc. This patch adds the necessary enums and MCExpr needed for lowering these. Reviewers: rnk, mstorsjo, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D56037 llvm-svn: 350798	2019-01-10 04:59:44 +00:00
Thomas Lively	fdd4999b86	Revert "[WebAssembly] Add simd128-unimplemented subtarget feature" This reverts rL350791. llvm-svn: 350795	2019-01-10 04:09:25 +00:00
Stanislav Mekhanoshin	d3757d3f3a	[AMDGPU] Separate feature dot-insts Differential Revision: https://reviews.llvm.org/D56524 llvm-svn: 350793	2019-01-10 03:25:20 +00:00
Thomas Lively	eb6f9abd41	[WebAssembly] Add simd128-unimplemented subtarget feature This is a second attempt at r350778, which was reverted in r350789. The only change is that the unimplemented-simd128 feature has been renamed simd128-unimplemented, since naming it unimplemented-simd128 somehow made the simd128 feature flag enable the unimplemented-simd128 feature on Windows. llvm-svn: 350791	2019-01-10 02:55:52 +00:00
Thomas Lively	fdca5fab60	Revert "[WebAssembly] Add unimplemented-simd128 subtarget feature" This reverts L350778. llvm-svn: 350789	2019-01-10 01:37:44 +00:00
Craig Topper	c38c9c120f	[X86] After turning VSELECT into SHRUNKBLEND, make we push the VSELECT into the worklist so it can be deleted. Found while trying to figure out why my second version of D56421 worked better than the first version. We weren't deleting the vselect in a timely fashion and that caused SimplfyDemandedBit to see an additional user. The new version doesn't have this problem so this fix isn't needed there, but seemed like the right thing to do. llvm-svn: 350781	2019-01-10 00:14:27 +00:00
Thomas Lively	2eeade1814	[WebAssembly] Add unimplemented-simd128 subtarget feature Summary: This replaces the old ad-hoc -wasm-enable-unimplemented-simd flag. Also makes the new unimplemented-simd128 feature imply the simd128 feature. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits, alexcrichton Differential Revision: https://reviews.llvm.org/D56501 llvm-svn: 350778	2019-01-09 23:59:37 +00:00
Evandro Menezes	224d831bed	[llvm-mca] Display masks in hex Display the resources masks as hexadecimal. Otherwise, NFC. llvm-svn: 350777	2019-01-09 23:57:15 +00:00
Eli Friedman	d4e7a0d83c	[SimplifyLibCalls] Fix memchr expansion for constant strings. The C standard says "The memchr function locates the first occurrence of c (converted to an unsigned char)[...]". The expansion was missing the conversion to unsigned char. Fixes https://bugs.llvm.org/show_bug.cgi?id=39041 . Differential Revision: https://reviews.llvm.org/D55947 llvm-svn: 350775	2019-01-09 23:39:26 +00:00
David Major	30ba0a0c95	Don't require a null terminator when loading objects When a null terminator is required and the file size is a multiple of the system page size, MemoryBuffer will prefer pread() over mmap(), which can result in excessive memory usage. Patch by Mike Hommey! Differential Revision: https://reviews.llvm.org/D56475 llvm-svn: 350774	2019-01-09 23:36:32 +00:00
Heejin Ahn	569f090922	[WebAssembly] Print a debug message at the start of each pass Summary: Looks like many passes print its pass description as a debug message at the start of each pass, so added that to (mostly newly added) other passes as well. Reviewers: dschuff Subscribers: jgravelle-google, sbc100, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56142 llvm-svn: 350771	2019-01-09 23:05:21 +00:00
Easwaran Raman	b45994b843	Refactor synthetic profile count computation. NFC. Summary: Instead of using two separate callbacks to return the entry count and the relative block frequency, use a single callback to return callsite count. This would allow better supporting hybrid mode in the future as the count of callsite need not always be derived from entry count (as in sample PGO). Reviewers: davidxl Subscribers: mehdi_amini, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D56464 llvm-svn: 350755	2019-01-09 20:10:27 +00:00
Francis Visoiu Mistrih	ac6454a7f6	[CodeGen] Ignore return sext/zext attributes of unused results for tail calls If the caller's return type does not have a zeroext attribute but the callee does a tail call zeroext, we won't consider the tail call during CodeGenPrepare because the attributes don't match. However, if the result of the tail call has no uses, it makes sense to drop the sext/zext attributes. Differential Revision: https://reviews.llvm.org/D56486 llvm-svn: 350753	2019-01-09 19:46:15 +00:00
Easwaran Raman	ed279752f0	[Inliner] Assert that the computed inline threshold is non-negative. Reviewers: chandlerc Subscribers: haicheng, llvm-commits Differential Revision: https://reviews.llvm.org/D56409 llvm-svn: 350751	2019-01-09 19:26:17 +00:00
David Callahan	3ef0f4447d	refactor BlockFrequencyInfo::view to take a title parameter Summary: All a non-default title for the debugging this debugging aide Reviewers: twoh, Kader, modocache Reviewed By: twoh Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56499 llvm-svn: 350749	2019-01-09 19:12:38 +00:00
Thomas Lively	edb54b22d3	[WebAssembly] Standardize order of SIMD bitselect arguments Summary: For some reason the backend assumed that the condition mask would be the first argument to the LLVM intrinsic, but everywhere else the condition mask is the third argument. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56412 llvm-svn: 350746	2019-01-09 18:13:11 +00:00
Aleksandar Beserminji	8abf680424	[mips][micrompis] Emit 16bit NOPs by default Emit 16bit NOPs by default. Use 32bit NOPs in delay slots where necessary. Differential https://reviews.llvm.org/D55323 llvm-svn: 350733	2019-01-09 15:58:02 +00:00
Valery Pykhtin	b7a459547d	Revert "[AMDGPU] Fix DPP combiner" This reverts commit e3e2923a39cbec3b3bc3a7d3f0e9a77a4115080e, svn revision rL350721 llvm-svn: 350730	2019-01-09 15:21:53 +00:00
Kristof Beyls	c650ff77eb	Initial AArch64 SLH implementation. This is an initial implementation for Speculative Load Hardening for AArch64. It builds on top of the recently introduced AArch64SpeculationHardening pass. This doesn't implement (yet) some of the optimizations implemented for the X86SpeculativeLoadHardening pass. I thought introducing the optimizations incrementally in follow-up patches should make this easier to review. Differential Revision: https://reviews.llvm.org/D55929 llvm-svn: 350729	2019-01-09 15:13:34 +00:00
Valery Pykhtin	1e0b5c719b	[AMDGPU] Fix DPP combiner Fixed issue with identity values and other cases, f32/f16 identity values to be added later. fma/mac instructions is disabled for now. Test is fully reworked, added comments. Other fixes: 1. dpp move with uses and old reg initializer should be in the same BB. 2. bound_ctrl:0 is only considered when bank_mask and row_mask are fully enabled (0xF). Othervise the old register value is checked for identity. 3. Added add, subrev, and, or instructions to the old folding function. 4. Kill flag is cleared for the src0 (DPP register) as it may be copied into more than one user. Differential revision: https://reviews.llvm.org/D55444 llvm-svn: 350721	2019-01-09 13:43:32 +00:00
Florian Hahn	9697d2a764	Revert r350647: "[NewPM] Port tsan" This patch breaks thread sanitizer on some macOS builders, e.g. http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52725/ llvm-svn: 350719	2019-01-09 13:32:16 +00:00
Simon Pilgrim	5a7132ff0f	[X86] Enable combining shuffles to PACKSS/PACKUS for 256/512-bit vectors llvm-svn: 350716	2019-01-09 13:23:28 +00:00
Anton Korobeynikov	c18e90369d	[MSP430] Optimize 'shl x, 8[+ N] -> swpb(zext(x)) [<< N]' for i16 Perform additional simplification to reduce shift amount. Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D56016 llvm-svn: 350712	2019-01-09 13:03:01 +00:00
Anton Korobeynikov	9222ed4485	[MSP430] Fix crash while lowering llvm.stacksave/stackrestore Perform the usual expansion of stacksave / restore intrinsics. Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D54890 llvm-svn: 350710	2019-01-09 12:52:15 +00:00
Diogo N. Sampaio	1eb31c8e94	[AArch64] Move feature predctrl to predres Follow up patch of rL350385, for adding predres command line option. This patch renames the feature as to keep it aligned with the option passed by/to clang Differential Revision: https://reviews.llvm.org/D56484 llvm-svn: 350702	2019-01-09 11:24:15 +00:00
Simon Pilgrim	7ee86e8e81	[X86] Fix gcc7 -Wunused-but-set-variable warning. NFCI. llvm-svn: 350701	2019-01-09 11:18:49 +00:00
David Stenberg	33b192d72b	[DebugInfo] Omit location list entries with empty ranges Summary: This fixes PR39710. In that case we emitted a location list looking like this: .Ldebug_loc0: .quad .Lfunc_begin0-.Lfunc_begin0 .quad .Lfunc_begin0-.Lfunc_begin0 .short 1 # Loc expr size .byte 85 # DW_OP_reg5 .quad .Lfunc_begin0-.Lfunc_begin0 .quad .Lfunc_end0-.Lfunc_begin0 .short 1 # Loc expr size .byte 85 # super-register DW_OP_reg5 .quad 0 .quad 0 As seen, the first entry's beginning and ending addresses evalute to 0, which meant that the entry inadvertently became an "end of list" entry, resulting in the location list ending sooner than expected. To fix this, omit all entries with empty ranges. Location list entries with empty ranges do not have any effect, as specified by DWARF, so we might as well drop them: "A location list entry (but not a base address selection or end of list entry) whose beginning and ending addresses are equal has no effect because the size of the range covered by such an entry is zero." Reviewers: davide, aprantl, dblaikie Reviewed By: aprantl Subscribers: javed.absar, JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D55919 llvm-svn: 350698	2019-01-09 09:58:59 +00:00
Matt Arsenault	3dddb163dd	GlobalISel: Implement fewerElements for implicit_def llvm-svn: 350697	2019-01-09 07:51:52 +00:00
Matt Arsenault	befee402ff	GlobalISel: Implement widenScalar for implicit_def llvm-svn: 350695	2019-01-09 07:34:14 +00:00
Max Kazantsev	4615a505f8	[IPT] Drop cache less eagerly in GVN and LoopSafetyInfo Current strategy of dropping `InstructionPrecedenceTracking` cache is to invalidate the entire basic block whenever we change its contents. In fact, `InstructionPrecedenceTracking` has 2 internal strictures: `OrderedInstructions` that is needed to be invalidated whenever the contents changes, and the map with first special instructions in block. This second map does not need an update if we add/remove a non-special instuction because it cannot affect the contents of this map. This patch changes API of `InstructionPrecedenceTracking` so that it now accounts for reasons under which we invalidate blocks. This should lead to much less recalculations of the map and should save us some compile time because in practice we don't typically add/remove special instructions. Differential Revision: https://reviews.llvm.org/D54462 Reviewed By: efriedma llvm-svn: 350694	2019-01-09 07:28:13 +00:00
Zi Xuan Wu	f2a75eef41	Revert "[PowerPC] Fix assert from machine verify pass that unmatched register class about fcmp selection in fast-isel" This reverts commit r350685. See compile assert in compiler-rt. llvm-svn: 350693	2019-01-09 06:12:24 +00:00
Hiroshi Inoue	dad8c6a1c9	[NFC] fix trivial typos in comments llvm-svn: 350690	2019-01-09 05:11:10 +00:00
Craig Topper	2fa8e2d8a8	[X86] Correct the MaskVT for avx512 gather/scatter intrinsics to use the min of the number of index and data elements. When the result type is v2i64/v2f64 and the index element size is i32, the index vector has two unused elements making the type v4i32. The mask VT should match the number of memory accesses that will be made. This is consistent with the isel patterns used for the target independent gather/scatter intrinsic. llvm-svn: 350687	2019-01-09 04:21:12 +00:00
Zi Xuan Wu	9479f6d72e	[PowerPC] Fix assert from machine verify pass that unmatched register class about fcmp selection in fast-isel Bad machine code: Illegal virtual register for instruction function: TestULE basic block: %bb.0 entry (0x1000a39b158) instruction: %2:crrc = FCMPUD %1:vsfrc, %3:f8rc operand 1: %1:vsfrc Fix assert about missing match between fcmp instruction and register class. We should use vsx related cmp instruction xvcmpudp instead of fcmpu when vsx is opened. add -verifymachineinstrs option into related test cases to enable the verify pass. Differential Revision: https://reviews.llvm.org/D55686 llvm-svn: 350685	2019-01-09 02:31:10 +00:00
Stanislav Mekhanoshin	ed0d6c60af	Remove check for single use in ShrinkDemandedConstant This removes check for single use from general ShrinkDemandedConstant to the BE because of the AArch64 regression after D56289/rL350475. After several hours of experiments I did not come up with a testcase failing on any other targets if check is not performed. Moreover, direct call to ShrinkDemandedConstant is not really needed and superceed by SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D56406 llvm-svn: 350684	2019-01-09 02:24:22 +00:00
Matt Arsenault	0ad1b71fe3	RegisterCoalescer: Assume CR_Replace for SubRangeJoin Currently it's possible for following check on V.WriteLanes (which is not really meaningful during SubRangeJoin) to pass for one half of the pair, and then fall through to to one of the impossible or unresolved states. This then fails as inconsistent on the other half. During the main range join, the check between V.WriteLanes and OtherV.ValidLanes must have passed, meaning this should be a CR_Replace. Fixes most of the testcases in bugs 39542 and 39602 llvm-svn: 350678	2019-01-08 23:22:18 +00:00
Matt Arsenault	2c807410fd	RegisterCoalescer: Defer clearing implicit_def lanes We can't go back and recover the lanes if it turns out the implicit_def really can't be erased. Assume all lanes are valid if an unresolved conflict is encountered. There aren't any tests where this seems to matter either way, but this seems like a safer option. Fixes bug 39602 llvm-svn: 350676	2019-01-08 23:10:47 +00:00
Sanjay Patel	d023dd60e9	[InstCombine] canonicalize another raw IR rotate pattern to funnel shift This is matching the equivalent of the DAG expansion, so it should never end up with worse perf than the original code even if the target doesn't have a rotate instruction. llvm-svn: 350672	2019-01-08 22:39:55 +00:00
Rong Xu	016220549d	[PGO] Use SourceFileName rather module name in PGOFuncName In LTO or Thin-lto mode (though linker plugin), the module names are of temp file names which are different for different compilations. Using SourceFileName avoids the issue. This should not change any functionality for current PGO as all the current callers of getPGOFuncName() is before LTO. Differential Revision: https://reviews.llvm.org/D56327 llvm-svn: 350671	2019-01-08 22:39:47 +00:00
Heejin Ahn	321d522038	[WebAssembly] Rename StoreResults to MemIntrinsicResults Summary: StoreResults pass does not optimize store instructions anymore because store instructions don't return results values anymore. Now this pass is used solely for memory intrinsics, so update the pass name accordingly and fix outdated pass descriptions as well. This patch does not change any meaningful behavior, but not marked as NFC because it changes a comment check line in a test case. Reviewers: dschuff Subscribers: mgorny, sbc100, jgravelle-google, sunfiish, llvm-commits Differential Revision: https://reviews.llvm.org/D56093 llvm-svn: 350669	2019-01-08 22:35:18 +00:00
Rong Xu	d236b0aa93	[PGO] Revert r350442 to fix commit message. Will re-commit it using the correct commit message. llvm-svn: 350667	2019-01-08 22:33:29 +00:00
Evandro Menezes	39c97bf6cd	[AArch64] Adjust the cost model for Exynos Improve the modeling of ALU instructions. llvm-svn: 350663	2019-01-08 22:29:58 +00:00
Evandro Menezes	5d780093fd	[llvm-mca] Improve debugging (NFC) llvm-svn: 350661	2019-01-08 22:29:38 +00:00
Zachary Turner	2fe4900525	[llvm-undname] Add support for demangling msvc's noexcept types. Starting in C++17, MSVC introduced a new mangling for function parameters that are themselves noexcept functions. This patch makes llvm-undname properly demangle them. Patch by Zachary Henkel Differential Revision: https://reviews.llvm.org/D55769 llvm-svn: 350656	2019-01-08 21:05:51 +00:00
Zachary Turner	4e83923d83	Don't write #include "Windows/WindowsSupport.h" from the Windows dir. This generates -Wnonportable-include-dir warnings, and doesn't need to be there. It seems this was just checked in on accident. llvm-svn: 350655	2019-01-08 21:05:34 +00:00
Adrian Prantl	8a753a2e5a	Revert "Revert "Revert "Resubmit rL345008 "Split MachinePipeliner code into header and cpp files"""" This reverts commit D56084. llvm-svn: 350654	2019-01-08 21:05:10 +00:00
Philip Pfaffe	82f995db75	[NewPM] Port tsan A straightforward port of tsan to the new PM, following the same path as D55647. Differential Revision: https://reviews.llvm.org/D56433 llvm-svn: 350647	2019-01-08 19:21:57 +00:00
Paul Robinson	7402fd9a35	Rename DIFlagFixedEnum to DIFlagEnumClass. NFC llvm-svn: 350641	2019-01-08 17:52:29 +00:00
Anna Thomas	2dfa412efe	[UnrollRuntime] Fix domTree failures in multiexit unrolling Summary: This fixes the IDom for exit blocks and all blocks reachable from the exit blocks, when runtime unrolling under multiexit/exiting case. We initially had a restrictive check that the IDom is only updated when it is the header of the loop. However, we also need to update the IDom to the correct one when the IDom is any block within the original loop. See added test cases (which fail dom tree verification without the patch). Reviewers: reames, mzolotukhin, mkazantsev, hfinkel Reviewed by: brzycki, kuhar Subscribers: zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D56284 llvm-svn: 350640	2019-01-08 17:16:25 +00:00
Yonghong Song	0d99031de0	[BPF] Fix .BTF.ext reloc type assigment issue Commit f1db33c5c1a9 ("[BPF] Disable relocation for .BTF.ext section") assigned relocation type R_BPF_NONE if the fixup type is FK_Data_4 and the symbol is temporary. The reason is we use FK_Data_4 as a fixup type for insn offsets in .BTF.ext section. Just checking whether the symbol is temporary is not enough. For example, .debug_info may reference some strings whose fixup is FK_Data_4 with a temporary symbol as well. To truely reflect the case for .BTF.ext section, this patch further checks that the section associateed with the symbol must be SHF_ALLOC and SHF_EXECINSTR, i.e., in the text section. This fixed the above-mentioned problem. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 350637	2019-01-08 16:36:06 +00:00
Florian Hahn	c1ece1b41b	[MachineVerifier] Include offending register in allocatable live-in error msg. This patch adds a convenience report() method for physical registers and uses it to print the offending register with the 'MBB has allocatable live-in' error. Reviewers: MatzeB, rtereshin, dsanders Reviewed By: dsanders Differential Revision: https://reviews.llvm.org/D55946 llvm-svn: 350630	2019-01-08 15:16:23 +00:00
Petr Pavlu	bf4fdecc51	[GlobalISel] Fix choice of instruction selector for AArch64 at -O0 with -global-isel=0 Commit rL347861 introduced an unintentional change in the behaviour when compiling for AArch64 at -O0 with -global-isel=0. Previously, explicitly disabling GlobalISel resulted in using FastISel but an updated condition in the commit changed it to using SelectionDAG. The patch fixes this condition and slightly better organizes the code that chooses the instruction selector. Fixes PR40131. Differential Revision: https://reviews.llvm.org/D56266 llvm-svn: 350626	2019-01-08 14:19:06 +00:00
Philip Pfaffe	efb5ad1c58	[DA][NewPM] Add a printerpass and port the testsuite The new-pm version of DA is untested. Testing requires a printer, so add that and use it in the existing DA tests. Differential Revision: https://reviews.llvm.org/D56386 llvm-svn: 350624	2019-01-08 14:06:58 +00:00
Francis Visoiu Mistrih	7a6d7672c1	[X86][Darwin] Emit compact-unwind for register-sized stack adjustments For stack frames on the size of a register in x86, a code size optimization emits "push rax/eax" instead of "sub" for stack allocation. For example: foo: .cfi_startproc BB#0: pushq %rax Ltmp0: .cfi_def_cfa_offset 16 ... .cfi_endproc However, we are falling back to DWARF in this case because we cannot encode %rax as a saved register. This requirement is wrong, since we don't care about the contents of %rax, it is the equivalent of a sub. In order to specify that we care about the contents of %rax, we would need a .cfi_offset %rax, <offset>. It's also overzealous in the case where there are pushes for callee saved registers followed by a "push rax/eax" instead of "sub", in which case we should also be able to encode the callee saved regs and everything else using compact unwind. Patch authored by Bruno Cardoso Lopes. Differential Revision: https://reviews.llvm.org/D13793 llvm-svn: 350623	2019-01-08 13:53:15 +00:00
Lama Saba	32f08399eb	Revert "Revert "Resubmit rL345008 "Split MachinePipeliner code into header and cpp files""" This reverts commit rL350497 reported remaining issues seem to be unrelated to modules or this change. more info: https://reviews.llvm.org/D56084 llvm-svn: 350621	2019-01-08 13:30:36 +00:00
Tim Northover	964eea7ad2	AArch64: avoid splitting vector truncating stores. We have code to split vector splats (of zero and non-zero) for performance reasons, but it ignores the fact that a store might be truncating. Actually, truncating stores are formed for vNi8 and vNi16 types. Since the truncation is from a legal type, the size of the store is always <= 64-bits and so they don't actually benefit from being split up anyway, so this patch just disables that transformation. llvm-svn: 350620	2019-01-08 13:30:27 +00:00
Benjamin Kramer	a480523ce9	[GlobalISel] Fix unused variable warning in Release builds. llvm-svn: 350618	2019-01-08 12:54:26 +00:00
Sam Parker	53000a74a5	[ARM] Add missing patterns for DSP muls Using a PatLeaf for sext_16_node allowed matching smulbb and smlabb instructions once the operands had been sign extended. But we also need to use sext_inreg operands along with sext_16_node to catch a few more cases that enable use to remove the unnecessary sxth. Differential Revision: https://reviews.llvm.org/D55992 llvm-svn: 350613	2019-01-08 10:12:36 +00:00
Matt Arsenault	c765240060	AMDGPU/GlobalISel: Introduce vcc reg bank I'm not entirely sure this is the correct thing to do with the global isel philosophy, but I think this is necessary to handle how differently SGPRs are used normally vs. from a condition. For example, it makes sense to allow a copy from a VGPR to an SGPR, but it makes no sense to allow a copy from VGPRs to SGPRs used as select mask. This avoids regbankselecting strange code with a truncate feeding directly into a condition field. Now a copy is forced from sgpr(s1) to vcc, which is more sensible to handle. Some of these issues could probably avoided with making enough operations resulting in i1 illegal. I think we can't avoid this register bank for legality. For example, an i1 and where one source is from a truncate, and one source is a compare needs some kind of copy inserted to make sure both are in condition registers. llvm-svn: 350611	2019-01-08 06:30:53 +00:00
Thomas Lively	6a87ddac9a	[WebAssembly] Massive instruction renaming Summary: An automated renaming of all the instructions listed at https://github.com/WebAssembly/spec/issues/884#issuecomment-426433329 as well as some similarly-named identifiers. Reviewers: aheejin, dschuff, aardappel Subscribers: sbc100, jgravelle-google, eraman, sunfish, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D56338 llvm-svn: 350609	2019-01-08 06:25:55 +00:00
Robert Widmann	616ed17221	[LLVM-C] Allow For Creating a BasicBlock without a Parent Function Summary: Add a utility function for creating a basic block without a parent function. A useful operation for compilers that need to synthesize and conditionally insert code without having to bother with appending and immediately unlinking a block. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56279 llvm-svn: 350608	2019-01-08 06:24:19 +00:00
Robert Widmann	40dc48be0e	[LLVM-C] Allow Specifying Signedness in Int Cast Summary: Fix an old outstanding problem with the int cast builder binding always assuming the cast is signed by introducing a new LLVMBuildIntCast2 operation and deprecating the old prototype. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56280 llvm-svn: 350607	2019-01-08 06:23:22 +00:00
Mandeep Singh Grang	f286bee9fe	[MC] [AArch64] Support resolving signed fixups for :abs_g0_s: etc. Summary: This patch is a follow-up to D55896. Reviewers: efriedma, mstorsjo Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D56029 llvm-svn: 350606	2019-01-08 04:48:00 +00:00
Chris Kennelly	a97cad4642	[NFC] Remove empty line as a test commit. llvm-svn: 350605	2019-01-08 04:04:51 +00:00
Matt Arsenault	a1515d2d33	AMDGPU/GlobalISel: Legalize concat_vectors llvm-svn: 350598	2019-01-08 01:30:02 +00:00
Matt Arsenault	376f2ef2f0	Fix typos llvm-svn: 350597	2019-01-08 01:25:47 +00:00
Heejin Ahn	e95056d69c	[WebAssembly] Move CFG-changing passes before RegStackify Summary: FixIrreducibleControlFlow and LateEHPrepare both possibly modify CFG and create new registers. There seems to be no reason these passes go after register-related optimization passes (PrepareForLiveIntervals, OptimizeLiveIntervals, StoreResults, RegStackify, and RegColoring), and this also possibly create new optimization opportunities. I think we should put all current and future optimization passes before RegStackify (and related passes) unless there's a reason not to. Reviewers: kripken Subscribers: dschuff, sbc100, sunfish, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D56356 llvm-svn: 350596	2019-01-08 01:25:12 +00:00
Matt Arsenault	adc40baa29	RegBankSelect: Fix copy insertion point for terminators If a copy was needed to handle the condition of brcond, it was being inserted before the defining instruction. Add tests for iterator edge cases. I find the existing code here suspect for the case where it's looking for terminators that modify the register. It's going to insert a copy in the middle of the terminators, which isn't allowed (it might be necessary to have a COPY_terminator if anybody actually needs this). Also legalize brcond for AMDGPU. llvm-svn: 350595	2019-01-08 01:22:47 +00:00
Heejin Ahn	8e2bac8e7f	[WebAssembly] Use 'I' multiclass template for br_table (NFC) Summary: We don't need to explicitly use `NI` anymore because we now don't use `let` statements within the definitions. Reviewers: aardappel Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56376 llvm-svn: 350594	2019-01-08 01:15:15 +00:00
Matt Arsenault	ae6f1e07fc	AMDGPU/GlobalISel: Disallow VGPR->SCC copies This fixes using scalar adds when only the carry in is a VGPR using greedy regbankselect. llvm-svn: 350593	2019-01-08 01:13:20 +00:00
Matt Arsenault	68c668a5f3	AMDGPU/GlobalISel: RegBankSelect for carry-in I'm not sure we should be allowing the truncate to s1 for the inputs. It may be necessary to create a new VCC reg bank. llvm-svn: 350592	2019-01-08 01:09:09 +00:00
Matt Arsenault	2cc15b67b7	AMDGPU/GlobalISel: RegBankSelect for add/sub with carry out llvm-svn: 350589	2019-01-08 01:03:58 +00:00
Matt Arsenault	299302fbe7	AMDGPU/GlobalISel: InstrMapping for G_UNMERGE_VALUES llvm-svn: 350588	2019-01-08 00:46:19 +00:00
Wei Mi	2645fd0ece	[RegisterCoalescer] dst register's live interval needs to be updated when merging a src register in ToBeUpdated set. This is to fix PR40061 related with https://reviews.llvm.org/rL339035. In https://reviews.llvm.org/rL339035, live interval of source pseudo register in rematerialized copy may be saved in ToBeUpdated set and its update may be postponed. In PR40061, %t2 = %t1 is rematerialized and %t1 is added into toBeUpdated set to postpone its live interval update. After the rematerialization, the live interval of %t1 is larger than necessary. Then %t1 is merged into %t3 and %t1 gets removed. After the merge, %t3 contains live interval larger than necessary. Because %t3 is not in toBeUpdated set, its live interval is not updated after register coalescing and it will break some assumption in regalloc. The patch requires the live interval of destination register in a merge to be updated if the source register is in ToBeUpdated. Differential revision: https://reviews.llvm.org/D55867 llvm-svn: 350586	2019-01-08 00:26:11 +00:00
Davide Italiano	bf1fdb852f	[Verifier] Reject invalid type for DILocalVariable. Reviewers: aprantl Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56414 llvm-svn: 350578	2019-01-07 23:09:09 +00:00
Craig Topper	486313b5f7	Recommit r350554 "[X86] Remove AVX512VBMI2 concat and shift intrinsics. Replace with target independent funnel shift intrinsics." The MSVC limit we hit on AutoUpgrade.cpp has been worked around for now. llvm-svn: 350567	2019-01-07 21:00:32 +00:00
Martin Storsjo	93a7137c0a	[ObjectYAML] [COFF] Support multiple symbols with the same name Differential Revision: https://reviews.llvm.org/D56294 llvm-svn: 350566	2019-01-07 20:55:33 +00:00
Craig Topper	81fe1fbf4a	[X86][AutoUpgrade] Make some tweaks to reduce the number of nested if/else in the intrinsic upgrade code to avoid an MSVC compiler limit. MSVC has a nesting limit of around 110-130. An if/else if/else if counts against this next level. The autoupgrade code consists a long chain of these checking matches against strings. This commit moves some code to a helper function to move out a large if/else chain that was inside of one of the blocks into a separate function. There are more of these we could move or we could change some to lookup tables. I've also merged together a few similar blocks in the outer chain. This should buy us some margin for a little bit. llvm-svn: 350564	2019-01-07 20:13:45 +00:00
Craig Topper	fad1589f39	Revert r350554 "[X86] Remove AVX512VBMI2 concat and shift intrinsics. Replace with target independent funnel shift intrinsics." The AutoUpgrade.cpp if/else cascade hit an MSVC limit again. llvm-svn: 350562	2019-01-07 19:39:05 +00:00
Alina Sbirlea	12bbb4fe8d	[MemorySSA] Add SkipSelfWalker. Summary: Add implementation of SkipSelfWalker. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D56285 llvm-svn: 350561	2019-01-07 19:38:47 +00:00
Craig Topper	826f44b550	[TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes a User and OpIdx. Stop using it in AMDGPU target for simplifyI24. As we saw in D56057 when we tried to use this function on X86, it's unsafe. It allows the operand node to have multiple users, but doesn't prevent recursing past the first node when it does have multiple users. This can cause other simplifications earlier in the graph without regard to what bits are needed by the other users of the first node. Ideally all we should do to the first node if it has multiple uses is bypass it when its not needed by the user we started from. Doing any other transformation that SimplifyDemandedBits can do like turning ZEXT/SEXT into AEXT would result in an increase in instructions. Fortunately, we already have a function that can do just that, GetDemandedBits. It will only make transformations that involve bypassing a node. This patch changes AMDGPU's simplifyI24, to use a combination of GetDemandedBits to handle the multiple use simplifications. And then uses the regular SimplifyDemandedBits on each operand to handle simplifications allowed when the operand only has a single use. Unfortunately, GetDemandedBits simplifies constants more aggressively than SimplifyDemandedBits. This caused the -7 constant in the changed test to be simplified to remove the upper bits. I had to modify computeKnownBits to account for this by ignoring the upper 8 bits of the input. Differential Revision: https://reviews.llvm.org/D56087 llvm-svn: 350560	2019-01-07 19:30:43 +00:00
Alina Sbirlea	bc8aa24c2f	[MemorySSA] Refactor CachingWalker. Summary: Refactor caching walker to make creating a walker that skips the starting access strightforward. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D55957 llvm-svn: 350558	2019-01-07 19:22:37 +00:00
Craig Topper	9c4f7e9147	[X86] Remove AVX512VBMI2 concat and shift intrinsics. Replace with target independent funnel shift intrinsics. Differential Revision: https://reviews.llvm.org/D56377 llvm-svn: 350554	2019-01-07 19:10:12 +00:00
Diogo N. Sampaio	f192cdb5c9	[ARM] ComputeKnownBits to handle extract vectors This patch adds the sign/zero extension done by vgetlane to ARM computeKnownBitsForTargetNode. Differential revision: https://reviews.llvm.org/D56098 llvm-svn: 350553	2019-01-07 19:01:47 +00:00
Alina Sbirlea	f723020456	[MemorySSA] Extend the clobber walker with the option to skip the starting access. Summary: The option enables loop transformations to hoist accesses that do not have clobbers in the loop. If the clobber queries skips the starting access, the result may be outside the loop instead of the header Phi. Adding the walker that uses this option in a separate patch. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D55944 llvm-svn: 350551	2019-01-07 18:40:27 +00:00
Nikita Popov	8dd19ed3ec	Revert "[DemandedBits] Use SetVector for Worklist" This reverts commit r350547. Seeing assertion failures on clang tests. llvm-svn: 350549	2019-01-07 18:15:11 +00:00
Nikita Popov	353d92decb	[DemandedBits] Use SetVector for Worklist DemandedBits currently uses a simple vector for the worklist, which means that instructions may be inserted multiple times into it. Especially in combination with the deep lattice, this may cause instructions too be recomputed very often. To avoid this, switch to a SetVector. Differential Revision: https://reviews.llvm.org/D56362 llvm-svn: 350547	2019-01-07 18:03:36 +00:00
Rhys Perry	f77e2e8406	AMDGPU: test for uniformity of branch instruction, not its condition Summary: If a divergent branch instruction is marked as divergent by propagation rule 2 in DivergencePropagator::exploreSyncDependency() and its condition is uniform, that branch would incorrectly be assumed to be uniform. Reviewers: arsenm, tstellar Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D56331 llvm-svn: 350532	2019-01-07 15:52:28 +00:00
Alexandre Ganea	90f4b94da3	[CodeView] More appropriate name and type for a Microsoft precompiled headers parameter. NFC llvm-svn: 350520	2019-01-07 13:53:16 +00:00
Matt Arsenault	7a27c1886f	AMDGPU: Remove v16i8 from register classes llvm-svn: 350518	2019-01-07 13:31:55 +00:00
Matt Arsenault	369acb8470	AMDGPU: Remove VS/SV mappings from select These would violate the constant bus restriction llvm-svn: 350517	2019-01-07 13:21:36 +00:00
Chandler Carruth	90c09232a2	[CallSite removal] Move the rest of IR implementation code away from `CallSite`. With this change, the remaining `CallSite` usages are just for implementing the wrapper type itself. This does update the C API but leaves the names of that API alone and only updates their implementation. Differential Revision: https://reviews.llvm.org/D56184 llvm-svn: 350509	2019-01-07 07:31:49 +00:00
Chandler Carruth	57578aaf96	[CallSite removal] Port `IndirectCallSiteVisitor` to use `CallBase` and update client code. Also rename it to use the more generic term `call` instead of something that could be confused with a praticular type. Differential Revision: https://reviews.llvm.org/D56183 llvm-svn: 350508	2019-01-07 07:15:51 +00:00
Chandler Carruth	fee1a04d04	[CallSite removal] Move the verifier to use `CallBase` instead of the `CallSite` wrapper. Mostly mechanical, but I've tried to tidy up code where it made sense to do so. Differential Revision: https://reviews.llvm.org/D56143 llvm-svn: 350507	2019-01-07 07:02:34 +00:00
Chandler Carruth	363ac68374	[CallSite removal] Migrate all Alias Analysis APIs to use the newly minted `CallBase` class instead of the `CallSite` wrapper. This moves the largest interwoven collection of APIs that traffic in `CallSite`s. While a handful of these could have been migrated with a minorly more shallow migration by converting from a `CallSite` to a `CallBase`, it hardly seemed worth it. Most of the APIs needed to migrate together because of the complex interplay of AA APIs and the fact that converting from a `CallBase` to a `CallSite` isn't free in its current implementation. Out of tree users of these APIs can fairly reliably migrate with some combination of `.getInstruction()` on the `CallSite` instance and casting the resulting pointer. The most generic form will look like `CS` -> `cast_or_null<CallBase>(CS.getInstruction())` but in most cases there is a more elegant migration. Hopefully, this migrates enough APIs for users to fully move from `CallSite` to the base class. All of the in-tree users were easily migrated in that fashion. Thanks for the review from Saleem! Differential Revision: https://reviews.llvm.org/D55641 llvm-svn: 350503	2019-01-07 05:42:51 +00:00
Craig Topper	6ffeeb705f	[X86] Add support for matching vector funnel shift to AVX512VBMI2 instructions. Summary: AVX512VBMI2 supports a funnel shift by immediate and a funnel shift by a variable vector. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56361 llvm-svn: 350498	2019-01-06 18:10:18 +00:00
Lama Saba	f385c21f79	Revert "Resubmit rL345008 "Split MachinePipeliner code into header and cpp files"" This reverts commit rL350493 issues related to modules still appear in http://green.lab.llvm.org/green/job/lldb-cmake llvm-svn: 350497	2019-01-06 16:39:14 +00:00
Sanjay Patel	8f12e8f3f6	[x86] explicitly set cost of integer add/sub There are no test changes here in the existing cost model regression tests because integer add/sub have a default legal cost of 1 already. This would break, however, if we custom lower those ops because the default cost model assumes that custom-lowered ops are more expensive. This is similar to the change in rL350403. See discussion in D56011 for more details. When we enhance that patch to handle integer ops, we need this cost model change to avoid unintended diffs here from the custom lowering. llvm-svn: 350496	2019-01-06 16:21:42 +00:00
Lama Saba	ea9d555b83	Resubmit rL345008 "Split MachinePipeliner code into header and cpp files" Resubmitted in rL345290 and reverted in rL350345 due to failures in http://green.lab.llvm.org/green/job/lldb-cmake/ Resubmitting after a workaround to lldb-cmake failure was committed in rL350346, more info in https://reviews.llvm.org/D56084 llvm-svn: 350493	2019-01-06 15:45:40 +00:00
Craig Topper	57fc891c1b	[LegalizeVectorOps] Add FSHL/FSHR to the list of vector operations that should be handled. The FSHL/FSHR nodes are handled in the expand function, but they need to also be listed in the code that queries for the operation action too. llvm-svn: 350490	2019-01-06 07:06:35 +00:00
Craig Topper	1187991bcf	[X86][AsmParser] Don't allow X86::DX in CheckBaseRegAndIndexRegAndScale. This was here because out and in instructions allow '(%dx)' even though its not a memory reference. To handle this we build a special operand for the DX register reference before we get to the call to CheckBaseRegAndIndexRegAndScale. So we no longer need this special case. llvm-svn: 350483	2019-01-05 23:30:28 +00:00
Craig Topper	d0ba531a0c	[X86] Use two pmovmskbs in combineBitcastvxi1 for (i64 (bitcast (v64i1 (truncate (v64i8)))) on KNL. llvm-svn: 350481	2019-01-05 22:42:58 +00:00
Craig Topper	46f8b4a11e	[X86] Allow combinevxi1Bitcast to use pmovmskb on avx512 targets if the input is a truncate from v16i8/v32i8. This is especially helpful on targets without avx512bw since we don't have a good way to convert from v16i8/v32i8 to v16i1/v32i1 for the truncate anyway. If we're just going to convert it to a GPR we might as well use pmovmskb to accomplish both. llvm-svn: 350480	2019-01-05 21:40:07 +00:00
Stanislav Mekhanoshin	35a3a3bd11	Added single use check to ShrinkDemandedConstant Fixes cvt_f32_ubyte combine. performCvtF32UByteNCombine() could shrink source node to demanded bits only even if there are other uses. Differential Revision: https://reviews.llvm.org/D56289 llvm-svn: 350475	2019-01-05 19:20:00 +00:00
Craig Topper	3f48dbf72e	[X86] Allow LowerTRUNCATE to use PACKUS/PACKSS for v16i16->v16i8 truncate when -mprefer-vector-width-256 is in effect and BWI is not available. llvm-svn: 350473	2019-01-05 18:48:11 +00:00
Nikita Popov	65038515ee	[InstCombine] Relax cttz/ctlz with select on zero The cttz/ctlz intrinsics have a parameter specifying whether the result is undefined for zero. cttz(x, false) can be relaxed to cttz(x, true) if x is known non-zero, and in fact such an optimization is already performed. However, this currently doesn't work if x is non-zero as a result of a select rather than an explicit branch. This patch adds handling for this case, thus allowing x != 0 ? cttz(x, false) : y to simplify to x != 0 ? cttz(x, true) : y. Differential Revision: https://reviews.llvm.org/D55786 llvm-svn: 350463	2019-01-05 09:48:16 +00:00
Easwaran Raman	366a873f14	[Inliner] Optimize shouldBeDeferred This has some minor optimizations to shouldBeDeferred. This is not strictly NFC because the early exit inside the loop assumes TotalSecondaryCost is monotonically non-decreasing, which is not true if the threshold used by CostAnalyzer is negative. AFAICT the thresholds do not go below 0 for the default values of the various options we use. llvm-svn: 350456	2019-01-05 02:26:29 +00:00
Craig Topper	45ec002e25	[X86] Require second operand of X86vshiftuniform to be an integer. NFC We don't need to require the first operand to be an integer because we already said it was the same type as the result which we also constrained to an integer. llvm-svn: 350455	2019-01-05 01:40:29 +00:00
Evgeniy Stepanov	0184c53cbd	Revert "Revert "[hwasan] Android: Switch from TLS_SLOT_TSAN(8) to TLS_SLOT_SANITIZER(6)"" This reapplies commit r348983. llvm-svn: 350448	2019-01-05 00:44:58 +00:00
Rong Xu	b5fa0a89b2	[PGO] Use SourceFileName rather module name in PGOFuncName In LTO or Thin-lto mode (though linker plugin), the module names are of temp file names which are different for different compilations. Using SourceFileName avoids the issue. This should not change any functionality for current PGO as all the current callers of getPGOFuncName() is before LTO. llvm-svn: 350442	2019-01-04 22:54:03 +00:00
Nikita Popov	c35b4a37ba	[X86] Fix warning; NFC llvm-svn: 350437	2019-01-04 21:41:35 +00:00
Vyacheslav Zakharin	0a6f86c54b	Update the pr_datasz of .note.gnu.property section. Patch by Xiang Zhang. Differential Revision: https://reviews.llvm.org/D56080 llvm-svn: 350436	2019-01-04 21:25:01 +00:00
Nikita Popov	6658fce4fc	[BDCE] Remove dead uses of arguments In addition to finding dead uses of instructions, also find dead uses of function arguments, and replace them with zero as well. I'm changing the way the known bits are computed here to remove the coupling between the transfer function and the algorithm. It previously relied on the first op being visited first and computing known bits -- unless the first op is not an instruction, in which case they're computed on the second op. I could have adjusted this to check for "instruction or argument", but I think it's better to avoid the repeated calculation with an explicit flag. Differential Revision: https://reviews.llvm.org/D56247 llvm-svn: 350435	2019-01-04 21:21:43 +00:00
Evandro Menezes	9f53bea536	[AArch64] Adjust the cost model for Exynos M3 Improve the modeling of ASIMD loads and stores. llvm-svn: 350434	2019-01-04 21:02:25 +00:00
Craig Topper	cfeb1cf9af	[X86] Add INSERT_SUBVECTOR to ComputeNumSignBits This adds support for calculating sign bits of insert_subvector. I based it on the computeKnownBits. My motivating case is propagating sign bits information across basic blocks on AVX targets where concatenating using insert_subvector is common. Differential Revision: https://reviews.llvm.org/D56283 llvm-svn: 350432	2019-01-04 20:50:59 +00:00
Peter Collingbourne	87f477b5e4	hwasan: Implement lazy thread initialization for the interceptor ABI. The problem is similar to D55986 but for threads: a process with the interceptor hwasan library loaded might have some threads started by instrumented libraries and some by uninstrumented libraries, and we need to be able to run instrumented code on the latter. The solution is to perform per-thread initialization lazily. If a function needs to access shadow memory or add itself to the per-thread ring buffer its prologue checks to see whether the value in the sanitizer TLS slot is null, and if so it calls __hwasan_thread_enter and reloads from the TLS slot. The runtime does the same thing if it needs to access this data structure. This change means that the code generator needs to know whether we are targeting the interceptor runtime, since we don't want to pay the cost of lazy initialization when targeting a platform with native hwasan support. A flag -fsanitize-hwaddress-abi={interceptor,platform} has been introduced for selecting the runtime ABI to target. The default ABI is set to interceptor since it's assumed that it will be more common that users will be compiling application code than platform code. Because we can no longer assume that the TLS slot is initialized, the pthread_create interceptor is no longer necessary, so it has been removed. Ideally, lazy initialization should only cost one instruction in the hot path, but at present the call may cause us to spill arguments to the stack, which means more instructions in the hot path (or theoretically in the cold path if the spills are moved with shrink wrapping). With an appropriately chosen calling convention for the per-thread initialization function (TODO) the hot path should always need just one instruction and the cold path should need two instructions with no spilling required. Differential Revision: https://reviews.llvm.org/D56038 llvm-svn: 350429	2019-01-04 19:27:04 +00:00
Teresa Johnson	853b962416	[ThinLTO] Handle chains of aliases At -O0, globalopt is not run during the compile step, and we can have a chain of an alias having an immediate aliasee of another alias. The summaries are constructed assuming aliases in a canonical form (flattened chains), and as a result only the base object but no intermediate aliases were preserved. Fix by adding a pass that canonicalize aliases, which ensures each alias is a direct alias of the base object. Reviewers: pcc, davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54507 llvm-svn: 350423	2019-01-04 19:04:54 +00:00
Sanjay Patel	6153565511	[x86] lower extracted fadd/fsub to horizontal vector math; 2nd try The 1st try for this was at rL350369, but it caused IR-level diffs because our cost models differentiate custom vs. legal/promote lowering. So that was reverted at rL350373. The cost models were fixed independently at rL350403, so this is effectively the same patch as last time. Original commit message: This would show up if we fix horizontal reductions to narrow as they go along, but it's an improvement for size and/or Jaguar (fast-hops) independent of that. We need to do this late to not interfere with other pattern matching of larger horizontal sequences. We can extend this to integer ops in a follow-up patch. Differential Revision: https://reviews.llvm.org/D56011 llvm-svn: 350421	2019-01-04 17:48:13 +00:00
Vedant Kumar	a1778df474	[CodeExtractor] Do not extract unsafe lifetime markers Lifetime markers which reference inputs to the extraction region are not safe to extract. Example ('rhs' will be extracted): ``` entry: +------------+ \| x = alloca \| \| y = alloca \| +------------+ / \ lhs: rhs: +-------------------+ +-------------------+ \| lifetime_start(x) \| \| lifetime_start(x) \| \| use(x) \| \| lifetime_start(y) \| \| lifetime_end(x) \| \| use(x, y) \| \| lifetime_start(y) \| \| lifetime_end(y) \| \| use(y) \| \| lifetime_end(x) \| \| lifetime_end(y) \| +-------------------+ +-------------------+ ``` Prior to extraction, the stack coloring pass sees that the slots for 'x' and 'y' are in-use at the same time. After extraction, the coloring pass infers that 'x' and 'y' are not in-use concurrently, because markers from 'rhs' are no longer available to help decide otherwise. This leads to a miscompile, because the stack slots actually are in-use concurrently in the extracted function. Fix this by moving lifetime start/end markers for memory regions defined in the calling function around the call to the extracted function. Fixes llvm.org/PR39671 (rdar://45939472). Differential Revision: https://reviews.llvm.org/D55967 llvm-svn: 350420	2019-01-04 17:43:22 +00:00
Sanjay Patel	722466e1f1	[InstCombine] reduce raw IR narrowing rotate patterns to funnel shift Similar to rL350199 - there are no known analysis/codegen holes for funnel shift intrinsics now, so we can canonicalize the 6+ regular instructions to funnel shift to improve vectorization, inlining, unrolling, etc. llvm-svn: 350419	2019-01-04 17:38:12 +00:00
John Brawn	39ac159c24	[LICM] Adjust how moving the re-hoist point works In some cases the order that we hoist instructions in means that when rehoisting (which uses the same order as hoisting) we can rehoist to a block A, then a block B, then block A again. This currently causes an assertion failure as it expects that when changing the hoist point it only ever moves to a block that dominates the hoist point being moved from. Fix this by moving the re-hoist point when it doesn't dominate the dominator of hoisted instruction, or in other words when it wouldn't dominate the uses of the instruction being rehoisted. Differential Revision: https://reviews.llvm.org/D55266 llvm-svn: 350408	2019-01-04 17:12:09 +00:00
Nirav Dave	1468d6e1c5	Undo r350355 "[X86] Remove terrible DX Register parsing hack in parse operand. NFCI." Add missing test case and update comments. llvm-svn: 350406	2019-01-04 17:11:15 +00:00
Simon Pilgrim	c2054144ee	[CostModel][X86] Fix SSE1 FADD/FSUB costs Noticed in D56011 - handle the case that scalar fp ops are quicker on P3 than P4 Add the other costs so that we're not relying on the default "is legal/custom" cost logic. llvm-svn: 350403	2019-01-04 16:55:57 +00:00
Ranjeet Singh	107dd2565c	Revert patches 348835 and 348571 because they're causing code size performance regressions. llvm-svn: 350402	2019-01-04 16:39:10 +00:00
Simon Pilgrim	9f4dea8c06	[X86] Add VPSLLI/VPSRLI ((X >>u C1) << C2) SimplifyDemandedBits combine Repeat of the generic SimplifyDemandedBits shift combine llvm-svn: 350399	2019-01-04 15:43:43 +00:00
Andrea Di Biagio	3f4b54850f	[MCA] Improved handling of in-order issue/dispatch resources. Added field 'MustIssueImmediately' to the instruction descriptor of instructions that only consume in-order issue/dispatch processor resources. This speeds up queries from the hardware Scheduler, and gives an average ~5% speedup on a release build. No functional change intended. llvm-svn: 350397	2019-01-04 15:08:38 +00:00
Florian Hahn	7902405c42	[ValueTracking] Fix a misuse of APInt in GetPointerBaseWithConstantOffset GetPointerBaseWithConstantOffset include this code, where ByteOffset and GEPOffset are both of type llvm::APInt : ByteOffset += GEPOffset.getSExtValue(); The problem with this line is that getSExtValue() returns an int64_t, but the += matches an overload for uint64_t. The problem is that the resulting APInt is no longer considered to be signed. That in turn causes assertion failures later on if the relevant pointer type is > 64 bits in width and the GEPOffset was negative. Changing it to ByteOffset += GEPOffset.sextOrTrunc(ByteOffset.getBitWidth()); resolves the issue and explicitly performs the sign-extending or truncation. Additionally, instead of asserting later if the result is > 64 bits, it breaks out of the loop in that case. See also https://reviews.llvm.org/D24729 https://reviews.llvm.org/D24772 This commit must be merged after D38662 in order for the test to pass. Patch by Michael Ferguson <mpfergu@gmail.com>. Reviewers: reames, sanjoy, hfinkel Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D38501 llvm-svn: 350395	2019-01-04 14:53:22 +00:00
Andrea Di Biagio	7bec693433	[MCA] Store extra information about processor resources in the ResourceManager. Method ResourceManager::use() is responsible for updating the internal state of used processor resources, as well as notifying resource groups that contain used resources. Before this patch, method 'use()' didn't know how to quickly obtain the set of groups that contain a particular resource unit. It had to discover groups by perform a potentially slow search (done by iterating over the set of processor resource descriptors). With this patch, the relationship between resource units and groups is stored in the ResourceManager. That means, method 'use()' no longer has to search for groups. This gives an average speedup of ~4-5% on a release build. This patch also adds extra code comments in ResourceManager.h to better describe the resource mask layout, and how resouce indices are computed from resource masks. llvm-svn: 350387	2019-01-04 12:31:14 +00:00
Richard Trieu	e1fef949ae	[WebAssembly] Split the checking from the sorting logic. Move the check for -1 and identical values outside the vector sorting code. Compare functions need to be able to compare identical elements to be conforming. llvm-svn: 350379	2019-01-04 06:49:24 +00:00
Xin Tong	47beee2f3f	[memcpyopt] Remove a few unnecessary isVolatile() checks. NFC We already checked for isSimple() on the store. llvm-svn: 350378	2019-01-04 02:13:22 +00:00
Craig Topper	6265a15f2e	[X86] Add post-isel peephole to fold KAND+KORTEST into KTEST if only the zero flag is used. Doing this late so we will prefer to fold the AND into a masked comparison first. That can be better for the live range of the mask register. Differential Revision: https://reviews.llvm.org/D56246 llvm-svn: 350374	2019-01-04 00:10:58 +00:00
Sanjay Patel	26ce9c38a7	revert r350369: [x86] lower extracted fadd/fsub to horizontal vector math There are non-codegen tests that need to be updated with this code change. llvm-svn: 350373	2019-01-04 00:02:02 +00:00
Sanjay Patel	ef4afca2ad	[x86] lower extracted fadd/fsub to horizontal vector math This would show up if we fix horizontal reductions to narrow as they go along, but it's an improvement for size and/or Jaguar (fast-hops) independent of that. We need to do this late to not interfere with other pattern matching of larger horizontal sequences. We can extend this to integer ops in a follow-up patch. Differential Revision: https://reviews.llvm.org/D56011 llvm-svn: 350369	2019-01-03 23:16:19 +00:00
Heejin Ahn	777d01c756	[WebAssembly] Optimize Irreducible Control Flow Summary: Irreducible control flow is not that rare, e.g. it happens in malloc and 3 other places in the libc portions linked in to a hello world program. This patch improves how we handle that code: it emits a br_table to dispatch to only the minimal necessary number of blocks. This reduces the size of malloc by 33%, and makes it comparable in size to asm2wasm's malloc output. Added some tests, and verified this passes the emscripten-wasm tests run on the waterfall (binaryen2, wasmobj2, other). Reviewers: aheejin, sunfish Subscribers: mgrang, jgravelle-google, sbc100, dschuff, llvm-commits Differential Revision: https://reviews.llvm.org/D55467 Patch by Alon Zakai (kripken) llvm-svn: 350367	2019-01-03 23:10:11 +00:00
Wouter van Oortmerssen	820c6263d9	[WebAssembly] Fixed disassembler not knowing about new brlist operand Summary: The previously introduced new operand type for br_table didn't have a disassembler implementation, causing an assert. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56227 llvm-svn: 350366	2019-01-03 23:01:30 +00:00
Wouter van Oortmerssen	9843295608	[WebAssembly] Made InstPrinter more robust Summary: Instead of asserting on certain kinds of malformed instructions, it now still print, but instead adds an annotation indicating the problem, and/or indicates invalid_type etc. We're using the InstPrinter from many contexts that can't always guarantee values are within range (e.g. the disassembler), where having output is more valueable than asserting. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56223 llvm-svn: 350365	2019-01-03 22:59:59 +00:00
Nirav Dave	8de916d1a4	[X86] Remove terrible DX Register parsing hack in parse operand. NFCI. Fold hack special casing of (%dx) operand parsing into the related hack for out/in instruction parsing. llvm-svn: 350355	2019-01-03 21:46:30 +00:00
Sanjay Patel	9633d76a40	[DAGCombiner][x86] scalarize binop followed by extractelement As noted in PR39973 and D55558: https://bugs.llvm.org/show_bug.cgi?id=39973 ...this is a partial implementation of a fold that we do as an IR canonicalization in instcombine: // extelt (binop X, Y), Index --> binop (extelt X, Index), (extelt Y, Index) We want to have this in the DAG too because as we can see in some of the test diffs (reductions), the pattern may not be visible in IR. Given that this is already an IR canonicalization, any backend that would prefer a vector op over a scalar op is expected to already have the reverse transform in DAG lowering (not sure if that's a realistic expectation though). The transform is limited with a TLI hook because there's an existing transform in CodeGenPrepare that tries to do the opposite transform. Differential Revision: https://reviews.llvm.org/D55722 llvm-svn: 350354	2019-01-03 21:31:16 +00:00
Alexander Timofeev	993e2798fd	[AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression. Detailed description: SIFoldOperands::foldInstOperand iterates over the operand uses calling the function that changes def-use iteratorson the way. As a result loop exits immediately when def-use iterator is changed. Hence, the operand is folded to the very first use instruction only. This makes VGPR live along the whole basic block and increases register pressure significantly. The performance drop observed in SHOC DeviceMemory test is caused by this bug. Proposed fix: collect uses to separate container for further processing in another loop. Testing: make check-llvm SHOC performance test. Reviewers: rampitec, ronlieb Differential Revision: https://reviews.llvm.org/D56161 llvm-svn: 350350	2019-01-03 19:55:32 +00:00
Anna Thomas	a470aa6701	[UnrollRuntime] Move the DomTree verification under expensive checks Suggested by Hal as done in r349871. llvm-svn: 350349	2019-01-03 19:43:33 +00:00
Stefan Granitz	a9b7ca472d	Revert "Resubmit rL345008 "Split MachinePipeliner code into header and cpp files"" This reverts commit r350290. llvm-svn: 350345	2019-01-03 19:09:24 +00:00
Kristina Brooks	e434280f3d	[MCStreamer] Use report_fatal_error in EmitRawTextImpl Use report_fatal_error in MCStreamer::EmitRawTextImpl instead of using errs() and explain the rationale behind it not being llvm_unreachable() to save confusion for any future maintainers. Differential Revision: https://reviews.llvm.org/D56245 llvm-svn: 350342	2019-01-03 18:42:31 +00:00
Anna Thomas	0785e7307e	[UnrollRuntime] Add DomTree verification under debug mode NFC: This adds the dom tree verification under debug mode at a point just before we start unrolling the loop. This allows us to verify dom tree at a state where it is much smaller and before the unrolling actually happens. This also implies we do not need to run -verify-dom-info everytime to see if the DT is in a valid state when we transform the loop for runtime unrolling. llvm-svn: 350334	2019-01-03 17:44:44 +00:00
Evandro Menezes	0f67746c92	[AArch64] Add new scheduling predicates Add new scheduling predicates to identify the ASIMD loads and stores using the post indexed addressing mode. llvm-svn: 350332	2019-01-03 17:28:09 +00:00
Andrea Di Biagio	b284054b26	[MCA] Improve code comment and reuse an helper function in ResourceManager. NFCI llvm-svn: 350322	2019-01-03 14:47:46 +00:00
Alex Bradbury	2ba76be882	[RISCV][MC] Accept %lo and %pcrel_lo on operands to li This matches GNU assembler behaviour. llvm-svn: 350321	2019-01-03 14:41:41 +00:00
Philip Pfaffe	b39a97c8f6	[NewPM] Port Msan Summary: Keeping msan a function pass requires replacing the module level initialization: That means, don't define a ctor function which calls __msan_init, instead just declare the init function at the first access, and add that to the global ctors list. Changes: - Pull the actual sanitizer and the wrapper pass apart. - Add a newpm msan pass. The function pass inserts calls to runtime library functions, for which it inserts declarations as necessary. - Update tests. Caveats: - There is one test that I dropped, because it specifically tested the definition of the ctor. Reviewers: chandlerc, fedor.sergeev, leonardchan, vitalybuka Subscribers: sdardis, nemanjai, javed.absar, hiraditya, kbarton, bollu, atanasyan, jsji Differential Revision: https://reviews.llvm.org/D55647 llvm-svn: 350305	2019-01-03 13:42:44 +00:00
Simon Pilgrim	c2aadfaaad	[SLPVectorizer] Flag ADD/SUB SSAT/USAT intrinsics trivially vectorizable (PR40123) Enables SLP vectorization for the SSE2 PADDS/PADDUS/PSUBS/PSUBUS style intrinsics llvm-svn: 350300	2019-01-03 12:18:23 +00:00
Diogo N. Sampaio	8786a946d8	[ARM] Add command-line option for SB SB (Speculative Barrier) is only mandatory from 8.5 onwards but is optional from Armv8.0-A. This patch adds a command line option to enable SB, as it was previously only possible to enable by selecting -march=armv8.5-a. This patch also renames FeatureSpecRestrict to FeatureSB. Reviewed By: olista01, LukeCheeseman Differential Revision: https://reviews.llvm.org/D55990 llvm-svn: 350299	2019-01-03 12:09:12 +00:00
Simon Pilgrim	d824f99a6c	[X86] Add ADD/SUB SSAT/USAT vector costs (PR40123) Costs for real SSE2 instructions llvm-svn: 350295	2019-01-03 11:38:42 +00:00
Piotr Sobczak	3abef8f9ea	[AMDGPU] Change section name with metadata access Summary: The commit rL348922 introduced a means to set Metadata section kind for a global variable, if its explicit section name was prefixed with ".AMDGPU.metadata.". This patch changes that prefix to ".AMDGPU.comment.", as "metadata" in the section name might lead to ambiguity with metadata used by AMD PAL runtime. Change-Id: Idd4748800d6fe801441d91595fc21e5a4171e668 Reviewers: kzhuravl Reviewed By: kzhuravl Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D56197 llvm-svn: 350292	2019-01-03 11:22:58 +00:00
Lama Saba	4d752a88e8	Resubmit rL345008 "Split MachinePipeliner code into header and cpp files" The commit caused unclear failures in http://green.lab.llvm.org/green//job/lldb-cmake/ will revert if the error reappears Differential Revision: https://reviews.llvm.org/D56084 llvm-svn: 350290	2019-01-03 10:03:54 +00:00
Markus Lavin	72b9deb21f	[CodeGen] Skip over dbg-instr in twoaddr pass A DBG_VALUE between a two-address instruction and a following COPY would prevent rescheduleMIBelowKill optimization inside TwoAddressInstructionPass. Differential Revision: https://reviews.llvm.org/D55987 llvm-svn: 350289	2019-01-03 08:36:06 +00:00
Martin Storsjo	74e7d26090	[llvm-readobj] [COFF] Print the symbol index for relocations There can be multiple local symbols with the same name (for e.g. comdat sections), and thus the symbol name itself isn't enough to disambiguate symbols. Differential Revision: https://reviews.llvm.org/D56140 llvm-svn: 350288	2019-01-03 08:08:23 +00:00
Kristina Brooks	bbbec9daa4	Don't go over 80 chars in MCStreamer.cpp. NFC. Fixing up style issues around the area to prepare for a larger differential. llvm-svn: 350286	2019-01-03 06:06:38 +00:00
QingShan Zhang	f24ec7bdd0	[Power9] Enable the Out-of-Order scheduling model for P9 hw When switched to the MI scheduler for P9, the hardware is modeled as out of order. However, inside the MI Scheduler algorithm, we still use the in-order scheduling model as the MicroOpBufferSize isn't set. The MI scheduler take it as the hw cannot buffer the op. So, only when all the available instructions issued, the pending instruction could be scheduled. That is not true for our P9 hw in fact. This patch is trying to enable the Out-of-Order scheduling model. The buffer size 44 is picked from the P9 hw spec, and the perf test indicate that, its value won't hurt the cpu2017. With this patch, there are 3 specs improved over 3% and 1 spec deg over 3%. The detail is as follows: x264_r: +6.95% cactuBSSN_r: +6.94% lbm_r: +4.11% xz_r: -3.85% And the GEOMEAN for all the C/C++ spec in spec2017 is about 0.18% improved. Reviewer: Nemanjai Differential Revision: https://reviews.llvm.org/D55810 llvm-svn: 350285	2019-01-03 05:04:18 +00:00
Pete Cooper	697281df42	Teach ObjCARC optimizer about equivalent PHIs when eliminating autoreleaseRV/retainRV pairs OptimizeAutoreleaseRVCall skips optimizing llvm.objc.autoreleaseReturnValue if it sees a user which is llvm.objc.retainAutoreleasedReturnValue, and if they have equivalent arguments (either identical or equivalent PHIs). It then assumes that ObjCARCOpt::OptimizeRetainRVCall will optimize the pair instead. Trouble is, ObjCARCOpt::OptimizeRetainRVCall doesn't know about equivalent PHIs so optimizes in a different way and we are left with an unoptimized llvm.objc.autoreleaseReturnValue. This teaches ObjCARCOpt::OptimizeRetainRVCall to also understand PHI equivalence. rdar://problem/47005143 Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D56235 llvm-svn: 350284	2019-01-03 01:38:08 +00:00
Robert Widmann	7882b283cd	[LLVM-C] Expand LLVMRelocMode Summary: Add read[only\|write] PIC relocation models to the C API and teach the TargetMachine API about it. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56187 llvm-svn: 350279	2019-01-03 00:33:44 +00:00
Craig Topper	df5304d8de	[X86] Add load folding support to the custom isel we do for X86ISD::UMUL/SMUL. The peephole pass isn't always able to fold the load because it can't commute the implicit usage of AL/AX/EAX/RAX. llvm-svn: 350272	2019-01-02 23:24:08 +00:00
Wouter van Oortmerssen	ad72f68501	[WebAssembly] made assembler parse block_type Summary: This was previously ignored and an incorrect value generated. Also fixed Disassembler's handling of block_type. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D56092 llvm-svn: 350270	2019-01-02 23:23:51 +00:00
Xin Tong	33e3b4b9b3	[ThinLTO] Scan all variants of vague symbol for reachability. Summary: Alias can make one (but not all) live, we still need to scan all others if this symbol is reachable from somewhere else. Reviewers: tejohnson, grimar Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D56117 llvm-svn: 350269	2019-01-02 23:18:20 +00:00
Pete Cooper	8d58048024	Fix assert in ObjCARC optimizer when deleting retainBlock of null or undef. The caller to EraseInstruction had this conditional: // ARC calls with null are no-ops. Delete them. if (IsNullOrUndef(Arg)) but the assert inside EraseInstruction only allowed ConstantPointerNull and not undef or bitcasts. This adds support for both of these cases. rdar://problem/47003805 llvm-svn: 350261	2019-01-02 21:00:02 +00:00
Nikita Popov	cc6ef7f153	[BDCE] Remove instructions without demanded bits If an instruction has no demanded bits, remove it directly during BDCE, instead of leaving it for something else to clean up. Differential Revision: https://reviews.llvm.org/D56185 llvm-svn: 350257	2019-01-02 20:02:14 +00:00
Pawel Bylica	119aa8fa5f	Format AggresiveInstCombine.cpp. NFC llvm-svn: 350255	2019-01-02 19:51:46 +00:00
Craig Topper	9d4860ec4e	[X86] Remove X86ISD::INC/DEC. Just select them from X86ISD::ADD/SUB at isel time INC/DEC are pretty much the same as ADD/SUB except that they don't update the C flag. This patch removes the special nodes and just pattern matches from ADD/SUB during isel if the C flag isn't being used. I had to avoid selecting DEC is the result isn't used. This will become a SUB immediate which will turned into a CMP later by optimizeCompareInstr. This lead to the one test change where we use a CMP instead of a DEC for an overflow intrinsic since we only checked the flag. This also exposed a hole in our RMW flag matching use of hasNoCarryFlagUses. Our root node for the match is a store and there's no guarantee that all the flag users have been selected yet. So hasNoCarryFlagUses needs to check copyToReg and machine opcodes, but it also needs to check for the pre-match SETCC, SETCC_CARRY, BRCOND, and CMOV opcodes. Differential Revision: https://reviews.llvm.org/D55975 llvm-svn: 350245	2019-01-02 19:01:05 +00:00
Zachary Turner	ba797b6dae	[MS Demangler] Add a flag for dumping types without tag specifier. Sometimes it's useful to be able to output demangled names without tag specifiers like "struct", "class", etc. This patch adds a flag enabling this. llvm-svn: 350241	2019-01-02 18:33:12 +00:00
Craig Topper	8dd7bd2cd7	[DAGCombiner] After performing the division by constant optimization for a DIV or REM node, replace the users of the corresponding REM or DIV node if it exists. Currently we expand the two nodes separately. This gives DAG combiner an opportunity to optimize the expanded sequence taking into account only one set of users. When we expand the other node we'll create the expansion again, but might not be able to optimize it the same way. So the nodes won't CSE and we'll have two similarish sequences in the same basic block. By expanding both nodes at the same time we'll avoid prematurely optimizing the expansion until both the division and remainder have been replaced. Improves the test case from PR38217. There may be additional opportunities after this. Differential Revision: https://reviews.llvm.org/D56145 llvm-svn: 350239	2019-01-02 18:19:07 +00:00
Craig Topper	3109f3a4ab	[LegalizeIntegerTypes] When promoting the result of an extract_vector_elt also promote the input type if necessary By also promoting the input type we get a better idea for what scalar type to use. This can provide better results if the result of the extract is sign extended. What was previously happening is that the extract result would be legalized, sometime later the input of the sign extend would be legalized using the result of the extract. Then later the extract input would be legalized forcing a truncate into the input of the sign extend using a replace all uses. This requires DAG combine to combine out the sext/truncate pair. But sometimes we visited the truncate first and messed things up before the sext could be combined. By creating the extract with the correct scalar type when we create legalize the result type, the truncate will be added right away. Then when the sign_extend input is legalized it will create an any_extend of the truncate which can be optimized by getNode to maybe remove the truncate. And then a sign_extend_inreg. Now DAG combine doesn't have to worry about getting rid of the extend. This fixes the regression on X86 in D56156. Differential Revision: https://reviews.llvm.org/D56176 llvm-svn: 350236	2019-01-02 17:58:30 +00:00
Craig Topper	c562fae02b	[DAGCombiner][X86][PowerPC] Teach visitSIGN_EXTEND_INREG to fold (sext_in_reg (aext/sext x)) -> (sext x) when x has more than 1 sign bit and the sext_inreg is from one of them. If x has multiple sign bits than it doesn't matter which one we extend from so we can sext from x's msb instead. The X86 setcc-combine.ll changes are a little weird. It appears we ended up with a (sext_inreg (aext (trunc (extractelt)))) after type legalization. The sext_inreg+aext now gets optimized by this combine to leave (sext (trunc (extractelt))). Then we visit the trunc before we visit the sext. This ends up changing the truncate to an extractvectorelt from a bitcasted vector. I have a follow up patch to fix this. Differential Revision: https://reviews.llvm.org/D56156 llvm-svn: 350235	2019-01-02 17:58:27 +00:00
Wei Mi	ecc89b76cb	[PowerPC] Remove SeenUse check when optimizing conditional branch in PPCPreEmitPeephole pass. PPCPreEmitPeephole will convert a BC to B when the conditional branch is based on a constant CR by CRSET or CRUNSET. This is added in https://reviews.llvm.org/rL343100. When the conditional branch is known to be always taken, all branches will be removed and a new unconditional branch will be inserted. However, when SeenUse is false the original patch will not remove the branches, but still insert the new unconditional branch, update the successors and create inconsistent IR. Compiling the synthetic testcase included can show the problem we run into. The patch simply removes the SeenUse condition when adding branches into InstrsToErase set. Differential Revision: https://reviews.llvm.org/D56041 llvm-svn: 350223	2019-01-02 17:07:23 +00:00
Simon Pilgrim	d8125726d5	[X86] Support SHLD/SHRD masked shift-counts (PR34641) Peek through shift modulo masks while matching double shift patterns. I was hoping to delay this until I could remove the X86 code with generic funnel shift matching (PR40081) but this will do for now. Differential Revision: https://reviews.llvm.org/D56199 llvm-svn: 350222	2019-01-02 17:05:37 +00:00
Hal Finkel	4f2381440d	[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug) Motivated by the discussion in D38499, this patch updates BasicAA to support arbitrary pointer sizes by switching most remaining non-APInt calculations to use APInt. The size of these APInts is set to the maximum pointer size (maximum over all address spaces described by the data layout string). Most of this translation is straightforward, but this patch contains a fix for a bug that revealed itself during this translation process. In order for test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit pointers, the intermediate calculations must be performed using 64-bit integers. This is because, as noted in the patch, when GetLinearExpression decomposes an expression into C1V+C2, and we then multiply this by Scale, and distribute, to get (C1Scale)V + C2Scale, it can be the case that, even through C1V+C2 does not overflow for relevant values of V, (C2Scale) can overflow. If this happens, later logic will draw invalid conclusions from the (base) offset value. Thus, when initially applying the APInt conversion, because the maximum pointer size in this test is 32 bits, it started failing. Suspicious, I created a 64-bit version of this test (included here), and that failed (miscompiled) on trunk for a similar reason (the multiplication can overflow). After fixing this overflow bug, the first test case (at least) in Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was relying on having 64-bit intermediate values to have BasicAA return an accurate result. In order to fix this problem, and because I believe that it is not uncommon to use i64 indexing expressions in 32-bit code (especially portable code using int64_t), it seems reasonable to always use at least 64-bit integers. In this way, we won't regress our analysis capabilities (and there's a command-line option added, so experimenting with this should be easy). As pointed out by Eli during the review, there are other potential overflow conditions that this patch does not address. Fixing those is left to follow-up work. Patch by me with contributions from Michael Ferguson (mferguson@cray.com). Differential Revision: https://reviews.llvm.org/D38662 llvm-svn: 350220	2019-01-02 16:28:09 +00:00
Philip Pfaffe	6bc98ad7e8	Extend Module::getOrInsertGlobal to control the construction of the GlobalVariable Summary: Extend Module::getOrInsertGlobal to accept a callback for creating a new GlobalVariable if necessary instead of calling the GV constructor directly using default arguments. Additionally overload getOrInsertGlobal for the previous default behavior. Reviewers: chandlerc Subscribers: hiraditya, llvm-commits, bollu Differential Revision: https://reviews.llvm.org/D56130 llvm-svn: 350219	2019-01-02 15:41:47 +00:00
Andrea Di Biagio	0682afbaee	[MCA] Minor refactoring of method DefaultResourceStrategy::select. NFCI Common code used by the default resource strategy to select pipeline resources has been moved to an helper function. The new selection logic has been slightly rewritten to get rid of a redundant zero check on the `ReadyMask` value. Before this patch, method select internally called function `PowerOf2Floor` to compute the next ready pipeline resource. However, `PowerOf2Floor` forces an implicit (redundant) zero check on the input value. By construction, `ReadyMask` can never be zero. This patch replaces the call to `PowerOf2Floor` with an equivalent block of code which avoids the redundant zero check. This gives a minor 3-3.5% speedup on a release build. No functional change intended. llvm-svn: 350218	2019-01-02 15:40:52 +00:00
Piotr Sobczak	378131bae0	[AMDGPU] Handle OR as operand of raw load/store Summary: Use isBaseWithConstantOffset() which handles OR as an operand to llvm.amdgcn.raw.buffer.load and llvm.amdgcn.raw.buffer.store. Change-Id: Ifefb9dc5ded8710d333df07ab1900b230e33539a Reviewers: nhaehnle, mareko, arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55999 llvm-svn: 350208	2019-01-02 09:47:41 +00:00
Craig Topper	f7cc7e3201	[X86] Remove the separate SMUL8/UMUL8 X86ISD opcodes by merging with SMUL/UMUL. Remove the second result from X86ISD::UMUL. All of these use custom isel so we can pretty easily detect the differences in the custom code in X86ISelDAGToDAG. The ISD opcodes just need to express the desired semantics not the details of how they would be selected by isel. So unifying them lets us remove the special casing from lowering. llvm-svn: 350206	2019-01-02 06:40:11 +00:00
Craig Topper	d4db122483	[X86] Allow LowerSELECT and LowerBRCOND to directly lower i8 UMULO/SMULO. These require a different X86ISD node to be created than i16/i32/i64. I guess no one wanted to add the special code for that except in LowerXALUO. But now LowerXALUO, LowerSELECT, and LowerBRCOND all use a common helper function so they all share the special code. Unfortunately, there are no test changes because we seem to correct the miss in a DAG combine later. I did verify it manually using test cases from xmulo.ll llvm-svn: 350205	2019-01-02 05:46:03 +00:00
Sanjay Patel	654e6aabb9	[InstCombine] canonicalize raw IR rotate patterns to funnel shift The final piece of IR-level analysis to allow this was committed with: rL350188 Using the intrinsics should improve transforms based on cost models like vectorization and inlining. The backend should be prepared too, so we can now canonicalize more sequences of shift/logic to the intrinsics and know that the end result should be equal or better to the original code even if the target does not have an actual rotate instruction. llvm-svn: 350199	2019-01-01 21:51:39 +00:00
Craig Topper	00b390a000	[X86] Factor the core code out of LowerXALUO into a helper function. Use it in LowerBRCOND and LowerSELECT to avoid some duplicated code. This makes it easier to keep the LowerBRCOND and LowerSELECT code in sync with LowerXALUO so they always pick the same operation for overflowing instructions. This is inspired by the helper functions used by ARM and AArch64 for the same purpose. The test change is because LowerSELECT was not in sync with LowerXALUO with regard to INC/DEC for SADDO/SSUBO. llvm-svn: 350198	2019-01-01 19:34:11 +00:00
Robert Widmann	db5b537f1e	[LLVM-C] bool -> LLVMBool llvm-svn: 350197	2019-01-01 19:03:37 +00:00
Robert Widmann	5d1dfa3eb6	[LLVM-C] Add Accessors for Discarding Value Names in the IR Summary: Add accessors so the performance improvement from this setting is accessible to third parties. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56179 llvm-svn: 350196	2019-01-01 18:56:51 +00:00
Sanjay Patel	738a863648	[x86] move/rename helper for horizontal op codegen; NFC Preliminary commit as suggested in D56011. llvm-svn: 350193	2019-01-01 16:08:36 +00:00
Nikita Popov	bc9986e9ad	Reapply "[BDCE][DemandedBits] Detect dead uses of undead instructions" This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The previous attempt to land this lead to miscompiles, because cases where uses were initially dead but were later found to be live during further analysis were not always correctly removed from the DeadUses set. This is fixed now and the added test case demanstrates such an instance. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 350188	2019-01-01 10:05:26 +00:00
Ayonam Ray	e00606a1b2	Reversing the commit in revision 350186. Revision causes regression in 4 tests. llvm-svn: 350187	2019-01-01 07:28:55 +00:00
Ayonam Ray	c471bb2e67	Omit range checks from jump tables when lowering switches with unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Review Reference: D52002 llvm-svn: 350186	2019-01-01 06:37:50 +00:00
Chen Zheng	4952e668f8	[InstCombine] canonicalize MUL with NEG operand -X * Y --> -(X * Y) X * -Y --> -(X * Y) Differential Revision: https://reviews.llvm.org/D55961 llvm-svn: 350185	2019-01-01 01:09:20 +00:00
Craig Topper	ed3ffae4a4	[SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG support to computeKnownBits. Differential Revision: https://reviews.llvm.org/D56168 llvm-svn: 350179	2018-12-31 19:09:30 +00:00
Craig Topper	bb0873cf46	[X86] Add X86ISD::VSRAI to computeKnownBitsForTargetNode. Differential Revision: https://reviews.llvm.org/D56169 llvm-svn: 350178	2018-12-31 19:09:27 +00:00
Simon Pilgrim	f2b9d10477	Keep tablegen commands in alphabetical order. NFCI. Mentioned on D56167. llvm-svn: 350176	2018-12-31 14:51:53 +00:00
Martin Storsjo	74d93f9b24	[AArch64] Accept "sve" as arch feature in assembler Differential Revision: https://reviews.llvm.org/D56128 llvm-svn: 350174	2018-12-31 10:22:04 +00:00
Alexander Potapenko	cea4f83371	[MSan] Handle llvm.is.constant intrinsic MSan used to report false positives in the case the argument of llvm.is.constant intrinsic was uninitialized. In fact checking this argument is unnecessary, as the intrinsic is only used at compile time, and its value doesn't depend on the value of the argument. llvm-svn: 350173	2018-12-31 09:42:23 +00:00
Craig Topper	802c4979ae	[DAGCombiner] Add missing one use check on the shuffle in the bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) transform. Found while trying out some other changes so I don't really have a test case. llvm-svn: 350172	2018-12-31 05:40:46 +00:00
Martin Storsjo	2018777836	[AArch64] Implement the .arch_extension directive Differential Revision: https://reviews.llvm.org/D56131 llvm-svn: 350169	2018-12-30 21:06:32 +00:00
Kang Zhang	9d78c60bf4	[PowerPC] Fix machine verify pass error for PATCHPOINT pseudo instruction that bad machine code Summary: For SDAG, we pretend patchpoints aren't special at all until we emit the code for the pseudo. Then the verifier runs and it seems like we have a use of an undefined register (the register will be reserved later, but the verifier doesn't know that). So this patch call setUsesTOCBasePtr before emit the code for the pseudo, so verifier can know X2 is a reserved register. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D56148 llvm-svn: 350165	2018-12-30 15:13:51 +00:00
David Bolvansky	90004149cc	[NFC] Fixed extra semicolon warning -This line, and those below, will be ignored-- M lib/Support/Error.cpp llvm-svn: 350162	2018-12-30 13:18:17 +00:00
Kang Zhang	4aa6453767	[PowerPC] Fix ADDE, SUBE do not know how to promote operator Summary: This patch is created to fix the Bugzilla bug 39815: https://bugs.llvm.org/show_bug.cgi?id=39815 This patch is to support promotion integer result for the instruction ADDE, SUBE. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D56119 llvm-svn: 350161	2018-12-30 07:48:09 +00:00
Craig Topper	a32e353afa	[X86] Don't mark SEXTLOAD from v4i8/v4i16/v8i8 as Custom on pre-sse4.1. This seems to be getting in the way more than its helping. This does mean we stop scalarizing some cases, but I'm not convinced the scalarization was really better. Some of the changes to vsel-cmp-load.ll are a regression but D56156 should fix it. llvm-svn: 350159	2018-12-30 03:05:07 +00:00
Craig Topper	f237ce159e	[X86] Add custom type legalization for SIGN_EXTEND_VECTOR_INREG from 16i16/v32i8 to v4i64 when v4i64 needs splitting. This allows us to sign extend to v4i32 first. And then share that extension to implement the final steps to v4i64 using a pcmpgt and punpckl and punpckh. We already do something similar for SIGN_EXTEND with -x86-experimental-vector-widening-legalization. llvm-svn: 350158	2018-12-30 02:30:34 +00:00
Nemanja Ivanovic	0dad994a10	[PowerPC][NFC] Macro for register set defs for the Asm Parser We have some unfortunate code in the back end that defines a bunch of register sets for the Asm Parser. Every time another class is needed in the parser, we have to add another one of those definitions with explicit lists of registers. This NFC patch simply provides macros to use to condense that code a little bit. Differential revision: https://reviews.llvm.org/D54433 llvm-svn: 350156	2018-12-29 16:13:11 +00:00
Nemanja Ivanovic	0f7715afe1	[PowerPC] Complete the custom legalization of vector int to fp conversion A recent patch has added custom legalization of vector conversions of v2i16 -> v2f64. This just rounds it out for other types where the input vector has an illegal (narrower) type than the result vector. Specifically, this will handle the following conversions: v2i8 -> v2f64 v4i8 -> v4f32 v4i16 -> v4f32 Differential revision: https://reviews.llvm.org/D54663 llvm-svn: 350155	2018-12-29 13:40:48 +00:00
Nemanja Ivanovic	3c7ac649ec	[PowerPC] Fix CR Bit spill pseudo expansion The current CRBIT spill pseudo-op expansion creates a KILL instruction that kills the CRBIT and defines the enclosing CR field. However, this paints a false picture to the register allocator that all bits in the CR field are killed so copies of other bits out of the field become dead and removable. This changes the expansion to preserve the KILL flag on the CRBIT as an implicit use and to treat the CR field as an undef input. Thanks to Hal Finkel for the review and Uli Weigand for implementation input. Differential revision: https://reviews.llvm.org/D55996 llvm-svn: 350153	2018-12-29 11:43:54 +00:00
Simon Atanasyan	a6424e7c4e	[mips] Show an error on attempt to use 64-bit PC-relative relocation The following code requests 64-bit PC-relative relocations unsupported by MIPS ABI. Now it triggers an assertion. It's better to show an error message. ``` foo: .quad bar - foo ``` llvm-svn: 350152	2018-12-29 10:10:02 +00:00
Simon Atanasyan	b243d8d42a	[mips] Show a regular error message on attempt to use one byte relocation llvm-svn: 350151	2018-12-29 10:09:55 +00:00
Max Kazantsev	201534d753	Drop SE cache early because loop parent can change in LoopSimplifyCFG llvm-svn: 350145	2018-12-29 04:26:22 +00:00
Heejin Ahn	4d98dfb67d	[WebAssembly] Fix comments in ExplicitLocals (NFC) llvm-svn: 350144	2018-12-29 02:42:04 +00:00
Richard Trieu	a87b70d1db	Add vtable anchor to classes. llvm-svn: 350142	2018-12-29 02:02:13 +00:00
Craig Topper	0a6cec6f9f	[X86] Don't mark SEXTLOAD v4i8->v4i64 and v8i8->v8i64 as custom under vector widening legalization. This was tricking us into making these operations and then letting them get scalarized later. But I can't prove that the scalarized version is actually better. llvm-svn: 350141	2018-12-29 01:17:11 +00:00
Craig Topper	f814d28eb3	[X86] Directly emit X86ISD::PMULUDQ from the ReplaceNodeResults handling of v2i8/v2i16/v2i32 multiply. Previously we emitted a multiply and some masking that was supposed to matched to PMULUDQ, but the masking could sometimes be removed before we got a chance to match it. So instead just emit the PMULUDQ directly. Remove the DAG combine that was added when the ReplaceNodeResults code was originally added. Add a new DAG combine to avoid regressions in shrink_vmul.ll Some of the shrink_vmul.ll test cases now pick PMULUDQ instead of PMADDWD/PMULLD, but I think this should be an improvement on most CPUs. I think all of this can go away if/when we switch to -x86-experimental-vector-widening-legalization llvm-svn: 350134	2018-12-28 19:19:39 +00:00
Anna Thomas	98743fa77a	[UnrollRuntime] NFC: Add comment and verify LCSSA Added -verify-loop-lcssa to test cases. Updated comments in ConnectProlog. llvm-svn: 350131	2018-12-28 18:52:16 +00:00
Diogo N. Sampaio	9123f82cc4	[AArch64] Add command-line option for SB SB (Speculative Barrier) is only mandatory from 8.5 onwards but is optional from Armv8.0-A. This patch adds a command line option to enable SB, as it was previously only possible to enable by selecting -march=armv8.5-a. This patch also moves to FeatureSB the old FeatureSpecRestrict. Reviewers: pbarrio, olista01, t.p.northover, LukeCheeseman Differential Revision: https://reviews.llvm.org/D55921 llvm-svn: 350126	2018-12-28 17:14:58 +00:00
Hiroshi Inoue	1ea98f040e	[PowerPC] handle ISD:TRUNCATE in BitPermutationSelector This is the last one in a series of patches to support better code generation for bitfield insert. BitPermutationSelector already support ISD::ZERO_EXTEND but not TRUNCATE. This patch adds support for ISD:TRUNCATE in BitPermutationSelector. For example of this test case, struct s64b { int a:4; int b:16; int c:24; }; void bitfieldinsert64b(struct s64b *p, unsigned char v) { p->b = v; } the selection DAG loos like: t14: i32,ch = load<(load 4 from %ir.0)> t0, t2, undef:i64 t18: i32 = and t14, Constant:i32<-1048561> t4: i64,ch = CopyFromReg t0, Register:i64 %1 t22: i64 = AssertZext t4, ValueType:ch:i8 t23: i32 = truncate t22 t16: i32 = shl nuw nsw t23, Constant:i32<4> t19: i32 = or t18, t16 t20: ch = store<(store 4 into %ir.0)> t14:1, t19, t2, undef:i64 By handling truncate in the BitPermutationSelector, we can use information from AssertZext when selecting t19 and skip the mask operation corresponding to t18. So the generated sequences with and without this patch are without this patch rlwinm 5, 5, 0, 28, 11 # corresponding to t18 rlwimi 5, 4, 4, 20, 27 with this patch rlwimi 5, 4, 4, 12, 27 Differential Revision: https://reviews.llvm.org/D49076 llvm-svn: 350118	2018-12-28 08:00:39 +00:00
Max Kazantsev	530ff8f3cc	Temporarily disable term folding in LoopSimplifyCFG, add tests llvm-svn: 350117	2018-12-28 06:22:39 +00:00
Max Kazantsev	80e4b40f3e	[LoopSimplifyCFG] Delete dead blocks in RPO Deletion of dead blocks in arbitrary order may lead to failure of assertion in `DeleteDeadBlock` that requires that we have deleted all predecessors before we can delete the current block. We should instead delete them in RPO order. llvm-svn: 350116	2018-12-28 06:08:51 +00:00
QingShan Zhang	f2d9df61c7	[PowerPC] Remove the implicit use of the register if it is replaced by Imm If we are changing the MI operand from Reg to Imm, we need also handle its implicit use if have. Differential Revision: https://reviews.llvm.org/D56078 llvm-svn: 350115	2018-12-28 03:38:09 +00:00
Zi Xuan Wu	5187444345	[NFC] clang-format functions related to r350113 llvm-svn: 350114	2018-12-28 02:45:17 +00:00
Zi Xuan Wu	a02a3feecf	[PowerPC] Fix assert from machine verify pass that atomic pseudo expanding causes mismatched register class For atomic value operand which less than 4 bytes need to be masked. And the related operation to calculate the newvalue can be done in 32 bit gprc. So just use gprc for mask and value calculation. Differential Revision: https://reviews.llvm.org/D56077 llvm-svn: 350113	2018-12-28 02:12:55 +00:00
Chen Zheng	5ede950df9	[PowerPC] fix register class after converting X-FORM instruction to D-FORM instruction Differential Revision: https://reviews.llvm.org/D55806 llvm-svn: 350111	2018-12-28 01:02:35 +00:00
Chandler Carruth	05b5bd8b85	[CallSite removal] Add and flesh out APIs on the new `CallBase` base class that previously were only available on the `CallSite` wrapper. Summary: This will make migrating code easier and generally seems like a good collection of API improvements. Some of these APIs seem like more consistent / better naming of existing ones. I've retained the old names for migration simplicit and am just adding the new ones in this commit. I'll try to garbage collect these once CallSite is gone. Subscribers: sanjoy, mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55638 llvm-svn: 350109	2018-12-27 23:40:17 +00:00
Craig Topper	787ad92bf6	[X86] Remove check that avoids creating PMULDQ with illegal types. Rely on SplitOpsAndApply to legalize it. Create PMULDQ/PMULUDQ as long as the number of elements is a power of 2. This seems to give some improvements in our ability to use SimplifyDemandedBits. llvm-svn: 350084	2018-12-27 03:37:04 +00:00
Craig Topper	a8f07e51f9	[X86] Factor the core code out of LowerSETCC into a helper that can create CMP/BT/PTEST/KORTEST etc. without making an X86ISD::SETCC node. NFCI Make each of the helper functions only return their comparison node and the condition code. Leave X86ISD::SETCC creation to the LowerSETCC function itself. Looking into whether we can use this code directly in BRCOND and SELECT lowering instead of going through LowerSETCC which creates an X86ISD::SETCC node we need to look through. llvm-svn: 350082	2018-12-27 01:50:40 +00:00
Craig Topper	4f1ef9fc0f	[X86] Merge getBitTestCondition into LowerAndToBT. Don't create X86ISD::SETCC node in the merged function. NFCI Only one of the 3 callers of LowerAndToBT need the SETCC node. Two of them have to look through it to find the operands they really need. Instead create it after the one call that needs it. LowerAndToBT now returns both the BT node and the X86 specific condition code separately. llvm-svn: 350081	2018-12-27 01:50:38 +00:00
Wouter van Oortmerssen	f227621036	[WebAssembly] Added basic support for if/else/end_if in MC layer. Summary: These instructions are currently unused in our backend, but for completeness it is good to support them, so they can be used with the assembler in hand-written code. Tests are very basic, signature support missing much like other blocks. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55973 llvm-svn: 350079	2018-12-26 22:55:26 +00:00
Wouter van Oortmerssen	29c6ce5879	[WebAssembly] Make assembler check for proper nesting of control flow. Summary: It does so using a simple nesting stack, and gives clear errors upon violation. This is unique to wasm, since most CPUs do not have any nested constructs. Had to add an end of file check to the general assembler for this. Note: if/else/end instructions are not currently supported in our tablegen defs, so these tests will be enabled in a follow-up. They already pass the nesting check. Reviewers: dschuff, aheejin Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55797 llvm-svn: 350078	2018-12-26 22:46:18 +00:00
Heejin Ahn	ce1d50f9d7	[WebAssembly] Delete an unnecessary line in RegStackify `OneUseInst` is set outside of the loop before and `OneUse` does not change throughout the loop, so this line is not necessary. llvm-svn: 350076	2018-12-26 22:33:35 +00:00
Heejin Ahn	99d3946398	[WebAssembly] Fix typos in comments in RegStackify (NFC) llvm-svn: 350075	2018-12-26 22:27:46 +00:00
Craig Topper	c9a6000755	[LoopIdiomRecognize] Add CTTZ support Summary: Existing LIR recognizes CTLZ where shifting input variable right until it is zero. (Shift-Until-Zero idiom) This commit: 1. Augments Shift-Until-Zero idiom to recognize CTTZ where input variable is shifted left. 2. Prepare for BitScan idiom recognition. Patch by Yuanfang Chen (tabloid.adroit) Reviewers: craig.topper, evstupac Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55876 llvm-svn: 350074	2018-12-26 21:59:48 +00:00
Reid Kleckner	c168c6f86f	[codeview] Check if this 'this' type of a method is a pointer Fixes crash reported after r347354 for frontends that don't always emit 'this' pointers for methods. Now we will silently produce debug info that makes functions like this look like static methods, which seems reasonable. llvm-svn: 350073	2018-12-26 21:52:17 +00:00
Justin Lebar	49fac56ea3	[NVPTX] Allow libcalls that are defined in the current module. The patch adds a possibility to make library calls on NVPTX. An important thing about library functions - they must be defined within the current module. This basically should guarantee that we produce a valid PTX assembly (without calls to not defined functions). The one who wants to use the libcalls is probably will have to link against compiler-rt or any other implementation. Currently, it's completely impossible to make library calls because of error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can lower ExternalSymbol to TargetExternalSymbol and verify if the function definition is available. Also, there was an issue with a DAG during legalisation. When we expand instruction into libcall, the inner call-chain isn't being "integrated" into outer chain. Since the last "data-flow" (call retval load) node is located in call-chain earlier than CALLSEQ_END node, the latter becomes a leaf and therefore a dead node (and is being removed quite fast). Proposed here solution relies on another data-flow pseudo nodes (ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and instruction selection phases - we remove the pseudo instructions before register scheduling phase. Patch by Denys Zariaiev! Differential Revision: https://reviews.llvm.org/D34708 llvm-svn: 350069	2018-12-26 19:12:31 +00:00
Max Kazantsev	28298e9647	[NFC] Use utility function for guards detection llvm-svn: 350064	2018-12-26 08:22:25 +00:00
Petar Avramovic	09dff33349	[MIPS GlobalISel] Select G_SELECT Add widen scalar for type index 1 (i1 condition) for G_SELECT. Select G_SELECT for pointer, s32(integer) and smaller low level types on MIPS32. Differential Revision: https://reviews.llvm.org/D56001 llvm-svn: 350063	2018-12-25 14:42:30 +00:00
Max Kazantsev	9b25bf3960	[NFC] Reuse variables instead of re-calling getParent llvm-svn: 350062	2018-12-25 07:20:06 +00:00
Kang Zhang	d501a1e596	[PowerPC] Fix the bug of ISD::ADDE to set its second return type to glue Summary: This patch is to fix the bug imported by rL341634. In above submit , the the return type of ISD::ADDE is 14224: SDVTList VTs = DAG.getVTList(MVT::i64, MVT::i64), but in fact, the second return type of ISD::ADDE should be MVT::Glue not MVT::i64. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D55977 llvm-svn: 350061	2018-12-25 03:29:51 +00:00
Craig Topper	0229da8f07	[X86] Use GetDemandedBits to simplify the operands of PMULDQ/PMULUDQ. This is an alternative to what I attempted in D56057. GetDemandedBits is a special version of SimplifyDemandedBits that allows simplifications even when the operand has other uses. GetDemandedBits will only do simplifications that allow a node to be bypassed. It won't create new nodes or alter any of the other users. I had to add support for bypassing SIGN_EXTEND_INREG to GetDemandedBits. Based on a patch that Simon Pilgrim sent me in email. Fixes PR40142. llvm-svn: 350059	2018-12-24 19:40:20 +00:00
Eugene Leviant	4dc3a3f746	[HWASAN] Instrument memorty intrinsics by default Differential revision: https://reviews.llvm.org/D55926 llvm-svn: 350055	2018-12-24 16:02:48 +00:00
Max Kazantsev	0b455c2b71	Revert rL350048 and rL350050 These patches have broken almost all buildbots on test DebugInfo/X86/addr_comments.ll. Reverting to green. llvm-svn: 350052	2018-12-24 10:30:04 +00:00
David Blaikie	5353e64935	Fix build - follow-up to r350048 which broke headerless (v4) address pool llvm-svn: 350050	2018-12-24 07:56:40 +00:00
Max Kazantsev	edabb9ae56	[LoopSimplifyCFG] Delete dead exiting edges This patch teaches LoopSimplifyCFG to remove dead exiting edges from loops. Differential Revision: https://reviews.llvm.org/D54025 Reviewed By: fedor.sergeev llvm-svn: 350049	2018-12-24 07:41:33 +00:00
David Blaikie	e20bf9ab91	DebugInfo: Use assembly label arithmetic for address pool size for easier reading/editing llvm-svn: 350048	2018-12-24 07:35:10 +00:00
David Blaikie	d671eb7e7c	DebugInfo: Add assembly comments for debug_addr contribution header fields llvm-svn: 350047	2018-12-24 07:09:50 +00:00
David Blaikie	b917c3a41a	llvm-dwarfdump: Skip address index info (and dump only the address, if found) when non-verbose dumping addrx forms There's a few bugs here still - demonstrated with FIXITs in the test. llvm-svn: 350046	2018-12-24 06:52:31 +00:00
Max Kazantsev	347c583772	Return "[LoopSimplifyCFG] Delete dead in-loop blocks" The underlying bug that caused the revert should be fixed by rL348567. Differential Revision: https://reviews.llvm.org/D54023 llvm-svn: 350045	2018-12-24 06:06:17 +00:00
George Burgess IV	7e12875c89	[LoopIdioms] More LocationSize::precise annotations; NFC Both of these places reference memset-like loops. Memset is precise. Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350044	2018-12-24 05:55:50 +00:00
Craig Topper	0adc3fe9e7	[X86] Remove unused variables left after r350041. NFC llvm-svn: 350043	2018-12-24 05:45:45 +00:00
George Burgess IV	610c76534f	[SelectionDAGBuilder] Use ::precise LocationSizes; NFC More migration so we can disable the implicit int -> LocationSize conversion. All of these are either scatter/gather'ed vector instructions, or direct loads. Hence, they're all precise. Perhaps if we see way more getTypeStoreSize calls, we can make a getTypeStoreLocationSize (or similar) as a wrapper that applies this ::precise. Doesn't appear that it's a good idea to make getTypeStoreSize return a LocationSize itself, however. llvm-svn: 350042	2018-12-24 05:34:21 +00:00
Craig Topper	d8217b23ff	[X86] Move the optimization that turns 'CMP (AND+IMM64), 0' into SRL/SHL+TEST to X86ISelDAGToDAG. This cleans more code out of EmitTest. llvm-svn: 350041	2018-12-24 05:27:13 +00:00
Craig Topper	e8c50fc6af	[X86] Remove the ANDN check from EmitTest. Remove the TESTmr isel patterns and add another postprocessing combine for TESTrr+ANDrm->TESTmr. We already have a postprocessing combine for TESTrr+ANDrr->TESTrr. With this we can give ANDN a chance to match first. And clean it up during post processing if we ended up with just a regular AND. This is another step towards my plan to gut EmitTest and do more flag handling during isel matching or by using optimizeCompare. llvm-svn: 350038	2018-12-24 01:10:13 +00:00
Sanjay Patel	93f1074677	[DAGCombiner] limit shuffle to extend transform (PR40146) It's dangerous to knowingly create an illegal vector type no matter what stage of combining we're in. This prevents the missed folding/scalarization seen in: https://bugs.llvm.org/show_bug.cgi?id=40146 llvm-svn: 350034	2018-12-23 20:48:31 +00:00
Sanjay Patel	9933574ac3	[DAGCombiner] allow hoisting vector bitwise logic ahead of extends llvm-svn: 350032	2018-12-23 19:58:16 +00:00
Simon Atanasyan	4553cdafe1	[ORC] Rename register in the OrcMips64 resolver code comments. NFC The `fp` and `s8` register names are synonyms. But `fp` better reflects a purpose of the register. llvm-svn: 350023	2018-12-23 12:05:04 +00:00
Simon Atanasyan	15b68b87d5	[ORC] clang-format OrcMips32 and OrcMips64 code. NFC llvm-svn: 350022	2018-12-23 12:05:00 +00:00
Simon Atanasyan	da29981707	[ORC] Remove redundant instruction from MIPS resolver code. NFC It's redundant to restore the `$a3` register twice. llvm-svn: 350021	2018-12-23 12:04:55 +00:00
George Burgess IV	5e4a03a089	[MemCpyOpt] Use LocationSize instead of ints; NFC Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. srcSize is derived from the size of an alloca, and we quit out if the size of that is > the size of the thing we're copying to. Hence, we should always copy everything over, so these sizes are precise. Don't make srcSize itself a LocationSize, since optionality isn't helpful, and we do some comparisons against other sizes elsewhere in that function. llvm-svn: 350019	2018-12-23 06:40:39 +00:00
Craig Topper	006bac6880	[X86] Return false from hasAndNotCompare if the comparision value is a constant. We won't end up using an ANDN instruction in this case so we should generate the same code we do for pre-BMI targets. llvm-svn: 350018	2018-12-23 05:52:55 +00:00
George Burgess IV	69952979da	[MemoryLocation] Use LocationSize instead of ints; NFC Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. This one sadly isn't super small, but all of the changes here are either to: - libfuncs that are passed a constant size (memcpy, memset, ...) - instructions that store/load a constant size So they have to be precise llvm-svn: 350017	2018-12-23 03:36:44 +00:00
George Burgess IV	8c5413f3f7	[Loads] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. This tries to find literal loads/stores of the given type, so this has to be precise. llvm-svn: 350016	2018-12-23 03:10:56 +00:00
George Burgess IV	1329cf1791	[Lint] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350015	2018-12-23 02:50:08 +00:00
George Burgess IV	685e781d55	[AAEval] Use LocationSize instead of ints; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350014	2018-12-23 02:39:58 +00:00
Craig Topper	3cc92a28ce	[X86] Fix an old FIXME about folding the zero constant into the OR instruction we use for sequentially consistent fence in 32-bit mode without SSE2. llvm-svn: 350013	2018-12-23 01:54:43 +00:00
David Blaikie	2a38c17b34	DebugInfo: Accurately propagate the section used by a relocation when accessing ranges defined by low/high_pc This is difficult/not possible to test in LLVM, but is visible as a crash in LLD when parsing DWARF to generate gdb-index. This function is called by llvm-dwarfdump when parsing high_pc for non-verbose output (to print the actual high_pc rather than the low_pc relative value), but in that case llvm-dwarfdump doesn't print section names (if it did, it would hit this problem). We could add some other features to llvm-dwarfdump to expose this, but nothing really springs to my mind. I will add a test to lld, though. llvm-svn: 350010	2018-12-22 22:20:40 +00:00
David Blaikie	25179613f6	llvm-dwarfdump: Dump the section name/number for addr attributes llvm-svn: 350009	2018-12-22 20:34:58 +00:00
George Burgess IV	640be69249	[Analysis] More LocationSize cleanup; NFC Keeping these patches super small so they're easily post-commit verifiable, as requested in D44748. llvm-svn: 350008	2018-12-22 18:23:21 +00:00
Sanjay Patel	4b537aaf6d	[DAGCombiner] allow narrowing of add followed by truncate trunc (add X, C ) --> add (trunc X), C' If we're throwing away the top bits of an 'add' instruction, do it in the narrow destination type. This makes the truncate-able opcode list identical to the sibling transform done in IR (in instcombine). This change used to show regressions for x86, but those are gone after D55494. This gets us closer to deleting the x86 custom function (combineTruncatedArithmetic) that does almost the same thing. Differential Revision: https://reviews.llvm.org/D55866 llvm-svn: 350006	2018-12-22 17:10:31 +00:00
Sanjay Patel	52c02d70e2	[x86] add load fold patterns for movddup with vzext_load The missed load folding noticed in D55898 is visible independent of that change either with an adjusted IR pattern to start or with AVX2/AVX512 (where the build vector becomes a broadcast first; movddup is not produced until we get into isel via tablegen patterns). Differential Revision: https://reviews.llvm.org/D55936 llvm-svn: 350005	2018-12-22 16:59:02 +00:00
David Blaikie	9efb0153f0	llvm-dwarfdump: Remove extraneous space between '(' and 'indexed' When dumping string or address indexes llvm-svn: 349997	2018-12-22 08:43:08 +00:00
David Blaikie	c04d2bf22a	llvm-dwarfdump: Print the section name/number for addr_index attributes (addr attributes coming shortly) llvm-svn: 349996	2018-12-22 08:33:55 +00:00
David Blaikie	87ae80fb2f	DebugInfo: Refactor named section dumping into a reusable helper Currently the section name (& possibly number) is only printed on addresses in ranges - but no reason it couldn't also be displayed on other addresses (like low/high PC). Refactor in that direction by pulling out the section lookup and name ambiguity dumping logic into a reusable helper. llvm-svn: 349995	2018-12-22 08:23:10 +00:00
David Blaikie	e4e0b9f48f	DebugInfo: Remove extra attribute lookup llvm-svn: 349985	2018-12-22 02:24:13 +00:00
Craig Topper	1f02ac3451	[X86] FixupLEAs, reduce number of calls to getOperand and use X86::AddrBaseReg/AddrIndexReg, etc. instead of hardcoded constants. Makes the code a little more readable. llvm-svn: 349983	2018-12-22 01:34:47 +00:00
Justin Lebar	7f41fe3a58	[NVPTX] Reduce stack size in NVPTXAsmPrinter::doInitialization(). NVPTXAsmPrinter::doInitialization() was creating an NVPTXSubtarget on the stack. This object is huge, about 80kb. Also it's slow to create. And it's all redundant; we have one in NVPTXTargetMachine anyway! llvm-svn: 349982	2018-12-22 01:30:37 +00:00
David Blaikie	219c6bd388	libDebugInfo: Refactor error handling in range list parsing Propagate the llvm::Error a little further up. This is NFC for llvm-dwarfdump in this change, but allows ld.lld to emit more precise error messages about which object and archive the erroneous DWARF is in. llvm-svn: 349978	2018-12-22 00:31:02 +00:00
Reid Kleckner	98bbd07cc3	[MC] Enable .file support on COFF and diagnose it on unsupported targets Summary: The "single parameter" .file directive appears to be an ELF-only feature that is intended to insert the main source filename into the string table table. I noticed that if you assemble an ELF .s file for COFF, typically it will assert right away on a .file directive near the top of the file. My first change was to make this emit a proper error in the asm parser so that we don't assert so easily. However, COFF actually does have some support for this directive, and if you emit an object file, llvm-mc does not assert. When emitting a COFF object, MC will take those file names and create "debug" symbol table entries for them. I'm not familiar with these kinds of symbol table entries, and I'm not aware of any users of them, but @compnerd added them a while ago. They don't introduce absolute paths, and most main source file paths are short enough that this extra entry shouldn't cause any problems, so I enabled the flag in MCAsmInfoCOFF that indicates that it's supported. This has the side effect of adding an extra debug symbol to every object produced by clang, which is a pretty big functional change. My question is, should we keep the functionality or remove it in the name of symbol table minimalism? Reviewers: mstorsjo, compnerd Subscribers: hiraditya, compnerd, llvm-commits Differential Revision: https://reviews.llvm.org/D55900 llvm-svn: 349976	2018-12-21 23:35:48 +00:00
Mircea Trofin	499a66ecc0	Silence warning in assert introduced in rL349973. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56030 llvm-svn: 349975	2018-12-21 23:02:10 +00:00
Mircea Trofin	b53eeb6f4c	[llvm] API for encoding/decoding DWARF discriminators. Summary: Added a pair of APIs for encoding/decoding the 3 components of a DWARF discriminator described in http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html: the base discriminator, the duplication factor (useful in profile-guided optimization) and the copy index (used to identify copies of code in cases like loop unrolling) The encoding packs 3 unsigned values in 32 bits. This CL addresses 2 issues: - communicates overflow back to the user - supports encoding all 3 components together. Current APIs assume a sequencing of events. For example, creating a new discriminator based on an existing one by changing the base discriminator was not supported. Reviewers: davidxl, danielcdh, wmi, dblaikie Reviewed By: dblaikie Subscribers: zzheng, dmgreen, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D55681 llvm-svn: 349973	2018-12-21 22:48:50 +00:00
David Blaikie	c3f30a7fc6	Reapply: DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses) Originally committed in r349333, reverted in r349353. GCC emitted these unconditionally on/before 4.4/March 2012 Clang emitted these unconditionally on/before 3.5/March 2014 This improves performance when parsing CUs (especially those using split DWARF) that contain no code ranges (such as the mini CUs that may be created by ThinLTO importing - though generally they should be/are avoided, especially for Split DWARF because it produces a lot of very small CUs, which don't scale well in a bunch of other ways too (including size)). The revert was due to a (Google internal) test that had some checked in old object files missing DW_AT_ranges. That's since been fixed. llvm-svn: 349968	2018-12-21 22:25:01 +00:00
Vedant Kumar	b264d69de7	[IR] Add Instruction::isLifetimeStartOrEnd, NFC Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an llvm.lifetime.start or an llvm.lifetime.end intrinsic. This was suggested as a cleanup in D55967. Differential Revision: https://reviews.llvm.org/D56019 llvm-svn: 349964	2018-12-21 21:49:40 +00:00
Craig Topper	e58cd9cbc6	[X86] Add isel patterns to match BMI/TBMI instructions when lowering has turned the root nodes into one of the flag producing binops. This fixes the patterns that have or/and as a root. 'and' is handled differently since thy usually have a CMP wrapped around them. I had to look for uses of the CF flag because all these nodes have non-standard CF flag behavior. A real or/xor would always clear CF. In practice we shouldn't be using the CF flag from these nodes as far as I know. Differential Revision: https://reviews.llvm.org/D55813 llvm-svn: 349962	2018-12-21 21:42:43 +00:00
Sanjay Patel	47a6129e26	[DAGCombiner] simplify code leading to scalarizeExtractedVectorLoad; NFC llvm-svn: 349958	2018-12-21 21:26:30 +00:00
Craig Topper	62ec024d3b	[X86] Don't allow optimizeCompareInstr to replace a CMP with BEXTR if the sign flag is used. The BEXTR instruction documents the SF bit as undefined. The TBM BEXTR instruction has the same issue, but I'm not sure how to test it. With the control being an immediate we can determine the sign bit is 0 or the BEXTR would have been removed. Fixes PR40060 Differential Revision: https://reviews.llvm.org/D55807 llvm-svn: 349956	2018-12-21 21:16:26 +00:00
Changpeng Fang	6f539294b5	AMDGPU: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing. Summary: Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing. This is because the M0 field is of unsigned. This patch achieves the similar goal as https://reviews.llvm.org/D55241, but keeps the optimization if the base is known unsigned. Reviewers: arsemn Differential Revision: https://reviews.llvm.org/D55568 llvm-svn: 349951	2018-12-21 20:57:34 +00:00
Armando Montanez	4cc2113114	[TextAPI][elfabi] Fix YAML support for weak symbols Weak symbols are supposed to be supported in the ELF TextAPI implementation, but the YAML handler didn't read or write the `Weak` member of ELFSymbol. This change adds the YAML mapping and updates tests to ensure correct behavior. Differential Revision: https://reviews.llvm.org/D56020 llvm-svn: 349950	2018-12-21 20:45:58 +00:00
Reid Kleckner	84a2d29681	[BasicAA] Fix AA bug on dynamic allocas and stackrestore Summary: BasicAA has special logic for unescaped allocas, which normally applies equally well to dynamic and static allocas. However, llvm.stackrestore has the power to end the lifetime of dynamic allocas, without referring to them directly. stackrestore is already marked with the most conservative memory modification attributes, but because the alloca is not escaped, the normal logic produces incorrect results. I think BasicAA needs a special case here to teach it about the relationship between dynamic allocas and stackrestore. Fixes PR40118 Reviewers: gbiv, efriedma, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55969 llvm-svn: 349945	2018-12-21 19:59:03 +00:00
Anna Thomas	18be3cb606	[RuntimeUnrolling] NFC: Add TODO and comments in connectProlog Currently, runtime unrolling does not support loops where multiple exiting blocks exit to the latchExit. Added TODO and other code clarifications for ConnectProlog code. llvm-svn: 349944	2018-12-21 19:45:05 +00:00
Sanjay Patel	80187b8a17	[x86] add movddup specialization for build vector lowering (PR37502) This is admittedly a narrow fix for the problem: https://bugs.llvm.org/show_bug.cgi?id=37502 ...but as the XOP restriction shows, it's a maze to get this right. In the motivating example, note that we have movddup before SSE4.1 and again with AVX2. That's because insertps isn't available pre-SSE41 and vbroadcast is (more generally) available with AVX2 (and the splat is reduced to movddup via isel pattern). Differential Revision: https://reviews.llvm.org/D55898 llvm-svn: 349937	2018-12-21 18:48:32 +00:00
Florian Hahn	8c9f865e3d	[ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR. Fixes PR35023. Reviewers: MatzeB, t.p.northover, sunfish, qcolombet, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D55909 llvm-svn: 349935	2018-12-21 18:07:10 +00:00
Jessica Paquette	453ab1db5b	[GlobalISel][AArch64] Add support for widening G_FCEIL This adds support for widening G_FCEIL in LegalizerHelper and AArch64LegalizerInfo. More specifically, it teaches the AArch64 legalizer to widen G_FCEIL from a 16-bit float to a 32-bit float when the subtarget doesn't support full FP 16. This also updates AArch64/f16-instructions.ll to show that we perform the correct transformation. llvm-svn: 349927	2018-12-21 17:05:26 +00:00
Evandro Menezes	96c11eceb2	[AArch64] Refactor Exynos predicate (NFC) Change order of conditions in predicate. llvm-svn: 349918	2018-12-21 15:51:34 +00:00
Simon Pilgrim	aa53fc15b7	[XCore] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349915	2018-12-21 15:35:32 +00:00
Simon Pilgrim	7787a02b23	[Sparc] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349914	2018-12-21 15:32:36 +00:00
Simon Pilgrim	3c157d3fa3	[AMDGPU] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349912	2018-12-21 15:29:47 +00:00
Simon Pilgrim	ca8bca2ad3	[WebAssembly] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349911	2018-12-21 15:25:37 +00:00
Simon Pilgrim	d800ee4861	[ARM] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349909	2018-12-21 15:15:38 +00:00
Simon Pilgrim	148957f336	[AArch64] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349908	2018-12-21 15:05:10 +00:00
Simon Pilgrim	911dce2f30	[SelectionDAG] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349907	2018-12-21 14:56:18 +00:00
Simon Pilgrim	2482c51e99	[SystemZ] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349906	2018-12-21 14:50:54 +00:00
Simon Pilgrim	d43bdc715c	[Lanai] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349905	2018-12-21 14:48:35 +00:00
Simon Pilgrim	af1ab22a76	[PPC] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349903	2018-12-21 14:32:39 +00:00
Simon Pilgrim	57733507fe	[X86] Always use the version of computeKnownBits that returns a value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the old KnownBits output paramater version. llvm-svn: 349902	2018-12-21 14:25:14 +00:00
Fedor Sergeev	2d94c2265e	[NewPM] -print-module-scope -print-after now prints module even after invalidated Loop/SCC -print-after IR printing generally can not print the IR unit (Loop or SCC) which has just been invalidated by the pass. However, when working in -print-module-scope mode even if Loop was invalidated there is still a valid module that we can print. Since we can not access invalidated IR unit from AfterPassInvalidated instrumentation point we can remember the module to be printed before pass. This change introduces BeforePass instrumentation that stores all the information required for module printing into the stack and then after pass (in AfterPassInvalidated) just print whatever has been placed on stack. Reviewed By: philip.pfaffe Differential Revision: https://reviews.llvm.org/D55278 llvm-svn: 349896	2018-12-21 11:49:05 +00:00
Luke Cheeseman	41a9e53500	[Dwarf/AArch64] Return address signing B key dwarf support - When signing return addresses with -msign-return-address=<scope>{+<key>}, either the A key instructions or the B key instructions can be used. To correctly authenticate the return address, the unwinder/debugger must know which key was used to sign the return address. - When and exception is thrown or a break point reached, it may be necessary to unwind the stack. To accomplish this, the unwinder/debugger must be able to first authenticate an the return address if it has been signed. - To enable this, the augmentation string of CIEs has been extended to allow inclusion of a 'B' character. Functions that are signed using the B key variant of the instructions should have and FDE whose associated CIE has a 'B' in the augmentation string. - One must also be able to preserve these semantics when first stepping from a high level language into assembly and then, as a second step, into an object file. To achieve this, I have introduced a new assembly directive '.cfi_b_key_frame ', that tells the assembler the current frame uses return address signing with the B key. - This ensures that the FDE is associated with a CIE that has 'B' in the augmentation string. Differential Revision: https://reviews.llvm.org/D51798 llvm-svn: 349895	2018-12-21 10:45:08 +00:00
Simon Pilgrim	5d403f6bf8	[X86][SSE] Auto upgrade PADDS/PSUBS intrinsics to SADD_SAT/SSUB_SAT generic intrinsics (llvm) This auto upgrades the signed SSE saturated math intrinsics to SADD_SAT/SSUB_SAT generic intrinsics. Clang counterpart: https://reviews.llvm.org/D55890 Differential Revision: https://reviews.llvm.org/D55894 llvm-svn: 349892	2018-12-21 09:04:14 +00:00
Thomas Lively	b6dac89c87	[WebAssembly] Fix invalid machine instrs in -O0, verify in tests Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55956 llvm-svn: 349889	2018-12-21 06:58:15 +00:00
Matt Arsenault	3eae3c4590	AMDGPU/GlobalISel: RegBankSelect for amdgcn.wqm.vote llvm-svn: 349882	2018-12-21 03:20:54 +00:00
Matt Arsenault	f4c21c575a	AMDGPU/GlobalISel: RegBankSelect for some fp ops llvm-svn: 349880	2018-12-21 03:14:45 +00:00
Matt Arsenault	bee2ad7185	AMDGPU/GlobalISel: Redo legality for build_vector It seems better to avoid using the callback if possible since there are coverage assertions which are disabled if this is used. Also fix missing tests. Only test the legal cases since it seems legalization for build_vector is quite lacking. llvm-svn: 349878	2018-12-21 03:03:11 +00:00
Reid Kleckner	b894ecf903	[memcpyopt] Add debug logs when forwarding memcpy src to dst llvm-svn: 349873	2018-12-21 01:41:20 +00:00
Eli Friedman	3af2f53456	[LoopUnroll] Don't verify domtree by default with +Asserts. This verification is linear in the size of the function, so it can cause a quadratic compile-time explosion in a function with many loops to unroll. Differential Revision: https://reviews.llvm.org/D54732 llvm-svn: 349871	2018-12-21 01:28:49 +00:00
Craig Topper	54f1a7be13	[X86] Refactor hasNoCarryFlagUses and hasNoSignFlagUses in X86ISelDAGToDAG.cpp to tranlate opcode to condition code using the helpers in X86InstrInfo.cpp. This shortens the switches in X86ISelDAGToDAG.cpp to only need to check condition code instead of a list of opcodes. This also fixes a bug where the memory forms of SETcc were missing from hasNoCarryFlagUses. llvm-svn: 349868	2018-12-21 01:14:25 +00:00
Craig Topper	e0cff10289	[X86] Add memory forms of some SETCC instructions to hasNoCarryFlagUses. Found while working on another patch llvm-svn: 349867	2018-12-21 01:14:23 +00:00
Eli Friedman	b1bbd5dca3	[ARM] Complete the Thumb1 shift+and->shift+shift transforms. This saves materializing the immediate. The additional forms are less common (they don't usually show up for bitfield insert/extract), but they're still relevant. I had to add a new target hook to prevent DAGCombine from reversing the transform. That isn't the only possible way to solve the conflict, but it seems straightforward enough. Differential Revision: https://reviews.llvm.org/D55630 llvm-svn: 349857	2018-12-20 23:39:54 +00:00
Tom Stellard	2f44fbe936	cmake: Remove add_llvm_loadable_module() Summary: This function is very similar to add_llvm_library(), so this patch merges it into add_llvm_library() and replaces all calls to add_llvm_loadable_module(lib ...) with add_llvm_library(lib MODULE ...) Reviewers: philip.pfaffe, beanz, chandlerc Reviewed By: philip.pfaffe Subscribers: chapuni, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D51748 llvm-svn: 349839	2018-12-20 22:04:08 +00:00
Jessica Paquette	a6b9c68a85	[GlobalISel][AArch64] Add G_FCEIL to isPreISelGenericFloatingPointOpcode If you don't do this, then if you hit a G_LOAD in getInstrMapping, you'll end up with GPRs on the G_FCEIL instead of FPRs. This causes a fallback. Add it to the switch, and add a test verifying that this happens. llvm-svn: 349822	2018-12-20 21:14:15 +00:00
David Blaikie	b3c56af49b	DebugInfo: Fix for missing comp_dir handling with r349207 When deciding lazily whether a CU would be split or non-split I accidentally dropped some handling for the line tables comp_dir (by doing it lazily it was too late to be handled properly by the MC line table code). Move that bit of the code back to the non-lazy place. llvm-svn: 349819	2018-12-20 20:46:55 +00:00
Eli Friedman	48397102d0	[MC] [AArch64] Correctly resolve ":abs_g1:3" etc. We have to treat constructs like this as if they were "symbolic", to use the correct codepath to resolve them. This mostly only affects movz etc. because the other uses of classifySymbolRef conservatively treat everything that isn't a constant as if it were a symbol. Differential Revision: https://reviews.llvm.org/D55906 llvm-svn: 349800	2018-12-20 19:46:14 +00:00
Eli Friedman	4648209e16	[MC] [AArch64] Support resolving fixups for abs_g0 etc. This requires a bit more code than other fixups, to distingush between abs_g0/abs_g1/etc. Actually, I think some of the other fixups are missing some checks, but I won't try to address that here. I haven't seen any real-world code that uses a construct like this, but it clearly should work, and we're considering using it in the implementation of localescape/localrecover on Windows (see https://reviews.llvm.org/D53540). I've verified that binutils produces the same code as llvm-mc for the testcase. This currently doesn't include support for the *_s variants (that requires a bit more work to set the opcode). Differential Revision: https://reviews.llvm.org/D55896 llvm-svn: 349799	2018-12-20 19:38:07 +00:00
Simon Pilgrim	2a25360ae3	[X86] Auto upgrade XOP/AVX512 rotation intrinsics to generic funnel shift intrinsics (llvm) This emits FSHL/FSHR generic intrinsics for the XOP VPROT and AVX512 VPROL/VPROR rotation intrinsics. Clang counterpart: https://reviews.llvm.org/D55937 Differential Revision: https://reviews.llvm.org/D55938 llvm-svn: 349795	2018-12-20 19:01:07 +00:00
Florian Hahn	ef307b8c26	[LAA] Avoid generating RT checks for known deps preventing vectorization. If we found unsafe dependences other than 'unknown', we already know at compile time that they are unsafe and the runtime checks should always fail. So we can avoid generating them in those cases. This should have no negative impact on performance as the runtime checks that would be created previously should always fail. As a sanity check, I measured the test-suite, spec2k and spec2k6 and there were no regressions. Reviewers: Ayal, anemet, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D55798 llvm-svn: 349794	2018-12-20 18:49:09 +00:00
Michael Trent	f44d830e2d	Add PLATFORM constants for iOS, tvOS, and watchOS simulators Summary: Add PLATFORM constants for iOS, tvOS, and watchOS simulators, as well as human readable names for these constants, to the Mach-O file format header files. rdar://46854119 Reviewers: ab, davide Reviewed By: ab, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55905 llvm-svn: 349779	2018-12-20 17:51:17 +00:00
Yonghong Song	821c93d556	[BPF] Disable relocation for .BTF.ext section Build llvm with assertion on, and then build bcc against this llvm. Run any bcc tool with debug=8 (turning on -g for clang compilation), you will get the following assertion errors, /home/yhs/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:888: void llvm::RuntimeDyldELF::resolveBPFRelocation(const llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion `Value <= (4294967295U)' failed. The .BTF.ext ELF section uses Fixup's to get the instruction offsets. The data width of the Fixup is 4 bytes since we only need the insn offset within the section. This caused the above error though since R_BPF_64_32 expects 4-byte value and the Runtime Dyld tried to resolve the actual insn address which is 8 bytes. Actually the offset within the section is all what we need. Therefore, there is no need to perform any kind of relocation for .BTF.ext section and such relocation will actually cause incorrect result. This patch changed BPFELFObjectWriter::getRelocType() such that for Fixup Kind FK_Data_4, if the relocation Target is a temporary symbol, let us skip the relocation (ELF::R_BPF_NONE). Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 349778	2018-12-20 17:40:23 +00:00
Brock Wyma	b17464e4e8	[CodeView] Emit global variables within lexical scopes to limit visibility Emit static locals within the correct lexical scope so variables with the same name will not confuse the debugger into getting the wrong value. Differential Revision: https://reviews.llvm.org/D55336 llvm-svn: 349777	2018-12-20 17:33:45 +00:00
Michael Kruse	199427100b	[InstCombine] Preserve access-group metadata. Preserve llvm.access.group metadata when combining store instructions. This was forgotten in r349725. Fixes llvm.org/PR40117 llvm-svn: 349774	2018-12-20 17:11:02 +00:00
Krzysztof Parzyszek	30c42e2ab6	[Hexagon] Add patterns for funnel shifts llvm-svn: 349770	2018-12-20 16:39:20 +00:00
Simon Pilgrim	b208255fe0	[SelectionDAGBuilder] Enable funnel shift building to custom rotates This patch enables funnel shift -> rotate building for all ROTL/ROTR custom/legal operations. AFAICT X86 was the last target that was missing modulo support (PR38243), but I've tried to CC stakeholders for every target that has ROTL/ROTR custom handling for their final OK. Differential Revision: https://reviews.llvm.org/D55747 llvm-svn: 349765	2018-12-20 14:56:44 +00:00
Alex Bradbury	eb3a64a4da	[RISCV] Properly evaluate fixup_riscv_pcrel_lo12 This is a update to D43157 to correctly handle fixup_riscv_pcrel_lo12. Notable changes: Rebased onto trunk Handle and test S-type Test case pcrel-hilo.s is merged into relocations.s D43157 description: VK_RISCV_PCREL_LO has to be handled specially. The MCExpr inside is actually the location of an auipc instruction with a VK_RISCV_PCREL_HI fixup pointing to the real target. Differential Revision: https://reviews.llvm.org/D54029 Patch by Chih-Mao Chen and Michael Spencer. llvm-svn: 349764	2018-12-20 14:52:15 +00:00
Simon Pilgrim	09c081176a	[X86][AVX512] Don't custom lower v16i8 rotations. As discussed on D55747, the expansion to (wider) shifts is better on all AVX512 cases, not just BWI. llvm-svn: 349763	2018-12-20 14:38:35 +00:00
Ulrich Weigand	380bece7af	[SystemZ] "Generic" vector assembler instructions shoud clobber CC There are several vector instructions which may or may not set the condition code register, depending on the value of an argument. For codegen, we use two versions of the instruction, one that sets CC and one that doesn't, which hard-code appropriate values of that argument. But we also have a "generic" version of the instruction that is used for the assembler/disassembler. These generic versions should always be considered to clobber CC just to be safe. llvm-svn: 349761	2018-12-20 14:24:17 +00:00
Ulrich Weigand	44d37ae38c	[SystemZ] Make better use of VLLEZ This patch fixes two deficiencies in current code that recognizes the VLLEZ idiom: - For the floating-point versions, we have ISel patterns that match on a bitconvert as the top node. In more complex cases, that bitconvert may already have been merged into something else. Fix the patterns to match the inner nodes instead. - For the 64-bit integer versions, depending on the surrounding code, we may get either a DAG tree based on JOIN_DWORDS or one based on INSERT_VECTOR_ELT. Use a PatFrags to simply match both variants. llvm-svn: 349749	2018-12-20 13:05:03 +00:00
Ulrich Weigand	8bb46b0f01	[SystemZ] Make better use of VGEF/VGEG Current code in SystemZDAGToDAGISel::tryGather refuses to perform any transformation if the Load SDNode has more than one use. This (erronously) counts uses of the chain result, which prevents the optimization in many cases unnecessarily. Fixed by this patch. llvm-svn: 349748	2018-12-20 13:01:20 +00:00
Clement Courbet	36a3480385	Re-land r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads. Update PPC ir following GEP->bitcat to bitcat->GEP->bitcat change. llvm-svn: 349747	2018-12-20 13:01:04 +00:00
Ulrich Weigand	f43b510015	[SystemZ] Make better use of VLDEB We already have special code (DAG combine support for FP_ROUND) to recognize cases where we an use a vector version of VLEDB to perform two floating-point truncates in parallel, but equivalent support for VLEDB (vector floating-point extends) has been missing so far. This patch adds corresponding DAG combine support for FP_EXTEND. llvm-svn: 349746	2018-12-20 12:59:05 +00:00
George Rimar	6367d7a6d1	[yaml2obj/obj2yaml] - Support dumping/parsing ABI version. These tools were assuming ABI version is 0, that is not always true. Patch teaches them to work with that field. Differential revision: https://reviews.llvm.org/D55884 llvm-svn: 349737	2018-12-20 10:43:49 +00:00
Piotr Sobczak	deaacc17fe	[InstCombine][AMDGPU] Handle more buffer intrinsics Summary: Include the following intrinsics in the InsctCombine simplification: * amdgcn_raw_buffer_load * amdgcn_raw_buffer_load_format * amdgcn_struct_buffer_load * amdgcn_struct_buffer_load_format Change-Id: I14deceff74bcb21179baf6aa6e94bf39e7d63d5d Reviewers: arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55882 llvm-svn: 349735	2018-12-20 10:08:18 +00:00
Alexander Potapenko	0e3b85a730	[MSan] Don't emit __msan_instrument_asm_load() calls LLVM treats void* pointers passed to assembly routines as pointers to sized types. We used to emit calls to __msan_instrument_asm_load() for every such void*, which sometimes led to false positives. A less error-prone (and truly "conservative") approach is to unpoison only assembly output arguments. llvm-svn: 349734	2018-12-20 10:05:00 +00:00
Clement Courbet	e22cf4d7cb	Revert r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads." Forgot to update PowerPC tests for the GEP->bitcast change. llvm-svn: 349733	2018-12-20 09:58:33 +00:00
Clement Courbet	d4bd3eb85d	[NFC] Fix trailing comma after function. lib/Analysis/VectorUtils.cpp:482:2: warning: extra ‘;’ [-Wpedantic] llvm-svn: 349732	2018-12-20 09:20:07 +00:00
Clement Courbet	1bb6e1b0f2	[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads. Summary: This allows expanding {7,11,13,14,15,21,22,23,25,26,27,28,29,30,31}-byte memcmp in just two loads on X86. These were previously calling memcmp. Reviewers: spatel, gchatelet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55263 llvm-svn: 349731	2018-12-20 09:13:47 +00:00
Eugene Leviant	2d98eb1b2e	[HWASAN] Add support for memory intrinsics Differential revision: https://reviews.llvm.org/D55117 llvm-svn: 349728	2018-12-20 09:04:33 +00:00
Kang Zhang	ca8db48974	[PowerPC] Implement the isSelectSupported() target hook Summary: PowerPC has scalar selects (isel) and vector mask selects (xxsel). But PowerPC does not have vector CR selects, PowerPC does not support scalar condition selects on vectors. In addition to implementing this hook, isSelectSupported() should return false when the SelectSupportKind is ScalarCondVectorVal, so that predictable selects are converted into branch sequences. Reviewed By: steven.zhang, hfinkel Differential Revision: https://reviews.llvm.org/D55754 llvm-svn: 349727	2018-12-20 06:19:59 +00:00
Craig Topper	bd788ce5db	[DAGCombiner] Fix a place that was creating a SIGN_EXTEND with an extra operand. llvm-svn: 349726	2018-12-20 05:28:06 +00:00
Michael Kruse	978ba61536	Introduce llvm.loop.parallel_accesses and llvm.access.group metadata. The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725	2018-12-20 04:58:07 +00:00
Thomas Lively	feb18fe927	[WebAssembly] Emit a splat for v128 IMPLICIT_DEF Summary: This is a code size savings and is also important to get runnable code while engines do not support v128.const. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55910 llvm-svn: 349724	2018-12-20 04:20:32 +00:00
Amara Emerson	321bfb210a	Fix build errors introduced by r349712 on aarch64 bots. llvm-svn: 349723	2018-12-20 03:27:42 +00:00
Thomas Lively	8dbf29af95	[WebAssembly] Gate unimplemented SIMD ops on flag Summary: Gates v128.const, f32x4.sqrt, f32x4.div, i8x16.extract_lane_u, and i16x8.extract_lane_u on the --wasm-enable-unimplemented-simd flag, since these ops are not implemented yet in V8. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55904 llvm-svn: 349720	2018-12-20 02:10:22 +00:00
Matt Arsenault	4339883710	AMDGPU: Make i1/i64/v2i32 and/or/xor legal The 64-bit types do depend on the register bank, but that's another issue to deal with later. llvm-svn: 349716	2018-12-20 01:35:49 +00:00
Matt Arsenault	8cc98bee8a	AMDGPU/GlobalISel: Fix ValueMapping tables for i1 This was incorrectly selecting SGPR for any i1 values, e.g. G_TRUNC to i1 from a VGPR was still an SGPR. llvm-svn: 349715	2018-12-20 01:33:43 +00:00
Craig Topper	9ca2f5605e	[X86] Disable custom widening of signed/unsigned add/sub saturation intrinsics under -x86-experimental-vector-widening-legalization. Generic legalization should take care of this. llvm-svn: 349714	2018-12-20 01:32:06 +00:00
Amara Emerson	8cb186ce17	[AArch64][GlobalISel] Implement selection og G_MERGE of two s32s into s64. This code pattern is an unfortunate side effect of the way some types get split at call lowering. Ideally we'd either not generate it at all or combine it away in the legalizer artifact combiner. Until then, add selection support anyway which is a significant proportion of our current fallbacks on CTMark. rdar://46491420 llvm-svn: 349712	2018-12-20 01:11:04 +00:00
Matt Arsenault	dff33c38e1	AMDGPU/GlobalISel: RegBankSelect for fp conversions llvm-svn: 349709	2018-12-20 00:37:02 +00:00
Matt Arsenault	36d4092173	AMDGPU/GlobalISel: Legality/regbankselect for atomicrmw/atomic_cmpxchg llvm-svn: 349708	2018-12-20 00:33:49 +00:00
Vitaly Buka	07a55f27dc	[asan] Undo special treatment of linkonce_odr and weak_odr Summary: On non-Windows these are already removed by ShouldInstrumentGlobal. On Window we will wait until we get actual issues with that. Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55899 llvm-svn: 349707	2018-12-20 00:30:27 +00:00
Vitaly Buka	d414e1bbb5	[asan] Prevent folding of globals with redzones Summary: ICF prevented by removing unnamed_addr and local_unnamed_addr for all sanitized globals. Also in general unnamed_addr is not valid here as address now is important for ODR violation detector and redzone poisoning. Before the patch ICF on globals caused: 1. false ODR reports when we register global on the same address more than once 2. globals buffer overflow if we fold variables of smaller type inside of large type. Then the smaller one will poison redzone which overlaps with the larger one. Reviewers: eugenis, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55857 llvm-svn: 349706	2018-12-20 00:30:18 +00:00
Matt Davis	87b2268c0c	[DwarfExpression] Fix a typo in a doxygen comment. NFC. llvm-svn: 349703	2018-12-20 00:01:57 +00:00
Craig Topper	217b3b20d8	[X86] Remove TLI variable from ReplaceNodeResults. NFC We're already in X86TargetLowering which is a derived class of TargetLowering. We can just call methods directly. llvm-svn: 349695	2018-12-19 23:13:03 +00:00
Rhys Perry	3931ad38b9	AMDGPU: Add patterns for v4i16/v4f16 -> v4i16/v4f16 bitcasts Reviewers: arsenm, tstellar Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55058 llvm-svn: 349694	2018-12-19 22:53:33 +00:00
Eli Friedman	a69084ffa8	[CodeGenPrepare] Fix bad IR created by large offset GEP splitting. Creating the IR builder, then modifying the CFG, leads to an IRBuilder where the BB and insertion point are inconsistent, so new instructions have the wrong parent. Modified an existing test because the test wasn't covering anything useful (the "invoke" was not actually an invoke by the time we hit the code in question). Differential Revision: https://reviews.llvm.org/D55729 llvm-svn: 349693	2018-12-19 22:52:04 +00:00
Rhys Perry	972273d1d3	Fix test commit Seems that was actually a eight space tab... llvm-svn: 349690	2018-12-19 22:33:42 +00:00
Rhys Perry	111bf831de	Test commit Replace tab with 4 spaces. llvm-svn: 349689	2018-12-19 22:26:51 +00:00
Evandro Menezes	374ccf6768	[AArch64] Improve Exynos predicates Expand the predicate `ExynosResetPred` to include all forms of immediate moves. llvm-svn: 349686	2018-12-19 22:24:36 +00:00
Evandro Menezes	ff827d737a	[AArch64] Use canonical copy idiom Use only the canonical form of the alias for register transfers in the `IsCopyIdiomPred` predicate. llvm-svn: 349685	2018-12-19 22:24:31 +00:00
Nikita Popov	3817ee7908	Revert "[BDCE][DemandedBits] Detect dead uses of undead instructions" This reverts commit r349674. It causes a failure in test-suite enc-3des.execution_time. llvm-svn: 349684	2018-12-19 22:09:02 +00:00
Reid Kleckner	ed3ef41711	[llvm-ar] Simplify string table get-or-insert pattern with .insert, NFC llvm-svn: 349681	2018-12-19 20:54:06 +00:00
Nikita Popov	649e125451	[BDCE][DemandedBits] Detect dead uses of undead instructions This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The test case has a couple of cases that are not simplified yet. In particular, we're only looking at uses of instructions right now. I think it would make sense to also extend this to arguments. Furthermore DemandedBits doesn't yet know some of the tricks that InstCombine does for the demanded bits or bitwise or/and/xor in combination with known bits information. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 349674	2018-12-19 19:56:21 +00:00
Craig Topper	d16da2b479	[X86] Remove a bunch of 'else' after returns in reduceVMULWidth. NFC This reduces indentation and makes it obvious this function always returns something. llvm-svn: 349671	2018-12-19 19:39:34 +00:00
David Blaikie	ac69af7ad6	llvm-dwarfdump: Improve/fix pretty printing of array dimensions This is to address post-commit feedback from Paul Robinson on r348954. The original commit misinterprets count and upper bound as the same thing (I thought I saw GCC producing an upper bound the same as Clang's count, but GCC correctly produces an upper bound that's one less than the count (in C, that is, where arrays are zero indexed)). I want to preserve the C-like output for the common case, so in the absence of a lower bound the count (or one greater than the upper bound) is rendered between []. In the trickier cases, where a lower bound is specified, a half-open range is used (eg: lower bound 1, count 2 would be "[1, 3)" and an unknown parts use a '?' (eg: "[1, ?)" or "[?, 7)" or "[?, ? + 3)"). Reviewers: aprantl, probinson, JDevlieghere Differential Revision: https://reviews.llvm.org/D55721 llvm-svn: 349670	2018-12-19 19:34:24 +00:00
Matthew Voss	62fcfc5adb	[ThinLTO] Remove dllimport attribute from locally defined symbols Summary: The LTO/ThinLTO driver currently creates invalid bitcode by setting symbols marked dllimport as dso_local. The compiler often has access to the definition (often dllexport) and the declaration (often dllimport) of an object at link-time, leading to a conflicting declaration. This patch resolves the inconsistency by removing the dllimport attribute. Reviewers: tejohnson, pcc, rnk, echristo Reviewed By: rnk Subscribers: dmikulin, wristow, mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D55627 llvm-svn: 349667	2018-12-19 19:07:45 +00:00
Jessica Paquette	3560e93dc1	[GlobalISel][AArch64] Add support for @llvm.ceil This adds a G_FCEIL generic instruction and uses it in AArch64. This adds selection for floating point ceil where it has a supported, dedicated instruction. Other cases aren't handled here. It updates the relevant gisel tests and adds a select-ceil test. It also adds a check to arm64-vcvt.ll which ensures that we don't fall back when we run into one of the relevant cases. llvm-svn: 349664	2018-12-19 19:01:36 +00:00
Craig Topper	84a00bd98a	[X86] Don't match TESTrr from (cmp (and X, Y), 0) during isel. Defer to post processing The (cmp (and X, Y) 0) pattern is greedy and ends up forming a TESTrr and consuming the and when it might be better to use one of the BMI/TBM like BLSR or BLSI. This patch moves removes the pattern from isel and adds a post processing check to combine TESTrr+ANDrr into just a TESTrr. With this patch we are able to select the BMI/TBM instructions, but we'll also emit a TESTrr when the result is compared to 0. In many cases the peephole pass will be able to use optimizeCompareInstr to remove the TEST, but its probably not perfect. Differential Revision: https://reviews.llvm.org/D55870 llvm-svn: 349661	2018-12-19 18:49:13 +00:00
Craig Topper	291470347a	[X86] Fix assert fails in pass X86AvoidSFBPass Fixes https://bugs.llvm.org/show_bug.cgi?id=38743 The function removeRedundantBlockingStores is supposed to remove any blocking stores contained in each other in lockingStoresDispSizeMap. But it currently looks only at the previous one, which will miss some cases that result in assert. This patch refine the function to check all previous layouts until find the uncontained one. So all redundant stores will be removed. Patch by Pengfei Wang Differential Revision: https://reviews.llvm.org/D55642 llvm-svn: 349660	2018-12-19 18:45:57 +00:00
Evandro Menezes	5d409b2278	[AArch64] Improve the Exynos M3 pipeline model llvm-svn: 349652	2018-12-19 17:37:51 +00:00
Anton Afanasyev	ce28791e20	Test commit Fix typos. llvm-svn: 349644	2018-12-19 17:18:40 +00:00
Sanjay Patel	798c5982a0	[ValueTracking] remove unused parameters from helper functions; NFC llvm-svn: 349641	2018-12-19 16:49:18 +00:00
Yonghong Song	7b410ac352	[BPF] Generate BTF DebugInfo under BPF target This patch implements BTF (BPF Type Format). The BTF is the debug info format for BPF, introduced in the below linux patch: `69b693f0ae (diff-06fb1c8825f653d7e539058b72c83332)` and further extended several times, e.g., https://www.spinics.net/lists/netdev/msg534640.html https://www.spinics.net/lists/netdev/msg538464.html https://www.spinics.net/lists/netdev/msg540246.html The main advantage of implementing in LLVM is: . better integration/deployment as no extra tools are needed. . bpf JIT based compilation (like bcc, bpftrace, etc.) can get BTF without much extra effort. . BTF line_info needs selective source codes, which can be easily retrieved when inside the compiler. This patch implemented BTF generation by registering a BPF specific DebugHandler in BPFAsmPrinter. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D55752 llvm-svn: 349640	2018-12-19 16:40:25 +00:00
Peter Wu	f0ad811b54	[Object] Deduplicate long archive member names Summary: Import libraries as created by llvm-dlltool always use the same archive member name for every object file (namely, the DLL library name). Ensure that long names are not repeatedly stored in the string table. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D55860 llvm-svn: 349637	2018-12-19 16:15:05 +00:00
Simon Pilgrim	7bfbf3caa4	[X86][SSE] Auto upgrade PADDUS/PSUBUS intrinsics to UADD_SAT/USUB_SAT generic intrinsics (llvm) Now that we use the generic ISD opcodes, we can use the generic intrinsics directly as well. This fixes the poor fast-isel codegen by not expanding to an easily broken IR code sequence. I'm intending to deal with the signed saturation equivalents as well. Clang counterpart: https://reviews.llvm.org/D55879 Differential Revision: https://reviews.llvm.org/D55855 llvm-svn: 349630	2018-12-19 14:43:36 +00:00
Simon Pilgrim	2ae3a91656	[SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate (part 2 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. I've updated the (or (and X, c1), c2) -> (and (or X, c2), c1\|c2) fold to demonstrate its use, which I believe is safe for undef cases. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349629	2018-12-19 14:09:38 +00:00
Simon Pilgrim	47ff0431e9	[SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate (part 1 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349628	2018-12-19 14:09:09 +00:00
Simon Pilgrim	6c95bea072	[TargetLowering] Fix propagation of undefs in zero extension ops (PR40091) As described on PR40091, we have several places where zext (and zext_vector_inreg) fold an undef input into an undef output. For zero extensions this is incorrect as the output should guarantee to least have the new upper bits set to zero. SimplifyDemandedVectorElts is the worst offender (and its the most likely to cause new undefs to appear) but DAGCombiner's tryToFoldExtendOfConstant has a similar issue. Thanks to @dmgreen for catching this. Differential Revision: https://reviews.llvm.org/D55883 llvm-svn: 349625	2018-12-19 13:37:59 +00:00
Nico Weber	f7cf1a1a73	Let TableGen write output only if it changed, instead of doing so in cmake, attempt 2 This relands r330742: """ Let TableGen write output only if it changed, instead of doing so in cmake. Removes one subprocess and one temp file from the build for each tablegen invocation. No intended behavior change. """ In particular, if you see rebuilds after this change that you didn't see before this change, that's unintended and it's fine to revert this change again (but let me know). r330742 got reverted because some people reported that llvm-tblgen ran on every build after it. This could happen if the depfile output got deleted without deleting the main .inc output. To fix, make TableGen always write the depfile, but keep writing the main .inc output only if it has changed. This matches what we did in cmake before. Differential Revision: https://reviews.llvm.org/D55842 llvm-svn: 349624	2018-12-19 13:35:53 +00:00
Nicolai Haehnle	8d5e974076	AMDGPU: Use an ABS32_LO relocation for SCRATCH_RSRC_DWORD1 Summary: Using HI here makes no logical sense, since the dword is only 32 bits to begin with. Current Mesa master does not look at the relocation type at all, so this change is fine. Future Mesa will rely on this, however. Change-Id: I91085707834c4ac0370926602b93c94b90e44cb1 Reviewers: arsenm, rampitec, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55369 llvm-svn: 349620	2018-12-19 11:55:03 +00:00
Simon Pilgrim	2072b5afbe	[SelectionDAG] Optional handling of UNDEF elements in matchUnaryPredicate Now that SimplifyDemandedBits/SimplifyDemandedVectorElts are simplifying vector elements, we're seeing more constant BUILD_VECTOR containing UNDEFs. This patch provides opt-in handling of UNDEF elements in matchUnaryPredicate, passing NULL instead of the ConstantSDNode* argument. I've updated SelectionDAG::simplifyShift to demonstrate its use. Differential Revision: https://reviews.llvm.org/D55819 llvm-svn: 349616	2018-12-19 10:41:06 +00:00
Carl Ritson	c521ac3a44	AMDGPU/InsertWaitcnts: Update VGPR/SGPR bounds when brackets are merged Summary: Fix an issue where VGPR/SGPR bounds are not properly extended when brackets are merged. This manifests as missing waitcnt insertions when multiple brackets are forwarded to a successor block and the first forward has lower VGPR/SGPR bounds. Irreducible loop test has been extended based on a CTS failure detected for GFX9. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D55602 llvm-svn: 349611	2018-12-19 10:17:49 +00:00
Diana Picus	6c35a1e5af	[ARM GlobalISel] Support G_CONSTANT for Thumb2 All we have to do is mark it as legal. This allows us to select a lot of new patterns handled by TableGen. This patch adds tests for them and splits up the existing test file for binary operators into 2 files, one for arithmetic ops and one for logical ones. llvm-svn: 349610	2018-12-19 09:55:10 +00:00
Matt Arsenault	b110e2277c	AMDGPU/GlobalISel: Regbankselect for fsub llvm-svn: 349608	2018-12-19 09:07:58 +00:00
Martin Storsjo	e84a0b5a9e	[llvm-objcopy] Initial COFF support This is an initial implementation of no-op passthrough copying of COFF with objcopy. Differential Revision: https://reviews.llvm.org/D54939 llvm-svn: 349605	2018-12-19 07:24:38 +00:00
Kewen Lin	a6247e7cf4	[PowerPC]Exploit P9 vabsdu for unsigned vselect patterns For type v4i32/v8ii16/v16i8, do following transforms: (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b) (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b) Differential Revision: https://reviews.llvm.org/D55812 llvm-svn: 349599	2018-12-19 03:04:07 +00:00
Evandro Menezes	f03c45d582	[AArch64] Simplify the Exynos M3 pipeline model llvm-svn: 349569	2018-12-18 23:19:57 +00:00
Evandro Menezes	4e39fa4474	[AArch64] Fix instructions order (NFC) llvm-svn: 349568	2018-12-18 23:19:55 +00:00
Yonghong Song	61b189e06f	[DebugInfo] Move several private headers to include directory This patch moved the following files in lib/CodeGen/AsmPrinter/ AsmPrinterHandler.h DbgEntityHistoryCalculator.h DebugHandlerBase.h to include/llvm/CodeGen directory. Such a change will enable Target to extend DebugHandlerBase and emit Target specific debug info sections. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D55755 llvm-svn: 349564	2018-12-18 23:10:17 +00:00
Pete Cooper	a3e0be109c	Preserve the linkage for objc* intrinsics as clang will set them to weak_external in some cases Clang uses weak linkage for objc runtime functions when they are not available on the platform. The intrinsic has this linkage so we just need to pass that on to the runtime call. llvm-svn: 349559	2018-12-18 22:42:08 +00:00
Pete Cooper	d0ffdf8782	Add nonlazybind to objc_retain/objc_release when converting from intrinsics. For performance reasons, clang set nonlazybind on these functions. Now that we are using intrinsics instead of runtime calls, we should set this attribute when creating the runtime functions. llvm-svn: 349558	2018-12-18 22:31:34 +00:00
Florian Hahn	485f2826ba	[LAA] Introduce enum for vectorization safety status (NFC). This patch adds a VectorizationSafetyStatus enum, which will be extended in a follow up patch to distinguish between 'safe with runtime checks' and 'known unsafe' dependences. Reviewers: anemet, anna, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D54892 llvm-svn: 349556	2018-12-18 22:25:11 +00:00
Vitaly Buka	4e4920694c	[asan] Restore ODR-violation detection on vtables Summary: unnamed_addr is still useful for detecting of ODR violations on vtables Still unnamed_addr with lld and --icf=safe or --icf=all can trigger false reports which can be avoided with --icf=none or by using private aliases with -fsanitize-address-use-odr-indicator Reviewers: eugenis Reviewed By: eugenis Subscribers: kubamracek, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55799 llvm-svn: 349555	2018-12-18 22:23:30 +00:00
Pete Cooper	f86db5ce9e	Rewrite objc intrinsics to runtime methods in PreISelIntrinsicLowering instead of SDAG. SelectionDAG currently changes these intrinsics to function calls, but that won't work for other ISel's. Also we want to eventually support nonlazybind and weak linkage coming from the front-end which we can't do in SelectionDAG. llvm-svn: 349552	2018-12-18 22:20:03 +00:00
Martin Storsjo	df20c666d6	[AArch64] Avoid crashing on .seh directives in assembly Differential Revision: https://reviews.llvm.org/D55670 llvm-svn: 349549	2018-12-18 22:10:17 +00:00
Kuba Mracek	3760fc9f3d	[asan] In llvm.asan.globals, allow entries to be non-GlobalVariable and skip over them Looks like there are valid reasons why we need to allow bitcasts in llvm.asan.globals, see discussion at https://github.com/apple/swift-llvm/pull/133. Let's look through bitcasts when iterating over entries in the llvm.asan.globals list. Differential Revision: https://reviews.llvm.org/D55794 llvm-svn: 349544	2018-12-18 21:20:17 +00:00
Evandro Menezes	3753c25b8c	[llvm-mca] Dump mask in hex Dump the resources masks as hexadecimal. llvm-svn: 349536	2018-12-18 20:45:50 +00:00
Pete Cooper	be4f571107	Change the objc ARC optimizer to use the new objc.* intrinsics We're moving ARC optimisation and ARC emission in clang away from runtime methods and towards intrinsics. This is the part which actually uses the intrinsics in the ARC optimizer when both analyzing the existing calls and emitting new ones. Differential Revision: https://reviews.llvm.org/D55348 Reviewers: ahatanak llvm-svn: 349534	2018-12-18 20:32:49 +00:00
Craig Topper	18a9d545e1	[X86] Add BSR to isUseDefConvertible. We already had BSF here as part of __builtin_ffs improvements and I was just wondering yesterday whether we should have BSR there. This addresses one issue from PR40090. llvm-svn: 349531	2018-12-18 20:03:54 +00:00
Nikita Popov	20853a7807	[InstCombine] Simplify cttz/ctlz + icmp eq/ne into mask check Checking whether a number has a certain number of trailing / leading zeros means checking whether it is of the form XXXX1000 / 0001XXXX, which can be done with an and+icmp. Related to https://bugs.llvm.org/show_bug.cgi?id=28668. As a next step, this can be extended to non-equality predicates. Differential Revision: https://reviews.llvm.org/D55745 llvm-svn: 349530	2018-12-18 19:59:50 +00:00
Farhana Aleen	59ee2c5362	[AMDGPU] Removed the unnecessary operand size-check-assert from processBaseWithConstOffset(). Summary: 32bit operand sizes are guaranteed by the opcode check AMDGPU::V_ADD_I32_e64 and AMDGPU::V_ADDC_U32_e64. Therefore, we don't any additional operand size-check-assert. Author: FarhanaAleen llvm-svn: 349529	2018-12-18 19:58:39 +00:00
David Blaikie	693f617763	DebugInfo: Fix missing local imported entities after r349207 Post commit review/bug reported by Pavel Labath - thanks! llvm-svn: 349528	2018-12-18 19:40:22 +00:00
Florian Hahn	5c014037b3	[SCCP] Get rid of redundant call for getPredicateInfoFor (NFC). We can use the result fetched a few lines above. llvm-svn: 349527	2018-12-18 19:37:07 +00:00
Craig Topper	8434ef7d1e	[X86] Don't use SplitOpsAndApply to create ISD::UADDSAT/ISD::USUBSAT nodes. Let type legalization and op legalization deal with it. Now that we've switched to target independent nodes we can rely on generic infrastructure to do the legalization for us. llvm-svn: 349526	2018-12-18 19:29:08 +00:00
Sanjay Patel	e51d5bdb3c	[InstCombine] refactor isCheapToScalarize(); NFC As the FIXME indicates, this has the potential to go overboard. So I'm not sure if it's even worth keeping this vs. iteratively doing simple matches, but we might as well clean it up. llvm-svn: 349523	2018-12-18 19:07:38 +00:00
Nikita Popov	f6058ff140	[X86] Use SADDSAT/SSUBSAT instead of ADDS/SUBS Migrate the X86 backend from X86ISD opcodes ADDS and SUBS to generic ISD opcodes SADDSAT and SSUBSAT. This also improves scodegen for @llvm.sadd.sat() and @llvm.ssub.sat() intrinsics. This is a followup to D55787 and part of PR40056. Differential Revision: https://reviews.llvm.org/D55833 llvm-svn: 349520	2018-12-18 18:28:22 +00:00
Craig Topper	20a6db5a84	[X86] Create PSUBUS from (add (umax X, C), -C) InstCombine seems to canonicalize or PSUB patter into a max with the cosntant and an add with an inverse of the constant. This patch recognizes this pattern and turns it into PSUBUS. Future work could improve undef element handling. Fixes some of PR40053 Differential Revision: https://reviews.llvm.org/D55780 llvm-svn: 349519	2018-12-18 18:26:25 +00:00
Alexandre Ganea	b536bf5299	Buildfix for r345516 (Clang compilation failing). llvm-svn: 349518	2018-12-18 18:23:36 +00:00
Alexandre Ganea	b67d91e090	[llvm-symbolizer] Omit stderr output when symbolizing a crash Differential revision: https://reviews.llvm.org/D55723 llvm-svn: 349516	2018-12-18 18:13:13 +00:00
Michael Berg	c6a5245cf7	Add FMF management to common fp intrinsics in GlobalIsel Summary: This the initial code change to facilitate managing FMF flags from Instructions to MI wrt Intrinsics in Global Isel. Eventually the GlobalObserver interface will be added as well, where FMF additions can be tracked for the builder and CSE. Reviewers: aditya_nandakumar, bogner Reviewed By: bogner Subscribers: rovka, kristof.beyls, javed.absar Differential Revision: https://reviews.llvm.org/D55668 llvm-svn: 349514	2018-12-18 17:54:52 +00:00
Michael Kruse	d4eb13c880	[LoopVectorize] Rename pass options. NFC. Rename: NoUnrolling to InterleaveOnlyWhenForced and AlwaysVectorize to !VectorizeOnlyWhenForced Contrary to what the name 'AlwaysVectorize' suggests, it does not unconditionally vectorize all loops, but applies a cost model to determine whether vectorization is profitable to all loops. Hence, passing false will disable the cost model, except when a loop is marked with llvm.loop.vectorize.enable. The 'OnlyWhenForced' suffix (suggested by @hfinkel in D55716) better matches this behavior. Similarly, 'NoUnrolling' disables the profitability cost model for interleaving (a term to distinguish it from unrolling by the LoopUnrollPass); rename it for consistency. Differential Revision: https://reviews.llvm.org/D55785 llvm-svn: 349513	2018-12-18 17:46:09 +00:00
Simon Pilgrim	1411917431	[X86][SSE] Don't use 'sign bit select' vXi8 ROTL lowering for constant rotation amounts Noticed by @spatel on D55747 - we get much better codegen if we use the regular shift expansion. llvm-svn: 349510	2018-12-18 17:31:11 +00:00
Michael Kruse	3284775b70	[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops. When using clang with `-fno-unroll-loops` (implicitly added with `-O1`), the LoopUnrollPass is not not added to the (legacy) pass pipeline. This also means that it will not process any loop metadata such as llvm.loop.unroll.enable (which is generated by #pragma unroll or WarnMissedTransformationsPass emits a warning that a forced transformation has not been applied (see https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html). Such explicit transformations should take precedence over disabling heuristics. This patch unconditionally adds LoopUnrollPass to the optimizing pipeline (that is, it is still not added with `-O0`), but passes a flag indicating whether automatic unrolling is dis-/enabled. This is the same approach as LoopVectorize uses. The new pass manager's pipeline builder has no option to disable unrolling, hence the problem does not apply. Differential Revision: https://reviews.llvm.org/D55716 llvm-svn: 349509	2018-12-18 17:16:05 +00:00
Simon Pilgrim	e9effe9744	[X86][SSE] Don't use 'sign bit select' vXi8 ROTL lowering for splat rotation amounts Noticed by @spatel on D55747 - we get much better codegen if we use the regular shift expansion. llvm-svn: 349500	2018-12-18 16:02:23 +00:00
Petar Avramovic	0a5e4eb776	[MIPS GlobalISel] Select G_SDIV, G_UDIV, G_SREM and G_UREM Add support for s64 libcalls for G_SDIV, G_UDIV, G_SREM and G_UREM and use integer type of correct size when creating arguments for CLI.lowerCall. Select G_SDIV, G_UDIV, G_SREM and G_UREM for types s8, s16, s32 and s64 on MIPS32. Differential Revision: https://reviews.llvm.org/D55651 llvm-svn: 349499	2018-12-18 15:59:51 +00:00
Nikita Popov	665ab08178	[X86] Use UADDSAT/USUBSAT instead of ADDUS/SUBUS Replace the X86ISD opcodes ADDUS and SUBUS with generic ISD opcodes UADDSAT and USUBSAT. As a side-effect, this also makes codegen for the @llvm.uadd.sat and @llvm.usub.sat intrinsics reasonable. This only replaces use in the X86 backend, and does not move any of the ADDUS/SUBUS X86 specific combines into generic codegen. Differential Revision: https://reviews.llvm.org/D55787 llvm-svn: 349481	2018-12-18 13:23:03 +00:00
Nikita Popov	a7d2a235bb	[SelectionDAG][X86] Fix [US](ADD\|SUB)SAT vector legalization, add tests Integer result promotion needs to use the scalar size, and we need support for result widening. This is in preparation for D55787. llvm-svn: 349480	2018-12-18 13:22:53 +00:00
Petar Avramovic	150fd430f6	[MIPS GlobalISel] ClampScalar G_AND G_OR and G_XOR Add narrowScalar for G_AND and G_XOR. Legalize G_AND G_OR and G_XOR for types other then s32 with clampScalar on MIPS32. Differential Revision: https://reviews.llvm.org/D55362 llvm-svn: 349475	2018-12-18 11:36:14 +00:00
Luke Cheeseman	f57d7d8237	[AArch64] - Return address signing dwarf support - Reapply changes intially introduced in r343089 - The archtecture info is no longer loaded whenever a DWARFContext is created - The runtimes libraries (santiziers) make use of the dwarf context classes but do not intialise the target info - The architecture of the object can be obtained without loading the target info - Adding a method to the dwarf context to get this information and multiplex the string printing later on Differential Revision: https://reviews.llvm.org/D55774 llvm-svn: 349472	2018-12-18 10:37:42 +00:00
Dylan McKay	f920da009e	[IPO][AVR] Create new Functions in the default address space specified in the data layout This modifies the IPO pass so that it respects any explicit function address space specified in the data layout. In targets with nonzero program address spaces, all functions should, by default, be placed into the default program address space. This is required for Harvard architectures like AVR. Without this, the functions will be marked as residing in data space, and thus not be callable. This has no effect to any in-tree official backends, as none use an explicit program address space in their data layouts. Patch by Tim Neumann. llvm-svn: 349469	2018-12-18 09:52:52 +00:00
Matt Arsenault	c94e26c71d	AMDGPU: Legalize/regbankselect frame_index llvm-svn: 349468	2018-12-18 09:46:13 +00:00
Matt Arsenault	c0ea221068	AMDGPU: Legalize/regbankselect fma llvm-svn: 349467	2018-12-18 09:39:56 +00:00
Simon Pilgrim	af6fbbf18b	[TargetLowering] Fallback from SimplifyDemandedVectorElts to SimplifyDemandedBits For opcodes not covered by SimplifyDemandedVectorElts, SimplifyDemandedBits might be able to help now that it supports demanded elts as well. llvm-svn: 349466	2018-12-18 09:33:25 +00:00
Tim Northover	856628f707	SROA: preserve alignment tags on loads and stores. When splitting up an alloca's uses we were dropping any explicit alignment tags, which means they default to the ABI-required default alignment and this can cause miscompiles if the real value was smaller. Also refactor the TBAA metadata into a parent class since it's shared by both children anyway. llvm-svn: 349465	2018-12-18 09:29:39 +00:00
Matt Arsenault	1ac38ba73f	GlobalISel: Improve crash on invalid mapping If NumBreakDowns is 0, BreakDown is null. This trades a null dereference with an assert somewhere else. llvm-svn: 349464	2018-12-18 09:27:29 +00:00
Matt Arsenault	e01e7c81f2	AMDGPU/GlobalISel: Legalize/regbankselect fneg/fabs/fsub llvm-svn: 349463	2018-12-18 09:19:03 +00:00
Simon Pilgrim	8488a44c34	[X86][SSE] Move VSRAI sign extend in reg fold into SimplifyDemandedBits (VSRAI (VSHLI X, C1), C1) --> X iff NumSignBits(X) > C1 This works better as part of SimplifyDemandedBits than part of the general combine. llvm-svn: 349462	2018-12-18 09:11:34 +00:00
Simon Pilgrim	26c630f416	[X86][SSE] Replace (VSRLI (VSRAI X, Y), 31) -> (VSRLI X, 31) fold. This fold was incredibly specific - replace with a SimplifyDemandedBits fold to remove a VSRAI if only the original sign bit is demanded (its guaranteed to stay the same). Test change is merely a rescheduling. llvm-svn: 349459	2018-12-18 08:55:47 +00:00
Kristof Beyls	e66bc1f756	Introduce control flow speculation tracking pass for AArch64 The pass implements tracking of control flow miss-speculation into a "taint" register. That taint register can then be used to mask off registers with sensitive data when executing under miss-speculation, a.k.a. "transient execution". This pass is aimed at mitigating against SpectreV1-style vulnarabilities. At the moment, it implements the tracking of miss-speculation of control flow into a taint register, but doesn't implement a mechanism yet to then use that taint register to mask off vulnerable data in registers (something for a follow-on improvement). Possible strategies to mask out vulnerable data that can be implemented on top of this are: - speculative load hardening to automatically mask of data loaded in registers. - using intrinsics to mask of data in registers as indicated by the programmer (see https://lwn.net/Articles/759423/). For AArch64, the following implementation choices are made. Some of these are different than the implementation choices made in the similar pass implemented in X86SpeculativeLoadHardening.cpp, as the instruction set characteristics result in different trade-offs. - The speculation hardening is done after register allocation. With a relative abundance of registers, one register is reserved (X16) to be the taint register. X16 is expected to not clash with other register reservation mechanisms with very high probability because: . The AArch64 ABI doesn't guarantee X16 to be retained across any call. . The only way to request X16 to be used as a programmer is through inline assembly. In the rare case a function explicitly demands to use X16/W16, this pass falls back to hardening against speculation by inserting a DSB SYS/ISB barrier pair which will prevent control flow speculation. - It is easy to insert mask operations at this late stage as we have mask operations available that don't set flags. - The taint variable contains all-ones when no miss-speculation is detected, and contains all-zeros when miss-speculation is detected. Therefore, when masking, an AND instruction (which only changes the register to be masked, no other side effects) can easily be inserted anywhere that's needed. - The tracking of miss-speculation is done by using a data-flow conditional select instruction (CSEL) to evaluate the flags that were also used to make conditional branch direction decisions. Speculation of the CSEL instruction can be limited with a CSDB instruction - so the combination of CSEL + a later CSDB gives the guarantee that the flags as used in the CSEL aren't speculated. When conditional branch direction gets miss-speculated, the semantics of the inserted CSEL instruction is such that the taint register will contain all zero bits. One key requirement for this to work is that the conditional branch is followed by an execution of the CSEL instruction, where the CSEL instruction needs to use the same flags status as the conditional branch. This means that the conditional branches must not be implemented as one of the AArch64 conditional branches that do not use the flags as input (CB(N)Z and TB(N)Z). This is implemented by ensuring in the instruction selectors to not produce these instructions when speculation hardening is enabled. This pass will assert if it does encounter such an instruction. - On function call boundaries, the miss-speculation state is transferred from the taint register X16 to be encoded in the SP register as value 0. Future extensions/improvements could be: - Implement this functionality using full speculation barriers, akin to the x86-slh-lfence option. This may be more useful for the intrinsics-based approach than for the SLH approach to masking. Note that this pass already inserts the full speculation barriers if the function for some niche reason makes use of X16/W16. - no indirect branch misprediction gets protected/instrumented; but this could be done for some indirect branches, such as switch jump tables. Differential Revision: https://reviews.llvm.org/D54896 llvm-svn: 349456	2018-12-18 08:50:02 +00:00
Martin Storsjo	8f0cb9c3a8	[AArch64] [MinGW] Allow enabling SEH exceptions The default still is dwarf, but SEH exceptions can now be enabled optionally for the MinGW target. Differential Revision: https://reviews.llvm.org/D55748 llvm-svn: 349451	2018-12-18 08:32:37 +00:00
Kewen Lin	44ace92596	[PowerPC] Exploit power9 new instruction setb Check the expected pattens feeding to SELECT_CC like: (select_cc lhs, rhs, 1, (sext (setcc [lr]hs, [lr]hs, cc2)), cc1) (select_cc lhs, rhs, -1, (zext (setcc [lr]hs, [lr]hs, cc2)), cc1) (select_cc lhs, rhs, 0, (select_cc [lr]hs, [lr]hs, 1, -1, cc2), seteq) (select_cc lhs, rhs, 0, (select_cc [lr]hs, [lr]hs, -1, 1, cc2), seteq) Further transform the sequence to comparison + setb if hits. Differential Revision: https://reviews.llvm.org/D53275 llvm-svn: 349445	2018-12-18 07:53:26 +00:00
Craig Topper	1ff7356f96	[X86] Const correct some helper functions X86InstrInfo.cpp. NFC llvm-svn: 349440	2018-12-18 04:58:05 +00:00
Artur Pilipenko	2a0146e0fd	[CaptureTracking] Pass MaxUsesToExplore from wrappers to the actual implementation This is a follow up for rL347910. In the original patch I somehow forgot to pass the limit from wrappers to the function which actually does the job. llvm-svn: 349438	2018-12-18 03:32:33 +00:00
Kewen Lin	3dac1252da	[PowerPC] Improve vec_abs on P9 Improve the current vec_abs support on P9, generate ISD::ABS node for vector types, combine ABS node to VABSD node for some special cases to make use of P9 VABSD* insns, do custom lowering to vsub(vneg later)+vmax if it has no combination opportunity. Differential Revision: https://reviews.llvm.org/D54783 llvm-svn: 349437	2018-12-18 03:16:43 +00:00
Eli Friedman	f457470286	[Support] Fix GNU/kFreeBSD build Patch by James Clarke. Differential Revision: https://reviews.llvm.org/D55296 llvm-svn: 349434	2018-12-18 01:38:20 +00:00
Reid Kleckner	4ab50b858e	[codeview] Update comment on aligning symbol records llvm-svn: 349433	2018-12-18 01:36:06 +00:00
Reid Kleckner	53ce05960e	[codeview] Align symbol records to save 441MB during linking clang.pdb In PDBs, symbol records must be aligned to four bytes. However, in the object file, symbol records may not be aligned. MSVC does not pad out symbol records to make sure they are aligned. That means the linker has to do extra work to insert the padding. Currently, LLD calculates the required space with alignment, and copies each record one at a time while padding them out to the correct size. It has a fast path that avoids this copy when the records are already aligned. This change fixes a bug in that codepath so that the copy is actually saved, and tweaks LLVM's symbol record emission to align symbol records. Here's how things compare when doing a plain clang Release+PDB build: - objs are 0.65% bigger (negligible) - link is 3.3% faster (negligible) - saves allocating 441MB - new LLD high water mark is ~1.05GB llvm-svn: 349431	2018-12-18 01:14:05 +00:00
David Blaikie	c4e08feb00	Recommit r348806: DebugInfo: Use symbol difference for CU length to simplify assembly reading/editing Mucking about simplifying a test case ( https://reviews.llvm.org/D55261 ) I stumbled across something I've hit before - that LLVM's (GCC's does too, FWIW) assembly output includes a hardcode length for a DWARF unit in its header. Instead we could emit a label difference - making the assembly easier to read/edit (though potentially at a slight (I haven't tried to observe it) performance cost of delaying/sinking the length computation into the MC layer). Fix: Predicated all the changes (including creating the labels, even if they aren't used/needed) behind the NVPTX useSectionsAsReferences, avoiding emitting labels in NVPTX where ptxas can't parse them. Reviewers: JDevlieghere, probinson, ABataev Differential Revision: https://reviews.llvm.org/D55281 llvm-svn: 349430	2018-12-18 01:06:09 +00:00
Joel E. Denny	e2afb61499	[FileCheck] Annotate input dump (final tweaks) Apply final suggestions from probinson for this patch series plus a few more tweaks: * Improve various docs, for MatchType in particular. * Rename some members of MatchType. The main problem was that the term "final match" became a misnomer when CHECK-COUNT-<N> was created. * Split InputStartLine, etc. declarations into multiple lines. Differential Revision: https://reviews.llvm.org/D55738 Reviewed By: probinson llvm-svn: 349425	2018-12-18 00:03:51 +00:00
Joel E. Denny	96f0e84ccf	[FileCheck] Annotate input dump (7/7) This patch implements annotations for diagnostics reporting CHECK-NOT failed matches. These diagnostics are enabled by -vv. As for diagnostics reporting failed matches for other directives, these annotations mark the search ranges using `X~~`. The difference here is that failed matches for CHECK-NOT are successes not errors, so they are green not red when colors are enabled. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - ^~~ marks good match (reported if -v) - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - CHECK-NOT found (error) - CHECK-DAG overlapping match (discarded, reported if -vv) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - CHECK-NOT not found (success, reported if -vv) - CHECK-DAG not found after discarded matches (error) - ? marks fuzzy match when no match is found - colors success, error, fuzzy match, discarded match, unmatched input If you are not seeing color above or in input dumps, try: -color $ FileCheck -vv -dump-input=always check5 < input5 \|& sed -n '/^<<<</,$p' <<<<<< 1: abcdef check:1 ^~~ not:2 X~~ 2: ghijkl not:2 ~~~ check:3 ^~~ 3: mnopqr not:4 X~~~~~ 4: stuvwx not:4 ~~~~~~ 5: eof:4 ^ >>>>>> $ cat check5 CHECK: abc CHECK-NOT: foobar CHECK: jkl CHECK-NOT: foobar $ cat input5 abcdef ghijkl mnopqr stuvwx ``` Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53899 llvm-svn: 349424	2018-12-18 00:03:36 +00:00
Joel E. Denny	f7c1c4d8a4	[FileCheck] Annotate input dump (6/7) This patch implements input annotations for diagnostics reporting CHECK-DAG discarded matches. These diagnostics are enabled by -vv. These annotations mark discarded match ranges using `!~~` because they are bad matches even though they are not errors. CHECK-DAG discarded matches create another case where there can be multiple match results for the same directive. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - ^~~ marks good match (reported if -v) - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - CHECK-NOT found (error) - CHECK-DAG overlapping match (discarded, reported if -vv) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - CHECK-DAG not found after discarded matches (error) - ? marks fuzzy match when no match is found - colors success, error, fuzzy match, discarded match, unmatched input If you are not seeing color above or in input dumps, try: -color $ FileCheck -vv -dump-input=always check4 < input4 \|& sed -n '/^<<<</,$p' <<<<<< 1: abcdef dag:1 ^~~~ dag:2'0 !~~~ discard: overlaps earlier match 2: cdefgh dag:2'1 ^~~~ check:3 X~ error: no match found >>>>>> $ cat check4 CHECK-DAG: abcd CHECK-DAG: cdef CHECK: efgh $ cat input4 abcdef cdefgh ``` This shows that the line 3 CHECK fails to match even though its pattern appears in the input because its search range starts after the line 2 CHECK-DAG's match range. The trouble might be that the line 2 CHECK-DAG's match range is later than expected because its first match range overlaps with the line 1 CHECK-DAG match range and thus is discarded. Because `!~~` for CHECK-DAG does not indicate an error, it is not colored red. Instead, when colors are enabled, it is colored cyan, which suggests a match that went cold. Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53898 llvm-svn: 349423	2018-12-18 00:03:19 +00:00
Joel E. Denny	7df86967b4	[FileCheck] Annotate input dump (5/7) This patch implements input annotations for diagnostics enabled by -v, which report good matches for directives. These annotations mark match ranges using `^~~`. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - ^~~ marks good match (reported if -v) - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - CHECK-NOT found (error) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - ? marks fuzzy match when no match is found - colors success, error, fuzzy match, unmatched input If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check3 < input3 \|& sed -n '/^<<<</,$p' <<<<<< 1: abc foobar def check:1 ^~~ not:2 !~~~~~ error: no match expected check:3 ^~~ >>>>>> $ cat check3 CHECK: abc CHECK-NOT: foobar CHECK: def $ cat input3 abc foobar def ``` -vv enables these annotations for FileCheck's implicit EOF patterns as well. For an example where EOF patterns become relevant, see patch 7 in this series. If colors are enabled, `^~~` is green to suggest success. -v plus color enables highlighting of input text that has no final match for any expected pattern. The highlight uses a cyan background to suggest a cold section. This highlighting can make it easier to spot text that was intended to be matched but that failed to be matched in a long series of good matches. CHECK-COUNT-<num> good matches are another case where there can be multiple match results for the same directive. Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53897 llvm-svn: 349422	2018-12-18 00:03:03 +00:00
Joel E. Denny	0e7e3fa0e9	[FileCheck] Annotate input dump (4/7) This patch implements input annotations for diagnostics that report unexpected matches for CHECK-NOT. Like wrong-line matches for CHECK-NEXT, CHECK-SAME, and CHECK-EMPTY, these annotations mark match ranges using red `!~~` to indicate bad matches that are errors. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - CHECK-NOT found (error) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - ? marks fuzzy match when no match is found - colors error, fuzzy match If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check3 < input3 \|& sed -n '/^<<<</,$p' <<<<<< 1: abc foobar def not:2 !~~~~~ error: no match expected >>>>>> $ cat check3 CHECK: abc CHECK-NOT: foobar CHECK: def $ cat input3 abc foobar def ``` Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53896 llvm-svn: 349421	2018-12-18 00:02:47 +00:00
Joel E. Denny	cadfcef493	[FileCheck] Annotate input dump (3/7) This patch implements input annotations for diagnostics that report wrong-line matches for the directives CHECK-NEXT, CHECK-SAME, and CHECK-EMPTY. Instead of the usual `^~~`, which is used by later patches for good matches, these annotations use `!~~` to mark the bad match ranges so that this category of errors is visually distinct. Because such matches are errors, these annotates are red when colors are enabled. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - ? marks fuzzy match when no match is found - colors error, fuzzy match If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check2 < input2 \|& sed -n '/^<<<</,$p' <<<<<< 1: foo bar next:2 !~~ error: match on wrong line >>>>>> $ cat check2 CHECK: foo CHECK-NEXT: bar $ cat input2 foo bar ``` Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53894 llvm-svn: 349420	2018-12-18 00:02:22 +00:00
Joel E. Denny	2c007c807d	[FileCheck] Annotate input dump (2/7) This patch implements input annotations for diagnostics that suggest fuzzy matches for directives for which no matches were found. Instead of using the usual `^~~`, which is used by later patches for good matches, these annotations use `?` so that fuzzy matches are visually distinct. No tildes are included as these diagnostics (independently of this patch) currently identify only the start of the match. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - X~~ marks search range when no match is found - ? marks fuzzy match when no match is found - colors error, fuzzy match If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check1 < input1 \|& sed -n '/^<<<</,$p' <<<<<< 1: ; abc def 2: ; ghI jkl next:3'0 X~~~~~~~~ error: no match found next:3'1 ? possible intended match >>>>>> $ cat check1 CHECK: abc CHECK-SAME: def CHECK-NEXT: ghi CHECK-SAME: jkl $ cat input1 ; abc def ; ghI jkl ``` This patch introduces the concept of multiple "match results" per directive. In the above example, the first match result for the CHECK-NEXT directive is the failed match, for which the annotation shows the search range. The second match result is the fuzzy match. Later patches will introduce other cases of multiple match results per directive. When colors are enabled, `?` is colored magenta. That is, it doesn't indicate the actual error, which a red `X~~` marker indicates, but its color suggests it's closely related. Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53893 llvm-svn: 349419	2018-12-18 00:02:04 +00:00
Joel E. Denny	3c5d267eb7	[FileCheck] Annotate input dump (1/7) Extend FileCheck to dump its input annotated with FileCheck's diagnostics: errors, good matches if -v, and additional information if -vv. The goal is to make it easier to visualize FileCheck's matching behavior when debugging. Each patch in this series implements input annotations for a particular category of FileCheck diagnostics. While the first few patches alone are somewhat useful, the annotations become much more useful as later patches implement annotations for -v and -vv diagnostics, which show the matching behavior leading up to the error. This first patch implements boilerplate plus input annotations for error diagnostics reporting that no matches were found for a directive. These annotations mark the search ranges of the failed directives. Instead of using the usual `^~~`, which is used by later patches for good matches, these annotations use `X~~` so that this category of errors is visually distinct. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the match result for a pattern of type T from line L of the check file - X~~ marks search range when no match is found - colors error If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check1 < input1 \|& sed -n '/^Input file/,$p' Input file: <stdin> Check file: check1 -dump-input=help describes the format of the following dump. Full input was: <<<<<< 1: ; abc def 2: ; ghI jkl next:3 X~~~~~~~~ error: no match found >>>>>> $ cat check1 CHECK: abc CHECK-SAME: def CHECK-NEXT: ghi CHECK-SAME: jkl $ cat input1 ; abc def ; ghI jkl ``` Some additional details related to the boilerplate: * Enabling: The annotated input dump is enabled by `-dump-input`, which can also be set via the `FILECHECK_OPTS` environment variable. Accepted values are `help`, `always`, `fail`, or `never`. As shown above, `help` describes the format of the dump. `always` is helpful when you want to investigate a successful FileCheck run, perhaps for an unexpected pass. `-dump-input-on-failure` and `FILECHECK_DUMP_INPUT_ON_FAILURE` remain as a deprecated alias for `-dump-input=fail`. * Diagnostics: The usual diagnostics are not suppressed in this mode and are printed first. For brevity in the example above, I've omitted them using a sed command. Sometimes they're perfectly sufficient, and then they make debugging quicker than if you were forced to hunt through a dump of long input looking for the error. If you think they'll get in the way sometimes, keep in mind that it's pretty easy to grep for the start of the input dump, which is `<<<`. * Colored Annotations: The annotated input is colored if colors are enabled (enabling colors can be forced using -color). For example, errors are red. However, as in the above example, colors are not vital to reading the annotations. I don't know how to test color in the output, so any hints here would be appreciated. Reviewed By: george.karpenkov, zturner, probinson Differential Revision: https://reviews.llvm.org/D52999 llvm-svn: 349418	2018-12-18 00:01:39 +00:00
Peter Collingbourne	d3a3e4b46d	hwasan: Move ctor into a comdat. Differential Revision: https://reviews.llvm.org/D55733 llvm-svn: 349413	2018-12-17 22:56:34 +00:00
Simon Pilgrim	7e2975a44c	[X86][SSE] Improve immediate vector shift known bits handling. Convert VSRAI to VSRLI is the sign bit is known zero and improve KnownBits output for all shift instruction. Fixes the poor codegen comments in D55768. llvm-svn: 349407	2018-12-17 22:09:47 +00:00
Wouter van Oortmerssen	d3c544aa6e	[WebAssembly] Fix assembler parsing of br_table. Summary: We use `variable_ops` in the tablegen defs to denote the list of branch targets in `br_table`, but unlike other uses of `variable_ops` (e.g. call) the these branch targets need to actually be encoded in the instruction. The existing tables for `variable_ops` cause not operands to be accepted by the assembly matcher. Following the example of ARM: `2cc0a7da87/lib/Target/ARM/ARMInstrInfo.td (L550-L555)` we introduce a new operand type to capture this list, and we use the same {} syntax as ARM as well to differentiate them from regular integer operands. Also removed definition and use of TSFlags in tablegen defs, since `br_table` now has a non-variable_ops immediate operand, so the previous logic of only the variable_ops arguments being labels didn't make sense anymore. Reviewers: dschuff, aheejin, sunfish Subscribers: javed.absar, sbc100, jgravelle-google, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D55401 llvm-svn: 349405	2018-12-17 22:04:44 +00:00
Craig Topper	8c9d772991	[X86] Add T1MSKC and TZMSK to isDefConvertible used by optimizeCompareInstr. These seem to have been missed when the other TBM instructions were added. llvm-svn: 349404	2018-12-17 21:50:06 +00:00
Reid Kleckner	94ee0728e5	[codeview] Flush labels before S_DEFRANGE* fragments This was a pre-existing bug that could be triggered with assembly like this: .p2align 2 .LtmpN: .cv_def_range "..." I noticed this when attempting to change clang to emit aligned symbol records. llvm-svn: 349403	2018-12-17 21:49:35 +00:00
Simon Pilgrim	6b5e0b7b2b	[X86][SSE] Split SimplifyDemandedBitsForTargetNode X86ISD::VSRLI/VSRAI handling. First step towards adding more capable combines to fix comments in D55768. llvm-svn: 349400	2018-12-17 21:36:17 +00:00
Sanjay Patel	200885e654	[AggressiveInstCombine] convert rotate with guard branch into funnel shift (PR34924) Now, that we have funnel shift intrinsics, it should be safe to convert this form of rotate to it. In the worst case (a target that doesn't have rotate instructions), we will expand this into a branch-less sequence of ALU ops (neg/and/and/lshr/shl/or) in the backend, so it's still very likely to be a perf improvement over the original code. The motivating source code pattern for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=34924 Background: I looked at several different options before deciding where to try this - instcombine, simplifycfg, CGP - because it doesn't fit cleanly anywhere AFAIK. The backend (CGP, SDAG, GlobalIsel?) is too late for what we're trying to accomplish. We want to have the IR converted before we reach things like vectorization because the reduced code can make a loop much simpler to transform. Technically, this could be included in instcombine, but it's a large pattern match that includes control-flow, so it just felt wrong to stuff into there (although I have a draft of that patch). Similarly, this could be part of simplifycfg, but all of this pattern matching is a stretch. So we're left with our relatively new dumping ground for homeless transforms: aggressive-instcombine. This only runs at -O3, but that seems like a reasonable limitation given that source code has many options to avoid this pattern (including the recently added clang intrinsics for rotates). I'm including a PhaseOrdering test because we require the teamwork of 3 passes (aggressive-instcombine, instcombine, simplifycfg) to get this into the minimal IR form that we want. That test shows a bug with the new pass manager that's independent of this change (but it will be masked if we canonicalize harder to funnel shift intrinsics in instcombine). Differential Revision: https://reviews.llvm.org/D55604 llvm-svn: 349396	2018-12-17 21:14:51 +00:00
Krzysztof Parzyszek	5852aa44ae	[SDAG] Clarify the origin of chain in REG_SEQUENCE in comment, NFC llvm-svn: 349391	2018-12-17 20:30:20 +00:00
Craig Topper	15b7246935	[SelectionDAG] Fix noop detection for vectors in AssertZext/AssertSext in getNode The assertion type is always supposed to be a scalar type. So if the result VT of the assertion is a vector, we need to get the scalar VT before we can compare them. Similarly for the assert above it. I don't have a test case because I don't know of any place we violate this today. A coworker found this while trying to use r347287 on the 6.0 branch without also having r336868 llvm-svn: 349390	2018-12-17 20:29:13 +00:00
Sanjay Patel	1a6e9ec434	[InstCombine] don't widen an arbitrary sequence of vector ops (PR40032) The problem is shown specifically for a case with vector multiply here: https://bugs.llvm.org/show_bug.cgi?id=40032 ...and this might mask the original backend bug for ARM shown in: https://bugs.llvm.org/show_bug.cgi?id=39967 As the test diffs here show, we were (and probably still aren't) doing these kinds of transforms in a principled way. We are producing more or equal wide instructions than we started with in some cases, so we still need to restrict/correct other transforms from overstepping. If there are perf regressions from this change, we can either carve out exceptions to the general IR rules, or improve the backend to do these transforms when we know the transform is profitable. That's probably similar to a change like D55448. Differential Revision: https://reviews.llvm.org/D55744 llvm-svn: 349389	2018-12-17 20:27:43 +00:00
Craig Topper	728cbc0378	Convert (CMP (srl/shl X, C), 0) to (CMP (and X, C'), 0) when only the zero flag is used. This allows a TEST to be used and can be combined with any AND that may already exist as an input to the shift. This was already done in EmitTest, but was easily tricked by multiple uses because the setcc might be used by multiple instructions. Once the SETCC and users are legalized then we can look for the shift to be used by a single CMP, but the CMP itself can have multiple users. This appears to fix the case in PR39968. llvm-svn: 349385	2018-12-17 20:02:16 +00:00
JF Bastien	1811217e4d	NFC: remove unused variable D55768 removed its use. llvm-svn: 349377	2018-12-17 19:03:24 +00:00
Simon Pilgrim	9274f17a5e	[TargetLowering] Add DemandedElts mask to SimplifyDemandedBits (PR40000) This is an initial patch to add the necessary support for a DemandedElts argument to SimplifyDemandedBits, more closely matching computeKnownBits and to help improve vector codegen. I've added only a small amount of the changes necessary to get at least one test to update - a lot more can be done but I'd like to add these methodically with proper test coverage, at the same time the hope is to slowly move some/all of SimplifyDemandedVectorElts into SimplifyDemandedBits as well. Differential Revision: https://reviews.llvm.org/D55768 llvm-svn: 349374	2018-12-17 18:43:43 +00:00
Nikita Popov	221f3fc750	[InstSimplify] Simplify saturating add/sub + icmp If a saturating add/sub has one constant operand, then we can determine the possible range of outputs it can produce, and simplify an icmp comparison based on that. The implementation is based on a similar existing mechanism for simplifying binary operator + icmps. Differential Revision: https://reviews.llvm.org/D55735 llvm-svn: 349369	2018-12-17 17:45:18 +00:00
Tim Northover	256a16d031	FastIsel: take care to update iterators when removing instructions. We keep a few iterators into the basic block we're selecting while performing FastISel. Usually this is fine, but occasionally code wants to remove already-emitted instructions. When this happens we have to be careful to update those iterators so they're not pointint at dangling memory. llvm-svn: 349365	2018-12-17 17:25:53 +00:00
Zachary Turner	1b9a938b9a	Add missing include file. llvm-svn: 349363	2018-12-17 16:42:26 +00:00
Zachary Turner	bb3d7e565f	[PDB] Add some helper functions for working with scopes. llvm-svn: 349361	2018-12-17 16:15:36 +00:00
Zachary Turner	b472512a77	[MS Demangler] Add a helper function to print a Node as a string. llvm-svn: 349359	2018-12-17 16:14:50 +00:00
Petar Avramovic	f9c9bc09ab	[MIPS GlobalISel] Remove switch statement (fix r349346 for MSVC) Temporarily remove switch statement without any case labels in function legalizeCustom in order to fix r349346 for MSVC. llvm-svn: 349356	2018-12-17 15:12:53 +00:00
Tim Northover	ae3b66b7b0	ARM: use acquire/release instruction variants when available. These features (fairly) recently got split out into their own feature, so we should make CodeGen use them when available. The main change here is that the check used to be based on the triple, but now it's based on CPU features. llvm-svn: 349355	2018-12-17 15:05:32 +00:00
Andrea Di Biagio	4c73711069	[MCA] Add support for BeginGroup/EndGroup. llvm-svn: 349354	2018-12-17 14:27:33 +00:00
Eric Liu	6c933a2bed	Revert "DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses)" This reverts commit r349333. It caused internal test to fail. I have sent more information to the author. llvm-svn: 349353	2018-12-17 14:14:40 +00:00
Andrea Di Biagio	4506067593	[MCA] Don't assume that createMCInstrAnalysis() always returns a valid pointer. Class InstrBuilder wrongly assumed that llvm targets were always able to return a non-null pointer when createMCInstrAnalysis() was called on them. This was causing crashes when simulating executions for targets that don't provide an MCInstrAnalysis object. This patch fixes the issue by making MCInstrAnalysis optional. llvm-svn: 349352	2018-12-17 14:00:37 +00:00
Petar Avramovic	b8276f2280	[MIPS GlobalISel] Lower G_UADDE and narrowScalar G_ADD Lower G_UADDE and legalize G_ADD using narrowScalar on MIPS32. Differential Revision: https://reviews.llvm.org/D54580 llvm-svn: 349346	2018-12-17 12:31:07 +00:00
Alexandros Lamprineas	490ae11717	[AArch64] Re-run load/store optimizer after aggressive tail duplication The Load/Store Optimizer runs before Machine Block Placement. At O3 the Tail Duplication Threshold is set to 4 instructions and this can create new opportunities for the Load/Store Optimizer. It seems worthwhile to run it once again. llvm-svn: 349338	2018-12-17 10:45:43 +00:00
David Blaikie	884deed1b3	DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses) GCC emitted these unconditionally on/before 4.4/March 2012 Clang emitted these unconditionally on/before 3.5/March 2014 This improves performance when parsing CUs (especially those using split DWARF) that contain no code ranges (such as the mini CUs that may be created by ThinLTO importing - though generally they should be/are avoided, especially for Split DWARF because it produces a lot of very small CUs, which don't scale well in a bunch of other ways too (including size)). llvm-svn: 349333	2018-12-17 08:27:19 +00:00
Clement Courbet	cc5e6a72de	[llvm-mca] Move llvm-mca library to llvm/lib/MCA. Summary: See PR38731. Reviewers: andreadb Subscribers: mgorny, javed.absar, tschuett, gbedwell, andreadb, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D55557 llvm-svn: 349332	2018-12-17 08:08:31 +00:00
Craig Topper	fa4907d671	[X86] Fix bad operand lookup for cmov introduced in r349315 The CC is operand 2 not operand 3. llvm-svn: 349330	2018-12-17 06:40:35 +00:00
Davide Italiano	e41e1d015f	[EarlyCSE] If DI can't be salvaged, mark it as unavailable. Fixes PR39874. llvm-svn: 349323	2018-12-17 01:42:39 +00:00
Simon Pilgrim	d0c9e43b1c	[X86] Pull out constant splat rotation detection. We had 3 different approaches - consistently use getTargetConstantBitsFromNode and allow undef elts. llvm-svn: 349319	2018-12-16 19:46:04 +00:00
Craig Topper	10f8892837	[X86] Remove truncation handling from EmitTest. Replace it with a DAG combine. I'd like to try to move a lot of the flag matching out of EmitTest and push it to isel or isel preprocessing. This is a step towards that. The test-shrink-bug.ll changie is an improvement because we are no longer interfering with test shrink handling in isel. The pr34137.ll change is a regression, but the IR came from -O0 and was not reduced by InstCombine. So it contains a lot of redundancies like duplicate loads that made it combine poorly. llvm-svn: 349315	2018-12-16 18:35:55 +00:00
Sanjay Patel	13ac2f15b0	[x86] increment/decrement constant vector with min/max in vsetcc lowering (PR39859) This is part of fixing PR39859: https://bugs.llvm.org/show_bug.cgi?id=39859 We have a crippled vector ISA, so we have to invert a typical fold and create min/max here. As discussed in the bug report, we can probably do better by using saturating subtract when it's available, but we should have this improvement for the min/max patterns regardless. Alive proofs: https://rise4fun.com/Alive/zsf https://rise4fun.com/Alive/Qrl Differential Revision: https://reviews.llvm.org/D55515 llvm-svn: 349304	2018-12-16 15:05:48 +00:00
Sanjay Patel	f24900b934	[DAGCombiner] allow hoisting vector bitwise logic ahead of truncates The transform performs a bitwise logic op in a wider type followed by truncate when both inputs are truncated from the same source type: logic_op (truncate x), (truncate y) --> truncate (logic_op x, y) There are a bunch of other checks that should prevent doing this when it might be harmful. We already do this transform for scalars in this spot. The vector limitation was shared with a check for the case when the operands are extended. I'm not sure if that limit is needed either, but that would be a separate patch. Differential Revision: https://reviews.llvm.org/D55448 llvm-svn: 349303	2018-12-16 14:57:04 +00:00
Simon Pilgrim	0ef977b83d	[SelectionDAG] Add FSHL/FSHR support to computeKnownBits Also exposes an issue in DAGCombiner::visitFunnelShift where we were assuming the shift amount had the result type (after legalization it'll have the targets shift amount type). llvm-svn: 349298	2018-12-16 13:33:37 +00:00
Simon Pilgrim	52c982406e	[X86] Begin cleaning up combineOr -> SHLD/SHRD. NFCI. In preparation for converting to funnel shifts. llvm-svn: 349286	2018-12-15 21:11:49 +00:00
Simon Pilgrim	ef7b5949e5	[X86] Lower to SHLD/SHRD on slow machines for optsize Use consistent rules for when to lower to SHLD/SHRD for slow machines - fixes a weird issue where funnel shift gets expanded but then X86ISelLowering's combineOr sees the optsize and combines to SHLD/SHRD, but now with the modulo amount guard...... llvm-svn: 349285	2018-12-15 19:43:44 +00:00
Kamil Rytarowski	21e270a479	Add NetBSD support in needsRuntimeRegistrationOfSectionRange. Use linker script magic to get data/cnts/name start/end. llvm-svn: 349277	2018-12-15 16:51:35 +00:00
Kamil Rytarowski	15ae738bc8	Register kASan shadow offset for NetBSD/amd64 The NetBSD x86_64 kernel uses the 0xdfff900000000000 shadow offset. llvm-svn: 349276	2018-12-15 16:32:41 +00:00
Dinar Temirbulatov	8c8724dd0d	[CodeGen] Enhance machine PHIs optimization Summary: Make machine PHIs optimization to work for single value register taken from several different copies. This is the first step to fix PR38917. This change allows to get rid of redundant PHIs (see opt_phis2.mir test) to make the subsequent optimizations (like CSE) possible and simpler. For instance, before this patch the code like this: %b = COPY %z ... %a = PHI %bb1, %a; %bb2, %b could be optimized to: %a = %b but the code like this: %c = COPY %z ... %b = COPY %z ... %a = PHI %bb1, %a; %bb2, %b; %bb3, %c would remain unchanged. With this patch the latter case will be optimized: %a = %z```. Committed on behalf of: Anton Afanasyev anton.a.afanasyev@gmail.com Reviewers: RKSimon, MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54839 llvm-svn: 349271	2018-12-15 14:37:01 +00:00
Simon Pilgrim	9831d4058c	Fix -Wunused-variable warning. NFCI. llvm-svn: 349265	2018-12-15 12:25:22 +00:00
Simon Pilgrim	1e1fd9c761	[TargetLowering] Add ISD::OR + ISD::XOR handling to SimplifyDemandedVectorElts Differential Revision: https://reviews.llvm.org/D55600 llvm-svn: 349264	2018-12-15 11:36:36 +00:00
Florian Hahn	abe32c9125	[SILoadStoreOptimizer] Use std::abs to avoid truncation. Using regular abs() causes the following warning error: absolute value function 'abs' given an argument of type 'int64_t' (aka 'long') but has parameter of type 'int' which may cause truncation of value [-Werror,-Wabsolute-value] (uint32_t)abs(Dist) > MaxDist) { ^ lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:1369:19: note: use function 'std::abs' instead which causes a bot to fail: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18284/steps/bootstrap%20clang/logs/stdio llvm-svn: 349224	2018-12-15 01:32:58 +00:00
Craig Topper	1fc257d97f	[X86] Rename hasNoSignedComparisonUses to hasNoSignFlagUses. Add the instruction that only modify the O flag to the waiver list. The only caller of this turns CMP with 0 into TEST. CMP with 0 and TEST both set OF to 0 so we should have no issues with instructions that only use OF. Though I don't think there's any reason we would read just OF after a compare with 0 anyway. So this probably isn't an observable change. llvm-svn: 349223	2018-12-15 01:07:19 +00:00
Craig Topper	5c304eac41	[X86] Make hasNoCarryFlagUses/hasNoSignedComparisonUses take an SDValue that indicates which result is the flag result. NFCI hasNoCarryFlagUses hardcoded that the flag result is 1 and used that to filter which uses were of interest. hasNoSignedComparisonUses just assumes the only result is flags and checks whether any user of the node is a CopyToReg instruction. After this patch we now do a result number check in both and rely on the caller to provide the result number. This shouldn't change behavior it was just an odd difference between the two functions that I noticed. llvm-svn: 349222	2018-12-15 01:07:16 +00:00
Heejin Ahn	feef720bb8	[WebAssembly] Check if the section order is correct Summary: This patch checks if the section order is correct when reading a wasm object file in `WasmObjectFile` and converting YAML to wasm object in yaml2wasm. (It is not possible to check when reading YAML because it is handled exclusively by the YAML reader.) This checks the ordering of all known sections (core sections + known custom sections). This also adds section ID DataCount section that will be scheduled to be added in near future. Reviewers: sbc100 Subscribers: dschuff, mgorny, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54924 llvm-svn: 349221	2018-12-15 00:58:12 +00:00
Florian Hahn	c214bc2b8d	[NewGVN] Update use counts for SSA copies when replacing them by their operands. The current code relies on LeaderUseCount to determine if we can remove an SSA copy, but in that the LeaderUseCount does not refer to the SSA copy. If a SSA copy is a dominating leader, we use the operand as dominating leader instead. This means we removed a user of a ssa copy and we should decrement its use count, so we can remove the ssa copy once it becomes dead. Fixes PR38804. Reviewers: efriedma, davide Reviewed By: davide Differential Revision: https://reviews.llvm.org/D51595 llvm-svn: 349217	2018-12-15 00:32:38 +00:00
Vedant Kumar	9d1827331f	[Util] Refer to [s\|z]exts of args when converting dbg.declares (fix PR35400) When converting dbg.declares, if the described value is a [s\|z]ext, refer to the ext directly instead of referring to its operand. This fixes a narrowing bug (the debugger got the sign of a variable wrong, see llvm.org/PR35400). The main reason to refer to the ext's operand was that an optimization may remove the ext itself, leading to a dropped variable. Now that InstCombine has been taught to use replaceAllDbgUsesWith (r336451), this is less of a concern. Other passes can/should adopt this API as needed to fix dropped variable bugs. Differential Revision: https://reviews.llvm.org/D51813 llvm-svn: 349214	2018-12-15 00:03:33 +00:00
Artem Belevich	6d74bd638a	[NVPTX] Lower instructions that expand into libcalls. The change is an effort to split and refactor abandoned D34708 into smaller parts. Here the behaviour of unsupported instructions is changed to match the behaviour of explicit intrinsics calls. Currently LLVM crashes with: > Assertion getInstruction() && "Not a call or invoke instruction!" failed. With this patch LLVM produces a more sensible error message: > Cannot select: ... i32 = ExternalSymbol'__foobar' Author: Denys Zariaiev <denys.zariaiev@gmail.com> Differential Revision: https://reviews.llvm.org/D55145 llvm-svn: 349213	2018-12-14 23:53:06 +00:00
David Blaikie	560ff35592	DebugInfo: Avoid using split DWARF when the split unit would be empty. In ThinLTO many split CUs may be effectively empty because of the lack of support for cross-unit references in split DWARF. Using a split unit in those cases is just a waste/overhead - and turned out to be one contributor to a significant symbolizer performance issue when global variable debug info was being imported (see r348416 for the primary fix) due to symbolizers seeing CUs with no ranges, assuming there might still be addresses covered and walking into the split CU to see if there are any ranges (when that split CU was in a DWP file, that meant loading the DWP and its index, the index was extra large because of all these fractured/empty CUs... and so was very expensive to load). (the 3rd fix which will follow, is to assume that a CU with no ranges is empty rather than merely missing its CU level range data - and to not walk into its DIEs (split or otherwise) in search of address information that is generally not present) llvm-svn: 349207	2018-12-14 22:44:46 +00:00
Reid Kleckner	5bf71d1127	[codeview] Add begin/endSymbolRecord helpers, NFC Previously beginning a symbol record was excessively verbose. Now it's a bit simpler. This follows the same pattern as begin/endCVSubsection. llvm-svn: 349205	2018-12-14 22:40:28 +00:00
David Blaikie	61c127c1ad	DebugInfo: Move addAddrBase from DwarfUnit to DwarfCompileUnit Only CUs need an address table reference. llvm-svn: 349203	2018-12-14 22:34:03 +00:00
Krzysztof Parzyszek	26d994f56e	[Hexagon] Add patterns for shifts of v2i16 This fixes https://llvm.org/PR39983. llvm-svn: 349202	2018-12-14 22:33:48 +00:00
Volkan Keles	574d737e06	[GlobalISel] LegalizerHelper: Implement fewerElementsVector for G_LOAD/G_STORE Reviewers: aemerson, dsanders, bogner, paquette, aditya_nandakumar Reviewed By: dsanders Subscribers: rovka, kristof.beyls, javed.absar, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D53728 llvm-svn: 349200	2018-12-14 22:11:20 +00:00
Krzysztof Parzyszek	c0fc0a9775	[Hexagon] Use IMPLICIT_DEF to any-extend 32-bit values to 64 bits llvm-svn: 349199	2018-12-14 22:05:44 +00:00
Farhana Aleen	ce095c564a	[AMDGPU] Promote constant offset to the immediate by finding a new base with 13bit constant offset from the nearby instructions. Summary: Promote constant offset to immediate by recomputing the relative 13bit offset from nearby instructions. E.g. s_movk_i32 s0, 0x1800 v_add_co_u32_e32 v0, vcc, s0, v2 v_addc_co_u32_e32 v1, vcc, 0, v6, vcc s_movk_i32 s0, 0x1000 v_add_co_u32_e32 v5, vcc, s0, v2 v_addc_co_u32_e32 v6, vcc, 0, v6, vcc global_load_dwordx2 v[5:6], v[5:6], off global_load_dwordx2 v[0:1], v[0:1], off => s_movk_i32 s0, 0x1000 v_add_co_u32_e32 v5, vcc, s0, v2 v_addc_co_u32_e32 v6, vcc, 0, v6, vcc global_load_dwordx2 v[5:6], v[5:6], off global_load_dwordx2 v[0:1], v[5:6], off offset:2048 Author: FarhanaAleen Reviewed By: arsenm, rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D55539 llvm-svn: 349196	2018-12-14 21:13:14 +00:00
Krzysztof Parzyszek	6b01d35497	[SDAG] Ignore chain operand in REG_SEQUENCE when emitting instructions llvm-svn: 349186	2018-12-14 20:14:12 +00:00
Evandro Menezes	ea9d90083f	[AArch64] Simplify the scheduling predicates (NFC) The instruction encodings make it unnecessary to distinguish extended W-form from X-form instructions. llvm-svn: 349185	2018-12-14 20:04:58 +00:00
Michael Kruse	ea9ef34558	[TransformWarning] Do not warn missed transformations in optnone functions. Optimization transformations are intentionally disabled by the 'optnone' function attribute. Therefore do not warn if transformation metadata is still present. Using the legacy pass manager structure, the `skipFunction` method takes care for the optnone attribute (already called before this patch). For the new pass manager, there is no equivalent, so we check for the 'optnone' attribute manually. Differential Revision: https://reviews.llvm.org/D55690 llvm-svn: 349184	2018-12-14 19:45:43 +00:00
Michael Kruse	5948b7f30f	[Transforms] Preserve metadata when converting invoke to call. The `changeToCall` function did not preserve the invoke's metadata. Currently, there is probably no metadata that depends on being applied on a CallInst or InvokeInst. Therefore we can replace the instruction's metadata. This fixes http://llvm.org/PR39994 Suggested-by: Moritz Kreutzer <moritz.kreutzer@siemens.com> Differential Revision: https://reviews.llvm.org/D55666 llvm-svn: 349170	2018-12-14 18:15:11 +00:00
Zachary Turner	8fb9a71dde	[MS Demangler] Fail gracefully on invalid pointer types. Once we detect a 'P', we know we a pointer type is upcoming, so we make some assumptions about the output that follows. If those assumptions didn't hold, we would assert. Instead, we should fail gracefully and propagate the error up. llvm-svn: 349169	2018-12-14 18:10:13 +00:00
Daniel Sanders	629db5d8e5	[globalisel][combiner] Make the CombinerChangeObserver a MachineFunction::Delegate Summary: This allows us to register it with the MachineFunction delegate and be notified automatically about erasure and creation of instructions. However, we still need explicit notification for modifications such as those caused by setReg() or replaceRegWith(). There is a catch with this though. The notification for creation is delivered before any operands can be added. While appropriate for scheduling combiner work. This is unfortunate for debug output since an opcode by itself doesn't provide sufficient information on what happened. As a result, the work list remembers the instructions (when debug output is requested) and emits a more complete dump later. Another nit is that the MachineFunction::Delegate provides const pointers which is inconvenient since we want to use it to schedule future modification. To resolve this GISelWorkList now has an optional pointer to the MachineFunction which describes the scope of the work it is permitted to schedule. If a given MachineInstr* is in this function then it is permitted to schedule work to be performed on the MachineInstr's. An alternative to this would be to remove the const from the MachineFunction::Delegate interface, however delegates are not permitted to modify the MachineInstr's they receive. In addition to this, the observer has three interface changes. * erasedInstr() is now erasingInstr() to indicate it is about to be erased but still exists at the moment. * changingInstr() and changedInstr() have been added to report changes before and after they are made. This allows us to trace the changes in the debug output. * As a convenience changingAllUsesOfReg() and finishedChangingAllUsesOfReg() will report changingInstr() and changedInstr() for each use of a given register. This is primarily useful for changes caused by MachineRegisterInfo::replaceRegWith() With this in place, both combine rules have been updated to report their changes to the observer. Finally, make some cosmetic changes to the debug output and make Combiner and CombinerHelp Reviewers: aditya_nandakumar, bogner, volkan, rtereshin, javed.absar Reviewed By: aditya_nandakumar Subscribers: mgorny, rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D52947 llvm-svn: 349167	2018-12-14 17:50:14 +00:00
Zachary Turner	2cd3286ed2	Fix a crash in llvm-undname with invalid types. llvm-svn: 349165	2018-12-14 17:43:56 +00:00
Ehsan Amiri	de1742c284	NFC. Adding an empty line to test the updated commit credentials. llvm-svn: 349158	2018-12-14 16:19:02 +00:00
Scott Linder	de6beb02a5	Implement -frecord-command-line (-frecord-gcc-switches) Implement options in clang to enable recording the driver command-line in an ELF section. Implement a new special named metadata, llvm.commandline, to support frontends embedding their command-line options in IR/ASM/ELF. This differs from the GCC implementation in some key ways: * In GCC there is only one command-line possible per compilation-unit, in LLVM it mirrors llvm.ident and multiple are allowed. * In GCC individual options are separated by NULL bytes, in LLVM entire command-lines are separated by NULL bytes. The advantage of the GCC approach is to clearly delineate options in the face of embedded spaces. The advantage of the LLVM approach is to support merging multiple command-lines unambiguously, while handling embedded spaces with escaping. Differential Revision: https://reviews.llvm.org/D54487 Clang Differential Revision: https://reviews.llvm.org/D54489 llvm-svn: 349155	2018-12-14 15:38:15 +00:00
John Brawn	1d0d86ae40	[RegAllocGreedy] IMPLICIT_DEF values shouldn't prefer registers It costs nothing to spill an IMPLICIT_DEF value (the only spill code that's generated is a KILL of the value), so when creating split constraints if the live-out value is IMPLICIT_DEF the exit constraint should be DontCare instead of PrefReg. Differential Revision: https://reviews.llvm.org/D55652 llvm-svn: 349151	2018-12-14 14:07:57 +00:00
Diana Picus	02c8343c75	[ARM GlobalISel] Thumb2: casts between int and ptr Mark as legal and add tests. Nothing special to do. llvm-svn: 349147	2018-12-14 13:45:38 +00:00
Diana Picus	813af0d283	[ARM GlobalISel] Minor refactoring. NFCI Refactor the ARMInstructionSelector to cache some opcodes in the constructor instead of checking all the time if we're in ARM or Thumb mode. llvm-svn: 349143	2018-12-14 12:37:24 +00:00
Diana Picus	14dc3b2959	[ARM GlobalISel] Allow simple binary ops in Thumb2 Mark G_ADD, G_SUB, G_MUL, G_AND, G_OR and G_XOR as legal for both ARM and Thumb2. Extract the legalizer tests for these opcodes into another file. Add tests for the instruction selector. llvm-svn: 349142	2018-12-14 11:58:14 +00:00
Craig Topper	257ce3871e	[DAGCombiner][X86] Prevent visitSIGN_EXTEND from returning N when (sext (setcc)) already has the target desired type for the setcc Summary: If the setcc already has the target desired type we can reach the getSetCC/getSExtOrTrunc after the MatchingVecType check with the exact same types as the nodes we started with. This causes those causes VsetCC to be CSEd to N0 and the getSExtOrTrunc will CSE to N. When we return N, the caller will think that meant we called CombineTo and did our own worklist management. But that's not what happened. This prevents target hooks from being called for the node. To fix this, I've now returned SDValue if the setcc is already the desired type. But to avoid some regressions in X86 I've had to disable one of the target combines that wasn't being reached before in the case of a (sext (setcc)). If we get vector widening legalization enabled that entire function will be deleted anyway so hopefully this is only for the short term. Reviewers: RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55459 llvm-svn: 349137	2018-12-14 08:28:24 +00:00
Fangrui Song	d2ed5be815	[Object] Rename getRelrRelocationType to getRelativeRelocationType Summary: The two utility functions were added in D47919 to support SHT_RELR. However, these are just relative relocations types and are't necessarily be named Relr. Reviewers: phosek, dberris Reviewed By: dberris Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55691 llvm-svn: 349133	2018-12-14 07:46:58 +00:00
Petr Hosek	0c02306dbd	[llvm-xray] Use correct variable name This fixes the compiler error introduced in r349129. llvm-svn: 349130	2018-12-14 06:06:19 +00:00
Petr Hosek	27e2f2014a	[llvm-xray] Store offset pointers in temporaries DataExtractor::getU64 modifies the OffsetPtr which also pass to RelocateOrElse which breaks on Windows. This addresses the issue introduced in r349120. Differential Revision: https://reviews.llvm.org/D55689 llvm-svn: 349129	2018-12-14 05:56:20 +00:00
Petr Hosek	493a082483	[llvm-xray] Support for PIE When the instrumented binary is linked as PIE, we need to apply the relative relocations to sleds. This is handled by the dynamic linker at runtime, but when processing the file we have to do it ourselves. Differential Revision: https://reviews.llvm.org/D55542 llvm-svn: 349120	2018-12-14 01:37:56 +00:00
Alex Lorenz	afa75d7843	[macho] save the SDK version stored in module metadata into the version min and build version load commands in the object file This commit introduces a new metadata node called "SDK Version". It will be set by the frontend to mark the platform SDK (macOS/iOS/etc) version which was used during that particular compilation. This node is used when machine code is emitted, by either saving the SDK version into the appropriate macho load command (version min/build version), or by emitting the assembly for these load commands with the SDK version specified as well. The assembly for both load commands is extended by allowing it to contain the sdk_version X, Y [, Z] trailing directive to represent the SDK version respectively. rdar://45774000 Differential Revision: https://reviews.llvm.org/D55612 llvm-svn: 349119	2018-12-14 01:14:10 +00:00
Sanjay Patel	093ab45d4c	[DAGCombiner] clean up visitEXTRACT_VECTOR_ELT This isn't quite NFC, but I don't know how to expose any outward diffs from these changes. Mostly, this was confusing because it used 'VT' to refer to the operand type rather the usual type of the input node. There's also a large block at the end that is dedicated solely to matching loads, but that wasn't obvious. This could probably be split up into separate functions to make it easier to see. It's still not clear to me when we make certain transforms because the legality and constant conditions are intertwined in a way that might be improved. llvm-svn: 349095	2018-12-14 00:09:08 +00:00
Craig Topper	178abc59ac	[X86] Demote EmitTest to a helper function of EmitCmp. Route all callers except EmitCmp through EmitCmp. This requires the two callers to manifest a 0 to make EmitCmp call EmitTest. I'm looking into changing how we combine TEST and flag setting instructions to not be part of lowering. And instead be part of DAG combine or isel. Which will mean EmitTest will probably become gutted and maybe disappear entirely. llvm-svn: 349094	2018-12-13 23:55:30 +00:00
Evgeniy Stepanov	eb238ecf0f	Revert "[hwasan] Android: Switch from TLS_SLOT_TSAN(8) to TLS_SLOT_SANITIZER(6)" Breaks sanitizer-android buildbot. This reverts commit af8443a984c3b491c9ca2996b8d126ea31e5ecbe. llvm-svn: 349092	2018-12-13 23:47:50 +00:00
Evandro Menezes	6fe51ac973	[AArch64] Fix Exynos predicates (NFC) Fix the logic in the definition of the `ExynosShiftExPred` as a more specific version of `ExynosShiftPred`. But, since `ExynosShiftExPred` is not used yet, this change has NFC. llvm-svn: 349091	2018-12-13 23:19:46 +00:00
Wei Mi	66c6c5abea	[SampleFDO] handle ProfileSampleAccurate when initializing function entry count ProfileSampleAccurate is used to indicate the profile has exact match to the code to be optimized. Previously ProfileSampleAccurate is handled in ProfileSummaryInfo::isColdCallSite and ProfileSummaryInfo::isColdBlock. A better solution is to initialize function entry count to 0 when ProfileSampleAccurate is true, so we don't have to handle ProfileSampleAccurate in multiple places. Differential Revision: https://reviews.llvm.org/D55660 llvm-svn: 349088	2018-12-13 21:51:42 +00:00
Aakanksha Patil	bc568766b2	Revert r348971: [AMDGPU] Support for "uniform-work-group-size" attribute This patch breaks RADV (and probably RadeonSI as well) llvm-svn: 349084	2018-12-13 21:23:12 +00:00
Matt Arsenault	934e534c47	AMDGPU/GlobalISel: Legalize/regbankselect block_addr llvm-svn: 349081	2018-12-13 20:34:15 +00:00
Nikita Popov	dc73a6edde	Reapply "[MemCpyOpt] memset->memcpy forwarding with undef tail" Currently memcpyopt optimizes cases like memset(a, byte, N); memcpy(b, a, M); to memset(a, byte, N); memset(b, byte, M); if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely. This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start. This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844. The previous version of this patch did not perform dependency checking properly: While the dependency is checked at the position of the memset, the used size must be that of the memcpy. Previously the size of the memset was used, which missed modification in the region MemSetSize..CopySize, resulting in miscompiles. The added tests cover variations of this issue. Differential Revision: https://reviews.llvm.org/D55120 llvm-svn: 349078	2018-12-13 20:04:27 +00:00
Easwaran Raman	5a7056fa03	[ThinLTO] Compute synthetic function entry count Summary: This patch computes the synthetic function entry count on the whole program callgraph (based on module summary) and writes the entry counts to the summary. After function importing, this count gets attached to the IR as metadata. Since it adds a new field to the summary, this bumps up the version. Reviewers: tejohnson Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43521 llvm-svn: 349076	2018-12-13 19:54:27 +00:00
Mircea Trofin	41c729e78e	[llvm] Address base discriminator overflow in X86DiscriminateMemOps Summary: Macros are expanded on a single line. In case of large expansions, with sufficiently many instructions with memory operands (and when -fdebug-info-for-profiling is requested), we may be unable to generate new base discriminator values - new values overflow (base discriminators may not be larger than 2^12). This CL warns instead of asserting in such a case. A subsequent CL will add APIs to check for overflow before creating new debug info. See https://bugs.llvm.org/show_bug.cgi?id=39890 Reviewers: davidxl, wmi, gbedwell Reviewed By: davidxl Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D55643 llvm-svn: 349075	2018-12-13 19:40:59 +00:00
Jordan Rupprecht	4888c4aba5	[llvm-size][libobject] Add explicit "inTextSegment" methods similar to "isText" section methods to calculate size correctly. Summary: llvm-size uses "isText()" etc. which seem to indicate whether the section contains code-like things, not whether or not it will actually go in the text segment when in a fully linked executable. The unit test added (elf-sizes.test) shows some types of sections that cause discrepencies versus the GNU size tool. llvm-size is not correctly reporting sizes of things mapping to text/data segments, at least for ELF files. This fixes pr38723. Reviewers: echristo, Bigcheese, MaskRay Reviewed By: MaskRay Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54369 llvm-svn: 349074	2018-12-13 19:40:12 +00:00
Davide Italiano	9737096bb1	[LoopUtils] Use i32 instead of `void`. The actual type of the first argument of the @dbg intrinsic doesn't really matter as we're setting it to `undef`, but the bitcode reader is picky about `void` types. llvm-svn: 349069	2018-12-13 18:37:23 +00:00
Francis Visoiu Mistrih	91e69d8a92	[MachO][TLOF] Add support for local symbols in the indirect symbol table On 32-bit archs, before, we would assume that an indirect symbol will never have local linkage. This can lead to miscompiles where the symbol's value would be 0 and the linker would use that value, because the indirect symbol table would contain the value `INDIRECT_SYMBOL_LOCAL` for that specific symbol. Differential Revision: https://reviews.llvm.org/D55573 llvm-svn: 349060	2018-12-13 17:23:30 +00:00
Sanjay Patel	791ae69afe	[DAGCombiner] after simplifying demanded elements of vector operand of extract, revisit the extract; 2nd try This is a retry of rL349051 (reverted at rL349056). I changed the check for dead-ness from number of uses to an opcode test for DELETED_NODE based on existing similar code. Differential Revision: https://reviews.llvm.org/D55655 llvm-svn: 349058	2018-12-13 17:05:01 +00:00
Simon Pilgrim	b5aaa673c6	[X86][SSE] Add SSE vector imm/var shift support to SimplifyDemandedVectorEltsForTargetNode llvm-svn: 349057	2018-12-13 16:39:29 +00:00
Sanjay Patel	c56f5728ee	revert rL349051: [DAGCombiner] after simplifying demanded elements of vector operand of extract, revisit the extract This causes an address sanitizer bot failure: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/27187/steps/check-llvm%20asan/logs/stdio llvm-svn: 349056	2018-12-13 16:32:44 +00:00
Simon Pilgrim	b0b2f1503a	[X86][SSE] Fix all remaining modulo vector rotation amounts (PR38243) There's still a couple of minor SimplifyDemandedElts regressions in some of the shift amount splats that will be fixed in future patches. llvm-svn: 349052	2018-12-13 15:50:31 +00:00
Sanjay Patel	a7b115b392	[DAGCombiner] after simplifying demanded elements of vector operand of extract, revisit the extract Differential Revision: https://reviews.llvm.org/D55655 llvm-svn: 349051	2018-12-13 15:44:26 +00:00
Daniel Cederman	77611426e1	[Sparc] Add membar assembler tags Summary: The Sparc V9 membar instruction can enforce different types of memory orderings depending on the value in its immediate field. In the architectural manual the type is selected by combining different assembler tags into a mask. This patch adds support for these tags. Reviewers: jyknight, venkatra, brad Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D53491 llvm-svn: 349048	2018-12-13 15:29:12 +00:00
Simon Pilgrim	ba91ff4a86	[X86][SSE] Fix modulo rotation amounts for v8i16/v16i16/v4i32 (PR38243) llvm-svn: 349047	2018-12-13 15:23:09 +00:00
Daniel Cederman	b5d284408e	[Sparc] Use float register for integer constrained with "f" in inline asm Summary: Constraining an integer value to a floating point register using "f" causes an llvm_unreachable to trigger. This patch allows i32 integers to be placed in a single precision float register and i64 integers to be placed in a double precision float register. This matches the behavior of GCC. For other types the llvm_unreachable is removed to instead trigger an error message that points out the offending line. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D51614 llvm-svn: 349045	2018-12-13 15:13:29 +00:00
Jinsong Ji	c7b43b94ce	[PowerPC][NFC] Sorting out Pseudo related classes to avoid confusion There are several Pseudo in PowerPC backend. eg: * ISel Pseudo-instructions , which has let usesCustomInserter=1 in td ExpandISelPseudos -> EmitInstrWithCustomInserter will deal with them. * Post-RA pseudo instruction, which has let isPseudo = 1 in td, or Standard pseudo (SUBREG_TO_REG,COPY etc.) ExpandPostRAPseudos -> expandPostRAPseudo will expand them * Multi-instruction pseudo operations will expand them PPCAsmPrinter::EmitInstruction * Pseudo instruction in CodeEmitter, which has encoding of 0. Currently, in td files, especially PPCInstrVSX.td, we did not distinguish Post-RA pseudo instruction and Pseudo instruction in CodeEmitter very clearly. This patch is to * Rename Pseudo<> class to PPCEmitTimePseudo, which means encoding of 0 in CodeEmitter * Introduce new class PPCPostRAExpPseudo <> for previous PostRA Pseudo * Introduce new class PPCCustomInserterPseudo <> for previous Isel Pseudo Differential Revision: https://reviews.llvm.org/D55143 llvm-svn: 349044	2018-12-13 15:12:57 +00:00
Daniel Sanders	b51480ff3e	[mir] Fix uninitialized variable in r349035 noticed by clang-atom-d525-fedora-rel and 3 other bots llvm-svn: 349043	2018-12-13 15:05:27 +00:00
Simon Pilgrim	7c84f7ae3a	[X86][SSE] Merge the vXi16/vXi32 vector rotation expansion cases. NFCI. Merged the repeated code into a single if(). llvm-svn: 349040	2018-12-13 14:51:28 +00:00
Jonas Paulsson	e79b1b986d	[SystemZ] Pass copy-hinted regs first from getRegAllocationHints(). When computing register allocation hints for a GRX32Bit register, make sure that any of the hinted registers that are also copy hints are returned first in the list. Review: Ulrich Weigand. llvm-svn: 349037	2018-12-13 14:37:05 +00:00
Daniel Sanders	9f3cf55e63	[mir] Serialize DILocation inline when not possible to use a metadata reference Summary: Sometimes MIR-level passes create DILocations that were not present in the LLVM-IR. For example, it may merge two DILocations together to produce a DILocation that points to line 0. Previously, the address of these DILocations were printed which prevented the MIR from being read back into LLVM. With this patch, DILocations will use metadata references where possible and fall back on serializing them inline like so: MOV32mr %stack.0.x.addr, 1, _, 0, _, %0, debug-location !DILocation(line: 1, scope: !15) Reviewers: aprantl, vsk, arphaman Reviewed By: aprantl Subscribers: probinson, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D55243 llvm-svn: 349035	2018-12-13 14:25:27 +00:00
Simon Pilgrim	320fd7383f	[X86][BWI] Don't custom lower vXi8 rotations. We always expand to shifts anyhow - test changes are just different scheduling only. llvm-svn: 349034	2018-12-13 13:44:33 +00:00
Chen Zheng	9c6fa536e0	[PowerPC] intrinsic llvm.eh.sjlj.setjmp should not have flag isBarrier. Differential Revision: https://reviews.llvm.org/D55499 llvm-svn: 349029	2018-12-13 12:25:20 +00:00
Simon Pilgrim	ab973a45b9	[DAGCombine] Moved X86 rotate_amount % bitwidth == 0 early out to DAGCombiner Remove common code from custom lowering (code is still safe if somehow a zero value gets used). llvm-svn: 349028	2018-12-13 12:23:32 +00:00
Diana Picus	99cd644b6c	[ARM GlobalISel] Support exts and truncs for Thumb2 Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for them in the instruction selector. This uses handwritten code again because the patterns that are generated with TableGen are tuned for what the DAG combiner would produce and not for simple sext/zext nodes. Luckily, we only need to update the opcodes to use the Thumb2 variants, everything else can be reused from ARM. llvm-svn: 349026	2018-12-13 12:06:54 +00:00
Simon Pilgrim	77fc551d1a	[TargetLowering] Add ISD::ROTL/ROTR vector expansion Move existing rotation expansion code into TargetLowering and set it up for vectors as well. Ideally this would share more of the funnel shift expansion, but we handle the shift amount modulo quite differently at the moment. Begun removing x86 vector rotate custom lowering to use the expansion. llvm-svn: 349025	2018-12-13 11:20:48 +00:00
Alex Bradbury	919f5fb8ca	[RISCV] Add support for the various RISC-V FMA instruction variants Adds support for the various RISC-V FMA instructions (fmadd, fmsub, fnmsub, fnmadd). The criteria for choosing whether a fused add or subtract is used, as well as whether the product is negated or not, is whether some of the arguments to the llvm.fma.* intrinsic are negated or not. In the tests, extraneous fadd instructions were added to avoid the negation being performed using a xor trick, which prevented the proper FMA forms from being selected and thus tested. The FMA instruction patterns might seem incorrect (e.g., fnmadd: -rs1 * rs2 - rs3), but they should be correct. The misleading names were inherited from MIPS, where the negation happens after computing the sum. The llvm.fmuladd.* intrinsics still do not generate RISC-V FMA instructions, as that depends on TargetLowering::isFMAFasterthanFMulAndFAdd. Some comments in the test files about what type of instructions are there tested were updated, to better reflect the current content of those test files. Differential Revision: https://reviews.llvm.org/D54205 Patch by Luís Marques. llvm-svn: 349023	2018-12-13 10:49:05 +00:00
Arnaud A. de Grandmaison	dfe861087d	[AArch64] Catch some more CMN opportunities. Fixes https://bugs.llvm.org/show_bug.cgi?id=33486 llvm-svn: 349022	2018-12-13 10:31:32 +00:00
Clement Courbet	76f4ae1092	[CodeGen] Allow mempcy/memset to generate small overlapping stores. Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 349016	2018-12-13 09:56:19 +00:00
Vitaly Buka	a257639a69	[asan] Don't check ODR violations for particular types of globals Summary: private and internal: should not trigger ODR at all. unnamed_addr: current ODR checking approach fail and rereport false violation if a linker merges such globals linkonce_odr, weak_odr: could cause similar problems and they are already not instrumented for ELF. Reviewers: eugenis, kcc Subscribers: kubamracek, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55621 llvm-svn: 349015	2018-12-13 09:47:39 +00:00
Matt Arsenault	577b9fc543	AMDGPU/GlobalISel: Legalize f64 fadd/fmul llvm-svn: 349014	2018-12-13 08:27:48 +00:00
Matt Arsenault	f38f483bef	AMDGPU/GlobalISel: RegBankSelect some simple operations llvm-svn: 349012	2018-12-13 08:23:51 +00:00
Craig Topper	a048d58de7	[X86] Remove assert leftover from when i1 was a legal type. Add more accurate assert. NFC llvm-svn: 349007	2018-12-13 06:14:25 +00:00
Stanislav Mekhanoshin	d933c2ced7	[AMDGPU] Fix build failure, second attempt Some compilers complain that variable is captured and some complain when it is not. Switch to [&]. llvm-svn: 349006	2018-12-13 05:52:11 +00:00
Stanislav Mekhanoshin	5225746e03	[AMDGPU] Fix build failure Fixed error 'lambda capture 'CondReg' is not required to be captured for this use'. llvm-svn: 349005	2018-12-13 05:21:25 +00:00
Stanislav Mekhanoshin	6071e1aa58	[AMDGPU] Simplify negated condition Optimize sequence: %sel = V_CNDMASK_B32_e64 0, 1, %cc %cmp = V_CMP_NE_U32 1, %1 $vcc = S_AND_B64 $exec, %cmp S_CBRANCH_VCC[N]Z => $vcc = S_ANDN2_B64 $exec, %cc S_CBRANCH_VCC[N]Z It is the negation pattern inserted by DAGCombiner::visitBRCOND() in the rebuildSetCC(). Differential Revision: https://reviews.llvm.org/D55402 llvm-svn: 349003	2018-12-13 03:17:40 +00:00
David L. Jones	54c01ad6a9	Revert r348645 - "[MemCpyOpt] memset->memcpy forwarding with undef tail" This revision caused trucated memsets for structs with padding. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610520.html llvm-svn: 349002	2018-12-13 03:15:11 +00:00
Davide Italiano	8ee59ca653	[LoopUtils] Prefer a set over a map. NFCI. llvm-svn: 348999	2018-12-13 01:11:52 +00:00
Shoaib Meenai	96929fdd42	[Support] Fix FileNameLength passed to SetFileInformationByHandle The rename_internal function used for Windows has a minor bug where the filename length is passed as a character count instead of a byte count. Windows internally ignores this field, but other tools that hook NT api's may use the documented behavior: MSDN documentation specifying the size should be in bytes: https://docs.microsoft.com/en-us/windows/desktop/api/winbase/ns-winbase-_file_rename_info Patch by Ben Hillis. Differential Revision: https://reviews.llvm.org/D55624 llvm-svn: 348995	2018-12-13 00:08:25 +00:00
Daniel Sanders	d001e0e0f4	[globalisel] Add GISelChangeObserver::changingInstr() Summary: In addition to knowing that an instruction is changed. It's also useful to know when it's about to change. For example, it might print the instruction so you can track the changes in a debug log, it might remove it from some queue while it's being worked on, or it might want to change several instructions as a single transaction and act on all the changes at once. Added changingInstr() to all existing uses of changedInstr() Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D55623 llvm-svn: 348992	2018-12-12 23:48:13 +00:00
Sam Clegg	03801256d8	[WebAssembly] Update dylink section parsing This updates the format of the dylink section in accordance with recent "spec" change: https://github.com/WebAssembly/tool-conventions/pull/77 Differential Revision: https://reviews.llvm.org/D55609 llvm-svn: 348989	2018-12-12 23:40:58 +00:00
Davide Italiano	744c3c327f	[LoopDeletion] Update debug values after loop deletion. When loops are deleted, we don't keep track of variables modified inside the loops, so the DI will contain the wrong value for these. e.g. int b() { int i; for (i = 0; i < 2; i++) ; patatino(); return a; -> 6 patatino(); 7 return a; 8 } 9 int main() { b(); } (lldb) frame var i (int) i = 0 We mark instead these values as unavailable inserting a @llvm.dbg.value(undef to make sure we don't end up printing an incorrect value in the debugger. We could consider doing something fancier, for, e.g. constants, in the future. PR39868. rdar://problem/46418795) Differential Revision: https://reviews.llvm.org/D55299 llvm-svn: 348988	2018-12-12 23:32:35 +00:00
Nikita Popov	36e03ac6ee	[InstCombine] Fix negative GEP offset evaluation for 32-bit pointers This fixes https://bugs.llvm.org/show_bug.cgi?id=39908. The evaluateGEPOffsetExpression() function simplifies GEP offsets for use in comparisons against zero, basically by converting XScale+Offset==0 to X+Offset/Scale==0 if Scale divides Offset. However, before this is done, Offset is masked down to the pointer size. This results in incorrect results for negative Offsets, because we basically end up dividing the 32-bit offset zero* extended to 64-bit bits (rather than sign extended). Fix this by explicitly sign extending the truncated value. Differential Revision: https://reviews.llvm.org/D55449 llvm-svn: 348987	2018-12-12 23:19:03 +00:00
Ryan Prichard	e028c818f5	[hwasan] Android: Switch from TLS_SLOT_TSAN(8) to TLS_SLOT_SANITIZER(6) Summary: The change is needed to support ELF TLS in Android. See D55581 for the same change in compiler-rt. Reviewers: srhines, eugenis Reviewed By: eugenis Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D55592 llvm-svn: 348983	2018-12-12 22:45:06 +00:00
Daniel Sanders	91dfdd5734	[globalisel] Rename GISelChangeObserver's erasedInstr() to erasingInstr() and related nits. NFC Summary: There's little of interest that can be done to an already-erased instruction. You can't inspect it, write it to a debug log, etc. It ought to be notification that we're about to erase it. Rename the function to clarify the timing of the event and reflect current usage. Also fixed one case where we were trying to print an erased instruction. Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D55611 llvm-svn: 348976	2018-12-12 21:32:01 +00:00
Craig Topper	d1c61861dd	[X86] Don't emit MULX by default with BMI2 MULX has somewhat improved register allocation constraints compared to the legacy MUL instruction. Both output registers are encoded instead of fixed to EAX/EDX, but EDX is used as input. It also doesn't touch flags. Unfortunately, the encoding is longer. Prefering it whenever BMI2 is enabled is probably not optimal. Choosing it should somehow be a function of register allocation constraints like converting adds to three address. gcc and icc definitely don't pick MULX by default. Not sure what if any rules they have for using it. Differential Revision: https://reviews.llvm.org/D55565 llvm-svn: 348975	2018-12-12 21:21:31 +00:00
Aakanksha Patil	729309cc89	[AMDGPU] Support for "uniform-work-group-size" attribute Updated the annotate-kernel-features pass to support the propagation of uniform-work-group attribute from the kernel to the called functions. Once this pass is run, all kernels, even the ones which initially did not have the attribute, will be able to indicate weather or not they have uniform work group size depending on the value of the attribute. Differential Revision: https://reviews.llvm.org/D50200 llvm-svn: 348971	2018-12-12 20:49:17 +00:00
David Blaikie	023674a9e4	DebugInfo/DWARF: Pretty print subroutine types Doesn't handle varargs and other fun things, but it's a start. (also doesn't print these strictly as valid C++ when it's a pointer to function, it'll print as "void(int)" instead of "void ()(int)") llvm-svn: 348965	2018-12-12 19:53:03 +00:00
Scott Linder	f5b36e56fb	[AMDGPU] Emit MessagePack HSA Metadata for v3 code object Continue to present HSA metadata as YAML in ASM and when output by tools (e.g. llvm-readobj), but encode it in Messagepack in the code object. Differential Revision: https://reviews.llvm.org/D48179 llvm-svn: 348963	2018-12-12 19:39:27 +00:00
David Blaikie	3f8f004daf	DebugInfo/DWARF: Improve dumping of pointers to members ('int foo::' rather than 'int') llvm-svn: 348962	2018-12-12 19:34:02 +00:00
David Blaikie	815cffaad8	DebugInfo/DWARF: Refactor type dumping to dump types, rather than DIEs that reference types This lays the foundation for dumping types not referenced by DW_AT_type attributes (in the near-term, that'll be DW_AT_containing_type for a DW_TAG_ptr_to_member_type - in the future, potentially dumping the pretty printed name next to the DW_TAG for the type, rather than only when the type is referenced from elsewhere) llvm-svn: 348961	2018-12-12 19:33:08 +00:00
David Blaikie	92b5493a14	DebugInfo/DWARF: Refactor getAttributeValueAsReferencedDie to accept a DWARFFormValue Save searching for the attribute again when you already have the DWARFFormValue at hand. llvm-svn: 348960	2018-12-12 19:23:55 +00:00
Craig Topper	4937adf75f	[X86] Emit SBB instead of SETCC_CARRY from LowerSELECT. Break false dependency on the SBB input. I'm hoping we can just replace SETCC_CARRY with SBB. This is another step towards that. I've explicitly used zero as the input to the setcc to avoid a false dependency that we've had with the SETCC_CARRY. I changed one of the patterns that used NEG to instead use an explicit compare with 0 on the LHS. We needed the zero anyway to avoid the false dependency. The negate would clobber its input register. By using a CMP we can avoid that which could be useful. Differential Revision: https://reviews.llvm.org/D55414 llvm-svn: 348959	2018-12-12 19:20:21 +00:00
Florian Hahn	81a22d32f7	[ConstantFold] Use getMinSignedBits for APInt in isIndexInRangeOfArrayType. Indices for getelementptr can be signed so we should use getMinSignedBits instead of getActiveBits here. The function later calls getSExtValue to get the int64_t value, which also checks getMinSignedBits. This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11647. Reviewers: mssimpso, efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D55536 llvm-svn: 348957	2018-12-12 18:55:14 +00:00
David Blaikie	73066d60f1	llvm-dwarfdump: Dump array dimensions in stringified type names llvm-svn: 348954	2018-12-12 18:46:25 +00:00
Simon Pilgrim	eb508f8ccb	[SelectionDAG] Add a generic isSplatValue function This patch introduces a generic function to determine whether a given vector type is known to be a splat value for the specified demanded elements, recursing up the DAG looking for BUILD_VECTOR or VECTOR_SHUFFLE splat patterns. It also keeps track of the elements that are known to be UNDEF - it returns true if all the demanded elements are UNDEF (as this may be useful under some circumstances), so this needs to be handled by the caller. A wrapper variant is also provided that doesn't take the DemandedElts or UndefElts arguments for cases where we just want to know if the SDValue is a splat or not (with/without UNDEFS). I had hoped to completely remove the X86 local version of this function, but I'm seeing some regressions in shift/rotate codegen that will take a little longer to fix and I hope to get this in sooner so I can continue work on PR38243 which needs more capable splat detection. Differential Revision: https://reviews.llvm.org/D55426 llvm-svn: 348953	2018-12-12 18:32:29 +00:00
Artem Belevich	f802b9324a	[NVPTX] do not rely on cached subtarget info. If a module has function references, but no functions themselves, we may end up never calling runOnMachineFunction and therefore would never initialize nvptxSubtarget field which would eventually cause a crash. Instead of relying on nvptxSubtarget being initialized by one of the methods, retrieve subtarget info directly. Differential Revision: https://reviews.llvm.org/D55580 llvm-svn: 348952	2018-12-12 18:31:04 +00:00
Sanjay Patel	44eaa492b8	[x86] allow 8-bit adds to be promoted by convertToThreeAddress() to form LEA This extends the code that handles 16-bit add promotion to form LEA to also allow 8-bit adds. That allows us to combine add ops with register moves and save some instructions. This is another step towards allowing add truncation in generic DAGCombiner (see D54640). Differential Revision: https://reviews.llvm.org/D55494 llvm-svn: 348946	2018-12-12 17:58:27 +00:00
Michael Kruse	7244852557	[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944	2018-12-12 17:32:52 +00:00
Wei Mi	7da5a08e1a	[SampleFDO] Extend profile-sample-accurate option to cover isFunctionColdInCallGraph For SampleFDO, when a callsite doesn't appear in the profile, it will not be marked as cold callsite unless the option -profile-sample-accurate is specified. But profile-sample-accurate doesn't cover function isFunctionColdInCallGraph which is used to decide whether a function should be put into text.unlikely section, so even if the user knows the profile is accurate and specifies profile-sample-accurate, those functions not appearing in the sample profile are still not be put into text.unlikely section right now. The patch fixes that. Differential Revision: https://reviews.llvm.org/D55567 llvm-svn: 348940	2018-12-12 17:09:27 +00:00
Neil Henning	76504a4c5e	[AMDGPU] Extend the SI Load/Store optimizer to combine more things. I've extended the load/store optimizer to be able to produce dwordx3 loads and stores, This change allows many more load/stores to be combined, and results in much more optimal code for our hardware. Differential Revision: https://reviews.llvm.org/D54042 llvm-svn: 348937	2018-12-12 16:15:21 +00:00
Simon Atanasyan	fa020082e4	[mips] Enable using of integrated assembler in all cases. llvm-svn: 348934	2018-12-12 15:32:03 +00:00
Simon Pilgrim	f6c898e12f	[TargetLowering] Add ISD::AND handling to SimplifyDemandedVectorElts If either of the operand elements are zero then we know the result element is going to be zero (even if the other element is undef). Differential Revision: https://reviews.llvm.org/D55558 llvm-svn: 348926	2018-12-12 13:43:07 +00:00
Piotr Sobczak	3732b4ce25	[AMDGPU] Set metadata access for explicit section Summary: This patch provides a means to set Metadata section kind for a global variable, if its explicit section name is prefixed with ".AMDGPU.metadata." This could be useful to make the global variable go to an ELF section without any section flags set. Reviewers: dstuttard, tpr, kzhuravl, nhaehnle, t-tye Reviewed By: dstuttard, kzhuravl Subscribers: llvm-commits, arsenm, jvesely, wdng, yaxunl, t-tye Differential Revision: https://reviews.llvm.org/D55267 llvm-svn: 348922	2018-12-12 11:20:04 +00:00
Diana Picus	59720b422a	[ARM GlobalISel] Select load/store for Thumb2 Unfortunately we can't use TableGen for this because it doesn't yet support predicates on the source pattern root. Therefore, add a bit of handwritten code to the instruction selector to handle the most basic cases. Also mark them as legal and extract their legalizer test cases to a new test file. llvm-svn: 348920	2018-12-12 10:32:15 +00:00
Jonas Paulsson	896775c2d3	[SystemZ] Minor cleanup of SchedModels Some fixes of a few InstRWs for z13 and z14. Review: Ulrich Weigand llvm-svn: 348917	2018-12-12 08:26:24 +00:00
Mikael Holmen	c06b01cb22	Fix compiler warning about unused variable [NFC] llvm-svn: 348913	2018-12-12 06:33:45 +00:00
Leonard Chan	118e53fd63	[Intrinsic] Signed Fixed Point Multiplication Intrinsic Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D54719 llvm-svn: 348912	2018-12-12 06:29:14 +00:00
Craig Topper	1fe466689b	[X86] Combine vpmovdw+vpacksswb into vpmovdb. This is similar to the combine we already have for vpmovdw+vpackuswb. llvm-svn: 348910	2018-12-12 05:56:01 +00:00
Florian Hahn	cc419ad7df	[ConstantInt] Check active bits before calling getZExtValue. Without this check, we hit an assertion in getZExtValue, if the constant value does not fit into an uint64_t. As getZExtValue returns an uint64_t, should we update getAggregateElement to take an uin64_t as well? This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6109. Reviewers: efriedma, craig.topper, spatel Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D55547 llvm-svn: 348906	2018-12-12 02:22:12 +00:00
Nathan Lanza	893083ae5e	Implement IMAGE_REL_AMD64_SECREL for RuntimeDyldCOFFX86_64 lldb on Windows uses the ExecutionEngine for expression evaluation and hits the llvm_unreachable due to this relocation. Thus, implement the relocation and add a test to verify it's function. llvm-svn: 348904	2018-12-12 00:04:06 +00:00
Reid Kleckner	9571c806c5	[codeview] Look through typedefs in getCompleteTypeIndex Summary: Any time a symbol record, whether it's S_UDT, S_LOCAL, or S_[GL]DATA32, references a record type, it should use the complete type index, even if there's a typedef in the way. Fixes the compiler part of PR39853. Reviewers: zturner, aganea Subscribers: hiraditya, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D55236 llvm-svn: 348902	2018-12-11 23:07:39 +00:00

... 10 11 12 13 14 ...

120049 Commits