llvm-project

Commit Graph

Author	SHA1	Message	Date
Hideto Ueno	cbab334e40	[Attributor] Deduce "noalias" attribute Summary: This patch adds very basic deduction for noalias. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Tags: LLVM Differential Revision: https://reviews.llvm.org/D66207 llvm-svn: 370295	2019-08-29 05:52:00 +00:00
Craig Topper	1ec5c204b8	[X86] Add a DAG combine to combine INSERTPS and VBROADCAST of a scalar load. Remove corresponding isel patterns. We had an isel pattern to perform this, but its better to do it in DAG combine as a simplification. This also fixes the lack of patterns for AVX512 targets. llvm-svn: 370294	2019-08-29 05:48:48 +00:00
Craig Topper	1aadf6f39f	[X86] Make inline assembly 'x' and 'v' constraints work for f128. Including a type legalizer fix to make bitcast operand promotion work correctly when getSoftenedFloat returns f128 instead of i128. Fixes PR43157 llvm-svn: 370293	2019-08-29 05:13:56 +00:00
Florian Hahn	3177b92231	[LoopUnroll] Use Lazy strategy for DTU used for MergeBlockIntoPredecessor. We do not access the DT in the loop, so we do not have to apply updates eagerly. We can apply them lazyly and flush them after we are done merging blocks. As follow-up work, we might be able to use the DTU above as well, instead of manually updating the DT. This brings the example from PR43134 from ~100s to ~4s for a relase + assertions build on my machine. Reviewers: efriedma, kuhar, asbirlea, brzycki Reviewed By: kuhar, brzycki Differential Revision: https://reviews.llvm.org/D66911 llvm-svn: 370292	2019-08-29 04:26:29 +00:00
Johannes Doerfert	bf112139ac	[Attributor] Improve messages in iteration verify mode When we now verify the iteration count we will see the actual count and the expected count before the assertion is triggered. llvm-svn: 370285	2019-08-29 01:29:44 +00:00
Johannes Doerfert	62a9c1da78	[Attributor][Fix] Indicate change correctly llvm-svn: 370283	2019-08-29 01:26:58 +00:00
Johannes Doerfert	1aac182f31	[Attributor] Fix typo llvm-svn: 370282	2019-08-29 01:26:09 +00:00
Matt Arsenault	216d8ff60b	AMDGPU: Don't use frame virtual registers SGPR spills aren't really handled after SILowerSGPRSpills. In order to directly control what happens if the scavenger needs to spill, the scavenger needs to be used directly. There is an alternative to spilling in these contexts anyway since the frame register can be increment and restored. This does present another possible issue if spilling is needed for the unused carry out if an add is needed. I think this can be avoided by using a scalar add (although that clobbers SCC, which happens anyway). llvm-svn: 370281	2019-08-29 01:13:47 +00:00
Matt Arsenault	8ec5c10042	GlobalISel/TableGen: Handle setcc patterns This is a special case because one node maps to two different G_ instructions, and the operand order is changed. This mostly enables G_FCMP for AMDPGPU. G_ICMP is still manually selected for now since it has the SALU and VALU complication to deal with. llvm-svn: 370280	2019-08-29 01:13:41 +00:00
Craig Topper	af364131af	[X86] Fix a couple isel patterns to not shrink a volatile load. Also add a FIXME because I'm not sure why these patterns exist. Looks like a missing combine. And another FIXME because the AVX512 equivalent one of the patterns is missing. llvm-svn: 370276	2019-08-28 23:45:10 +00:00
Shiva Chen	b39876d8cd	[RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall The patch fixed the issue that RV64 didn't clear the upper bits when return complex floating value with lp64 ABI. float _Complex complex_add(float _Complex a, float _Complex b) { return a + b; } RealResult = zero_extend(RealA + RealB) ImageResult = ImageA + ImageB Return (RealResult \| (ImageResult << 32)) The patch introduces shouldExtendTypeInLibCall target hook to suppress the AssertZext generation when lowering floating LibCall. Thanks to Eli's comments from the Bugzilla https://bugs.llvm.org/show_bug.cgi?id=42820 Differential Revision: https://reviews.llvm.org/D65497 llvm-svn: 370275	2019-08-28 23:40:37 +00:00
Heejin Ahn	d85fd5a3f4	[WebAssembly] Add atomic.fence instruction Summary: This adds `atomic.fence` instruction: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md#fence-operator And we now emit the new `atomic.fence` instruction for multithread fences, rather than the prevous `atomic.rmw` hack. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, tlively, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66794 llvm-svn: 370272	2019-08-28 23:13:43 +00:00
Simon Atanasyan	027f1da010	[mips] Add an empty line to separate different patterns. NFC llvm-svn: 370269	2019-08-28 22:32:16 +00:00
Simon Atanasyan	59bb3609fa	[mips] Fix 64-bit address loading in case of applying 32-bit mask to the result If result of 64-bit address loading combines with 32-bit mask, LLVM tries to optimize the code and remove "redundant" loading of upper 32-bits of the address. It leads to incorrect code on MIPS64 targets. MIPS backend creates the following chain of commands to load 64-bit address in the `MipsTargetLowering::getAddrNonPICSym64` method: ``` (add (shl (add (shl (add %highest(sym), %higher(sym)), 16), %hi(sym)), 16), %lo(%sym)) ``` If the mask presents, LLVM decides to optimize the chain of commands. It really does not make sense to load upper 32-bits because the 0x0fffffff mask anyway clears them. After removing redundant commands we get this chain: ``` (add (shl (%hi(sym), 16), %lo(%sym)) ``` There is no patterns matched `(MipsHi (i64 symbol))`. Due a bug in `SYM_32` predicate definition, backend incorrectly selects a pattern for a 32-bit symbols and uses the `lui` instruction for loading `%hi(sym)`. As a result we get incorrect set of instructions with unnecessary 16-bit left shifting: ``` lui at,0x0 R_MIPS_HI16 foo dsll at,at,0x10 daddiu at,at,0 R_MIPS_LO16 foo ``` This patch resolves two problems: - Fix `SYM_32/SYM_64` predicates to prevent selection of patterns dedicated to 32-bit symbols in case of using N64 ABI. - Add missed patterns for 64-bit symbols for `%hi/%lo`. Fix PR42736. Differential Revision: https://reviews.llvm.org/D66228 llvm-svn: 370268	2019-08-28 22:32:10 +00:00
Artur Pilipenko	925afc1ce7	Fix for "DICompileUnit not listed in llvm.dbg.cu" verification error after ... ...cloning a function from a different module Currently when a function with debug info is cloned from a different module, the cloned function may have hanging DICompileUnits, so that the module with the cloned function fails debug info verification. The proposed fix inserts all DICompileUnits reachable from the cloned function to "llvm.dbg.cu" metadata operands of the cloned function module. Reviewed By: aprantl, efriedma Differential Revision: https://reviews.llvm.org/D66510 Patch by Oleg Pliss (Oleg.Pliss@azul.com) llvm-svn: 370265	2019-08-28 21:27:50 +00:00
Julian Lettner	3ae9b9d5e4	[ASan] Make insertion of version mismatch guard configurable By default ASan calls a versioned function `__asan_version_mismatch_check_vXXX` from the ASan module constructor to check that the compiler ABI version and runtime ABI version are compatible. This ensures that we get a predictable linker error instead of hard-to-debug runtime errors. Sometimes, however, we want to skip this safety guard. This new command line option allows us to do just that. rdar://47891956 Reviewed By: kubamracek Differential Revision: https://reviews.llvm.org/D66826 llvm-svn: 370258	2019-08-28 20:40:55 +00:00
James Y Knight	f025968bcc	Ignore object files that lack coverage information. Before this change, if multiple binary files were presented, all of them must have been instrumented or the load would fail with coverage_map_error::no_data_found. Patch by Dean Sturtevant. Differential Revision: https://reviews.llvm.org/D66763 llvm-svn: 370257	2019-08-28 20:35:50 +00:00
Scott Linder	04f6f25421	[AMDGPU] Fix bug when calculating user_spgr_count for Code Object V3 assembler Stop counting explicitly disabled user_spgr's in the user_sgpr_count field of the kernel descriptor. Differential Revision: https://reviews.llvm.org/D66900 llvm-svn: 370250	2019-08-28 19:38:15 +00:00
Sanjay Patel	2d4b6777c4	[InstCombine] clean up wrap propagation for reassociated ops; NFCI Always true/false checks were flagged by static analysis; https://bugs.llvm.org/show_bug.cgi?id=43143 I have not confirmed the logic difference in propagating nsw vs. nuw, but presumably we would have noticed a bug by now if that was wrong. llvm-svn: 370248	2019-08-28 18:58:06 +00:00
Pirama Arumuga Nainar	19205abaaa	[ValueMapper] NFC: Remove dead code to pause metadata mapping Summary: This functionality was added when Mapper::mapMetadata was recursive. It is no longer needed after r265456, which switched it to be iterative. Reviewers: dexonsmith, srhines Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66860 llvm-svn: 370236	2019-08-28 17:43:14 +00:00
Johannes Doerfert	f7ca0fe1c8	[Attributor] Regularly clear dependences to remove spurious ones As dependences between abstract attributes can become stale, e.g., if one was sufficient to imply another one at some point but it has since been wakened to the point it is not usable for the formerly implied one. To weed out spurious dependences, and thereby eliminate unneeded updates, we introduce an option to determine how often the dependence cache is cleared and recomputed during the fixpoint iteration. Note that the initial value was determined such that we see a positive result on our tests. Differential Revision: https://reviews.llvm.org/D63315 llvm-svn: 370230	2019-08-28 16:58:52 +00:00
Kevin P. Neal	ddf13c00ed	[FPEnv] Add fptosi and fptoui constrained intrinsics. This implements constrained floating point intrinsics for FP to signed and unsigned integers. Quoting from D32319: The purpose of the constrained intrinsics is to force the optimizer to respect the restrictions that will be necessary to support things like the STDC FENV_ACCESS ON pragma without interfering with optimizations when these restrictions are not needed. Reviewed by: Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton Approved by: Craig Topper Differential Revision: http://reviews.llvm.org/D63782 llvm-svn: 370228	2019-08-28 16:33:36 +00:00
Jessica Paquette	af0bd41e06	[AArch64][GlobalISel] Fall back when translating musttail calls These are currently translated as normal functions calls in AArch64. Until we have proper tail call lowering, we shouldn't translate these. Differential Revision: https://reviews.llvm.org/D66842 llvm-svn: 370225	2019-08-28 16:19:01 +00:00
Simon Pilgrim	1d8a886c59	Reduce scope of variable only used in a local pattern match. NFCI. llvm-svn: 370224	2019-08-28 16:06:33 +00:00
Craig Topper	f79d8a064c	[InstCombine] Disable recursion in foldGEPICmp for vector pointer GEPs Due to missing vector support in this function, recursion can generate worse code in some cases. llvm-svn: 370221	2019-08-28 15:40:34 +00:00
Simon Pilgrim	b569624049	Fix uninitialized variable warning in cppcheck. NFCI. InstCombiner::MaxArraySizeForCombine is set outside the constructor so we need to ensure it has a default initialization value. llvm-svn: 370220	2019-08-28 15:19:49 +00:00
David Bolvansky	af118bb6d0	[NFC] Added a comment to avoid possible confusion llvm-svn: 370217	2019-08-28 15:04:48 +00:00
Ryan Taylor	3b1459ed7c	[AMDGPU] Adjust number of SGPRs available in Calling Convention This reduces the number of SGPRs due to some concerns about running out of SGPRs if you make all the SGPRs that aren't reserved available for the calling convention. Change-Id: Idb4ca4dc72f5b6808cb524ff7270915a8de5b4c1 llvm-svn: 370215	2019-08-28 15:00:45 +00:00
Simon Pilgrim	316bfb0f48	Remove duplicate 'BitWidth' variable. NFCI. llvm-svn: 370212	2019-08-28 14:37:44 +00:00
Johannes Doerfert	07a5c129c6	[Attributor] Restrict liveness and return information to functions Summary: Until we have proper call-site information we should not recompute liveness and return information for each call site. This patch directly uses the function versions and introduces TODOs at the usage sites. The required iterations to get to the fixpoint are most of the time reduced by this change and we always avoid work duplication. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66562 llvm-svn: 370208	2019-08-28 14:09:14 +00:00
Simon Pilgrim	284118ce3b	InstCombiner::visitSelectInst - rename Pred to MinMaxPred to stop shadow variable warning. NFCI. We have a lot of Predicate variables, all similarly named.... llvm-svn: 370207	2019-08-28 14:05:38 +00:00
Vlad Tsyrklevich	b8a96f4bf5	Reland "[yaml2obj] - Don't allow setting StOther and Other/Visibility at the same time." This relands this commit, I mistakenly reverted the original change thinking it was the cause of the observed MSan failures but it was not. llvm-svn: 370206	2019-08-28 14:04:09 +00:00
Hans Wennborg	cff90f07cb	[SelectionDAG] Don't generate libcalls for wide shifts on Windows (PR42711) Neither libgcc or compiler-rt are usually used on Windows, so these functions can't be called. Differential revision: https://reviews.llvm.org/D66880 llvm-svn: 370204	2019-08-28 13:55:10 +00:00
Vlad Tsyrklevich	aba62e9c00	Revert "[yaml2obj] - Don't allow setting StOther and Other/Visibility at the same time." This reverts commit r370032, it was causing check-llvm failures on sanitizer-x86_64-linux-bootstrap-msan llvm-svn: 370198	2019-08-28 13:15:08 +00:00
Simon Pilgrim	14e07d7f4b	[DAGCombine] Fix cppcheck shadow variable warning. NFCI. We already have an outer Ops variable. llvm-svn: 370197	2019-08-28 12:48:41 +00:00
Simon Atanasyan	f46ba4f077	[mips] Use less registers to load address of TargetExternalSymbol There is no pattern matched `add hi, (MipsLo texternalsym)`. As a result, loading an address of 32-bit symbol requires two registers and one more additional instruction: ``` addiu $1, $zero, %lo(foo) lui $2, %hi(foo) addu $25, $2, $1 ``` This patch adds the missed pattern and enables generation more effective set of instructions: ``` lui $1, %hi(foo) addiu $25, $1, %lo(foo) ``` Differential Revision: https://reviews.llvm.org/D66771 llvm-svn: 370196	2019-08-28 12:35:53 +00:00
Amaury Sechet	4f4387dd12	[TargetLowering] Add buildLegalVectorShuffle facility to help build legal shuffles Summary: There are at least 2 ways to express the same shuffle. Various pieces of code explicit check for both option, but other places do not when they would benefit from doing it. This patches refactor the codebase to use buildLegalVectorShuffle in order to make that behavior more consistent. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66804 llvm-svn: 370190	2019-08-28 12:00:06 +00:00
Simon Pilgrim	c5b38e2869	[DAGCombine] Remove LoadedSlice::Cost default 'ForCodeSize' constructor arguments. NFCI. These were always being passed in and it allowed me to add the explicit tag to stop a cppcheck warning about 1 argument constructors. llvm-svn: 370189	2019-08-28 11:50:36 +00:00
David Green	379f6186dd	[ARM] Move MVEVPTBlockPass to a separate file. NFC This just pulls the MVEVPTBlockPass into a separate file, as opposed to being wrapped up in Thumb2ITBlockPass. Differential revision: https://reviews.llvm.org/D66579 llvm-svn: 370187	2019-08-28 11:37:31 +00:00
David Green	1c5b143c99	[MVE] VMOVX patterns This adds fp16 VMOVX patterns, using the same patterns as rL362482 with some adjustments for MVE. It allows us to move fp16 registers without going into and out of gprs. VMOVX is able to move the top bits from a fp16 in a fp reg into the bottom bits of another register, zeroing the rest. This can be used for odd MVE register lanes. The top bits are not read by fp16 instructions, so no move is required there if we are dealing with even lanes. Differential revision: https://reviews.llvm.org/D66793 llvm-svn: 370184	2019-08-28 10:13:23 +00:00
Hans Wennborg	0af82068a8	[LLVM-C] Fix ByVal Attribute crashing With the introduction of the typed byval attribute change there was no way that the LLVM-C API could create the correct class Attribute. If a program that uses the C API creates a ByVal attribute and annotates a function with that attribute LLVM will crash when it assembles or write that module containing the function out as bitcode. This change is a minimal fix to at least allow code to work, this is because the byval change is on the 9.0 and I don't want to introduce new LLVM-C API this late in the release cycle. By Jakob Bornecrantz! Differential revision: https://reviews.llvm.org/D66144 llvm-svn: 370176	2019-08-28 09:21:56 +00:00
Ayal Zaks	d15df0ede5	[LV] Fold tail by masking - handle reductions Allow vectorizing loops that have reductions when tail is folded by masking. A select is introduced in VPlan, choosing between the last value carried by the loop-exit/live-out instruction of the reduction, and the penultimate value carried by the reduction phi, according to the "i < n" mask of fold-tail. This select replaces the last value as the live-out value of the loop. Differential Revision: https://reviews.llvm.org/D66720 llvm-svn: 370173	2019-08-28 09:02:23 +00:00
Sam Parker	a761ba0f2d	[ARM][ParallelDSP] Change search for muls rL369567 reverted a couple of recent changes made to ARMParallelDSP because of a miscompilation error: PR43073. The issue stemmed from an underlying bug that was caused by adding muls into a reduction before it was proved that they could be executed in parallel with another mul. Most of the changes here are from the previously reverted commits. The additional changes have been made area: 1) The Search function now doesn't insert any muls into the Reduction object. That now happens once the search has successfully finished. 2) For any muls added into the reduction but that weren't paired, we accumulate their values as an input into the smlad. Differential Revision: https://reviews.llvm.org/D66660 llvm-svn: 370171	2019-08-28 08:51:13 +00:00
David Bolvansky	05bda8b4e5	Annotate return values of allocation functions with dereferenceable_or_null Summary: Example define dso_local noalias i8* @_Z6maixxnv() local_unnamed_addr #0 { entry: %call = tail call noalias dereferenceable_or_null(64) i8* @malloc(i64 64) #6 ret i8* %call } Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: aaron.ballman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66651 llvm-svn: 370168	2019-08-28 08:28:20 +00:00
Yi Kong	b9d87b9528	[llvm-objdump] Add the missing ARMv8 subarch detection Differential Revision: https://reviews.llvm.org/D66849 llvm-svn: 370163	2019-08-28 06:37:22 +00:00
Fangrui Song	6964027315	[LoopFusion] Fix another -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off build llvm-svn: 370156	2019-08-28 03:12:40 +00:00
Matt Arsenault	a8bbcbd006	AMDGPU/GlobalISel: Fix constraining scalar and/or/xor If the result register already had a register class assigned, the sources may not have been properly constrained. llvm-svn: 370150	2019-08-28 02:11:03 +00:00
Vlad Tsyrklevich	57076d3199	Revert "Change the X86 datalayout to add three address spaces for 32 bit signed," This reverts commit r370083 because it caused check-lld failures on sanitizer-x86_64-linux-fast. llvm-svn: 370142	2019-08-28 01:08:54 +00:00
Matt Arsenault	5c7e96dc26	AMDGPU/GlobalISel: Implement addrspacecast for 32-bit constant addrspace llvm-svn: 370140	2019-08-28 00:58:24 +00:00
Philip Reames	93a26ec98d	[NFC] Assert preconditions and merge all users into one codepath in Loads.cpp llvm-svn: 370128	2019-08-27 23:36:31 +00:00
Craig Topper	5bbb604bb5	[InstCombine] Disable some portions of foldGEPICmp for GEPs that return a vector of pointers. Fix other portions. llvm-svn: 370114	2019-08-27 21:38:56 +00:00
Luis Marques	c894c6c983	[RISCV] Implement RISCVRegisterInfo::getPointerRegClass Fixes bug 43041 Differential Revision: https://reviews.llvm.org/D66752 llvm-svn: 370113	2019-08-27 21:37:57 +00:00
Amara Emerson	e20b91c265	[GlobalISel] Replace hard coded dynamic alloca handling with G_DYN_STACKALLOC. This change moves the actual stack pointer manipulation into the legalizer, available to targets via lower(). The codegen is slightly different because we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than adding a negative amount via G_GEP. Differential Revision: https://reviews.llvm.org/D66678 llvm-svn: 370104	2019-08-27 19:54:27 +00:00
Philip Reames	2694522f13	[Loads/SROA] Remove blatantly incorrect code and fix a bug revealed in the process The code we had isSafeToLoadUnconditionally was blatantly wrong. This function takes a "Size" argument which is supposed to describe the span loaded from. Instead, the code use the size of the pointer passed (which may be unrelated!) and only checks that span. For any Size > LoadSize, this can and does lead to miscompiles. Worse, the generic code just a few lines above correctly handles the cases which are valid. So, let's delete said code. Removing this code revealed two issues: 1) As noted by jdoerfert the removed code incorrectly handled external globals. The test update in SROA is to stop testing incorrect behavior. 2) SROA was confusing bytes and bits, but this wasn't obvious as the Size parameter was being essentially ignored anyway. Fixed. Differential Revision: https://reviews.llvm.org/D66778 llvm-svn: 370102	2019-08-27 19:34:43 +00:00
Matt Arsenault	2910184936	DAG: computeNumSignBits for MUL Copied directly from the IR version. Most of the testcases I've added for this are somewhat problematic because they really end up testing the yet to be implemented version for MUL_I24/MUL_U24. llvm-svn: 370099	2019-08-27 19:05:33 +00:00
Jason Liu	7c72e82b25	[XCOFF][AIX] Generate symbol table entries with llvm-readobj Summary: This patch implements main entry and auxiliary entries of symbol table generation for llvm-readobj on AIX. The source code of aix_xcoff_xlc_test8.o (compile with xlc) is: -bash-4.2$ cat test8.c extern int i; extern int TestforXcoff; extern int fun(int i); static int static_i; char* p="abcd"; int fun1(int j) { static_i++; j++; j=j+*p; return j; } int main() { i++; fun(i); return fun1(i); } Patch provided by DiggerLin Differential Revision: https://reviews.llvm.org/D65240 llvm-svn: 370097	2019-08-27 18:54:46 +00:00
Praveen Velliengiri	3b1b56d3fb	[ORCv2] - New Speculate Query Implementation Summary: This patch introduces, SequenceBBQuery - new heuristic to find likely next callable functions it tries to find the blocks with calls in order of execution sequence of Blocks. It still uses BlockFrequencyAnalysis to find high frequency blocks. For a handful of hottest blocks (plan to customize), the algorithm traverse and discovered the caller blocks along the way to Entry Basic Block and Exit Basic Block. It uses Block Hint, to stop traversing the already visited blocks in both direction. It implicitly assumes that once the block is visited during discovering entry or exit nodes, revisiting them again does not add much. It also branch probability info (cached result) to traverse only hot edges (planned to customize) from hot blocks. Without BPI, the algorithm mostly return's all the blocks in the CFG with calls. It also changes the heuristic queries, so they don't maintain states. Hence it is safe to call from multiple threads. It also implements, new instrumentation to avoid jumping into JIT on every call to the function with the help _orc_speculate.decision.block and _orc_speculate.block. "Speculator Registration Mechanism is also changed" - kudos to @lhames Open to review, mostly looking to change implementation of SequeceBBQuery heuristics with good data structure choices. Reviewers: lhames, dblaikie Reviewed By: lhames Subscribers: mgorny, hiraditya, mgrang, llvm-commits, lhames Tags: #speculative_compilation_in_orc, #llvm Differential Revision: https://reviews.llvm.org/D66399 llvm-svn: 370092	2019-08-27 18:23:36 +00:00
Andrea Di Biagio	2f51a43f8c	[Tblgen][MCA] Add the ability to mark groups as LoadQueue and StoreQueue. NFCI Before this patch, users were not allowed to optionally mark processor resource groups as load/store queues. That is because tablegen class MemoryQueue was originally declared as expecting a ProcResource template argument (instead of a more generic ProcResourceKind). That was an oversight, since the original intention from D54957 was to let user mark any processor resource as either load/store queue. This patch adds the ability to use processor resource groups in MemoryQueue definitions. This is not a user visible change. Differential Revision: https://reviews.llvm.org/D66810 llvm-svn: 370091	2019-08-27 18:20:34 +00:00
Matt Arsenault	ff07631b48	AMDGPU: Add amdgpu-32bit-address-high-bits to MIR serialization llvm-svn: 370089	2019-08-27 18:18:38 +00:00
Matt Arsenault	0c096da02f	AMDGPU: Fix crash from inconsistent register types for v3i16/v3f16 This is something of a workaround since computeRegisterProperties seems to be doing the wrong thing. llvm-svn: 370086	2019-08-27 17:51:56 +00:00
Amy Huang	1299945b81	Change the X86 datalayout to add three address spaces for 32 bit signed, 32 bit unsigned, and 64 bit pointers. llvm-svn: 370083	2019-08-27 17:46:53 +00:00
Craig Topper	fc1f08c2f2	[X86] Remove encoding information from the TAILJMP instructions that are lowered by MCInstLowering. Fix LowerPATCHABLE_TAIL_CALL to also convert them to regular JMP/JCC instructions There are 5 instructions here that are converted from TAILJMP opcodes to regular JMP/JCC opcodes during MCInstLowering. So normally there encoding information isn't used. The exception being when XRay wraps them in PATCHABLE_TAIL_CALL. For the ones that weren't already handled in MCInstLowering, add handling for those and remove their encoding information. This patch fixes PATCHABLE_TAIL_CALL to do the same opcode conversion as the regular lowering patch. Then removes the encoding information. Differential Revision: https://reviews.llvm.org/D66561 llvm-svn: 370079	2019-08-27 17:24:23 +00:00
Lang Hames	c48f1f6da6	[JITLink][ORC] Track eh-frame section size for registration/deregistration. On MachO, processing of the eh-frame section should stop if the end of the __eh_frame section is reached, regardless of whether or not there is a null CFI length field at the end of the section. This patch tracks the eh-frame section size and threads it through the appropriate APIs so that processing can be terminated correctly. No testcase yet: This patch is all API plumbing (rather than modification of linked memory) which the existing infrastructure does not provide a way of testing. Committing without a testcase until I have an idea of how to write one. llvm-svn: 370074	2019-08-27 15:50:32 +00:00
Lang Hames	70e158e09e	[JITLink] Don't under-align zero-fill sections. If content sections have lower alignment than zero-fill sections then bump the overall segment alignment to avoid under-aligning the zero-fill sections. llvm-svn: 370072	2019-08-27 15:22:23 +00:00
Sanjay Patel	b516f1afdd	[DAGCombiner] cancel fnegs from multiplied operands of FMA (-X) * (-Y) + Z --> X * Y + Z This is a missing optimization that shows up as a potential regression in D66050, so we should solve it first. We appear to be partly missing this fold in IR as well. We do handle the simpler case already: (-X) * (-Y) --> X * Y And it might be beneficial to make the constraint less conservative (eg, if both operands are cheap, but not necessarily cheaper), but that causes infinite looping for the existing fmul transform. Differential Revision: https://reviews.llvm.org/D66755 llvm-svn: 370071	2019-08-27 15:17:46 +00:00
Jason Liu	fc056950aa	Handle local commons for XCOFF object file writing Summary: Adds support for emitting common local global symbols to an XCOFF object file. Local commons are emitted into the .bss section with a storage class of C_HIDEXT. Patch by: daltenty Reviewers: sfertile, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D66097 llvm-svn: 370070	2019-08-27 15:14:45 +00:00
Jinsong Ji	7f536bcf22	Revert "[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks" This reverts commit `b3d258fc44`. @skatkov is reporting crash in D63972#1646303 Contacted @ZhangKang, and revert the commit on behalf of him. llvm-svn: 370069	2019-08-27 14:59:08 +00:00
Petar Avramovic	4a2a653288	[MIPS GlobalISel] ClampScalar G_SHL, G_ASHR and G_LSHR ClampScalar G_SHL, G_ASHR and G_LSHR to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66533 llvm-svn: 370067	2019-08-27 14:41:44 +00:00
Petar Avramovic	a393238422	[GlobalISel] Factor narrowScalar for G_ASHR and G_LSHR. NFC Main difference is in the way Hi for Long shift (HiL) is made. G_LSHR fills HiL with zeros, while G_ASHR fills HiL with sign bit value. Differential Revision: https://reviews.llvm.org/D66589 llvm-svn: 370064	2019-08-27 14:33:05 +00:00
Petar Avramovic	d568ed40e0	[GlobalISel] Fix narrowScalar for shifts to match algorithm from SDAG Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS to match names of surrounding variables. Differential Revision: https://reviews.llvm.org/D66587 llvm-svn: 370062	2019-08-27 14:22:32 +00:00
Amaury Sechet	f28dee2cff	[DAGCombiner] Add node to the worklist in topological order in parallelizeChainedStores Summary: As per title. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66659 llvm-svn: 370056	2019-08-27 13:27:57 +00:00
Simon Pilgrim	8912e2af39	[X86][AVX] Add SimplifyDemandedVectorElts support for KSHIFTL/KSHIFTR Differential Revision: https://reviews.llvm.org/D66527 llvm-svn: 370055	2019-08-27 13:13:17 +00:00
Cullen Rhodes	2ba5d64a80	[IntrinsicEmitter] Support scalable vectors in intrinsics Summary: This patch adds support for scalable vectors in intrinsics, enabling intrinsics such as the following to be defined: declare <vscale x 4 x i32> @llvm.something.nxv4i32(<vscale x 4 x i32>) Support for this is implemented by defining a new type descriptor for scalable vectors and adding mangling support for scalable vector types in the name mangling scheme used by 'any' types in intrinsic signatures. Tests have been added for IRBuilder to test scalable vectors work as expected when using intrinsics through this interface. This required implementing an intrinsic that is explicitly defined with scalable vectors, e.g. LLVMType<nxv4i32>, an SVE floating-point convert intrinsic was used for this. The behaviour of the overloaded type LLVMScalarOrSameVectorWidth with scalable vectors is tested using the existing masked load intrinsic. Also added an .ll test to test the Verifier catches a bad intrinsic argument when passing a fixed-width predicate (mask) to the masked.load intrinsic where a scalable is expected. Patch by Paul Walker Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D65930 llvm-svn: 370053	2019-08-27 12:57:09 +00:00
Pavel Labath	b1f29cec25	Add error handling to the DataExtractor class Summary: This is motivated by D63591, where we realized that there isn't a really good way of telling whether a DataExtractor is reading actual data, or is it just returning default values because it reached the end of the buffer. This patch resolves that by providing a new "Cursor" class. A Cursor object encapsulates two things: - the current position/offset in the DataExtractor - an error object Storing the error object inside the Cursor enables one to use the same pattern as the std::{io}stream API, where one can blindly perform a sequence of reads and only check for errors once at the end of the operation. Similarly to the stream API, as soon as we encounter one error, all of the subsequent operations are skipped (return default values) too, even if the would suceed with clear error state. Unlike the std::stream API (but in line with other llvm APIs), we force the error state to be checked through usage of llvm::Error. Reviewers: probinson, dblaikie, JDevlieghere, aprantl, echristo Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63713 llvm-svn: 370042	2019-08-27 11:24:08 +00:00
Amaury Sechet	a1e5ef3fd4	[DAGCombiner] Add node to the worklist in topological order after relegalization. Summary: As per title. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66702 llvm-svn: 370040	2019-08-27 11:06:09 +00:00
David Bolvansky	0c2692108c	[InstCombine] Fold select with ctlz to cttz Summary: Handle pattern [0]: int ctz(unsigned int a) { int c = __clz(a & -a); return a ? 31 - c : c; } In reality, the compiler can generate much better code for cttz, so fold away this pattern. https://godbolt.org/z/c5kPtV [0] https://community.arm.com/community-help/f/discussions/2114/count-trailing-zeros Reviewers: spatel, nikic, lebedev.ri, dmgreen, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66308 llvm-svn: 370037	2019-08-27 10:22:40 +00:00
Tim Northover	a7f226f9db	AArch64: avoid creating cycle in DAG for post-increment NEON ops. Inserting a value into Visited has the effect of terminating a search for predecessors if that node is seen. This is legitimate for the base address, and acts as a slight performance optimization, but the vector-building node can be paert of a legitimate cycle so we shouldn't stop searching there. PR43056. llvm-svn: 370036	2019-08-27 10:21:11 +00:00
George Rimar	7a2e21d9f4	[yaml2obj] - Don't allow setting StOther and Other/Visibility at the same time. This is a follow up discussed in the comments of D66583. Currently, if for example, we have both StOther and Other set in YAML document for a symbol, then yaml2obj reports an "unknown key 'Other'" error. It happens because 'mapOptional()' is never called for 'Other/Visibility' in this case, leaving those unhandled. This message does not describe the reason of the error well. This patch fixes it. Differential revision: https://reviews.llvm.org/D66642 llvm-svn: 370032	2019-08-27 09:58:39 +00:00
Craig Topper	243ede9970	[SelectionDAGBuilder] Hide existence of ConstantDataVector vector from visitGetElementPtr. ConstantDataVector is a specialized verison of ConstantVector that stores data in a packed array of bits instead of as individual pointers to other Constants. But we really shouldn't expose that if we can void it. And we should handle regular ConstantVector equally well. This removes a dyn_cast to ConstantDataVector and just calls getSplatValue directly on a Constant* if the type is a vector. llvm-svn: 370018	2019-08-27 06:39:50 +00:00
Craig Topper	4a3f62f9fd	[SelectionDAGBuilder] Fix typo in comment. NFC llvm-svn: 370017	2019-08-27 06:38:51 +00:00
Johannes Doerfert	39681e733c	[Attributor] Introduce an API to delete stuff Summary: During the fixpoint iteration, including the manifest stage, we should not delete stuff as other abstract attributes might have a reference to the value. Through the API this can now be done safely at the very end. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66779 llvm-svn: 370014	2019-08-27 04:57:54 +00:00
Philip Reames	20650eda99	[NFC] Replace the FIXME I added in rL369989 with a comment clarifying the current code The current approach is restrictive (as all of geps must be multiples of the alignment), but correct. llvm-svn: 370013	2019-08-27 04:52:35 +00:00
Richard Trieu	58e67b8aa3	Revert r369927 - [DAGCombiner] Remove a bunch of redundant AddToWorklist calls. This change causes instrumented builds of Clang to have a fatal error in the backend. https://reviews.llvm.org/D66537 has the details. llvm-svn: 370006	2019-08-27 02:04:11 +00:00
Pengfei Wang	564fb58a32	[WinEH] Allocate space in funclets stack to save XMM CSRs Summary: This is an alternate approach to D63396 Currently funclets reuse the same stack slots that are used in the parent function for saving callee-saved xmm registers. If the parent function modifies a callee-saved xmm register before an excpetion is thrown, the catch handler will overwrite the original saved value. This patch allocates space in funclets stack for saving callee-saved xmm registers and uses RSP instead RBP to access memory. Signed-off-by: Pengfei Wang <pengfei.wang@intel.com> Reviewers: rnk, RKSimon, craig.topper, annita.zhang, LuoYuanke, andrew.w.kaylor Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66596 Signed-off-by: Pengfei Wang <pengfei.wang@intel.com> llvm-svn: 370005	2019-08-27 01:53:24 +00:00
Alina Sbirlea	228ffac678	[MemorySSA] Fix insertUse. Actually call the renamePass on inserted Phis. Fixes PR42940. Subscribers: llvm-commits llvm-svn: 369997	2019-08-27 00:34:47 +00:00
Matt Arsenault	0a6564980b	AMDGPU: Combine directly on mul24 intrinsics The problem these are supposed to work around can occur before the intrinsics are lowered into the nodes. Try to directly simplify them so they are matched before the bit assert operations can be optimized out. llvm-svn: 369994	2019-08-27 00:18:09 +00:00
Matt Arsenault	3b95986a32	AMDGPU: Run AMDGPUCodeGenPrepare after scalar opts The mul24 matching could interfere with SLSR and the other addressing mode related passes. This probably is not the optimal placement, but is an intermediate step. This should probably be moved after all the generic IR passes, particularly LSR. Moving this after LSR seems to help in some cases, and hurts others. As-is in this patch, in idiv-licm, it saves 1-2 instructions inside some of the loop bodies, but increases the number in others. Moving this later helps these loops. In the new lsr tests in mul24-pass-ordering, the intrinsic prevents introducing more instructions in the loop preheader, so moving this later ends up hurting them. This shouldn't be any worse than before the intrinsics were introduced in r366094, and LSR should probably be smarter. I think it's because it doesn't know the and inside the loop will be folded away. llvm-svn: 369991	2019-08-27 00:08:31 +00:00
Philip Reames	2f858c2e91	Reorganize code and add a fixme to point out a bug in existing code [NFC] llvm-svn: 369989	2019-08-26 23:57:27 +00:00
Simon Atanasyan	d5918edf0d	[mips] Fix indentation. NFC llvm-svn: 369983	2019-08-26 22:40:34 +00:00
Simon Atanasyan	ac64924a55	[mips] clang-format the code. NFC llvm-svn: 369982	2019-08-26 22:40:28 +00:00
Craig Topper	6db7f492d9	[X86] Delay combineIncDecVector until after op legalization. Probably better to keep add over sub in early DAG combines. It might make sense to push this to lowering or delay it all the way to isel. But this was the simplest change. llvm-svn: 369981	2019-08-26 22:17:54 +00:00
Vitaly Buka	aeca56964f	msan, codegen, instcombine: Keep more lifetime markers used for msan Reviewers: eugenis Subscribers: hiraditya, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D66695 llvm-svn: 369979	2019-08-26 22:15:50 +00:00
Heejin Ahn	173a3a54bb	[WebAssembly] Fix SSA rebuilding in SjLj transformation Summary: Previously we skipped uses within the same BB as a def when rebuilding SSA after SjLj transformation. For example, before transformation, ``` for.cond: %0 = phi i32 [ %var, %for.inc ] ... %var = ... br label %for.inc for.inc: ; preds = %for.cond call i32 @setjmp(...) br %for.cond ``` In this BB, %var should be defined in all paths from %for.inc to make %0 valid. In the input it was true; %for.inc's only predecessor was %for.cond. But after SjLj transformation, it is possible that %for.inc has other predecessors that are reachable without reaching %for.cond. ``` entry.split: ... br i1 %a, label %bb.1, label %for.inc for.cond: %0 = phi i32 [ %var, %for.inc ] ... ; Not valid! %var = ... br label %for.inc for.inc: ; preds = %for.cond, %entry.split call i32 @setjmp(...) ... br %for.cond ``` In this case, we can't use %var in the `phi` instruction in %for.cond, because %var is not defined in all paths through %for.inc (If the control flow is %entry -> %entry.split -> %for.inc -> %for.cond, %var has not been defined until we reach the `phi`). But the previous code excluded users within the same BB, skipping instructions within the same BB so they are not rewritten properly. User instructions within the same BB also should be candidates for rewriting if they are _before_ the original definition. Fixes PR43097. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66729 llvm-svn: 369978	2019-08-26 21:51:35 +00:00
Evgeniy Stepanov	ed4fefb0df	[hwasan] Fix test failure in r369721. Try harder to emulate "old runtime" in the test. To get the old behavior with the new runtime library, we need both disable personality function wrapping and enable landing pad instrumentation. llvm-svn: 369977	2019-08-26 21:44:55 +00:00
Lang Hames	8853ac7e02	[ORC] Make sure that queries on emitted-but-not-ready symbols fail correctly. In r369808 the failure scheme for ORC symbols was changed to make MaterializationResponsibility objects responsible for failing the symbols they represented. This simplifies error logic in the case where symbols are still covered by a MaterializationResponsibility, but left a gap in error handling: Symbols that have been emitted but are not yet ready (due to a dependence on some unemitted symbol) are not covered by a MaterializationResponsibility object. Under the scheme introduced in r369808 such symbols would be moved to the error state, but queries on those symbols were never notified. This led to deadlocks when such symbols were failed. This commit updates error logic to immediately fail queries on any symbol that has already been emitted if one of its dependencies fails. llvm-svn: 369976	2019-08-26 21:42:51 +00:00
Lang Hames	8ec9661870	[ORC] Fix an overly aggressive assert. Symbols that have not been queried will not have MaterializingInfo entries, so remove the assert that all failed symbols should have these entries. Also updates the loop to only remove entries that were found earlier. llvm-svn: 369975	2019-08-26 21:42:47 +00:00
Shafik Yaghmour	90e00bd8f3	Debug Info: Support for DW_AT_export_symbols for anonymous structs This implements the DWARF 5 feature described in: http://dwarfstd.org/ShowIssue.php?issue=141212.1 To support recognizing anonymous structs: struct A { struct { // Anonymous struct int y; }; } a This patch adds support for the new flag in constructTypeDIE(...) and test to verify this change. Differential Revision: https://reviews.llvm.org/D66605 llvm-svn: 369969	2019-08-26 20:59:44 +00:00
Vedant Kumar	58a0714885	[DWARF] Rename getDwarf5OrGNUCallSite{Attr,Tag}, NFC llvm-svn: 369967	2019-08-26 20:53:34 +00:00
Vedant Kumar	533dd0214c	[DWARF] Pick the DWARF5 OP_entry_value opcode on Darwin Use the GNU extension for OP_entry_value consistently (i.e. whenever GNU extensions are used for TAG_call_site). llvm-svn: 369966	2019-08-26 20:53:12 +00:00
Philip Reames	cf3b555973	Add a clarify comment for meaning of SafePointes [NFC] Extracted from D66688 as requested. llvm-svn: 369962	2019-08-26 20:48:35 +00:00
Roland Froese	18db4e9ae1	Recommit [PowerPC] Update P9 vector costs for insert/extract Now that the v1i128 smin regression has been fixed, recommit the P9 cost updates from D60160. llvm-svn: 369952	2019-08-26 19:26:08 +00:00
Philip Reames	b92c971099	[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null for vectors Extend the transform introduced in https://reviews.llvm.org/D66608 to work for vector geps as well. Differential Revision: https://reviews.llvm.org/D66671 llvm-svn: 369949	2019-08-26 19:11:49 +00:00
Krzysztof Parzyszek	9e0feaf562	[Hexagon] Improve generated code for test-if-bit-clear llvm-svn: 369947	2019-08-26 19:08:08 +00:00
Johannes Doerfert	b504eb8bb5	[Attributor] Adjust and test the iteration bound of tests Summary: Try to verify how many iterations we need for a fixpoint in our tests. This patch adjust the way we count to make it easier to follow. It also adjusts the bounds to actually account for a fixpoint and not only the minimum number to pass all checks. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66757 llvm-svn: 369945	2019-08-26 18:55:47 +00:00
Craig Topper	36d1588f01	[X86] Add a hack to combinePMULDQ to manually turn SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG inputs into an ANY_EXTEND_VECTOR_INREG style shuffle ANY_EXTEND_VECTOR_INREG isn't currently marked Legal which prevents SimplifyDemandedBits from turning SIGN/ZERO_EXTEND_VECTOR_INREG into it after op legalization. And even if we did make it Legal, combineExtInVec doesn't do shuffle combining on the VECTOR_INREG nodes until AVX1. This patch adds a quick hack to combinePMULDQ to directly emit a vector shuffle corresponding to an ANY_EXTEND_VECTOR_INREG operation. This avoids both of those issues without creating any other regressions on our tests. The xop-ifma.ll change here also showed up when I tried to resurrect D56306 and seemed to be the only improvement that patch creates now. This is a more direct way to get the benefit. Differential Revision: https://reviews.llvm.org/D66436 llvm-svn: 369942	2019-08-26 18:23:26 +00:00
Craig Topper	846429de74	[DAGCombiner][X86] Teach SimplifyVBinOp to fold VBinOp (concat X, undef/constant), (concat Y, undef/constant) -> concat (VBinOp X, Y), VecC This improves the combine I included in D66504 to handle constants in the upper operands of the concat. If we can constant fold them away we can pull the concat after the bin op. This helps with chains of madd reductions on X86 from loop unrolling. The loop madd reduction pattern creates pmaddwd with half the width of the add that follows it using zeroes to fill the upper bits. If we have two of these added together we can pull the zeroes through the accumulating add and then shrink it. Differential Revision: https://reviews.llvm.org/D66680 llvm-svn: 369937	2019-08-26 17:59:11 +00:00
Johannes Doerfert	a4a308cc25	[Attributor] Further cut down on non-determinism llvm-svn: 369936	2019-08-26 17:51:23 +00:00
Johannes Doerfert	19b0043641	[Attributor] Allow explicit dependence tracking By default, the Attributor tracks potential dependences between abstract attributes based on the issued Attributor::getAAFor queries. This simplifies the development of new abstract attributes but it can also lead to spurious dependences that might increase compile time and make internalization harder (D63312). With this patch, abstract attributes can opt-out of implicit dependence tracking and instead register dependences explicitly. It is up to the implementation to make sure all existing dependences are registered. Differential Revision: https://reviews.llvm.org/D63314 llvm-svn: 369935	2019-08-26 17:48:05 +00:00
Amaury Sechet	b7075e40f3	[DAGCombiner] Remove a bunch of redundant AddToWorklist calls. Summary: This comes as a first step toward processing the DAG nodes in topological orders. Doing so ensure that arguments of a node are combined before the node itself is combined, which exposes ore opportunities for optimization and/or reduce the amount of patterns a node has to match for. DAGCombiner adding nodes to the worklist is various places causes the nodes to be in a different order from what is expected. In addition, this is reduant because these nodes end up being added to the worklist anyways due to the machinery at line 1621. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66537 llvm-svn: 369927	2019-08-26 17:02:12 +00:00
Wei Mi	077a9c7053	[SampleFDO] Extract the code calling each section reader to readOneSection. This is a followup of https://reviews.llvm.org/D66513. The code calling each section reader should be put into a separate function (readOneSection), so SampleProfileExtBinaryReader can override it. Otherwise, the base class SampleProfileExtBinaryBaseReader will need to be aware of all different kinds of section readers. That is not right. Differential Revision: https://reviews.llvm.org/D66693 llvm-svn: 369919	2019-08-26 15:54:16 +00:00
Bjorn Pettersson	d804bd17de	[LoopUnroll] Handle certain PHIs in full unrolling properly Summary: When reconstructing the CFG of the loop after unrolling, LoopUnroll could in some cases remove the phi operands of loop-carried values instead of preserving them, resulting in undef phi values after loop unrolling. When doing this reconstruction, avoid removing incoming phi values for phis in the successor blocks if the successor is the block we are jumping to anyway. Patch-by: ebevhan Reviewers: fhahn, efriedma Reviewed By: fhahn Subscribers: bjope, lebedev.ri, zzheng, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66334 llvm-svn: 369886	2019-08-26 09:29:53 +00:00
Craig Topper	b8b90ac1c5	[X86][DAGCombiner] Teach narrowShuffle to use concat_vectors instead of inserting into undef Summary: Concat_vectors is more canonical during early DAG combine. For example, its what's used by SelectionDAGBuilder when converting IR shuffles into SelectionDAG shuffles when element counts between inputs and mask don't match. We also have combines in DAGCombiner than can pull concat_vectors through a shuffle. See partitionShuffleOfConcats. So it seems like concat_vectors is a better operation to use here. I had to teach DAGCombiner's SimplifyVBinOp to also handle concat_vectors with undef. I haven't checked yet if we can remove the INSERT_SUBVECTOR version in there or not. I didn't want to mess with the other caller of getShuffleHalfVectors that's used during shuffle lowering where insert_subvector probably is what we want to produce so I've enabled this via a boolean passed to the function. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66504 llvm-svn: 369872	2019-08-25 17:59:49 +00:00
Xing Xue	ef039a3ccd	[PowerPC][AIX] Adds support for writing the .data section in assembly files Summary: Adds support for generating the .data section in assembly files for global variables with a non-zero initialization. The support for writing the .data section in XCOFF object files will be added in a follow-on patch. Any relocations are not included in this patch. Reviewers: hubert.reinterpretcast, sfertile, jasonliu, daltenty, Xiangling_L Reviewed by: hubert.reinterpretcast Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, wuzish, shchenz, DiggerLin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66154 llvm-svn: 369869	2019-08-25 15:17:25 +00:00
Benjamin Kramer	55e8c91dd5	[AMDGPU] Downgrade from StringLiteral to const char* in an attempt to make GCC 5 happy llvm-svn: 369867	2019-08-25 12:47:31 +00:00
Nikita Popov	aa71c977ba	[SDAG] Fold umul_lohi with 0 or 1 multiplicand These can turn up during multiplication legalization. In principle these should also apply to smul_lohi, but I wasn't able to figure out how to produce those with the necessary operands. Differential Revision: https://reviews.llvm.org/D66380 llvm-svn: 369864	2019-08-25 08:04:22 +00:00
Craig Topper	1abe162a9a	[X86] Teach -Os immediate sharing code to not count constant uses that will become INC/DEC. INC/DEC don't use an immediate so we don't need to count it. We also shouldn't use the custom isel for it. Fixes PR42998. llvm-svn: 369863	2019-08-25 05:22:40 +00:00
Nilanjana Basu	7da6f432d8	Removing block comments from CodeView records in assembly files & related code cleanup llvm-svn: 369860	2019-08-25 01:09:11 +00:00
Craig Topper	cc4b0596b1	[X86] Add isel patterns to match vpdpwssd avx512vnni instruction from add+pmaddwd nodes. llvm-svn: 369859	2019-08-24 23:14:57 +00:00
Matt Arsenault	c6ab2b4fed	AMDGPU: Preserve value name when inserting mul24 intrinsic llvm-svn: 369857	2019-08-24 22:17:10 +00:00
Matt Arsenault	b3dd381a73	AMDGPU: Introduce a flag to disable mul24 intrinsic formation llvm-svn: 369856	2019-08-24 22:14:41 +00:00
Benjamin Kramer	d5e60669c4	[TLI] Simplify code. NFCI. llvm-svn: 369854	2019-08-24 17:30:12 +00:00
Benjamin Kramer	7043477042	Fix some accidental global initializers by using StringLiteral instead of StringRef llvm-svn: 369850	2019-08-24 15:24:25 +00:00
Benjamin Kramer	16b322914a	Use a bit of relaxed constexpr to make FeatureBitset costant intializable This requires std::intializer_list to be a literal type, which it is starting with C++14. The downside is that std::bitset is still not constexpr-friendly so this change contains a re-implementation of most of it. Shrinks clang by ~60k. llvm-svn: 369847	2019-08-24 15:02:44 +00:00
Roman Lebedev	9cf08c6de1	[Constant] Add 'isElementWiseEqual()' method Promoting it from InstCombine's tryToReuseConstantFromSelectInComparison(). Return true if this constant and a constant 'Y' are element-wise equal. This is identical to just comparing the pointers, with the exception that for vectors, if only one of the constants has an `undef` element in some lane, the constants still match. llvm-svn: 369842	2019-08-24 06:49:51 +00:00
Roman Lebedev	de19f749e0	[InstCombine] matchThreeWayIntCompare(): commutativity awareness Summary: `matchThreeWayIntCompare()` looks for ``` select i1 (a == b), i32 Equal, i32 (select i1 (a < b), i32 Less, i32 Greater) ``` but both of these selects/compares can be in it's commuted form, so out of 8 variants, only the two most basic ones is handled. This fixes regression being introduced in D66232. Reviewers: spatel, nikic, efriedma, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66607 llvm-svn: 369841	2019-08-24 06:49:36 +00:00
Roman Lebedev	2c75fe7f2a	[InstCombine] Try to reuse constant from select in leading comparison Summary: If we have e.g.: ``` %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 ``` the constants `65535` and `65536` are suspiciously close. We could perform a transformation to deduplicate them: ``` Name: ult %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 => %t.inv = icmp ugt i32 %x, 65535 %r = select i1 %t.inv, i32 65535, i32 %y ``` https://rise4fun.com/Alive/avb While this may seem esoteric, this should certainly be good for vectors (less constant pool usage) and for opt-for-size - need to have only one constant. But the real fun part here is that it allows further transformation, in particular it finishes cleaning up the `clamp` folding, see e.g. `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`. We start with e.g. ``` %dont_need_to_clamp_positive = icmp sle i32 %X, 32767 %dont_need_to_clamp_negative = icmp sge i32 %X, -32768 %clamp_limit = select i1 %dont_need_to_clamp_positive, i32 -32768, i32 32767 %dont_need_to_clamp = and i1 %dont_need_to_clamp_positive, %dont_need_to_clamp_negative %R = select i1 %dont_need_to_clamp, i32 %X, i32 %clamp_limit ``` without this patch we currently produce ``` %1 = icmp slt i32 %X, 32768 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1, i32 %3, i32 32767 ``` which isn't really a `clamp` - both comparisons are performed on the original value, this patch changes it into ``` %1.inv = icmp sgt i32 %X, 32767 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1.inv, i32 32767, i32 %3 ``` and then the magic happens! Some further transform finishes polishing it and we finally get: ``` %t1 = icmp sgt i32 %X, -32768 %t2 = select i1 %t1, i32 %X, i32 -32768 %t3 = icmp slt i32 %t2, 32767 %R = select i1 %t3, i32 %t2, i32 32767 ``` which is beautiful and just what we want. Proofs for `getFlippedStrictnessPredicateAndConstant()` for de-canonicalization: https://rise4fun.com/Alive/THl Proofs for the fold itself: https://rise4fun.com/Alive/THl Reviewers: spatel, dmgreen, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66232 llvm-svn: 369840	2019-08-24 06:49:25 +00:00
Craig Topper	dd2cf78381	[X86] Add an assert to mark more code that needs to be removed when the vector widening legalization switch is removed again. llvm-svn: 369837	2019-08-24 05:59:46 +00:00
Fangrui Song	eb70ac0249	[LoopFusion] Fix -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off build llvm-svn: 369836	2019-08-24 02:50:42 +00:00
Amara Emerson	3f6dd0c588	[GlobalISel] Introduce a G_DYN_STACKALLOC opcode to represent dynamic allocas. This just adds the opcode and verifier, it will be used to replace existing dynamic alloca handling in a subsequent patch. Differential Revision: https://reviews.llvm.org/D66677 llvm-svn: 369833	2019-08-24 02:25:56 +00:00
Guillaume Chatelet	0b6563e8a2	[LLVM][NFC] Removing unused functions Summary: Removes a not so useful function from DataLayout and cleans up Support/MathExtras.h Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66691 llvm-svn: 369824	2019-08-23 23:19:25 +00:00
Stanislav Mekhanoshin	b37d6a750a	[AMDGPU] Check for immediate SrcC in mfma in AsmParser Differential Revision: https://reviews.llvm.org/D66674 llvm-svn: 369819	2019-08-23 22:22:49 +00:00
Stanislav Mekhanoshin	e6e1c4eac0	[AMDGPU] w/a for gfx908 mfma SrcC literal HW bug gfx908 ignores an mfma if SrcC is a literal. Differential Revision: https://reviews.llvm.org/D66670 llvm-svn: 369818	2019-08-23 22:22:29 +00:00
Stanislav Mekhanoshin	8fe1245a0f	[AMDGPU] w/a for gfx908 mfma SrcC literal HW bug gfx908 ignores an mfma if SrcC is a literal. Differential Revision: https://reviews.llvm.org/D66670 llvm-svn: 369816	2019-08-23 22:09:58 +00:00
Peter Collingbourne	5b31ac5096	hwasan: Fix use of uninitialized memory. Reported by e.g. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/23071/steps/build%20with%20ninja/logs/stdio llvm-svn: 369815	2019-08-23 21:37:20 +00:00
Guillaume Chatelet	b7be5b9095	[LLVM][NFC] remove unused fields Summary: Here is the commit introducing the fields https://github.com/llvm/llvm-project/commit/cf6749e4c091 It dates back from 2006 and was used by AArch64 backend. There is no more reference to these fields in the whole codebase so I think it's fine. Reviewers: courbet Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66683 llvm-svn: 369810	2019-08-23 20:49:06 +00:00
Lang Hames	7371fb4229	[ORC] Remove query dependencies when symbols are resolved. If the dependencies are not removed then a late failure (one symbol covered by the query failing after others have already been resolved) can result in an attempt to detach the query from already finalized symbol, resulting in an assert/crash. This patch fixes the issue by removing query dependencies in JITDylib::resolve for symbols that meet the required state. llvm-svn: 369809	2019-08-23 20:37:32 +00:00
Lang Hames	e00585c77c	[ORC] Fix a FIXME: Propagate errors to dependencies. When symbols are failed (via MaterializationResponsibility::failMaterialization) any symbols depending on them will now be moved to an error state. Attempting to resolve or emit a symbol in the error state (via the notifyResolved or notifyEmitted methods on MaterializationResponsibility) will result in an error. If notifyResolved or notifyEmitted return an error due to failure of a dependence then the caller should log or discard the error and call failMaterialization to propagate the failure to any queries waiting on the symbols being resolved/emitted (plus their dependencies). llvm-svn: 369808	2019-08-23 20:37:31 +00:00
Jessica Paquette	83fe56b3b9	[AArch64][GlobalISel] Import XRO load/store patterns instead of custom selection Instead of using custom C++ in `earlySelect` for loads and stores, just import the patterns. Remove `earlySelectLoad`, since we can just import the work it's doing. Some minor changes to how `ComplexRendererFns` are returned for the XRO addressing modes. If you add immediates in two steps, sometimes they are not imported properly and you only end up with one immediate. I'm not sure if this is intentional. - Update load-addressing-modes.mir to include the instructions we can now import. - Add a similar test, store-addressing-modes.mir to show which store opcodes we currently import, and show that we can pull in shifts etc. - Update arm64-fastisel-gep-promote-before-add.ll to use FastISel instead of GISel. This test failed with GISel because GISel folds the gep into the load. The test checks that FastISel doesn't fold non-pointer-width adds into loads. GISel on the other hand, produces a G_CONSTANT of -128 for the add, and then a G_GEP, which must be pointer-width. Note that we don't get STRBRoX right now. It seems like the importer can't handle `FPR8Op:{ *:[Untyped] }:$Rt` source operands. So, those are not currently supported. Differential Revision: https://reviews.llvm.org/D66679 llvm-svn: 369806	2019-08-23 20:31:34 +00:00
Volkan Keles	277631e3b8	[GlobalISel] Legalizer: Retry combining illegal artifacts as long as there new artifacts Summary: Currently, Legalizer aborts if it’s unable to legalize artifacts. However, it’s possible to combine them after processing the rest of the instruction because the legalization is likely to generate more artifacts that allow ArtifactCombiner to combine away them. Instead, move illegal artifacts to another list called RetryList and wait until all of the instruction in InstList are legalized. After that, check if there is any new artifacts and try to combine them again if that’s the case. If not, abort. The idea is similar to D59339, but the approach is a bit different. This patch fixes the issue described above, but the legalizer still may be unable to handle some cases depending on when to legalize artifacts. So, in the long run, we probably need a different legalization strategy that handles this dependency in a better way. Reviewers: dsanders, aditya_nandakumar, qcolombet, arsenm, aemerson, paquette Reviewed By: dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65894 llvm-svn: 369805	2019-08-23 20:30:35 +00:00
Johannes Doerfert	5a5a139917	[Attributor] Manifest alignment in load and store instructions Summary: We can now manifest alignment information in load/store instructions if the pointer is known to have a better alignment. Reviewers: uenoku, sstefan1, lebedev.ri Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66567 llvm-svn: 369804	2019-08-23 20:20:10 +00:00
Benjamin Kramer	dc5f805d31	Do a sweep of symbol internalization. NFC. llvm-svn: 369803	2019-08-23 19:59:23 +00:00
Craig Topper	bc173d4c51	[X86] Move a transform out of combineConcatVectorOps so we don't prematurely turn CONCAT_VECTORS into INSERT_SUBVECTORS. CONCAT_VECTORS and INSERT_SUBVECTORS can both call combineConcatVectorOps, but we shouldn't produce INSERT_SUBVECTORS from there. We should keep CONCAT_VECTORS until vector legalization. Noticed while looking at the madd_quad_reduction test from madd.ll llvm-svn: 369802	2019-08-23 19:52:24 +00:00
Wei Mi	be9073249e	[SampleFDO] Add ExtBinary format to support extension of binary profile. This is a patch split from https://reviews.llvm.org/D66374. It tries to add a new format of profile called ExtBinary. The format adds a section header table to the profile and organize the profile in sections, so the future extension like adding a new section or extending an existing section will be easier while keeping backward compatiblity feasible. Differential Revision: https://reviews.llvm.org/D66513 llvm-svn: 369798	2019-08-23 19:05:30 +00:00
Roland Froese	b4051e57b1	[PowerPC] Expand v1i128 smin The smin opcode and friends for v1i128 are incorrectly marked as legal for PPC. Change them to expand. Differential Revision: https://reviews.llvm.org/D64960 llvm-svn: 369797	2019-08-23 19:04:47 +00:00
Philip Reames	9cb059fdcc	Fix a bug in just submitted rL369789 Started implementing the vector case and realized the scalar case hadn't handled the GEP producing a different type than the base correctly. It's entertaining seeing what slips through review when we're focused on the 'hard' parts. :( Also adding an extra vector test as it happened to be in workspace and wasn't worth separating. llvm-svn: 369795	2019-08-23 18:27:57 +00:00
Matt Arsenault	2fd1afe8ef	RegScavenger: Use Register llvm-svn: 369794	2019-08-23 18:25:34 +00:00
Craig Topper	bccd183217	[X86] Mark VPDPWSSD and VPDPWSSDS as commutable. Add stack folding tests. llvm-svn: 369792	2019-08-23 18:05:37 +00:00
Manoj Gupta	30232770fb	Revert r369233. This breaks building of some projects like libfuse and alsa-lib that now fail when linking. Error details in PR43092. llvm-svn: 369790	2019-08-23 18:01:13 +00:00
Philip Reames	5b02cfa0b3	[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null This generalizes the isGEPKnownNonNull rule from ValueTracking to apply when we do not know if the base is non-null, and thus need to replace one condition with another. The core notion is that since an inbounds GEP can only form null if the base pointer is null and the offset is zero. However, if the offset is non-zero, the the "inbounds" marker makes the result poison. Thus, we're free to ignore the case where the offset is non-zero. Similarly, there's no case under which a non-null base can result in a null result without generating poison. Differential Revision: https://reviews.llvm.org/D66608 llvm-svn: 369789	2019-08-23 17:58:58 +00:00
Johannes Doerfert	22e6e108e1	[BasicAA] Use dereferenceability to reason about aliasing Summary: We already use the fact that an object with known size X does not alias another objection of size Y > X before. With this commit, we use dereferenceability information to determine a lower bound for Y and not only rely on the user provided query size. The result for @global_and_deref_arg_2() and @local_and_deref_ret_2() in test/Analysis/BasicAA/dereferenceable.ll improved with this patch. Reviewers: asbirlea, chandlerc, hfinkel, sanjoy Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66157 llvm-svn: 369786	2019-08-23 17:56:10 +00:00
Johannes Doerfert	23400e618b	[Attributor] Manifest constant return values Summary: If the unique return value is a constant we now replace call uses with that constant. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66551 llvm-svn: 369785	2019-08-23 17:41:37 +00:00
Johannes Doerfert	785fad3202	[Attributor] Deal with shrinking dereferenceability in a loop Summary: If we have a loop in which the dereferenceability of a pointer decreases we did slowly decrease it iteration by iteration, leading to a timeout. With this patch we detect such circular reasoning and indicate a fixpoint early. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66558 llvm-svn: 369784	2019-08-23 17:29:23 +00:00
Nathan Huckleberry	5808077bc6	Allow Compiler.h to be included in C files and fix fallthrough warnings Summary: Since clang does not support comment style fallthrough annotations these should be switched to macros defined in Compiler.h. This requires some fixing to Compiler.h. Original patch: https://reviews.llvm.org/D66487 Reviewers: nickdesaulniers, aaron.ballman, xbolva00, rsmith Reviewed By: nickdesaulniers, aaron.ballman, rsmith Subscribers: rsmith, sfertile, ormris, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66609 llvm-svn: 369782	2019-08-23 17:25:21 +00:00
Craig Topper	e7211bb567	[SelectionDAG][X86] Enable iX SimplifyDemandedBits to vXi1 SimplifyDemandedVectorElts simplification. Add a hack to X86 to avoid a regression Patch showing the effect of enabling bool vector oversimplification. Non-VLX builds can simplify a kshift shuffle, but VLX builds simplify: insert_subvector v8i zeroinitializer, v2i --> insert_subvector v8i undef, v2i Preventing the removal of the AND to clear the upper bits of result Differential Revision: https://reviews.llvm.org/D53022 llvm-svn: 369780	2019-08-23 17:14:58 +00:00
Jeremy Morse	0ae5498146	[DebugInfo] Remove invalidated locations during LiveDebugValues LiveDebugValues gives variable locations to blocks, but it should also take away. There are various circumstances where a variable location is known until a loop backedge with a different location is detected. In those circumstances, where there's no agreement on the variable location, it should be undef / removed, otherwise we end up picking a location that's valid on some loop iterations but not others. However, LiveDebugValues doesn't currently do this, see the new testcase attached. Without this patch, the location of !3 is assumed to be %bar through the loop. Once it's added to the In-Locations list, it's never removed, even though the later dbg.value(0... of !3 makes the location un-knowable. This patch checks during block-location-joining to see whether any previously-present locations have been removed in a predecessor. If they have, the live-ins have changed, and the block needs reprocessing. Similarly, in transferTerminator, assign rather than \|= the Out-Locations after processing a block, as we may have deleted some previously valid locations. This will mean that LiveDebugValues performs more propagation -- but that's necessary for it being correct. Differential Revision: https://reviews.llvm.org/D66599 llvm-svn: 369778	2019-08-23 16:33:42 +00:00
Sanjay Patel	5a5d44e801	[SLP] use range-for loops, fix formatting; NFC These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369776	2019-08-23 16:22:32 +00:00
Cameron McInally	688f3bc240	[Reassoc] Small fix to support unary FNeg in NegateValue(...) Differential Revision: https://reviews.llvm.org/D66612 llvm-svn: 369772	2019-08-23 15:49:38 +00:00
Johannes Doerfert	2f2d7c3add	[Attributor][Fix] Deal with "growing" dereferenceability Summary: If we have a negative inbounds offset dereferenceabily "grows". However, until we do not handle the overflow that can occur in the dereferenceable bytes and the problem with loops, we simply do not grow the state. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66557 llvm-svn: 369771	2019-08-23 15:45:46 +00:00
Johannes Doerfert	deb9ea3a8c	[Attributor][NFCI] Avoid lookups when resolving returned values If the number of potentially returned values not change since the last traversal we do not need to visit the returned values again. This works as we only add values to the returned values set now. Differential Revision: https://reviews.llvm.org/D66484 llvm-svn: 369770	2019-08-23 15:42:19 +00:00
Sanjay Patel	9182467886	[SLP] fix formatting; NFC These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369769	2019-08-23 15:26:12 +00:00
Johannes Doerfert	9543f1498c	[Attributor] FIX: Treat new attributes as changed ones Summary: When we have new attributes and we end the fixpoint iteration because the iteration limit is reached, we need to treat the new ones as if they changed in the last iteration, as they might have. This adds a test for which we should not derive anything regardless of the iteration limit, e.g., if we abort there should not be any attributes manifested in the IR. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66549 llvm-svn: 369768	2019-08-23 15:24:57 +00:00
Johannes Doerfert	695089ecfb	[Attributor][NFCI] Try to avoid potential non-deterministic behavior This commit replaces sets with set vectors in an effort to make the behavior of the Attributor deterministic. llvm-svn: 369767	2019-08-23 15:23:49 +00:00
Teresa Johnson	ea314fd476	[ThinLTO] Fix handling of weak interposable symbols Summary: Keep aliasees alive if their alias is live, otherwise we end up with an alias to a declaration, which is invalid. This can happen when the aliasee is weak and non-prevailing. This fix exposed the fact that we were then attempting to internalize the weak symbol, which was not exported as it was not prevailing. We should not internalize interposable symbols in general, unless this is the prevailing copy, since it can lead to incorrect inlining and other optimizations. Most of the changes in this patch are due to the restructuring required to pass down the prevailing callback. Finally, while implementing the test cases, I found that in the case of a weak aliasee that is still marked not live because its alias isn't live, after dropping the definition we incorrectly marked the declaration with weak linkage when resolving prevailing symbols in the module. This was due to some special case handling for symbols marked WeakLinkage in the summary located before instead of after a subsequent check for the symbol being a declaration. It turns out that we don't actually need this special case handling any more (looking back at the history, when that was added the code was structured quite differently) - we will correctly mark with weak linkage further below when the definition hasn't been dropped. Fixes PR42542. Reviewers: pcc Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66264 llvm-svn: 369766	2019-08-23 15:18:58 +00:00
Johannes Doerfert	a5b10b464e	[MustExec] Add a generic "must-be-executed-context" explorer Given an instruction I, the MustBeExecutedContextExplorer allows to easily traverse instructions that are guaranteed to be executed whenever I is. For now, these instruction have to be statically "after" I, in the same or different basic blocks. This patch also adds a pass which prints the must-be-executed-context for each instruction in a module. It is used to test the MustBeExecutedContextExplorer, for now on the examples given in the class comment of the MustBeExecutedIterator. Differential Revision: https://reviews.llvm.org/D65186 llvm-svn: 369765	2019-08-23 15:17:27 +00:00
Simon Atanasyan	5f7d6ac7bf	[mips] Reduce number of instructions used for loading a global symbol's value Now `lw/sw $reg, sym+offset` pseudo instructions for global symbol `sym` are lowering into the following three instructions. ``` lw $reg, %got(symbol)($gp) addiu $reg, $reg, offset lw/sw $reg, 0($reg) ``` It's possible to reduce the number of instructions by taking the offset in account in the final `lw/sw` command. This patch implements that optimization. ``` lw $reg, %got(symbol)($gp) lw/sw $reg, offset($reg) ``` Differential Revision: https://reviews.llvm.org/D66553 llvm-svn: 369756	2019-08-23 13:36:24 +00:00
Simon Atanasyan	58492b1895	[mips] Do not include offset into `%got` expression for global symbols Now pseudo instruction `la $6, symbol+8($6)` is expanding into the following chain of commands: ``` lw $1, %got(symbol+8)($gp) addiu $1, $1, 8 addu $6, $1, $6 ``` This is incorrect. When a linker handles the `R_MIPS_GOT16` relocation, it does not expect to get any addend and breaks on assertion. Otherwise it has to create new GOT entry for each unique "sym + offset" pair. Offset for a global symbol should be added to result of loading GOT entry by a separate `add` command. The patch fixes the problem by stripping off an offset from the expression passed to the `%got`. That's interesting that even current code inserts a separate `add` command. Differential Revision: https://reviews.llvm.org/D66552 llvm-svn: 369755	2019-08-23 13:36:14 +00:00
Simon Pilgrim	c88408cf85	Use VT::getHalfNumVectorElementsVT helpers in a few places. NFCI. llvm-svn: 369751	2019-08-23 12:37:09 +00:00
Andrea Di Biagio	8e9af64da6	[X86][BtVer2] Add a read-advance to every implicit register use of CMPXCHG8B/16B. This is a follow up of r369642. This patch assigns a ReadAfterLd to every implicit register use of instruction CMPXCHG8B and instruction CMPXCHG16B. Perf micro-benchmarks show that implicit registers are read after 3cy from the start of execution. llvm-svn: 369750	2019-08-23 12:19:45 +00:00
Andrea Di Biagio	1630f64e2f	[X86][BtVer2] Fix latency of ALU RMW instructions. Excluding ADC/SBB and the bit-test instructions (BTR/BTS/BTC), the observed latency of all other RMW integer arithmetic/logic instructions is 6cy and not 5cy. Example (ADD): ``` addb $0, (%rsp) # Latency: 6cy addb $7, (%rsp) # Latency: 6cy addb %sil, (%rsp) # Latency: 6cy addw $0, (%rsp) # Latency: 6cy addw $511, (%rsp) # Latency: 6cy addw %si, (%rsp) # Latency: 6cy addl $0, (%rsp) # Latency: 6cy addl $511, (%rsp) # Latency: 6cy addl %esi, (%rsp) # Latency: 6cy addq $0, (%rsp) # Latency: 6cy addq $511, (%rsp) # Latency: 6cy addq %rsi, (%rsp) # Latency: 6cy ``` The same latency profile applies to SUB/AND/OR/XOR/INC/DEC. The observed latency of ADC/SBB is 7-8cy. So we need a different write to model those. Latency of BTS/BTR/BTC is not fixed by this patch (they are much slower than what the model for btver2 currently reports). Differential Revision: https://reviews.llvm.org/D66636 llvm-svn: 369748	2019-08-23 11:34:10 +00:00
Martin Storsjo	8dbdb1c2a2	[llvm-dlltool] Make sure to strip decorations from ExtName for renamed exports ExtName should not be decorated, just like Name. This avoids double decoration on symbols in import libraries that use = for renaming functions. (Weak aliases, which use ==, worked fine with respect to decoration.) Differential Revision: https://reviews.llvm.org/D66617 llvm-svn: 369747	2019-08-23 11:18:11 +00:00
Simon Pilgrim	04906ef1f2	[DAGCombine] GetNegatedExpression - add FMA\FMAD support If the accumulator and either of the multiply operands are negatable then we can we negate the entire expression. Differential Revision: https://reviews.llvm.org/D63141 llvm-svn: 369746	2019-08-23 10:49:46 +00:00
Jay Foad	eac23862a8	[AMDGPU] gfx10 atomic optimizer changes. Summary: Add support for gfx10, where all DPP operations are confined to work within a single row of 16 lanes, and wave32. Reviewers: arsenm, sheredom, critson, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, jfb, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65644 llvm-svn: 369745	2019-08-23 10:07:43 +00:00
George Rimar	668b11b2c8	[yaml2obj] - Allow setting the symbol st_other field to any integer. st_other field of a symbol usually contains its visibility. Other bits are usually 0, though some targets, like MIPS can set them using the named bit field values. Problem is that there is no way to set an arbitrary value now, though that might be useful for our test cases. In this patch I introduced a way to set st_other to any numeric value using the new StOther field. I added a test and simplified the existent one to show the effect/benefit Differential revision: https://reviews.llvm.org/D66583 llvm-svn: 369742	2019-08-23 09:31:07 +00:00
Craig Topper	4deb388bca	[X86] Make combineLoopSADPattern use CONCAT_VECTORS instead of INSERT_SUBVECTORS for widening with zeros. CONCAT_VECTORS is more canonical for the early DAG combine runs until we start getting into the op legalization phases. llvm-svn: 369734	2019-08-23 06:08:33 +00:00
Craig Topper	bdceb9fb14	[X86] Improve lowering of v2i32 SAD handling in combineLoopSADPattern. For v2i32 we only feed 2 i8 elements into the psadbw instructions with 0s in the other 14 bytes. The resulting psadbw instruction will produce zeros in bits [127:16] of the output. We need to take the result and feed it to a v2i32 add where the first element includes bits [15:0] of the sad result. The other element should be zero. Prior to this patch we were using a truncate to take 0 from bits 95:64 of the psadbw. This results in a pshufd to move those bits to 63:32. But since we also have zeroes in bits 63:32 of the psadbw output, we should just take those bits. The previous code probably worked better with promoting legalization, but now we use widening legalization. I've preserved the old behavior if -x86-experimental-vector-widening-legalization=false until we get that option removed. llvm-svn: 369733	2019-08-23 05:33:27 +00:00
Philip Reames	2a52583d67	[IndVars] Fix a bug noticed by inspection We were computing the loop exit value, but not ensuring the addrec belonged to the loop whose exit value we were computing. I couldn't actually trip this; the test case shows the basic setup which might trip this, but none of the variations I've tried actually do. llvm-svn: 369730	2019-08-23 04:03:23 +00:00
Fangrui Song	3fc933af8b	[AlignmentFromAssumptions] getNewAlignmentDiff(): use getURemExpr() The alignment is calculated incorrectly, thus sometimes it doesn't generate aligned mov instructions, as shown by the example below: ``` // b.cc typedef long long index; extern "C" index g_tid; extern "C" index g_num; void add3(float* __restrict__ a, float* __restrict__ b, float* __restrict__ c) { index n = 641024; index m = 161024; index k = 41024; index tid = g_tid; index num = g_num; __builtin_assume_aligned(a, 32); __builtin_assume_aligned(b, 32); __builtin_assume_aligned(c, 32); for (index i0=tidk; i0<m; i0+=numk) for (index i1=0; i1<nm; i1+=m) for (index i2=0; i2<k; i2++) c[i1+i0+i2] = b[i0+i2] + a[i1+i0+i2]; } ``` Compile with `clang b.cc -Ofast -march=skylake -mavx2 -S` ``` vmovaps -224(%rdi,%rbx,4), %ymm0 vmovups -192(%rdi,%rbx,4), %ymm1 # should be movaps vmovups -160(%rdi,%rbx,4), %ymm2 # should be movaps vmovups -128(%rdi,%rbx,4), %ymm3 # should be movaps vaddps -224(%rsi,%rbx,4), %ymm0, %ymm0 vaddps -192(%rsi,%rbx,4), %ymm1, %ymm1 vaddps -160(%rsi,%rbx,4), %ymm2, %ymm2 vaddps -128(%rsi,%rbx,4), %ymm3, %ymm3 vmovaps %ymm0, -224(%rdx,%rbx,4) vmovups %ymm1, -192(%rdx,%rbx,4) # should be movaps vmovups %ymm2, -160(%rdx,%rbx,4) # should be movaps vmovups %ymm3, -128(%rdx,%rbx,4) # should be movaps ``` Differential Revision: https://reviews.llvm.org/D66575 Patch by Dun Liang llvm-svn: 369723	2019-08-23 02:17:04 +00:00
Peter Collingbourne	21a1814417	hwasan: Untag unwound stack frames by wrapping personality functions. One problem with untagging memory in landing pads is that it only works correctly if the function that catches the exception is instrumented. If the function is uninstrumented, we have no opportunity to untag the memory. To address this, replace landing pad instrumentation with personality function wrapping. Each function with an instrumented stack has its personality function replaced with a wrapper provided by the runtime. Functions that did not have a personality function to begin with also get wrappers if they may be unwound past. As the unwinder calls personality functions during stack unwinding, the original personality function is called and the function's stack frame is untagged by the wrapper if the personality function instructs the unwinder to keep unwinding. If unwinding stops at a landing pad, the function is still responsible for untagging its stack frame if it resumes unwinding. The old landing pad mechanism is preserved for compatibility with old runtimes. Differential Revision: https://reviews.llvm.org/D66377 llvm-svn: 369721	2019-08-23 01:28:44 +00:00
Sam Clegg	90b6bb75e8	[MC] Minor cleanup to MCFixup::Kind handling. NFC. Prefer `MCFixupKind` where possible and add getTargetKind() to convert to `unsigned` when needed rather than scattering cast operators around the place. Differential Revision: https://reviews.llvm.org/D59890 llvm-svn: 369720	2019-08-23 01:00:55 +00:00
Peter Collingbourne	2452d7030b	IR. Change strip* family of functions to not look through aliases. I noticed another instance of the issue where references to aliases were being replaced with aliasees, this time in InstCombine. In the instance that I saw it turned out to be only a QoI issue (a symbol ended up being missing from the symbol table due to the last reference to the alias being removed, preventing HWASAN from symbolizing a global reference), but it could easily have manifested as incorrect behaviour. Since this is the third such issue encountered (previously: D65118, D65314) it seems to be time to address this common error/QoI issue once and for all and make the strip* family of functions not look through aliases. Includes a test for the specific issue that I saw, but no doubt there are other similar bugs fixed here. As with D65118 this has been tested to make sure that the optimization isn't load bearing. I built Clang, Chromium for Linux, Android and Windows as well as the test-suite and there were no size regressions. Differential Revision: https://reviews.llvm.org/D66606 llvm-svn: 369697	2019-08-22 19:56:14 +00:00
Benjamin Kramer	b3a991df3c	Fight a bit against global initializers. NFC. llvm-svn: 369695	2019-08-22 19:43:27 +00:00
Matt Arsenault	fba82858f2	GlobalISel: Don't create G_UADDE with constant false carry in The x86 tests are now broken (in paticular add-scalar.ll now hits the DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a custom lowering for this, so that will need to be implemented. llvm-svn: 369673	2019-08-22 17:29:17 +00:00
Francis Visoiu Mistrih	5b5ee61b5f	[MachO][TLOF] Use hasLocalLinkage to determine if indirect symbol is local Local symbols in the indirect symbol table contain the value `INDIRECT_SYMBOL_LOCAL` and the corresponding __pointers entry must contain the address of the target. In r349060, I added support for local symbols in the indirect symbol table, which was checking if the symbol `isDefined` && `!isExternal` to determine if the symbol is local or not. It turns out that `isDefined` will return false if the user of the symbol comes before its definition, and we'll again generate .long 0 which will be the symbol at the adress 0x0. Instead of doing that, use GlobalValue::hasLocalLinkage() to check if the symbol is local. Differential Revision: https://reviews.llvm.org/D66563 llvm-svn: 369671	2019-08-22 16:59:00 +00:00
Craig Topper	898a0e9b84	[X86] Remove MCInstLower code that drops operands from some CALL and TAILJMP instructions. Add asserts to verify operand count It appears the FIXME here was handled at some point. r159728 from 2012 seems to be at least aportion of fixing it. Differential Revision: https://reviews.llvm.org/D66570 llvm-svn: 369665	2019-08-22 16:23:35 +00:00
Guozhi Wei	51f48295cb	[MBP] Disable aggressive loop rotate in plain mode Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 369664	2019-08-22 16:21:32 +00:00
Amaury Sechet	95cf66de7c	[DAGCombiner] Remove explicit call to AddToWorklist in sqrt and reciprocal computations Summary: These nodes end up being processed regardless due to DAGCombiner ensuring arguments are processed. This changes the order in which nodes are processed, which fixes an issue on PowerPC. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri, mcberg2017, stefanp, hfinkel Subscribers: nemanjai, MaskRay, jsji, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66548 llvm-svn: 369662	2019-08-22 15:35:45 +00:00
Andrea Di Biagio	c9649eb9da	[X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions. Single operand MUL instructions that implicitly set EAX have the following latency/throughput profile (see below): imul %cl # latency: 3cy - uOPs: 1 - 1 JMul imul %cx # latency: 3cy - uOPs: 3 - 3 JMul imul %ecx # latency: 3cy - uOPs: 2 - 2 JMul imul %rcx # latency: 6cy - uOPs: 2 - 4 JMul mul %cl # latency: 3cy - uOPs: 1 - 1 JMul mul %cx # latency: 3cy - uOPs: 3 - 3 JMul mul %ecx # latency: 3cy - uOPs: 2 - 2 JMul mul %rcx # latency: 6cy - uOPs: 2 - 4 JMul Excluding the 64bit variant, which has a latency of 6cy, every other instruction has a latency of 3cy. However, the number of decoded macro-opcodes (as well as the resource cyles) depend on the MUL size. The two operand MULs have a more predictable profile (see below): imul %dx, %dx # latency: 3cy - uOPs: 1 - 1 JMul imul %edx, %edx # latency: 3cy - uOPs: 1 - 1 JMul imul %rdx, %rdx # latency: 6cy - uOPs: 1 - 4 JMul imul $3, %dx, %dx # latency: 4cy - uOPs: 2 - 2 JMul imul $3, %ecx, %ecx # latency: 3cy - uOPs: 1 - 1 JMul imul $3, %rdx, %rdx # latency: 6cy - uOPs: 1 - 4 JMul This patch updates the values in the Jaguar scheduling model and regenerates llvm-mca tests. Differential Revision: https://reviews.llvm.org/D66547 llvm-svn: 369661	2019-08-22 15:20:16 +00:00
Sean Fertile	5f85a7b1cf	[PowerPC] Add combined ELF ABI and 32/64 bit queries to the subtarget. [NFC] A lot of places in the code combine checks for both ABI (SVR4/Darwin/AIX) and addressing mode (64-bit vs 32-bit). In an attempt to make some of the code more readable I've added a couple functions that combine checking for the ELF abi and 64-bit/32-bit code at once. As we add more AIX support I intend to add similar functions for the AIX ABI. Differential Revision: https://reviews.llvm.org/D65814 llvm-svn: 369658	2019-08-22 15:11:28 +00:00
Sean Fertile	18fd1b0b49	[PowerPC][XCOFF][MC] Explicitly set containing csect on symbols. [NFC] Previously we would get the csect a symbol was contained in through its fragment. This works only if we are writing an object file, and only for defined symbols. To fix this we set the contating csect explicitly on the MCSymbolXCOFF object. Differential Revision: https://reviews.llvm.org/D66032 llvm-svn: 369657	2019-08-22 15:11:23 +00:00
Hideto Ueno	70576cac52	[Attributor][NFC] Move DerefState to header and use StateWrapper Summary: In D65402, I want to get DerefState from AADereferenceable but it was not allowed. This patch moves DerefState definition into Attributor.h and makes AADerefenceable inherit StateWrapper. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66585 llvm-svn: 369653	2019-08-22 14:18:29 +00:00
Jinsong Ji	545e993b8b	[SlotIndexes] Add print-slotindexes to disable printing slotindexes Summary: When we print the IR with --print-after/before-*, SlotIndexes will be printed whenever available (We haven't freed it). This introduces some noises when we try to compare the IR among different optimizations. eg: -print-before=machine-cp will print SlotIndexes for 1st machine-cp pass, but NOT for 2nd machine-cp; -print-after=machine-cp will NOT print SlotIndexes for both machine-cp passes. So SlotIndexes in 1st pass introduce noises when differing these IRs. This patch introduces an option to hide indexes. Reviewers: stoklund, thegameg, qcolombet Reviewed By: thegameg Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66500 llvm-svn: 369650	2019-08-22 13:44:47 +00:00
Andrea Di Biagio	589cb004de	[MCA] consistently use MCPhysReg instead of unsigned as register type. NFCI llvm-svn: 369648	2019-08-22 13:32:17 +00:00
George Rimar	91208447d0	[yaml2obj] - Lookup relocation symbols in dynamic symbol when .dynsym referenced. This fixes https://bugs.llvm.org/show_bug.cgi?id=40337. Previously, it was always assumed that relocations referenced symbols in the static symbol table. Now, if the Link field references a section called ".dynsym" it will look up these symbols in the dynamic symbol table. This patch is heavily based on D59097 by James Henderson Differential revision: https://reviews.llvm.org/D66532 llvm-svn: 369645	2019-08-22 12:39:56 +00:00
Andrea Di Biagio	c6744055ad	[X86][BtVer2] Fix latency and throughput of XCHG and XADD. On Jaguar, XCHG has a latency of 1cy and decodes to 2 macro-opcodes. Maximum throughput for XCHG is 1 IPC. The byte exchange has worse latency and decodes to 1 extra uOP; maximum observed throughput is 0.5 IPC. ``` xchgb %cl, %dl # Latency: 2cy - uOPs: 3 - 2 ALU xchgw %cx, %dx # Latency: 1cy - uOPs: 2 - 2 ALU xchgl %ecx, %edx # Latency: 1cy - uOPs: 2 - 2 ALU xchgq %rcx, %rdx # Latency: 1cy - uOPs: 2 - 2 ALU ``` The reg-mem forms of XCHG are atomic operations with an observed latency of 16cy. The resource usage is similar to the XCHGrr variants. The biggest difference is obviously the bus-locking, which prevents the LS to issue other memory uOPs in parallel until the unlocking store uOP is executed. ``` xchgb %cl, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgw %cx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgl %ecx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgq %rcx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy ``` The exchanged in/out register operand becomes available after 11cy from the start of execution. Added test xchg.s to verify that we correctly see that register write committed in 11cy (and not 16cy). Reg-reg XADD instructions have the same latency/throughput than the byte exchange (register-register variant). ``` xaddb %cl, %dl # latency: 2cy - uOPs: 3 - 3 ALU xaddw %cx, %dx # latency: 2cy - uOPs: 3 - 3 ALU xaddl %ecx, %edx # latency: 2cy - uOPs: 3 - 3 ALU xaddq %rcx, %rdx # latency: 2cy - uOPs: 3 - 3 ALU ``` The non-atomic RM variants have a latency of 11cy, and decode to 4 macro-opcodes. They still consume 2 ALU pipes, and the exchange in/out register operand becomes available in 3cy (it matches the 'load-to-use latency'). ``` xaddb %cl, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddw %cx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddl %ecx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddq %rcx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU ``` The atomic XADD variants execute in 16cy. The in/out register operand is available after 11cy from the start of execution. ``` lock xaddb %cl, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddw %cx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddl %ecx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddq %rcx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy ``` Added test xadd.s to verify those latencies as well as read-advance values. Differential Revision: https://reviews.llvm.org/D66535 llvm-svn: 369642	2019-08-22 11:32:47 +00:00
Simon Pilgrim	6dd51c2f19	[MVT] Add MVT equivalent to EVT::getHalfNumVectorElementsVT() helper. NFCI. Allows for some cleanup in a lot of SSE/AVX vector splitting code llvm-svn: 369640	2019-08-22 11:14:30 +00:00
Sam Tebbs	a69d9d6156	Reapply: [ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32 The CodeGen/Thumb2/mve-vaddv.ll test needed to be amended to reflect the changes from the above patch. This reverts commit `cd53ff6`, reapplying `7c6b229`. llvm-svn: 369638	2019-08-22 10:29:20 +00:00
Serguei Katkov	036e636aa7	[Loop Peeling] Fix silly bug in metadata update. We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637	2019-08-22 10:06:46 +00:00
Hans Wennborg	cd53ff6c0d	Revert r369626 "[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32" It broke the bots, see e.g. http://lab.llvm.org:8011/builders/clang-cuda-build/builds/36275/ > This patch fixes shifts by a 128/256 bit shift amount. It also fixes > codegen for shifts of 32 by delegating to LLVM's default optimisation > instead of emitting a long shift. > > Tests that used to generate long shifts of 32 are updated to check for the > more optimised codegen. > > Differential revision: https://reviews.llvm.org/D66519 > > llvm-svn: 369626 llvm-svn: 369636	2019-08-22 09:16:53 +00:00
Craig Topper	d420616313	[X86] Lower the cost of v2i32->v2f64 sint_to_fp under vector widening legalization. I don't really understand the costs we're using for fp_to_sint, but prior to widening legalization we used 20 as the cost for this via the v2i64->v2f64 entry. That number seems better than the 40 we got with widening legalization. So now we need either a v2i32->v2f64 entry or a v4i32->v2f64 entry depending on whether AVX is enabled or not since we skip the first SSE2 table look up under AVX. llvm-svn: 369628	2019-08-22 08:18:45 +00:00
Pavel Labath	1b30ea2c50	[Support] Improve readNativeFile(Slice) interface Summary: There was a subtle, but pretty important difference between the Slice and regular versions of this function. The Slice function was zero-initializing the rest of the buffer when the read syscall returned less bytes than expected, while the regular function did not. This patch removes the inconsistency by making both functions not zero-initialize the buffer. The zeroing code is moved to the MemoryBuffer class, which is currently the only user of this code. This makes the API more consistent, and the code shorter. While in there, I also refactor the functions to return the number of bytes through the regular return value (via Expected<size_t>) instead of a separate by-ref argument. Reviewers: aganea, rnk Subscribers: kristina, Bigcheese, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66471 llvm-svn: 369627	2019-08-22 08:13:30 +00:00
Sam Tebbs	7c6b229204	[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32 This patch fixes shifts by a 128/256 bit shift amount. It also fixes codegen for shifts of 32 by delegating to LLVM's default optimisation instead of emitting a long shift. Tests that used to generate long shifts of 32 are updated to check for the more optimised codegen. Differential revision: https://reviews.llvm.org/D66519 llvm-svn: 369626	2019-08-22 08:12:06 +00:00
Shiva Chen	72a41e7b0d	[TargetLowering] Remove optional arguments passing to makeLibCall The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497. The struct contain argument flags which will pass to makeLibCall function. The patch should not has any functionality changes. Differential Revision: https://reviews.llvm.org/D65795 llvm-svn: 369622	2019-08-22 04:59:43 +00:00
Pengfei Wang	7630e24492	[X86] Making X86OptimizeLEAs pass public. NFC Reviewers: wxiao3, LuoYuanke, andrew.w.kaylor, craig.topper, annita.zhang, liutianle, pengfei, xiangzhangllvm, RKSimon, spatel, andreadb Reviewed By: RKSimon Subscribers: andreadb, hiraditya, llvm-commits Tags: #llvm Patch by Gen Pei (gpei) Differential Revision: https://reviews.llvm.org/D65933 llvm-svn: 369612	2019-08-22 02:29:27 +00:00
Fangrui Song	246750c2a9	[COFF] Fix section name for constants larger than 64 bits on Windows APIntToHexString returns wrong value ("0000000000000000ffffffffffffffff") for integer larger than 64 bits, and thus TargetLoweringObjectFileCOFF::getSectionForConstant returns same section name for all numbers larger than 64 bits. This patch tries to fix it. Differential Revision: https://reviews.llvm.org/D66458 Patch by Senran Zhang llvm-svn: 369610	2019-08-22 01:48:34 +00:00
Cyndy Ishida	9443d0e2c0	[Object] FIX: update PlatformKind name in TapiFile Buildbots that use GCC failed to compile because overwritten namespace with variable name llvm-svn: 369602	2019-08-21 23:57:57 +00:00
Cyndy Ishida	c20d1f90b5	[Object] Add tapi files to object Summary: The intention for this is to allow reading and printing symbols out from llvm-nm. Tapi file, and Tapi universal follow a similiar format to their respective MachO Object format. The tests are dependent on llvm-nm processing tbd files which is why its in D66160 Reviewers: ributzka, steven_wu, lhames Reviewed By: ributzka, lhames Subscribers: mgorny, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66159 llvm-svn: 369600	2019-08-21 23:30:53 +00:00
Craig Topper	78e6507b0a	[X86] Correct the scheduler classes for TAILJMP and TCRETURN CodeGenOnly instructions. We had an odd combination of WriteJump applied to some memory instructions and WriteJumpLd applied to register and immediate instructions. Thsi should hopefully assign them all correctly. llvm-svn: 369599	2019-08-21 23:17:52 +00:00
Craig Topper	303bbc3be2	[X86] Replace a couple hardcoded '5's with X86::AddrNumOperands for readability. NFC llvm-svn: 369598	2019-08-21 22:40:07 +00:00
Luis Marques	f7cdff4ffd	[RISCV] Remove fix introduced by r369573, superseded by r369580 llvm-svn: 369590	2019-08-21 22:02:56 +00:00
Johannes Doerfert	d98f975089	[Attributor] Fix: Gracefully handle non-instruction users Function can have users that are not instructions, e.g., bitcasts. For now, we simply give up when we see them. llvm-svn: 369588	2019-08-21 21:48:56 +00:00
Greg Clayton	bf9ee07afa	Add FileWriter to GSYM and encode/decode functions to AddressRange and AddressRanges The full GSYM patch started with: https://reviews.llvm.org/D53379 This patch add the ability to encode data using the new llvm::gsym::FileWriter class. FileWriter is a simplified binary data writer class that doesn't require targets, target definitions, architectures, or require any other optional compile time libraries to be enabled via the build process. This class needs the ability to seek to different spots in the binary data that it produces to fix up offsets and sizes in GSYM data. It currently uses std::ostream over llvm::raw_ostream because llvm::raw_ostream doesn't support seeking which is required when encoding and decoding GSYM data. AddressRange objects are encoded and decoded to be relative to a base address. This will be the FunctionInfo's start address if the AddressRange is directly contained in a FunctionInfo, or a base address of the containing parent AddressRange or AddressRanges. This allows address ranges to be efficiently encoded using ULEB128 encodings as we encode the offset and size of each range instead of full addresses. This also makes encoded addresses easy to relocate as we just need to relocate one base address. Differential Revision: https://reviews.llvm.org/D63828 llvm-svn: 369587	2019-08-21 21:48:11 +00:00
Luis Marques	4f488b594a	[RISCV] Fix use of side-effects in asserts in decoder functions llvm-svn: 369580	2019-08-21 21:11:37 +00:00
Cyndy Ishida	359840a6e4	[BinaryFormat] Teach identify_magic about Tapi files. Summary: Tapi files are YAML files that start with the !tapi tag. The only execption are TBD v1 files, which don't have a tag. In that case we have to scan a little further and check if the first key "archs" exists. This is the first patch in a series of patches to add libObject support for text-based dynamic library (.tbd) files. This patch is practically exactly the same as D37820, that was never pushed to master, and is needed for future commits related to reading tbd files for llvm-nm Reviewers: ributzka, steven_wu, bollu, espindola, jfb, shafik, jdoerfert Reviewed By: steven_wu Subscribers: dexonsmith, llvm-commits Tags: #llvm, #clang, #sanitizers, #lldb, #libc, #openmp Differential Revision: https://reviews.llvm.org/D66149 llvm-svn: 369579	2019-08-21 21:00:16 +00:00
Johannes Doerfert	5427aa843b	[Attributor][NFC] Fix copy & paste error llvm-svn: 369577	2019-08-21 20:57:20 +00:00
Johannes Doerfert	2db8528fb4	[Attributor][NFC] Remove leftover semicolon llvm-svn: 369576	2019-08-21 20:56:56 +00:00
Johannes Doerfert	d410805d57	[Attributor] Use existing unreachable instead of introducing new ones So far we split the unreachable off and placed a new one, this is not necessary. llvm-svn: 369575	2019-08-21 20:56:41 +00:00
Richard Smith	b73cd33625	Fix -Werror=unused-variable error after r369528. llvm-svn: 369573	2019-08-21 20:42:37 +00:00
Florian Hahn	b5e52bfd83	[GVN] Do PHI translations across all edges between the load and the unavailable pred. Currently we do not properly translate addresses with PHIs if LoadBB != LI->getParent(), because PHITranslateAddr expects a direct predecessor as argument, because it considers all instructions outside of the current block to not requiring translation. The amount of cases that trigger this should be very low, as most single predecessor blocks should be folded into their predecessor by GVN before we actually start with value numbering. It is still not guaranteed to happen, so we should do PHI translation along all edges between the loads' block and the predecessor where we have to place a load. There are a few test cases showing current limits of the PHI translation, which could be improved later. Reviewers: spatel, reames, efriedma, john.brawn Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D65020 llvm-svn: 369570	2019-08-21 20:06:50 +00:00
Aaron Ballman	6a29ff1754	Revert r369549 as it broke the bots. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/13605/ llvm-svn: 369569	2019-08-21 20:00:41 +00:00
Nico Weber	ed18e70c86	Revert r367389 (and follow-up r368404); it caused PR43073. llvm-svn: 369567	2019-08-21 19:53:42 +00:00
Sam Clegg	dde8a25a4b	[WebAssembly] Handle aliases in WebAssemblyFixFunctionBitcasts Fixes: https://github.com/emscripten-core/emscripten/issues/8770 Differential Revision: https://reviews.llvm.org/D66508 llvm-svn: 369566	2019-08-21 19:52:33 +00:00
Craig Topper	3f59bfd5be	[MVT] Add v16f16 and v32f16 vectors. I might look at improving PR43065 which will require being able to mark a 256 and 512 bit vector of f16 as Legal. Differential Revision: https://reviews.llvm.org/D66515 llvm-svn: 369565	2019-08-21 19:14:48 +00:00
Simon Atanasyan	159f621c5c	[mips] Replace call `expandLoadAddress` by `loadAndAddSymbolAddress`. NFC In case of expanding `lw/sw $reg, symbol($reg)` instruction for PIC it's enough to call the `loadAndAddSymbolAddress` method. Additional work performed by the `expandLoadAddress` is not required here. llvm-svn: 369563	2019-08-21 18:54:51 +00:00
Simon Atanasyan	bb2f857247	[mips] Remove duplicated case from the `StringSwitch`. NFC llvm-svn: 369562	2019-08-21 18:54:41 +00:00
Amaury Sechet	c0f190a048	[DAGCombiner] Remove mostly redundant calls to AddToWorklist Summary: These calls change the order in which some nodes are processed and so have an effect on codegen. The change in fixup-bw-copy.ll is due to (and (load anyext)) gets transformed into (load zext) while previously the and was removed by SimplifyDemandedBits, so the (load anyext) remained. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66543 llvm-svn: 369561	2019-08-21 18:51:08 +00:00
Florian Hahn	969b3e6a8f	[BitcodeReader] Check if we can create a null constant for type. We cannot create null constants for certain types, e.g. VoidTy, FunctionTy or LabelTy. getNullValue asserts if we pass in an unsupported type. We should also check for opaque types, but I'm not sure how. This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14795. Reviewers: t.p.northover, jfb, vsk Reviewed By: vsk Tags: #llvm Differential Revision: https://reviews.llvm.org/D65897 llvm-svn: 369557	2019-08-21 18:20:11 +00:00
Nathan Huckleberry	01a413695c	Fix -Wimplicit-fallthrough warnings in regcomp.c Summary: Since clang does not support comment style fallthrough annotations these should be switched. Reviewers: aaron.ballman, nickdesaulniers, xbolva00 Reviewed By: aaron.ballman, nickdesaulniers, xbolva00 Subscribers: xbolva00, nickdesaulniers, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66487 llvm-svn: 369549	2019-08-21 17:07:43 +00:00
Alina Sbirlea	7425179fee	[LoopPassManager + MemorySSA] Only enable use of MemorySSA for LPMs known to preserve it. Summary: Add a flag to the FunctionToLoopAdaptor that allows enabling MemorySSA only for the loop pass managers that are known to preserve it. If an LPM is known to have only loop transforms that all preserve MemorySSA, then use MemorySSA if `EnableMSSALoopDependency` is set. If an LPM has loop passes that do not preserve MemorySSA, then the flag passed is `false`, regardless of the value of `EnableMSSALoopDependency`. When using a custom loop pass pipeline via `passes=...`, use keyword `loop` vs `loop-mssa` to use MemorySSA in that LPM. If a loop that does not preserve MemorySSA is added while using the `loop-mssa` keyword, that's an error. Add the new `loop-mssa` keyword to a few tests where a difference occurs when enabling MemorySSA. Reviewers: chandlerc Subscribers: mehdi_amini, Prazek, george.burgess.iv, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66376 llvm-svn: 369548	2019-08-21 17:00:57 +00:00
Matt Arsenault	954a012b4c	GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547	2019-08-21 16:59:10 +00:00
David Green	717feabdf0	[ARM] Formatting for ARMInstrMVE.td. NFC This is just some formatting cleanup, prior to the masked load and store patch in D66534. llvm-svn: 369545	2019-08-21 16:20:35 +00:00
Philip Reames	764b0fd5a3	[instcombine] icmp eq/ne (sub C, Y), C -> icmp eq/ne Y, 0 Noticed while looking at pr43028. llvm-svn: 369541	2019-08-21 15:51:57 +00:00
Nilanjana Basu	ac3851c434	Improving CodeView debug info type record's inline comments llvm-svn: 369533	2019-08-21 15:19:58 +00:00
Alexander Timofeev	78347c979e	[AMDGPU] Prevent VGPR copies from moving across the EXEC mask definitions Differential Revision: https://reviews.llvm.org/D63731 Reviewers: qcolombet, rampitec llvm-svn: 369532	2019-08-21 15:15:04 +00:00
Guillaume Chatelet	1c18a9cb9e	[LLVM][Alignment] Introduce Alignment In MachineFrameInfo Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: jfb Subscribers: hiraditya, dexonsmith, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65800 llvm-svn: 369531	2019-08-21 14:29:30 +00:00
Igor Kudrin	ed413074f2	[DWARF] Adjust return type of DWARFUnit::getLength(). DWARFUnitHeader::getLength() returns uint64_t. DWARFUnit::getLength() should do the same. Differential Revision: https://reviews.llvm.org/D66472 llvm-svn: 369529	2019-08-21 14:10:57 +00:00
Luis Marques	c3bf3d14ea	[RISCV] Add support for RVC HINT instructions The hint instructions are enabled by default (if the standard C extension is enabled). To disable them pass -mattr=-rvc-hints. Differential Revision: https://reviews.llvm.org/D62592 llvm-svn: 369528	2019-08-21 14:00:58 +00:00
Amaury Sechet	045f33aec9	[DAGCombiner] Various nits. NFC llvm-svn: 369520	2019-08-21 12:01:37 +00:00
Sanjay Patel	e728259278	[InstCombine] narrow icmp with extended operands of different widths An intermediate extend is used to widen the narrow operand to the width of the other (wider) operand. At that point, we have the same logic as the existing transform that was restricted to folds of equal width zext/sext. This mostly solves PR42700: https://bugs.llvm.org/show_bug.cgi?id=42700 llvm-svn: 369519	2019-08-21 11:56:08 +00:00
Pavel Labath	82275ec51d	MinidumpYAML: move serialization code to MinidumpEmitter.cpp Summary: The code for serializing minidumps was living in MinidumpYAML.cpp so that it would be accessible from unit tests. While this had its advantages, it was also unfortunate because it broke symmetry with all other yaml2obj serializers. Fortunately, nowadays all of yaml2obj is a library, so we don't need to do anything special. This patch improves the code consistency by moving the serialization code to MinidumpEmitter.cpp to match the style used in other backends. It also removes the writeAsBinary entry point in favor of the more general convertYAML interface. This patch is just massaging the code a bit. There shouldn't be any functional change here. Reviewers: jhenderson, abrachet Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66474 llvm-svn: 369517	2019-08-21 11:30:48 +00:00
Petar Avramovic	7f581df649	[MIPS GlobalISel] NarrowScalar G_ZEXTLOAD and G_SEXTLOAD NarrowScalar G_ZEXTLOAD and G_SEXTLOAD to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66205 llvm-svn: 369512	2019-08-21 09:43:20 +00:00
Petar Avramovic	e406aa791c	[MIPS GlobalISel] NarrowScalar G_ZEXT and G_SEXT NarrowScalar G_ZEXT and G_SEXT to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66204 llvm-svn: 369511	2019-08-21 09:35:02 +00:00
Petar Avramovic	61bf2675b9	[MIPS GlobalISel] Consider type1 when legalizing shifts after r351882 r351882 allows different type for shift amount then result and value being shifted. Fix MIPS Legalizer rules to take r351882 into account. Differential Revision: https://reviews.llvm.org/D66203 llvm-svn: 369510	2019-08-21 09:31:29 +00:00
Petar Avramovic	5b4c5c2c54	[MIPS GlobalISel] NarrowScalar G_TRUNC Add NarrowScalar for G_TRUNC when NarrowTy is half the size of source. NarrowScalar G_TRUNC to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66202 llvm-svn: 369509	2019-08-21 09:26:39 +00:00
Jeremy Morse	67443c3c6e	[DebugInfo] Avoid dropping location info across block boundaries LiveDebugValues propagates variable locations between blocks by creating new DBG_VALUE insts in the successors, then interpreting them when it passes back through the block at a later time. However, this flushes out any extra information about the location that LiveDebugValues holds: for example, connections between variable locations such as discussed in D65368. And as reported in PR42772 this causes us to lose track of the fact that a spill-location is actually a spill, not a register location. This patch fixes that by deferring the creation of propagated DBG_VALUEs until after propagation has completed: instead location propagation occurs only by sharing location ID numbers between blocks. Differential Revision: https://reviews.llvm.org/D66412 llvm-svn: 369508	2019-08-21 09:22:31 +00:00
Luke Cheeseman	71d38b3c62	[AArch64] Update MTE system register encodings The encodings for the system registers TFSRE0_EL1, TFSR_EL1 TFSR_EL2, TFSR_EL3 and TFSR_EL12 have been changed so that they consistently have CRn=5 and CRm=6 as per https://developer.arm.com/docs/ddi0487/latest. Differential Revision: https://reviews.llvm.org/D65442 llvm-svn: 369505	2019-08-21 09:09:56 +00:00
Serge Guelton	d1262a6e91	Be explicit about Windows coff name trailing character policy It's okay to not copy the trailing zero of a windows section/symbol name. This is compatible with strncpy behavior but gcc doesn't know that and throws an invalid warning. Encode this behavior in a proper function. Differential Revision: https://reviews.llvm.org/D66420 llvm-svn: 369501	2019-08-21 07:54:42 +00:00
Vitaly Buka	5d84a67ce0	Fix 'fall through' annotation llvm-svn: 369490	2019-08-21 04:05:34 +00:00
Amara Emerson	56606a4db3	[AArch64][GlobalISel] Add support for narrowScalar of G_ZEXT We do this by merging the source with the high bits set to 0. Differential Revision: https://reviews.llvm.org/D66181 llvm-svn: 369480	2019-08-21 00:12:37 +00:00
Sean Fertile	9467734a1c	Fix assert in XCOFFObjectWriter related to program code csects. Removed code that added program code csects to a collection as part of addressing review comments, but I failed to update an assert affected by the change before commiting. llvm-svn: 369471	2019-08-20 23:24:47 +00:00
Stefan Stipanovic	26121ae4d0	[Attributor] Liveness for internal functions. For an internal function, if all its call sites are dead, the body of the function is considered dead. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D66155 llvm-svn: 369470	2019-08-20 23:16:57 +00:00
Alexandre Ganea	21e9603030	[Sanitizer] Remove unused functions Differential Revision: https://reviews.llvm.org/D66503 llvm-svn: 369468	2019-08-20 22:56:40 +00:00
Daniel Sanders	a16bd4f9f2	[RISCV GlobalISel] Adding initial GlobalISel infrastructure Summary: Add an initial GlobalISel skeleton for RISCV. It can only run ir translator for `ret void`. Patch by Andrew Wei Reviewers: asb, sabuasal, apazos, lenary, simoncook, lewis-revill, edward-jones, rogfer01, xiangzhai, rovka, Petar.Avramovic, mgorny, dsanders Reviewed By: dsanders Subscribers: pzheng, s.egerton, dsanders, hiraditya, rbar, johnrusso, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, psnobl, benna, Jim, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65219 llvm-svn: 369467	2019-08-20 22:53:24 +00:00
Alina Sbirlea	2863721f05	[MemorySSA] Make Phi cleanups consistent. Summary: Make Phi cleanups consistent: remove self as a trivial Phi and recurse to potentially remove other trivial phis. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66454 llvm-svn: 369466	2019-08-20 22:47:58 +00:00
Jessica Paquette	e6c299b983	[AArch64][GlobalISel] Select logical_imm32 and logical_imm64 patterns Add a GlobalISel equivalent for the logical_imm32_XFORM and logical_imm64_XFORM SDNodeXForms in AArch64InstrFormats.td. - Add select-logical-imm.mir, which contains tests for each imported pattern. - Update select-pr32733.mir and select-scalar-shift-imm.mir, since they now select instructions of this form. Differential Revision: https://reviews.llvm.org/D66162 llvm-svn: 369465	2019-08-20 22:31:25 +00:00
Alina Sbirlea	1c528e8f1b	[MemorySSA] Fix existing phis when inserting defs. Summary: When inserting a new Def, and inserting Phis in the IDF when needed, also mark the already existing Phis in the IDF as non-optimized, since these may need fixing as well. In the test attached, there is a Phi in the IDF that happens to be trivial, and is wrongfully removed by the call to getLastDef that follows. This is a valid situation and the existing IDF Phis need to marked as "may need fixing" as well. Resolves PR43044. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66495 llvm-svn: 369464	2019-08-20 22:29:06 +00:00
Sean Fertile	89463fcfc7	Remove assert with tautological compare from XCOFFObjectWriter. Remove assert of 'Sec->getCSectType() <= 0x07u' added in r369454, since its always true. llvm-svn: 369462	2019-08-20 22:23:34 +00:00
Jessica Paquette	9a95e79b1b	[AArch64][GlobalISel] Select patterns which use shifted register operands This adds GlobalISel equivalents for the following from AArch64InstrFormats: - arith_shifted_reg32 - arith_shifted_reg64 And partial support for - logical_shifted_reg32 - logical_shifted_reg32 The only thing missing for the logical cases is support for rotates. Other than the missing support, the transformation is identical for the arithmetic shifted register and the logical shifted register. Lots of tests here: - Add select-arith-shifted-reg.mir to show that we correctly select add and sub instructions which use this pattern. - Add select-logical-shifted-reg.mir to cover patterns which are not shared between the arithmetic and logical cases. - Update addsub-shifted.ll to show that we correctly fold shifts into adds/subs. - Update eon.ll to show that we can select the eon instruction by folding xors. Differential Revision: https://reviews.llvm.org/D66163 llvm-svn: 369460	2019-08-20 22:18:06 +00:00
Craig Topper	ba375263e8	[DAGCombiner][X86] Teach visitCONCAT_VECTORS to combine (concat_vectors (concat_vectors X, Y), undef)) -> (concat_vectors X, Y, undef, undef) I also had to add a new combine to X86's combineExtractSubvector to prevent a regression. This helps our vXi1 code see the full concat operation and allow it optimize undef to a zero if there is already a zero in the concat. This helped us use a movzx instead of an AND in some of the tests. In those tests, one concat comes from SelectionDAGBuilder and the second comes from type legalization of v4i1->i4 bitcasts which uses an additional concat. Though these changes weren't my original motivation. I'm looking at making X86ISelLowering's narrowShuffle emit a concat_vectors instead of an insert_subvector since concat_vectors is more canonical during early DAG combine. This patch helps prevent a regression from my experiments with that. Differential Revision: https://reviews.llvm.org/D66456 llvm-svn: 369459	2019-08-20 22:12:50 +00:00
Reid Kleckner	22fb734907	Revert [WinEH] Allocate space in funclets stack to save XMM CSRs This reverts r367088 (git commit `9ad565f70e`) And the follow up fix r368631 / `e9865b9b31` llvm-svn: 369457	2019-08-20 22:08:57 +00:00
Sean Fertile	1e46d4cec5	Adds support for writing the .bss section for XCOFF object files. Adds Wrapper classes for MCSymbol and MCSection into the XCOFF target object writer. Also adds a class to represent the top-level sections, which we materialize in the ObjectWriter. executePostLayoutBinding will map all csects into the appropriate container depending on its storage mapping class, and map all symbols into their containing csect. Once all symbols have been processed we - Assign addresses and symbol table indices. - Calaculte section sizes. - Build the section header table. - Assign the sections raw-pointer value for non-virtual sections. Since the .bss section is virtual, writing the header table is enough to add support. Writing of a sections raw data, or of any relocations is not included in this patch. Testing is done by dumping the section header table, but it needs to be extended to include dumping the symbol table once readobj support for dumping auxiallary entries lands. Differential Revision: https://reviews.llvm.org/D65159 llvm-svn: 369454	2019-08-20 22:03:18 +00:00
Michael Liao	a99086dbdd	[Attributor] Remove unused variable. NFC. llvm-svn: 369444	2019-08-20 21:02:31 +00:00
Wenlei He	5adace352d	[AutoFDO] Make call targets order deterministic for sample profile Summary: StringMap is used for storing call target to frequency map for AutoFDO. However the iterating order of StringMap is non-deterministic, which leads to non-determinism in AutoFDO profile output. Now new API getSortedCallTargets and SortCallTargets are added for deterministic ordering and output. Roundtrip test for text profile and binary profile is added. Reviewers: wmi, davidxl, danielcdh Subscribers: hiraditya, mgrang, llvm-commits, twoh Tags: #llvm Differential Revision: https://reviews.llvm.org/D66191 llvm-svn: 369440	2019-08-20 20:52:00 +00:00
Craig Topper	3a2b08e6c9	[X86] Add a DAG combine to transform (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) -> (i8 (trunc (i16 (bitcast (v16i1 X))))) on KNL target Without AVX512DQ we don't have KMOVB so we can't really copy 8-bits of a k-register to a GPR. We have to copy 16 bits instead. We do this even if the DAG copy is from v8i1->v16i1. If we detect the (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) we should rewrite the types to match the copy we do support. By doing this, we can help known bits to propagate without losing the upper 8 bits of the input to the extract_subvector. This allows some zero extends to be removed since we have an isel pattern to use kmovw for (zero_extend (i16 (bitcast (v16i1 X))). Differential Revision: https://reviews.llvm.org/D66489 llvm-svn: 369434	2019-08-20 20:20:04 +00:00
Craig Topper	250951abf5	[X86] Add isel patterns for (i64 (zext (i8 (bitcast (v16i1 X))))) to use a KMOVW and a SUBREG_TO_REG. Similar for i8 and anyextend. We already had patterns for extending to i32 to take advantage of the impliciting zeroing of the upper bits of a 32-bit GPR that is done by KMOVW/KMOVB. But the extend might be all the way to i64, in which case the existing patterns would fail and we'd get a KMOVW/B followed by a MOVZX. By adding patterns for i64 we can use the fact that KMOVW/B zero the upper bits of the 32-bit GPR and the normal property that 32-bit GPR writes implicitly zero the upper 32-bits of the full 64-bit GPR. The anyextend patterns are slightly different since we don't care about the upper zeros. For the i8->i64 I think this avoids selecting the anyextend as a MOVZX to prevent a partial register issue that doesn't exist. For i16->i64 I think we would have just emitted an insert_subreg on top of the extract_subreg that the vXi16->i16 bitcast pattern emits. The register coalescer or peephole pass should combine those, but this saves that work and makes i8/16 consistent. llvm-svn: 369431	2019-08-20 19:43:48 +00:00
Martin Storsjo	514f3a122d	[TargetMachine] Don't try to create COFFSTUB references on windows on non-COFF This avoids spurious relocation types for windows/elf targets. Differential Revision: https://reviews.llvm.org/D66401 llvm-svn: 369426	2019-08-20 18:58:05 +00:00
Sam Clegg	cf2b8722d4	[WebAssembly][lld] Fix crash when applying relocations to debug sections Debug sections are special in that they can contain relocations against symbols that are not present in the final output (i.e. not live). However it is also possible to have R_WASM_TABLE_INDEX relocations against symbols that don't have a table index assigned (since they are not address taken by actual code. Fixes: https://github.com/emscripten-core/emscripten/issues/9023 Differential Revision: https://reviews.llvm.org/D66435 llvm-svn: 369423	2019-08-20 18:39:24 +00:00
Sanjay Patel	292b1087f4	[InstCombine] add helper function for icmp+zext/sext; NFC llvm-svn: 369421	2019-08-20 18:15:17 +00:00
Matt Arsenault	4b7fc85c0b	Revert "AMDGPU: Fix iterator error when lowering SI_END_CF" This reverts r367500 and r369203. This is causing various test failures. llvm-svn: 369417	2019-08-20 17:45:25 +00:00
Andrea Di Biagio	2e897a94f5	[X86][BtVer2] Use ReadAfterLd entries for the register operands of CMPXCHG. This is a follow-up of r369365. llvm-svn: 369412	2019-08-20 17:05:56 +00:00
Sanjay Patel	2e68e4d60e	[InstCombine] make fold for icmp with sext more efficient; NFC We were creating 2 instructions and relying on a subsequent fold to invert a not(icmp). Create the final icmp directly instead. llvm-svn: 369411	2019-08-20 17:03:22 +00:00
Craig Topper	22ac9f396f	[X86] Use isNullConstant instead of getConstantOperandVal == 0. NFC llvm-svn: 369410	2019-08-20 16:55:12 +00:00
Sam Tebbs	dcfc2d40d3	[ARM] Select vaddva This patch adds vaddva selection. Differential revision: https://reviews.llvm.org/D66410 llvm-svn: 369404	2019-08-20 16:33:34 +00:00
Aditya Nandakumar	08bd080872	[GlobalISel] Handle multiple registers in dbg.value intrinsic https://reviews.llvm.org/D66077 The value passed into dbg.value may relate to multiple registers, each of which need a DBG_VALUE. This fix calls MIRBuilder.buildDirectDbgValue for each register. Without this, IR passed in from flang-compiler/flang may fail an assertion in getOrCreateVReg. Patch by : peterwaller-arm. llvm-svn: 369403	2019-08-20 16:28:37 +00:00
Thomas Raoux	be699bf389	[CodeGen] Add a pass to do block predication on SSA machine IR. For targets requiring aggressive scheduling and/or software pipeline we need to apply predication before preRA scheduling. This adds a pass re-using the early if-cvt infrastructure but generating predicated instructions instead of speculatively executing instructions. It allows doing if conversion on blocks containing instructions with side-effects. The pass re-use the target hook from postRA if-conversion to let the target decide on the heuristic to apply. Differential Revision: https://reviews.llvm.org/D66190 llvm-svn: 369395	2019-08-20 15:54:59 +00:00
Sanjay Patel	a90ee0eeb6	[InstCombine] improve readability for icmp with cast folds; NFC 1. Update function name and stale code comments. 2. Use variable names that are less ambiguous. 3. Move operand checks into the function as early exits. llvm-svn: 369390	2019-08-20 14:56:44 +00:00
Jinsong Ji	cda334ba54	[BlockExtractor] Avoid assert with wrong line format Summary: When the line format is wrong, we may end up accessing out of bound memory. eg: the test with invalide line will cause assert. Assertion `idx < size()' failed The fix is to report fatal when we found mismatched line format. Reviewers: qcolombet, volkan Reviewed By: qcolombet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66444 llvm-svn: 369389	2019-08-20 14:46:02 +00:00
Andrea Di Biagio	16111d3795	[X86][BtVer2] Fix latency and throughput of atomic INC/DEC/NEG/NOT. Latency and throughput of LOCK INC/DEC/NEG/NOT is always 19cy. Number of uOPs is still 1. Differential Revision: https://reviews.llvm.org/D66469 llvm-svn: 369388	2019-08-20 14:31:27 +00:00
Sanjay Patel	f99d254aae	[InstCombine] simplify min/max of min/max with same operands (PR35607) This is the original integer variant requested in: https://bugs.llvm.org/show_bug.cgi?id=35607 As noted in the TODO and several similar TODOs around this block, we could do this in instsimplify, but then it would cost more because we would be trying to match min/max via ValueTracking in 2 different places. There are 4 commuted variants for each of smin/smax/umin/umax that are not matched here. There are also icmp predicate variants that are not included in the affected test file because they are already handled by instsimplify by folding the final icmp to true/false. https://rise4fun.com/Alive/3KVc Name: smax(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: smin(smax, smin) %c1 = icmp slt i32 %x, %y %c2 = icmp slt i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp sgt i32 %max, %min %r = select i1 %c3, i32 %min, i32 %max => %r = %min Name: umax(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %max, i32 %min => %r = %max Name: umin(umax, umin) %c1 = icmp ult i32 %x, %y %c2 = icmp ult i32 %y, %x %min = select i1 %c1, i32 %x, i32 %y %max = select i1 %c2, i32 %x, i32 %y %c3 = icmp ult i32 %min, %max %r = select i1 %c3, i32 %min, i32 %max => %r = %min llvm-svn: 369386	2019-08-20 13:39:17 +00:00
Igor Kudrin	59d5abaa71	[DWARF] Fix reading 64-bit DWARF type units. The type_offset field is 8 bytes long in DWARF64. The patch extends TypeOffset to uint64_t and fixes its reading. The patch also fixes checking of TypeOffset bounds as it was inaccurate in DWARF64 case. Differential Revision: https://reviews.llvm.org/D66465 llvm-svn: 369378	2019-08-20 12:52:32 +00:00
Alex Bradbury	7cb3cd34e8	[RISCV] Implement getExprForFDESymbol to ensure RISCV_32_PCREL is used for the FDE location Follow binutils in using RISCV_32_PCREL for the FDE initial location. As explained in the relevant binutils commit <`a6cbf936e3`>, the ADD/SUB pair of relocations is problematic in the presence of linker relaxation. This patch has the same end goal as D64715 but includes test changes and avoids adding a new global VariantKind to MCExpr.h (preferring RISCVMCExpr VKs like the rest of the RISC-V backend). Differential Revision: https://reviews.llvm.org/D66419 llvm-svn: 369375	2019-08-20 12:32:31 +00:00
Pavel Labath	51d7398f63	Recommit "MemoryBuffer: Add a missing error-check to getOpenFileImpl" This recommits r368977, which was reverted in r369027 due to test failures in lldb. The cause of this was different behavior of readNativeFileSlice on windows and unix. These have been addressed in r369269. The original commit message was: In case the function was called with a desired read size and the file was not an "mmap()" candidate, the function was falling back to a "pread()", but it was failing to check the result of that system call. This meant that the function would return "success" even though the read operation failed, and it returned a buffer full of uninitialized memory. Reviewers: rnk, dblaikie Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66224 llvm-svn: 369370	2019-08-20 12:08:52 +00:00
Andrea Di Biagio	b1bdd97a26	[X86][Btver2] Fix latency and throughput of CMPXCHG instructions. On Jaguar, CMPXCHG has a latency of 11cy, and a maximum throughput of 0.33 IPC. Throughput is superiorly limited to 0.33 because of the implicit in/out dependency on register EAX. In the case of repeated non-atomic CMPXCHG with the same memory location, store-to-load forwarding occurs and values for sequent loads are quickly forwarded from the store buffer. Interestingly, the functionality in LLVM that computes the reciprocal throughput doesn't seem to know about RMW instructions. That functionality only looks at the "consumed resource cycles" for the throughput computation. It should be fixed/improved by a future patch. In particular, for RMW instructions, that logic should also take into account for the write latency of in/out register operands. An atomic CMPXCHG has a latency of ~17cy. Throughput is also limited to ~17cy/inst due to cache locking, which prevents other memory uOPs to start executing before the "lock releasing" store uOP. CMPXCHG8rr and CMPXCHG8rm are treated specially because they decode to one less macro opcode. Their latency tend to be the same as the other RR/RM variants. RR variants are relatively fast 3cy (but still microcoded - 5 macro opcodes). CMPXCHG8B is 11cy and unfortunately doesn't seem to benefit from store-to-load forwarding. That means, throughput is clearly limited by the in/out dependency on GPR registers. The uOP composition is sadly unknown (due to the lack of PMCs for the Integer pipes). I have reused the same mix of consumed resource from the other CMPXCHG instructions for CMPXCHG8B too. LOCK CMPXCHG8B is instead 18cycles. CMPXCHG16B is 32cycles. Up to 38cycles when the LOCK prefix is specified. Due to the in/out dependencies, throughput is limited to 1 instruction every 32 (or 38) cycles dependeing on whether the LOCK prefix is specified or not. I wouldn't be surprised if the microcode for CMPXCHG16B is similar to 2x microcode from CMPXCHG8B. So, I have speculatively set the JALU01 consumption to 2x the resource cycles used for CMPXCHG8B. The two new hasLockPrefix() functions are used by the btver2 scheduling model check if a MCInst/MachineInst has a LOCK prefix. Calls to hasLockPrefix() have been encoded in predicates of variant scheduling classes that describe lat/thr of CMPXCHG. Differential Revision: https://reviews.llvm.org/D66424 llvm-svn: 369365	2019-08-20 10:23:55 +00:00
Seiya Nuta	522377494b	[yaml2obj/obj2yaml][MachO] Allow setting custom section data Reviewers: alexshap, jhenderson, rupprecht Reviewed By: alexshap, jhenderson Subscribers: abrachet, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65799 llvm-svn: 369348	2019-08-20 08:49:07 +00:00
Fangrui Song	2682340cdf	[MC] Delete an overload of MCExpr::evaluateKnownAbsolute and its associated hack The hack dated back to 2010 (r121076) and was documented by r122144: // FIXME: The use if InSet = Addrs is a hack. Setting InSet causes us // absolutize differences across sections and that is what the MachO writer // uses Addrs for. llvm-svn: 369337	2019-08-20 07:42:04 +00:00
Fangrui Song	f182617352	[Attributor] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after r369331 llvm-svn: 369334	2019-08-20 07:21:43 +00:00
Craig Topper	1ada137854	[X86] Add back the -x86-experimental-vector-widening-legalization comand line flag and all associated code, but leave it enabled by default Google is reporting performance issues with the new default behavior and have asked for a way to switch back to the old behavior while we investigate and make fixes. I've restored all of the code that had since been removed and added additional checks of the command flag onto code paths that are not otherwise guarded by a check of getTypeAction. I've also modified the cost model tables to hopefully get us back to the previous costs. Hopefully we won't need to support this for very long since we have no test coverage of the old behavior so we can very easily break it. llvm-svn: 369332	2019-08-20 06:58:00 +00:00
Johannes Doerfert	12cbbab9d9	[Attributor] Create abstract attributes on-demand Before, we create the set of abstract attributes initially and then dealt with the fact hat a lookup could fail, e.g., return a nullptr. This patch will ensure we always return a valid object from a lookup, allowing us not only to remove the nullptr checks but also to grow the set of abstract attributes "in-flight" on-demand. One can now start from those that have the best chance of improving performance without the need to specify all they might depend on. While this introduces some boilerplate, the usage of attributes is much easier and cleaner now. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66276 llvm-svn: 369331	2019-08-20 06:15:50 +00:00
Johannes Doerfert	169af994bc	[Attributor][NFC] Cleanup statistics code llvm-svn: 369330	2019-08-20 06:09:56 +00:00
Johannes Doerfert	cfcca1a5b1	[Attributor] Use structured deduction for AADereferenceable Summary: This is analogous to D66128 but for AADereferenceable. We have the logic concentrated in the floating value updateImpl and we use the combiner helper classes for arguments and return values. The regressions will go away with "on-demand" attribute creation. Improvements are already visible in the existing tests. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66272 llvm-svn: 369329	2019-08-20 06:08:35 +00:00
Johannes Doerfert	b9b8791fed	[Attributor] Use structured deduction for AANonNull Summary: What D66126 did for AAAlign, this patch does for AANonNull. Agian, the logic becomes more concise and localized. Again, returned poiners are not annotated properly but that will not be an issue if this lands with the "on-demand" generation of attributes. First improvements due to the genericValueTraversal are already visible. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66128 llvm-svn: 369328	2019-08-20 06:02:39 +00:00
Johannes Doerfert	028b2aa56a	[Attributor] Fix the "clamp" operator The clamp operator should not take the known of the given state as the known is potentially based on assumed information. This also adds TODOs to guide improvements. llvm-svn: 369327	2019-08-20 05:57:01 +00:00
Thomas Raoux	a08e139d50	[NFC] Test commit, fix some comment spelling. llvm-svn: 369326	2019-08-20 05:21:27 +00:00
Karl-Johan Karlsson	40da6be2bd	[AsmPrinter] Remove const qualifier from EmitBasicBlockStart. Overriders may want to modify state in it. AMDGPU wants to, but has to make its members mutable in order to do so. Besides, EmitBasicBlockEnd is not const, so why should Start be? Patch by Bevin Hansson. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D66341 llvm-svn: 369325	2019-08-20 05:13:57 +00:00
Fangrui Song	ce21c3e12c	MCAsmMacro: add `#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)` to some dump() declarations llvm-svn: 369324	2019-08-20 04:14:43 +00:00
Fangrui Song	e828ce1b88	[WebAssembly][MC] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after r369317 llvm-svn: 369318	2019-08-20 02:02:57 +00:00
Sam Clegg	ecc5e8084f	[WebAssembly][MC] Simplify WasmObjectWriter::recordRelocation. NFC. WebAssembly doesn't support PC relative relocation or relocation expressions that can't be reduced to single symbol. The only support for we have for fixups involving two symbols are when both symbols are defined and withing the same section. In this case evaluateFixup will already have evaluated to the expression before calling recordRelocation. llvm-svn: 369317	2019-08-20 00:33:50 +00:00
Dinar Temirbulatov	081c57989e	[SLP][NFC] Avoid repetitive calls to getSameOpcode() We can avoid repetitive calls getSameOpcode() for already known tree elements by keeping MainOp and AltOp in TreeEntry. Differential Revision: https://reviews.llvm.org/D64700 llvm-svn: 369315	2019-08-20 00:22:04 +00:00
Hubert Tong	71974b5175	[cmake] Link in LLVMPasses due to dependency by LLVMOrcJIT; NFC Summary: rL367756 (`f5c40cb`) increases the dependency of LLVMOrcJIT on LLVMPasses. In particular, symbols defined in LLVMPasses that are referenced by the destructor of `PassBuilder` are now referenced by LLVMOrcJIT through `Speculation.cpp.o`. We believe that referencing symbols defined in LLVMPasses in the destructor of `PassBuilder` is valid, and that adding to the set of such symbols is legitimate. To support such cases, this patch adds LLVMPasses to the set of libraries being linked when linking in LLVMOrcJIT causes such symbols from LLVMPasses to be referenced. Reviewers: Whitney, anhtuyen, pree-jackie Reviewed By: pree-jackie Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66441 llvm-svn: 369310	2019-08-19 23:12:48 +00:00
Anton Afanasyev	3f3a2573c3	[Support][Time profiler] Make FE codegen blocks to be inside frontend blocks Summary: Add `Frontend` time trace entry to `HandleTranslationUnit()` function. Add test to check all codegen blocks are inside frontend blocks. Also, change `--time-trace-granularity` option a bit to make sure very small time blocks are outputed to json-file when using `--time-trace-granularity=0`. This fixes http://llvm.org/pr41969 Reviewers: russell.gallop, lebedev.ri, thakis Reviewed By: russell.gallop Subscribers: vsapsai, aras-p, lebedev.ri, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63325 llvm-svn: 369308	2019-08-19 22:58:26 +00:00
Matthias Gehre	5b3275e56f	[ORC] fix use-after-free detected by -Wreturn-stack-address Summary: llvm/lib/ExecutionEngine/Orc/Layer.cpp:53:12: warning: returning address of local temporary object [-Wreturn-stack-address] In ``` StringRef IRMaterializationUnit::getName() const { [...] return TSM.withModuleDo( [](const Module &M) { return M.getModuleIdentifier(); }); ``` `getModuleIdentifier()` returns a `const std::string &`, but the implicit return type of the lambda is `std::string` by value, and thus the returned `StringRef` refers to a temporary `std::string`. Detect by annotating `llvm::StringRef` with `[[gsl::Pointer]]`. Reviewers: lhames, sgraenitz Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66440 llvm-svn: 369306	2019-08-19 21:59:44 +00:00
Johannes Doerfert	8b962f2814	[CaptureTracker] Let subclasses provide dereferenceability information Summary: CaptureTracker subclasses might have better dereferenceability information which allows null pointer checks to be no-capturing. The first user will be D59922. Reviewers: sanjoy, hfinkel, aykevl, sstefan1, uenoku, xbolva00 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66371 llvm-svn: 369305	2019-08-19 21:56:38 +00:00
Johannes Doerfert	de7674ce76	Recommit "[Attributor] Fix: Do not partially resolve returned calls." This reverts commit `b1752f670f`. Fixed the issue with a different commit, reapply this one as it was, afaik, not broken. llvm-svn: 369303	2019-08-19 21:35:31 +00:00
Evgeniy Stepanov	55ccd16354	Refactor isPointerOffset (NFC). Summary: Simplify the API using Optional<> and address comments in https://reviews.llvm.org/D66165 Reviewers: vitalybuka Subscribers: hiraditya, llvm-commits, ostannard, pcc Tags: #llvm Differential Revision: https://reviews.llvm.org/D66317 llvm-svn: 369300	2019-08-19 21:08:04 +00:00
Vyacheslav Zakharin	f7229ac7d8	Fixed placement of llvm.global_dtors on Windows. Differential revision: https://reviews.llvm.org/D66373 llvm-svn: 369299	2019-08-19 21:07:03 +00:00
Evgeniy Stepanov	50affbe47f	MemTag: stack initializer merging. Summary: MTE provides instructions to update memory tags and data at the same time. This change makes use of those to generate more compact code for stack variable tagging + initialization. We collect memory store and memset instructions following an alloca or a lifetime.start call, and replace them with the corresponding MTE intrinsics. Since the intrinsics work on 16-byte aligned chunks, the stored values are combined as necessary. Reviewers: pcc, vitalybuka, ostannard Subscribers: srhines, javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66167 llvm-svn: 369297	2019-08-19 20:47:09 +00:00
Benjamin Kramer	928071ae4e	[Support] Replace sys::Mutex with their standard equivalents. Only use a recursive mutex if it can be locked recursively. llvm-svn: 369295	2019-08-19 19:49:57 +00:00
Johannes Doerfert	056f1b5cc7	Re-apply fixed "[Attributor] Fix: Make sure we set the changed flag" This reverts commit `cedd0d9a6e`. Re-apply the original commit but make sure the variables are initialized (even if they are not used) so UBSan is not complaining. llvm-svn: 369294	2019-08-19 19:14:10 +00:00
Sam Clegg	19bf637eb1	[WebAssembly][MC] Allow empty assembly functions Differential Revision: https://reviews.llvm.org/D66434 llvm-svn: 369292	2019-08-19 19:04:54 +00:00
Alina Sbirlea	1a3fdaf6a6	[MemorySSA] Rename uses when inserting memory uses. Summary: When inserting uses from outside the MemorySSA creation, we don't normally need to rename uses, based on the assumption that there will be no inserted Phis (if Def existed that required a Phi, that Phi already exists). However, when dealing with unreachable blocks, MemorySSA will optimize away Phis whose incoming blocks are unreachable, and these Phis end up being re-added when inserting a Use. There are two potential solutions here: 1. Analyze the inserted Phis and clean them up if they are unneeded (current method for cleaning up trivial phis does not cover this) 2. Leave the Phi in place and rename uses, the same way as whe inserting defs. This patch use approach 2. Resolves first test in PR42940. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66033 llvm-svn: 369291	2019-08-19 18:57:40 +00:00
Craig Topper	a0d92c7262	[X86] Teach lowerV4I32Shuffle to only use broadcasts if the mask has more than one undef element. Prioritize shifts over broadcast in lowerV8I16Shuffle. The motivating case are the changes in vector-reduce-add.ll where we were doing extra work in the scalar domain instead of shuffling. There may be some one use check that needs to be looked into there, but this patch sidesteps the issue by avoiding broadcasts that aren't really broadcasting. Differential Revision: https://reviews.llvm.org/D66071 llvm-svn: 369287	2019-08-19 18:15:50 +00:00
Craig Topper	93c2787193	[CGP] Remove ModifiedDT from the makeBitReverse loop I don't think anything in this loop modifies the control flow and we don't restart any iteration after setting the flag. This code was added in http://reviews.llvm.org/D16893 but looking at the test case added there the code that caused the dominator tree to change was merging blocks with their predecessor not the bitreverse optimization. Differential Revision: https://reviews.llvm.org/D66366 llvm-svn: 369283	2019-08-19 18:02:24 +00:00
Pavel Labath	08c77b97c0	Filesystem/Windows: fix inconsistency in readNativeFileSlice API Summary: The windows version implementation of readNativeFileSlice, was trying to match the POSIX behavior of not treating EOF as an error, but it was only handling the case of reading from a pipe. Attempting to read past the end of a regular file returns a slightly different error code, which needs to be handled too. This patch adds ERROR_HANDLE_EOF to the list of error codes to be treated as an end of file, and adds some unit tests for the API. This issue was found while attempting to land D66224, which caused a bunch of lldb tests to start failing on windows. Reviewers: rnk, aganea Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66344 llvm-svn: 369269	2019-08-19 15:40:49 +00:00
Roman Lebedev	edfaee0811	[TargetLowering] x s% C == 0 fold: vector divisor with INT_MIN handling Summary: The general fold is only valid for positive divisors. Which effectively means, it is invalid for `INT_MIN` divisors, and we currently bailout if we see them. But that is too strict, we can just fix-up the results. For that, let's do a second computation 'in parallel': ``` Name: srem -> and Pre: isPowerOf2(C) %o = srem i8 %X, C %r = icmp eq %o, 0 => %n = and i8 %X, C-1 %r = icmp eq %n, 0 ``` https://rise4fun.com/Alive/Sup And then just blend results: if the divisor was `INT_MIN`, pick the value we got via bit-test, else pick the value from general fold. There's interesting observation - `ISD::ROTR` is set to `LegalizeAction::Expand` before AVX512, so we should not treat `INT_MIN` divisor as even; and as it can be seen while `@test_srem_odd_even_one` improves on all run-lines, `@test_srem_odd_even_INT_MIN` only improves for AVX512. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66300 llvm-svn: 369268	2019-08-19 15:01:42 +00:00
Serge Guelton	a023d6b7de	[nfc] Silent gcc warning llvm-svn: 369266	2019-08-19 14:40:33 +00:00
George Rimar	9d5e8a476f	[Object/COFF.h] - Stop returning std::error_code in a few methods. NFCI. There are 4 methods that return std::error_code now, though they do not have to because they are always succeed. I refactored them. This allows to simplify the code in tools a bit. llvm-svn: 369263	2019-08-19 14:32:23 +00:00
Jinsong Ji	0776da5236	[PeepholeOptimizer] Don't assume bitcast def always has input Summary: If we have a MI marked with bitcast bits, but without input operands, PeepholeOptimizer might crash with assert. eg: If we apply the changes in PPCInstrVSX.td as in this patch: [(set v4i32:$XT, (bitconvert (v16i8 immAllOnesV)))]>; We will get assert in PeepholeOptimizer. ``` llvm-lit llvm-project/llvm/test/CodeGen/PowerPC/build-vector-tests.ll -v llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:417: const llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i < getNumOperands() && "getOperand() out of range!"' failed. ``` The fix is to abort if we found out of bound access. Reviewers: qcolombet, MatzeB, hfinkel, arsenm Reviewed By: qcolombet Subscribers: wdng, arsenm, steven.zhang, wuzish, nemanjai, hiraditya, kbarton, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65542 llvm-svn: 369261	2019-08-19 14:19:04 +00:00
Alex Bradbury	1c1f8f215d	[RISCV] Don't force absolute FK_Data_X fixups to relocs The current behavior of shouldForceRelocation forces relocations for the majority of fixups when relaxation is enabled. This makes sense for fixups which incorporate symbols but is unnecessary for simple data fixups where the fixup target is already resolved to an absolute value. Differential Revision: https://reviews.llvm.org/D63404 Patch by Edward Jones. llvm-svn: 369257	2019-08-19 13:23:02 +00:00
David Stenberg	88df53e6ea	[DebugInfo] Allow bundled calls in the MIR's call site info Summary: Extend the MIR parser and writer so that the call site information can refer to calls that are bundled. Reviewers: aprantl, asowda, NikolaPrica, djtodoro, ivanbaev, vsk Reviewed By: aprantl Subscribers: arsenm, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D66145 llvm-svn: 369256	2019-08-19 12:41:22 +00:00
Sanjay Patel	b38bac3699	[SLP] reduce duplicated code; NFC llvm-svn: 369250	2019-08-19 11:39:56 +00:00
Fangrui Song	d9a071c54b	[MC] Simplify ELFObjectWriter::recordRelocation. NFC llvm-svn: 369248	2019-08-19 10:05:59 +00:00
Jeremy Morse	176bbd5cde	[DebugInfo] Make postra sinking of DBG_VALUEs subregister-safe Currently the machine instruction sinker identifies DBG_VALUE insts that also need to sink by comparing register numbers. Unfortunately this isn't safe, because (after register allocation) a DBG_VALUE may read a register that aliases what's being sunk. To fix this, identify the DBG_VALUEs that need to sink by recording & examining their register units. Register units gives us the following guarantee: "Two registers overlap if and only if they have a common register unit" [MCRegisterInfo.h] Thus we can always identify aliasing DBG_VALUEs if the set of register units read by the DBG_VALUE, and the register units of the instruction being sunk, intersect. (MachineSink already uses classes like "LiveRegUnits" for determining sinking validity anyway). The test added checks for super and subregister DBG_VALUE reads of a sunk copy being sunk as well. Differential Revision: https://reviews.llvm.org/D58191 llvm-svn: 369247	2019-08-19 09:53:07 +00:00
Sam Tebbs	f312c1ecf4	[ARM] Add support for MVE vaddv This patch adds vecreduce_add and the relevant instruction selection for vaddv. Differential revision: https://reviews.llvm.org/D66085 llvm-svn: 369245	2019-08-19 09:38:28 +00:00
David Green	2bfc13fde1	[ARM] MVE sext costs This adds some sext costs for MVE, taken from the length of assembly sequences that we currently generate. Differential Revision: https://reviews.llvm.org/D66010 llvm-svn: 369244	2019-08-19 09:13:22 +00:00
David L. Jones	cedd0d9a6e	Revert [Attributor] Fix: Make sure we set the changed flag This reverts r369159 (git commit `cbaf1fdea2`) r369160 caused a test to fail under UBSAN. See thread on llvm-commits. llvm-svn: 369241	2019-08-19 08:00:08 +00:00
Fangrui Song	b127771f7d	[MC] Delete unnecessary diagnostic: "No relocation available to represent this relative expression" Replace - error: No relocation available to represent this relative expression with + error: symbol 'undef' can not be undefined in a subtraction expression or + error: Cannot represent a difference across sections Keep !IsPcRel as an assertion after the two diagnostic checks are done. llvm-svn: 369239	2019-08-19 07:59:35 +00:00
David L. Jones	b1752f670f	Revert [Attributor] Fix: Do not partially resolve returned calls. This reverts r369160 (git commit `f72d9b1c97`) r369160 caused some tests to fail under UBSAN. See thread on llvm-commits. llvm-svn: 369236	2019-08-19 07:16:24 +00:00
Fangrui Song	38426c114f	[MC] Don't emit .symver redirected symbols to the symbol table GNU as keeps the original symbol in the symbol table for defined @ and @@, but suppresses it in other cases (@@@ or undefined). The original symbol is usually undesired: In a shared object, the original symbol can be localized with a version script, but it is hard to remove/localize in an archive: 1) a post-processing step removes the undesired original symbol 2) consumers (executable) of the archive are built with the version script Moreover, it can cause linker issues like binutils PR/18703 if the original symbol name and the base name of the versioned symbol is the same (both ld.bfd and gold have some code to work around defined @ and @@). In lld, if it sees f and f@v1: --version-script =(printf 'v1 {};') => f and f@v1 --version-script =(printf 'v1 { f; };') => f@v1 and f@@v1 It can be argued that @@@ added on 2000-11-13 corrected the @ and @@ mistake. This patch catches some more multiple version errors (defined @ and @@), and consistently suppress the original symbol. This addresses all the problems listed above. If the user wants other aliases to the versioned symbol, they can copy the original symbol to other symbol names with .set directive, e.g. .symver f, f@v1 # emit f@v1 but not f into .symtab .set f_impl, f # emit f_impl into .symtab llvm-svn: 369233	2019-08-19 06:17:30 +00:00
Craig Topper	ebb7ddc633	[X86] Teach lower1BitShuffle to match right shifts with upper zero elements on types that don't natively support KSHIFT. We can support these by widening to a supported type, then shifting all the way to the left and then back to the right to ensure that we shift in zeroes. llvm-svn: 369232	2019-08-19 05:45:39 +00:00
Craig Topper	e47437a6ef	[X86] Fix the lower1BitShuffle code added in r369215 to correctly pass the widened vector to the KSHIFT node. Not sure how to test this as we have tests that exercise this code, but nothing failed for the types not matching. Since all the k-registers use equivalent register classes everything just ends up working. llvm-svn: 369228	2019-08-19 04:08:44 +00:00
Craig Topper	269c6b1c15	[X86] Teach lower1BitShuffle to match KSHIFTR that doesn't use Zeroable and only relies on undef. This allows us to widen the type when the KSHIFTR instruction doesn't exist for the type. If we need to shift in zeroes into the upper elements we would need more work to guarantee zeroes when widening. llvm-svn: 369227	2019-08-19 04:08:40 +00:00
Craig Topper	2eb7951da3	[X86] Teach lower1BitShuffle to recognize padding a subvector with zeros with V2 as the source and V1 as the zero vector. Shuffle canonicalization can swap the sources so the zero vector might be V1 and the subvector that's being padded can be V2. llvm-svn: 369226	2019-08-19 00:39:22 +00:00
Craig Topper	2ee46c7c4b	[X86] Add a special case to LowerCONCAT_VECTORSvXi1 to handle concatenating zero vectors followed by one non-zero vector followed by undef vectors. For such a case we should only need a KSHIFTL, but we were previously generating a KSHIFTL followed by a KSHIFTR because we mistakenly believed we need to zero the undef elements. llvm-svn: 369224	2019-08-18 23:30:11 +00:00
Craig Topper	388b8dd94a	[X86] Replace uses of getZeroVector for vXi1 vectors with DAG.getConstant. vXi1 vectors don't need special handling. llvm-svn: 369222	2019-08-18 23:30:03 +00:00
Craig Topper	9e074c06fe	[X86] Improve lower1BitShuffle handling for KSHIFTL on narrow vectors. We can insert the value into a larger legal type and shift that by the desired amount. llvm-svn: 369215	2019-08-18 18:52:46 +00:00
Simon Pilgrim	63b3c56fca	Fix signed/unsigned comparison warning. NFCI. llvm-svn: 369213	2019-08-18 17:26:30 +00:00
Simon Pilgrim	fee2546f3f	[X86] isTargetShuffleEquivalent - add BUILD_VECTOR matching Add similar functionality to isShuffleEquivalent - if the mask elements don't match, try matching the BUILD_VECTOR scalars instead. As target shuffles need to handle SM_Sentinel values, this can get a bit tricky, so commit just adds actual mask element index handling - full SM_SentinelZero support will be added when the need arises. Also, enables support in matchVectorShuffleWithPACK llvm-svn: 369212	2019-08-18 17:15:26 +00:00
Simon Pilgrim	a66edd86e2	[X86] isTargetShuffleEquivalent - early out on illegal shuffle masks. NFCI. Simplifies shuffle mask comparisons by just bailing out if the shuffle mask has any out of range values - will make an upcoming patch much simpler. llvm-svn: 369211	2019-08-18 16:37:58 +00:00
Roman Lebedev	9b957d3321	[InstCombine] Cherry-pick NFC cleanups of foldShiftIntoShiftInAnotherHandOfAndInICmp() from D66383 llvm-svn: 369207	2019-08-18 12:26:33 +00:00
Craig Topper	74168ded03	[TargetLowering] Teach computeRegisterProperties to only widen v3i16/v3f16 vectors to the next power of 2 type if that's legal. These were recently made simple types. This restores their behavior back to something like their EVT legalization. We might be able to fix the code in type legalization where the assert was failing, but I didn't investigate too much as I had already looked at the computeRegisterProperties code during the review for v3i16/v3f16. Most of the test changes restore the X86 codegen back to what it looked like before the recent change. The test case in vec_setcc.ll and is a reduced version of the reproducer from the fuzzer. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=16490 llvm-svn: 369205	2019-08-18 06:28:06 +00:00
Craig Topper	f43106e341	[SelectionDAG] Add a node creation debug message to getMachineNode. llvm-svn: 369204	2019-08-18 06:28:00 +00:00
Matt Arsenault	479f3bdb2c	AMDGPU: Fix iterator error when lowering SI_END_CF If the instruction is the last in the block, there is no next instruction but the iteration still needs to look at the new block. llvm-svn: 369203	2019-08-18 00:20:44 +00:00
Matt Arsenault	cfdc2b9bd9	AMDGPU: Disambiguate v3f16 format in load/store tables Currently the searchable tables report the number of dwords. These round to the same number for 3 and 4 component d16 instructions. Change this to report the number of elements so this isn't ambiguous. llvm-svn: 369202	2019-08-18 00:20:43 +00:00
Craig Topper	31f829f0cd	[X86] Add a one use check to the combineStore code that handles v16i16->v16i8 truncate+store by extending to v16i32 and then emitting a v16i32->v16i8 truncstore. This prevent us from emitting a separate truncate and a truncating store instruction. llvm-svn: 369200	2019-08-17 22:46:15 +00:00
Yonghong Song	a8dad5c79b	[BPF] Fix bpf llvm-objdump issues. Commit https://reviews.llvm.org/D57939 ("[DWARF] Refactor RelocVisitor and fix computation of SHT_RELA-typed relocation entries) made a change for relocation resolution when operating on an object file. The change unfortunately broke BPF as given SymbolValue (S) and Addent (A), previously relocation is resolved to S + A and after the change, it is resolved to S This patch fixed the issue by resolving relocation correctly. It looks not all relocation resolution reaches here and I did not trace down exactly when. But I do find if the object file includes codes in two different ELF sections than default ".text", the above bug will be triggered. This patch included a trivial two function source code to demonstrate this issue. The relocation for .debug_loc is resolved incorrectly due to this and llvm-objdump cannot display source annotated assembly. Differential Revision: https://reviews.llvm.org/D66372 llvm-svn: 369199	2019-08-17 22:12:00 +00:00
Kang Zhang	b3d258fc44	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: Fix a bug of preducessors. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 369191	2019-08-17 14:37:05 +00:00
Paul Walker	26295676a4	Revert Revert [AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. This reverts r369132 (git commit `19301d75f0`) llvm-svn: 369186	2019-08-17 09:22:36 +00:00
Paul Walker	93c7a4a47c	Revert [AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. This reverts r369133 (git commit `2632c677f8`) llvm-svn: 369185	2019-08-17 09:22:28 +00:00
Alina Sbirlea	f92109dc01	[MemorySSA] Loop passes should mark MSSA preserved when available. This patch applies only to the new pass manager. Currently, when MSSA Analysis is available, and pass to each loop pass, it will be preserved by that loop pass. Hence, mark the analysis preserved based on that condition, vs the current `EnableMSSALoopDependency`. This leaves the global flag to affect only the entry point in the loop pass manager (in FunctionToLoopPassAdaptor). llvm-svn: 369181	2019-08-17 01:02:12 +00:00
Sanjay Patel	a53ad0e157	Revert r367891 - "[InstCombine] combine mul+shl separated by zext" This reverts commit `5dbb90bfe1`. As noted in the post-commit thread for r367891, this can create a multiply that is lowered to a libcall that may not exist. We need to improve the backend decomposition for integer multiply before trying to re-land this (if it's still worthwhile after doing the backend work). llvm-svn: 369174	2019-08-16 23:36:28 +00:00
Jian Cai	16fa8b0970	Reland "[ARM] push LR before __gnu_mcount_nc" This relands r369147 with fixes to unit tests. https://reviews.llvm.org/D65019 llvm-svn: 369173	2019-08-16 23:30:16 +00:00
Amara Emerson	57ec292ab8	[AArch64][GlobalISel] Fix an assertion during G_UNMERGE selection for s128 types. llvm-svn: 369172	2019-08-16 23:23:40 +00:00
Sanjay Patel	acceedb15f	[CodeGenPrepare] Fix use-after-free If OptimizeExtractBits() encountered a shift instruction with no operands at all, it would erase the instruction, but still return false. This previously didn’t matter because its caller would always return after processing the instruction, but https://reviews.llvm.org/D63233 changed the function’s caller to fall through if it returned false, which would then cause a use-after-free detectable by ASAN. This change makes OptimizeExtractBits return true if it removes a shift instruction with no users, terminating processing of the instruction. Patch by: @brentdax (Brent Royal-Gordon) Differential Revision: https://reviews.llvm.org/D66330 llvm-svn: 369168	2019-08-16 23:10:34 +00:00
Jordan Rupprecht	d0797ece46	Revert [X86] SimplifyDemandedVectorElts - attempt to recombine target shuffle using DemandedElts mask (reapplied) This reverts r368662 (git commit `1a8d790cf5`) The compile-time regression repro is in https://bugs.llvm.org/show_bug.cgi?id=43024 llvm-svn: 369167	2019-08-16 23:08:56 +00:00
Eli Friedman	eaff844fe9	[ARM] Preserve liveness in ARMConstantIslands. We currently don't use liveness information after this point, but it can be useful to catch bugs using -verify-machineinstrs, and optimizations could potentially use this information in the future. Differential Revision: https://reviews.llvm.org/D66319 llvm-svn: 369162	2019-08-16 22:20:14 +00:00
Johannes Doerfert	f72d9b1c97	[Attributor] Fix: Do not partially resolve returned calls. By partially resolving returned calls we did not record that they were not fully resolved which caused odd behavior down the line. We could also end up with some, but not all, returned values of the callee in the returned values map of the caller, another odd behavior we want to avoid. llvm-svn: 369160	2019-08-16 21:59:52 +00:00
Johannes Doerfert	cbaf1fdea2	[Attributor] Fix: Make sure we set the changed flag The flag was updated before we actually run the visitor callback so we might miss updates. llvm-svn: 369159	2019-08-16 21:55:01 +00:00
Johannes Doerfert	17cb918536	[CaptureTracking] Allow null to be in either icmp operand Summary: Before we required the comparison against null to be "canonical", hence null to be operand #1. This patch allows null to be in either operand, similar to the handling of loaded globals that follows. Reviewers: sanjoy, hfinkel, aykevl, sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66321 llvm-svn: 369158	2019-08-16 21:53:49 +00:00
Johannes Doerfert	6dedc78d9d	[Attributor] Add all missing attribute definitions/symbols As a preparation to "on-demand" abstract attribute generation we need implementations for all attributes (as they can be queried and then created on-demand where we now fail to find one). Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66129 llvm-svn: 369155	2019-08-16 21:31:11 +00:00
Jonas Devlieghere	f4bdbea02f	[RWMutex] Simplify availability check Check for the actual version number for the scenarios where the macOS version isn't available (__MAC_10_12). llvm-svn: 369154	2019-08-16 21:25:40 +00:00
Craig Topper	a17d1d2250	[X86] Use Register/MCRegister in more places in X86 This was a quick pass through some obvious places. I haven't tried the clang-tidy check. I also replaced the zeroes in getX86SubSuperRegister with X86::NoRegister which is the real sentinel name. Differential Revision: https://reviews.llvm.org/D66363 llvm-svn: 369151	2019-08-16 20:50:23 +00:00
Jian Cai	2d957cfe02	Revert "[ARM] push LR before __gnu_mcount_nc" This reverts commit `f4cf3b9593`. llvm-svn: 369149	2019-08-16 20:40:21 +00:00
Jian Cai	f4cf3b9593	[ARM] push LR before __gnu_mcount_nc Push LR register before calling __gnu_mcount_nc as it expects the value of LR register to be the top value of the stack on ARM32. Differential Revision: https://reviews.llvm.org/D65019 llvm-svn: 369147	2019-08-16 20:21:08 +00:00
Johannes Doerfert	234eda563d	[Attributor] Towards a more structured deduction pattern Summary: This is the first commit aiming to structure the attribute deduction. The base idea is that we have default propagation patterns as listed below on top of which we can add specific, e.g., context sensitive, logic. Deduction patterns used in this patch: - argument states are determined from call site argument states, see AAAlignArgument and AAArgumentFromCallSiteArguments. - call site argument states are determined as if they were floating values, see AAAlignCallSiteArgument and AAAlignFloating. - floating value states are determined by traversing the def-use chain and combining the states determined for the leaves, see AAAlignFloating and genericValueTraversal. - call site return states are determined from function return states, see AAAlignCallSiteReturned and AACallSiteReturnedFromReturned. - function return states are determined from returned value states, see AAAlignReturned and AAReturnedFromReturnedValues. Through this strategy all logic for alignment is concentrated in the AAAlignFloating::updateImpl method. Note: This commit works on its own but is part of a larger change that involves "on-demand" creation of abstract attributes that will participate in the fixpoint iteration. Without this part, we sometimes do not have an AAAlign abstract attribute to query, loosing information we determined before. All tests have appropriate FIXMEs and the information will be recovered once we added all parts. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66126 llvm-svn: 369144	2019-08-16 19:51:23 +00:00
Johannes Doerfert	66cf87e290	[Attributor][NFC] Introduce aliases for call site attributes Until we have call site specific liveness and/or value information there is no need to do call site specific deduction. Though, we need the symbols in follow up patches that make Attributor::getAAFor return a reference. llvm-svn: 369143	2019-08-16 19:49:00 +00:00
Johannes Doerfert	fe6dbadc0d	[Attributor] Introduce initialize calls and move code to keep attributes concise Summary: This patch should not change the behavior except that the added initialize methods might indicate an optimistic fixpoint earlier. The code movement is done to keep the attribute definitions in a single block where it makes sense. No functional changes intended there. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66258 llvm-svn: 369142	2019-08-16 19:36:17 +00:00
Lang Hames	9bb9a0c10b	[ORC] Remove some stray debugging output accidentally left in r368707 llvm-svn: 369141	2019-08-16 19:33:37 +00:00
Sanjay Patel	39eb2324f7	[InstCombine] canonicalize a scalar-select-of-vectors to vector select This pattern may arise more frequently with an enhancement to SLP vectorization suggested in PR42755: https://bugs.llvm.org/show_bug.cgi?id=42755 ...but we should handle this pattern to make things easier for the backend either way. For all in-tree targets that I looked at, codegen for typical vector sizes looks better when we change to a vector select, so this is safe to do without a cost model (in other words, as a target-independent canonicalization). For example, if the condition of the select is a scalar, we end up with something like this on x86: vpcmpgtd %xmm0, %xmm1, %xmm0 vpextrb $12, %xmm0, %eax testb $1, %al jne LBB0_2 ## %bb.1: vmovaps %xmm3, %xmm2 LBB0_2: vmovaps %xmm2, %xmm0 Rather than the splat-condition variant: vpcmpgtd %xmm0, %xmm1, %xmm0 vpshufd $255, %xmm0, %xmm0 ## xmm0 = xmm0[3,3,3,3] vblendvps %xmm0, %xmm2, %xmm3, %xmm0 Differential Revision: https://reviews.llvm.org/D66095 llvm-svn: 369140	2019-08-16 18:51:30 +00:00
Evgeniy Stepanov	187c63f145	Escape % in printf format string. Fixes branch-relax-block-size.mir on the ASan builder. llvm-svn: 369138	2019-08-16 18:23:54 +00:00
Guanzhong Chen	b1cb9fd1aa	[WebAssembly] Forbid use of EM_ASM with setjmp/longjmp Summary: We tried to support EM_ASM with setjmp/longjmp in binaryen. But with dynamic linking thrown into the mix, the code is no longer understandable and cannot be maintained. We also discovered more bugs in the EM_ASM handling code. To ensure maintainability and correctness of the binaryen code, EM_ASM will no longer be supported with setjmp/longjmp. This is probably fine since the support was added recently and haven't be published. Reviewers: tlively, sbc100, jgravelle-google, kripken Reviewed By: tlively, kripken Subscribers: dschuff, hiraditya, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66356 llvm-svn: 369137	2019-08-16 18:21:08 +00:00
Simon Pilgrim	63b78b678b	[X86] resolveTargetShuffleInputs - add DemandedElts variant. NFCI. Nothing calls this yet, everything still goes through the non (all) DemandedElts wrapper. llvm-svn: 369136	2019-08-16 18:13:22 +00:00
Amara Emerson	c809230a69	[AArch64][GlobalISel] Lower G_SHUFFLE_VECTOR with 1 elt src and 1 elt mask. Again, it's weird that these are allowed. Since lowering support was added in r368709 we started crashing on compiling the neon intrinsics test in the test suite. This fixes the lowering to fold the 1 elt src/mask case into copies. llvm-svn: 369135	2019-08-16 18:06:53 +00:00
Simon Pilgrim	8ff1b7de4d	[X86] combineExtractWithShuffle - handle extract(truncate(x), 0) Eventually we need to generalize combineExtractWithShuffle to handle all faux shuffles and handle truncate (and X86ISD::VTRUNC etc.) there, but we're not ready yet (still creates nodes on the fly, incomplete DemandedElts support, bad use of recursive Depth limit). llvm-svn: 369134	2019-08-16 17:35:08 +00:00
Paul Walker	2632c677f8	[AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. Recommit with fixes for mac builders. Summary: AArch64InstrInfo::getInstSizeInBytes is incorrectly treating meta instructions (e.g. CFI_INSTRUCTION) as normal instructions and giving them a size of 4. This results in branch relaxation calculating block sizes wrong. Branch relaxation also considers alignment and thus a single mistake can result in later blocks being incorrectly sized even when they themselves do not contain meta instructions. The net result is we might not relax a branch whose destination is not within range. Reviewers: nickdesaulniers, peter.smith Reviewed By: peter.smith Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66337 > llvm-svn: 369111 llvm-svn: 369133	2019-08-16 17:29:53 +00:00
Paul Walker	19301d75f0	Revert [AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. This reverts r369111 (git commit `3ccee5f7c4`) llvm-svn: 369132	2019-08-16 17:29:42 +00:00
Vasileios Porpodas	1d254f3dae	[SLPVectorizer] Make the scheduler aware of the TreeEntry operands. Summary: The scheduler's dependence graph gets the use-def dependencies by accessing the operands of the instructions in a bundle. However, buildTree_rec() may change the order of the operands in TreeEntry, and the scheduler is currently not aware of this. This is not causing any functional issues currently, because reordering is restricted to the operands of a single instruction. Once we support operand reordering across multiple TreeEntries, as shown here: http://www.llvm.org/devmtg/2019-04/slides/Poster-Porpodas-Supernode_SLP.pdf , the scheduler will need to get the correct operands from TreeEntry and not from the individual instructions. In short, this patch: - Connects the scheduler's bundle with the corresponding TreeEntry. It introduces new TE and Lane fields in ScheduleData. - Moves the location where the operands of the TreeEntry are initialized. This used to take place in newTreeEntry() setting one operand at a time, but is now moved pre-order just before the recursion of buildTree_rec(). This is required because the scheduler needs to access both operands of the TreeEntry in tryScheduleBundle(). - Updates the scheduler to access the instruction operands through the TreeEntry operands instead of accessing the instruction operands directly. Reviewers: ABataev, RKSimon, dtemirbulatov, Ayal, dorit, hfinkel Reviewed By: ABataev Subscribers: hiraditya, llvm-commits, lebedev.ri, rcorcs Tags: #llvm Differential Revision: https://reviews.llvm.org/D62432 llvm-svn: 369131	2019-08-16 17:21:18 +00:00
Simon Pilgrim	3a8c698771	[X86] Alphabetize pass initialization definitions. NFCI. llvm-svn: 369126	2019-08-16 16:41:38 +00:00
Guozhi Wei	e03f6a1631	[CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization In function Analysis.cpp:isInTailCallPosition, instructions between call and ret are checked to see if they block tail call optimization. If an instruction is an intrinsic call, only llvm.lifetime_end is allowed and other intrinsic functions block tail call. When compiling tcmalloc, we found llvm.assume between a hot function call and ret, it blocks the optimization. But llvm.assume doesn't generate instructions, it should not block tail call. Differential Revision: https://reviews.llvm.org/D66096 llvm-svn: 369125	2019-08-16 16:26:12 +00:00
Krzysztof Parzyszek	ac83aab035	[Hexagon] Generate min/max instructions for 64-bit vectors llvm-svn: 369124	2019-08-16 16:16:27 +00:00
Sander de Smalen	f28e1128d9	Relanding r368987 [AArch64] Change location of frame-record within callee-save area. Changes: There was a condition for `!NeedsFrameRecord` missing in the assert. The assert in question has changed to: + assert((!RPI.isPaired() \|\| !NeedsFrameRecord \|\| RPI.Reg2 != AArch64::FP \|\| + RPI.Reg1 == AArch64::LR) && + "FrameRecord must be allocated together with LR"); This addresses PR43016. llvm-svn: 369122	2019-08-16 15:42:28 +00:00
Evandro Menezes	05e9c2ac2e	[InstCombine] Simplify pow(2.0, itofp(y)) to ldexp(1.0, y) Simplify `pow(2.0, itofp(y))` to `ldexp(1.0, y)`. Differential revision: https://reviews.llvm.org/D65979 llvm-svn: 369120	2019-08-16 15:33:41 +00:00
Cyndy Ishida	5f865ecf06	[TextAPI] Update reader to be supported by lib/Object Summary: To be able to use the TextAPI/Reader for tbd file consumption (by libObject) it gets passed a MemoryBufferRef which isn't castable to MemoryBuffer. Updated the tests to expect that input as well. Reviewers: ributzka, steven_wu Reviewed By: steven_wu Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66147 llvm-svn: 369119	2019-08-16 15:30:48 +00:00
David Green	b782e61e47	[ARM] MVE sext of a load is free MVE also has some sext of loads, which will be free just as scalar instructions are. Differential Revision: https://reviews.llvm.org/D66008 llvm-svn: 369118	2019-08-16 15:13:37 +00:00
Roman Lebedev	16244fccfe	[InstCombine] Shift amount reassociation in bittest: trunc-of-shl (PR42399) Summary: This is continuation of D63829 / https://bugs.llvm.org/show_bug.cgi?id=42399 I thought naive pattern would solve my issue, but nope, it involved truncation, thus more folds needed.. This isn't really the fold i'm interested in, i need trunc-of-lshr, but i'we decided to start with `shl` because it's simpler. In this case, no extra legality checks are needed: https://rise4fun.com/Alive/CAb We should be careful about not increasing instruction count, since we need to produce `zext` because `and` is done in wider type. Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66057 llvm-svn: 369117	2019-08-16 15:10:41 +00:00
Luis Marques	fa06e95898	[RISCV] Convert registers from unsigned to Register Only in public interfaces that have not yet been converted should there remain registers with unsigned type. Differential Revision: https://reviews.llvm.org/D66252 llvm-svn: 369114	2019-08-16 14:27:50 +00:00
Paul Walker	3ccee5f7c4	[AArch64InstrInfo] Stop getInstSizeInBytes returning non-zero for meta instructions. Summary: AArch64InstrInfo::getInstSizeInBytes is incorrectly treating meta instructions (e.g. CFI_INSTRUCTION) as normal instructions and giving them a size of 4. This results in branch relaxation calculating block sizes wrong. Branch relaxation also considers alignment and thus a single mistake can result in later blocks being incorrectly sized even when they themselves do not contain meta instructions. The net result is we might not relax a branch whose destination is not within range. Reviewers: nickdesaulniers, peter.smith Reviewed By: peter.smith Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66337 llvm-svn: 369111	2019-08-16 14:17:52 +00:00
Simon Pilgrim	9da4989c52	[X86] Remove unused include. NFCI. We don't use anything from TargetOptions.h directly and its included via TargetLowering.h anyhow. llvm-svn: 369110	2019-08-16 14:05:46 +00:00
David Green	6e1ac42474	[ARM] Correct register for narrowing and widening MVE loads and stores. The widening and narrowing MVE instructions like VLDRH.32 are only permitted to use low tGPR registers. This means that if they are used for a stack slot, where the register used is only decided during frame setup, we need to be able to correctly pick a thumb1 register over a normal GPR. This attempts to add the required logic into eliminateFrameIndex and rewriteT2FrameIndex, only picking the FrameReg if it is a valid register for the operands register class, and picking a valid scratch register for the register class. Differential Revision: https://reviews.llvm.org/D66285 llvm-svn: 369108	2019-08-16 13:42:39 +00:00
Florian Hahn	403e85cbc5	Revert [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks This reverts r368997 (git commit `2a903c0b67`) It looks like this commit adds invalid predecessors to MBBs. The example below fails the verifier after MachineBlockPlacement (run llc -verify-machineinstrs): @global.4 = external constant i8* declare i32 @zot(...) define i16* @snork.67() personality i8* bitcast (i32 (...)* @zot to i8) { bb: invoke void undef() to label %bb5 unwind label %bb4 bb4: ; preds = %bb %tmp = landingpad { i8, i32 } catch i8* null unreachable bb5: ; preds = %bb %tmp6 = load i32, i32* null, align 4 %tmp7 = icmp eq i32 %tmp6, 0 br i1 %tmp7, label %bb14, label %bb8 bb8: ; preds = %bb11, %bb5 invoke void undef() to label %bb9 unwind label %bb11 bb9: ; preds = %bb8 %tmp10 = invoke i16* undef() to label %bb14 unwind label %bb11 bb11: ; preds = %bb9, %bb8 %tmp12 = landingpad { i8, i32 } cleanup catch i8 bitcast (i8** @global.4 to i8) %tmp13 = icmp ult i64 undef, undef br i1 %tmp13, label %bb8, label %bb14 bb14: ; preds = %bb11, %bb9, %bb5 %tmp15 = phi i16 [ null, %bb5 ], [ null, %bb11 ], [ %tmp10, %bb9 ] ret i16* %tmp15 } llvm-svn: 369104	2019-08-16 13:19:29 +00:00
Bjorn Pettersson	9dddd26e31	[DAGCombiner] Add simple folds for SMULFIX/UMULFIX/SMULFIXSAT Summary: Add the following DAGCombiner folds for mulfix being one of SMULFIX/UMULFIX/SMULFIXSAT: (mulfix x, undef, scale) -> 0 (mulfix x, 0, scale) -> 0 Also added canonicalization of constants to RHS. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66052 llvm-svn: 369103	2019-08-16 13:16:48 +00:00
David Green	8c2c5f5045	[ARM] Don't pretend we know how to generate MVE VLDn We don't yet know how to generate these instructions for MVE. And in the case of VLD3, we don't even have the instruction. For the moment don't tell the vectoriser that we have VLD4, just to end up serialising the results. Differential Revision: https://reviews.llvm.org/D66009 llvm-svn: 369101	2019-08-16 13:06:49 +00:00
Lewis Revill	d3f774d33c	[RISCV] Allow parsing of bare symbols with offsets This patch allows symbols followed by an expression for an offset to be parsed as bare symbols. Differential Revision: https://reviews.llvm.org/D57332 llvm-svn: 369097	2019-08-16 12:00:56 +00:00
Benjamin Kramer	31a47f9890	Revert "[CallGraph] Refine call graph for indirect calls with !callees metadata" This reverts commit r369025. Crashes clang, test case is on the mailing list. llvm-svn: 369096	2019-08-16 10:59:18 +00:00
Lewis Revill	7abf863f76	[RISCV] Lower inline asm constraint A for RISC-V This allows arguments with the constraint A to be lowered to input nodes for RISC-V, which implies a memory address stored in a register. This patch adds the minimal amount of code required to get operands with the right constraints to compile. https://reviews.llvm.org/D54296 llvm-svn: 369095	2019-08-16 10:28:34 +00:00
Simon Pilgrim	59894d4668	[SLPVectorizer] Silence null dereference warning. NFCI. cppcheck + MSVC analyzer both over zealously warn that we might dereference a null Bundle pointer - add an assertion to check for null to silence the warning, plus its a good idea to check that we succeeded in finding a schedule bundle anyway.... llvm-svn: 369094	2019-08-16 10:28:23 +00:00
Jeremy Morse	8b593480d3	[DebugInfo] Handle complex expressions with spills in LiveDebugValues In r369026 we disabled spill-recognition in LiveDebugValues for anything that has a complex expression. This is because it's hard to recover the complex expression once the spill location is baked into it. This patch re-enables spill-recognition and slightly adjusts the DBG_VALUE insts that LiveDebugValues tracks: instead of tracking the last DBG_VALUE for a variable, it tracks the last _unspilt_ DBG_VALUE. The spill-restore code is then able to access and copy the original complex expression; but the rest of LiveDebugValues has to be aware of the slight semantic shift, and produce a new spilt location if a spilt location is propagated between blocks. The test added produces an incorrect variable location (see FIXME), which will be the subject of future work. Differential Revision: https://reviews.llvm.org/D65368 llvm-svn: 369092	2019-08-16 10:04:17 +00:00
Tim Northover	22970d66be	AssumptionCache: remove old affected values after RAUW. If they're left in the cache then they can't be removed efficiently when the cache is notified to unlink a @llvm.assume call, and that can lead to values from different functions entirely remaining there. llvm-svn: 369091	2019-08-16 09:34:27 +00:00
Florian Hahn	75be1a9e58	[ValueTracking] Fix recurrence detection to check both PHI operands. Summary: Currently we fail to compute known bits for recurrences where the first incoming value is the start value of the recurrence. Instead of exiting the loop when the first incoming value is not the step of the recurrence, continue to check the second incoming value. The original code uses a loop to handle both cases, but incorrectly exits instead of continuing. Reviewers: lebedev.ri, spatel, nikic Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66216 llvm-svn: 369088	2019-08-16 09:15:02 +00:00
Craig Topper	120cffccf8	[X86] Manually reimplement getTargetInsertSubreg in X86DAGToDAGISel::matchBitExtract so we can call insertDAGNode on the target constant. This is needed to maintain the topological sort order. Fixes PR42992. llvm-svn: 369084	2019-08-16 04:47:44 +00:00
Igor Kudrin	a33004aca7	Remove the temporary code. NFC. That should have been done in rL368156 but somehow was missed. llvm-svn: 369082	2019-08-16 03:40:04 +00:00
Nico Weber	ee96499a42	Revert r368987, it caused PR43016. llvm-svn: 369080	2019-08-16 02:21:21 +00:00
Jonas Devlieghere	de0ce98abe	[DebugLine] Don't try to guess the path style In r368879 I made an attempt to guess the path style from the files in the line table. After some consideration I now think this is a poor idea. This patch undoes that behavior and instead adds an optional argument to specify the path style. This allows us to make that decision elsewhere where we have more information. In case of LLDB based on the Unit. llvm-svn: 369072	2019-08-15 23:53:15 +00:00
Volkan Keles	0ae6006bee	[GlobalISel] CSEMIRBuilder: Add support for G_GEP Summary: This patch adds G_GEP to `shouldCSEOpc` so that it can be CSEd. It also refactors `translateGetElementPtr` by replacing `createGenericVirtualRegister` calls with types. Reviewers: aditya_nandakumar, arsenm, dsanders, paquette, aemerson Reviewed By: aditya_nandakumar Subscribers: wdng, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66316 llvm-svn: 369070	2019-08-15 23:45:45 +00:00
Eli Friedman	9b9a308452	[ARM][LowOverheadLoops] Fix generated code for "revert". Two issues: 1. t2CMPri shouldn't use CPSR if it isn't predicated. This doesn't really have any visible effect at the moment, but it might matter in the future. 2. The t2CMPri generated for t2WhileLoopStart might need to use a register that isn't LR. My team found this because we have a patch to track register liveness late in the pass pipeline. I'll look into upstreaming it to help catch issues like this earlier. Differential Revision: https://reviews.llvm.org/D66243 llvm-svn: 369069	2019-08-15 23:35:53 +00:00
Jonas Devlieghere	6d6babf745	[Support] Re-introduce the RWMutexImpl for macOS < 10.12 In r369018, Benjamin replaced the custom RWMutex implementation with their C++14 counterpart. Unfortunately, std::shared_timed_mutex is only available on macOS 10.12 and later. This prevents LLVM from compiling even on newer versions of the OS when you have an older deployment target. This patch reintroduced the old RWMutexImpl but guards it by the macOS availability macro. Differential revision: https://reviews.llvm.org/D66313 llvm-svn: 369064	2019-08-15 23:07:20 +00:00
Evgeniy Stepanov	75344955fc	Move isPointerOffset function to ValueTracking (NFC). Summary: To be reused in MemTag sanitizer. Reviewers: pcc, vitalybuka, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66165 llvm-svn: 369062	2019-08-15 22:58:28 +00:00
Philip Reames	5c38ca3534	[SDAG] Minor code cleanup/standardization of atomic accessors [NFC] llvm-svn: 369057	2019-08-15 22:21:14 +00:00
Evgeniy Stepanov	10ce5f88d1	Add missing MIR serialization text for AArch64II::MO_TAGGED. Reviewers: pcc Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66312 llvm-svn: 369053	2019-08-15 22:03:55 +00:00
Alina Sbirlea	79ff20428e	[MemorySSA] Remove restrictive asserts. The verification I added has overly restrictive asserts. Unreachable blocks can have any incoming value in practice, after an update due to a "replaceAllUses" call when the repalced entry is LiveOnEntry. llvm-svn: 369050	2019-08-15 21:20:08 +00:00
Daniel Sanders	0c47611131	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041	2019-08-15 19:22:08 +00:00
Krzysztof Parzyszek	8e987702b1	[Hexagon] Fix instruction selection for vselect v4i8 llvm-svn: 369040	2019-08-15 19:20:09 +00:00
Matt Arsenault	1f2b727298	MVT: Add v3i16/v3f16 vectors AMDGPU has some buffer intrinsics which theoretically could use this. Some of the generated tables include the 3 and 4 element vector versions of these rounded to 64-bits, which is ambiguous. Add these to help the table disambiguate these. Assertion change is for the path odd sized vectors now take for R600. v3i16 is widened to v4i16, which then needs to be promoted to v4i32. llvm-svn: 369038	2019-08-15 18:58:25 +00:00
Philip Reames	d202899431	[NFC] Add a couple of dump routines for RegisterPressure helper classes llvm-svn: 369037	2019-08-15 18:49:39 +00:00
Florian Hahn	3f2850bc60	[ValueTracking] Look through ptrmask intrinsics during getUnderlyingObject. Reviewers: nlopes, efriedma, hfinkel, sanjoy, aqjune, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61669 llvm-svn: 369036	2019-08-15 18:39:56 +00:00
Craig Topper	2a372ba534	[X86] Add custom type legalization for bitcasting mmx to v2i32/v4i16/v8i8 to use movq2dq instead of going through memory. llvm-svn: 369031	2019-08-15 18:23:37 +00:00
Benjamin Kramer	2e62396c2f	Link libpthread into LLVMCore.so After r369018 the compiler can inline pthread calls into users of RWMutex. llvm-svn: 369029	2019-08-15 18:06:30 +00:00
Pavel Labath	11d9e46f8e	Revert "MemoryBuffer: Add a missing error-check to getOpenFileImpl" This reverts commit r368977 because it broke a couple of tests in lldb. llvm-svn: 369027	2019-08-15 17:52:40 +00:00
Jeremy Morse	c476124bc8	[DebugInfo] Avoid crash from dropped fragments in LiveDebugValues This patch avoids a crash caused by DW_OP_LLVM_fragments being dropped from DIExpressions by LiveDebugValues spill-restore code. The appearance of a previously unseen fragment configuration confuses LDV, as documented in PR42773, and reproduced by the test function this patch adds (Crashes on a x86_64 debug build). To avoid this, on spill restore, we now use fragment information from the spilt-location-expression. In addition, when spilling, we now don't spill any DBG_VALUE with a complex expression, as it can't be safely restored and will definitely lead to an incorrect variable location. The discussion of this is in D65368. Differential Revision: https://reviews.llvm.org/D66284 llvm-svn: 369026	2019-08-15 17:49:46 +00:00
Mark Lacey	626ed22fbe	[CallGraph] Refine call graph for indirect calls with !callees metadata For indirect call sites having a small set of possible callees, !callees metadata can be used to indicate what those callees are. This patch updates the call graph and lazy call graph analyses so that they consider this metadata when encountering call sites. For the call graph, it adds a new external call graph node to the graph for each unique !callees metadata node. A call graph edge connects an indirect call site with the external node associated with the !callees metadata that is attached to it. And there is an edge from this external node to each of the callees indicated by the metadata. Similarly, for the lazy call graph, the patch adds Ref edges from a caller to the possible callees indicated by the metadata. The primary purpose of the patch is to facilitate iterating over the functions in a module such that all of the callees indicated by a given !callees metadata node will be visited prior to the functions containing call sites annotated by that node. This property is required by optimizations performing a bottom-up traversal of the SCC DAG. For example, the inliner can be made to inline through an indirect call. If the call site is annotated with !callees metadata, this patch ensures that the inliner will have visited all of the callees prior to the caller, allowing it to reliably compute the cost of inlining one or more of the potential callees. Original patch by @mssimpso. I've made some small changes to get it to apply, build, and pass tests on the top of tree, as well as some minor tweaks to formatting and functionality. Subscribers: mehdi_amini, hiraditya, llvm-commits, mssimpso Tags: #llvm Differential Revision: https://reviews.llvm.org/D39339 llvm-svn: 369025	2019-08-15 17:47:53 +00:00
Taewook Oh	213d8a9f13	[NewPM][PassInstrumentation] IR printing support for (Thin)LTO Summary: IR printing has not been correctly supported with (Thin)LTO if the new pass manager is enabled. Previously we only get outputs from backend(codegen) passes, as they are still under legacy pass manager even when the new pass manager is enabled. This patch addresses the issue and enables IR printing for optimization passes with new pass manager + (Thin)LTO setting. Reviewers: fedor.sergeev, philip.pfaffe Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66253 llvm-svn: 369024	2019-08-15 17:47:44 +00:00
Craig Topper	6eebd2bcd7	[X86] Improve cost model for subvector extraction of less than 128-bit vectors Now that we're using widening legalization. We need to improve our extract_subvector cost model for these types. This patch begins by modeling these as a subvector extract followed by a permute. I've left FIXMEs in the code for future improvements. Differential Revision: https://reviews.llvm.org/D65892 llvm-svn: 369022	2019-08-15 17:29:42 +00:00
Benjamin Kramer	8d3a1523dd	[Support] Base RWMutex on std::shared_timed_mutex (C++14) This should have the same semantics. We use std::shared_mutex instead on MSVC and C++17, std::shared_timed_mutex is less efficient than our custom implementation on Windows, std::shared_mutex should be faster. llvm-svn: 369018	2019-08-15 16:55:23 +00:00
Krzysztof Parzyszek	8460301d58	[Hexagon] Generate vector min/max for HVX llvm-svn: 369014	2019-08-15 16:13:17 +00:00
Jonas Devlieghere	0eaee545ee	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Andrea Di Biagio	3de2f0330f	[MCA] Slightly refactor class RetireControlUnit, and add the ability to override the mask of used buffered resources in class mca::Instruction. NFCI This patch teaches the RCU how to peek 'next' RCUTokens. A new method has been added to the RetireControlUnit class with the goal of minimizing the complexity of follow-up patches that will enable macro-fusion support in mca. This patch also adds method Instruction::getNumMicroOpcodes() to simplify common interactions with the instruction descriptor (a pattern quite common in some pipeline stages). Added the ability to override the default set of consumed scheduler resources (this -again- is to simplify future patches that add support for macro-op fusion). No functional change intended. llvm-svn: 369010	2019-08-15 15:27:40 +00:00
Simon Pilgrim	d4df81f463	Remove SmallBitVector.h include. NFCI. SmallBitVector/BitVector types aren't used at all in the cpp file. llvm-svn: 369008	2019-08-15 14:40:37 +00:00
Simon Pilgrim	983e9118a2	Remove BitVector.h include. NFCI. BitVector type isn't used at all in the cpp file. llvm-svn: 369007	2019-08-15 14:39:28 +00:00
Jinsong Ji	9fd81dc139	[PowerPC] Use xxleqv to set all one vector IMM(-1). Summary: xxspltib/vspltisb are 3 cycle PM instructions, xxleqv is 2 cycle ALU instruction. We should use xxleqv to set all one vectors. Reviewers: hfinkel, nemanjai, steven.zhang Subscribers: hiraditya, kbarton, MaskRay, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65529 llvm-svn: 369006	2019-08-15 14:32:51 +00:00
Simon Pilgrim	ed804dad1e	[DAGCombine] MergeConsecutiveStores - fix cppcheck/MSVC extension warning. NFCI. Set the StartIdx type to size_t so that it matches the StoreNodes SmallVector size() and index types. Silences the MSVC analyzer warning that unsigned increment might overflow before exceeding size_t on 64-bit targets - this isn't likely to happen but it means we use consistent types and reduces the warning "noise" a little. llvm-svn: 368998	2019-08-15 13:07:14 +00:00
Kang Zhang	2a903c0b67	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: This patch has trigger a bug of r368339, and the r368339 has been reverted, So upstream this patch again. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 368997	2019-08-15 13:05:16 +00:00
David Green	3a99101812	[ARM] Fix alignment checks for BE VLDRH We need to allow any alignment at least 2, not just exactly 2, so that the big endian loads and stores can be selected successfully. I've also added extra BE testing for the load and store tests. Thanks to Oliver for the report. Differential Revision: https://reviews.llvm.org/D66222 llvm-svn: 368996	2019-08-15 12:54:47 +00:00
Sanjay Patel	57d459309d	[SDAG][x86] check for relaxed math when matching an FP reduction If the last step in an FP add reduction allows reassociation and doesn't care about -0.0, then we are free to recognize that computation as a reduction that may reorder the intermediate steps. This is requested directly by PR42705: https://bugs.llvm.org/show_bug.cgi?id=42705 and solves PR42947 (if horizontal math instructions are actually faster than the alternative): https://bugs.llvm.org/show_bug.cgi?id=42947 Differential Revision: https://reviews.llvm.org/D66236 llvm-svn: 368995	2019-08-15 12:43:15 +00:00
Andrea Di Biagio	7aa0dbb664	[MCA] Slightly refactor the logic in ResourceManager. NFCI This patch slightly changes the API in the attempt to simplify resource buffer queries. It is done in preparation for a patch that will enable support for macro fusion. llvm-svn: 368994	2019-08-15 12:39:55 +00:00
Florian Hahn	fd72bf21c9	[ValueTracking] Add MustPreserveNullness arg to functions analyzing calls. (NFC) Some uses of getArgumentAliasingToReturnedPointer and isIntrinsicReturningPointerAliasingArgumentWithoutCapturing require the calls/intrinsics to preserve the nullness of the argument. For alias analysis, the nullness property does not really come into play. This patch explicitly sets it to true. In D61669, the alias analysis uses will be switched to not require preserving nullness. Reviewers: nlopes, efriedma, hfinkel, sanjoy, aqjune, jdoerfert Reviewed By: jdoerfert Tags: #llvm Differential Revision: https://reviews.llvm.org/D64150 llvm-svn: 368993	2019-08-15 12:13:02 +00:00
David Green	0ff2296a49	[ARM] MVE predicate store patterns Stack loads and stores were already working, but direct stores were not. This adds the patterns for them, same as predicate loads. Differential Revision: https://reviews.llvm.org/D66213 llvm-svn: 368988	2019-08-15 10:41:42 +00:00
Sander de Smalen	643adb5576	[AArch64] Change location of frame-record within callee-save area. This patch changes the location of the frame-record (FP, LR) to the bottom of the callee-saved area. According to the AAPCS the location of the frame-record within the stackframe is unspecified (section 5.2.3 The Frame Pointer), so the compiler should be free to choose a different location. The reason for changing the location of the frame-record is to prepare the frame for allocating an SVE area below the callee-saves. This way the compiler can use the VL-scaled addressing modes to directly access SVE objects from the frame-pointer. : : \| stack \| \| stack \| \| args \| \| args \| +-------+ +-------+ \| x30 \| \| x19 \| \| x29 \| \| x20 \| FP -> \|- - - -\| \| x21 \| \| x19 \| ==> \| x22 \| \| x20 \| \|- - - -\| \| x21 \| \| x30 \| \| x22 \| \| x29 \| +-------+ +-------+ <- FP \|///////\| \|///////\| // realignment gap \|- - - -\| \|- - - -\| \|spills/\| \|spills/\| \| locals\| \| locals\| SP -> +-------+ +-------+ <- SP Things to point out: - The algorithm to find a paired register should be prevented from accidentally pairing some callee-saved register with LR that is not FP, since they should always be paired together when the frame has a frame-record. - For Darwin platforms the location of the frame-record is unchanged, since the unwind encoding does not allow for encoding this position dynamically and other tools currently depend on the former layout. Reviewers: efriedma, rovka, rengolin, thegameg, greened, t.p.northover Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D65653 llvm-svn: 368987	2019-08-15 10:34:16 +00:00
Florian Hahn	de1d6c8220	Add ptrmask intrinsic This patch adds a ptrmask intrinsic which allows masking out bits of a pointer that must be zero when accessing it, because of ABI alignment requirements or a restriction of the meaningful bits of a pointer through the data layout. This avoids doing a ptrtoint/inttoptr round trip in some cases (e.g. tagged pointers) and allows us to not lose information about the underlying object. Reviewers: nlopes, efriedma, hfinkel, sanjoy, jdoerfert, aqjune Reviewed by: sanjoy, jdoerfert Differential Revision: https://reviews.llvm.org/D59065 llvm-svn: 368986	2019-08-15 10:12:26 +00:00
Sven van Haastregt	0096d1938e	[Support] Fix Wundef warning llvm-svn: 368984	2019-08-15 10:05:22 +00:00
David Green	04f2f32869	[ARM] MVE trunc to i1 vectors This adds patterns for selecting trunc instructions from full vectors to i1's vectors. Differential Revision: https://reviews.llvm.org/D66201 llvm-svn: 368981	2019-08-15 09:26:51 +00:00
Pavel Labath	46bfdb956c	MemoryBuffer: Add a missing error-check to getOpenFileImpl Summary: In case the function was called with a desired read size and the file was not an "mmap()" candidate, the function was falling back to a "pread()", but it was failing to check the result of that system call. This meant that the function would return "success" even though the read operation failed, and it returned a buffer full of uninitialized memory. Reviewers: rnk, dblaikie Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66224 llvm-svn: 368977	2019-08-15 08:20:15 +00:00
Dorit Nuzman	d57d73daed	[LV] fold-tail predication should be respected even with assume_safety assume_safety implies that loads under "if's" can be safely executed speculatively (unguarded, unmasked). However this assumption holds only for the original user "if's", not those introduced by the compiler, such as the fold-tail "if" that guards us from loading beyond the original loop trip-count. Currently the combination of fold-tail and assume-safety pragmas results in ignoring the fold-tail predicate that guards the loads, generating unmasked loads. This patch fixes this behavior. Differential Revision: https://reviews.llvm.org/D66106 Reviewers: Ayal, hsaito, fhahn llvm-svn: 368973	2019-08-15 07:12:14 +00:00
Craig Topper	1e246b20c0	[X86] Add isel pattern to match VZEXT_MOVL and a v2i64 scalar_to_vector bitcasted from x86mmx to MOVQ2DQ. We already had the pattern for just the scalar to vector and bitcast, but not the case where we wanted zeroes in the high half of the xmm. llvm-svn: 368972	2019-08-15 06:46:30 +00:00
Craig Topper	e6409602a1	[X86] Make sure load is non-volatile in the MMX_X86movdq2q (loadv2i64) isel pattern. This pattern will narrow the load so we should make sure its not volatile. llvm-svn: 368971	2019-08-15 06:46:26 +00:00
Craig Topper	dbcbbf5658	[X86] Remove unneeded isel pattern for v4f32->v4i32 fp_to_sint and conversion to MMX. fp_to_sint is turned into X86cvttp2si during isel preprocessing. The other redundant isel patterns were removed previously, but I missed this one because its in the MMX td file. llvm-svn: 368968	2019-08-15 05:52:02 +00:00
Craig Topper	a57734ba4e	[X86] Disable custom type legalization for v2i32/v4i16/v8i8->i64. The default legalization can take care of this. llvm-svn: 368967	2019-08-15 05:51:58 +00:00
Craig Topper	57286afe4e	[X86] Disable custom type legalization for v2i32/v4i16/v8i8->f64 bitcast. The generic legalization handles this in the same way so just use that. llvm-svn: 368966	2019-08-15 05:51:54 +00:00
Craig Topper	ba39fcd8c6	[X86] Remove some unreachable code from LowerBITCAST. llvm-svn: 368965	2019-08-15 05:51:50 +00:00
Michael Pozulp	9abf668c08	[llvm-objdump] Add warning messages if disassembly + source for problematic inputs Summary: Addresses https://bugs.llvm.org/show_bug.cgi?id=41905 Reviewers: jhenderson, rupprecht, grimar Reviewed By: jhenderson, grimar Subscribers: RKSimon, MaskRay, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62462 llvm-svn: 368963	2019-08-15 05:15:22 +00:00
Craig Topper	14f7560020	[X86] Remove some dead code and combine some repeated code that's left. If the width is 256 bits, then we must have AVX so the else here was unnecessary. Once that's removed then the >= 256 bit code is identical to the 128 bit code with a different VT so combine them. llvm-svn: 368956	2019-08-15 04:07:43 +00:00
Jonas Devlieghere	ed3b6d1bb2	Revert "Expose TailCallKind via the LLVM C API" This is failing on several build bots. Reverting as discussed in https://reviews.llvm.org/D66061. llvm-svn: 368953	2019-08-15 03:49:51 +00:00
Gor Nishanov	efe0093404	[coroutine] Fixes "cannot move instruction since its users are not dominated by CoroBegin" problem. Summary: Fixes https://bugs.llvm.org/show_bug.cgi?id=36578 and https://bugs.llvm.org/show_bug.cgi?id=36296. Supersedes: https://reviews.llvm.org/D55966 One of the fundamental transformation that CoroSplit pass performs before splitting the coroutine is to find which values need to survive between suspend and resume and provide a slot for them in the coroutine frame to spill and restore the value as needed. Coroutine frame becomes available once the storage for it was allocated and that point is marked in the pre-split coroutine with a llvm.coro.begin intrinsic. FE normally puts all of the user-authored code that would be accessing those values after llvm.coro.begin, however, sometimes instructions accessing those values would end up prior to coro.begin. For example, writing out a value of the parameter into the alloca done by the FE or instructions that are added by the optimization passes such as SROA when it rewrites allocas. Prior to this change, CoroSplit pass would try to move instructions that may end up accessing the values in the coroutine frame after CoroBegin. However it would run into problems (report_fatal_error) if some of the values would be used both in the allocation function (for example allocator is passed as a parameter to a coroutine) and in the use-authored body of the coroutine. To handle this case and to simplify the instruction moving logic, this change removes all of the instruction moving. Instead, we only change the uses of the spilled values that are dominated by coro.begin and leave other instructions intact. Before: ``` %var = alloca i32 %1 = getelementptr .. %var; ; will move this one after coro.begin %f = call i8* @llvm.coro.begin( ``` After: ``` %var = alloca i32 %1 = getelementptr .. %var; stays put %f = call i8* @llvm.coro.begin( ``` If we discover that there is a potential write into an alloca, prior to coro.begin we would copy its value from the alloca into the spill slot in the coroutine frame. Before: ``` %var = alloca i32 store .. %var ; will move this one after coro.begin %f = call i8* @llvm.coro.begin( ``` After: ``` %var = alloca i32 store .. %var ;stays put %f = call i8* @llvm.coro.begin( %tmp = load %var store %tmp, %spill.slot.for.var ``` Note: This change does not handle array allocas as that is something that C++ FE does not produce, but, it can be added in the future if need arises Reviewers: llvm-commits, modocache, ben-clayton, tks2103, rjmccall Reviewed By: modocache Subscribers: bartdesmet Differential Revision: https://reviews.llvm.org/D66230 llvm-svn: 368949	2019-08-15 00:48:51 +00:00
Robert Widmann	708c4605a1	Expose TailCallKind via the LLVM C API Summary: This exposes `CallInst`'s tail call kind via new `LLVMGetTailCallKind` and `LLVMSetTailCallKind` functions. The motivation for this is to be able to see `musttail` for languages that require mandatory tail calls for correctness. Today only the weaker `LLVMSetTail` is exposed and there is no way to set `GuaranteedTailCallOpt` via the C API. Reviewers: CodaFi, jyknight, deadalnix, rnk Reviewed By: CodaFi Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66061 llvm-svn: 368945	2019-08-14 23:54:35 +00:00
Johannes Doerfert	54f6be7b83	[Attributor] Try to fix "missing field 'RetInsts' initializer" warning http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/35674/steps/build_Lld/logs/stdio llvm-svn: 368938	2019-08-14 22:32:29 +00:00
Johannes Doerfert	5304b72a81	[Attributor][NFC] Make debug output consistent llvm-svn: 368931	2019-08-14 22:04:28 +00:00
Philip Reames	7b0515176b	[SCEV] Rename getMaxBackedgeTakenCount to getConstantMaxBackedgeTakenCount [NFC] llvm-svn: 368930	2019-08-14 21:58:13 +00:00
Johannes Doerfert	4395b31d99	[Attributor][NFC] Try to eliminate warnings (debug build + fall through) llvm-svn: 368928	2019-08-14 21:46:28 +00:00
Johannes Doerfert	17b578bc75	[Attributor][NFC] Introduce statistics macros for new positions llvm-svn: 368927	2019-08-14 21:46:25 +00:00
Craig Topper	e7ea06b7d2	[SelectionDAGBuilder] Teach gather/scatter getUniformBase to look through vector zeroinitializer indices in addition to scalar zeroes. llvm-svn: 368926	2019-08-14 21:38:56 +00:00
Johannes Doerfert	e1e844d6b0	[Attributor][NFC] Add merge/join/clamp operators to the IntegerState Differential Revision: https://reviews.llvm.org/D66146 llvm-svn: 368925	2019-08-14 21:35:20 +00:00
Johannes Doerfert	6a1274a52e	[Attributor] Use the AANoNull attribute directly in AADereferenceable Summary: Instead of constantly keeping track of the nonnull status with the dereferenceable information we can simply query the nonnull attribute whenever we need the information (debug + manifest). Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66113 llvm-svn: 368924	2019-08-14 21:31:32 +00:00
Amara Emerson	1222cfd5fe	[AArch64][GlobalISel] Custom selection for s8 load acquire. Implement this single atomic load instruction so that we can compile stack protector code. Differential Revision: https://reviews.llvm.org/D66245 llvm-svn: 368923	2019-08-14 21:30:30 +00:00
Johannes Doerfert	def9928204	[Attributor] Use liveness during the creation of AAReturnedValues Summary: As one of the first attributes, and one of the complex ones, AAReturnedValues was not using liveness but we filtered the result after the fact. This change adds liveness usage during the creation. The algorithm is also improved and shorter. The new algorithm will collect returned values over time using the generic facilities that work with liveness already, e.g., genericValueTraversal which does not look at dead PHI node predecessors. A test to show how this leads to better results is included. Note: Unresolved calls and resolved calls are now tracked explicitly. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66120 llvm-svn: 368922	2019-08-14 21:29:37 +00:00
Johannes Doerfert	9a1a1f96d9	[Attributor] Do not update or manifest dead attributes Summary: If the associated context instruction is assumed dead we do not need to update or manifest the state. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66116 llvm-svn: 368921	2019-08-14 21:25:08 +00:00
Johannes Doerfert	710ebb03ed	[Attributor] Use IRPosition consistently Summary: The next attempt to clean up the Attributor interface before we grow it further. Before, we used a combination of two values (associated + anchor) and an argument number (or -1) to determine a location. This was very fragile. The new system uses exclusively IR positions and we restrict the generation of IR positions to special constructor methods that verify internal constraints we have. This will catch misuse early. The auto-conversion, e.g., in getAAFor, is now performed through the SubsumingPositionIterator. This iterator takes an IR position and allows to visit all IR positions that "subsume" the given one, e.g., function attributes "subsume" argument attributes of that function. For a detailed breakdown see the class comment of SubsumingPositionIterator. This patch also introduces the IRPosition::getAttrs() to extract IR attributes at a certain position. The method knows how to look up in different positions that are equivalent, e.g., the argument position for call site arguments. We also introduce three new positions kinds such that we have all IR positions where attributes can be placed and one for "floating" values. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65977 llvm-svn: 368919	2019-08-14 21:18:01 +00:00
Dinar Temirbulatov	da0435a690	[SLP][NFC] Use pointers to address to ScalarToTreeEntry elements, instead of indexes. llvm-svn: 368906	2019-08-14 19:46:50 +00:00
Sanjay Patel	ecccf29e6c	[SDAG] move variable closer to use; NFC llvm-svn: 368905	2019-08-14 19:46:15 +00:00
Jan Korous	14230f9926	[Support][NFC] Fix error message for posix_spawn_file_actions_addopen failed call Seems like a copy-paste from couple lines above. llvm-svn: 368899	2019-08-14 18:30:18 +00:00
Philip Reames	6cca3ad43e	[RLEV] Rewrite loop exit values for multiple exit loops w/o overall loop exit count We already supported rewriting loop exit values for multiple exit loops, but if any of the loop exits were not computable, we gave up on all loop exit values. This patch generalizes the existing code to handle individual computable loop exits where possible. As discussed in the review, this is a starting point for figuring out a better API. The code is a bit ugly, but getting it in lets us test as we go. Differential Revision: https://reviews.llvm.org/D65544 llvm-svn: 368898	2019-08-14 18:27:57 +00:00
Matt Arsenault	dbc1f207fa	InferAddressSpaces: Move target intrinsic handling to TTI I'm planning on handling intrinsics that will benefit from checking the address space enums. Don't bother moving the address collection for now, since those won't need th enums. llvm-svn: 368895	2019-08-14 18:13:00 +00:00
Matt Arsenault	0eac2a2963	InferAddressSpaces: Remove unnecessary check for ConstantInt The IR is invalid if this isn't a constant since immarg was added. llvm-svn: 368893	2019-08-14 18:01:42 +00:00
Taewook Oh	df7022825c	[DebugInfo] Consider debug label scope has an extra lexical block file Summary: There are places where a case that debug label scope has an extra lexical block file is not considered properly. The modified test won't pass without this patch. Reviewers: aprantl, HsiangKai Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66187 llvm-svn: 368891	2019-08-14 17:58:45 +00:00
David Bolvansky	f94460d4b6	[SLC] Dereferenceable annonation - handle valid null pointers Reviewers: jdoerfert, reames Reviewed By: jdoerfert Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66161 llvm-svn: 368884	2019-08-14 17:15:20 +00:00
Jonas Devlieghere	c0a9b1edca	[DebugLine] Improve path handling. After switching over LLDB's line table parser to libDebugInfo, we noticed two regressions on the Windows bot. The problem is that when obtaining a file from the line table prologue, we append paths without specifying a path style. This leads to incorrect results on Windows for debug info containing Posix paths: 0x0000000000201000: /tmp\b.c, is_start_of_statement = TRUE This patch is an attempt to fix that by guessing the path style whenever possible. Differential revision: https://reviews.llvm.org/D66227 llvm-svn: 368879	2019-08-14 17:00:10 +00:00
David Bolvansky	0e0fbae1a4	[BuildLibCalls] Noalias annotation Summary: I think this is better solution than annotating callsites in IC/SLC. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66217 llvm-svn: 368875	2019-08-14 16:50:06 +00:00
Bill Wendling	cc2bebe039	Ignore indirect branches from callbr. Summary: We can't speculate around indirect branches: indirectbr and invoke. The callbr instruction needs to be included here. Reviewers: nickdesaulniers, manojgupta, chandlerc Reviewed By: chandlerc Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66200 llvm-svn: 368873	2019-08-14 16:44:07 +00:00
Thomas Lively	de0133eaa2	[WebAssembly] Stop unrolling SIMD shifts since they are fixed in V8 Summary: Fixes PR42973. Tests don't change because simd-arith.ll tests behavior on unimplemented-simd128, which does not include any temporary workarounds such as the one removed in this revision. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66166 llvm-svn: 368868	2019-08-14 16:24:37 +00:00
Craig Topper	3e44d96170	[X86] Use PSADBW for v8i8 addition reductions. Improves the 8 byte case from PR42674. Differential Revision: https://reviews.llvm.org/D66069 llvm-svn: 368864	2019-08-14 15:57:29 +00:00
Xiangling Liao	49661f94c8	[NFC][AIX] Change assertion Address one left comment on https://reviews.llvm.org/D63547. A minor change for assertion. Differential Revision: https://reviews.llvm.org/D63547 llvm-svn: 368860	2019-08-14 14:57:25 +00:00
Craig Topper	30d3e9c395	[X86][CostModel] Adjust the costs of ZERO_EXTEND/SIGN_EXTEND with less than 128-bit inputs Now that we legalize by widening, the element types here won't change. Previously these were modeled as the elements being widened and then the instruction might become an AND or SHL/ASHR pair. But now they'll become something like a ZERO_EXTEND_VECTOR_INREG/SIGN_EXTEND_VECTOR_INREG. For AVX2, when the destination type is legal its clear the cost should be 1 since we have extend instructions that can produce 256 bit vectors from less than 128 bit vectors. I'm a little less sure about AVX1 costs, but I think the ones I changed were definitely too high, but they might still be too high. Differential Revision: https://reviews.llvm.org/D66169 llvm-svn: 368858	2019-08-14 14:52:39 +00:00
Craig Topper	8c545168ee	[X86] Add llvm_unreachable to a switch that covers all expected values. llvm-svn: 368857	2019-08-14 14:51:19 +00:00
Jinsong Ji	e71db6584d	[PowerPC][NFC] Consolidate duplicate XX3Form_SetZero and XX3Form_Zero. Rename one to XX3Form_SameOp, remove the other one. llvm-svn: 368856	2019-08-14 14:16:26 +00:00
Jason Liu	8fc095d453	[AIX] Add call lowering for parameters that could pass onto FPRs Summary: This patch adds call lowering functionality to enable passing parameters onto floating point registers when needed. Differential Revision: https://reviews.llvm.org/D63654 llvm-svn: 368855	2019-08-14 14:13:11 +00:00
Pavel Labath	0d802a4923	Revert "raw_ostream: add operator<< overload for std::error_code" This reverts commit r368849, because it breaks some bots (e.g. llvm-clang-x86_64-win-fast). It turns out this is not as NFC as we had hoped, because operator== will consider two std::error_codes to be distinct even though they both hold "success" values if they have different categories. llvm-svn: 368854	2019-08-14 13:59:04 +00:00
Pavel Labath	40837e97b1	raw_ostream: add operator<< overload for std::error_code Summary: The main motivation for this is unit tests, which contain a large macro for pretty-printing std::error_code, and this macro is duplicated in every file that needs to do this. However, the functionality may be useful elsewhere too. In this patch I have reimplemented the existing ASSERT_NO_ERROR macros to reuse the new functionality, but I have kept the macro (as a one-liner) as it is slightly more readable than ASSERT_EQ(..., std::error_code()). Reviewers: sammccall, ilya-biryukov Subscribers: zturner, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65643 llvm-svn: 368849	2019-08-14 13:33:28 +00:00
Jeremy Morse	90c2794bfc	[DebugInfo] MCP: collect and update DBG_VALUEs encountered in local block MCP currently uses changeDebugValuesDefReg / collectDebugValues to find debug users of a register, however those functions assume that all DBG_VALUEs immediately follow the specified instruction, which isn't necessarily true. This is going to become very often untrue when we turn off CodeGenPrepare::placeDbgValues. Instead of calling changeDebugValuesDefReg on an instruction to change its debug users, in this patch we instead collect DBG_VALUEs of copies as we iterate over insns, and update the debug users of copies that are made dead. This isn't a non-functional change, because MCP will now update DBG_VALUEs that aren't immediately after a copy, but refer to the same register. I've hijacked the regression test for PR38773 to test for this new behaviour, an entirely new test seemed overkill. Differential Revision: https://reviews.llvm.org/D56265 llvm-svn: 368835	2019-08-14 12:20:02 +00:00
Fangrui Song	4c8deb6172	[IR] Simplify removeDeadConstantUsers. NFC llvm-svn: 368833	2019-08-14 11:38:45 +00:00
Simon Pilgrim	828a89e244	Fix "not all control paths return a value" MSVC warnings. NFCI. llvm-svn: 368831	2019-08-14 11:31:05 +00:00
Simon Pilgrim	3f40bdb558	Fix "not all control paths return a value" MSVC warning. NFCI. llvm-svn: 368830	2019-08-14 11:29:56 +00:00
Simon Pilgrim	8bba4798c2	Fix "not all control paths return a value" MSVC warnings. NFCI. llvm-svn: 368829	2019-08-14 11:29:16 +00:00
George Rimar	bcc00e1afb	Recommit r368812 "[llvm/Object] - Convert SectionRef::getName() to return Expected<>" Changes: no changes. A fix for the clang code will be landed right on top. Original commit message: SectionRef::getName() returns std::error_code now. Returning Expected<> instead has multiple benefits. For example, it forces user to check the error returned. Also Expected<> may keep a valuable string error message, what is more useful than having a error code. (Object\invalid.test was updated to show the new messages printed.) This patch makes a change for all users to switch to Expected<> version. Note: in a few places the error returned was ignored before my changes. In such places I left them ignored. My intention was to convert the interface used, and not to improve and/or the existent users in this patch. (Though I think this is good idea for a follow-ups to revisit such places and either remove consumeError calls or comment each of them to clarify why it is OK to have them). Differential revision: https://reviews.llvm.org/D66089 llvm-svn: 368826	2019-08-14 11:10:11 +00:00
Fangrui Song	8caa0aaa4d	[AsmPrinter] Delete redundant .type foo, @function when emitting an ifunc In MCAsmStreamer: .type foo,@function # <--- this is redundant .type foo,@gnu_indirect_function In MCELFStreamer, the latter STT_GNU_IFUNC overrides STT_FUNC. llvm-svn: 368823	2019-08-14 10:30:27 +00:00
Roman Lebedev	32f1e1a01d	[InstCombine] Refactor getFlippedStrictnessPredicateAndConstant() out of canonicalizeCmpWithConstant(), NFCI I'd like to use it elsewhere, hopefully without reinventing the wheel. No functional change intended so far. llvm-svn: 368820	2019-08-14 09:57:20 +00:00
George Rimar	468919e182	Revert r368812 "[llvm/Object] - Convert SectionRef::getName() to return Expected<>" It broke clang BB: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16455 llvm-svn: 368813	2019-08-14 08:56:55 +00:00
George Rimar	a0c6a35714	[llvm/Object] - Convert SectionRef::getName() to return Expected<> SectionRef::getName() returns std::error_code now. Returning Expected<> instead has multiple benefits. For example, it forces user to check the error returned. Also Expected<> may keep a valuable string error message, what is more useful than having a error code. (Object\invalid.test was updated to show the new messages printed.) This patch makes a change for all users to switch to Expected<> version. Note: in a few places the error returned was ignored before my changes. In such places I left them ignored. My intention was to convert the interface used, and not to improve and/or the existent users in this patch. (Though I think this is good idea for a follow-ups to revisit such places and either remove consumeError calls or comment each of them to clarify why it is OK to have them). Differential revision: https://reviews.llvm.org/D66089 llvm-svn: 368812	2019-08-14 08:46:54 +00:00
Dorit Nuzman	491ca2425d	[LV] Fold-tail flag This is the compiler-flag equivalent of the Predicate pragma (https://reviews.llvm.org/D65197), to direct the vectorizer to fold the remainder-loop into the main-loop using predication. Differential Revision: https://reviews.llvm.org/D66108 Reviewers: Ayal, hsaito, fhahn, SjoerdMeije llvm-svn: 368801	2019-08-14 05:22:20 +00:00
David L. Jones	d4edd9d97e	Revert '[LICM] Make Loop ICM profile aware' and 'Fix pass dependency for LICM' This reverts r368526 (git commit `7e71aa24bc`) This reverts r368542 (git commit `cb5a90fd31`) llvm-svn: 368800	2019-08-14 04:50:33 +00:00
John McCall	a318c55073	Coroutines: adjust for SVN r358739 CallSite has been removed in favour of CallBase. Adjust the coroutine split to account for that. llvm-svn: 368798	2019-08-14 03:54:25 +00:00
John McCall	3bbf207fbc	Don't run a full verifier pass in coro-splitting's private pipeline. Potentially addresses rdar://49022293. llvm-svn: 368797	2019-08-14 03:54:18 +00:00
John McCall	5f60b68c68	Remove unreachable blocks before splitting a coroutine. The suspend-crossing algorithm is not correct in the presence of uses that cannot be reached on some successor path from their defs. llvm-svn: 368796	2019-08-14 03:54:13 +00:00
John McCall	2133feec93	Support swifterror in coroutine lowering. The support for swifterror allocas should work in all lowerings. The support for swifterror arguments only really works in a lowering with prototypes where you can ensure that the prototype also has a swifterror argument; I'm not really sure how it could possibly be made to work in the switch lowering. llvm-svn: 368795	2019-08-14 03:54:05 +00:00
John McCall	d47801e718	In coro.retcon lowering, don't explode if the optimizer messes around with the linkage of the prototype or the exact types of the yielded values. llvm-svn: 368793	2019-08-14 03:53:52 +00:00
John McCall	ac40483276	Fix a use-after-free in the coro.alloca treatment. llvm-svn: 368792	2019-08-14 03:53:46 +00:00
John McCall	62a5dde0c2	Add intrinsics for doing frame-bound dynamic allocations within a coroutine. These rely on having an allocator provided to the coroutine and thus, for now, only work in retcon lowerings. llvm-svn: 368791	2019-08-14 03:53:40 +00:00
John McCall	137b50f0c3	Guard dumps in the coro intrinsic validation logic behind NDEBUG checks. dump() is not guaranteed to be defined in all builds. llvm-svn: 368790	2019-08-14 03:53:31 +00:00
John McCall	3829214185	Generalize llvm.coro.suspend.retcon to allow an arbitrary number of arguments to be passed back to the continuation function. llvm-svn: 368789	2019-08-14 03:53:26 +00:00
John McCall	94010b2b7f	Extend coroutines to support a "returned continuation" lowering. A quick contrast of this ABI with the currently-implemented ABI: - Allocation is implicitly managed by the lowering passes, which is fine for frontends that are fine with assuming that allocation cannot fail. This assumption is necessary to implement dynamic allocas anyway. - The lowering attempts to fit the coroutine frame into an opaque, statically-sized buffer before falling back on allocation; the same buffer must be provided to every resume point. A buffer must be at least pointer-sized. - The resume and destroy functions have been combined; the continuation function takes a parameter indicating whether it has succeeded. - Conversely, every suspend point begins its own continuation function. - The continuation function pointer is directly returned to the caller instead of being stored in the frame. The continuation can therefore directly destroy the frame when exiting the coroutine instead of having to leave it in a defunct state. - Other values can be returned directly to the caller instead of going through a promise allocation. The frontend provides a "prototype" function declaration from which the type, calling convention, and attributes of the continuation functions are taken. - On the caller side, the frontend can generate natural IR that directly uses the continuation functions as long as it prevents IPO with the coroutine until lowering has happened. In combination with the point above, the frontend is almost totally in charge of the ABI of the coroutine. - Unique-yield coroutines are given some special treatment. llvm-svn: 368788	2019-08-14 03:53:17 +00:00
Aditya Nandakumar	c65ac865c3	[GlobalISel]: Fix lowering of G_Shuffle_vector where we pick up the wrong source index https://reviews.llvm.org/D66182 llvm-svn: 368781	2019-08-14 01:23:33 +00:00
Amara Emerson	2a312fc989	[AArch64][GlobalISel] RBS: Treat s128s like vectors when unmerging. The destinations should be FPRs (for now). Differential Revision: https://reviews.llvm.org/D66184 llvm-svn: 368775	2019-08-13 23:51:20 +00:00
Eli Friedman	b5eb3e1e82	[AArch64] Remove incorrect usage of MONonTemporal. This has no effect at the moment, but might matter if we try to implement non-temporal loads in the future. llvm-svn: 368770	2019-08-13 23:12:14 +00:00
Aditya Nandakumar	615eee6402	[GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sources https://reviews.llvm.org/D66171 llvm-svn: 368753	2019-08-13 21:49:11 +00:00
Xiangling Liao	a8c624a1c4	[AIX]Lowering global address for 32/64bit small/large code models This patch implements global address lowering for 32/64 bit with small/large code models. 1.For 32bit large code model on AIX, there are newly added pseudo opcode LWZtocL & ADDIStocHA32, the support of which on MC layer will be provided by future patches. 2.The default code model on AIX should be small code model. 3.Since AIX does not have medium code model, "report_fatal_error" when users specify it. Differential Revision: https://reviews.llvm.org/D63547 llvm-svn: 368744	2019-08-13 20:29:01 +00:00
Tim Renouf	10db641aab	[AMDGPU] Fix to 'Fold readlane from copy of SGPR or imm' That change (r363670) could leave a copy from vgpr to sgpr. Fixed. Differential Revision: https://reviews.llvm.org/D66133 Change-Id: I00c3fe6fda2e8e1e36f53195b881b1449c777ea4 llvm-svn: 368736	2019-08-13 18:57:55 +00:00
David Green	a655393f17	[ARM] Add MVE beats vector cost model The MVE architecture has the idea of "beats", where a vector instruction can be executed over several ticks of the architecture. This adds a similar system into the Arm backend cost model, multiplying the cost of all vector instructions by a factor. This factor essentially becomes the expected difference between scalar code and vector code, on average. MVE Vector instructions can also overlap so the a true cost of them is often lower. But equally scalar instructions can in some situations be dual issued, or have other optimisations such as unrolling or make use of dsp instructions. The default is chosen as 2. This should not prevent vectorisation is a most cases (as the vector instructions will still be doing at least 4 times the work), but it will help prevent over vectorising in cases where the benefits are less likely. This adds things so far to the obvious places in ARMTargetTransformInfo, and updates a few related costs like not treating float instructions as cost 2 just because they are floats. Differential Revision: https://reviews.llvm.org/D66005 llvm-svn: 368733	2019-08-13 18:12:08 +00:00
Wenlei He	d328954467	[llvm-profdata] Profile dump for compact binary format Summary: Fix "llvm-profdata show" so it can work with compact binary format profile. The change is to mark all functions "used" so SampleProfileReaderCompactBinary::read will read in all profiles available for dumping. The function names will be MD5 hash for compact binary format. Reviewers: wmi, davidxl, danielcdh Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65162 llvm-svn: 368731	2019-08-13 17:56:08 +00:00
Steven Wu	9e51fb6c57	[AutoUpgrader] Make ArcRuntime Autoupgrader more conservative Summary: This is a tweak to r368311 and r368646 which auto upgrades the calls to objc runtime functions to objc runtime intrinsics, in order to make sure that the auto upgrader does not trigger with up-to-date bitcode. It is possible for bitcode that is up-to-date to contain direct calls to objc runtime function and those are not inserted by compiler as part of ARC and they should not be upgraded. Now auto upgrader only triggers as when the old style of ARC marker is used so it is guaranteed that it won't trigger on update-to-date bitcode. This also means it won't do this upgrade for bitcode from llvm-8 and llvm-9, which preserves the behavior of those releases. Ideally they should be upgraded as well but it is more important to make sure AutoUpgrader will not trigger on up-to-date bitcode. Reviewers: ahatanak, rjmccall, dexonsmith, pete Reviewed By: dexonsmith Subscribers: hiraditya, jkorous, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66153 llvm-svn: 368730	2019-08-13 17:52:21 +00:00
Heejin Ahn	64517a6419	Use Register over unsigned in LateEHPrepare (NFC) Summary: While D65962 is pending for review, I landed D65475 that added one more use of `unsigned`. Changed it to `Register`. Reviewers: dsanders Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66064 llvm-svn: 368727	2019-08-13 17:35:44 +00:00
David Bolvansky	038d604f4f	[SimplifyLibCalls] Add noalias from known callsites Summary: Should be fine for memcpy, strcpy, strncpy. Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66135 llvm-svn: 368724	2019-08-13 17:18:46 +00:00
Nikita Popov	2a4f26b4c2	[ValueTracking] Improve reverse assumption inference Use isGuaranteedToTransferExecutionToSuccessor() instead of isSafeToSpeculativelyExecute() when seeing whether we can propagate the information in an assume backwards in isValidAssumeForContext(). The latter is more general - it also allows arbitrary loads/stores - and is also the condition we want: if our assume is guaranteed to execute, its condition not holding would be UB. Original patch by arielb1. Differential Revision: https://reviews.llvm.org/D37215 llvm-svn: 368723	2019-08-13 17:15:42 +00:00
Hubert Tong	0996705009	Reland r368691: "[AIX] Implement LR prolog/epilog save/restore" Trying again with the code changes (and not just the new test). Summary: This patch fixes the offsets of fields in the stack frame linkage save area for AIX. Reviewers: sfertile, hubert.reinterpretcast, jasonliu, Xiangling_L, xingxue, ZarkoCA, daltenty Reviewed By: hubert.reinterpretcast Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64424 Patch by Chris Bowler! llvm-svn: 368721	2019-08-13 17:05:53 +00:00
David Tenty	9bf01e53a3	[NFC][AIX] Use assert instead of llvm_unreachable Addresses post-commit comments on https://reviews.llvm.org/D64825. Use assert instead of llvm_unreachable to check if invalid csect types are being generated. Use report_fatal_error on unimplemented XCOFF features. Differential Revision: https://reviews.llvm.org/D64825 llvm-svn: 368720	2019-08-13 17:04:51 +00:00
Jonas Devlieghere	57ae300562	[Dwarf] Complete the list of type tags. An incorrect verification error revealed that the list of type tags was incomplete. This patch adds the missing types by adding a tag kind to the Dwarf.def file, which is used by the `isType` function. A test was added for the original verification error. Differential revision: https://reviews.llvm.org/D65914 llvm-svn: 368718	2019-08-13 17:00:54 +00:00
David Bolvansky	90a30fdcc3	[SLC] Improve dereferenceable bytes annotation llvm-svn: 368715	2019-08-13 16:44:16 +00:00
Matt Arsenault	28215caa60	GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUES Odd sized vectors aren't handled yet. llvm-svn: 368713	2019-08-13 16:26:28 +00:00
Momchil Velikov	114c37e72a	[ARM] Fix detection of duplicates when parsing reg list operands Differential Revision: https://reviews.llvm.org/D65957 llvm-svn: 368712	2019-08-13 16:13:00 +00:00
Momchil Velikov	f990e4a4c7	[ARM] Fix encoding of APSR in CLRM instruction The APSR is encoded by setting bit 15 in the register list of the CLRM instruction (cf. https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf). Differential Revision: https://reviews.llvm.org/D65873 llvm-svn: 368711	2019-08-13 16:12:46 +00:00
Matt Arsenault	690645bda0	GlobalISel: Implement lower for G_SHUFFLE_VECTOR llvm-svn: 368709	2019-08-13 16:09:07 +00:00
Lang Hames	52a34a78d9	[ORC] Refactor definition-generation, add a generator for static libraries. This patch replaces the JITDylib::DefinitionGenerator typedef with a class of the same name, and adds support for attaching a sequence of DefinitionGeneration objects to a JITDylib. This patch also adds a new definition generator, StaticLibraryDefinitionGenerator, that can be used to add symbols fom a static library to a JITDylib. An object from the static library will be added (via a supplied ObjectLayer reference) whenever a symbol from that object is referenced. To enable testing, lli is updated to add support for the --extra-archive option when running in -jit-kind=orc-lazy mode. llvm-svn: 368707	2019-08-13 16:05:18 +00:00
Matt Arsenault	0a04a06250	GlobalISel: Add more verifier checks for G_SHUFFLE_VECTOR llvm-svn: 368705	2019-08-13 15:52:21 +00:00
Matt Arsenault	5af9cf042f	GlobalISel: Change representation of shuffle masks Currently shufflemasks get emitted as any other constant, and you end up with a bunch of virtual registers of G_CONSTANT with a G_BUILD_VECTOR. The AArch64 selector then asserts on anything that doesn't fit this pattern. This isn't an ideal representation, and should avoid legalization and have fewer opportunities for a representational error. Rather than invent a new shuffle mask operand type, similar to what ShuffleVectorSDNode does, just track the original IR Constant mask operand. I don't completely like the idea of adding another link to the IR, but MIR is already quite dependent on IR constants already, and this will allow sharing the shuffle mask utility functions with the IR. llvm-svn: 368704	2019-08-13 15:34:38 +00:00
Roman Lebedev	676594305a	[CodeGen][SelectionDAG] More efficient code for X % C == 0 (SREM case) Summary: This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. One huge caveat: this signed case is only valid for positive divisors. While we can freely negate negative divisors, we can't negate `INT_MIN`, so for now if `INT_MIN` is encountered, we bailout. As a follow-up, it should be possible to handle that more gracefully via extra `and`+`setcc`+`select`. This passes llvm's test-suite, and from cursory(!) cross-examination the folds (the assembly) match those of GCC, and manual checking via alive did not reveal any issues (other than the `INT_MIN` case) Reviewers: RKSimon, spatel, hermord, craig.topper, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: xbolva00, thakis, javed.absar, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65366 llvm-svn: 368702	2019-08-13 14:57:37 +00:00
Roman Lebedev	f4de7eda4a	[TargetLowering][NFC] prepareUREMEqFold(): fixup comment The comment initially matched the code, but the code was incorrect and was fixed after the initial revert back back when it was introduced, but the comment was never updated. llvm-svn: 368701	2019-08-13 14:57:08 +00:00
Roman Lebedev	73f702ff19	[InstCombine] Non-canonical clamp-like pattern handling Summary: Given a pattern like: ``` %old_cmp1 = icmp slt i32 %x, C2 %old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high %old_x_offseted = add i32 %x, C1 %old_cmp0 = icmp ult i32 %old_x_offseted, C0 %r = select i1 %old_cmp0, i32 %x, i32 %old_replacement ``` it can be rewritten as more canonical pattern: ``` %new_cmp1 = icmp slt i32 %x, -C1 %new_cmp2 = icmp sge i32 %x, C0-C1 %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low ``` Iff `-C1 s<= C2 s<= C0-C1` Also, `ULT` predicate can also be `UGE`; or `UGT` iff `C0 != -1` (+invert result) Also, `SLT` predicate can also be `SGE`; or `SGT` iff `C2 != INT_MAX` (+invert result) If `C1 == 0`, then all 3 instructions must be one-use; else at most either `%old_cmp1` or `%old_x_offseted` can have extra uses. NOTE: if we could reuse `%old_cmp1` as one of the comparisons we'll have to build, this could be less limiting. So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants: \| \| ULT \| UGE \| UGT \| \| SLT \| https://rise4fun.com/Alive/yIJ \| https://rise4fun.com/Alive/5BfN \| https://rise4fun.com/Alive/INH \| \| SGE \| https://rise4fun.com/Alive/hd8 \| https://rise4fun.com/Alive/Abk \| https://rise4fun.com/Alive/PlzS \| \| SGT \| https://rise4fun.com/Alive/VYG \| https://rise4fun.com/Alive/oMY \| https://rise4fun.com/Alive/KrzC \| {F9730206} This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch. This patch requires D65530. Reviewers: spatel, nikic, xbolva00, dmgreen Reviewed By: spatel Subscribers: hiraditya, llvm-commits, dmgreen Tags: #llvm Differential Revision: https://reviews.llvm.org/D65765 llvm-svn: 368687	2019-08-13 12:49:28 +00:00
Roman Lebedev	0410489a34	[InstCombine][NFC] Rename IsFreeToInvert() -> isFreeToInvert() for consistency As per https://reviews.llvm.org/D65530#inline-592325 llvm-svn: 368686	2019-08-13 12:49:16 +00:00
Roman Lebedev	2635c324da	[InstCombine] foldXorOfICmps(): don't give up on non-single-use ICmp's if all users are freely invertible Summary: This is rather unconventional.. As the comment there says, we don't have much folds for xor-of-icmps, we try to turn them into an and-of-icmps, for which we have plenty of folds. But if the ICmp we need to invert is not single-use - we give up. As discussed in https://reviews.llvm.org/D65148#1603922, we may have a non-canonical CLAMP pattern, with bit match and select-of-threshold that we'll potentially clamp. As it can be seen in `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`, out of all 8 variations of the pattern, only two are not canonicalized into the variant with and+icmp instead of bit math. The reason is because the ICmp we need to invert is not single-use - we give up. We indeed can't perform this fold at will, the general rule is that we should not increase instruction count in InstCombine, But we wouldn't end up increasing instruction count if we can adapt every other user to the inverted value. This way the `not` we create will get folded, and in the end the instruction count did not increase. For that, of course, we need to look at the users of a Value, which is again rather unconventional for InstCombine :S Thus i'm proposing to be a little bit more insistive in `foldXorOfICmps()`. The alternatives would be to not create that `not`, but add duplicate code to manually invert all users; or to add some even less general combine to handle some more specific pattern[s]. Reviewers: spatel, nikic, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, jdoerfert, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65530 llvm-svn: 368685	2019-08-13 12:49:06 +00:00
Simon Pilgrim	e7b350a5d1	[X86] XFormVExtractWithShuffleIntoLoad - handle shuffle mask scaling If the target shuffle mask is from a wider type, attempt to scale the mask so that the extraction can attempt to peek through. Fixes the regression mentioned in rL368662 Reapplying this as rL368308 had to be reverted as part of rL368660 to revert rL368276 llvm-svn: 368663	2019-08-13 11:11:42 +00:00
Simon Pilgrim	1a8d790cf5	[X86] SimplifyDemandedVectorElts - attempt to recombine target shuffle using DemandedElts mask (reapplied) If we don't demand all elements, then attempt to combine to a simpler shuffle. At the moment we can only do this if Depth == 0 as combineX86ShufflesRecursively uses Depth to track whether the shuffle has really changed or not - we'll need to change this before we can properly start merging combineX86ShufflesRecursively into SimplifyDemandedVectorElts. The insertps-combine.ll regression is because XFormVExtractWithShuffleIntoLoad can't see through shuffles of different widths - this will be fixed in a follow-up commit. Reapplying this as rL368307 had to be reverted as part of rL368660 to revert rL368276 llvm-svn: 368662	2019-08-13 10:51:39 +00:00
Hans Wennborg	5390d25f2b	Revert r368276 "[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT" This introduced a false positive MemorySanitizer warning about use of uninitialized memory in a vectorized crc function in Chromium. That suggests maybe something is not right with this transformation. See https://crbug.com/992853#c7 for a reproducer. This also reverts the follow-up commits r368307 and r368308 which depended on this. > This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. > > In particular this helps remove some unnecessary scalar->vector->scalar patterns. > > The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. > > Differential Revision: https://reviews.llvm.org/D65887 llvm-svn: 368660	2019-08-13 09:33:25 +00:00
David Bolvansky	39130314fe	[SimplifyLibCalls] Add dereferenceable bytes from known callsites Summary: int mm(char a, char b) { return memcmp(a,b,16); } Currently: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* %a, i8* %b, i64 16) ret i32 %call } After patch: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* dereferenceable(16) %a, i8* dereferenceable(16) %b, i64 16) ret i32 %call } Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: javed.absar, spatel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66079 llvm-svn: 368657	2019-08-13 09:11:49 +00:00
Qiu Chaofan	4fb99a3330	[PowerPC] Fix ICE when truncating some vectors The legalizer would hit an assertion on PowerPC platform when truncating a vector whose size is not power of 2. This patch is to add a check to prevent vectors with such odd-size elements from being custom lowered. Reviewed By: Hal Finkel Differential Revision: https://reviews.llvm.org/D65261 llvm-svn: 368654	2019-08-13 07:53:29 +00:00
Amara Emerson	72c81b94cb	[AArch64][GlobalISel] Replace explicit vreg creation with implicit using SrcOp. NFC. llvm-svn: 368653	2019-08-13 06:55:32 +00:00
Amara Emerson	e14c91b71a	[GlobalISel] Make the InstructionSelector instance non-const, allowing state to be maintained. Currently we can't keep any state in the selector object that we get from subtarget. As a result we have to plumb through all our variables through multiple functions. This change makes it non-const and adds a virtual init() method to allow further state to be captured for each target. AArch64 makes use of this in this patch to cache a call to hasFnAttribute() which is expensive to call, and is used on each selection of G_BRCOND. Differential Revision: https://reviews.llvm.org/D65984 llvm-svn: 368652	2019-08-13 06:26:59 +00:00
Serge Pavlov	2a09b9acfb	Added unit tests to check supported rounding modes Also added fixed misspelled metadata name. Differential Revision: https://reviews.llvm.org/D66073 llvm-svn: 368650	2019-08-13 05:21:18 +00:00
Aditya Nandakumar	70fdfed45f	[GlobalISel]: Add KnownBits for G_XOR https://reviews.llvm.org/D66119 llvm-svn: 368648	2019-08-13 04:32:33 +00:00
Yevgeny Rouban	8b996dc16e	Verifier: check prof branch_weights This patch is to check some of constraints on !pro branch_weights metadata: https://llvm.org/docs/BranchWeightMetadata.html Reviewers: asbirlea, reames, chandlerc Reviewed By: reames Differential Revision: https://reviews.llvm.org/D61179 llvm-svn: 368647	2019-08-13 04:03:38 +00:00
Akira Hatanaka	3c7c053145	Do not call replaceAllUsesWith to upgrade calls to ARC runtime functions to intrinsic calls This fixes a bug in r368311. It turns out that the ARC runtime functions in the IR can have pointer parameter types that are not i8* or i8**. Instead of RAUWing normal functions with intrinsics, manually bitcast the arguments before passing them to the intrinsic functions and bitcast the return value back to the type of the original call instruction. This recommits r368634, which was reverted in r368637. The loop in the patch was iterating over uses of a function and deleting function calls inside it, which caused bots to crash. rdar://problem/54125406 Differential Revision: https://reviews.llvm.org/D66047 llvm-svn: 368646	2019-08-13 01:23:06 +00:00
Stanislav Mekhanoshin	438315bf69	[AMDGPU] Fix msan failure in printf lowering llvm-svn: 368645	2019-08-13 01:07:27 +00:00
Daniel Sanders	a58a27513b	Eliminate implicit Register->unsigned conversions in VirtRegMap. NFC Summary: This was mostly an experiment to assess the feasibility of completely eliminating a problematic implicit conversion case in D61321 in advance of landing that* but it also happens to align with the goal of propagating the use of Register/MCRegister instead of unsigned so I believe it makes sense to commit it. The overall process for eliminating the implicit conversions from Register/MCRegister -> unsigned was to: 1. Add an explicit conversion to support genuinely required conversions to unsigned. For example, using them as an index for IndexedMap. Sadly it's not possible to have an explicit and implicit conversion to the same type and only deprecate the implicit one so I called the explicit conversion get(). 2. Temporarily annotate the implicit conversion to unsigned with LLVM_ATTRIBUTE_DEPRECATED to make them visible 3. Eliminate implicit conversions by propagating Register/MCRegister/ explicit-conversions appropriately 4. Remove the deprecation added in 2. * My conclusion is that it isn't feasible as there's too much code to update in one go. Depends on D65678 Reviewers: arsenm Subscribers: MatzeB, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65685 llvm-svn: 368643	2019-08-13 00:55:24 +00:00
Akira Hatanaka	c109808982	Revert "Do not call replaceAllUsesWith to upgrade calls to ARC runtime functions" This reverts commit r368634 because it broke a bot. llvm-svn: 368637	2019-08-13 00:20:36 +00:00
Eric Christopher	4acb4ee767	Move findBBwithCalls to the file it's used in to avoid unused function warnings. llvm-svn: 368636	2019-08-13 00:05:01 +00:00
Akira Hatanaka	6817ce24c1	Do not call replaceAllUsesWith to upgrade calls to ARC runtime functions to intrinsic calls This fixes a bug in r368311. It turns out that the ARC runtime functions in the IR can have pointer parameter types that are not i8* or i8**. Instead of RAUWing normal functions with intrinsics, manually bitcast the arguments before passing them to the intrinsic functions and bitcast the return value back to the type of the original call instruction. rdar://problem/54125406 llvm-svn: 368634	2019-08-12 23:53:23 +00:00
Stanislav Mekhanoshin	5b32752d10	[AMDGPU] removed unused functions from printf lowering Differential Revision: https://reviews.llvm.org/D66117 llvm-svn: 368633	2019-08-12 23:32:35 +00:00
Reid Kleckner	e9865b9b31	[WinEH] Fix catch block parent frame pointer offset r367088 made it so that funclets store XMM registers into their local frame instead of storing them to the parent frame. However, that change forgot to update the parent frame pointer offset for catch blocks. This change does that. Fixes crashes when an exception is rethrown in a catch block that saves XMMs, as described in https://crbug.com/992860. llvm-svn: 368631	2019-08-12 23:02:00 +00:00
Juergen Ributzka	b978c51ce4	[TextAPI] Fix & Add tests for tbd files version 3. - There was a simple typo in TextStub code that prevented version 3 files to be read. - Included a version 3 unit test to handle the differences in the format. - Also a typo in Error.h inside the comments. https://reviews.llvm.org/D66041 This patch is from Cyndy Ishida <cyndy_ishida@apple.com>. llvm-svn: 368630	2019-08-12 23:01:07 +00:00
Daniel Sanders	3836874dbb	[risc-v] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Depends on D65919 Reviewers: lenary Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for full review was: https://reviews.llvm.org/D65962 llvm-svn: 368629	2019-08-12 22:41:02 +00:00
Daniel Sanders	5ae66e56cf	[aarch64] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Manual fixups in: AArch64InstrInfo.cpp - genFusedMultiply() now takes a Register* instead of unsigned* AArch64LoadStoreOptimizer.cpp - Ternary operator was ambiguous between Register/MCRegister. Settled on Register Depends on D65919 Reviewers: aemerson Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for full review was: https://reviews.llvm.org/D65962 llvm-svn: 368628	2019-08-12 22:40:53 +00:00
Daniel Sanders	05c145d694	[webassembly] Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Reviewers: aheejin Subscribers: jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision for whole review: https://reviews.llvm.org/D65962 llvm-svn: 368627	2019-08-12 22:40:45 +00:00
Stanislav Mekhanoshin	ef8f1c473a	[AMDGPU] Use PredicateControl in MIMGBaseOpcode. NFC. This is infrastructural, will be needed for future work. For some reason it was only used in MIMG_NoSampler, while needed everywere we use MIMGBaseOpcode if we want to use predicates. Differential Revision: https://reviews.llvm.org/D66115 llvm-svn: 368626	2019-08-12 22:32:21 +00:00
Johannes Doerfert	26e58466de	[Attributor] Use the cached data layout directly This removes the warning by using the new DL member. It also simplifies the code. llvm-svn: 368625	2019-08-12 22:21:09 +00:00
Craig Topper	e07e593782	[X86] Allow combineTruncateWithSat to use pack instructions for i16->i8 without AVX512BW. We need AVX512BW to be able to truncate an i16 vector. If we don't have that we have to extend i16->i32, then trunc, i32->i8. But we won't be able to remove the min/max if we do that. At least not without more special handling. llvm-svn: 368623	2019-08-12 22:18:23 +00:00
Johannes Doerfert	acc8079f8e	[Attributor][NFC] Add IntegerState raw_ostream << operator llvm-svn: 368622	2019-08-12 22:07:34 +00:00
Johannes Doerfert	ece8190497	[Attributor] Make the InformationCache an Attributor member The functionality is not changed but the interfaces are simplified and repetition is removed. llvm-svn: 368621	2019-08-12 22:05:53 +00:00
Aditya Nandakumar	55371e697c	[GISel]: Fix a bug in KnownBits where we should have been using SizeInBits https://reviews.llvm.org/D66039 We were using getIndexSize instead of getIndexSizeInBits(). Added test case for G_PTRTOINT and G_INTTOPTR. llvm-svn: 368618	2019-08-12 21:28:12 +00:00
Craig Topper	0761a38e8a	[X86] Remove unreachable code from LowerTRUNCATE. NFC All three 256->128 bit cases were already handled above. Noticed while looking at the coverage report. llvm-svn: 368609	2019-08-12 19:26:45 +00:00
Craig Topper	a3605baaff	[X86] Add a paranoia type check to the code that detects AVG patterns from truncating stores. If we're after type legalize, we should make sure we won't create a store with an illegal type when we separate the AVG pattern from the truncating store. I don't know of a way to fail for this today. Just noticed while I was in the vicinity. llvm-svn: 368608	2019-08-12 19:26:37 +00:00
Craig Topper	1b02909847	[X86] Simplify creation of saturating truncating stores. We just need to check if the truncating store is legal instead of going through isSATValidOnAVX512Subtarget. llvm-svn: 368607	2019-08-12 19:26:30 +00:00
Craig Topper	3f4e9b156d	[X86] Replace call to isTruncStoreLegalOrCustom with isTruncStoreLegal. NFC We have no custom trunc stores on X86. llvm-svn: 368606	2019-08-12 19:26:22 +00:00
Wenlei He	4b99b58a84	[ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends Summary: This commit fixed a race condition from multi-threaded thinLTO backends that causes non-deterministic memory corruption for a data structure used only by AutoFDO with compact binary profile. GUIDToFuncNameMap, a static data member of type DenseMap in FunctionSamples is used as a per-module mapping from function name MD5 to name string when input AutoFDO profile is in compact binary format. However with ThinLTO, we can have parallel backends modifying and accessing the class static map concurrently. The fix is to make GUIDToFuncNameMap a member of SampleProfileLoader instead of a file static data. Reviewers: wmi, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65848 llvm-svn: 368596	2019-08-12 17:45:14 +00:00
Craig Topper	09d5d15339	[X86] Disable use of zmm registers for varargs musttail calls under prefer-vector-width=256 and min-legal-vector-width=256. Under this config, the v16f32 type we try to use isn't to a register class so the getRegClassFor call will fail. llvm-svn: 368594	2019-08-12 17:43:26 +00:00
David Green	86876422ef	[ARM] sext of a load is free This teaches the cost model that the sext or zext of a load is going to be free. Differential Revision: https://reviews.llvm.org/D66006 llvm-svn: 368593	2019-08-12 17:39:56 +00:00
Stanislav Mekhanoshin	4c9c98f36b	[AMDGPU] Printf runtime binding pass This pass is a port of the according pass from the HSAIL compiler. It parses printf calls and setup runtime printf buffer. After that it copies printf arguments to the buffer and fills in module metadata for runtime. Differential Revision: https://reviews.llvm.org/D24035 llvm-svn: 368592	2019-08-12 17:12:29 +00:00
David Green	3e39f39ad9	[ARM] MVE shuffle broadcast costs A VDUP will perform a vector broadcast in a single instruction. Update the cost model for MVE accordingly. Code originally by David Sherwood. Differential Revision: https://reviews.llvm.org/D63448 llvm-svn: 368589	2019-08-12 16:54:07 +00:00
David Green	83bbfaa5e4	[ARM] Put some of the TTI costmodel behind hasNeon calls. This puts some of the calls in ARMTargetTransformInfo.cpp behind hasNeon() checks, now that we have MVE, and updates all the tests accordingly. Differential Revision: https://reviews.llvm.org/D63447 llvm-svn: 368587	2019-08-12 15:59:52 +00:00
Sean Fertile	29141da75e	[XCOFF] Use a single symbolic constant for the size of an embeded name. [NFC] Convert SymbolNameSize and SectionNameSize into just `NameSize`. The length of a name embeded in a symbol table entry or section header table entry is length 8 for Sections, Symbols and Files. No need to have a distinct constant for each one. Also removes the Size argument to 'generateStringRef' as the size is always 'XCOFF::NameSize'. llvm-svn: 368584	2019-08-12 15:27:40 +00:00
Hans Wennborg	a45f301f7a	Revert r368339 "[MBP] Disable aggressive loop rotate in plain mode" It caused assertions to fire when building Chromium: lib/CodeGen/LiveDebugValues.cpp:331: bool {anonymous}::LiveDebugValues::OpenRangesSet::empty() const: Assertion `Vars.empty() == VarLocs.empty() && "open ranges are inconsistent"' failed. See https://crbug.com/992871#c3 for how to reproduce. > Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. > > To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. > > Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 368579	2019-08-12 14:23:13 +00:00
Kang Zhang	489efc68a5	Revert r368565: [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks llvm-svn: 368574	2019-08-12 14:00:31 +00:00
Sam Elliott	fee242aed4	[RISCV] Fix ICE in isDesirableToCommuteWithShift Summary: Ana Pazos reported a bug where we were not checking that an APInt would fit into 64-bits before calling `getSExtValue()`. This caused asserts when compiling large constants, such as i128s, as happens when compiling compiler-rt. This patch adds a testcase and makes the callback less error-prone. Reviewers: apazos, asb, luismarques Reviewed By: luismarques Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66081 llvm-svn: 368572	2019-08-12 13:51:00 +00:00
David Bolvansky	20d37fab82	[InstCombine] x /c fabs(x) -> copysign(1.0, x) Summary: x / fabs(x) -> copysign(1.0, x) fabs(x) / x -> copysign(1.0, x) Reviewers: spatel, foad, RKSimon, efriedma Reviewed By: spatel Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65898 llvm-svn: 368570	2019-08-12 13:43:35 +00:00
David Stenberg	9b29ec58b7	[DebugInfo] Remove call sites when eliminating unreachable blocks Summary: When eliminating an unreachable block we must remove any call site information for calls residing in the block. This was originally found on a downstream target, and the attached x86 test case was produced by hand-modifying some MIR. Reviewers: aprantl, asowda, NikolaPrica, djtodoro, ivanbaev, vsk Reviewed By: NikolaPrica, vsk Subscribers: vsk, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D64500 llvm-svn: 368566	2019-08-12 13:22:29 +00:00
Kang Zhang	342fb0db6d	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 368565	2019-08-12 13:15:31 +00:00
Hans Wennborg	5b96d4655c	Revert r368509 "[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks" > In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. > But the `early-ret` pass is before `block-placement`, we don't want to run it again. > This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. > > Reviewed By: efriedma > > Differential Revision: https://reviews.llvm.org/D63972 This also revertes follow-ups r368514 and r368532. llvm-svn: 368560	2019-08-12 12:43:51 +00:00
Simon Pilgrim	182249daee	[X86][SSE] ComputeKnownBits - add basic PSADBW handling llvm-svn: 368558	2019-08-12 12:19:19 +00:00
Roman Lebedev	ccdad6ef48	[InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): avoid constantexpr pitfail (PR42962) Instead of matching value and then blindly casting to BinaryOperator just to get the opcode, just match instruction and do no cast. Fixes https://bugs.llvm.org/show_bug.cgi?id=42962 llvm-svn: 368554	2019-08-12 11:28:02 +00:00
Simon Pilgrim	05e8209e33	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::TRUNCATE llvm-svn: 368553	2019-08-12 10:56:05 +00:00
Pengfei Wang	e28cbbd5d4	[X86] Support -march=tigerlake Support -march=tigerlake for x86. Compare with Icelake Client, It include 4 more new features ,they are avx512vp2intersect, movdiri, movdir64b, shstk. Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D65840 llvm-svn: 368543	2019-08-12 01:29:46 +00:00
Wenlei He	cb5a90fd31	Fix pass dependency for LICM Expected to address buildbot failure http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16285 caused by D65060. llvm-svn: 368542	2019-08-11 22:54:05 +00:00
Bjorn Pettersson	27038a3780	[SelectionDAG] Widen vector results of SMULFIX/UMULFIX/SMULFIXSAT Summary: After the commits that changed x86 backend to widen vectors instead of using promotion some of our downstream tests started to fail. It was noticed that WidenVectorResult has been missing support for SMULFIX/UMULFIX/SMULFIXSAT. This patch adds the missing functionality. Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66051 llvm-svn: 368540	2019-08-11 19:27:06 +00:00
Craig Topper	ce6a2cf966	[X86] Simplify some of the type checks in combineSubToSubus. If we have SSE2 we can handle any i8/i16 type and let type legalization deal with it. llvm-svn: 368538	2019-08-11 17:36:49 +00:00
Craig Topper	637964bfd8	[X86] Don't use SplitOpsAndApply for ISD::USUBSAT. Target independent type legalization and custom lowering should be able to handle it. llvm-svn: 368537	2019-08-11 17:36:45 +00:00
Kang Zhang	b1a62d168f	[NFC][CodeGen] Use while loop instead for loop in MachineBlockPlacement::optimizeBranches() This will pass EXPENSIVE check. llvm-svn: 368532	2019-08-11 12:58:50 +00:00
David Green	11c4602fce	[MVE] Don't try to unroll vectorised MVE loops Due to the nature of the beat system in the MVE architecture, along with tail predication and low-overhead loops, unrolling has less benefit compared to normal loops. You can not, for example, hide the latency of a load with other instructions as you can for scalar code. Preventing unrolling also makes the code easier to read and reason about. So if a loop contains vector code, don't enable the runtime unrolling. At least for the time being. Differential Revision: https://reviews.llvm.org/D65803 llvm-svn: 368530	2019-08-11 08:53:18 +00:00
David Green	44f8d635e2	[ARM] Permit auto-vectorization using MVE With enough codegen complete, we can now correctly report the number and size of vector registers for MVE, allowing auto vectorisation. This also allows FP auto-vectorization for MVE without -Ofast/-ffast-math, due to support for IEEE FP arithmetic and parity between scalar and vector FP behaviour. Patch by David Sherwood. Differential Revision: https://reviews.llvm.org/D63728 llvm-svn: 368529	2019-08-11 08:42:57 +00:00
Heejin Ahn	831efe0e0f	Fix __clang_call_termiante's argument for foreign exceptions Summary: When exceptions are repeatedly thrown in the middle of handling another exception, we call `__clang_call_terminate` with the exception pointer (i32) as an argument. But in case of foreign exceptions, we don't have the pointer, so we call the function with 0. (This requires `__clang_call_terminate` can deal with 0 argument, which will be done later) But previously the 0 argument was not added as a `i32.const 0` but an immediate by mistake, causing the `call` instruction to take not an i32 but rather an exnref, because an `exnref` is left on top of the value stack if `br_on_exn` is not taken. ``` block i32 br_on_exn 0, __cpp_exception ;; exnref is on top of stack now i32.const 0 ;; This was missing! call __clang_call_terminate unreachable end call __clang_call_terminate ;; This takes i32 extracted by br_on_exn ``` Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65475 llvm-svn: 368527	2019-08-11 06:24:07 +00:00
Wenlei He	7e71aa24bc	[LICM] Make Loop ICM profile aware Summary: Hoisting/sinking instruction out of a loop isn't always beneficial. Hoisting an instruction from a cold block inside a loop body out of the loop could hurt performance. This change makes Loop ICM profile aware - it now checks block frequency to make sure hoisting/sinking anly moves instruction to colder block. Test Plan: ninja check Reviewers: asbirlea, sanjoy, reames, nikic, hfinkel, vsk Reviewed By: asbirlea Subscribers: fhahn, vsk, davidxl, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65060 llvm-svn: 368526	2019-08-11 06:05:35 +00:00
Wenlei He	d664072dd5	Revert "test commit" This reverts commit ad92a4a2769425ad0d39ac1dbb6282f6f51a1af7. llvm-svn: 368525	2019-08-11 05:59:20 +00:00
Wenlei He	7bd327da00	test commit llvm-svn: 368524	2019-08-11 05:50:28 +00:00
Craig Topper	9758e0e1bf	[X86] Remove some more code from combineShuffle that is no longer needed with widening legalization. llvm-svn: 368523	2019-08-11 02:17:18 +00:00
Craig Topper	0f74b82ef1	[X86] Remove some code from combineShuffle that seems largely unnecessary with widening legalization. The test case that changed is probably better served through allowing combineTruncatedArithmetic to create narrow vectors. It also appears InstCombine would have simplified this test case to remove the zext and trunc anyway. llvm-svn: 368522	2019-08-11 02:08:38 +00:00
Roman Lebedev	96474d17c6	[InstCombine][NFC] Use SimplifyAddInst() instead of SimplifyBinOp(Instruction::BinaryOps::Add, ) llvm-svn: 368521	2019-08-10 19:29:10 +00:00
Roman Lebedev	a8d20b4467	[InstCombine] Shift amount reassociation in bittest: relax one-use check when shifting constant If one of the values being shifted is a constant, since the new shift amount is known-constant, the new shift will end up being constant-folded so, we don't need that one-use restriction then. llvm-svn: 368519	2019-08-10 19:28:54 +00:00
Roman Lebedev	64fe806c4e	[InstCombine] Shift amount reassociation in bittest: drop pointless one-use restriction That one-use restriction is not needed for correctness - we have already ensured that one of the shifts will go away, so we know we won't increase the instruction count. So there is no need for that restriction. llvm-svn: 368518	2019-08-10 19:28:44 +00:00
Simon Pilgrim	ec128709f0	[X86][SSE] Lower shuffle as ANY_EXTEND_VECTOR_INREG On SSE41+ targets we always lower vector shuffles to ZERO_EXTEND_VECTOR_INREG, even if we don't need the extended bits. This patch relaxes this so that we lower to ANY_EXTEND_VECTOR_INREG if we can, meaning that shuffle combines have a better idea of what elements need to be kept zero. This helps the multiple reduction code as we can now combine away a lot more of the pack+extend codes. Differential Revision: https://reviews.llvm.org/D65741 llvm-svn: 368515	2019-08-10 16:46:07 +00:00
Kang Zhang	555f7495df	[NFC][CodeGen] Modify the PI++ to ++PI in MachineBlockPlacement::optimizeBranches() llvm-svn: 368514	2019-08-10 16:23:17 +00:00
Sanjay Patel	21c15ef384	[Reassociate] try harder to convert negative FP constants to positive This is an extension of a transform that tries to produce positive floating-point constants to improve canonicalization (and hopefully lead to more reassociation and CSE). The original patches were: D4904 D5363 (rL221721) But as the test diffs show, these were limited to basic patterns by walking from an instruction to its single user rather than recursively moving up the def-use sequence. No fast-math is required here because we're only rearranging implicit FP negations in intermediate ops. A motivating bug is: https://bugs.llvm.org/show_bug.cgi?id=32939 Differential Revision: https://reviews.llvm.org/D65954 llvm-svn: 368512	2019-08-10 13:17:54 +00:00
Kang Zhang	36cd84bdd9	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 368509	2019-08-10 09:58:52 +00:00
Craig Topper	74c43a2277	[X86] Match the IR pattern form movmsk on SSE1 only targets where v4i32 isn't legal Summary: This patch adds a special DAG combine for SSE1 to recognize the IR pattern InstCombine gives us for movmsk. This only does the recognition for a few cases where its obvious the input won't be scalarized resulting in building a vector just do to the movmsk. I've made it separate from our existing matching for movmsk since that's called in multiple places and I didn't spend time to see if the other callers would make sense here. Plus the restrictions and additional checks would complicate that. This fixes the case from PR42870. Buts its probably still broken the presence of logic ops feeding the movmsk pattern which would further hide the v4f32 type. Reviewers: spatel, RKSimon, xbolva00 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65689 llvm-svn: 368506	2019-08-10 07:51:13 +00:00
Craig Topper	a8e5e73711	[X86] Improve the diagnostic for larger than 4-bit immediate for vpermil2pd/ps. Only allow MCConstantExprs. llvm-svn: 368505	2019-08-10 04:28:52 +00:00
Luo, Yuanke	c6c86f4f81	[X86] Fix stack probe issue on windows32. Summary: On windows if the frame size exceed 4096 bytes, compiler need to generate a call to _alloca_probe. X86CallFrameOptimization pass changes the reserved stack size and cause of stack probe function not be inserted. This patch fix the issue by detecting the call frame size, if the size exceed 4096 bytes, drop X86CallFrameOptimization. Reviewers: craig.topper, wxiao3, annita.zhang, rnk, RKSimon Reviewed By: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65923 llvm-svn: 368503	2019-08-10 02:49:02 +00:00
Fedor Sergeev	92e160abab	[MemDep] allow to select block-scan-limit when constructing MemoryDependenceAnalysis Introducing non-global control for default block-scan-limit in MemDep analysis. Useful when there are many compilations per initialized LLVM instance (e.g. JIT). Reviewed By: asbirlea Tags: #llvm Differential Revision: https://reviews.llvm.org/D65806 llvm-svn: 368502	2019-08-10 01:23:38 +00:00
Peter Collingbourne	0e497d1554	cfi-icall: Allow the jump table to be optionally made non-canonical. The default behavior of Clang's indirect function call checker will replace the address of each CFI-checked function in the output file's symbol table with the address of a jump table entry which will pass CFI checks. We refer to this as making the jump table `canonical`. This property allows code that was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address of a function, but it comes with a couple of caveats that are especially relevant for users of cross-DSO CFI: - There is a performance and code size overhead associated with each exported function, because each such function must have an associated jump table entry, which must be emitted even in the common case where the function is never address-taken anywhere in the program, and must be used even for direct calls between DSOs, in addition to the PLT overhead. - There is no good way to take a CFI-valid address of a function written in assembly or a language not supported by Clang. The reason is that the code generator would need to insert a jump table in order to form a CFI-valid address for assembly functions, but there is no way in general for the code generator to determine the language of the function. This may be possible with LTO in the intra-DSO case, but in the cross-DSO case the only information available is the function declaration. One possible solution is to add a C wrapper for each assembly function, but these wrappers can present a significant maintenance burden for heavy users of assembly in addition to adding runtime overhead. For these reasons, we provide the option of making the jump table non-canonical with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump table is made non-canonical, symbol table entries point directly to the function body. Any instances of a function's address being taken in C will be replaced with a jump table address. This scheme does have its own caveats, however. It does end up breaking function address equality more aggressively than the default behavior, especially in cross-DSO mode which normally preserves function address equality entirely. Furthermore, it is occasionally necessary for code not compiled with ``-fsanitize=cfi-icall`` to take a function address that is valid for CFI. For example, this is necessary when a function's address is taken by assembly code and then called by CFI-checking C code. The ``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make the jump table entry of a specific function canonical so that the external code will end up taking a address for the function that will pass CFI checks. Fixes PR41972. Differential Revision: https://reviews.llvm.org/D65629 llvm-svn: 368495	2019-08-09 22:31:59 +00:00
Sanjay Patel	26b2c11451	[DAGCombiner] exclude x*2.0 from normal negation profitability rules This is the codegen part of fixing: https://bugs.llvm.org/show_bug.cgi?id=32939 Even with the optimal/canonical IR that is ideally created by D65954, we would reverse that transform in DAGCombiner and end up with the same asm on AArch64 or x86. I see 2 options for trying to correct this: 1. Limit isNegatibleForFree() by special-casing the fmul pattern (this patch). 2. Avoid creating (fmul X, 2.0) in the 1st place by adding a special-case transform to SelectionDAG::getNode() and/or SelectionDAGBuilder::visitFMul() that matches the transform done by DAGCombiner. This seems like the less intrusive patch, but if there's some other reason to prefer 1 option over the other, we can change to the other option. Differential Revision: https://reviews.llvm.org/D66016 llvm-svn: 368490	2019-08-09 21:37:32 +00:00
Daniel Sanders	e9a57c2b23	[globalisel] Add G_SEXT_INREG Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487	2019-08-09 21:11:20 +00:00
Eric Christopher	db2f17d362	Remove variable only used in an assert. llvm-svn: 368486	2019-08-09 21:02:47 +00:00
Craig Topper	6cb05ca044	[X86] Remove custom handling for extloads from LowerLoad. We don't appear to need this with widening legalization. llvm-svn: 368479	2019-08-09 20:27:22 +00:00
Bill Wendling	79176a2542	[CodeGen] Require a name for a block addr target Summary: A block address may be used in inline assembly. In which case it requires a name so that the asm parser has something to parse. Creating a name for every block address is a large hammer, but is necessary because at the point when a temp symbol is created we don't necessarily know if it's used in inline asm. This ensures that it exists regardless. Reviewers: nickdesaulniers, craig.topper Subscribers: nathanchance, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65352 llvm-svn: 368478	2019-08-09 20:18:30 +00:00
Bill Wendling	1b10438875	[MC] Don't recreate a label if it's already used Summary: This patch keeps track of MCSymbols created for blocks that were referenced in inline asm. It prevents creating a new symbol which doesn't refer to the block. Inline asm may have a reference to a label. The asm parser however doesn't recognize it as a label and tries to create a new symbol. The result being that instead of the original symbol (e.g. ".Ltmp0") the parser replaces it in the inline asm with the new one (e.g. ".Ltmp00") without updating it in the symbol table. So the machine basic block retains the "old" symbol (".Ltmp0"), but the inline asm uses the new one (".Ltmp00"). Reviewers: nickdesaulniers, craig.topper Subscribers: nathanchance, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65304 llvm-svn: 368477	2019-08-09 20:16:31 +00:00
Evandro Menezes	59fbe516bd	[InstCombine] Refactor optimizeExp2() (NFC) Refactor `LibCallSimplifier::optimizeExp2()` to use the new `emitBinaryFloatFnCall()` version that fetches the function name from TLI. llvm-svn: 368457	2019-08-09 17:22:56 +00:00
Evandro Menezes	8a21214174	[Transforms] Add a emitBinaryFloatFnCall() version that fetches the function name from TLI Add the counterpart to a similar function for single operands. Differential revision: https://reviews.llvm.org/D65976 llvm-svn: 368453	2019-08-09 17:06:46 +00:00
Sunil Srivastava	27f6f2f88b	Print reasonable representations of type names in llvm-nm, readelf and readobj For type values that do not have proper names, print reasonable representation in llvm-nm, llvm-readobj and llvm-readelf, matching GNU tools.s Fixes PR41713. Differential Revision: https://reviews.llvm.org/D65537 llvm-svn: 368451	2019-08-09 16:54:51 +00:00
Evandro Menezes	c6c00cdf2e	[Transforms] Rename hasUnaryFloatFn() and getUnaryFloatFn() (NFC) Rename `hasUnaryFloatFn()` to `hasFloatFn()` and `getUnaryFloatFn()` to `getFloatFnName()`. llvm-svn: 368449	2019-08-09 16:04:18 +00:00
Sanjay Patel	0b4ae34c2f	[DAGCombiner] remove redundant fold for X*1.0; NFC This is handled at node creation time (similar to X/1.0) after: rL357029 (no fast-math-flags needed) llvm-svn: 368443	2019-08-09 14:30:59 +00:00
Jinsong Ji	6349ce5ca5	[MachinePipeliner] Avoid indeterminate order in FuncUnitSorter Summary: This is exposed by adding a new testcase in PowerPC in https://reviews.llvm.org/rL367732 The testcase got different output on different platform, hence breaking buildbots. The problem is that we get differnt FuncUnitOrder when calculateResMII. The root cause is: 1. Two MachineInstr might get SAME priority(MFUsx) from minFuncUnits. 2. Current comparison operator() will return `MFUs1 > MFUs2`. 3. We use iterators for MachineInstr, so the input to FuncUnitSorter might be different on differnt platform due to the iterator nature. So for two MI with same MFU, their order is actually depends on the iterator order, which is platform (implemtation) dependent. This is risky, and may cause cross-compiling problems. The fix is to check make sure we assign a determine order when they are equal. Reviewers: bcahoon, hfinkel, jmolloy Subscribers: nemanjai, hiraditya, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65992 llvm-svn: 368441	2019-08-09 14:10:57 +00:00
Whitney Tsang	dd3b6498b0	Title: Loop Cache Analysis Summary: Implement a new analysis to estimate the number of cache lines required by a loop nest. The analysis is largely based on the following paper: Compiler Optimizations for Improving Data Locality By: Steve Carr, Katherine S. McKinley, Chau-Wen Tseng http://www.cs.utexas.edu/users/mckinley/papers/asplos-1994.pdf The analysis considers temporal reuse (accesses to the same memory location) and spatial reuse (accesses to memory locations within a cache line). For simplicity the analysis considers memory accesses in the innermost loop in a loop nest, and thus determines the number of cache lines used when the loop L in loop nest LN is placed in the innermost position. The result of the analysis can be used to drive several transformations. As an example, loop interchange could use it determine which loops in a perfect loop nest should be interchanged to maximize cache reuse. Similarly, loop distribution could be enhanced to take into consideration cache reuse between arrays when distributing a loop to eliminate vectorization inhibiting dependencies. The general approach taken to estimate the number of cache lines used by the memory references in the inner loop of a loop nest is: Partition memory references that exhibit temporal or spatial reuse into reference groups. For each loop L in the a loop nest LN: a. Compute the cost of the reference group b. Compute the 'cache cost' of the loop nest by summing up the reference groups costs For further details of the algorithm please refer to the paper. Authored By: etiotto Reviewers: hfinkel, Meinersbur, jdoerfert, kbarton, bmahjour, anemet, fhahn Reviewed By: Meinersbur Subscribers: reames, nemanjai, MaskRay, wuzish, Hahnfeld, xusx595, venkataramanan.kumar.llvm, greened, dmgreen, steleman, fhahn, xblvaOO, Whitney, mgorny, hiraditya, mgrang, jsji, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D63459 llvm-svn: 368439	2019-08-09 13:56:29 +00:00
Simon Pilgrim	60394f47b0	[X86][SSE] Swap X86ISD::BLENDV inputs with an inverted selection mask (PR42825) As discussed on PR42825, if we are inverting the selection mask we can just swap the inputs and avoid the inversion. Differential Revision: https://reviews.llvm.org/D65522 llvm-svn: 368438	2019-08-09 12:44:20 +00:00
Sanjay Patel	991834a516	[GlobalOpt] prevent crashing on large integer types (PR42932) This is a minimal fix (copy the predicate for the assert) to prevent the crashing seen in: https://bugs.llvm.org/show_bug.cgi?id=42932 ...when converting a constant integer of arbitrary width to uint64_t. Differential Revision: https://reviews.llvm.org/D65970 llvm-svn: 368437	2019-08-09 12:43:25 +00:00
Simon Atanasyan	242c5a70d4	[Mips][Codegen] Fix fast-isel mixing of FGR64 and AFGR64 registers Fast-isel was picking AFGR64 register class for processing call arguments when +fp64 options was used. We simply check is option +fp64 is used and pick appropriate register. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D65886 llvm-svn: 368433	2019-08-09 12:02:32 +00:00
Andrea Di Biagio	cbec9af6bf	[MCA] Add flag -show-encoding to llvm-mca. Flag -show-encoding enables the printing of instruction encodings as part of the the instruction info view. Example (with flags -mtriple=x86_64-- -mcpu=btver2): Instruction Info: [1]: #uOps [2]: Latency [3]: RThroughput [4]: MayLoad [5]: MayStore [6]: HasSideEffects (U) [7]: Encoding Size [1] [2] [3] [4] [5] [6] [7] Encodings: Instructions: 1 2 1.00 4 c5 f0 59 d0 vmulps %xmm0, %xmm1, %xmm2 1 4 1.00 4 c5 eb 7c da vhaddps %xmm2, %xmm2, %xmm3 1 4 1.00 4 c5 e3 7c e3 vhaddps %xmm3, %xmm3, %xmm4 In this example, column Encoding Size is the size in bytes of the instruction encoding. Column Encodings reports the actual instruction encodings as byte sequences in hex (objdump style). The computation of encodings is done by a utility class named mca::CodeEmitter. In future, I plan to expose the CodeEmitter to the instruction builder, so that information about instruction encoding sizes can be used by the simulator. That would be a first step towards simulating the throughput from the decoders in the hardware frontend. Differential Revision: https://reviews.llvm.org/D65948 llvm-svn: 368432	2019-08-09 11:26:27 +00:00
Pablo Barrio	3cdd586be2	[AArch64] Set pref. func. align to 8 bytes on Neoverse E1 & Cortex-A65 Summary: The Arm Neoverse E1 and Cortex-A65 Software Optimization Guide [1][2], Section "4.7 Branch instruction alignment" state: "It is preferable for branch targets, including subroutine entry points, to be placed on aligned 64-bit boundaries to maximize instruction fetch efficiency." This patch sets the preferred function alignment on Neoverse E1 and Cortex-A65 to 2^3=8B. This was already the case in some Cortex-A CPUs such as Cortex-A53. [1] https://developer.arm.com/docs/swog466751/latest/arm-neoversetm-e1-core-software-optimization-guide [2] https://developer.arm.com/docs/swog010045/latest/arm-cortex-a65-core-software-optimization-guide Reviewers: dmgreen, fhahn, samparker Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65937 llvm-svn: 368431	2019-08-09 11:05:15 +00:00
Tim Northover	01eb869114	AArch64: support TLS on Darwin platforms in GlobalISel. All TLS access on Darwin is in the "general dynamic" form where we call a function to resolve the address, so implementation is pretty simple. llvm-svn: 368418	2019-08-09 09:32:38 +00:00
Tim Northover	e1a5f668b3	GlobalISel: pack various parameters for lowerCall into a struct. I've now needed to add an extra parameter to this call twice recently. Not only is the signature getting extremely unwieldy, but just updating all of the callsites and implementations is a pain. Putting the parameters in a struct sidesteps both issues. llvm-svn: 368408	2019-08-09 08:26:38 +00:00
Sam Parker	0dba791a25	[ARM][ParallelDSP] Replace SExt uses As loads are combined and widened, we replaced their sext users operands whereas we should have been replacing the uses of the sext. I've added a load of tests, with only a few of them originally causing assertion failures, the rest improve pattern coverage. Differential Revision: https://reviews.llvm.org/D65740 llvm-svn: 368404	2019-08-09 07:48:50 +00:00
Bjorn Pettersson	d218a3326e	[InstSimplify] Report "Changed" also when only deleting dead instructions Summary: Make sure that we report that changes has been made by InstSimplify also in situations when only trivially dead instructions has been removed. If for example a call is removed the call graph must be updated. Bug seem to have been introduced by llvm-svn r367173 (commit `02b9e45a7e`), since the code in question was rewritten in that commit. Reviewers: spatel, chandlerc, foad Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65973 llvm-svn: 368401	2019-08-09 07:08:25 +00:00
Craig Topper	6179175551	[X86] Remove code that expands truncating stores from combineStore. We shouldn't form trunc stores that need to be expanded now that we are using widening legalization. llvm-svn: 368400	2019-08-09 06:59:53 +00:00
Craig Topper	7e33f11ba7	[X86] Remove stale FIXME from combineMaskedStore. NFC I believe PR34584 was tracking that FIXME, but its since been closed and a test case was added. llvm-svn: 368397	2019-08-09 05:55:41 +00:00
Craig Topper	8c5c09780d	[X86] Remove DAG combine expansion of extending masked load and truncating masked store. The only way to generate these was through promoting legalization of narrow vectors, but we widen those types now. So we shouldn't produce these nodes. llvm-svn: 368396	2019-08-09 05:53:37 +00:00
Craig Topper	509c8774fa	[X86] Remove handler for (U/S)(ADD/SUB)SAT from ReplaceNodeResults. Remove TypeWidenVector check from code that handles X86ISD::VPMADDWD and X86ISD::AVG. More unneeded code since we now legalize narrow vectors by widening. llvm-svn: 368395	2019-08-09 05:17:52 +00:00
Craig Topper	824961824f	[X86] Remove ISD::SETCC handling from ReplaceNodeResults. This is no longer needed since we widen v2i32 instead of promoting. llvm-svn: 368394	2019-08-09 05:17:48 +00:00
Craig Topper	ef5b435b00	[X86] Simplify ISD::LOAD handling in ReplaceNodeResults and ISD::STORE handling in LowerStore now that v2i32 is widened to v4i32. llvm-svn: 368390	2019-08-09 03:09:43 +00:00
Craig Topper	0da681a2be	[X86] Merge v2f32 and v2i32 gather/scatter handling in ReplaceNodeResults/LowerMSCATTER now that v2i32 is also widened like v2f32. llvm-svn: 368389	2019-08-09 03:09:28 +00:00
Craig Topper	6f81db0f68	[X86] Now unreachable handling for f64->v2i32/v4i16/v8i8 bitcasts from ReplaceNodeResults. We rely on the generic type legalizer for this now. llvm-svn: 368388	2019-08-09 03:09:19 +00:00
Craig Topper	d871f638d7	[X86] Simplify ReplaceNodeResults handling for FP_TO_SINT/UINT for vectors to only handle widening. llvm-svn: 368387	2019-08-09 03:09:10 +00:00
Craig Topper	0bd44d59db	[X86] Simplify ReplaceNodeResults handling for SIGN_EXTEND/ZERO_EXTEND/TRUNCATE for vectors to only handle widening. llvm-svn: 368386	2019-08-09 03:08:54 +00:00
Craig Topper	cdb9a8ebd8	[X86] Simplify ReplaceNodeResults handling for UDIV/UREM/SDIV/SREM for vectors to only handle widening. llvm-svn: 368385	2019-08-09 03:08:45 +00:00
Craig Topper	35848345f0	[X86] Remove vector promotion handling from the ReplaceNodeResults ISD::MUL handling code. We now widen illegal vector types so we don't need this anymore. llvm-svn: 368384	2019-08-09 03:08:28 +00:00
David Blaikie	0fcc1f7bac	DebugInfo/DWARF: Provide some (pretty half-hearted) error handling access when parsing units This isn't the most robust error handling API, but does allow clients to opt-in to getting Errors they can handle. I suspect the long-term solution would be to move away from the lazy unit parsing and have an explicit step that parses the unit and then allows access to the other APIs that require a parsed unit. llvm-dwarfdump could be expanded to use this (or newer/better API) to demonstrate the benefit of it - but for now lld will use this in a follow-up cl which ensures lld can exit non-zero on errors like this (& provide more descriptive diagnostics including which object file the error came from). (error access to later errors when parsing nested DIEs would be good too - but, again, exposing that without it being a hassle for every consumer may be tricky) llvm-svn: 368377	2019-08-09 01:14:33 +00:00
Akira Hatanaka	3e61ed0299	Change the return type of UpgradeARCRuntimeCalls to void Nothing is using the function return. llvm-svn: 368367	2019-08-08 23:33:17 +00:00
David Blaikie	5b9508396c	Remove else-after-return llvm-svn: 368364	2019-08-08 23:17:23 +00:00
Peter Collingbourne	bb17e46644	Linker: Add support for GlobalIFunc. GlobalAlias and GlobalIFunc ought to be treated the same by the IR linker, so we can generalize the code to be in terms of their common base class GlobalIndirectSymbol. Differential Revision: https://reviews.llvm.org/D55046 llvm-svn: 368357	2019-08-08 22:09:18 +00:00
Cameron McInally	8416f20f2f	[LICM] Support unary FNeg in LICM Differential Revision: https://reviews.llvm.org/D65908 llvm-svn: 368350	2019-08-08 21:38:31 +00:00
Craig Topper	c49d3e6c4d	[X86] Improve codegen of v8i64->v8i16 and v16i32->v16i8 truncate with avx512vl, avx512bw, min-legal-vector-width<=256 and prefer-vector-width=256 Under this configuration we'll want to split the v8i64 or v16i32 into two vectors. The default legalization will try to truncate each of those 256-bit pieces one step to 128-bit, concatenate those, then truncate one more time from the new 256 to 128 bits. With this patch we now truncate the two splits to 64-bits then concatenate those. We have to do this two different ways depending on whether have widening legalization enabled. Without widening legalization we have to manually construct X86ISD::VTRUNC to prevent the ISD::TRUNCATE with a narrow result being promoted to 128 bits with a larger element type than what we want followed by something like a pshufb to grab the lower half of each element to finish the job. With widening legalization we just get the right thing. When we switch to widening by default we can just delete the other code path. Differential Revision: https://reviews.llvm.org/D65626 llvm-svn: 368349	2019-08-08 21:36:47 +00:00
Craig Topper	9158e54270	[SelectionDAG][X86] Move setcc mask splitting for mload/mstore/mgather/mscatter from DAGCombiner to the type legalizer. We may be able to look to how VSELECT is handled to further improve this, but this appears to be neutral or an improvement on the test cases we have. llvm-svn: 368344	2019-08-08 21:14:08 +00:00
Craig Topper	bce4d79f37	[LegalizeTypes] Remove SplitVSETCC helper and just call SplitVecRes_SETCC. llvm-svn: 368343	2019-08-08 21:13:58 +00:00

... 11 12 13 14 15 ...

126735 Commits