llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	d918d5b36b	[InstCombine] Improve the expansion in SimplifyUsingDistributiveLaws to handle cases where one side doesn't simplify, but the other side resolves to an identity value Summary: If one side simplifies to the identity value for inner opcode, we can replace the value with just the operation that can't be simplified. I've removed a couple now unneeded special cases in visitAnd and visitOr. There are probably other cases I missed. Reviewers: spatel, majnemer, hfinkel, dberlin Reviewed By: spatel Subscribers: grandinj, llvm-commits, spatel Differential Revision: https://reviews.llvm.org/D35451 llvm-svn: 308111	2017-07-15 21:49:49 +00:00
Sanjay Patel	3437ee2740	[InstCombine] improve (1 << x) & 1 --> zext(x == 0) folding 1. Add a one-use check to prevent increasing instruction count. 2. Generalize the pattern matching to include vector types. llvm-svn: 308105	2017-07-15 17:26:01 +00:00
Sanjay Patel	55b9f88ecc	[InstCombine] allow (0 - x) & 1 --> x & 1 for vectors llvm-svn: 308098	2017-07-15 15:29:47 +00:00
Sanjay Patel	27339133a7	[InstCombine] remove dead code/tests; NFCI These patterns and tests were added to InstSimplify with: https://reviews.llvm.org/rL303004 llvm-svn: 308096	2017-07-15 15:01:33 +00:00
Chandler Carruth	d78a38ed2e	Revert r308078 (and subsequent tweak in r308079) which introduces a test that appears to exhibit non-determinism and is flaking on the bots pretty consistently. r308078: [ThinLTO] Ensure we always select the same function copy to import r308079: Require asserts in new test that uses debug flag llvm-svn: 308095	2017-07-15 13:50:26 +00:00
Florian Hahn	ad993521ac	[LoopInterchange] Add some optimization remarks. Reviewers: anemet, karthikthecool, blitz.opensource Reviewed By: anemet Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35122 llvm-svn: 308094	2017-07-15 13:13:19 +00:00
Dinar Temirbulatov	3c64077c82	[SLPVectorizer] Add an extra parameter to tryScheduleBundle function, NFCI. llvm-svn: 308081	2017-07-15 05:43:54 +00:00
Teresa Johnson	82b4fb1afe	[ThinLTO] Ensure we always select the same function copy to import Summary: Check if the first eligible callee is under the instruction threshold. Checking this on the first eligible callee ensures that we don't end up selecting different callees to import when we invoke this routine with different thresholds due to reaching the callee via paths that are shallower or hotter (when there are multiple copies, i.e. with weak or linkonce linkage). We don't want to leave the decision of which copy to import up to the backend. Reviewers: mehdi_amini Subscribers: inglorion, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D35436 llvm-svn: 308078	2017-07-15 04:53:05 +00:00
Geoff Berry	f7d5daa0c0	[EarlyCSE] Handle calls with no MemorySSA info. Summary: When checking for memory dependencies between calls using MemorySSA, handle cases where the calls have no MemoryAccess associated with them because the AA analysis being used has determined that the call does not read/write memory. Fixes PR33756 Reviewers: dberlin, davide Subscribers: mcrosier, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D35317 llvm-svn: 308051	2017-07-14 20:13:21 +00:00
Haicheng Wu	476adcca6b	[JumpThreading] Add a pattern to TryToUnfoldSelectInCurrBB() Add the following pattern to TryToUnfoldSelectInCurrBB() bb: %p = phi [0, %bb1], [1, %bb2], [0, %bb3], [1, %bb4], ... %c = cmp %p, 0 %s = select %c, trueval, falseval The Select in the above pattern will be unfolded and then jump-threaded. The current implementation does not allow CMP in the middle of PHI and Select. Differential Revision: https://reviews.llvm.org/D34762 llvm-svn: 308050	2017-07-14 19:16:47 +00:00
Jakub Kuderski	b292c22c8d	[Dominators] Make IsPostDominator a template parameter Summary: DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere. This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction. Reviewers: dberlin, sanjoy, davide, grosser Reviewed By: dberlin Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35315 llvm-svn: 308040	2017-07-14 18:26:09 +00:00
Sanjay Patel	3f4db3ea97	[InstCombine] convert bitwise (in)equality checks to logical ops (PR32401) As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 we have a backend transform to undo this: https://reviews.llvm.org/rL299542 when it's likely that the xor version leads to better codegen, but we want this form in IR for better analysis and simplification potential. llvm-svn: 308031	2017-07-14 15:09:49 +00:00
Max Kazantsev	f80ffa1a78	[IRCE] Fix corner case with Start = INT_MAX When iterating through loop for (int i = INT_MAX; i > 0; i--) We fail to generate the pre-loop for it. It happens because we use the overflown value in a comparison predicate when identifying whether or not we need it. In old logic, we used SLE predicate against Greatest value which exceeds all seen values of the IV and might be overflown. Now we use the GreatestSeen value of this IV with SLT predicate. Also added a test that ensures that a pre-loop is generated for such loops. Differential Revision: https://reviews.llvm.org/D35347 llvm-svn: 308001	2017-07-14 06:35:03 +00:00
Dinar Temirbulatov	21599fe2de	[SLPVectorizer] Add an extra parameter to alreadyVectorized function, NFCI. llvm-svn: 307996	2017-07-14 03:48:29 +00:00
Simon Pilgrim	f32f4be957	Fix unused variable warning on EXPENSIVE_CHECKS release builds. NFCI. llvm-svn: 307929	2017-07-13 17:10:12 +00:00
Davide Italiano	c3dc055780	Reapply [GlobalOpt] Remove unreachable blocks before optimizing a function. This commit reapplies r307215 now that we found out and fixed the cause of the cfi test failure (in r307871). llvm-svn: 307920	2017-07-13 15:40:59 +00:00
Anna Thomas	ec9b326569	[RuntimeUnrolling] Update DomTree correctly when exit blocks have successors Summary: When we runtime unroll with multiple exit blocks, we also need to update the immediate dominators of the immediate successors of the exit blocks. Reviewers: reames, mkuper, mzolotukhin, apilipenko Reviewed by: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35304 llvm-svn: 307909	2017-07-13 13:21:23 +00:00
Xinliang David Li	f564c6959e	[PGO] Enhance pgo counter promotion This is an incremental change to the promotion feature. There are two problems with the current behavior: 1) loops with multiple exiting blocks are totally disabled 2) a counter update can only be promoted one level up in the loop nest -- which does help much for short trip count inner loops inside a high trip-count outer loops. Due to this limitation, we still saw very large profile count fluctuations from run to run for the affected loops which are usually very hot. This patch adds the support for promotion counters iteratively across the loop nest. It also turns on the promotion for loops with multiple exiting blocks (with a limit). For single-threaded applications, the performance impact is flat on average. For instance, dealII improves, but povray regresses. llvm-svn: 307863	2017-07-12 23:27:44 +00:00
Anna Thomas	8e431a9851	[LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loop Refactored the code and separated out a function `canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it easier to add profitability heuristics later. Added tests to runtime unrolling to make sure that unrolling for multi-exit loops is not done unless the option -unroll-runtime-multi-exit is true. llvm-svn: 307843	2017-07-12 20:55:43 +00:00
Sam Clegg	fd5ab25ae1	Remove unneeded use of #undef DEBUG_TYPE. NFC Where is is needed (at the end of headers that define it), be consistent about its use. Also fix a few header guards that I found in the process. Differential Revision: https://reviews.llvm.org/D34916 llvm-svn: 307840	2017-07-12 20:49:21 +00:00
Michael Kuperstein	fdb46b2fb4	[LV] Don't allow outside uses of IVs if the SCEV is predicated on loop conditions. This fixes PR33706. Differential Revision: https://reviews.llvm.org/D35227 llvm-svn: 307837	2017-07-12 19:53:55 +00:00
Jakub Kuderski	b323f4f173	[LoopRotate] Fix DomTree update logic for unreachable nodes. Fix PR33701. Summary: LoopRotate manually updates the DoomTree by iterating over all predecessors of a basic block and computing the Nearest Common Dominator. When a predecessor happens to be unreachable, `DT.findNearestCommonDominator` returns nullptr. This patch teaches LoopRotate to handle this case and fixes [[ https://bugs.llvm.org/show_bug.cgi?id=33701 \| PR33701 ]]. In the future, LoopRotate should be taught to use the new incremental API for updating the DomTree. Reviewers: dberlin, davide, uabelho, grosser Subscribers: efriedma, mzolotukhin Differential Revision: https://reviews.llvm.org/D35074 llvm-svn: 307828	2017-07-12 18:42:16 +00:00
Peter Collingbourne	cacac6a104	LowerTypeTests: When importing functions skip definitions where the summary contains a decl. This normally indicates mixed CFI + non-CFI compilation, and will result in us treating the function in the same way as a function defined outside of the LTO unit. Part of PR33752. Differential Revision: https://reviews.llvm.org/D35281 llvm-svn: 307744	2017-07-12 00:39:12 +00:00
Davide Italiano	b8ad3eebca	[IPO] Temporarily rollback r307215. [GlobalOpt] Remove unreachable blocks before optimizing a function. While the change is presumably correct, it exposes a latent bug in DI which breaks on of the CFI checks. I'll analyze it further and try to understand what's going on. llvm-svn: 307729	2017-07-11 23:10:17 +00:00
Konstantin Zhuravlyov	bb80d3e1d3	Enhance synchscope representation OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722	2017-07-11 22:23:00 +00:00
Anna Thomas	bafe766f5d	[LoopUnrollRuntime] NFC: Add some debugging trace messages for why loop wasn't unrolled. llvm-svn: 307705	2017-07-11 20:44:37 +00:00
Davide Italiano	ee1c82112e	[NewGVN] Check for congruency of memory accesses. This is fine as nothing in the code relies on leader and memory leader being the same for a given congruency class. Ack'ed by Dan. Fixes PR33720. llvm-svn: 307699	2017-07-11 19:49:12 +00:00
Davide Italiano	67b0e53dc1	[NewGVN] Fix an innocent typo I found while debugging PR33720. llvm-svn: 307694	2017-07-11 19:19:45 +00:00
Davide Italiano	fb4544cd15	[NewGVN] Clarify the function invariants formatting them properly. llvm-svn: 307692	2017-07-11 19:15:36 +00:00
Evgeniy Stepanov	3d5ea713f7	[msan] Only check shadow memory for operands that are sized. Fixes PR33347: https://bugs.llvm.org/show_bug.cgi?id=33347. Differential Revision: https://reviews.llvm.org/D35160 Patch by Matt Morehouse. llvm-svn: 307684	2017-07-11 18:13:52 +00:00
Anna Thomas	5526a33f4f	[LoopUnrollRuntime] Avoid multi-exit nested loop with epilog generation The loop structure for the outer loop does not contain the epilog preheader when we try to unroll inner loop with multiple exits and epilog code is generated. For now, we just bail out in such cases. Added a test case that shows the problem. Without this bailout, we would trip on assert saying LCSSA form is incorrect for outer loop. llvm-svn: 307676	2017-07-11 17:16:33 +00:00
Dinar Temirbulatov	09b6779709	[SLPVectorizer] Revert change in cancelScheduling with referencing to FirstInBundle, NFCI. llvm-svn: 307667	2017-07-11 15:54:50 +00:00
Hiroshi Inoue	0ca79dcf4b	fix typos in comments; NFC llvm-svn: 307626	2017-07-11 06:04:59 +00:00
Chandler Carruth	01f0c8a8c4	[PM/ThinLTO] Fix PR33536, a bug where the ThinLTO bitcode writer was querying for analysis results on a function declaration rather than a definition. The only reason this worked previously is by chance -- because the way we got alias analysis results with the legacy PM, we happened to not compute a dominator tree and so we happened to not hit an assert even though it didn't make any real sense. Now we bail out before trying to compute alias analysis so that we don't hit these asserts. llvm-svn: 307625	2017-07-11 05:39:20 +00:00
Leo Li	93abd7d915	[ConstantHoisting] Remove dupliate logic in constant hoisting Summary: As metioned in https://reviews.llvm.org/D34576, checkings in `collectConstantCandidates` can be replaced by using `llvm::canReplaceOperandWithVariable`. The only special case is that `collectConstantCandidates` return false for all `IntrinsicInst` but it is safe for us to collect constant candidates from `IntrinsicInst`. Reviewers: pirama, efriedma, srhines Reviewed By: efriedma Subscribers: llvm-commits, javed.absar Differential Revision: https://reviews.llvm.org/D34921 llvm-svn: 307587	2017-07-10 20:45:34 +00:00
Davide Italiano	a7a77540ef	[NewGVN] Simplify a lambda a little bit. NFCI. llvm-svn: 307586	2017-07-10 20:45:00 +00:00
Serge Guelton	f6329ec2e9	Fix invalid cast in instcombine UMul/ZExt idiom Fixes https://bugs.llvm.org/show_bug.cgi?id=25454 Do not assume IRBuilder creates Instruction where it can create Value. Do not assume idiom operands are constant, leave generalisation ot the IRBuilder. Differential Revision: https://reviews.llvm.org/D35114 llvm-svn: 307554	2017-07-10 16:51:40 +00:00
Anna Thomas	70ffd65ca9	[LoopUnrollRuntime] Remove strict assert about VMap requirement When unrolling under multiple exits which is under off-by-default option, the assert that checks for VMap entry in loop exit values is too strong. (assert if VMap entry did not exist, the value should be a constant). However, values derived from constants or from values outside loop, does not have a VMap entry too. Removed the assert and added a testcase showcasing the property for non-constant values. llvm-svn: 307542	2017-07-10 15:29:38 +00:00
Mikael Holmen	e0ced14449	[ArgumentPromotion] Change use of removed argument in llvm.dbg.value to undef Summary: This solves PR33641. When removing a dead argument we must also handle possibly existing calls to llvm.dbg.value that use the removed argument. Now we change the use of the otherwise dead argument to an undef for some other pass to cleanup later. If the calls are left untouched, they will later on cause errors: "function-local metadata used in wrong function" since the ArgumentPromotion rewrites the code by creating a new function with the wanted signature, but the metadata is not recreated so the new function may then erroneously use metadata from the old function. Reviewers: mstorsjo, rnk, arsenm Reviewed By: rnk Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34874 llvm-svn: 307521	2017-07-10 06:07:24 +00:00
Craig Topper	fde4723ebe	[IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing isIntegerTy(unsigned), but also works for vectors. llvm-svn: 307492	2017-07-09 07:04:03 +00:00
Craig Topper	95d2347ae1	[IR] Make use of Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC llvm-svn: 307491	2017-07-09 07:04:00 +00:00
Hiroshi Inoue	713b5ba2de	fix trivial typos; NFC sucessor -> successor llvm-svn: 307488	2017-07-09 05:54:44 +00:00
Chandler Carruth	bd9c29039e	[PM] Finish implementing and fix a chain of bugs uncovered by testing the invalidation propagation logic from an SCC to a Function. I wrote the infrastructure to test this but didn't actually use it in the unit test where it was designed to be used. =[ My bad. Once I actually added it to the test case I discovered that it also hadn't been properly implemented, so I've implemented it. The logic in the FAM proxy for an SCC pass to propagate invalidation follows the same ideas as the FAM proxy for a Module pass, but the implementation is a bit different to reflect the fact that it is forwarding just for an SCC. However, implementing this correctly uncovered a surprising "bug" (it was conservatively correct but relatively very expensive) in how we handle invalidation when splitting one SCC into multiple SCCs. We did an eager invalidation when in reality we should be deferring invaliadtion for the current SCC to the CGSCC pass manager and just invaliating the newly constructed SCCs. Otherwise we end up invalidating too much too soon. This was exposed by the inliner test case that I've updated. Now, we invalidate just the split off '(test1_f)' SCC when doing the CG update, and then the inliner finishes and invalidates the '(test1_g, test1_h)' SCC's analyses. The first few attempts at fixing this hit still more bugs, but all of those are covered by existing tests. For example, the inliner should also preserve the FAM proxy to avoid unnecesasry invalidation, and this is safe because the CG update routines it uses handle any necessary adjustments to the FAM proxy. Finally, the unittests for the CGSCC pass manager needed a bunch of updates where we weren't correctly preserving the FAM proxy because it hadn't been fully implemented and failing to preserve it didn't matter. Note that this doesn't yet fix the current crasher due to MemSSA finding a stale dominator tree, but without this the fix to that crasher doesn't really make any sense when testing because it relies on the proxy behavior. llvm-svn: 307487	2017-07-09 03:59:31 +00:00
Craig Topper	e79b3e7d9a	[InstCombine] Speculatively implement a fix for what might be the root cause of PR33721 by making sure that we have integer types before doing select C, -1, 0 -> sext C to int I recently changed m_One and m_AllOnes to use Constant::isOneValue/isAllOnesValue which work on floating point values too. The original implementation looked specifically for ConstantInt scalars and splats. So I'm guessing we are accidentally trying to issue sext/zexts on floating point types now. Hopefully I figure out how to reproduce the failure from the PR soon. llvm-svn: 307486	2017-07-09 03:25:17 +00:00
Max Kazantsev	b9edcbcb1d	Re-enable "[IndVars] Canonicalize comparisons between non-negative values and indvars" The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate. In this patch we use the original predicate to do the transformation. Also added a test case that exercises this situation. Differentian Revision: https://reviews.llvm.org/D35107 llvm-svn: 307477	2017-07-08 17:17:30 +00:00
Craig Topper	bb4069e439	[InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference. This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do. llvm-svn: 307451	2017-07-07 23:16:26 +00:00
Dehao Chen	64c46574b0	Increase the import-threshold for crtical functions. Summary: For interative sample-pgo, if a hot call site is inlined in the profiling binary, we should inline it in before profile annotation in the backend. Before that, the compile phase first collects all GUIDs that needs to be imported and creates virtual "hot" call edge in the summary. However, "hot" is not good enough to guarantee the callsites get inlined. This patch introduces "critical" call edge, and assign much higher importing threshold for those edges. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, mehdi_amini, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D35096 llvm-svn: 307439	2017-07-07 21:01:00 +00:00
Anna Thomas	e3872003d0	[LoopUnrollRuntime] Support multiple exit blocks unrolling when prolog remainder generated With the NFC refactoring in rL307417 (git SHA 987dd01), all the logic is in place to support multiple exit/exiting blocks when prolog remainder is generated. This patch removed the assert that multiple exit blocks unrolling is only supported when epilog remainder is generated. Also, added test runs and checks with PROLOG prefix in runtime-loop-multiple-exits.ll test cases. llvm-svn: 307435	2017-07-07 20:12:32 +00:00
Davide Italiano	4eb210bdeb	[Local] Update the comment for removeUnreachableBlocks. It referenced a wrong function name, and didn't mention what the second argument did. This should be slightly more accurate now. llvm-svn: 307425	2017-07-07 18:54:14 +00:00
Gor Nishanov	8cdf648795	[cloning] Do not duplicate types when cloning functions Summary: This is an addon to the change rl304488 cloning fixes. (Originally rl304226 reverted rl304228 and reapplied rl304488 https://reviews.llvm.org/D33655) rl304488 works great when DILocalVariables that comes from the inlined function has a 'unique-ed' type, but, in the case when the variable type is distinct we will create a second DILocalVariable in the scope of the original function that was inlined. Consider cloning of the following function: ``` define private void @f() !dbg !5 { %1 = alloca i32, !dbg !11 call void @llvm.dbg.declare(metadata i32* %1, metadata !14, metadata !12), !dbg !18 ret void, !dbg !18 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) ; came from an inlined function !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Without this fix, when function 'f' is cloned, we will create another DILocalVariable for "inlined", due to its type being distinct. ``` define private void @f.1() !dbg !23 { %1 = alloca i32, !dbg !26 call void @llvm.dbg.declare(metadata i32* %1, metadata !28, metadata !12), !dbg !30 ret void, !dbg !30 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ; !28 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !29) ; OOPS second DILocalVariable !29 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Now we have two DILocalVariable for "inlined" within the same scope. This result in assert in AsmPrinter/DwarfDebug.h:131: void llvm::DbgVariable::addMMIEntry(const llvm::DbgVariable &): Assertion `V.Var == Var && "conflicting variable"' failed. (Full example: See: https://bugs.llvm.org/show_bug.cgi?id=33492) In this change we prevent duplication of types so that when a metadata for DILocalVariable is cloned it will get uniqued to the same metadate node as an original variable. Reviewers: loladiro, dblaikie, aprantl, echristo Reviewed By: loladiro Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D35106 llvm-svn: 307418	2017-07-07 18:24:20 +00:00
Anna Thomas	734ab3f75c	[LoopUnrollRuntime] NFC: use the precomputed loop exit in ConnectProlog Minor refactoring to use the preexisting loop exit that's already calculated. We do not need to recompute the loop exit in ConnectProlog. Apart from avoiding redundant computation, this is required for supporting multiple loop exits when Prolog remainder loops are generated. llvm-svn: 307417	2017-07-07 18:05:28 +00:00
Yaxun Liu	b909f11a31	[InferAddressSpaces] Fix assertion about null pointer InferAddressSpaces does not check address space in collectFlatAddressExpressions, which causes values with non flat address space put into Postorder and causes assertion in cloneValueWithNewAddressSpace. This patch fixes assertion in OpenCL 2.0 conformance test generic_address_space subtest for amdgcn target. Differential Revision: https://reviews.llvm.org/D34991 llvm-svn: 307349	2017-07-07 02:40:13 +00:00
Sean Fertile	9cd1cdf814	Extend memcpy expansion in Transform/Utils to handle wider operand types. Adds loop expansions for known-size and unknown-sized memcpy calls, allowing the target to provide the operand types through TTI callbacks. The default values for the TTI callbacks use int8 operand types and matches the existing behaviour if they aren't overridden by the target. Differential revision: https://reviews.llvm.org/D32536 llvm-svn: 307346	2017-07-07 02:00:06 +00:00
Evgeniy Stepanov	7d3eeaaa96	Revert r307342, r307343. Revert "Copy arguments passed by value into explicit allocas for ASan." Revert "[asan] Add end-to-end tests for overflows of byval arguments." Build failure on lldb-x86_64-ubuntu-14.04-buildserver. Test failure on clang-cmake-aarch64-42vma and sanitizer-x86_64-linux-android. llvm-svn: 307345	2017-07-07 01:31:23 +00:00
Evgeniy Stepanov	2a7a4bc1c9	Copy arguments passed by value into explicit allocas for ASan. ASan determines the stack layout from alloca instructions. Since arguments marked as "byval" do not have an explicit alloca instruction, ASan does not produce red zones for them. This commit produces an explicit alloca instruction and copies the byval argument into the allocated memory so that red zones are produced. Patch by Matt Morehouse. Differential revision: https://reviews.llvm.org/D34789 llvm-svn: 307342	2017-07-07 00:48:25 +00:00
Wei Mi	7586755013	[ConstHoisting] Turn on consthoist-with-block-frequency by default. Using profile information to guide consthoisting is generally helpful for performance, so the patch turns it on by default. No compile time or perf regression were found using spec2000 and spec2006 on x86. Some significant improvement (>20%) was seen on internal benchmarks. Differential Revision: https://reviews.llvm.org/D35063 llvm-svn: 307338	2017-07-07 00:11:05 +00:00
Craig Topper	cb22039bee	[InstCombine] No need to pass DataLayout to helper functions if we're passing the InstCombiner object. We can just ask it for the DataLayout. NFC llvm-svn: 307333	2017-07-06 23:18:43 +00:00
Craig Topper	4853c4304b	[InstCombine] Remove unused arguments from some helper functions. NFC llvm-svn: 307332	2017-07-06 23:18:42 +00:00
Craig Topper	2bb9f0f620	[InstCombine] Change a couple helper functions to only take the IRBuilder as an argument and not the whole InstCombiner object. NFC llvm-svn: 307331	2017-07-06 23:18:41 +00:00
Wei Mi	20526b2725	[ConstHoisting] choose to hoist when frequency is the same. The patch is to adjust the strategy of frequency based consthoisting: Previously when the candidate block has the same frequency with the existing blocks containing a const, it will not hoist the const to the candidate block. For that case, now we change the strategy to hoist the const if only existing blocks have more than one block member. This is helpful for reducing code size. Differential Revision: https://reviews.llvm.org/D35084 llvm-svn: 307328	2017-07-06 22:32:27 +00:00
Davide Italiano	f4891d29f8	[lib/LTO] Add a comment to explain where we set the linkage in the summary. Pointed out by Teresa! llvm-svn: 307305	2017-07-06 20:04:20 +00:00
Davide Italiano	6a5fbe52fa	[LTO] Fix the interaction between linker redefined symbols and ThinLTO This is the same as r304719 but for ThinLTO. The substantial difference is that in this case we don't have whole visibility, just the summary. In the LTO case, when we got the resolution for the input file we could just see if the linker told us whether a symbol was linker redefined (using --wrap or --defsym) and switch the linkage directly for the GV. Here, we have the summary. So, we record that the linkage changed from <whatever it was> to $weakany to prevent IPOs across this symbol boundaries and actually just switch the linkage at FunctionImport time. This patch should also fixes the lld bits (as all the scaffolding for communicating if a symbol is linker redefined should be there & should be the same), but I'll make sure to add some tests there as well. Fixes PR33192. Differential Revision: https://reviews.llvm.org/D35064 llvm-svn: 307303	2017-07-06 19:58:26 +00:00
Craig Topper	e9bf7ebacf	[InstCombine] Remove include of DIBuilder.h and Dwarf.h as they don't appear to be necessary. llvm-svn: 307295	2017-07-06 18:47:47 +00:00
Leo Li	5499b1b8be	Modify constraints in `llvm::canReplaceOperandWithVariable` Summary: `Instruction::Switch`: only first operand can be set to a non-constant value. `Instruction::InsertValue` both the first and the second operand can be set to a non-constant value. `Instruction::Alloca` return true for non-static allocation. Reviewers: efriedma Reviewed By: efriedma Subscribers: srhines, pirama, llvm-commits Differential Revision: https://reviews.llvm.org/D34905 llvm-svn: 307294	2017-07-06 18:47:05 +00:00
Craig Topper	ca2c87653c	[Constants] Replace calls to ConstantInt::equalsInt(0)/equalsInt(1) with isZero and isOne. NFCI llvm-svn: 307293	2017-07-06 18:39:49 +00:00
Craig Topper	79ab643da8	[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292	2017-07-06 18:39:47 +00:00
Anna Thomas	eb6d5d1950	[LoopUnrollRuntime] Bailout when multiple exiting blocks to the unique latch exit block Currently, we do not support multiple exiting blocks to the latch exit block. However, this bailout wasn't triggered when we had a unique exit block (which is the latch exit), with multiple exiting blocks to that unique exit. Moved the bailout so that it's triggered in both cases and added testcase. llvm-svn: 307291	2017-07-06 18:39:26 +00:00
Craig Topper	47c8f66997	[InstCombine] Remove Builder argument from InstCombiner::tryFactorization. NFC Builder is already a member of the InstCombiner class so we can use it with passing it. llvm-svn: 307290	2017-07-06 18:35:52 +00:00
Craig Topper	dfd01ea9ed	[SimplifyCFG] Move a portion of an if statement that should already be implied to an assert Summary: In this code we got to Dom by following the predecessor link of BB. So it stands to reason that BB should also show up as a successor of Dom's terminator right? There isn't a way to have the CFG connect in only one direction is there? Reviewers: jmolloy, davide, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D35025 llvm-svn: 307276	2017-07-06 16:29:43 +00:00
Craig Topper	95e4142f94	[InstCombine] Change helper method to a file local static method. NFC llvm-svn: 307275	2017-07-06 16:24:23 +00:00
Craig Topper	fc42acef92	[InstCombine] Clarify comment to mention other transform that it does. NFC llvm-svn: 307274	2017-07-06 16:24:22 +00:00
Craig Topper	22795de20a	[InstCombine] Add single use checks to SimplifyBSwap to ensure we are really saving instructions Bswap isn't a simple operation so we need to make sure we are really removing a call to it before doing these simplifications. For the case when both LHS and RHS are bswaps I've allowed it to be moved if either LHS or RHS has a single use since that at least allows us to move it later where it might find another bswap to combine with and it decreases the use count on the other side so maybe the other user can be optimized. Differential Revision: https://reviews.llvm.org/D34974 llvm-svn: 307273	2017-07-06 16:24:21 +00:00
Craig Topper	3e1909d797	[InstCombine] Don't create extra ConstantInt objects in foldSelectICmpAnd. NFCI Instead just use APInt objects and only create a ConstantInt at the end if we need it for the Offset. llvm-svn: 307270	2017-07-06 15:58:54 +00:00
Wei Mi	90707394e3	[LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale. When the formulae search space is huge, LSR uses a series of heuristic to keep pruning the search space until the number of possible solutions are within certain limit. The big hammer of the series of heuristics is NarrowSearchSpaceByPickingWinnerRegs, which picks the register which is used by the most LSRUses and deletes the other formulae which don't use the register. This is a effective way to prune the search space, but quite often not a good way to keep the best solution. We saw cases before that the heuristic pruned the best formula candidate out of search space. To relieve the problem, we introduce a new heuristic called NarrowSearchSpaceByFilterFormulaWithSameScaledReg. The basic idea is in order to reduce the search space while keeping the best formula, we want to keep as many formulae with different Scale and ScaledReg as possible. That is because the central idea of LSR is to choose a group of loop induction variables and use those induction variables to represent LSRUses. An induction variable candidate is often represented by the Scale and ScaledReg in a formula. If we have more formulae with different ScaledReg and Scale to choose, we have better opportunity to find the best solution. That is why we believe pruning search space by only keeping the best formula with the same Scale and ScaledReg should be more effective than PickingWinnerReg. And we use two criteria to choose the best formula with the same Scale and ScaledReg. The first criteria is to select the formula using less non shared registers, and the second criteria is to select the formula with less cost got from RateFormula. The patch implements the heuristic before NarrowSearchSpaceByPickingWinnerRegs, which is the last resort. Testing shows we get 1.8% and 2% on two internal benchmarks on x86. llvm nightly testsuite performance is neutral. We also tried lsr-exp-narrow and it didn't help on the two improved internal cases we saw. Differential Revision: https://reviews.llvm.org/D34583 llvm-svn: 307269	2017-07-06 15:52:14 +00:00
Max Kazantsev	98838527c6	Revert "Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars""" It appears that the problem is still there. Needs more analysis to understand why SaturatedMultiply test fails. llvm-svn: 307249	2017-07-06 10:47:13 +00:00
Max Kazantsev	c8db20b78c	Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"" It seems that the patch was reverted by mistake. Clang testing showed failure of the MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the fresh code base and was able to confirm that the transformation introduced by the change does not happen in the said test. This gives a strong confidence that the actual reason of the failure of the initial patch was somewhere else, and that problem now seems to be fixed. Re-submitting the change to confirm that. llvm-svn: 307244	2017-07-06 09:57:41 +00:00
Frederich Munch	52dfcd18d1	Avoid constructing GlobalExtensions only to find out it is empty. Summary: GlobalExtensions is dereferenced twice, once for iteration and then a check if it is empty. As a ManagedStatic this dereference forces it's construction which is unnecessary. Reviewers: efriedma, davide, mehdi_amini Reviewed By: mehdi_amini Subscribers: chapuni, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D33381 llvm-svn: 307229	2017-07-06 00:09:09 +00:00
Davide Italiano	7dd0694f96	[GlobalOpt] Remove unreachable blocks before optimizing a function. LLVM's definition of dominance allows instructions that are cyclic in unreachable blocks, e.g.: %pat = select i1 %condition, @global, i16* %pat because any instruction dominates an instruction in a block that's not reachable from entry. So, remove unreachable blocks from the function, because a) there's no point in analyzing them and b) GlobalOpt should otherwise grow some more complicated logic to break these cycles. Differential Revision: https://reviews.llvm.org/D35028 llvm-svn: 307215	2017-07-05 22:28:28 +00:00
Craig Topper	cc418b656a	[InstCombine] Use CmpInst::Predicate with m_Cmp instead of ICmpInst::Predicate. NFC There isn't really an ICmpInst version so we're just accessing the CmpInst version through inheritance. llvm-svn: 307199	2017-07-05 20:31:00 +00:00
Dinar Temirbulatov	b78adec638	[SLPVectorizer] Add an extra parameter to cancelScheduling function, NFCI. llvm-svn: 307158	2017-07-05 13:53:03 +00:00
David Green	b26a0a460c	[IndVarSimplify] Add AShr exact flags using induction variables ranges. This adds exact flags to AShr/LShr flags where we can statically prove it is valid using the range of induction variables. This allows further optimisations to remove extra loads. Differential Revision: https://reviews.llvm.org/D34207 llvm-svn: 307157	2017-07-05 13:25:58 +00:00
Max Kazantsev	ebe56283bc	Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars" This patch seems to cause failures of test MathExtras.SaturatingMultiply on multiple buildbots. Reverting until the reason of that is clarified. Differential Revision: https://reviews.llvm.org/rL307126 llvm-svn: 307135	2017-07-05 09:44:41 +00:00
Max Kazantsev	80bc4a5554	[IndVars] Canonicalize comparisons between non-negative values and indvars -If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative, then signed and unsigned comparisons between them produce the same result. Both of those can be seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like %c = icmp slt i32 %iv, %b to %c = icmp ult i32 %iv, %b if both %iv and %b are known to be non-negative. Differential Revision: https://reviews.llvm.org/D34979 llvm-svn: 307126	2017-07-05 06:38:49 +00:00
Anna Thomas	ada4ddc0bc	[LoopDeletion] NFC: Add loop being analyzed debug statement llvm-svn: 307096	2017-07-04 17:00:03 +00:00
Anna Thomas	90f69abc8b	[LoopDeletion] NFC: Add debug statements to the optimization We have a DEBUG option for loop deletion, but no related debug messages. Added some debug messages to state why loop deletion failed. llvm-svn: 307078	2017-07-04 14:05:19 +00:00
Craig Topper	0f746c2793	[InstCombine] Add TODOs for a couple things that should maybe be in InstSimplify instead. NFC llvm-svn: 307065	2017-07-04 06:50:48 +00:00
Florian Hahn	4eeff394d3	[LoopInterchange] Add more debug messages to currentLimitations(). Summary: This makes it easier to find out which limitation prevented this pass from doing its work. Reviewers: karthikthecool, mzolotukhin, efriedma, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34940 llvm-svn: 307035	2017-07-03 15:32:00 +00:00
Benjamin Kramer	fb620493e1	Revert "[GVN] Recommit the patch "Add phi-translate support in scalarpre"." This reverts commit r306313. This breaks selfhost at -O3 and PR33652. Let me know if you need additional information on reproducing the issue. llvm-svn: 307021	2017-07-03 12:23:10 +00:00
Craig Topper	8036970008	[InstCombine] Add a TODO for a probable missing single use check. NFC Will try to fix it soon, but in case I forget. llvm-svn: 307003	2017-07-03 05:54:16 +00:00
Craig Topper	766ce6e9cf	[InstCombine] Support BITWISE_OP( BSWAP(x), CONSTANT ) -> BSWAP( BITWISE_OP(x, BSWAP(CONSTANT) ) ) for splat vectors. llvm-svn: 307002	2017-07-03 05:54:15 +00:00
Craig Topper	32fce4d647	[InstCombine] Remove support for BITWISE_OP(CONSTANT, BSWAP(x)) -> BSWAP(OP(BSWAP(CONSTANT), x)). Constants were already canonicalized to the right hand side before we got here. llvm-svn: 307000	2017-07-03 05:54:13 +00:00
Craig Topper	1e4643a98e	[InstCombine] Support BITWISE_OP(BSWAP(A),BSWAP(B))->BSWAP(BITWISE_OP(A, B)) for vectors. llvm-svn: 306999	2017-07-03 05:54:13 +00:00
Craig Topper	c6948c25cc	[InstCombine] Remove an if that should have been guaranteed by the caller. Replace with an assert. NFC llvm-svn: 306997	2017-07-03 05:54:11 +00:00
Simon Pilgrim	df2657ac2d	[InstCombine] Use m_BitReverse pattern match helper. NFCI. llvm-svn: 306986	2017-07-02 16:31:16 +00:00
Sanjay Patel	b51e072d35	[InstCombine] fix crash when folding cmp+bswap vector We assumed the constant was a scalar when creating the replacement operand. Also, improve tests for this fold and move the tests for this fold to their own file. I'll move the related and missing tests to this file as a follow-up. llvm-svn: 306985	2017-07-02 16:05:11 +00:00
Sanjay Patel	c3d5cf0bb7	[InstCombine] look through bswap/bitreverse for equality comparisons I noticed this missed bswap optimization in the CGP memcmp() expansion, and then I saw that we don't have the fold in InstCombine. Differential Revision: https://reviews.llvm.org/D34763 llvm-svn: 306980	2017-07-02 14:34:50 +00:00
Hiroshi Inoue	bb703e8960	fix trivial typos; NFC suport -> support llvm-svn: 306968	2017-07-02 03:24:54 +00:00
Craig Topper	f60ab47098	[InstCombine] Fold (a \| b) ^ (~a \| ~b) --> ~(a ^ b) and (a & b) ^ (~a & ~b) --> ~(a ^ b) Summary: I came across this while thinking about what would happen if one of the operands in this xor pattern was itself a inverted (A & ~B) ^ (~A & B)-> (A^B). The patterns here assume that the (~a \| ~b) will be demorganed to ~(a & b) first. Though I wonder if there's a multiple use case that would prevent the demorgan. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34870 llvm-svn: 306967	2017-07-02 01:15:51 +00:00
Davide Italiano	e3f7dda1fb	[CodeExtractor] Remove unneded and commented out debugging stmts. llvm-svn: 306966	2017-07-02 00:07:18 +00:00
Hiroshi Inoue	ef1c2ba22a	fix trivial typos, NFC llvm-svn: 306952	2017-07-01 07:12:15 +00:00
Davide Italiano	9282f1aece	[Cloner] Re-map simplfied cloned instructions. This commit pretty much rolls back the logic added in r306495 as in the testcase provided we simplify an `icmp` looking through a PHI that hasn't been mapped yet. I think instsimplify shouldn't do threading over select/phis or just looking through phis in general, but this is what we have now. Also, add a test to prevent this from happening in case somebody wants to modify this code again. Briefly discussed with Kyle Butt (thanks Kyle!). llvm-svn: 306938	2017-07-01 03:29:33 +00:00
Teresa Johnson	32d95742b8	Recommit "r306541 - Add zero-length check to memcpy/memset load store loop expansion"" With fix for use-after-free errors. We can't add the new branch and remove the old one until we are done with the Builder constructed for the block. llvm-svn: 306937	2017-07-01 03:24:10 +00:00
Teresa Johnson	c12306c0ad	Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." This still breaks PPC tests we have. I'll forward reproduction instructions to dehao. llvm-svn: 306936	2017-07-01 03:24:09 +00:00
Teresa Johnson	eb4fba9d61	re-commit r306336: Enable vectorizer-maximize-bandwidth by default. Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306935	2017-07-01 03:24:08 +00:00
Teresa Johnson	de56903bde	revert r306336 for breaking ppc test. llvm-svn: 306934	2017-07-01 03:24:07 +00:00
Teresa Johnson	1fbaffeba1	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306933	2017-07-01 03:24:06 +00:00
Dinar Temirbulatov	2fb1075f14	[SLPVectorizer] Add isOdd() helper function, NFCI. llvm-svn: 306887	2017-06-30 21:16:26 +00:00
Craig Topper	bcf511c0da	[InstCombine] Replace an unnecessary use of a matcher with just an isa and a cast. NFC We aren't looking through any levels of IR here so I don't think we need the power of a matcher or the temporary variable it requires. llvm-svn: 306885	2017-06-30 21:09:34 +00:00
Ayal Zaks	2ff59d4350	[LV] Sink casts to unravel first order recurrence Check if a single cast is preventing handling a first-order-recurrence Phi, because the scheduling constraints it imposes on the first-order-recurrence shuffle are infeasible; but they can be made feasible by moving the cast downwards. Record such casts and move them when vectorizing the loop. Differential Revision: https://reviews.llvm.org/D33058 llvm-svn: 306884	2017-06-30 21:05:06 +00:00
Sumanth Gundapaneni	5372f0a73e	[SimplifyCFG] Update the name of switch generated lookup table. This patch appends the name of the function to the switch generated lookup table. This will ease the visual debugging in identifying the function the table is generated from. Differential Revision: https://reviews.llvm.org/D34817 llvm-svn: 306867	2017-06-30 20:00:01 +00:00
Simon Pilgrim	77c3c5f9b8	[InstCombine] Add m_BitReverse pattern match helper. NFCI. llvm-svn: 306860	2017-06-30 18:58:29 +00:00
Anna Thomas	e5e5e59d8b	[RuntimeUnrolling] Add logic for loops with multiple exit blocks Summary: Runtime unrolling is done for loops with a single exit block and a single exiting block (and this exiting block should be the latch block). This patch adds logic to support unrolling in the presence of multiple exit blocks (which also means multiple exiting blocks). Currently this is under an off-by-default option and is supported when epilog code is generated. Support in presence of prolog code will be in a future patch (we just need to add more tests, and update comments). This patch is essentially an implementation patch. I have not added any heuristic (in terms of branches added or code size) to decide when this should be enabled. Reviewers: mkuper, sanjoy, reames, evstupac Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33001 llvm-svn: 306846	2017-06-30 17:57:07 +00:00
Nikolai Bozhenov	bde9b14c6f	Revert of r306525: "Canonicalize clamp of float types to minmax" llvm-svn: 306815	2017-06-30 10:39:09 +00:00
Ayal Zaks	8d26f0a602	[LV] Optimize for size when vectorizing loops with tiny trip count It may be detrimental to vectorize loops with very small trip count, as various costs of the vectorized loop body as well as enclosing overheads including runtime tests and scalar iterations may outweigh the gains of vectorizing. The current cost model measures the cost of the vectorized loop body only, expecting it will amortize other costs, and loops with known or expected very small trip counts are not vectorized at all. This patch allows loops with very small trip counts to be vectorized, but under OptForSize constraints, which ensure the cost of the loop body is dominant, having no runtime guards nor scalar iterations. Patch inspired by D32451. Differential Revision: https://reviews.llvm.org/D34373 llvm-svn: 306803	2017-06-30 08:02:35 +00:00
Craig Topper	880bf82685	[InstCombine] In foldXorToXor, move the commutable matcher from the LHS match to the RHS match. No meaningful change intended. There are two conditions ORed here with similar checks and each contain two matches that must be true for the if to succeed. With the commutable match on the first half of the OR then both ifs basically have the same first part and only the second part distinguishs. With this change we move the commutable match to second half and make the first half unique. This caused some tests to change because we now produce a commuted result, but this shouldn't matter in practice. llvm-svn: 306800	2017-06-30 07:37:41 +00:00
Chandler Carruth	3545a9e1f9	Remove the BBVectorize pass. It served us well, helped kick-start much of the vectorization efforts in LLVM, etc. Its time has come and past. Back in 2014: http://lists.llvm.org/pipermail/llvm-dev/2014-November/079091.html Time to actually let go and move forward. =] I've updated the release notes both about the removal and the deprecation of the corresponding C API. llvm-svn: 306797	2017-06-30 07:09:08 +00:00
Daniel Jasper	3b704ceba1	Revert "r306541 - Add zero-length check to memcpy/memset load store loop expansion" Segfaults in non-optimized builds. I'll get a stack trace and a reproducer to Teresa. llvm-svn: 306793	2017-06-30 06:37:33 +00:00
Daniel Jasper	5ce1ce742e	Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." This still breaks PPC tests we have. I'll forward reproduction instructions to dehao. llvm-svn: 306792	2017-06-30 06:32:21 +00:00
Max Kazantsev	8d0322e612	[SCEV] Use depth limit instead of local cache for SExt and ZExt In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously. In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short. Differential Revision: https://reviews.llvm.org/D34273 llvm-svn: 306785	2017-06-30 05:04:09 +00:00
Eric Christopher	a95aac3751	Reduce indenting and clean up comparisons around sign bit. llvm-svn: 306781	2017-06-30 01:57:48 +00:00
Eric Christopher	710c1c8faa	Reduce the complexity of the signbit/branch test functions. llvm-svn: 306779	2017-06-30 01:35:31 +00:00
Dehao Chen	2f31d0d86e	Hook the sample PGO machinery in the new PM Summary: This patch hooks up SampleProfileLoaderPass with the new PM. Reviewers: chandlerc, davidxl, davide, tejohnson Reviewed By: chandlerc, tejohnson Subscribers: tejohnson, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D34720 llvm-svn: 306763	2017-06-29 23:33:05 +00:00
Dinar Temirbulatov	f05c73c132	[SLPVectorizer] Moving Entry->NeedToGather check out of inner loop, since it is invariant there. NFCI. llvm-svn: 306749	2017-06-29 21:56:33 +00:00
Sam Clegg	3d65030c45	Remove `inline` keyword from inline `classof` methods The style guide states that the explicit `inline` should not be used with inline methods. classof is very common inline method with a fair amount on inconsistency: $ git grep classof ./include \| grep inline \| wc -l 230 $ git grep classof ./include \| grep -v inline \| wc -l 257 I chose to target this method rather the larger change since this method is easily cargo-culted (I did it at least once). I considered doing the larger change and removing all occurrences but that would be a much larger change. Differential Revision: https://reviews.llvm.org/D33906 llvm-svn: 306731	2017-06-29 19:35:17 +00:00
Xin Tong	02008c30b5	Remove useless header. NFC llvm-svn: 306712	2017-06-29 17:48:12 +00:00
Leo Li	20fbad9307	[ConstantHoisting] Avoid hoisting constants in GEPs that index into a struct type. Summary: Indices for GEPs that index into a struct type should always be constants. This added more checks in `collectConstantCandidates:` which make sure constants for GEP pointer type are not hoisted. This fixed Bug https://bugs.llvm.org/show_bug.cgi?id=33538 Reviewers: ributzka, rnk Reviewed By: ributzka Subscribers: efriedma, llvm-commits, srhines, javed.absar, pirama Differential Revision: https://reviews.llvm.org/D34576 llvm-svn: 306704	2017-06-29 17:03:34 +00:00
Daniel Berlin	b7df17ec59	PredicateInfo: Use OrderedInstructions instead of our homemade version. llvm-svn: 306703	2017-06-29 17:01:14 +00:00
Daniel Berlin	b779db7ebc	NewGVN: Remove useless test in addPhiOfOps. llvm-svn: 306702	2017-06-29 17:01:10 +00:00
Daniel Berlin	7c757aee38	Remove unneeded else from OrderedInstructions::dominates. llvm-svn: 306701	2017-06-29 17:01:03 +00:00
Dinar Temirbulatov	7b96266a16	[SLPVectorizer] Introducing getTreeEntry() helper function [NFC] Differential Revision: https://reviews.llvm.org/D34756 llvm-svn: 306655	2017-06-29 08:46:18 +00:00
Craig Topper	798a19ab8e	[InstCombine] In visitXor, use m_Not on the instruction itself instead of looking for all ones in Op1. This is consistent with 3 other not checks before this one. NFCI llvm-svn: 306617	2017-06-29 00:07:08 +00:00
Keno Fischer	a236dae5d1	[InstCombine] Retain TBAA when narrowing memory accesses Summary: As discussed on the mailing list it is legal to propagate TBAA to loads/stores from/to smaller regions of a larger load tagged with TBAA. Do so for (load->extractvalue)=>(gep->load) and similar foldings. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D31954 llvm-svn: 306615	2017-06-28 23:36:40 +00:00
Ayal Zaks	d9bc43ef2a	[LV] Fix PR33613 - retain order of insertelement per part r306381 caused PR33613, by reversing the order in which insertelements were generated per unroll part. This patch fixes PR33613 by retraining this order, placing each set of insertelements per part immediately after the last scalar being packed for this part. Includes a test case derived from PR33613. Reference: https://bugs.llvm.org/show_bug.cgi?id=33613 Differential Revision: https://reviews.llvm.org/D34760 llvm-svn: 306575	2017-06-28 17:59:33 +00:00
Geoff Berry	b0573547f6	[LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D34532 llvm-svn: 306564	2017-06-28 17:01:15 +00:00
Sanjay Patel	4e96f19052	[InstCombine] use local variable to reduce code; NFCI llvm-svn: 306560	2017-06-28 16:39:06 +00:00
Geoff Berry	66d9bdbca8	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 llvm-svn: 306554	2017-06-28 15:53:17 +00:00
Teresa Johnson	538b8d25f0	Add zero-length check to memcpy/memset load store loop expansion Summary: I was testing using this expansion logic in other cases besides NVPTX, and found some runtime failures due to the lack of a check for a zero length memcpy/memset before the loop. There is already such a check in the memmove expansion code though. Reviewers: hfinkel Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34707 llvm-svn: 306541	2017-06-28 13:07:37 +00:00
Nikolai Bozhenov	b01e6b5a52	[InstCombine] Canonicalize clamp of float types to minmax in fast mode. Summary: This commit allows matchSelectPattern to recognize clamp of float arguments in the presence of FMF the same way as already done for integers. This case is a little different though. With integers, given the min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX "automatically". That is not the case for float, because for them only full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care about NaNs. On the other hand, some backends (e.g. X86) have only FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM nodes are illegal thus selection is not happening. So I decided to do such kind of transformation in IR (InstCombiner) instead of complicating the logic in the backend. Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper Reviewed By: efriedma Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D33186 llvm-svn: 306525	2017-06-28 09:26:20 +00:00
Max Kazantsev	6c466a376e	[IRCE][NFC] Better get SCEV for 1 in calculateSubRanges A slightly more efficient way to get constant, we avoid resolving in getSCEV and excessive invocations, and we don't create a ConstantInt if 'true' branch is taken. Differential Revision: https://reviews.llvm.org/D34672 llvm-svn: 306503	2017-06-28 04:57:45 +00:00
Kyle Butt	f73c8a06a9	Inlining: Don't re-map simplified cloned instructions. When simplifying an instruction that has been re-mapped, it should never simplify to an instruction in the original function. In the edge case where we are inlining a function into itself, the existing code led to incorrect behavior. Replace the incorrect code with an assert verifying that we never expect simplification to produce an instruction in the old function, unless the functions are the same. Differential Revision: https://reviews.llvm.org/D33850 llvm-svn: 306495	2017-06-28 01:41:25 +00:00
Peter Collingbourne	92648c25a4	Bitcode: Write the irsymtab to disk. Differential Revision: https://reviews.llvm.org/D33973 llvm-svn: 306487	2017-06-27 23:50:11 +00:00
Geoff Berry	2573a19fe6	[EarlyCSE][MemorySSA] Enable MemorySSA in function-simplification pass of EarlyCSE. llvm-svn: 306477	2017-06-27 22:25:02 +00:00
Dehao Chen	920d022519	re-commit r306336: Enable vectorizer-maximize-bandwidth by default. Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306473	2017-06-27 22:05:58 +00:00
Craig Topper	5fe0197622	[InstCombine] Propagate nsw flag when turning mul by pow2 into shift when the constant is a vector splat or the scalar bit width is larger than 64-bits The check to see if we can propagate the nsw flag used m_ConstantInt(uint64_t*&) which doesn't work with splat vectors and has a restriction that the bitwidth of the ConstantInt must be 64-bits are less. This patch changes it to use m_APInt to remove both these issues Differential Revision: https://reviews.llvm.org/D34699 llvm-svn: 306457	2017-06-27 19:57:53 +00:00
Serge Guelton	7bc405aa4c	[CodeExtractor] Prevent extraction of block involving blockaddress BlockAddress are only valid within their function context, which does not interact well with CodeExtractor. Detect this case and prevent it. Differential Revision: https://reviews.llvm.org/D33839 llvm-svn: 306448	2017-06-27 18:57:53 +00:00
Yaxun Liu	7c44f340de	[SROA] Fix APInt size when alloca address space is not 0 SROA assumes alloca address space is 0, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D34104 llvm-svn: 306440	2017-06-27 18:26:06 +00:00
Sanjay Patel	7227276d41	[InstCombine] canonicalize icmp predicate feeding select This canonicalization was suggested in D33172 as a way to make InstCombine behavior more uniform. We have this transform for icmp+br, so unless there's some reason that icmp+select should be treated differently, we should do the same thing here. The benefit comes from increasing the chances of creating identical instructions. This is shown in the tests in logical-select.ll (PR32791). InstCombine doesn't fold those directly, but EarlyCSE can simplify the identical cmps, and then InstCombine can fold the selects together. The possible regression for the tests in select.ll raises questions about poison/undef: http://lists.llvm.org/pipermail/llvm-dev/2017-May/113261.html ...but that transform is just as likely to be triggered by this canonicalization as it is to be missed, so we're just pointing out a commutation deficiency in the pattern matching: https://reviews.llvm.org/rL228409 Differential Revision: https://reviews.llvm.org/D34242 llvm-svn: 306435	2017-06-27 17:53:22 +00:00
Dehao Chen	66131665c4	Enable ICP for AutoFDO. Summary: AutoFDO should have ICP enabled. Reviewers: davidxl Reviewed By: davidxl Subscribers: sanjoy, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D34662 llvm-svn: 306429	2017-06-27 17:23:33 +00:00
Anna Thomas	dc935a6eb6	[LoopUnrollRuntime] Use SCEV exit count for calculating trip count. NFCI Instead of getBackEdgeTakenCount, use getExitCount on the latch exiting block (which is proven to be the only exiting block in the loop to be unrolled). llvm-svn: 306410	2017-06-27 14:14:35 +00:00
Ayal Zaks	fc1e210d44	Recommitting 306331. Undoing revert 306338 after fixed bug: add metadata to the load instead of the reverse shuffle added to it, retaining the original ValueMap implementation. llvm-svn: 306381	2017-06-27 08:41:19 +00:00
Chandler Carruth	3f81d8024c	[SROA] Fix PR32902 by more carefully propagating !nonnull metadata. This is based heavily on the work done ni D34285. I mostly wanted to do test cleanup for the author to save them some time, but I had a really hard time understanding why it was so hard to write better test cases for these issues. The problem is that because SROA does a second rewrite of the loads and because we don't propagate !nonnull for non-pointer loads, we first introduced invalid !nonnull metadata and then stripped it back off just in time to avoid most ways of this PR manifesting. Moving to the more careful utility only fixes this by changing the predicate to look at the new load's type rather than the target type. However, that does fix the bug, and the utility is much nicer including adding range metadata to model the nonnull property after a conversion to an integer. However, we have bigger problems because we don't actually propagate range metadata, and the utility to do this extracted from instcombine isn't really in good shape to do this currently. It only handles the case of copying range metadata from an integer load to a pointer load. It doesn't even handle the trivial cases of propagating from one integer load to another when they are the same width! This utility will need to be beefed up prior to using in this location to get the metadata to fully survive. And even then, we need to go and teach things to turn the range metadata into an assume the way we do with nonnull so that when we promote an integer we don't lose the information. All of this will require a new test case that looks kind-of like `preserve-nonnull.ll` does here but focuses on range metadata. It will also likely require more testing because it needs to correctly handle changes to the integer width, especially as SROA actively tries to change the integer width! Last but not least, I'm a little worried about hooking the range metadata up here because the instcombine logic for converting from a range metadata to a nonnull metadata node seems broken in the face of non-zero address spaces where null is not mapped to the integer `0`. So that probably needs to get fixed with test cases both in SROA and in instcombine to cover it. But this does extract the core PR fix from D34285 of preventing the !nonnull metadata from being propagated in a broken state just long enough to feed into promotion and crash value tracking. On D34285 there is some discussion of zero-extend handling because it isn't necessary. First, the new load size covers all of the non-undef (ie, possibly initialized) bits. This may even extend past the original alloca if loading those bits could produce valid data. The only way its valid for us to zero-extend an integer load in SROA is if the original code had a zero extend or those bits were undef. And we get to assume things like undef never satifies nonnull, so non undef bits can participate here. No need to special case the zero-extend handling, it just falls out correctly. The original credit goes to Ariel Ben-Yehuda! I'm mostly landing this to save a few rounds of trivial edits fixing style issues and test case formulation. Differental Revision: D34285 llvm-svn: 306379	2017-06-27 08:32:03 +00:00
Mikael Holmen	37b5120a9a	[Reassociate] Make sure EraseInst sets MadeChange Summary: EraseInst didn't report that it made IR changes through MadeChange. It is essential that changes to the IR are reported correctly, since for example ReassociatePass::run() will indicate that all analyses are preserved otherwise. And the CGPassManager determines if the CallGraph is up-to-date based on status from InstructionCombiningPass::runOnFunction(). Reviewers: craig.topper, rnk, davide Reviewed By: rnk, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34616 llvm-svn: 306368	2017-06-27 05:32:13 +00:00
Dehao Chen	8b7effb344	revert r306336 for breaking ppc test. llvm-svn: 306344	2017-06-26 23:05:35 +00:00
Ayal Zaks	3923c0c46b	reverting 306331. Causes TBAA metadata to be generates on reverse shuffles, investigating. llvm-svn: 306338	2017-06-26 22:26:54 +00:00
Dehao Chen	79655792cc	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306336	2017-06-26 21:41:09 +00:00
Ayal Zaks	e7e15d186b	[LV] Changing the interface of ValueMap, NFC. Instead of providing access to the internal MapStorage holding all Values associated with a given Key, used for setting or resetting them all together, ValueMap keeps its MapStorage internal; its new interface allows getting, setting or resetting a single Value, per part or per part-and-lane. Follows the discussion in https://reviews.llvm.org/D32871. Differential Revision: https://reviews.llvm.org/D34473 llvm-svn: 306331	2017-06-26 21:03:51 +00:00
Wei Mi	71f06420e4	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit fixes three bugs: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. The third one is an out-of-bound array access in a flipped if guard. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. llvm-svn: 306313	2017-06-26 18:16:10 +00:00
Chandler Carruth	2abb65ae11	[InstCombine] Factor the logic for propagating !nonnull and !range metadata out of InstCombine and into helpers. NFC, this just exposes the logic used by InstCombine when propagating metadata from one load instruction to another. The plan is to use this in SROA to address PR32902. If anyone has better ideas about how to factor this or name variables, I'm all ears, but this seemed like a pretty good start and lets us make progress on the PR. This is based on a patch by Ariel Ben-Yehuda (D34285). llvm-svn: 306267	2017-06-26 03:31:31 +00:00
Chandler Carruth	4a000883c7	[LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr. This was reverted in r306252, but I already had the bug fixed and was just trying to form a test case. The original commit factored the logic for forming dedicated exits inside of LoopSimplify into a helper that could be used elsewhere and with an approach that required fewer intermediate data structures. See that commit for full details including the change to the statistic, etc. The code looked fine to me and my reviewers, but in fact didn't handle indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty. If you have code that looks just right, you can end up leaking these predecessors into a subsequent rewrite, and crash deep down when trying to update PHI nodes for predecessors that don't exist. I've added an assert that makes the bug much more obvious, and then changed the code to reliably clear the vector so we don't get this bug again in some other form as the code changes. I've also added a test case that does manage to catch this while also giving some nice positive coverage in the face of indirectbr. The real code that found this came out of what I think is CPython's interpreter loop, but any code with really "creative" interpreter loops mixing indirectbr and other exit paths could manage to tickle the bug. I was hard to reduce the original test case because in addition to having a particular pattern of IR, the whole thing depends on the order of the predecessors which is in turn depends on use list order. The test case added here was designed so that in multiple different predecessor orderings it should always end up going down the same path and tripping the same bug. I hope. At least, it tripped it for me without manipulating the use list order which is better than anything bugpoint could do... llvm-svn: 306257	2017-06-25 22:45:31 +00:00
Anna Thomas	e7cb633d29	[LoopDeletion] NFC: Move phi node value setting into prepass Recommit NFC patch (rL306157) where I missed incrementing the basic block iterator, which caused loop deletion tests to hang due to infinite loop. Had reverted it in rL306162. rL306157 commit message: Currently, the implementation of delete dead loops has a special case when the loop being deleted is never executed. This special case (updating of exit block's incoming values for phis) can be run as a prepass for non-executable loops before performing the actual deletion. llvm-svn: 306254	2017-06-25 21:13:58 +00:00
Daniel Jasper	4c6cd4ccb7	Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility." This leads to a segfault. Chandler already has a test case and should be able to recommit with a fix soon. llvm-svn: 306252	2017-06-25 17:58:25 +00:00
Sanjay Patel	2f3ead7adc	[InstCombine] add (sext i1 X), 1 --> zext (not X) http://rise4fun.com/Alive/i8Q A narrow bitwise logic op is obviously better than math for value tracking, and zext is better than sext. Typically, the 'not' will be folded into an icmp predicate. The IR difference would even survive through codegen for x86, so we would see worse code: https://godbolt.org/g/C14HMF one_or_zero(int, int): # @one_or_zero(int, int) xorl %eax, %eax cmpl %esi, %edi setle %al retq one_or_zero_alt(int, int): # @one_or_zero_alt(int, int) xorl %ecx, %ecx cmpl %esi, %edi setg %cl movl $1, %eax subl %ecx, %eax retq llvm-svn: 306243	2017-06-25 14:15:28 +00:00
Xinliang David Li	b67530e9b9	[PGO] Implementate profile counter regiser promotion Differential Revision: http://reviews.llvm.org/D34085 llvm-svn: 306231	2017-06-25 00:26:43 +00:00
Hiroshi Inoue	b300824ee7	fix trivial typos in comment, NFC dereferencable -> dereferenceable llvm-svn: 306210	2017-06-24 15:43:33 +00:00
Craig Topper	7b66ffe875	[ValueTracking][InstCombine] Use m_Shr instead m_CombineOr(m_LShr, m_AShr). NFC llvm-svn: 306205	2017-06-24 06:24:04 +00:00
Craig Topper	72ee6945af	[Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a few places. NFC llvm-svn: 306204	2017-06-24 06:24:01 +00:00
Vitaly Buka	df19ad456e	[InstCombine] Don't replace allocas with smaller globals Summary: InstCombine replaces large allocas with small globals consts causing buffer overflows on valid code, see PR33372. This fix permits this optimization only if the global is dereference for alloca size. Fixes PR33372 Reviewers: eugenis, majnemer, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34311 llvm-svn: 306194	2017-06-24 01:35:19 +00:00
Anna Thomas	77a2e6b198	Revert "[LoopDeletion] NFC: Move phi node value setting into prepass" This reverts commit r306157. It caused some timeouts in clang tests. Perhaps unreachable loops have far too many phi nodes. Reverting and investigating. llvm-svn: 306162	2017-06-23 21:30:48 +00:00
Anna Thomas	a43b387f27	[LoopDeletion] NFC: Move phi node value setting into prepass Currently, the implementation of delete dead loops has a special case when the loop being deleted is never executed. This special case (updating of exit block's incoming values for phis) can be run as a prepass for non-executable loops before performing the actual deletion. llvm-svn: 306157	2017-06-23 20:38:50 +00:00
Craig Topper	68ed55e06a	[CorrelatedValuePropagation] Fix typo in comment sense->since. NFC llvm-svn: 306152	2017-06-23 20:28:40 +00:00
Craig Topper	29cdfe2cd9	[CorrelatedValuePropagation] Remove comment about iterating switch cases in reverse order. This is no longer being done after r298791. NFC llvm-svn: 306151	2017-06-23 20:28:35 +00:00
Anna Thomas	91eed9ac1a	[RuntimeLoopUnrolling] Rename exit block and move assert earlier. NFC The single exit block allowed in runtime unrolling is guaranteed to be the Latch's successor, so rename it as LatchExitBlock. llvm-svn: 306105	2017-06-23 14:28:01 +00:00
Anna Thomas	d67165c93c	[InstCombine] Recognize and simplify three way comparison idioms Summary: Many languages have a three way comparison idiom where comparing two values produces not a boolean, but a tri-state value. Typical values (e.g. as used in the lcmp/fcmp bytecodes from Java) are -1 for less than, 0 for equality, and +1 for greater than. We actually do a great job already of converting three way comparisons into binary comparisons when the result produced has one a single use. Unfortunately, such values can have more than one use, and in that case, our existing optimizations break down. The patch adds a peephole which converts a three-way compare + test idiom into a binary comparison on the original inputs. It focused on replacing the test on the result of the three way compare and does nothing about removing the three way compare itself. That's left to other optimizations (which do actually kick in commonly.) We currently recognize one idiom on signed integer compare. In the future, we plan to recognize and simplify other comparison idioms on other signed/unsigned datatypes such as floats, vectors etc. This is a resurrection of Philip Reames' original patch: https://reviews.llvm.org/D19452 Reviewers: majnemer, apilipenko, reames, sanjoy, mkazantsev Reviewed by: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34278 llvm-svn: 306100	2017-06-23 13:41:45 +00:00
Craig Topper	2c20c42cb6	[JumpThreading] Teach jump threading how to analyze (and (cmp A, C1), (cmp A, C2)) after InstCombine has turned it into (cmp (add A, C3), C4) Currently JumpThreading can use LazyValueInfo to analyze an 'and' or 'or' of compare if the compare is fed by a livein of a basic block. This can be used to to prove the condition can't be met for some predecessor and the jump from that predecessor can be moved to the false path of the condition. But if the compare is something that InstCombine turns into an add and a single compare, it can't be analyzed because the livein is now an input to the add and not the compare. This patch adds a new method to LVI to get a ConstantRange on an edge. Then we teach jump threading to detect the add livein feeding a compare and to get the ConstantRange and propagate it. Differential Revision: https://reviews.llvm.org/D33262 llvm-svn: 306085	2017-06-23 05:41:35 +00:00
Craig Topper	7927996140	[JumpThreading] Use some temporary variables to reduce the number of times we call the same methods. NFC A future patch will add even more uses of these variables. llvm-svn: 306084	2017-06-23 05:41:32 +00:00
Chandler Carruth	4ab0f4910a	[LoopSimplify] Factor the logic to form dedicated exits into a utility. I want to use the same logic as LoopSimplify to form dedicated exits in another pass (SimpleLoopUnswitch) so I wanted to factor it out here. I also noticed that there is a pretty significantly more efficient way to implement this than the way the code in LoopSimplify worked. We don't need to actually retain the set of unique exit blocks, we can just rewrite them as we find them and use only a set to deduplicate. This did require changing one part of LoopSimplify to not re-use the unique set of exits, but it only used it to check that there was a single unique exit. That part of the code is about to walk the exiting blocks anyways, so it seemed better to rewrite it to use those exiting blocks to compute this property on-demand. I also had to ditch a statistic, but it doesn't seem terribly valuable. Differential Revision: https://reviews.llvm.org/D34049 llvm-svn: 306081	2017-06-23 04:03:04 +00:00
Eric Christopher	5a7c2f1700	Remove the LoadCombine pass. It was never enabled and is unsupported. Based on discussions with the author on mailing lists. llvm-svn: 306067	2017-06-22 22:58:12 +00:00
Anna Thomas	72c90c87f8	[LoopDeletion] Update exits correctly when multiple duplicate edges from an exiting block Summary: Currently, we incorrectly update exit blocks of loops when there are multiple edges from a single exiting block to the exit block. This can happen when we have switches as the terminator of the exiting blocks. The fix here is to correctly update the phi nodes in the exit block, and remove all incoming values except for one which is from the preheader. Note: Currently, this error can manifest only while deleting non-executed loops. However, it is possible to trigger this error in invariant loops, once we enhance the logic around the exit conditions for the loop check. Reviewers: chandlerc, dberlin, sanjoy, efriedma Reviewed by: efriedma Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D34516 llvm-svn: 306048	2017-06-22 20:20:56 +00:00
Craig Topper	dffbbcb3fd	[InstCombine] Teach foldSelectICmpAndOr to recognize (select (icmp slt (trunc (X)), 0), Y, (or Y, C2)) Summary: InstCombine likes to turn (icmp eq (and X, C1), 0) into (icmp slt (trunc (X)), 0) sometimes. This breaks foldSelectICmpAndOr's ability to recognize (select (icmp eq (and X, C1), 0), Y, (or Y, C2))->(or (shl (and X, C1), C3), y). This patch tries to recover this. I had to flip around some of the early out checks so that I could create a new And instruction during the compare processing without it possibly never getting used. Reviewers: spatel, majnemer, davide Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34184 llvm-svn: 306029	2017-06-22 16:23:30 +00:00
Craig Topper	0de5e6a729	[InstCombine] Add one use checks to or/and->xnor folding If the components of the and/or had multiple uses, this transform created an additional instruction. This patch makes sure we remove one of the components. Differential Revision: https://reviews.llvm.org/D34498 llvm-svn: 306027	2017-06-22 16:12:02 +00:00
Sanjay Patel	d1e811979c	[InstCombine] reverse bitcast + bitwise-logic canonicalization (PR33138) There are 2 parts to this patch made simultaneously to avoid a regression. We're reversing the canonicalization that moves bitwise vector ops before bitcasts. We're moving bitwise vector ops after bitcasts instead. That's the 1st and 3rd hunks of the patch. The motivation is that there's only one fold that currently depends on the existing canonicalization (see next), but there are many folds that would automatically benefit from the new canonicalization. PR33138 ( https://bugs.llvm.org/show_bug.cgi?id=33138 ) shows why/how we have these patterns in IR. There's an or(and,andn) pattern that requires an adjustment in order to continue matching to 'select' because the bitcast changes position. This match is unfortunately complicated because it requires 4 logic ops with optional bitcast and sext ops. Test diffs: 1. The bitcast.ll and bitcast-bigendian.ll changes show the most basic difference - bitcast comes before logic. 2. There are also tests with no diffs in bitcast.ll that verify that we're still doing folds that were enabled by the previous canonicalization. 3. icmp-xor-signbit.ll shows the payoff. We don't need to adjust existing icmp patterns to look through bitcasts. 4. logical-select.ll contains several tests for the or(and,andn) --> select fold to verify that we are still handling those cases. The lone diff shows the movement of the bitcast from the new canonicalization rule. Differential Revision: https://reviews.llvm.org/D33517 llvm-svn: 306011	2017-06-22 15:46:54 +00:00
Sanjay Patel	e800df8eac	[InstCombine] add peekThroughBitcast() helper; NFC This is an NFC portion of D33517. We have similar helpers in the backend. llvm-svn: 306008	2017-06-22 15:28:01 +00:00
Diana Picus	b512e91515	Revert "Enable vectorizer-maximize-bandwidth by default." This reverts commit r305960 because it broke self-hosting on AArch64. llvm-svn: 305990	2017-06-22 10:00:28 +00:00
Sam Clegg	705f798bff	Mark dump() methods as const. NFC Add const qualifier to any dump() method where adding one was trivial. Differential Revision: https://reviews.llvm.org/D34481 llvm-svn: 305963	2017-06-21 22:19:17 +00:00
Dehao Chen	014db29b89	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 305960	2017-06-21 22:01:32 +00:00
Craig Topper	34caf5396f	[Reassociate] Use early returns in a couple places to reduce indentation and improve readability. NFC llvm-svn: 305946	2017-06-21 19:39:35 +00:00
Craig Topper	99a2e89920	[Reassociate] Const correct a helper function. NFC llvm-svn: 305945	2017-06-21 19:39:33 +00:00
Craig Topper	a074c101e5	[InstCombine] Cleanup using commutable matchers. Make a couple helper methods standalone static functions. Put 'if' around variable declaration instead of after. NFC llvm-svn: 305941	2017-06-21 18:57:00 +00:00
Dehao Chen	50f2aa19e8	Do not inline recursive direct calls in sample loader pass. Summary: r305009 disables recursive inlining for indirect calls in sample loader pass. The same logic applies to direct recursive calls. Reviewers: iteratee, davidxl Reviewed By: iteratee Subscribers: sanjoy, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D34456 llvm-svn: 305934	2017-06-21 17:57:43 +00:00
Craig Topper	5b173f2bb3	[InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on known bits Summary: I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know. This patch adds range metadata to these intrinsics based on the known bits. Currently the patch punts if we already have range metadata present. Reviewers: spatel, RKSimon, davide, majnemer Reviewed By: RKSimon Subscribers: sanjoy, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D32582 llvm-svn: 305927	2017-06-21 16:32:35 +00:00
Craig Topper	ae86cc725d	[InstCombine] Don't let folding (select (icmp eq (and X, C1), 0), Y, (or Y, C2)) create more instructions than it removes Summary: Previously this folding had no checks to see if it was going to result in less instructions. This was pointed out during the review of D34184 This patch adds code to count how many instructions its going to create vs how many its going to remove so we can make a proper decision. Reviewers: spatel, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34437 llvm-svn: 305926	2017-06-21 16:07:13 +00:00
Craig Topper	cbac691c4b	[Reassociate] Support xor reassociating for splat vectors Summary: This patch adds support for xors of splat vectors. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34354 llvm-svn: 305925	2017-06-21 16:07:09 +00:00
Davide Italiano	0ec715be1f	[NewGVN] Fix a bug that made the store verifier less effective. We weren't actually checking for duplicated stores, as the condition was always actually false. This was found by Coverity, and I have no clue how to trigger this in real-world code (although I tried for a bit). llvm-svn: 305867	2017-06-20 22:57:40 +00:00
Sanjay Patel	4ccbd58d70	[InstCombine] fix code/test comments for r305792; NFC These diffs were in the last version of the patch in D33342, but I accidentally committed the previous rev. llvm-svn: 305793	2017-06-20 12:45:46 +00:00
Sanjay Patel	adca825dc1	[InstCombine] try to canonicalize xor-of-icmps to and-of-icmps We have a large portfolio of folds for and-of-icmps and or-of-icmps in InstSimplify and InstCombine, but hardly anything for xor-of-icmps. Rather than trying to rethink and translate all of those folds, we can use the truth table definition of xor: X ^ Y --> (X \| Y) & !(X & Y) ...to see if we can convert the xor to and/or and then use the existing folds. http://rise4fun.com/Alive/J9v Differential Revision: https://reviews.llvm.org/D33342 llvm-svn: 305792	2017-06-20 12:40:55 +00:00
Vedant Kumar	b5794ca90c	[ProfileData] PR33517: Check for failure of symtab creation With PR33517, it became apparent that symbol table creation can fail when presented with malformed inputs. This patch makes that sort of error detectable, so llvm-cov etc. can fail more gracefully. Specifically, we now check that function names within the symbol table aren't empty. Testing: check-{llvm,clang,profile}, some unit test updates. llvm-svn: 305765	2017-06-20 01:38:56 +00:00
Ana Pazos	f731bde064	[PATCH] [PGO] Fixed cast operation in emIntrinsicVisitor::instrumentOneMemIntrinsic. Reviewers: xur, efriedma, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34293 llvm-svn: 305737	2017-06-19 20:04:33 +00:00
Taewook Oh	9083547ae3	Improve profile-guided heuristics to use estimated trip count. Summary: Existing heuristic uses the ratio between the function entry frequency and the loop invocation frequency to find cold loops. However, even if the loop executes frequently, if it has a small trip count per each invocation, vectorization is not beneficial. On the other hand, even if the loop invocation frequency is much smaller than the function invocation frequency, if the trip count is high it is still beneficial to vectorize the loop. This patch uses estimated trip count computed from the profile metadata as a primary metric to determine coldness of the loop. If the estimated trip count cannot be computed, it falls back to the original heuristics. Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32451 llvm-svn: 305729	2017-06-19 18:48:58 +00:00
Bjorn Pettersson	475fcd9cd8	[InstCombine] Make sure AddReachableCodeToWorklist sets MadeIRChange Summary: Some optimizations in AddReachableCodeToWorklist did not update the MadeIRChange state. This could happen both when removing trivially dead instructions (DCE) and at constant folds. It is essential that changes to the IR is reported correctly, since for example InstCombinePass::run() will indicate that all analyses are preserved otherwise. And the CGPassManager determines if the CallGraph is up-to-date based on status from InstructionCombiningPass::runOnFunction(). The new test case early_dce_clobbers_callgraph.ll is a reproducer for some asserts that started to trigger after changes in the inliner in r305245. With this patch the test case passes again. Reviewers: sanjoy, craig.topper, dblaikie Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34346 llvm-svn: 305725	2017-06-19 18:00:27 +00:00
Hans Wennborg	ca69fc1cb7	Revert r304824 "Fix PR23384 (part 3 of 3)" This seems to be interacting badly with ASan somehow, causing false reports of heap-buffer overflows: PR33514. > Summary: > The patch makes instruction count the highest priority for > LSR solution for X86 (previously registers had highest priority). > > Reviewers: qcolombet > > Differential Revision: http://reviews.llvm.org/D30562 > > From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 305720	2017-06-19 17:57:15 +00:00
Davide Italiano	daa9c0e403	[NewGVN] Simplify findConditionEquivalence(). NFCI. llvm-svn: 305707	2017-06-19 16:46:15 +00:00
Dinar Temirbulatov	e2c6991c07	Remove brackets, NFC. llvm-svn: 305706	2017-06-19 16:44:07 +00:00
Craig Topper	a7529b68cc	[InstCombine] Cleanup some duplicated one use checks Summary: These 4 patterns have the same one use check repeated twice for each. Once without a cast and one with. But the cast has no effect on what method is called. For the OR case I believe it is always profitable regardless of the number of uses since we'll never increase the instruction count. For the AND case I believe it is profitable if the pair of xors has one use such that we'll get rid of it completely. Or if the C value is something freely invertible, in which case the not doesn't cost anything. Reviewers: spatel, majnemer Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34308 llvm-svn: 305705	2017-06-19 16:23:49 +00:00
Craig Topper	ef85498e05	[Reassociate] Support some reassociation of vector xors Summary: Currently we don't try to do anything with vector xors. This patch adds support for removing duplicate pairs from a chain of vector xors as its pretty easy to support. We still dont' try to combine the xors with and/ors, but I might try that in a future patch. Reviewers: mcrosier, davide, resistor Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34338 llvm-svn: 305704	2017-06-19 16:23:46 +00:00
Craig Topper	4350734d36	[Reassociate] Make one of the helper methods static because it doesn't use any class variables. NFC llvm-svn: 305703	2017-06-19 16:23:43 +00:00
Anna Thomas	7949f4529a	[JumpThreading][LVI] Invalidate LVI information after blocks are merged Summary: After a single predecessor is merged into a basic block, we need to invalidate the LVI information for the new merged block, when LVI is not provably true for all of instructions in the new block. The test cases added show the correct LVI information using the LVI printer pass. Reviewers: reames, dberlin, davide, sanjoy Reviewed by: dberlin, davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34108 llvm-svn: 305699	2017-06-19 15:23:33 +00:00
Xin Tong	b412831d11	[TRE] Improve code motion in TRE, use AA to tell whether a load can be moved before a call that writes to memory. Summary: use AA to tell whether a load can be moved before a call that writes to memory. Reviewers: dberlin, davide, sanjoy, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D34115 llvm-svn: 305698	2017-06-19 15:21:18 +00:00
Daniel Berlin	36b08b2088	NewGVN: Fix PR 33461, caused by slightly overzealous verification. llvm-svn: 305657	2017-06-19 00:24:00 +00:00
Craig Topper	d96177cf72	[Reassociate] Use APInt::isNullValue() instead of comparing with 0. NFC This should compile to slightly better code. llvm-svn: 305651	2017-06-18 18:15:38 +00:00
Xin Tong	9d2a5b1cf7	Add argmononly attribute to strlen and wcslen, i.e. they only read memory (string) passed to them. Summary: This allows strlen to be moved out of the loop in case its argument is not modified in the loop in LICM. Reviewers: hfinkel, davide, sanjoy, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34323 llvm-svn: 305641	2017-06-18 03:10:26 +00:00
Sanjoy Das	b70ddd8901	[SROA] Add support for non-integral pointers Summary: C.f. http://llvm.org/docs/LangRef.html#non-integral-pointer-type Reviewers: chandlerc, loladiro Reviewed By: loladiro Subscribers: reames, loladiro, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D32203 llvm-svn: 305639	2017-06-17 20:28:13 +00:00
Xin Tong	025780ba6e	[TRE] Add assertion for folding trivial return block llvm-svn: 305637	2017-06-17 16:55:12 +00:00
Xin Tong	d5b4d0b53a	[TRE] Update comments. NFC llvm-svn: 305636	2017-06-17 16:18:36 +00:00
Wei Mi	c7ba876323	Revert rL305578. There is still some buildbot failure to be fixed. llvm-svn: 305603	2017-06-16 23:14:35 +00:00
Anna Thomas	6bc14c65ad	[InstCombine] Set correct insertion point for selects generated while folding phis Summary: When we fold vector constants that are operands of phi's that feed into select, we need to set the correct insertion point for the new selects that get generated. The correct insertion point is the incoming block for the phi. Such cases can occur with patch r298845, which fixed folding of vector constants, but the new selects could be inserted incorrectly (as the added test case shows). Reviewers: majnemer, spatel, sanjoy Reviewed by: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34162 llvm-svn: 305591	2017-06-16 21:08:37 +00:00
Davide Italiano	ec5b0257bf	[SCCP] Simplify the code a bit. NFCI. llvm-svn: 305583	2017-06-16 20:50:31 +00:00
Davide Italiano	0b1190aa8d	[SCCP] Clarify a comment about unhandled instructions. llvm-svn: 305579	2017-06-16 20:27:17 +00:00
Wei Mi	a2493b6ad9	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit fixes two bugs: The first one is to use CurrentBlock instead of PREInstr's Parent as param of performScalarPREInsertion because the Parent of a clone instruction may be uninitialized. The second one is stop PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst is defined inside of CurrentBlock. The same value defined inside of loop in last iteration can not be regarded as available. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 305578	2017-06-16 20:21:01 +00:00
Davide Italiano	d95d871af0	[SCCP] Remove redundant instruction visitors. Whenever we don't know what to do with an instruction, we send it to overdefined anyway. llvm-svn: 305575	2017-06-16 19:43:57 +00:00
Xinliang David Li	c3f8e83253	Fix function name /NFC llvm-svn: 305564	2017-06-16 16:54:13 +00:00
Daniel Neilson	3faabbbe85	[Atomics] Rename and change prototype for atomic memcpy intrinsic Summary: Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic. Reviewers: reames, sanjoy, efriedma Reviewed By: reames Subscribers: mzolotukhin, anna, llvm-commits, skatkov Differential Revision: https://reviews.llvm.org/D33240 llvm-svn: 305558	2017-06-16 14:43:59 +00:00
Craig Topper	da6ea0d3e8	[InstCombine] Fold (!iszero(A & K1) & !iszero(A & K2)) -> (A & (K1 \| K2)) == (K1 \| K2) if K1 and K2 are a 1-bit mask Summary: This is the demorganed version of the case we already handle for the OR of iszero. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34244 llvm-svn: 305548	2017-06-16 05:10:37 +00:00
Craig Topper	4d76da4fc9	[CorrelatedValuePropagation] Remove superfluous semicolon. NFC llvm-svn: 305538	2017-06-16 01:53:20 +00:00
Evgeniy Stepanov	4d4ee93d25	[cfi] CFI-ICall for ThinLTO. Implement ControlFlowIntegrity for indirect function calls in ThinLTO. Design follows the RFC in llvm-dev, see https://groups.google.com/d/msg/llvm-dev/MgUlaphu4Qc/kywu0AqjAQAJ llvm-svn: 305533	2017-06-16 00:18:29 +00:00
Xinliang David Li	eea0ade2eb	[PartialInlining] Code Refactoring This is a NFC code refactoring and interface cleanup. This paves the way to enable outlining-only mode for the partial inliner. llvm-svn: 305530	2017-06-15 23:56:59 +00:00
Craig Topper	2ba991ff2c	[InstCombine] Add two FIXMEs for bad single use checks. NFC llvm-svn: 305510	2017-06-15 21:38:48 +00:00
Teresa Johnson	152277952e	Split PGO memory intrinsic optimization into its own source file Summary: Split the PGOMemOPSizeOpt pass out from IndirectCallPromotion.cpp into its own file. Reviewers: davidxl Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D34248 llvm-svn: 305501	2017-06-15 20:23:57 +00:00
Craig Topper	f2d3e6d3d5	[InstCombine] Make the context instruction parameter of foldOrOfICmps a reference to discourage passing nullptr and to remove the '&' from all of the call sites. NFC llvm-svn: 305493	2017-06-15 19:09:51 +00:00
Craig Topper	6eec9e21a5	[InstCombine] Handle (iszero(A & K1) \| iszero(A & K2)) -> (A & (K1 \| K2)) != (K1 \| K2) when the one of the Ands is commuted relative to the other Currently we expect A to be on the same side in both Ands but nothing guarantees that. While there also switch to using matchers for some of the code. Differential Revision: https://reviews.llvm.org/D34230 llvm-svn: 305487	2017-06-15 17:55:20 +00:00
Max Kazantsev	dc80366d52	[ScalarEvolution] Apply Depth limit to getMulExpr This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463	2017-06-15 11:48:21 +00:00
George Karpenkov	406c113103	Fixing section name for Darwin platforms for sanitizer coverage On Darwin, section names have a 16char length limit. llvm-svn: 305429	2017-06-14 23:40:25 +00:00
Daniel Berlin	6d2db9edb2	PredicateInfo: Don't insert conditional info when a conditional branch jumps to the same target regardless of condition llvm-svn: 305416	2017-06-14 21:19:52 +00:00
Daniel Berlin	51e878e01d	NewGVN: This is wrong by inspection, it will not cause an issue currently due to other limitations, i believe. This also means i can't make a test for it. llvm-svn: 305415	2017-06-14 21:19:28 +00:00
Davide Italiano	0dc4778067	[EarlyCSE] Make PhiToCheck in removeMSSA() a set. This way we end up not looking at PHI args already removed. MemSSA now goes through the updater so we can prune it to avoid having redundant MemoryPHI arguments, but that doesn't quite work for the general case. Discussed with Daniel Berlin, fixes PR33406. llvm-svn: 305409	2017-06-14 19:29:53 +00:00
Frederich Munch	dceb612eeb	Hide dbgs() stream for when built with -fmodules. Summary: Make DebugCounter::print and dump methods to be const correct. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34214 llvm-svn: 305408	2017-06-14 19:16:22 +00:00
Vedant Kumar	9c056c9e1b	[InstrProf] Don't take the address of alwaysinline available_externally functions Doing so breaks compilation of the following C program (under -fprofile-instr-generate): __attribute__((always_inline)) inline int foo() { return 0; } int main() { return foo(); } At link time, we fail because taking the address of an available_externally function creates an undefined external reference, which the TU cannot provide. Emitting the function definition into the object file at all appears to be a violation of the langref: "Globals with 'available_externally' linkage are never emitted into the object file corresponding to the LLVM module." Differential Revision: https://reviews.llvm.org/D34134 llvm-svn: 305327	2017-06-13 22:12:35 +00:00
Teresa Johnson	8015f88525	[PGO] Update VP metadata after memory intrinsic optimization Summary: Leave an updated VP metadata on the fallback memcpy intrinsic after specialization. This can be used for later possible expansion based on the average of the remaining values. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34164 llvm-svn: 305321	2017-06-13 20:44:08 +00:00
Frederich Munch	6391c7e2a1	Revert r305313 & r305303, self-hosting build-bot isn’t liking it. llvm-svn: 305318	2017-06-13 19:05:24 +00:00
Frederich Munch	4c73b40dca	Force RegisterStandardPasses to construct std::function in the IPO library. Summary: Fixes an issue using RegisterStandardPasses from a statically linked object before PassManagerBuilder::addGlobalExtension is called from a dynamic library. Reviewers: efriedma, theraven Reviewed By: efriedma Subscribers: mehdi_amini, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D33515 llvm-svn: 305303	2017-06-13 16:48:41 +00:00
David Blaikie	6d0f39476a	Inliner: Avoid calling shouldInline until it's absolutely necessary This restores the order of evaluation (& conditionalized evaluation) of isTriviallyDeadInstruction, InlineHistoryIncludes, and shouldInline (with the addition of a shouldInline call after isTriviallyDeadInstruction) from before r305245. llvm-svn: 305267	2017-06-13 02:24:09 +00:00
George Burgess IV	f613749382	Fix signed/unsigned comparison warning; NFC llvm-svn: 305262	2017-06-13 01:28:49 +00:00
David Blaikie	ae8c4af4ac	Inliner: Don't remove calls to readnone+nounwind (but not always_inline) functions in the AlwaysInliner llvm-svn: 305245	2017-06-12 23:01:17 +00:00
Anna Thomas	4b027e8f89	[RS4GC] Drop invalid metadata after pointers are relocated Summary: After RS4GC, we should drop metadata that is no longer valid. These metadata is used by optimizations scheduled after RS4GC, and can cause a miscompile. One such metadata is invariant.load which is used by LICM sinking transform. After rewriting statepoints, the address of a load maybe relocated. With invariant.load metadata on a load instruction, LICM sinking assumes the loaded value (from a dererenceable address) to be invariant, and rematerializes the load operand and the load at the exit block. This transforms the IR to have an unrelocated use of the address after a statepoint, which is incorrect. Other metadata we conservatively remove are related to dereferenceability and noalias metadata. This patch drops such metadata on store and load instructions after rewriting statepoints. Reviewers: reames, sanjoy, apilipenko Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33756 llvm-svn: 305234	2017-06-12 21:26:53 +00:00
Sanjay Patel	2e33bbaff0	[InstCombine] lshr (sext iM X to iN), N-M --> zext (ashr X, min(N-M, M-1)) to iN This is a follow-up to https://reviews.llvm.org/D33879 / https://reviews.llvm.org/rL304939 , and was discussed in https://reviews.llvm.org/D33338. We prefer this form because a narrower shift may be cheaper, and we can more easily fold a zext than a sext. http://rise4fun.com/Alive/slVe Name: shz %s = sext i8 %x to i12 %r = lshr i12 %s, 4 => %a = ashr i8 %x, 4 %r = zext i8 %a to i12 llvm-svn: 305190	2017-06-12 14:23:43 +00:00
Xinliang David Li	7ed6cd32ea	[PartialInlining] Support shrinkwrap life_range markers Differential Revision: http://reviews.llvm.org/D33847 llvm-svn: 305170	2017-06-11 20:46:05 +00:00
Geoff Berry	3cca1da20c	[EarlyCSE] Add option to use MemorySSA for function simplification run of EarlyCSE (off by default). Summary: Use MemorySSA for memory dependency checking in the EarlyCSE pass at the start of the function simplification portion of the pipeline. We rely on the fact that GVNHoist runs just after this pass of EarlyCSE to amortize the MemorySSA construction cost since GVNHoist uses MemorySSA and EarlyCSE preserves it. This is turned off by default. A follow-up change will turn it on to allow for easier reversion in case it breaks something. llvm-svn: 305146	2017-06-10 15:20:03 +00:00
Andrew Kaylor	647025f9e1	[InstSimplify] Don't constant fold or DCE calls that are marked nobuiltin Differential Revision: https://reviews.llvm.org/D33737 llvm-svn: 305132	2017-06-09 23:18:11 +00:00
Yaxun Liu	6455b0dbf3	[SROA] Fix APInt size when load/store have different address space Currently there is a bug in SROA::presplitLoadsAndStores which causes assertion in GEPOperator::accumulateConstantOffset. Basically it does not consider the situation that the pointer operand of load or store may be in a non-zero address space and its size may be different from the size of a pointer in address space 0. This patch fixes assertion when compiling Blender Cycles kernels for amdgpu backend. Diffferential Revision: https://reviews.llvm.org/D33298 llvm-svn: 305107	2017-06-09 20:46:29 +00:00
Keno Fischer	5329174cb1	[Sink] Fix predicate in legality check Summary: isSafeToSpeculativelyExecute is the wrong predicate to use here. All that checks for is whether it is safe to hoist a value due to unaligned/un-dereferencable accesses. However, not only are we doing sinking rather than hoisting, our concern is that the location we're loading from may have been modified. Instead forbid sinking any load across a critical edge. Reviewers: majnemer Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D33179 llvm-svn: 305102	2017-06-09 19:31:10 +00:00
Sanjay Patel	70db424601	[SimplifyLibCalls] fix formatting; NFC llvm-svn: 305081	2017-06-09 14:22:03 +00:00
Serguei Katkov	38414b57f9	[IndVars] Add an option to be able to disable LFTR This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization. By default option is off so current behavior is not changed. Reviewers: reames, sanjoy, wmi, andreadb, apilipenko Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33979 llvm-svn: 305055	2017-06-09 06:11:59 +00:00
George Burgess IV	a20352e13e	[LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops. If we're shrinking a binary operation, it may be the case that the new operations wraps where the old didn't. If this happens, the behavior should be well-defined. So, we can't always carry wrapping flags with us when we shrink operations. If we do, we get incorrect optimizations in cases like: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] - 128; } which gets optimized to: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] \| 128; } Because: - InstCombine turned `sub i32 %from.i, 128` into `add nuw nsw i32 %from.i, 128`. - LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a vector full of `i8 128`s - InstCombine took advantage of the fact that the newly-shrunken add "couldn't wrap", and changed the `add` to an `or`. InstCombine seems happy to figure out whether we can add nuw/nsw on its own, so I just decided to drop the flags. There are already a number of places in LoopVectorize where we rely on InstCombine to clean up. llvm-svn: 305053	2017-06-09 03:56:15 +00:00
David Blaikie	cb9327b02d	Inliner: Don't touch indirect calls Other comments/implications are that this isn't intended behavior (nor perserved/reimplemented in the new inliner) & complicates fixing the 'inlining' of trivially dead calls without consulting the cost function first. llvm-svn: 305052	2017-06-09 03:29:20 +00:00
Craig Topper	a420562257	[InstCombine] Pass a proper context instruction to all of the calls into InstSimplify Summary: This matches the behavior we already had for compares and makes us consistent everywhere. Reviewers: dberlin, hfinkel, spatel Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33604 llvm-svn: 305049	2017-06-09 03:21:29 +00:00
Evgeniy Stepanov	d02dbf6b1c	[CFI] Remove LinkerSubsectionsViaSymbols. Since D17854 LinkerSubsectionsViaSymbols is unnecessary. It is interfering with ThinLTO implementation of CFI-ICall, where the aliases used on the !LinkerSubsectionsViaSymbols branch are needed to export jump tables to ThinLTO backends. This is the second attempt to land this change after fixing PR33316. llvm-svn: 305031	2017-06-08 23:38:22 +00:00
Craig Topper	2aa4d39f5e	[ExtractGV] Fix the doxygen comment on the constructor and the class to refer to global values instead of functions. While there fix an 80 column violation. NFC llvm-svn: 305030	2017-06-08 23:38:19 +00:00
Peter Collingbourne	e357fbd243	Write summaries for merged modules when splitting modules for ThinLTO. This is to prepare to allow for dead stripping of globals in the merged modules. Differential Revision: https://reviews.llvm.org/D33921 llvm-svn: 305027	2017-06-08 23:01:49 +00:00
Kostya Serebryany	2c2fb8896b	[sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. Reapplying revisions 304630, 304631, 304632, 304673, see PR33308 llvm-svn: 305026	2017-06-08 22:58:19 +00:00
Dehao Chen	e2a428bad7	Do not early-inline recursive calls in sample profile loader. Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it. Reviewers: davidxl, dnovillo, iteratee Reviewed By: iteratee Subscribers: iteratee, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D34017 llvm-svn: 305009	2017-06-08 20:11:57 +00:00
Galina Kistanova	e128958552	Changed a comparison operator for std::stable_sort to implement strict weak ordering. This is a temporarily fix which needs additional work, as it triggers a test3 failure. test3 is commented out till then. llvm-svn: 304993	2017-06-08 17:27:40 +00:00
Nirav Dave	62fb8498d3	InferAddressSpaces: Avoid assertion failure with replacing identical cloned constexpr Have cloneConstantExprWithNewAddressSpaces return nullptr when returning initial ConstantExpr. Reviewers: arsenm Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D33995 llvm-svn: 304975	2017-06-08 13:20:55 +00:00
John Brawn	da4a68a1d2	[BPI] Don't assume that strcmp returning >0 is more likely than <0 The zero heuristic assumes that integers are more likely positive than negative, but this also has the effect of assuming that strcmp return values are more likely positive than negative. Given that for nonzero strcmp return values it's the ordering of arguments that determines the sign of the result there's no reason to assume that's true. Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to decide if it's strcmp-like, and if so only assume that nonzero is more likely than zero i.e. strings are more often different than the same. This causes a slight code generation change in the spec2006 benchmark 403.gcc, but with no noticeable performance impact. The intent of this patch is to allow better optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are also some changes that need to be made to if-conversion. Differential Revision: https://reviews.llvm.org/D33934 llvm-svn: 304970	2017-06-08 09:44:40 +00:00
Sanjay Patel	66f7fdb300	[InstCombine] fold lshr (sext X), C1 --> zext (lshr X, C2) This was discussed in D33338. We have larger pattern-matching ending in a truncate that we can reduce or remove by handling these smaller patterns first. Further motivation is that narrower shift ops are easier for value tracking and zext is better than sext. http://rise4fun.com/Alive/rhh Name: boolshift %sext = sext i1 %x to i8 %r = lshr i8 %sext, 7 => %r = zext i1 %x to i8 Name: noboolshift %sext = sext i3 %x to i8 %r = lshr i8 %sext, 7 => %sh = lshr i3 %x, 2 %r = zext i3 %sh to i8 Differential Revision: https://reviews.llvm.org/D33879 llvm-svn: 304939	2017-06-07 20:32:08 +00:00
Xinliang David Li	4f49bee764	Fix builin_expect lowering bug PR33346 Skip cases when expected value is not constant int. llvm-svn: 304933	2017-06-07 18:32:24 +00:00
Peter Collingbourne	aaae7eed5c	LowerTypeTests: Generate simpler IR for br(llvm.type.test, then, else). This makes it so that the code quality for CFI checks when compiling with -O2 and linking with --lto-O0 is similar to that of the rest of the code. Reduces the size of a chrome binary built with -O2/--lto-O0 by about 750KB. Differential Revision: https://reviews.llvm.org/D33925 llvm-svn: 304921	2017-06-07 15:49:14 +00:00
Craig Topper	73ba1c84be	[InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce compiled code for comparing APInts with 0 and 1. NFC These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare. llvm-svn: 304876	2017-06-07 07:40:37 +00:00
Craig Topper	29c282eac8	[InstCombine] Fix two asserts that were accidentally checking that an APInt pointer is non-zero instead of checking that the APInt self is non-zero. I believe this code used to use APInt references which would have worked. But then they were changed to pointers to allow m_APInt to be used. llvm-svn: 304875	2017-06-07 07:40:29 +00:00
Zachary Turner	264b5d9e88	Move Object format code to lib/BinaryFormat. This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864	2017-06-07 03:48:56 +00:00
Evgeny Stupachenko	3b88291581	Fix PR23384 (part 3 of 3) Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304824	2017-06-06 20:04:16 +00:00
Daniel Berlin	eafdd862e5	NewGVN: Fix PR/33187. This is a bug caused by two things: 1. When there is no perfect iteration order, we can't let phi nodes put themselves in terms of things that come later in the iteration order, or we will endlessly cycle (the normal RPO algorithm clears the hashtable to avoid this issue). 2. We are sometimes erasing the wrong expression (causing pessimism) because our equality says loads and stores are the same. We introduce an exact equality function and use it when erasing to make sure we erase only identical expressions, not equivalent ones. llvm-svn: 304807	2017-06-06 17:15:28 +00:00
Anna Thomas	b2a212c070	[Atomics][LoopIdiom] Recognize unordered atomic memcpy Summary: Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy. The only difference for recognizing an unordered atomic memcpy and instead of a normal memcpy is that the loads and/or stores involved are unordered atomic operations. Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html Patch by Daniel Neilson! Reviewers: reames, anna, skatkov Reviewed By: reames, anna Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33243 llvm-svn: 304806	2017-06-06 16:45:25 +00:00
Anna Thomas	7218032019	[IRCE] Canonicalize pre/post loops after the blocks are added into parent loop Summary: We were canonizalizing the pre loop (into loop-simplify form) before the post loop blocks were added into parent loop. This is incorrect when IRCE is done on a subloop. The post-loop blocks are created, but not yet added to the parent loop. So, loop-simplification on the pre-loop incorrectly updates LoopInfo. This patch corrects the ordering so that pre and post loop blocks are added to parent loop (if any), and then the loops are canonicalized to LCSSA and LoopSimplifyForm. Reviewers: reames, sanjoy, apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33846 llvm-svn: 304800	2017-06-06 14:54:01 +00:00
Chandler Carruth	6bda14b313	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Xin Tong	9d6f08a8d4	Add a dominanance check interface that uses caching for instructions within same basic block. Summary: This problem stems from the fact that instructions are allocated using new in LLVM, i.e. there is no relationship that can be derived by just looking at the pointer value. This interface dispatches to appropriate dominance check given 2 instructions, i.e. in case the instructions are in the same basic block, ordered basicblock (with instruction numbering and caching) are used. Otherwise, dominator tree is used. This is a preparation patch for https://reviews.llvm.org/D32720 Reviewers: dberlin, hfinkel, davide Subscribers: davide, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D33380 llvm-svn: 304764	2017-06-06 02:34:41 +00:00
Evgeny Stupachenko	f2b3b467e5	Fix PR23384 (part 2 of 3) NFC Summary: The patch moves LSR cost comparison to target part. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30561 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304750	2017-06-05 23:37:00 +00:00
Evgeny Stupachenko	4d94e99446	LSR: Calculate instruction cost only if InsnsCost is set to true (NFC) Summary: The patch guard all instruction cost calculations with InsnCosts (-lsr-insns-cost) option. Currently even if the option set to false we calculate and print (in debug mode) instruction costs. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D33914 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304746	2017-06-05 22:44:18 +00:00
Sven van Haastregt	78819e0fd4	[InstCombine] Fix extractelement use before def This fixes a bug that can cause extractelements with operands that haven't been defined yet to be inserted at a wrong point when optimising insertelements. Patch by Karl Hylen. Differential Revision: https://reviews.llvm.org/D33449 llvm-svn: 304701	2017-06-05 09:18:10 +00:00
Renato Golin	cdf840fd38	Revert "[sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet." This reverts commit r304630, as it broke ARM/AArch64 bots for 2 days. llvm-svn: 304698	2017-06-05 07:35:52 +00:00
Ayal Zaks	ab32aff838	[LV] Make scalarizeInstruction() non-virtual. NFC. Following the request made in https://reviews.llvm.org/D32871, scalarizeInstruction() which is no longer overridden by InnerLoopUnroller is hereby made non-virtual in InnerLoopVectorizer. Should have been part of r297580 originally. llvm-svn: 304685	2017-06-04 13:29:51 +00:00
Craig Topper	0799ff9e64	[InstCombine] Add support for simplifying ctlz/cttz intrinsics based on known bits. llvm-svn: 304669	2017-06-03 18:50:32 +00:00
Galina Kistanova	e9cacb6ae8	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304638	2017-06-03 05:19:32 +00:00
Galina Kistanova	55344aba7e	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304637	2017-06-03 05:19:10 +00:00
Galina Kistanova	96d51f5bcb	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. llvm-svn: 304636	2017-06-03 05:18:46 +00:00
Kostya Serebryany	f7db346cdf	[sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. llvm-svn: 304630	2017-06-03 01:35:47 +00:00
Evgeniy Stepanov	704003ea3d	Revert "[CFI] Remove LinkerSubsectionsViaSymbols." This reverts commit r304582: breaks cfi-devirt :: anon-namespace.cpp on Darwin. llvm-svn: 304626	2017-06-03 00:46:27 +00:00
Alexey Bataev	e4e5923ef1	[SLP] Improve comments and naming of functions/variables/members, NFC. Fixed some comments, added an additional description of the algorithms, improved readability of the code. Differential revision: https://reviews.llvm.org/D33320 llvm-svn: 304616	2017-06-03 00:08:21 +00:00
Kostya Serebryany	aed6ba770c	[sanitizer-coverage] refactor the code to make it easier to add more sections in future. NFC llvm-svn: 304610	2017-06-02 23:13:44 +00:00
Alexey Bataev	03ca396b95	Revert "[SLP] Improve comments and naming of functions/variables/members, NFC." This reverts commit 6e311de8b907aa20da9a1a13ab07c3ce2ef4068a. llvm-svn: 304609	2017-06-02 23:09:15 +00:00
Philip Reames	b70cecd60a	[Statepoint] Be consistent about using deopt naming [NFCI] We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed. llvm-svn: 304607	2017-06-02 23:03:26 +00:00
Xinliang David Li	5fdc75aea1	Fix debug build test failure llvm-svn: 304600	2017-06-02 22:38:48 +00:00
Xinliang David Li	0b7d858fa3	[PartialInlining] Minor cost anaysis tuning Also added a test option and 2 cost analysis related tests. llvm-svn: 304599	2017-06-02 22:08:04 +00:00
David Blaikie	6aeacaa527	FunctionAttrs: Skip it if the effective SCC (ignoring optnone functions) is empty Minor optimization but mostly simplifies my debugging so I'm not dealing with empty SCCNodeSets while investigating issues in this optimization. llvm-svn: 304597	2017-06-02 21:24:17 +00:00
Alexey Bataev	2c08fde9e5	[SLP] Improve comments and naming of functions/variables/members, NFC. Summary: Fixed some comments, added an additional description of the algorithms, improved readability of the code. Reviewers: anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33320 llvm-svn: 304593	2017-06-02 20:39:27 +00:00
Keno Fischer	514a6a54e7	[SROA] Fix crash due to bad bitcast Summary: As shown in the test case, SROA was crashing when trying to split stores (to the alloca) of loads (from anywhere), because it assumed the pointer operand to the loads and stores had to have the same address space. This isn't the case. Make sure to use the correct pointer type for both the load and the store. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D32593 llvm-svn: 304585	2017-06-02 19:04:17 +00:00
Evgeniy Stepanov	63f056327d	[CFI] Remove LinkerSubsectionsViaSymbols. Since D17854 LinkerSubsectionsViaSymbols is unnecessary. It is interfering with ThinLTO implementation of CFI-ICall, where the aliases used on the !LinkerSubsectionsViaSymbols branch are needed to export jump tables to ThinLTO backends. llvm-svn: 304582	2017-06-02 18:45:14 +00:00
Evgeniy Stepanov	b933ad3a77	Skip CFI for dead functions. Differential Revision: https://reviews.llvm.org/D33805 llvm-svn: 304578	2017-06-02 18:24:23 +00:00
Sanjay Patel	ce241f48c5	[InstCombine] fix icmp with not op and constant to work with splat vector constant llvm-svn: 304562	2017-06-02 16:29:41 +00:00
Sanjay Patel	4dc85eb75a	[InstCombine] improve perf by not creating a known non-canonical instruction Op1 (RHS) is a constant, so putting it on the LHS makes us churn through visitICmp an extra time to canonicalize it: INSTCOMBINE ITERATION #1 on cmpnot IC: ADDING: 3 instrs to worklist IC: Visiting: %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 %notx, 42 IC: Old = %cmp = icmp sgt i8 %notx, 42 New = <badref> = icmp sgt i8 -43, %x IC: ADD: %cmp = icmp sgt i8 -43, %x IC: ERASE %1 = icmp sgt i8 %notx, 42 IC: ADD: %notx = xor i8 %x, -1 IC: DCE: %notx = xor i8 %x, -1 IC: ERASE %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 -43, %x IC: Mod = %cmp = icmp sgt i8 -43, %x New = %cmp = icmp slt i8 %x, -43 IC: ADD: %cmp = icmp slt i8 %x, -43 IC: Visiting: %cmp = icmp slt i8 %x, -43 IC: Visiting: ret i1 %cmp If we create the swapped ICmp directly, we go faster: INSTCOMBINE ITERATION #1 on cmpnot IC: ADDING: 3 instrs to worklist IC: Visiting: %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 %notx, 42 IC: Old = %cmp = icmp sgt i8 %notx, 42 New = <badref> = icmp slt i8 %x, -43 IC: ADD: %cmp = icmp slt i8 %x, -43 IC: ERASE %1 = icmp sgt i8 %notx, 42 IC: ADD: %notx = xor i8 %x, -1 IC: DCE: %notx = xor i8 %x, -1 IC: ERASE %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp slt i8 %x, -43 IC: Visiting: ret i1 %cmp llvm-svn: 304558	2017-06-02 16:11:14 +00:00
Gor Nishanov	053d2d24f7	[coroutines] PR33271: Remove stray coro.save intrinsics during CoroSplit Summary: Optimization passes may remove llvm.coro.suspend intrinsic while leaving matching llvm.coro.save intrinsic orphaned. Make sure we clean up orphaned coro.saves. The bug manifested with a crash similar to this: ``` llvm_unreachable("Unknown type!"); llvm::MVT::getVT (Ty=0x489518, HandleUnknown=false) llvm::EVT::getEVT llvm::TargetLoweringBase::getValueType llvm::ComputeValueVTs llvm::SelectionDAGBuilder::visitTargetIntrinsic ``` Reviewers: GorNishanov Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33817 llvm-svn: 304518	2017-06-02 02:18:36 +00:00
Xinliang David Li	621e8dcf1f	[Profile] Enhance expect lowering to handle correlated branches builtin_expect applied on && or \|\| expressions were not handled properly before. With this patch, the problem is fixed. Differential Revision: http://reviews.llvm.org/D33164 llvm-svn: 304517	2017-06-02 02:09:31 +00:00
Philip Reames	ae80045deb	[RS4GC] Comment clarification llvm-svn: 304514	2017-06-02 01:52:06 +00:00
Davide Italiano	1dd5558e52	[PM] GVNSink is off by default, fix an obvious typo. llvm-svn: 304497	2017-06-01 23:47:53 +00:00
Xinliang David Li	d6cfba2a02	Fix compiler_rt buildbot failure llvm-svn: 304489	2017-06-01 23:05:11 +00:00
Keno Fischer	fa635d730f	Reapply "[Cloning] Take another pass at properly cloning debug info" This was rL304226, reverted in 304228 due to a clang assertion failure on the build bots. That problem should have been addressed by clang commit rL304470. llvm-svn: 304488	2017-06-01 23:02:12 +00:00
Evgeniy Stepanov	56584bbf16	(NFC) Track global summary liveness in GVFlags. Replace GVFlags::LiveRoot with GVFlags::Live and use that instead of all the DeadSymbols sets. This is refactoring in order to make liveness information available in the RegularLTO pipeline. llvm-svn: 304466	2017-06-01 20:30:06 +00:00
Xinliang David Li	ee8d6acb1f	[Profile] Fix builtin_expect lowering bug The lowerer wrongly assumes the ICMP instruction 1) always has a constant operand; 2) the operand has value 0. It also assumes the expected value can only be one, thus other values other than one will be considered 'zero'. This leads to wrong profile annotation when other integer values are used other than 0, 1 in the comparison or in the expect intrinsic. Also missing is handling of equal predicate. This patch fixes all the above problems. Differential Revision: http://reviews.llvm.org/D33757 llvm-svn: 304453	2017-06-01 19:05:55 +00:00
Xinliang David Li	0a0acbcf78	[PartialInlining] Emit branch info and profile data as remarks This allows us to collect profile statistics to tune static branch prediction. Differential Revision: http://reviews.llvm.org/D33746 llvm-svn: 304452	2017-06-01 18:58:50 +00:00
Mandeep Singh Grang	33a1b73600	[PredicateInfo] Fix non-determinism in codegen uncovered by reverse iterating SmallPtrSet Summary: Sort OpsToRename before iterating to make iteration order deterministic. Thanks to Daniel Berlin for the sorting logic. Reviewers: dberlin, RKSimon, efriedma, davide Reviewed By: dberlin, davide Subscribers: sanjoy, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D33265 llvm-svn: 304447	2017-06-01 18:36:24 +00:00
Tim Shen	6b41141863	[ThinLTO] Migrate ThinLTOBitcodeWriter to the new PM. Summary: Also see D33429 for other ThinLTO + New PM related changes. Reviewers: davide, chandlerc, tejohnson Subscribers: mehdi_amini, Prazek, cfe-commits, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D33525 llvm-svn: 304378	2017-06-01 01:02:12 +00:00
Xinliang David Li	32c5e809be	[PartialInlining] Reduce outlining overhead by removing unneeded live-out(s) Differential Revision: http://reviews.llvm.org/D33694 llvm-svn: 304375	2017-06-01 00:12:41 +00:00
Wei Mi	0bd3f41588	Revert rL304050. It may break sanitizer bootstrap. Revert it for now while investigating. llvm-svn: 304350	2017-05-31 21:29:33 +00:00
Reid Kleckner	5fbdd17714	[IR] Add additional addParamAttr/removeParamAttr to AttributeList API Summary: Fairly straightforward patch to fill in some of the holes in the attributes API with respect to accessing parameter/argument attributes. The patch aims to step further towards encapsulating the idx+FirstArgIndex pattern to access these attributes to within the AttributeList. Patch by Daniel Neilson! Reviewers: rnk, chandlerc, pete, javed.absar, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33355 llvm-svn: 304329	2017-05-31 19:23:09 +00:00
Kostya Serebryany	53b34c8443	[sanitizer-coverage] remove stale code (old coverage); llvm part llvm-svn: 304319	2017-05-31 18:27:33 +00:00
Anna Thomas	777bb90bdc	Revert "[Atomics][LoopIdiom] Recognize unordered atomic memcpy" This reverts commit r304310. It caused build failures in polly and mingw due to undefined reference to llvm::RTLIB::getMEMCPY_ELEMENT_ATOMIC. llvm-svn: 304315	2017-05-31 17:20:51 +00:00
Zaara Syeda	3a7578c658	[PPC] Inline expansion of memcmp This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313	2017-05-31 17:12:38 +00:00
Anna Thomas	056c009f1b	[Atomics][LoopIdiom] Recognize unordered atomic memcpy Summary: Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy. The only difference for recognizing an unordered atomic memcpy and instead of a normal memcpy is that the loads and/or stores involved are unordered atomic operations. Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html Patch by Daniel Neilson! Reviewers: reames, anna, skatkov Reviewed By: reames Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33243 llvm-svn: 304310	2017-05-31 16:39:52 +00:00
Gor Nishanov	2bc782d8da	[coroutines] Call initializePass in coroutine pass constructors Summary: Fixes: https://bugs.llvm.org/show_bug.cgi?id=33226 Reviewers: chandlerc, davide, majnemer, dblaikie Reviewed By: chandlerc Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33701 llvm-svn: 304277	2017-05-31 03:12:42 +00:00
Daniel Berlin	be3e7ba45e	NewGVN: Fix PR 33185 by checking whether we need to recursively generate a phi of ops, which we don't currently support. llvm-svn: 304272	2017-05-31 01:47:32 +00:00
Xinliang David Li	74480adafd	[PartialInlining] Shrinkwrap allocas with live range contained in outline region. Differential Revision: http://reviews.llvm.org/D33618 llvm-svn: 304245	2017-05-30 21:22:18 +00:00
Matthew Simpson	646475a9bc	[LV] Reapply r303763 with fix for PR33193 r303763 caused build failures in some out-of-tree tests due to an assertion in TTI. The original patch updated cost estimates for induction variable update instructions marked for scalarization. However, it didn't consider that the incoming value of an induction variable phi node could be a cast instruction. This caused queries for cast instruction costs with a mix of vector and scalar types. This patch includes a fix for cast instructions and the test case from PR33193. The fix was suggested by Jonas Paulsson <paulsson@linux.vnet.ibm.com>. Reference: https://bugs.llvm.org/show_bug.cgi?id=33193 Original Differential Revision: https://reviews.llvm.org/D33457 llvm-svn: 304235	2017-05-30 19:55:57 +00:00
Keno Fischer	3fa5db4c04	Revert "[Cloning] Take another pass at properly cloning debug info" At least one build bot is complaining. Will investigate after lunch. llvm-svn: 304228	2017-05-30 18:56:26 +00:00
Keno Fischer	945dc1d2d1	[Cloning] Take another pass at properly cloning debug info Summary: In rL302576, DISubprograms gained the constraint that a !dbg attachments to functions must have a 1:1 mapping to DISubprograms. As part of that change, the function cloning support was adjusted to attempt to enforce this invariant during cloning. However, there were several problems with the implementation. Part of these were fixed in rL304079. However, there was a more fundamental problem with these changes, namely that it bypasses the matadata value map, causing the cloned metadata to be a mix of metadata pointing to the new suprogram (where manual code was added to fix those up) and the old suprogram (where this was not the case). This mismatch could cause a number of different assertion failures in the DWARF emitter. Some of these are given at https://github.com/JuliaLang/julia/issues/22069, but some others have been observed as well. Attempt to rectify this by partially reverting the manual DI metadata fixup, and instead using the standard value map approach. To retain the desired semantics of not duplicating the compilation unit and inlined subprograms, explicitly freeze these in the value map. Reviewers: dblaikie, aprantl, GorNishanov, echristo Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33655 llvm-svn: 304226	2017-05-30 18:28:30 +00:00
Daniel Berlin	2aa5dc1589	NewGVN: Compute hash value of expression on demand and use it in inequality testing. llvm-svn: 304195	2017-05-30 06:58:18 +00:00
Daniel Berlin	c8ed40400c	NewGVN: Fix PR33194, memory corruption by putting temporary instructions in tables sometimes. llvm-svn: 304194	2017-05-30 06:42:29 +00:00
Joerg Sonnenberger	9375a25342	Revert r303763, results in asserts i.e. while building Ruby. llvm-svn: 304179	2017-05-29 22:52:17 +00:00
Hiroshi Inoue	ac9cd3080d	[trivial] fix a typo in comment, NFC llvm-svn: 304139	2017-05-29 08:37:42 +00:00
Gor Nishanov	ffbeb22b6f	Cloning: Fix debug info cloning Summary: I believe https://reviews.llvm.org/rL302576 introduced two bugs: 1) it produces duplicate distinct variables for every: dbg.value describing the same variable. To fix the problme I switched form getDistinct() to get() in DebugLoc.cpp: auto reparentVar = [&](DILocalVariable Var) { return DILocalVariable::getDistinct( 2) It passes NewFunction plain name as a linkagename parameter to Subprogram constructor. Breaks assert in: \|\| DeclLinkageName.empty()) \|\| LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed. #9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3 # (Edit: reproducer added) Here how https://reviews.llvm.org/rL302576 broke coroutine debug info. Coroutine body of the original function is split into several parts by cloning and removing unneeded code. All parts describe the original function and variables present in the original function. For a simple case, prior to Split, original function has these two blocks: ``` PostSpill: ; preds = %AllocaSpillBB call void @llvm.dbg.value(metadata i32 %x, i64 0, metadata !14, metadata !15), !dbg !13 store i32 %x, i32* %x.addr, align 4 ... and sw.epilog: ; preds = %sw.bb %x.addr.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 4, !dbg !20 %4 = load i32, i32* %x.addr.reload.addr, align 4, !dbg !20 call void @llvm.dbg.value(metadata i32 %4, i64 0, metadata !14, metadata !15), !dbg !13 !14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11) ``` Note that in two blocks different expression represent the same original user variable X. Before rL302576, for every cloned function there was exactly one cloned DILocalVariable(name: "x" as in: ``` define i8* @f(i32 %x) #0 !dbg !6 { ... !6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, ... !14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11) define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 { ... !25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, isOptimized: false, unit: !0, variables: !2) !28 = !DILocalVariable(name: "x", arg: 1, scope: !25, file: !7, line: 55, type: !11) ``` After rL302576, for every cloned function there were as many DILocalVariable(name: "x" as there were "call void @llvm.dbg.value" for that variable. This was causing asserts in VerifyDebugInfo and AssemblyPrinter. Example: ``` !27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, !29 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11) !39 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11) !41 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11) ``` Second problem: Prior to rL302576, all clones were described by DISubprogram referring to original function. ``` define i8* @f(i32 %x) #0 !dbg !6 { ... !6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 { ... !25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, ``` After rL302576, DISubprogram for clones is of two minds, plain name refers to the original name, linkageName refers to plain name of the clone. ``` !27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, ``` I think the assumption in AsmPrinter is that both name and linkageName should refer to the same entity. It asserts here when they are not: ``` \|\| DeclLinkageName.empty()) \|\| LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed. #9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3 ``` After this fix, behavior (with respect to coroutines) reverts to exactly as it was before and therefore making them debuggable again, or even more importantly, compilable, with "-g" Reviewers: dblaikie, echristo, aprantl Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33614 llvm-svn: 304079	2017-05-27 19:41:09 +00:00
Gor Nishanov	9c6ac6138d	[coroutines] Define getPassName() for coroutine passes Reviewers: GorNishanov Reviewed By: GorNishanov Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33622 llvm-svn: 304065	2017-05-27 05:54:30 +00:00
Vitaly Buka	a637489ef1	[PartialInlining] Replace delete with unique_ptr in computeCallsiteToProfCountMap Reviewers: davidxl Reviewed By: davidxl Subscribers: vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D33220 llvm-svn: 304064	2017-05-27 05:32:09 +00:00
Wei Mi	5bbb5aafc1	[GVN] Recommit the patch "Add phi-translate support in scalarpre". The recommit is to fix a bug about ExtractValue and InsertValue ops. For those ops, some varargs inside GVN::Expression are not value numbers but raw index numbers. It is wrong to do phi-translate for raw index numbers, and the fix is to stop doing that. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 304050	2017-05-27 00:54:19 +00:00
Benjamin Kramer	debb3c35e0	Make helper functions static. NFC. llvm-svn: 304029	2017-05-26 20:09:00 +00:00
Peter Collingbourne	7730b24448	PMB: Run the whole-program-devirt pass during LTO at --lto-O0. The whole-program-devirt pass needs to run at -O0 because only it knows about the llvm.type.checked.load intrinsic: it needs to both lower the intrinsic itself and handle it in the summary. Differential Revision: https://reviews.llvm.org/D33571 llvm-svn: 304019	2017-05-26 18:27:13 +00:00
Craig Topper	d45185f231	[InstCombine] Pass the DominatorTree, AssumptionCache, and context instruction to a few calls to isKnownPositive, isKnownNegative, and isKnownNonZero Every other place in InstCombine that uses these methods in ValueTracking already pass this information. This makes the remaining sites consistent. Differential Revision: https://reviews.llvm.org/D33567 llvm-svn: 304018	2017-05-26 18:23:57 +00:00
Wei Mi	3250ae3f7c	Revert rL303923 since it broke the sanitizer bootstrap build bot. llvm-svn: 303969	2017-05-26 05:42:50 +00:00
Craig Topper	d4039f7283	[InstCombine] Add an InstCombine specific wrapper around isKnownToBeAPowerOfTwo to shorten code. NFC We have wrappers for several other ValueTracking methods that take care of passing all of the analysis and assumption cache parameters. This extends it to isKnownToBeAPowerOfTwo. llvm-svn: 303924	2017-05-25 21:51:12 +00:00
Wei Mi	fd257fa7bf	[GVN] Add phi-translate support in scalarpre. Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 303923	2017-05-25 21:49:02 +00:00
Daniel Berlin	e67c322260	NewGVN: Fix PR 33119, PR 33129, due to regressed undef handling Fix PR33120 and others by eliminating self-cycles a different way. llvm-svn: 303875	2017-05-25 15:44:20 +00:00
Artur Pilipenko	315eafc339	[InstCombine] Teach isAllocSiteRemovable to look through addrspacecasts Reviewed By: reames Differential Revision: https://reviews.llvm.org/D28565 llvm-svn: 303870	2017-05-25 15:14:48 +00:00
Sanjay Patel	5150612012	[InstCombine] make icmp-mul fold more efficient There's probably a lot more like this (see also comments in D33338 about responsibility), but I suspect we don't usually get a visible manifestation. Given the recent interest in improving InstCombine efficiency, another potential micro-opt that could be repeated several times in this function: morph the existing icmp pred/operands instead of creating a new instruction. llvm-svn: 303860	2017-05-25 14:13:57 +00:00
James Molloy	dc2d64bc35	[GVNSink] Pacify MSVC Don't convert an unsigned to a pointer for a sentinel, use a size_t instead. llvm-svn: 303855	2017-05-25 13:14:10 +00:00
James Molloy	2a237f19f1	[GVNSink] Don't define operator<< in NDEBUG Without debug macros enabled, the raw_ostream operator<< overload is unused. llvm-svn: 303852	2017-05-25 13:11:18 +00:00
James Molloy	a929063233	[GVNSink] GVNSink pass This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts. This pass attempts to sink instructions into successors, reducing static instruction count and enabling if-conversion. We use a variant of global value numbering to decide what can be sunk. Consider: [ %a1 = add i32 %b, 1 ] [ %c1 = add i32 %d, 1 ] [ %a2 = xor i32 %a1, 1 ] [ %c2 = xor i32 %c1, 1 ] \ / [ %e = phi i32 %a2, %c2 ] [ add i32 %e, 4 ] GVN would number %a1 and %c1 differently because they compute different results - the VN of an instruction is a function of its opcode and the transitive closure of its operands. This is the key property for hoisting and CSE. What we want when sinking however is for a numbering that is a function of the uses of an instruction, which allows us to answer the question "if I replace %a1 with %c1, will it contribute in an equivalent way to all successive instructions?". The (new) PostValueTable class in GVN provides this mapping. This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG. Differential revision: https://reviews.llvm.org/D24805 llvm-svn: 303850	2017-05-25 12:51:11 +00:00
Chandler Carruth	dd2e275a47	[PM/Unswitch] Fix a bug in the domtree update logic for the new unswitch pass. The original logic only considered direct successors of the hoisted domtree nodes, but that isn't really enough. If there are other basic blocks that are completely within the subtree, their successors could just as easily be impacted by the hoisting. The more I think about it, the more I think the correct update here is to hoist every block on the dominance frontier which has an idom in the chain we hoist across. However, this is subtle enough that I'd definitely appreciate some more eyes on it. Sadly, if this is the correct algorithm, it requires computing a (highly localized) dominance frontier. I've done this in the simplest (IE, least code) way I could come up with, but that may be too naive. Suggestions welcome here, dominance update algorithms are not an area I've studied much, so I don't have strong opinions. In good news, with this patch, turning on simple unswitch passes the LLVM test suite for me with asserts enabled. Differential Revision: https://reviews.llvm.org/D32740 llvm-svn: 303843	2017-05-25 06:33:36 +00:00
Chandler Carruth	29c22d2835	[LegacyPM] Make the 'addLoop' method accept a loop to add rather than having it internally allocate the loop. This is a much more flexible API and necessary in the new loop unswitch to reasonably support both new and old PMs in common code. It also just seems like a cleaner separation of concerns. NFC, this should just be a pure refactoring. Differential Revision: https://reviews.llvm.org/D33528 llvm-svn: 303834	2017-05-25 03:01:31 +00:00
George Karpenkov	a1c532784d	Fix coverage check for full post-dominator basic blocks. Coverage instrumentation which does not instrument full post-dominators and full-dominators may skip valid paths, as the reasoning for skipping blocks may become circular. This patch fixes that, by only skipping full post-dominators with multiple predecessors, as such predecessors by definition can not be full-dominators. llvm-svn: 303827	2017-05-25 01:41:46 +00:00
Gor Nishanov	1fbc01f70f	[coroutines] CoroFrame.cpp conform to coding convention (s/repeat/Repeat) (NFC) llvm-svn: 303826	2017-05-25 01:07:10 +00:00
Gor Nishanov	0ea1863b27	[coroutines] Relocate instructions that maybe spilled after coro.begin Summary: Frontend generates store instructions after allocas, for example: ``` define i8* @f(i64 %this) "coroutine.presplit"="1" personality i32 0 { entry: %this.addr = alloca i64 store i64 %this, i64* %this.addr .. %hdl = call i8* @llvm.coro.begin(token %id, i8* %alloc) ``` Such instructions may require spilling into coro.frame, but, coro-frame address is only available after coro.begin and thus needs to be moved after coro.begin. The only instructions that should not be moved are the arguments of coro.begin and all of their operands. Reviewers: GorNishanov, majnemer Reviewed By: GorNishanov Subscribers: llvm-commits, EricWF Differential Revision: https://reviews.llvm.org/D33527 llvm-svn: 303825	2017-05-25 00:46:20 +00:00
Gor Nishanov	1f72d75714	[coroutines] Allow rematerialization upto 4 times. Remove incorrect assert Reviewers: majnemer Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33524 llvm-svn: 303819	2017-05-24 23:01:02 +00:00
Sanjay Patel	07b1ba54b5	[InstCombine] use m_APInt to allow icmp-mul-mul vector fold The swapped operands in the first test is a manifestation of an inefficiency for vectors that doesn't exist for scalars because the IRBuilder checks for an all-ones mask for scalars, but not vectors. llvm-svn: 303818	2017-05-24 22:58:17 +00:00
Craig Topper	2f9c6dafe3	[InstCombine] Merge together the SimplifyDemandedUseBits implementations for ZExt and Trunc. NFC While there avoid resizing the DemandedMask twice. Make a copy into a separate variable instead. This potentially removes an allocation on large bit widths. With the use of the zextOrTrunc methods on APInt and KnownBits these can be made almost source identical. The only difference is the zero of the upper bits for ZExt. This is similar to how its done in computeKnownBits in ValueTracking. llvm-svn: 303791	2017-05-24 18:40:25 +00:00
Teresa Johnson	cd2aa0d2e4	Fix a couple of typos in memory intrinsic optimization output (NFC) s/instrinsic/intrinsic llvm-svn: 303782	2017-05-24 17:55:25 +00:00
Craig Topper	1c660dbea6	[InstCombine] Use less bitwise operations to handle Instruction::SExt in SimplifyDemandedUseBits. Other improvements. The current code created a NewBits mask and used it as a mask several times. One of them just before a call to trunc making it unnecessary. A call to getActiveBits can get us the same information for the case. We also ORed with this mask later when we should have just sign extended the known bits. We also called trunc on the guaranteed to be zero KnownZeros/Ones masks entering this code. Creating appropriately sized temporary APInts is probably better. Differential Revision: https://reviews.llvm.org/D32098 llvm-svn: 303779	2017-05-24 17:33:30 +00:00
Craig Topper	8205a1a9b6	[ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object. This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits. Differential Revision: https://reviews.llvm.org/D33431 llvm-svn: 303773	2017-05-24 16:53:07 +00:00
Matthew Simpson	d6f179cad6	[LV] Update type in cost model for scalarization For non-uniform instructions marked for scalarization, we should update `VectorTy` when computing instruction costs to reflect the scalar type. In addition to determining instruction costs, this type is also used to signal that all instructions in the loop will be scalarized. This currently affects memory instructions and non-pointer induction variables and their updates. (We also mark GEPs scalar after vectorization, but their cost is computed together with memory instructions.) For scalarized induction updates, this patch also scales the scalar cost by the vectorization factor, corresponding to each induction step. llvm-svn: 303763	2017-05-24 15:26:15 +00:00
Jonas Paulsson	8624b7e1ce	[LoopVectorizer] Let target prefer scalar addressing computations. The loop vectorizer usually vectorizes any instruction it can and then extracts the elements for a scalarized use. On SystemZ, all elements containing addresses must be extracted into address registers (GRs). Since this extraction is not free, it is better to have the address in a suitable register to begin with. By forcing address arithmetic instructions and loads of addresses to be scalar after vectorization, two benefits result: * No need to extract the register * LSR optimizations trigger (LSR isn't handling vector addresses currently) Benchmarking show improvements on SystemZ with this new behaviour. Any other target could try this by returning false in the new hook prefersVectorizedAddressing(). Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand https://reviews.llvm.org/D32422 llvm-svn: 303744	2017-05-24 13:42:56 +00:00
Davide Italiano	fd9100e056	[NewGVN] Update additionalUsers when we simplify to a value. Otherwise we don't revisit an instruction that could be simplified, and when we verify, we discover there's something that changed, i.e. what we had wasn't a maximal fixpoint. Fixes PR32836. llvm-svn: 303715	2017-05-24 02:30:24 +00:00
George Karpenkov	018472c34a	Revert "Disable coverage opt-out for strong postdominator blocks." This reverts commit 2ed06f05fc10869dd1239cff96fcdea2ee8bf4ef. Buildbots do not like this on Linux. llvm-svn: 303710	2017-05-24 00:29:12 +00:00
Davide Italiano	c4861adad9	[SCCP] Use the `hasAddressTaken()` version defined in `Function`. Instead of using the SCCP homegrown one. We should eventually make the private SCCP version disappear, but that wont' be today. PR33143 tracks this issue. Add braces for consistency while here. No functional change intended. llvm-svn: 303706	2017-05-23 23:59:23 +00:00
Davide Italiano	7bf95b964f	[LIR] Use the newly `getRecurrenceVar()` helper. NFCI. llvm-svn: 303704	2017-05-23 23:51:54 +00:00
Davide Italiano	4bc91190ea	[LIR] Strengthen the check for recurrence variable in popcnt/CTLZ. Fixes PR33114. Differential Revision: https://reviews.llvm.org/D33420 llvm-svn: 303700	2017-05-23 22:32:56 +00:00
George Karpenkov	9017ca290a	Disable coverage opt-out for strong postdominator blocks. Coverage instrumentation has an optimization not to instrument extra blocks, if the pass is already "accounted for" by a successor/predecessor basic block. However (https://github.com/google/sanitizers/issues/783) this reasoning may become circular, which stops valid paths from having coverage. In the worst case this can cause fuzzing to stop working entirely. This change simplifies logic to something which trivially can not have such circular reasoning, as losing valid paths does not seem like a good trade-off for a ~15% decrease in the # of instrumented basic blocks. llvm-svn: 303698	2017-05-23 21:58:54 +00:00
Sanjay Patel	d3106add77	[InstCombine] allow icmp-xor folds for vectors (PR33138) This fixes the first part of: https://bugs.llvm.org/show_bug.cgi?id=33138 More work is needed for the bitcasted variant. llvm-svn: 303660	2017-05-23 17:29:58 +00:00
Reid Kleckner	8bf67fe98f	[IR] Switch AttributeList to use an array for O(1) access Summary: Before this change, AttributeLists stored a pair of index and AttributeSet. This is memory efficient if most arguments do not have attributes. However, it requires doing a search over the pairs to test an argument or function attribute. Profiling shows that this loop was 0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly tests values for nullability. This was worth about 2.5% of mid-level optimization cycles on the sqlite3 amalgamation. Here are the full perf results: https://reviews.llvm.org/P7995 Here are just the before and after cycle counts: ``` $ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null 13,274,181,184 cycles # 3.047 GHz ( +- 0.28% ) $ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null 12,906,927,263 cycles # 3.043 GHz ( +- 0.51% ) ``` This patch does not change the indices used to query attributes, as requested by reviewers. Tracking whether an index is usable for array indexing is a huge pain that affects many of the internal APIs, so it would be good to come back later and do a cleanup to remove this internal adjustment. Reviewers: pete, chandlerc Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D32819 llvm-svn: 303654	2017-05-23 17:01:48 +00:00
Anna Thomas	c07d5544dd	[JumpThreading] Safely replace uses of condition This patch builds over https://reviews.llvm.org/rL303349 and replaces the use of the condition only if it is safe to do so. We should not blindly RAUW the condition if experimental.guard or assume is a use of that condition. This is because LVI may have used the guard/assume to identify the value of the condition, and RUAWing will fold the guard/assume and uses before the guards/assumes. Reviewers: sanjoy, reames, trentxintong, mkazantsev Reviewed by: sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33257 llvm-svn: 303633	2017-05-23 13:36:25 +00:00
Craig Topper	7e0aeeb884	[KnownBits] Use !hasConflict() in asserts in place of Zero & One == 0 or similar. NFC llvm-svn: 303614	2017-05-23 07:18:37 +00:00
Ayal Zaks	589e1d9610	[LV] Report multiple reasons for not vectorizing under allowExtraAnalysis The default behavior of -Rpass-analysis=loop-vectorizer is to report only the first reason encountered for not vectorizing, if one is found, at which time the vectorizer aborts its handling of the loop. This patch allows multiple reasons for not vectorizing to be identified and reported, at the potential expense of additional compile-time, under allowExtraAnalysis which can currently be turned on by Clang's -fsave-optimization-record and opt's -pass-remarks-missed. Removed from LoopVectorizationLegality::canVectorize() the redundant checking and reporting if we CantComputeNumberOfIterations, as LAI::canAnalyzeLoop() also does that. This redundancy is caught by a lit test once multiple reasons are reported. Patch initially developed by Dror Barak. Differential Revision: https://reviews.llvm.org/D33396 llvm-svn: 303613	2017-05-23 07:08:02 +00:00
Teresa Johnson	525dcb617b	Fix update VP metadata after inlining for instrumentation PGO Summary: With instrumentation profiling, when updating the VP metadata after an inline, VP metadata on the inlined copy was inadvertantly having all counts zeroed out. This was causing indirect calls from code inlined during the call step to be marked as cold in the ThinLTO summaries and not imported. The CallerBFI needs to be passed down so that the CallSiteCount can be computed from the profile summary info. With Sample PGO this was working since the count is extracted from the branch weight metadata on the call being inlined (even before we stopped looking at metadata for non-sample PGO in r302844 this largely wasn't working for instrumentation PGO since only promoted indirect calls would be getting inlined and have the metadata). Added an instrumentation PGO test and renamed the sample PGO test. Reviewers: danielcdh, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D33389 llvm-svn: 303574	2017-05-22 20:28:18 +00:00
Xinliang David Li	126157c3b4	[PartialInlining] Add internal options to enable partial inlining in pass pipeline (off by default) 1. Legacy: -mllvm -enable-partial-inlining 2. New: -mllvm -enable-npm-partial-inlining -fexperimental-new-pass-manager Differential Revision: http://reviews.llvm.org/D33382 llvm-svn: 303567	2017-05-22 16:41:57 +00:00
Artur Pilipenko	edee25152b	[LoopPredication] NFC. Add extra debug output in case we fail to parse the range check llvm-svn: 303544	2017-05-22 12:06:57 +00:00
Artur Pilipenko	c488dfabac	[LoopPredication] NFC. Move a nested struct declaration before the fields, clang-format a bit This will simplify the diff for an upcoming review. llvm-svn: 303543	2017-05-22 12:01:32 +00:00
Craig Topper	2b1fc32f22	[InstCombine] Cleanup the interface for overflow checks Summary: Fix naming conventions and const correctness. This completes the changes made in rL303029. Patch by Yoav Ben-Shalom. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33377 llvm-svn: 303529	2017-05-22 06:25:31 +00:00
Craig Topper	e777fed152	[SimplifyCFG] Prevent a few APInt copies on method calls that return const reference. NFCI llvm-svn: 303523	2017-05-22 00:49:35 +00:00
Craig Topper	aaef41f71b	[KnownBits] Use isNegative/isNonNegative to shorten some code. NFC llvm-svn: 303522	2017-05-22 00:49:33 +00:00
Daniel Berlin	d130b6c27d	NewGVN: Fix PR 33116, the memoryphi version of bug 32838. llvm-svn: 303521	2017-05-21 23:41:58 +00:00
Daniel Berlin	0207cca8e0	NewGVN: Cleanup some repeated code using some templated helpers llvm-svn: 303520	2017-05-21 23:41:56 +00:00
Daniel Berlin	0193997b7e	NewGVN: Fix printing of simplified expression llvm-svn: 303519	2017-05-21 23:41:53 +00:00
Davide Italiano	21a49dcdf1	[InstCombine] Take in account the size in sext->lshr->trunc patterns. Otherwise we end up miscompiling, transforming: define i8 @tinky() { %sext = sext i1 1 to i16 %hibit = lshr i16 %sext, 15 %tr = trunc i16 %hibit to i8 ret i8 %tr } into: %sext = sext i1 1 to i8 ret i8 %sext and the first get folded to ret i8 1, while the second gets folded to ret i8 -1. Eventually we should get rid of this transform entirely, but for now, this at least fixes a know correctness bug. Differential Revision: https://reviews.llvm.org/D33338 llvm-svn: 303513	2017-05-21 20:30:27 +00:00
Xin Tong	9fbfeefadf	Revert "Add pthread_self function prototype and make it speculatable." This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21. Build breaking llvm-svn: 303496	2017-05-21 00:37:55 +00:00
Xin Tong	75af3af957	Add pthread_self function prototype and make it speculatable. Summary: This allows pthread_self to be pulled out of a loop by LICM. Reviewers: hfinkel, arsenm, davide Reviewed By: davide Subscribers: davide, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D32782 llvm-svn: 303495	2017-05-20 22:40:25 +00:00
Davide Italiano	9a0f542db6	[NewGVN] Create a StoreExpression instead of a VariableExpression. In the case where we have an operand defined by a lod of the same memory location. Historically this was a VariableExpression because we wanted to make sure they ended up in the same class, but if we create the right expression, they end up in the same class anyway. Fixes PR32897. Thanks to Dan for the detailed discussion and the fix suggestion. llvm-svn: 303475	2017-05-20 00:46:54 +00:00
Davide Italiano	888965c8a2	[NewGVN] Get rid of an assertion. This was here because we don't want to switch leaders too much, in order to avoid fixpoint(ing) issue, but it's not sure if it matters in practice. A first step towards fixing PR32897. llvm-svn: 303473	2017-05-20 00:24:04 +00:00
Adrian Prantl	660437975b	Revert "ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator." This reverts commit r303438 while deliberating buildbot breakage. llvm-svn: 303467	2017-05-19 23:32:21 +00:00
Matthias Braun	50ec0b5dce	SimplifyLibCalls: Optimize wcslen Refactor the strlen optimization code to work for both strlen and wcslen. This especially helps with programs in the wild where people pass L"string"s to const std::wstring& function parameters and the wstring constructor gets inlined. This also fixes a lingerind API problem/bug in getConstantStringInfo() where zeroinitializers would always give you an empty string (without a length) back regardless of the actual length of the initializer which did not work well in the TrimAtNul==false causing the PR mentioned below. Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG memcpy lowering and may lead to some cases for out-of-bounds zeroinitializer accesses not getting optimized anymore. So some code with UB may produce out of bound memory reads now instead of just producing zeros. The refactoring "accidentally" fixes http://llvm.org/PR32124 Differential Revision: https://reviews.llvm.org/D32839 llvm-svn: 303461	2017-05-19 22:37:09 +00:00
Daniel Berlin	e021d2d629	NewGVN: Fix PR32838. This is a complicated bug involving two issues: 1. What do we do with phi nodes when we prove all arguments are not live? 2. When is it safe to use value leaders to determine if we can ignore an argumnet? llvm-svn: 303453	2017-05-19 20:22:20 +00:00
Daniel Berlin	b527b2cf13	Last of the major pieces to NewGVN - yay! Summary: NewGVN: Handle equivalence between phi of ops and op of phis. This makes our GVN mostly-complete. It would be complete, modulo some deliberate choices we make. This means it detects roughly all herband equivalences in polynomial time, including cases notoriously hard for other GVN's to detect. It also detects a very large swath of the cases we currently rely on instcombine to detect that involve folding upwards through phis. Fixes PR 31125, 31463, PR 31868 Reviewers: davide Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32151 llvm-svn: 303444	2017-05-19 19:01:27 +00:00
Daniel Berlin	ff15200b1d	NewGVN: Get rid of most dominating leader check llvm-svn: 303443	2017-05-19 19:01:24 +00:00
Anna Thomas	ae3f752f36	[NFC][loopIdiom] Clang format change rL303434 llvm-svn: 303439	2017-05-19 18:00:30 +00:00
Adrian Prantl	f9ab9bfc39	ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator. rdar://problem/31233625 Differential Revision: https://reviews.llvm.org/D33151 llvm-svn: 303438	2017-05-19 17:55:02 +00:00
Anna Thomas	5ecb8f7593	[LoopIdiom] Refactor return value of isLegalStore [NFC] Summary: This NFC simply refactors the return value of LoopIdiomRecognize::isLegalStore() from bool to an enumeration, and removes the return-through-parameter mechanism that the function was using. This function is constructed such that it will only ever recognize a single store idiom (memset, memset_pattern, or memcpy), and never a combination of these. As such it makes much more sense for the return value to be the single idiom that the store matches, rather than having a separate argument-return for each idiom -- it's cleaner, and makes it clearer that only a single idiom can be matched. Patch by Daniel Neilson! Reviewers: anna, sanjoy, davide, haicheng Reviewed By: anna, haicheng Subscribers: haicheng, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D33359 llvm-svn: 303434	2017-05-19 17:05:36 +00:00
Artur Pilipenko	a6c278049a	[LoopPredication] NFC. Extract LoopICmp struct and parseLoopICmp helper llvm-svn: 303427	2017-05-19 14:02:46 +00:00
Artur Pilipenko	6780ba65b9	[LoopPredication] NFC. Extract LoopPredication::expandCheck helper llvm-svn: 303426	2017-05-19 14:00:58 +00:00
Artur Pilipenko	aab28666bc	[LoopPredication] NFC. Extract CanExpand helper lambda llvm-svn: 303425	2017-05-19 14:00:04 +00:00
Artur Pilipenko	46c4e0a4bf	[LoopPredication] NFC. Add an early exit if there is no guards in the loop llvm-svn: 303424	2017-05-19 13:59:34 +00:00
Amara Emerson	4d33c86359	Fix vector pass-through value being unused in IRBuilder::CreateMaskedGather Also s/0/nullptr in the call site in LV. llvm-svn: 303416	2017-05-19 10:40:18 +00:00
Davide Italiano	ee49f4943c	[NewGVN] Delete the old store when we find congruent to a load. (or non-store, more in general). Fixes PR33086. Caught by the store verifier. llvm-svn: 303406	2017-05-19 04:06:10 +00:00
Davide Italiano	eab0de2b82	[NewGVN] Break infinite recursion in singleReachablePHIPath(). We can have cycles between PHIs and this causes singleReachablePhi() to call itself indefintely (until we run out of stack). The proper solution would be that of computing SCCs, but it's not worth for now, so just keep a visited set and give up when we find a cycle. Thanks to Dan for the discussion/help with this. Fixes PR33014. llvm-svn: 303393	2017-05-18 23:22:44 +00:00
Davide Italiano	a76e5fa111	[NewGVN] Replace predicate info leftovers. This time with an additional fix, i.e. we remove the dead @llvm.ssa.copy instruction. llvm-svn: 303385	2017-05-18 21:43:23 +00:00
Sanjay Patel	5e456b943a	[InstCombine] add helper to foldXorOfICmps(); NFCI Also, fix the old-style capitalization of the related functions and move them to the 'private' section of the class since they are just helpers of the visit* functions. As shown in the post-commit comments for D32143, we are missing folds for xor-of-icmps. llvm-svn: 303381	2017-05-18 20:53:16 +00:00
Reid Kleckner	96ab8726a3	[IR] De-virtualize ~Value to save a vptr Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362	2017-05-18 17:24:10 +00:00
Wei Mi	8848c1e3c7	[LSR] Call canonicalize after we generate a new Formula in GenerateTruncates. Fix PR33077. The testcase in PR33077 generates a LSR Use Formula with two SCEVAddRecExprs for the same loop. Such uncommon formula will become non-canonical after GenerateTruncates adds sign extension to the ScaledReg of the Formula, and it will break the assertion that every Formula to be inserted is canonical. The fix is to call canonicalize for the raw Formula generated by GenerateTruncates before inserting it. llvm-svn: 303361	2017-05-18 17:21:22 +00:00
Anna Thomas	7bca59152a	[JumpThreading] Dont RAUW condition incorrectly Summary: We have a bug when RAUWing the condition if experimental.guard or assumes is a use of that condition. This is because LazyValueInfo may have used the guards/assumes to identify the value of the condition at the end of the block. RAUW replaces the uses at the guard/assume as well as uses before the guard/assume. Both of these are incorrect. For now, disable RAUW for conditions and fix the logic as a next step: https://reviews.llvm.org/D33257 Reviewers: sanjoy, reames, trentxintong Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33279 llvm-svn: 303349	2017-05-18 13:12:18 +00:00
Craig Topper	8a950275f7	[Statistics] Add a method to atomically update a statistic that contains a maximum Summary: There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways: MaxNumFoo = std::max(MaxNumFoo, NumFoo); or MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo; The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare. But we have no way of knowing if the value was changed by another thread between the reads and the writes. This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again. This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ Reviewers: dberlin, chandlerc, hfinkel, dblaikie Reviewed By: chandlerc Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D33301 llvm-svn: 303318	2017-05-18 00:51:39 +00:00
Craig Topper	48187cffe2	[Statistics] Use Statistic::operator+= instead of adding and assigning separately. I believe this technically fixes a multithreaded race condition in this code. But my primary concern was as part of looking at removing the ability to treat Statistics like a plain unsigned. There are many weird operations on Statistics in the codebase. llvm-svn: 303314	2017-05-17 23:22:10 +00:00
Sanjay Patel	ba212c241a	[InstCombine] handle icmp i1 X, C early to avoid creating an unknown pattern The missing optimization for xor-of-icmps still needs to be added, but by being more efficient (not generating unnecessary logic ops with constants) we avoid the bug. See discussion in post-commit comments: https://reviews.llvm.org/D32143 llvm-svn: 303312	2017-05-17 22:29:40 +00:00
Sanjay Patel	e5747e3cbd	[InstCombine] move icmp bool canonicalizations to helper; NFC As noted in the post-commit comments in D32143, we should be catching the constant operand cases sooner to be more efficient and less likely to expose a missing fold. llvm-svn: 303309	2017-05-17 22:15:07 +00:00
Sanjay Patel	b2e7003103	[InstCombine] add isCanonicalPredicate() helper function and use it; NFCI There should be a slight efficiency improvement from handling icmp/fcmp with one matcher and reducing duplicated code. The larger motivation is that there are questions about how predicate canonicalization is handled, and the refactoring should make it easier if we want to change any of that behavior. 1. As noted in the code comment, we've chosen 3 of the 16 FCMP preds as not canonical. Why those 3? It goes back to rL32751 from what I can tell, but I'm not sure if there's a justification for that rule. 2. We currently do not canonicalize integer select conditions. Should we use the same rule that applies to branches for selects? 3. We currently do canonicalize some FP select conditions, and those rules would conflict with the rule shown here. Should one or both be changed? No-functional-change-intended, but adding tests anyway because there's no coverage for most of the predicates. Differential Revision: https://reviews.llvm.org/D33247 llvm-svn: 303261	2017-05-17 14:21:19 +00:00
Gor Nishanov	db38485588	[coroutines] Handle spills before catchswitch If we need to spill the result of the PHI instruction, we insert the spill after all of the PHIs and EHPads, however, in a catchswitch block there is no room to insert the spill. Make room by splitting away catchswitch into a separate block. Before the fix: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %switch = catchswitch within none [label %catch] unwind label %cleanuppad After: catch.dispatch: %val = phi i32 [ 1, %if.then ], [ 2, %if.else ] %tok = cleanuppad within none [] ; spill goes here cleanupret from %tok unwind label %catch.dispatch.switch catch.dispatch.switch: %switch = catchswitch within none [label %catch] unwind label %cleanuppad https://reviews.llvm.org/D31846 llvm-svn: 303232	2017-05-17 03:09:22 +00:00
Francis Visoiu Mistrih	b52e036600	BitVector: add iterators for set bits Differential revision: https://reviews.llvm.org/D32060 llvm-svn: 303227	2017-05-17 01:07:53 +00:00
Eugene Zelenko	a369a45746	[ADT] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC). llvm-svn: 303221	2017-05-16 23:10:25 +00:00
Davide Italiano	79eb3b0366	[IR] Prefer use_empty() to !hasNUsesOrMore(1) for clarity. llvm-svn: 303218	2017-05-16 22:38:40 +00:00
Evgeny Stupachenko	cc19560253	The patch exclude a case from zero check skip in CTLZ idiom recognition (r303102). Summary: The following case: i = 1; if(n) while (n >>= 1) i++; use(i); Was converted to: i = 1; if(n) i += builtin_ctlz(n >> 1, false); use(i); Which is not correct. The patch make it: i = 1; if(n) i += builtin_ctlz(n >> 1, true); use(i); From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 303212	2017-05-16 21:44:59 +00:00
Dmitry Mikulin	fce148c568	In debug builds non-trivial amount of time is spent in InstCombine processing @llvm.dbg.* calls in visitCallInst(). They can be safely ignored. llvm-svn: 303202	2017-05-16 20:08:49 +00:00
Daniel Berlin	6c66e9a22a	NewGVN: Only do something in verifyStoreExpressions if assertions are enabled, to avoid unused code warnings. llvm-svn: 303201	2017-05-16 20:02:45 +00:00
Daniel Berlin	4540357240	NewGVN: Fix PR 33051 by making sure we remove old store expressions from the ExpressionToClass mapping. llvm-svn: 303200	2017-05-16 19:58:47 +00:00
Matthew Simpson	af60af1ed5	Revert 303174, 303176, and 303178 These commits are breaking the bots. Reverting to investigate. llvm-svn: 303182	2017-05-16 15:50:30 +00:00
Matthew Simpson	b7b5d55c38	[LV] Avoid potentential division by zero when selecting IC llvm-svn: 303174	2017-05-16 14:43:55 +00:00
Gor Nishanov	23453c11ff	[coroutines] Handle unwind edge splitting Summary: RewritePHIs algorithm used in building of CoroFrame inserts a placeholder ``` %placeholder = phi [%val] ``` on every edge leading to a block starting with PHI node with multiple incoming edges, so that if one of the incoming values was spilled and need to be reloaded, we have a place to insert a reload. We use SplitEdge helper function to split the incoming edge. SplitEdge function does not deal with unwind edges comping into a block with an EHPad. This patch adds an ehAwareSplitEdge function that can correctly split the unwind edge. For landing pads, we clone the landing pad into every edge block and replace the original landing pad with a PHI collection the values from all incoming landing pads. For WinEH pads, we keep the original EHPad in place and insert cleanuppad/cleapret in the edge blocks. Reviewers: majnemer, rnk Reviewed By: majnemer Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D31845 llvm-svn: 303172	2017-05-16 14:11:39 +00:00
Craig Topper	064adc6bfa	[CorrelatedValuePropagation] Don't use -> to call a static method of ConstantRange. NFC llvm-svn: 303147	2017-05-16 07:05:38 +00:00
Daniel Berlin	629e1ff6e6	NewGVN: Use StoreExpression StoredValue instead of looking it up again, since it was already looked up when it was created llvm-svn: 303144	2017-05-16 06:06:15 +00:00
Daniel Berlin	abd632dfeb	NewGVN: Formatting fixes llvm-svn: 303143	2017-05-16 06:06:12 +00:00
Davide Italiano	a641842845	Revert "[NewGVN] Replace predicate info leftovers." It's breaking the bots. llvm-svn: 303142	2017-05-16 05:51:21 +00:00
Davide Italiano	331058fcc4	[NewGVN] Replace predicate info leftovers. Fixes PR32945. Differential Revision: https://reviews.llvm.org/D33226 llvm-svn: 303141	2017-05-16 05:23:23 +00:00
Peter Collingbourne	6f0ecca3b5	IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape(). This function gives the wrong answer on some non-ELF platforms in some cases. The function that does the right thing lives in Mangler.h. To try to discourage people from using this function, give it a different name. Differential Revision: https://reviews.llvm.org/D33162 llvm-svn: 303134	2017-05-16 00:39:01 +00:00
Xinliang David Li	8726d91d29	Fix memory leak llvm-svn: 303126	2017-05-15 22:43:52 +00:00
David Blaikie	441cfee780	PR32288: Describe a bool parameter's DWARF location with a simple register There's no need (& a bit incorrect) to mask off the high bits of the register reference when describing a simple bool value. Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D31062 llvm-svn: 303117	2017-05-15 21:34:01 +00:00
Adam Nemet	e29686e5c1	[SLP] Enable 64-bit wide vectorization on AArch64 ARM Neon has native support for half-sized vector registers (64 bits). This is beneficial for example for 2D and 3D graphics. This patch adds the option to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer. * Performance Analysis This change was motivated by some internal benchmarks but it is also beneficial on SPEC and the LLVM testsuite. The results are with -O3 and PGO. A negative percentage is an improvement. The testsuite was run with a sample size of 4. SPEC * CFP2006/482.sphinx3 -3.34% A pretty hot loop is SLP vectorized resulting in nice instruction reduction. This used to be a +22% regression before rL299482. * CFP2000/177.mesa -3.34% * CINT2000/256.bzip2 +6.97% My current plan is to extend the fix in rL299482 to i16 which brings the regression down to +2.5%. There are also other problems with the codegen in this loop so there is further room for improvement. ** LLVM testsuite * SingleSource/Benchmarks/Misc/ReedSolomon -10.75% There are multiple small SLP vectorizations outside the hot code. It's a bit surprising that it adds up to 10%. Some of this may be code-layout noise. * MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40% The opt-viewer screenshot can be seen at F3218284. We start at a colder store but the tree leads us into the hottest loop. * MultiSource/Applications/lambda-0.1.3/lambda -2.68% * MultiSource/Benchmarks/Bullet/bullet -2.18% This is using 3D vectors. * SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67% Noise, binary is unchanged. * MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90% There is an additional SLP in the cold code. The test runs for ~1sec and prints out over 2000 lines. This is most likely noise. * MultiSource/Applications/aha/aha +1.63% * MultiSource/Applications/JM/lencod/lencod +1.41% * SingleSource/Benchmarks/Misc/richards_benchmark +1.15% Differential Revision: https://reviews.llvm.org/D31965 llvm-svn: 303116	2017-05-15 21:15:01 +00:00
Evgeniy Stepanov	b56012b548	[asan] Better workaround for gold PR19002. See the comment for more details. Test in a follow-up CFE commit. llvm-svn: 303113	2017-05-15 20:43:42 +00:00
Davide Italiano	cff8a34716	[NewGVN] Remove unused setDefiningExpr(). NFCI. llvm-svn: 303107	2017-05-15 19:35:40 +00:00
Sanjay Patel	878715f978	[InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949) This is the InstCombine counterpart to D32954. I added some comments about the code duplication in: rL302436 Alive-based verification: http://rise4fun.com/Alive/dPw This is a 2nd fix for the problem reported in: https://bugs.llvm.org/show_bug.cgi?id=32949 Differential Revision: https://reviews.llvm.org/D32970 llvm-svn: 303105	2017-05-15 19:27:53 +00:00
Evgeny Stupachenko	2fecd38ab8	The patch adds CTLZ idiom recognition. Summary: The following loops should be recognized: i = 0; while (n) { n = n >> 1; i++; body(); } use(i); And replaced with builtin_ctlz(n) if body() is empty or for CPUs that have CTLZ instruction converted to countable: for (j = 0; j < builtin_ctlz(n); j++) { n = n >> 1; i++; body(); } use(builtin_ctlz(n)); Reviewers: rengolin, joerg Differential Revision: http://reviews.llvm.org/D32605 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 303102	2017-05-15 19:08:56 +00:00
Davide Italiano	6e7a212748	[NewGVN] Fix verification of MemoryPhis in verifyMemoryCongruency(). verifyMemoryCongruency() filters out trivially dead MemoryDef(s), as we find them immediately dead, before moving from TOP to a new congruence class. This fixes the same problem for PHI(s) skipping MemoryPhis if all the operands are dead. Differential Revision: https://reviews.llvm.org/D33044 llvm-svn: 303100	2017-05-15 18:50:53 +00:00
Sanjay Patel	941e8dfcbf	[InstCombine] use m_OneUse to reduce code; NFCI llvm-svn: 303090	2017-05-15 18:08:17 +00:00
Craig Topper	1a36b7d836	[ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits. This patch finishes off the conversion of ComputeSignBit to computeKnownBits. Differential Revision: https://reviews.llvm.org/D33166 llvm-svn: 303035	2017-05-15 06:39:41 +00:00
Craig Topper	bb9737247a	[InstCombine] Merge duplicate functionality between InstCombine and ValueTracking Summary: Merge overflow computation for signed add, appearing both in InstCombine and ValueTracking. As part of the merge, cleanup the interface for overflow checks in InstCombine. Patch by Yoav Ben-Shalom. Reviewers: craig.topper, majnemer Reviewed By: craig.topper Subscribers: takuto.ikuta, llvm-commits Differential Revision: https://reviews.llvm.org/D32946 llvm-svn: 303029	2017-05-15 02:44:08 +00:00
Craig Topper	26c4159956	[InstCombine] Remove 'return' of a called function that also returned void. NFC llvm-svn: 303028	2017-05-15 02:30:27 +00:00
Xinliang David Li	392e975693	Fix test failure on windows -- do not return deleted func llvm-svn: 302999	2017-05-14 02:54:02 +00:00
Simon Pilgrim	7d62e4b455	[LoopOptimizer][Fix]PR32859, PR24738 The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form. Here it is: before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ] and after this change we have: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ] Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D33055 llvm-svn: 302988	2017-05-13 13:25:57 +00:00
Craig Topper	935f7b050f	[InstCombine] Prevent InstCombine from triggering an extra iteration if something changed in the initial Worklist creation Summary: If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop. This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now. This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation. I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this. This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that. Reviewers: davide, majnemer, spatel, chandlerc Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32453 llvm-svn: 302982	2017-05-13 06:56:04 +00:00
Xinliang David Li	66bdfca77a	[PartialInlining] Profile based cost analysis Implemented frequency based cost/saving analysis and related options. The pass is now in a state ready to be turne on in the pipeline (in follow up). Differential Revision: http://reviews.llvm.org/D32783 llvm-svn: 302967	2017-05-12 23:41:43 +00:00
Craig Topper	8df66c602a	[KnownBits] Add bit counting methods to KnownBits struct and use them where possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925	2017-05-12 17:20:30 +00:00
Davide Italiano	c43a9f80ed	[NewGVN] Improve debug output a bit. NFCI. While debugging a predicate info problem, I noticed this was missing a newline, making the debug output slightly less readable. llvm-svn: 302908	2017-05-12 15:28:12 +00:00
Davide Italiano	b60f6e0550	[NewGVN] Format an assertion and fix a typo. NFCI. llvm-svn: 302906	2017-05-12 15:25:56 +00:00
Davide Italiano	41f5c7bcba	[NewGVN] Don't incorrectly reset the memory leader. This code was missing a check for stores, so we were thinking the congruency class didn't have any memory members, and reset the memory leader. Differential Revision: https://reviews.llvm.org/D33056 llvm-svn: 302905	2017-05-12 15:22:45 +00:00
Chandler Carruth	d869b18826	[PM/Unswitch] Teach the new simple loop unswitch to handle loop invariant PHI inputs and to rewrite PHI nodes during the actual unswitching. The checking is quite easy, but rewriting the PHI nodes is somewhat surprisingly challenging. This should handle both branches and switches. I think this is now a full featured trivial unswitcher, and more full featured than the trivial cases in the old pass while still being (IMO) somewhat simpler in how it works. Next up is to verify its correctness in more widespread testing, and then to add non-trivial unswitching. Thanks to Davide and Sanjoy for the excellent review. There is one remaining question that I may address in a follow-up patch (see the review thread for details) but it isn't related to the functionality specifically. Differential Revision: https://reviews.llvm.org/D32699 llvm-svn: 302867	2017-05-12 02:19:59 +00:00
Adam Nemet	0aca09fc6c	[SLP] Emit optimization remarks The approach I followed was to emit the remark after getTreeCost concludes that SLP is profitable. I initially tried emitting them after the vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely remember missing a few cases for example in HorizontalReduction::tryToReduce. ORE is placed in BoUpSLP so that it's available from everywhere (notably HorizontalReduction::tryToReduce). We use the first instruction in the root bundle as the locator for the remark. In order to get a sense how far the tree is spanning I've include the size of the tree in the remark. This is not perfect of course but it gives you at least a rough idea about the tree. Then you can follow up with -view-slp-tree to really see the actual tree. llvm-svn: 302811	2017-05-11 17:06:17 +00:00
Ayal Zaks	58b28d549a	[LV] Refactor ILV.vectorize{Loop}() by introducing LVP.executePlan(); NFC Introduce LoopVectorizationPlanner.executePlan(), replacing ILV.vectorize() and refactoring ILV.vectorizeLoop(). Method collectDeadInstructions() is moved from ILV to LVP. These changes facilitate building VPlans and using them to generate code, following https://reviews.llvm.org/D28975 and its tentative breakdown. Method ILV.createEmptyLoop() is renamed ILV.createVectorizedLoopSkeleton() to improve clarity; it's contents remain intact. Differential Revision: https://reviews.llvm.org/D32200 llvm-svn: 302790	2017-05-11 11:36:33 +00:00
Alexander Potapenko	a658ae8fe2	[msan] Fix PR32842 It turned out that MSan was incorrectly calculating the shadow for int comparisons: it was done by truncating the result of (Shadow1 OR Shadow2) to i1, effectively rendering all bits except LSB useless. This approach doesn't work e.g. in the case where the values being compared are even (i.e. have the LSB of the shadow equal to zero). Instead, if CreateShadowCast() has to cast a bigger int to i1, we replace the truncation with an ICMP to 0. This patch doesn't affect the code generated for SPEC 2006 binaries, i.e. there's no performance impact. For the test case reported in PR32842 MSan with the patch generates a slightly more efficient code: orq %rcx, %rax jne .LBB0_6 , instead of: orl %ecx, %eax testb $1, %al jne .LBB0_6 llvm-svn: 302787	2017-05-11 11:07:48 +00:00
Serge Guelton	f4dc59ba8e	Remove spurious cast of nullptr. NFC. Conversion rules allow automatic casting of nullptr to any pointer type. llvm-svn: 302780	2017-05-11 08:53:00 +00:00
Sanjay Patel	40a87a909b	[InstCombine] remove fold that swaps xor/or with constants; NFCI // (X ^ C1) \| C2 --> (X \| C2) ^ (C1&~C2) This canonicalization was added at: https://reviews.llvm.org/rL7264 By moving xors out/down, we can more easily combine constants. I'm adding tests that do not change with this patch, so we can verify that those kinds of transforms are still happening. This is no-functional-change-intended because there's a later fold: // (X^C)\|Y -> (X\|Y)^C iff Y&C == 0 ...and demanded-bits appears to guarantee that any fold that would have hit the fold we're removing here would be caught by that 2nd fold. Similar reasoning was used in: https://reviews.llvm.org/rL299384 The larger motivation for removing this code is that it could interfere with the fix for PR32706: https://bugs.llvm.org/show_bug.cgi?id=32706 Ie, we're not checking if the 'xor' is actually a 'not', so we could reverse a 'not' optimization and cause an infinite loop by altering an 'xor X, -1'. Differential Revision: https://reviews.llvm.org/D33050 llvm-svn: 302733	2017-05-10 21:33:55 +00:00
Davide Italiano	dc435325a8	[NewGVN] Introduce a definesNoMemory() helper and use it. This is nice as is, but it will be used in my next patch to fix a bug. Suggested by Daniel Berlin. llvm-svn: 302714	2017-05-10 19:57:43 +00:00
Teresa Johnson	94624aca2a	Ensure non-null ProfileSummaryInfo passed to ModuleSummaryIndex builder This fixes a ubsan bot failure after r302597, which made getProfileCount non-static, but ended up invoking it on a null ProfileSummaryInfo object in some cases from buildModuleSummaryIndex. Most testing passed because the non-static getProfileCount currently doesn't access any member variables, but I found this when testing a follow on patch (D32877) that adds a member variable access. llvm-svn: 302705	2017-05-10 18:52:16 +00:00
Sanjay Patel	2e069f250a	[InstCombine] add (ashr (shl i32 X, 31), 31), 1 --> and (not X), 1 This is another step towards favoring 'not' ops over random 'xor' in IR: https://bugs.llvm.org/show_bug.cgi?id=32706 This transformation may have occurred in longer IR sequences using computeKnownBits, but that could be much more expensive to calculate. As the scalar result shows, we do not currently favor 'not' in all cases. The 'not' created by the transform is transformed again (unnecessarily). Vectors don't have this problem because vectors are (wrongly) excluded from several other combines. llvm-svn: 302659	2017-05-10 13:56:52 +00:00
Serge Guelton	778ece82ae	Use explicit false instead of casted nullptr. NFC. llvm-svn: 302656	2017-05-10 13:24:17 +00:00
Chandler Carruth	f3bd8ddedb	Revert r301950: SpeculativeExecution: Stop using whitelist for costs This pass doesn't correctly handle testing for when it is legal to hoist arbitrary instructions. The whitelist happens to make it safe, so before it is removed the pass's legality checks will need to be enhanced. Details have been added to the code review thread for the patch. llvm-svn: 302640	2017-05-10 12:30:07 +00:00
Amara Emerson	836b0f48c1	Add a late IR expansion pass for the experimental reduction intrinsics. This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence. Differential Revision: https://reviews.llvm.org/D32245 llvm-svn: 302631	2017-05-10 09:42:49 +00:00
Sanjay Patel	4133d4a56e	[InstCombine] add helper function for add X, C folds; NFCI llvm-svn: 302605	2017-05-10 00:07:16 +00:00
Easwaran Raman	f5f9160072	[ProfileSummary] Make getProfileCount a non-static member function. This change is required because the notion of count is different for sample profiling and getProfileCount will need to determine the underlying profile type. Differential revision: https://reviews.llvm.org/D33012 llvm-svn: 302597	2017-05-09 23:21:10 +00:00
Peter Collingbourne	c3d677f9d9	FunctionImport: Simplify function llvm::thinLTOInternalizeModule. NFCI. llvm-svn: 302595	2017-05-09 22:43:31 +00:00
Keno Fischer	06f962c1e8	[GVN] Fix a crash on encountering non-integral pointers Summary: This fixes the immediate crash caused by introducing an incorrect inttoptr before attempting the conversion. There may still be a legality check missing somewhere earlier for non-integral pointers, but this change seems necessary in any case. Reviewers: sanjoy, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32623 llvm-svn: 302587	2017-05-09 21:07:20 +00:00
Matthew Simpson	78fd46b230	[AArch64] Consider widening instructions in cost calculations The AArch64 instruction set has a few "widening" instructions (e.g., uaddl, saddl, uaddw, etc.) that take one or more doubleword operands and produce quadword results. The operands are automatically sign- or zero-extended as appropriate. However, in LLVM IR, these extends are explicit. This patch updates TTI to consider these widening instructions as single operations whose cost is attached to the arithmetic instruction. It marks extends that are part of a widening operation "free" and applies a sub-target specified overhead (zero by default) to the arithmetic instructions. Differential Revision: https://reviews.llvm.org/D32706 llvm-svn: 302582	2017-05-09 20:18:12 +00:00
Sanjay Patel	7caaa79879	[InstCombine] clean up matchDeMorgansLaws(); NFCI The motivation for getting rid of dyn_castNotVal is to allow fixing: https://bugs.llvm.org/show_bug.cgi?id=32706 So this was supposed to be functional-change-intended for the case of inverting constants and applying DeMorgan. However, I can't find any cases where that pattern will actually get to matchDeMorgansLaws() because we have other folds in visitAnd/visitOr that do the same thing. So this ends up just being a clean-up patch with slight efficiency improvement, but no-functional-change-intended. llvm-svn: 302581	2017-05-09 20:05:05 +00:00
Davide Italiano	b7a6698ae9	[NewGVN] Simplify a DEBUG() statement. NFCI. llvm-svn: 302579	2017-05-09 20:02:48 +00:00
Adrian Prantl	c10d0e5ccd	Make it illegal for two Functions to point to the same DISubprogram As recently discussed on llvm-dev [1], this patch makes it illegal for two Functions to point to the same DISubprogram and updates FunctionCloner to also clone the debug info of a function to conform to the new requirement. To simplify the implementation it also factors out the creation of inlineAt locations from the Inliner into a general-purpose utility in DILocation. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html <rdar://problem/31926379> Differential Revision: https://reviews.llvm.org/D32975 This reapplies r302469 with a fix for a bot failure (reparentDebugInfo now checks for the case the orig and new function are identical). llvm-svn: 302576	2017-05-09 19:47:37 +00:00
Piotr Padlewski	d979c1f806	NFC: refactor replaceDominatedUsesWith Summary: Since I will post patch with some changes to replaceDominatedUsesWith, it would be good to avoid duplicating code again. Reviewers: davide, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32798 llvm-svn: 302575	2017-05-09 19:39:44 +00:00
Serge Guelton	e38003f839	Suppress all uses of LLVM_END_WITH_NULL. NFC. Use variadic templates instead of relying on <cstdarg> + sentinel. This enforces better type checking and makes code more readable. Differential Revision: https://reviews.llvm.org/D32541 llvm-svn: 302571	2017-05-09 19:31:13 +00:00
Davide Italiano	63998ec3c8	[NewGVN] Explain why sorting by pointer values doesn't introduce non-determinism. Thanks to Eli for pointing out in a post-commit review comment. llvm-svn: 302566	2017-05-09 18:29:37 +00:00
Davide Italiano	d6bb8cab03	[NewGVN] Fix a consistent order for phi nodes operands. The way we currently define congruency for two PHIExpression(s) is: 1) The operands to the phi functions are congruent 2) The PHIs are defined in the same BasicBlock. NewGVN works under the assumption that phi operands are in predecessor order, or at least in some consistent order. OTOH, is valid IR: patatino: %meh = phi i16 [ %0, %winky ], [ %conv1, %tinky ] %banana = phi i16 [ %0, %tinky ], [ %conv1, %winky ] br label %end and the in-memory representations of the two SSA registers have an inconsistent order. This violation of NewGVN assumptions results into two PHIs found congruent when they're not. While we think it's useful to have always a consistent order enforced, let's fix this in NewGVN sorting uses in predecessor order before creating a PHI expression. Differential Revision: https://reviews.llvm.org/D32990 llvm-svn: 302552	2017-05-09 16:58:28 +00:00
Daniel Berlin	6604a2ffbb	NewGVN: Make all of symbolic evaluation logically const. llvm-svn: 302550	2017-05-09 16:40:04 +00:00
Sanjay Patel	6844e21f59	[InstCombineCasts] Fix checks in sext->lshr->trunc pattern. The comment says to avoid the case where zero bits are shifted into the truncated value, but the code checks that the shift is smaller than the truncated value instead of the number of bits added by the sign extension. Fixing this allows a shift by more than the value size to be introduced, which is undefined behavior, so the shift is capped at the value size minus one, which has the expected behavior of filling the value with the sign bit. Patch by Jacob Young! Differential Revision: https://reviews.llvm.org/D32285 llvm-svn: 302548	2017-05-09 16:24:59 +00:00
Hans Wennborg	66fb0d9768	Revert r302469 "Make it illegal for two Functions to point to the same DISubprogram" This caused PR32977. Original commit message: > Make it illegal for two Functions to point to the same DISubprogram > > As recently discussed on llvm-dev [1], this patch makes it illegal for > two Functions to point to the same DISubprogram and updates > FunctionCloner to also clone the debug info of a function to conform > to the new requirement. To simplify the implementation it also factors > out the creation of inlineAt locations from the Inliner into a > general-purpose utility in DILocation. > > [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html > <rdar://problem/31926379> > > Differential Revision: https://reviews.llvm.org/D32975 llvm-svn: 302533	2017-05-09 14:44:15 +00:00
Anna Thomas	0691483435	[LV] Fix insertion point for shuffle vectors in first order recurrence Summary: In first order recurrence vectorization, when the previous value is a phi node, we need to set the insertion point to the first non-phi node. We can have the previous value being a phi node, due to the generation of new IVs as part of trunc optimization [1]. [1] https://reviews.llvm.org/rL294967 Reviewers: mssimpso, mkuper Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32969 llvm-svn: 302532	2017-05-09 14:29:33 +00:00
Amara Emerson	cf9daa33a7	Introduce experimental generic intrinsics for horizontal vector reductions. - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514	2017-05-09 10:43:25 +00:00
Sanjoy Das	76bbdd1a16	[InstNamer] Use range-for llvm-svn: 302481	2017-05-08 23:18:43 +00:00
Sanjoy Das	0ac3bf1cc8	[InstNamer] Don't check type of arguments (they're never void) llvm-svn: 302480	2017-05-08 23:18:39 +00:00
Sanjoy Das	7be961b081	Delete trailing whitespace llvm-svn: 302479	2017-05-08 23:18:36 +00:00
Adrian Prantl	200a5ef526	Make it illegal for two Functions to point to the same DISubprogram As recently discussed on llvm-dev [1], this patch makes it illegal for two Functions to point to the same DISubprogram and updates FunctionCloner to also clone the debug info of a function to conform to the new requirement. To simplify the implementation it also factors out the creation of inlineAt locations from the Inliner into a general-purpose utility in DILocation. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html <rdar://problem/31926379> Differential Revision: https://reviews.llvm.org/D32975 llvm-svn: 302469	2017-05-08 21:17:08 +00:00
Sanjay Patel	a1c8814891	[InstCombine] add folds for not-of-shift-right This is another step towards getting rid of dyn_castNotVal, so we can recommit: https://reviews.llvm.org/rL300977 As the tests show, we were missing the lshr case for constants and both ashr/lshr vector splat folds. The ashr case with constant was being performed inefficiently in 2 steps. It's also possible there was a latent bug in that case because we can't do that fold if the constant is positive: http://rise4fun.com/Alive/Bge llvm-svn: 302465	2017-05-08 20:49:59 +00:00
Davide Italiano	aa42a10051	[PartialInlining] Capture by reference rather than by value. llvm-svn: 302464	2017-05-08 20:44:01 +00:00
Sanjay Patel	2a06263036	[InstCombine] use local variable to reduce code duplication; NFCI llvm-svn: 302438	2017-05-08 16:33:42 +00:00
Sanjay Patel	2df38a80f1	[InstCombine/InstSimplify] add comments about code duplication; NFC llvm-svn: 302436	2017-05-08 16:21:55 +00:00
Craig Topper	7e3e7afca8	[ConstantRange][SimplifyCFG] Add a helper method to allow SimplifyCFG to determine if a ConstantRange has more than 8 elements without requiring an allocation if the ConstantRange is 64-bits wide. Previously SimplifyCFG used getSetSize which returns an APInt that is 1 bit wider than the ConstantRange's bit width. In the reasonably common case that the ConstantRange is 64-bits wide, this requires returning a 65-bit APInt. APInt's can only store 64-bits without a memory allocation so this is inefficient. The new method takes the 8 as an input and tells if the range contains more than that many elements without requiring any wider math. llvm-svn: 302385	2017-05-07 22:22:11 +00:00
Sanjay Patel	599e65b1ff	[InstSimplify] use ConstantRange to simplify or-of-icmps We can simplify (or (icmp X, C1), (icmp X, C2)) to 'true' or one of the icmps in many cases. I had to check some of these with Alive to prove to myself it's right, but everything seems to check out. Eg, the deleted code in instcombine was completely ignoring predicates with mismatched signedness. This is a follow-up to: https://reviews.llvm.org/rL301260 https://reviews.llvm.org/D32143 llvm-svn: 302370	2017-05-07 15:11:40 +00:00
Kostya Serebryany	424bfed693	[sanitizer-coverage] implement -fsanitize-coverage=no-prune,... instead of a hidden -mllvm flag. llvm part. llvm-svn: 302319	2017-05-05 23:14:40 +00:00
Craig Topper	a49e768977	Fix spelling error in command line option description. NFC llvm-svn: 302311	2017-05-05 22:31:11 +00:00
Matthias Braun	60b40b8fec	TargetLibraryInfo: Introduce wcslen wcslen is part of the C99 and C++98 standards. - This introduces the function to TargetLibraryInfo. - Also set attributes for wcslen in llvm::inferLibFuncAttributes(). Differential Revision: https://reviews.llvm.org/D32837 llvm-svn: 302278	2017-05-05 20:25:50 +00:00
Craig Topper	f0aeee01c3	[KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262	2017-05-05 17:36:09 +00:00
Craig Topper	fc481e5eb7	[Float2Int] Replace a ConstantRange copy with a move. Remove an extra call to MapVector::find. llvm-svn: 302256	2017-05-05 17:09:29 +00:00
Aditya Kumar	1c42d135e1	[LoopIdiom] check for safety while expanding Loop Idiom recognition was generating memset in a case that would result generating a division operation to an unsafe location. Differential Revision: https://reviews.llvm.org/D32674 llvm-svn: 302238	2017-05-05 14:49:45 +00:00
Evgeniy Stepanov	9aff829f78	Remap metadata attached to global variables. Fix for PR32577. Global variables may have !associated metadata, which includes a reference to another global. It needs remapping. llvm-svn: 302203	2017-05-04 23:29:39 +00:00
Craig Topper	1f673d4450	[JumpThreading] When processing compares, explicitly check that the result type is not a vector rather than check for it being an integer. Compares always return a scalar integer or vector of integers. isIntegerTy returns false for vectors, but that's not completely obvious. So using isVectorTy is less confusing. llvm-svn: 302198	2017-05-04 21:45:49 +00:00
Craig Topper	930689ada4	[JumpThreading] Change a dyn_cast that is already protected by an isa check to a static cast. Combine the with another static cast. NFC Differential Revision: https://reviews.llvm.org/D32874 llvm-svn: 302197	2017-05-04 21:45:45 +00:00
Craig Topper	5974dadc69	[Float2Int] Remove return of ConstantRange from seen method. Nothing uses it so it just creates and discards a ConstantRange object for no reason. llvm-svn: 302193	2017-05-04 21:29:45 +00:00
Peter Collingbourne	9667b91b13	Re-apply r302108, "IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI." with a fix for the clang backend. llvm-svn: 302176	2017-05-04 18:03:25 +00:00
Davide Italiano	94bf7846fd	[NewGVN] Remove unneeded newline and format assertions. NFCI. llvm-svn: 302173	2017-05-04 17:26:15 +00:00
Eric Liu	f6039f255e	Revert "IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI." This reverts commit r302108. This causes crash in clang bootstrap with LTO. Contacted the auther in the original commit. llvm-svn: 302140	2017-05-04 11:49:39 +00:00
Martin Storsjo	e81233d0ed	[ArgPromotion] Fix a truncated variable This fixes a regression since SVN rev 273808 (which was supposed to not change functionality). The regression caused miscompilations (noted in the wild when targeting AArch64) on platforms with 32 bit long. Differential Revision: https://reviews.llvm.org/D32850 llvm-svn: 302137	2017-05-04 10:54:35 +00:00
Jonas Paulsson	8bf1fdcc91	Use right function in LoopVectorize. - unsigned AS = getMemInstAlignment(I); + unsigned AS = getMemInstAddressSpace(I); Review: Hal Finkel llvm-svn: 302114	2017-05-04 05:31:56 +00:00
Peter Collingbourne	5f85a9deda	IR: Use pointers instead of GUIDs to represent edges in the module summary. NFCI. When profiling a no-op incremental link of Chromium I found that the functions computeImportForFunction and computeDeadSymbols were consuming roughly 10% of the profile. The goal of this change is to improve the performance of those functions by changing the map lookups that they were previously doing into pointer dereferences. This is achieved by changing the ValueInfo data structure to be a pointer to an element of the global value map owned by ModuleSummaryIndex, and changing reference lists in the GlobalValueSummary to hold ValueInfos instead of GUIDs. This means that a ValueInfo will take a client directly to the summary list for a given GUID. Differential Revision: https://reviews.llvm.org/D32471 llvm-svn: 302108	2017-05-04 03:36:16 +00:00
Craig Topper	cff357c322	[InstCombine][KnownBits] Use KnownBits better to detect nsw adds Change checkRippleForAdd from a heuristic to a full check - if it is provable that the add does not overflow return true, otherwise false. Patch by Yoav Ben-Shalom Differential Revision: https://reviews.llvm.org/D32686 llvm-svn: 302093	2017-05-03 23:22:46 +00:00
Craig Topper	8189a87a1e	[KnownBits] Add methods for determining if KnownBits is a constant value This patch adds isConstant and getConstant for determining if KnownBits represents a constant value and to retrieve the value. Use them to simplify code. Differential Revision: https://reviews.llvm.org/D32785 llvm-svn: 302091	2017-05-03 23:12:29 +00:00
Craig Topper	d938fd1397	[KnownBits] Add zext, sext, and trunc methods to KnownBits This patch adds zext, sext, and trunc methods to KnownBits and uses them where possible. Differential Revision: https://reviews.llvm.org/D32784 llvm-svn: 302088	2017-05-03 22:07:25 +00:00
Xin Tong	46fb813ac3	[TailCallElim] Remove an unused argument. NFCI llvm-svn: 302080	2017-05-03 20:37:07 +00:00
Anna Thomas	f475fa3575	Avoid warning of unused variable in release builds. NFC llvm-svn: 302068	2017-05-03 19:25:04 +00:00
Sanjoy Das	23f314d04f	Fix typos in comment llvm-svn: 302063	2017-05-03 18:29:34 +00:00
Anna Thomas	d4c0295cc8	Fix PPC64 warning for missing parantheses. NFC. llvm-svn: 302061	2017-05-03 18:25:43 +00:00
Reid Kleckner	a0b45f4bfc	[IR] Abstract away ArgNo+1 attribute indexing as much as possible Summary: Do three things to help with that: - Add AttributeList::FirstArgIndex, which is an enumerator currently set to 1. It allows us to change the indexing scheme with fewer changes. - Add addParamAttr/removeParamAttr. This just shortens addAttribute call sites that would otherwise need to spell out FirstArgIndex. - Remove some attribute-specific getters and setters from Function that take attribute list indices. Most of these were only used from BuildLibCalls, and doesNotAlias was only used to test or set if the return value is malloc-like. I'm happy to split the patch, but I think they are probably easier to review when taken together. This patch should be NFC, but it sets the stage to change the indexing scheme to this, which is more convenient when indexing into an array: 0: func attrs 1: retattrs 2...: arg attrs Reviewers: chandlerc, pete, javed.absar Subscribers: david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D32811 llvm-svn: 302060	2017-05-03 18:17:31 +00:00
Anna Thomas	ac0ec2240b	[RuntimeLoopUnroller] Add assert that we dont unroll non-rotated loops Summary: Cloning basic blocks in the loop for runtime loop unroller depends on loop being in rotated form (i.e. loop latch target is the exit block). Assert that this is true, so that callers of runtime loop unroller pass in canonical loops. The single caller of this function has that check recently added: https://reviews.llvm.org/rL301239 Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32801 llvm-svn: 302058	2017-05-03 17:43:59 +00:00
Anna Thomas	53c8d95c85	[Loop Deletion] Delete loops that are never executed Summary: Currently, loop deletion deletes loop where the only values that are used outside the loop are loop-invariant. This patch adds logic to delete loops where the loop is proven to be never executed (i.e. the only predecessor of the loop preheader has a constant conditional branch as terminator, and the preheader is not the taken target). This will remove loops that become dead after loop-unswitching generates constant conditional branches. The next steps are: 1. moving the loop deletion implementation to LoopUtils. 2. Add logic in loop-simplifyCFG which will support changing conditional constant branches to unconditional branches. If loops become unreachable in this process, they can be removed using `deleteDeadLoop` function. Reviewers: chandlerc, efriedma, sanjoy, reames Reviewed by: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32494 llvm-svn: 302015	2017-05-03 11:47:11 +00:00
Matt Arsenault	6a288c1e32	Replace hardcoded intrinsic list with speculatable attribute. No change in which intrinsics should be speculated. llvm-svn: 301995	2017-05-03 02:26:10 +00:00
Reid Kleckner	ee4930b688	Re-land r301697 "[IR] Make add/remove Attributes use AttrBuilder instead of AttributeList" This time, I fixed, built, and tested clang. This reverts r301712. llvm-svn: 301981	2017-05-02 22:07:37 +00:00
Davide Italiano	839c7e6cfb	[NewGVN] Fix typo and format comment. NFCI. llvm-svn: 301974	2017-05-02 21:11:40 +00:00
Xinliang David Li	ab8722f80a	[PartialInlining] Add more early filtering This is a follow up to the previous inline cost patch for quicker filtering. llvm-svn: 301959	2017-05-02 18:43:21 +00:00
Matt Arsenault	9ac7d6be3c	SpeculativeExecution: Stop using whitelist for costs Just let TTI's cost do this instead of arbitrarily restricting this. llvm-svn: 301950	2017-05-02 18:02:18 +00:00
Sanjay Patel	6381db18fe	[InstCombine] don't use DeMorgan's Law on integer constants (2nd try) This was originally checked in here: https://reviews.llvm.org/rL301923 And reverted here: https://reviews.llvm.org/rL301924 Because there's a clang test that would fail after this. I fixed/removed the offending CHECK lines in: https://reviews.llvm.org/rL301928 So let's try this again. Original commit message: This is the fold that causes the infinite loop in BoringSSL (https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c) when we fix instcombine demanded bits to prefer 'not' ops as in https://reviews.llvm.org/D32255. There are 2 or 3 problems with dyn_castNotVal, and I don't think we can reinstate https://reviews.llvm.org/D32255 until dyn_castNotVal is completely eliminated. 1. As shown here, it transforms 'not' into random xor. This transform is harmful to SCEV and codegen because 'not' can often be folded while random xor cannot. 2. It does not transform vector constants. This is actually a good thing, but if you don't believe the above argument, then we shouldn't have excluded vectors. 3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't match the greedy nature of instcombine. If we DeMorganize a pattern that has an extra 'not' in it: ~(~(~X) & Y) --> (~X \| ~Y) That's just another case of DeMorgan, so we should trust that we'll fold that pattern too: (~X \| ~ Y) --> ~(X & Y) Differential Revision: https://reviews.llvm.org/D32665 llvm-svn: 301929	2017-05-02 15:31:40 +00:00
Sanjay Patel	da0b4deafa	revert r301923 : [InstCombine] don't use DeMorgan's Law on integer constants There's a clang test that is wrongly using -O1 and failing after this commit. llvm-svn: 301924	2017-05-02 14:48:23 +00:00
Sanjay Patel	096a981982	[InstCombine] don't use DeMorgan's Law on integer constants This is the fold that causes the infinite loop in BoringSSL (https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c) when we fix instcombine demanded bits to prefer 'not' ops as in D32255. There are 2 or 3 problems with dyn_castNotVal, and I don't think we can reinstate D32255 until dyn_castNotVal is completely eliminated. 1. As shown here, it transforms 'not' into random xor. This transform is harmful to SCEV and codegen because 'not' can often be folded while random xor cannot. 2. It does not transform vector constants. This is actually a good thing, but if you don't believe the above argument, then we shouldn't have excluded vectors. 3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't match the greedy nature of instcombine. If we DeMorganize a pattern that has an extra 'not' in it: ~(~(~X) & Y) --> (~X \| ~Y) That's just another case of DeMorgan, so we should trust that we'll fold that pattern too: (~X \| ~ Y) --> ~(X & Y) Differential Revision: https://reviews.llvm.org/D32665 llvm-svn: 301923	2017-05-02 14:31:30 +00:00
Xinliang David Li	6133846be1	[PartialInlining] Hook up inline cost analysis Differential Revision: http://reviews.llvm.org/D32666 llvm-svn: 301894	2017-05-02 02:44:14 +00:00
Xin Tong	a41bf70bea	Empty Space. NFC llvm-svn: 301878	2017-05-01 23:08:19 +00:00
Davide Italiano	2dfd46bf08	[NewGVN] Don't derive incorrect implications. In the testcase attached, we believe %tmp1 implies %tmp4. where: br i1 %tmp1, label %bb2, label %bb7 br i1 %tmp4, label %bb5, label %bb7 because Wwhile looking at PredicateInfo stuffs we end up calling isImpliedTrueByMatchingCmp() with the arguments backwards. Differential Revision: https://reviews.llvm.org/D32718 llvm-svn: 301849	2017-05-01 22:26:28 +00:00
Sanjay Patel	59d0aeaafe	[InstCombine] check one-use before applying DeMorgan nor/nand folds If we have ~(~X & Y), it only makes sense to transform it to (X \| ~Y) when we do not need the intermediate (~X & Y) value. In that case, we would need an extra instruction to generate ~Y + 'or' (as shown in the test changes). It's ok if we have multiple uses of ~X or Y, however. In those cases, we may not reduce the instruction count or critical path, but we might improve throughput because we can generate ~X and ~Y in parallel. Whether that actually makes perf sense or not for a target is something we can't answer in IR. Differential Revision: https://reviews.llvm.org/D32703 llvm-svn: 301848	2017-05-01 22:25:42 +00:00
Peter Collingbourne	a992f53099	IPO: Add missing build dep. llvm-svn: 301835	2017-05-01 20:57:20 +00:00
Peter Collingbourne	c15d60b772	Object: Remove ModuleSummaryIndexObjectFile class. Differential Revision: https://reviews.llvm.org/D32195 llvm-svn: 301832	2017-05-01 20:42:32 +00:00
Xin Tong	a4b9b9f42a	Take indirect branch into account as well when folding. We may not be able to rewrite indirect branch target, but we also want to take it into account when folding, i.e. if it and all its successor's predecessors go to the same destination, we can fold, i.e. no need to thread. llvm-svn: 301816	2017-05-01 17:15:37 +00:00
Sanjoy Das	e6bca0eecb	Rename WeakVH to WeakTrackingVH; NFC This relands r301424. llvm-svn: 301812	2017-05-01 17:07:49 +00:00
Xin Tong	99dce428bc	[JumpThread] Add some assertions for expected ConstantInt/BlockAddress llvm-svn: 301808	2017-05-01 16:19:59 +00:00
Xin Tong	21f8ac235e	[JumpThread] Do RAUW in case Cond folds to a constant in the CFG Summary: [JumpThread] Do RAUW in case Cond folds to a constant in the CFG Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32407 llvm-svn: 301804	2017-05-01 15:34:17 +00:00
Sanjoy Das	08989c7ecd	Rename isKnownNotFullPoison to programUndefinedIfPoison; NFC Summary: programUndefinedIfPoison makes more sense, given what the function does; and I'm about to add a function with a name similar to isKnownNotFullPoison (so do the rename to avoid confusion). Reviewers: broune, majnemer, bjarke.roune Reviewed By: broune Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30444 llvm-svn: 301776	2017-04-30 19:41:19 +00:00
Craig Topper	ca48af3c87	[KnownBits] Add methods for determining if the known bits represent a negative/nonnegative number and add methods for changing the negative/nonnegative state Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit. Reviewers: RKSimon, spatel, davide Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32651 llvm-svn: 301747	2017-04-29 16:43:11 +00:00
Akira Hatanaka	6fdcb3c2ce	[ObjCARC] Do not move a release between a call and a retainAutoreleasedReturnValue that retains the returned value. This commit fixes a bug in ARC optimizer where it moves a release between a call and a retainAutoreleasedReturnValue, causing the returned object to be released before the retainAutoreleasedReturnValue can retain it. This commit accomplishes that by doing a lookahead and checking whether the call prevents the release from moving upwards. In the long term, we should treat the region between the retainAutoreleasedReturnValue and the call as a critical section and disallow moving anything there (possibly using operand bundles). rdar://problem/20449878 llvm-svn: 301724	2017-04-29 00:23:11 +00:00
Davide Italiano	0aaa96a07b	[LoopUnswitch] Make DEBUG output more readable (part 2). I fixed my miscompile in r301722 and I hope I don't have to take a look at this code again now that Chandler has a new LoopUnswitch pass, but maybe this could be of use for somebody else in the meanwhile. llvm-svn: 301723	2017-04-29 00:18:26 +00:00
Davide Italiano	534e314356	[LoopUnswitch] Don't remove instructions with side effects. This fixes PR32818. Differential Revision: https://reviews.llvm.org/D32664 llvm-svn: 301722	2017-04-29 00:12:18 +00:00
Hans Wennborg	0f88d863b4	Revert r301697 "[IR] Make add/remove Attributes use AttrBuilder instead of AttributeList" This broke the Clang build. (Clang-side patch missing?) Original commit message: > [IR] Make add/remove Attributes use AttrBuilder instead of > AttributeList > > This change cleans up call sites and avoids creating temporary > AttributeList objects. > > NFC llvm-svn: 301712	2017-04-28 23:01:32 +00:00
Matt Arsenault	e0f9e984fd	InferAddressSpaces: Search constant expressions for addrspacecasts These are pretty common when using local memory, and the 64-bit generic addressing is much more expensive to compute. llvm-svn: 301711	2017-04-28 22:52:41 +00:00
Matt Arsenault	c20ccd2c02	InferAddressSpaces: Avoid looking up deleted values While looking at pure addressing expressions, it's possible for the value to appear later in Postorder. I haven't been able to come up with a testcase where this exhibits an actual issue, but if you insert a dump before the value map lookup, a few testcases crash. llvm-svn: 301705	2017-04-28 22:18:19 +00:00
Matt Arsenault	a1e734050c	InferAddressSpaces: Infer from just addrspacecasts Eliminates some more cases where some subset of the addressing computation remains flat. Some cases with addrspacecasts in nested constant expressions are still left behind however. llvm-svn: 301704	2017-04-28 22:18:08 +00:00
Daniel Berlin	98a1de85cb	LoopRotate: Fix use after scope bug llvm-svn: 301702	2017-04-28 22:05:55 +00:00
Reid Kleckner	608c8b63b3	[IR] Make add/remove Attributes use AttrBuilder instead of AttributeList This change cleans up call sites and avoids creating temporary AttributeList objects. NFC llvm-svn: 301697	2017-04-28 21:48:28 +00:00
Davide Italiano	e27cb87754	[LoopUnswitch] Make DEBUG output more readable. While debugging a miscompile I realized loopunswitch doesn't put newlines when printing the instruction being replacement. Ending up with a single line with many instruction replaced isn't the best for readability and/or mental sanity. llvm-svn: 301692	2017-04-28 21:30:50 +00:00
Reid Kleckner	859f8b544a	Make getParamAlignment use argument numbers The method is called "get Param Alignment", and is only used for return values exactly once, so it should take argument indices, not attribute indices. Avoids confusing code like: IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError); Alignment = CS->getParamAlignment(ArgIdx + 1); Add getRetAlignment to handle the one case in Value.cpp that wants the return value alignment. This is a potentially breaking change for out-of-tree backends that do their own call lowering. llvm-svn: 301682	2017-04-28 20:34:27 +00:00
Daniel Berlin	4d0fe64ae3	Kill off the old SimplifyInstruction API by converting remaining users. llvm-svn: 301673	2017-04-28 19:55:38 +00:00
Davide Italiano	b6681e2b4e	[IPO/MergeFunctions] This function is used only under DEBUG(). llvm-svn: 301672	2017-04-28 19:39:45 +00:00
Reid Kleckner	99351967c7	[RS4GC] Simplify attribute handling code NFC Avoids use of AttributeList::getNumSlots, making it easier to change the underlying implementation. llvm-svn: 301671	2017-04-28 19:22:40 +00:00
Reid Kleckner	6652a52e2b	Use Argument::hasAttribute and AttributeList::ReturnIndex more This eliminates many extra 'Idx' induction variables in loops over arguments in CodeGen/ and Target/. It also reduces the number of places where we assume that ReturnIndex is 0 and that we should add one to argument numbers to get the corresponding attribute list index. NFC llvm-svn: 301666	2017-04-28 18:37:16 +00:00
Adrian Prantl	109b236850	Clean up DIExpression::prependDIExpr a little. (NFC) llvm-svn: 301662	2017-04-28 17:51:05 +00:00
Craig Topper	24db6b800f	[APInt] Add clearSignBit method. Use it and setSignBit in a few places. NFCI llvm-svn: 301656	2017-04-28 16:58:05 +00:00
Teresa Johnson	51177295c4	Memory intrinsic value profile optimization: Avoid divide by 0 Summary: Skip memops if the total value profiled count is 0, we can't correctly scale up the counts and there is no point anyway. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32624 llvm-svn: 301645	2017-04-28 14:30:54 +00:00
Andrew Ng	03e35b6bc0	[DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values. This is a follow up to the fix in r298360 to improve the handling of debug values when redundant LEAs are removed. The fix in r298360 effectively discarded the debug values. This patch now attempts to preserve the debug values by using the DWARF DW_OP_stack_value operation via prependDIExpr. Moved functions appendOffset and prependDIExpr from Local.cpp to DebugInfoMetadata.cpp and made them available as static member functions of DIExpression. Differential Revision: https://reviews.llvm.org/D31604 llvm-svn: 301630	2017-04-28 08:44:30 +00:00
Max Kazantsev	531db9a504	[EarlyCSE] Mark the condition of assume intrinsic as true EarlyCSE should not just ignore assumes. It should use the fact that its condition is true for all dominated instructions. Reviewers: sanjoy, reames, apilipenko, anna, skatkov Reviewed By: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32482 llvm-svn: 301625	2017-04-28 06:25:39 +00:00
Max Kazantsev	0589d9fa0f	[EarlyCSE] Remove guards with conditions known to be true If a condition is calculated only once, and there are multiple guards on this condition, we should be able to remove all guards dominated by the first of them. This patch allows EarlyCSE to try to find the condition of a guard among the known values, and if it is true, remove the guard. Otherwise we keep the guard and mark its condition as 'true' for future consideration. Reviewers: sanjoy, reames, apilipenko, skatkov, anna, dberlin Reviewed By: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32476 llvm-svn: 301623	2017-04-28 06:05:48 +00:00
Craig Topper	24e71017aa	[APInt] Use inplace shift methods where possible. NFCI llvm-svn: 301612	2017-04-28 03:36:24 +00:00
Davide Italiano	81a26da1e5	[SROA] Fix nondeterminism exposed by Simon's r299221. Use a SmallSetSetVector instead of a SmallPtrSet as iterating over the latter is not stable ('<' relies on addresses). llvm-svn: 301599	2017-04-27 23:09:01 +00:00
Sanjay Patel	73d8c43da8	[InstCombine] fix matcher to bind to specific operand (PR32830) Matching any random value would be very wrong: https://bugs.llvm.org/show_bug.cgi?id=32830 llvm-svn: 301594	2017-04-27 21:55:03 +00:00
Evgeniy Stepanov	964f4663c4	[asan] Fix dead stripping of globals on Linux. Use a combination of !associated, comdat, @llvm.compiler.used and custom sections to allow dead stripping of globals and their asan metadata. Sometimes. Currently this works on LLD, which supports SHF_LINK_ORDER with sh_link pointing to the associated section. This also works on BFD, which seems to treat comdats as all-or-nothing with respect to linker GC. There is a weird quirk where the "first" global in each link is never GC-ed because of the section symbols. At this moment it does not work on Gold (as in the globals are never stripped). This is a second re-land of r298158. This time, this feature is limited to -fdata-sections builds. llvm-svn: 301587	2017-04-27 20:27:27 +00:00
Evgeniy Stepanov	716f0ff222	[asan] Put ctor/dtor in comdat. When possible, put ASan ctor/dtor in comdat. The only reason not to is global registration, which can be TU-specific. This is not the case when there are no instrumented globals. This is also limited to ELF targets, because MachO does not have comdat, and COFF linkers may GC comdat constructors. The benefit of this is a lot less __asan_init() calls: one per DSO instead of one per TU. It's also necessary for the upcoming gc-sections-for-globals change on Linux, where multiple references to section start symbols trigger quadratic behaviour in gold linker. This is a second re-land of r298756. This time with a flag to disable the whole thing to avoid a bug in the gold linker: https://sourceware.org/bugzilla/show_bug.cgi?id=19002 llvm-svn: 301586	2017-04-27 20:27:23 +00:00
Chandler Carruth	1353f9a48b	[PM/LoopUnswitch] Introduce a new, simpler loop unswitch pass. Currently, this pass only focuses on trivial loop unswitching. At that reduced problem it remains significantly better than the current loop unswitch: - Old pass is worse than cubic complexity. New pass is (I think) linear. - New pass is much simpler in its design by focusing on full unswitching. (See below for details on this). - New pass doesn't carry state for thresholds between pass iterations. - New pass doesn't carry state for correctness (both miscompile and infloop) between pass iterations. - New pass produces substantially better code after unswitching. - New pass can handle more trivial unswitch cases. - New pass doesn't recompute the dominator tree for the entire function and instead incrementally updates it. I've ported all of the trivial unswitching test cases from the old pass to the new one to make sure that major functionality isn't lost in the process. For several of the test cases I've worked to improve the precision and rigor of the CHECKs, but for many I've just updated them to handle the new IR produced. My initial motivation was the fact that the old pass carried state in very unreliable ways between pass iterations, and these mechansims were incompatible with the new pass manager. However, I discovered many more improvements to make along the way. This pass makes two very significant assumptions that enable most of these improvements: 1) Focus on full unswitching -- that is, completely removing whatever control flow construct is being unswitched from the loop. In the case of trivial unswitching, this means removing the trivial (exiting) edge. In non-trivial unswitching, this means removing the branch or switch itself. This is in opposition to partial unswitching where some part of the unswitched control flow remains in the loop. Partial unswitching only really applies to switches and to folded branches. These are very similar to full unrolling and partial unrolling. The full form is an effective canonicalization, the partial form needs a complex cost model, cannot be iterated, isn't canonicalizing, and should be a separate pass that runs very late (much like unrolling). 2) Leverage LLVM's Loop machinery to the fullest. The original unswitch dates from a time when a great deal of LLVM's loop infrastructure was missing, ineffective, and/or unreliable. As a consequence, a lot of complexity was added which we no longer need. With these two overarching principles, I think we can build a fast and effective unswitcher that fits in well in the new PM and in the canonicalization pipeline. Some of the remaining functionality around partial unswitching may not be relevant today (not many test cases or benchmarks I can find) but if they are I'd like to add support for them as a separate layer that runs very late in the pipeline. Purely to make reviewing and introducing this code more manageable, I've split this into first a trivial-unswitch-only pass and in the next patch I'll add support for full non-trivial unswitching against a fixed threshold, exactly like full unrolling. I even plan to re-use the unrolling thresholds, as these are incredibly similar cost tradeoffs: we're cloning a loop body in order to end up with simplified control flow. We should only do that when the total growth is reasonably small. One of the biggest changes with this pass compared to the previous one is that previously, each individual trivial exiting edge from a switch was unswitched separately as a branch. Now, we unswitch the entire switch at once, with cases going to the various destinations. This lets us unswitch multiple exiting edges in a single operation and also avoids numerous extremely bad behaviors, where we would introduce 1000s of branches to test for thousands of possible values, all of which would take the exact same exit path bypassing the loop. Now we will use a switch with 1000s of cases that can be efficiently lowered into a jumptable. This avoids relying on somehow forming a switch out of the branches or getting horrible code if that fails for any reason. Another significant change is that this pass actively updates the CFG based on unswitching. For trivial unswitching, this is actually very easy because of the definition of loop simplified form. Doing this makes the code coming out of loop unswitch dramatically more friendly. We still should run loop-simplifycfg (at the least) after this to clean up, but it will have to do a lot less work. Finally, this pass makes much fewer attempts to simplify instructions based on the unswitch. Something like loop-instsimplify, instcombine, or GVN can be used to do increasingly powerful simplifications based on the now dominating predicate. The old simplifications are things that something like loop-instsimplify should get today or a very, very basic loop-instcombine could get. Keeping that logic separate is a big simplifying technique. Most of the code in this pass that isn't in the old one has to do with achieving specific goals: - Updating the dominator tree as we go - Unswitching all cases in a switch in a single step. I think it is still shorter than just the trivial unswitching code in the old pass despite having this functionality. Differential Revision: https://reviews.llvm.org/D32409 llvm-svn: 301576	2017-04-27 18:45:20 +00:00
Eli Friedman	10ab923b32	[GlobalOpt] Correctly update metadata when localizing a global. Just calling dropAllReferences leaves pointers to the ConstantExpr behind, so we would eventually crash with a null pointer dereference. Differential Revision: https://reviews.llvm.org/D32551 llvm-svn: 301575	2017-04-27 18:39:08 +00:00
Teresa Johnson	f9ea176f05	Memory intrinsic value profile optimization: Improve debug output (NFC) Summary: Misc improvements to debug output. Fix a couple typos and also dump the value profile before we make any profitability checks. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32607 llvm-svn: 301574	2017-04-27 18:25:22 +00:00
Xinliang David Li	d21601a929	[PartialInlining]: Improve partial inlining to handle complex conditions Differential Revision: http://reviews.llvm.org/D32249 llvm-svn: 301561	2017-04-27 16:34:00 +00:00
Craig Topper	9474e9b6c8	[InstCombine] Use APInt bit counting methods to avoid a temporary APInt. NFC llvm-svn: 301516	2017-04-27 04:51:25 +00:00
Chandler Carruth	c246a4c973	Disable GVN Hoist due to still more bugs being found in it. There is also a discussion about exactly what we should do prior to re-enabling it. The current bug is http://llvm.org/PR32821 and the discussion about this is in the review thread for r300200. llvm-svn: 301505	2017-04-27 00:28:03 +00:00
Davide Italiano	d7b2a9981c	[LibCallsShrinkWrap] Remove an unnecessary class member variable. llvm-svn: 301477	2017-04-26 21:28:40 +00:00
Davide Italiano	11817ba2ea	[LibCallsShrinkWrap] More descriptive assertion messages. Fix a typo while I'm here. llvm-svn: 301474	2017-04-26 21:21:02 +00:00
Davide Italiano	3c3785fd1f	[LibCallsShrinkWrap] Remove some temporary cl::opt(s). The pass has been on and working for a while. llvm-svn: 301473	2017-04-26 21:19:05 +00:00
Davide Italiano	6abada8ab8	[LibCallsShrinkWrap] Teach the pass how to preserve the dominator. llvm-svn: 301471	2017-04-26 21:05:40 +00:00
Daniel Berlin	ede130d490	NewGVN: Use new SimplifyQuery based API llvm-svn: 301466	2017-04-26 20:56:14 +00:00
Daniel Berlin	2c75c63063	InstCombine: Use the new SimplifyQuery versions of Simplify*. Use AssumptionCache, DominatorTree, TargetLibraryInfo everywhere. llvm-svn: 301464	2017-04-26 20:56:07 +00:00
Daniel Berlin	c9f0a4f1ec	CorrelatedValuePropagation: Rename a variable for consistency llvm-svn: 301435	2017-04-26 17:41:46 +00:00
Craig Topper	b45eabcf82	[ValueTracking] Introduce a KnownBits struct to wrap the two APInts for computeKnownBits This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit. Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch. I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases. Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero\|One) so we don't write it out everywhere. Maybe a method for (Zero\|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with. Differential Revision: https://reviews.llvm.org/D32376 llvm-svn: 301432	2017-04-26 16:39:58 +00:00
Sanjoy Das	2cbeb00f38	Reverts commit r301424, r301425 and r301426 Commits were: "Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts" "Add a new WeakVH value handle; NFC" "Rename WeakVH to WeakTrackingVH; NFC" The changes assumed pointers are 8 byte aligned on all architectures. llvm-svn: 301429	2017-04-26 16:37:05 +00:00
Matthew Simpson	9eed0bee3d	[LV] Handle external uses of floating-point induction variables Reference: https://bugs.llvm.org/show_bug.cgi?id=32758 Differential Revision: https://reviews.llvm.org/D32445 llvm-svn: 301428	2017-04-26 16:23:02 +00:00
Sanjoy Das	01de557738	Rename WeakVH to WeakTrackingVH; NFC Summary: I plan to use WeakVH to mean "nulls itself out on deletion, but does not track RAUW" in a subsequent commit. Reviewers: dblaikie, davide Reviewed By: davide Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D32266 llvm-svn: 301424	2017-04-26 16:20:52 +00:00
Haojian Wu	e43db0a834	Fix unused-variable warning caused by r301407. llvm-svn: 301411	2017-04-26 14:31:05 +00:00
Daniel Berlin	62aee14978	Convert LoopRotation to use SimplifyQuery version of SimplifyInstruction. Add AssumptionCache, DominatorTree, TLI if available. llvm-svn: 301407	2017-04-26 13:52:18 +00:00
Daniel Berlin	954006fde8	Convert SimplifyInstructions to use the SimplifyQuery version of SimplifyInstruction llvm-svn: 301406	2017-04-26 13:52:16 +00:00
Daniel Berlin	9bae449d78	Convert CVP to use SimplifyQuery version of SimplifyInstruction. Add AssumptionCache, DominatorTree, TLI if available. llvm-svn: 301405	2017-04-26 13:52:13 +00:00
Filipe Cabecinhas	92dc348773	Simplify the CFG after loop pass cleanup. Summary: Otherwise we might end up with some empty basic blocks or single-entry-single-exit basic blocks. This fixes PR32085 Reviewers: chandlerc, danielcdh Subscribers: mehdi_amini, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D30468 llvm-svn: 301395	2017-04-26 12:02:41 +00:00
Matthias Braun	c36a78c3f3	SimplifyLibCalls: Fix crash on memset(notmalloc()) rdar://31520787 llvm-svn: 301352	2017-04-25 19:44:25 +00:00
Stanislav Mekhanoshin	f2db5434be	Skip bitcasts while looking for GEP in LoadStoreVectorizer Differential Revisison: https://reviews.llvm.org/D32101 llvm-svn: 301343	2017-04-25 18:00:08 +00:00
Craig Topper	09a5878d33	[InstCombine] Remove redundant code from SimplifyUsingDistributiveLaws The code I've removed here exists in ExpandBinOp in InstSimplify which we call into before SimplifyUsingDistributiveLaws. The code in InstSimplify looks to have been copied from here. I verified this code doesn't fire on any lit tests. Not that that proves its definitely dead. Differential Revision: https://reviews.llvm.org/D32472 llvm-svn: 301341	2017-04-25 17:54:12 +00:00
Craig Topper	f3dbd17d0a	[APInt] Use isSubsetOf, intersects, and bit counting methods to reduce temporary APInts This patch uses various APInt methods to reduce temporary APInt creation. This should be all of the unrelated cleanups that got buried in D32376(creating a KnownBits struct) as well as some pointed out by Simon during the review of that. Plus a few improvements to use counting instead of masking. I've left out any places where we do something like (KnownZero & KnownOne) != 0 as I plan to add a helper method to KnownBits to ask that question and didn't want to thrash that code an additional time. Differential Revision: https://reviews.llvm.org/D32495 llvm-svn: 301338	2017-04-25 17:46:30 +00:00
Davide Italiano	058abf1f61	[PM] Run IndirectCallPromotion only when PGO is enabled. Differential Revision: https://reviews.llvm.org/D32465 llvm-svn: 301327	2017-04-25 16:54:45 +00:00
Craig Topper	7603dce6b2	[InstCombine] Remove superfluous curly braces around a single line if body. NFC llvm-svn: 301326	2017-04-25 16:48:19 +00:00
Craig Topper	ba01143193	[InstCombine] Add missing commute handling to (A \| B) & (B ^ (~A)) -> (A & B) The matching here wasn't able to handle all the possible commutes. It always assumed the not would be on the left of the xor, but that's not guaranteed. Differential Revision: https://reviews.llvm.org/D32474 llvm-svn: 301316	2017-04-25 15:19:04 +00:00
Andrew Ng	1606fc0bf9	[SimplifyLibCalls] Fix infinite loop with fast-math optimization. One of the fast-math optimizations is to replace calls to standard double functions with their float equivalents, e.g. exp -> expf. However, this can cause infinite loops for the following: float expf(float val) { return (float) exp((double) val); } A similar inline declaration exists in the MinGW-w64 math.h header file which when compiled with -O2/3 and fast-math generates infinite loops. So this fix checks that the calling function to the standard double function that is being replaced does not match the float equivalent. Differential Revision: https://reviews.llvm.org/D31806 llvm-svn: 301304	2017-04-25 12:36:14 +00:00
Craig Topper	c4b48a32f0	[InstCombine] Use commutable matchers to reduce some code. NFC llvm-svn: 301294	2017-04-25 06:02:11 +00:00
Gil Rapaport	860f0a2bad	[LV] Remove redundant basic block split This patch is part of D28975's breakdown. Genreating the control-flow to guard predicated instructions modified to only use SplitBlockAndInsertIfThen() for producing the if-then construct. Differential Revision: https://reviews.llvm.org/D32224 llvm-svn: 301293	2017-04-25 05:57:22 +00:00
Xinliang David Li	f12a0faf88	[CodeExtractor]: Fixup use refs of the old phi. Differential Revision: http://reviews.llvm.org/D32468 llvm-svn: 301291	2017-04-25 04:51:19 +00:00
Akira Hatanaka	490397fc08	[ObjCARC] Do not sink an objc_retain past a clang.arc.use. We need to do this to prevent a miscompile which sinks an objc_retain past an objc_release that releases the object objc_retain retains. This happens because the top-down and bottom-up traversals each determines the insert point for retain or release individually without knowing where the other instruction is moved. For example, when the following IR is fed to the ARC optimizer, the top-down traversal decides to insert objc_retain right before objc_release and the bottom-up traversal decides to insert objc_release right after clang.arc.use. (IR before ARC optimizer) %11 = call i8* @objc_retain(i8* %10) call void (...) @clang.arc.use(%0* %5) call void @llvm.dbg.value(...) call void @objc_release(i8* %6) This reverses the order of objc_release and objc_retain, which causes the object to be destructed prematurely. (IR after ARC optimizer) call void (...) @clang.arc.use(%0* %5) call void @objc_release(i8* %6) call void @llvm.dbg.value(...) %11 = call i8* @objc_retain(i8* %10) rdar://problem/30530580 llvm-svn: 301289	2017-04-25 04:06:35 +00:00
Davide Italiano	5b65f12bfa	[SimplifyLibCalls] Remove a cl::opt that's been `true` for a long time. llvm-svn: 301288	2017-04-25 03:48:47 +00:00
Matt Arsenault	6d7f01e3d8	InferAddressSpaces: Use reference arguments instead of pointers llvm-svn: 301276	2017-04-24 23:42:41 +00:00
Matt Arsenault	e8d0539f20	InferAddressSpaces: Remove redundant assert This is just asserting all the operations are handled in the switch, which the unreachable already handles. llvm-svn: 301270	2017-04-24 23:02:57 +00:00
Sanjay Patel	35c362ebbb	[InstSimplify] use ConstantRange to simplify more and-of-icmps We can simplify (and (icmp X, C1), (icmp X, C2)) to one of the icmps in many cases. I had to check some of these with Alive to prove to myself it's right, but everything seems to check out. Eg, the code in instcombine was completely ignoring predicates with mismatched signedness. Handling or-of-icmps would be a follow-up step. Differential Revision: https://reviews.llvm.org/D32143 llvm-svn: 301260	2017-04-24 21:52:39 +00:00
Teresa Johnson	b2c390e9f5	Update profile during memory instrinsic optimization Summary: Ensure that the new merge BB (which contains the rest of the original BB after the mem op being optimized) gets a profile frequency, in case there are additional mem ops later in the BB. Otherwise they get skipped as the merge BB looks cold. Reviewers: davidxl, xur Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32447 llvm-svn: 301244	2017-04-24 20:30:42 +00:00
Matt Arsenault	4474652c95	Revert "StructurizeCFG: Directly invert cmp instructions" This reverts commit r300732. This breaks a few tests. I think the problem is related to adding more uses of the condition that don't yet exist at this point. llvm-svn: 301242	2017-04-24 20:25:01 +00:00
Davide Italiano	ca81fbcadb	[LoopUnroll] Remove spurious newline. Eli pointed out in the review, but I didn't squash the two commits correctly. Pointy-hat to me. llvm-svn: 301241	2017-04-24 20:17:38 +00:00
Davide Italiano	0f62eea7ff	[LoopUnroll] Don't try to unroll non canonical loops. The current Loop Unroll implementation works with loops having a single latch that contains a conditional branch to a block outside the loop (the other successor is, by defition of latch, the header). If this precondition doesn't hold, avoid unrolling the loop as the code is not ready to handle such circumstances. Differential Revision: https://reviews.llvm.org/D32261 llvm-svn: 301239	2017-04-24 20:14:11 +00:00
Sanjoy Das	206f65c049	[LIR] Obey non-integral pointer semantics Summary: See http://llvm.org/docs/LangRef.html#non-integral-pointer-type Reviewers: haicheng Reviewed By: haicheng Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32196 llvm-svn: 301238	2017-04-24 20:12:10 +00:00
Evgeniy Stepanov	9e536081fe	[asan] Let the frontend disable gc-sections optimization for asan globals. Also extend -asan-globals-live-support flag to all binary formats. llvm-svn: 301226	2017-04-24 19:34:13 +00:00
Mandeep Singh Grang	799a2edb3d	[SimplifyCFG] Fix for non-determinism in codegen Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718 Reviewers: majnemer, chenli, davide Reviewed By: davide Subscribers: davide, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D26726 llvm-svn: 301222	2017-04-24 19:20:45 +00:00
Evgeniy Stepanov	58ccc0949a	Revert "Compute safety information in a much finer granularity." Use-after-free in llvm::isGuaranteedToExecute. llvm-svn: 301214	2017-04-24 18:25:07 +00:00
Sanjay Patel	0889225f51	[InstSimplify] move (A & ~B) \| (A ^ B) -> (A ^ B) from InstCombine This is a straight cut and paste, but there's a bigger problem: if this fold exists for simplifyOr, there should be a DeMorganized version for simplifyAnd. But more than that, we have a patchwork of ad hoc logic optimizations in InstCombine. There should be some structure to ensure that we're not missing sibling folds across and/or/xor. llvm-svn: 301213	2017-04-24 18:24:36 +00:00
Adrian Prantl	f2c7997013	Use DW_OP_stack_value when reconstructing variable values with arithmetic. When the location description of a source variable involves arithmetic on the value itself, it needs to be marked with DW_OP_stack_value since it is not describing the variable's location, but rather its value. This is a follow-up to r297971 and fixes the source testcase quoted in the comment in debuginfo-dce.ll. rdar://problem/30725338 This reapplies r301093 without modifications. llvm-svn: 301210	2017-04-24 18:11:42 +00:00
Matt Arsenault	02907f3039	InstCombine: Fix assert when reassociating fsub with undef There is logic to track the expected number of instructions produced. It thought in this case an instruction would be necessary to negate the result, but here it folded into a ConstantExpr fneg when the non-undef value operand was cancelled out by the second fsub. I'm not sure why we don't fold constant FP ops with undef currently, but I think that would also avoid this problem. llvm-svn: 301199	2017-04-24 17:24:37 +00:00
Xin Tong	a266923d57	Compute safety information in a much finer granularity. Summary: Instead of keeping a variable indicating whether there are early exits in the loop. We keep all the early exits. This improves LICM's ability to move instructions out of the loop based on is-guaranteed-to-execute. I am going to update compilation time as well soon. Reviewers: hfinkel, sanjoy, efriedma, mkuper Reviewed By: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D32433 llvm-svn: 301196	2017-04-24 17:12:22 +00:00
Nicolai Haehnle	9c66185315	InstCombine/AMDGPU: Fix constant folding of llvm.amdgcn.{icmp,fcmp} Summary: The return value of these intrinsics should always have 0 bits for inactive threads. This means that when all arguments are constant and the comparison evaluates to true, the intrinsic should return the current exec mask. Fixes some GL_ARB_shader_ballot tests. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D32344 llvm-svn: 301195	2017-04-24 17:08:43 +00:00
Xinliang David Li	db8d09b6c2	[PartialInine]: add triaging options There are more bugs (runtime failures) triggered when partial inlining is turned on. Add options to help triaging problems. llvm-svn: 301148	2017-04-23 23:39:04 +00:00
Sanjay Patel	e0c26e0640	[InstCombine] add/move folds for [not]-xor We handled all of the commuted variants for plain xor already, although they were scattered around and sometimes folded less efficiently using distributive laws. We had no folds for not-xor. Handling all of these patterns consistently is part of trying to reinstate: https://reviews.llvm.org/rL300977 llvm-svn: 301144	2017-04-23 22:00:02 +00:00
Xinliang David Li	15744ad87b	[PartialInlining] Add optimization remark support Differential Revision: http://reviews.llvm.org/D32387 llvm-svn: 301143	2017-04-23 21:40:58 +00:00
Xin Tong	f98602a1ab	[JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread). I failed to update the phi nodes properly in the last patch https://reviews.llvm.org/rL300657. Phi nodes values are per predecessor in LLVM. Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32400 llvm-svn: 301139	2017-04-23 20:56:29 +00:00
Xin Tong	b7b081262a	Correct grammar. NFC llvm-svn: 301135	2017-04-23 17:36:25 +00:00
Sanjay Patel	d13b0bfdac	[InstCombine] add pattern matches for commuted variants of xor-to-xor There's probably some better way to write this that eliminates the code duplication without hurting readability, but at least this eliminates the logic holes and is hopefully slightly more efficient than creating new instructions. llvm-svn: 301129	2017-04-23 16:03:00 +00:00
Renato Golin	4abfb3d741	Revert "[APInt] Fix a few places that use APInt::getRawData to operate within the normal API." This reverts commit r301105, 4, 3 and 1, as a follow up of the previous revert, which broke even more bots. For reference: Revert "[APInt] Use operator<<= where possible. NFC" Revert "[APInt] Use operator<<= instead of shl where possible. NFC" Revert "[APInt] Use ashInPlace where possible." PR32754. llvm-svn: 301111	2017-04-23 12:15:30 +00:00
Craig Topper	5f68af0806	[APInt] Use operator<<= instead of shl where possible. NFC llvm-svn: 301103	2017-04-23 05:18:31 +00:00
Davide Italiano	5da7090256	[ThinLTO/Summary] Rename anonymous globals as last action ... ... in the per-TU -O0 pipeline. The problem is that there could be passes registered using `addExtensionsToPM()` introducing unnamed globals. Asan is an example, but there may be others. Building cppcheck with `-flto=thin` and `-fsanitize=address` triggers an assertion while we're reading bitcode (in lib/LTO), as the BitcodeReader assumes there are no unnamed globals (because the namer has run). Unfortunately I wasn't able to find an easy way to test this. I added a comment in the hope nobody moves this again. llvm-svn: 301102	2017-04-23 04:49:34 +00:00
Adrian Prantl	4677205010	Revert "Use DW_OP_stack_value when reconstructing variable values with arithmetic." This reverts commit r301093 while investigating stage2 bot breakage. llvm-svn: 301099	2017-04-23 00:44:40 +00:00
Adrian Prantl	a2d25ac14a	Use DW_OP_stack_value when reconstructing variable values with arithmetic. When the location description of a source variable involves arithmetic on the value itself, it needs to be marked with DW_OP_stack_value since it is not describing the variable's location, but rather its value. This is a follow-up to r297971 and fixes the source testcase quoted in the comment in debuginfo-dce.ll. rdar://problem/30725338 llvm-svn: 301093	2017-04-22 20:54:06 +00:00
Xinliang David Li	016a82ba51	[PartialInlining] Using existing hasAddressTaken interface to legality check/NFC llvm-svn: 301090	2017-04-22 19:24:19 +00:00
Sanjay Patel	3b863f8a1e	[InstCombine] use 'match' to reduce code; NFCI The later uses of dyn_castNotVal in this block are either incomplete (doesn't handle vector constants) or overstepping (shouldn't handle constants at all), but this first use is just unnecessary. 'I' is obviously not a constant, and it can't be a not-of-a-not because that would already be instsimplified. llvm-svn: 301088	2017-04-22 18:05:35 +00:00
Artur Pilipenko	0632bdc648	Fix for PR32740 - Invalid floating type, unreachable between r300969 and r301029 The bug was introduced by r301018 "[InstCombine] fadd double (sitofp x), y check that the promotion is valid". The patch didn't expect that fadd can be on vectors not necessarily scalars. Add vector support along with the test. llvm-svn: 301070	2017-04-22 07:24:52 +00:00
Matt Arsenault	01d17e7c5f	LowerSwitch: Fix producing invalid IR on unreachable code If a switch was in an unreachable block that branched to a block with a phi, it would leave phis with missing predecessors. llvm-svn: 301064	2017-04-21 23:54:12 +00:00
Matt Arsenault	c07bda7b87	InferAddressSpaces: Infer for just GEPs Fixes leaving intermediate flat addressing computations where a GEP instruction's source is a constant expression. Still leaves behind a trivial addrspacecast + gep pair that instcombine is able to handle, which ideally could be folded here directly. llvm-svn: 301044	2017-04-21 21:35:04 +00:00
Xinliang David Li	0e9f6df169	[PartialInliner] Partial inliner needs to check use kind before transformation Differential Revision: https://reviews.llvm.org/D32373 llvm-svn: 301042	2017-04-21 21:20:56 +00:00
Sanjay Patel	8ce1d4cbe1	[InstCombine] revert r300977 and r301021 This can cause an inf-loop. Investigating... llvm-svn: 301035	2017-04-21 20:29:17 +00:00
Adrian Prantl	1a18f1ad10	typo llvm-svn: 301030	2017-04-21 20:06:41 +00:00
Sanjay Patel	0f001a4701	[InstCombine] use isSubsetOf() for efficiency C \| ~D == -1 ~(C \| ~D) == 0 ~C & D == 0 D & ~C == 0 D.isSubsetOf(C) llvm-svn: 301021	2017-04-21 19:16:52 +00:00
Artur Pilipenko	134d94f9a3	[InstCombine] fadd double (sitofp x), y check that the promotion is valid Doing these transformations check that the result of integer addition is representable in the FP type. (fadd double (sitofp x), fpcst) --> (sitofp (add int x, intcst)) (fadd double (sitofp x), (sitofp y)) --> (sitofp (add int x, y)) This is a fix for https://bugs.llvm.org//show_bug.cgi?id=27036 Reviewed By: andrew.w.kaylor, scanon, spatel Differential Revision: https://reviews.llvm.org/D31182 llvm-svn: 301018	2017-04-21 18:45:25 +00:00
Craig Topper	7af078847c	[SimplifyCFG] Fix the determination of PostBB in conditional store merging to handle the targets on the second branch being commuted Currently we choose PostBB as the single successor of QFB, but its possible that QTB's single successor is QFB which would make QFB the correct choice. Differential Revision: https://reviews.llvm.org/D32323 llvm-svn: 300992	2017-04-21 15:53:42 +00:00
Wei Mi	337d4d95c2	[ConstHoisting] Add BFI in constanthoisting pass and select the best insertion places based on it. Existing constant hoisting pass will merge a group of contants in a small range and hoist the const materialization code to the common dominator of their uses. However, if the uses are all in cold pathes, existing implementation may hoist the materialization code from cold pathes to a hot place. This may hurt performance. The patch introduces BFI to the pass and selects the best insertion places based on it. The change is controlled by an option consthoist-with-block-frequency which is off by default for now. Differential Revision: https://reviews.llvm.org/D28962 llvm-svn: 300989	2017-04-21 15:50:16 +00:00
Matthew Simpson	e2037d24f9	[LV] Model if-converted phi node costs Phi nodes in non-header blocks are converted to select instructions after if-conversion. This patch updates the cost model to account for the selects. Differential Revision: https://reviews.llvm.org/D31906 llvm-svn: 300980	2017-04-21 14:14:54 +00:00
Sanjay Patel	347b54b093	[InstCombine] prefer xor with -1 because 'not' is easier to understand (PR32706) This matches the demanded bits behavior in the DAG and should fix: https://bugs.llvm.org/show_bug.cgi?id=32706 Differential Revision: https://reviews.llvm.org/D32255 llvm-svn: 300977	2017-04-21 14:03:54 +00:00
Davide Italiano	fa15de34b7	[PartialInliner] Fix crash when inlining functions with unreachable blocks. CodeExtractor looks up the dominator node corresponding to return blocks when splitting them. If one of these blocks is unreachable, there's no node in the Dom and CodeExtractor crashes because it doesn't check for domtree node validity. In theory, we could add just a check for skipping null DTNodes in `splitReturnBlock` but the fix I propose here is slightly different. To the best of my knowledge, unreachable blocks are irrelevant for the algorithm, therefore we can just skip them when building the candidate set in the constructor. Differential Revision: https://reviews.llvm.org/D32335 llvm-svn: 300946	2017-04-21 04:25:00 +00:00
Davide Italiano	059574c537	[CodeExtractor] Remove an unneeded level of indirection. NFCI. llvm-svn: 300931	2017-04-21 00:21:09 +00:00
Craig Topper	358cd9ae3a	[InstCombine] Remove the zextOrTrunc from ShrinkDemandedConstant. The demanded mask and the constant should always be the same width for all callers today. Also stop copying the demanded mask as its passed in. We should avoid allocating memory unless we are going to do something. The final AND to create the new constant will take care of it. llvm-svn: 300927	2017-04-20 23:58:27 +00:00
Sanjay Patel	cc663b82fa	[InstCombine] function names start with lower-case letter; NFC Forgot to make this fix with the signature change in r300911. llvm-svn: 300912	2017-04-20 22:37:01 +00:00
Sanjay Patel	c9485ca895	[InstCombine] allow shl+shr demanded bits folds with splat constants llvm-svn: 300911	2017-04-20 22:33:54 +00:00
Xinliang David Li	99e3ca1526	Use basicblock split block utility function Instead of calling BasicBlock::SplitBasicBlock directly in CodeExtractor. Differential Revision: https://reviews.llvm.org/D32308 llvm-svn: 300899	2017-04-20 21:40:22 +00:00
Sanjay Patel	3e1ae72fcf	[InstCombine] allow shl demanded bits folds with splat constants More fixes are needed to enable the helper SimplifyShrShlDemandedBits(). llvm-svn: 300898	2017-04-20 21:33:02 +00:00
Craig Topper	ff23889609	[InstCombine] Use APInt::intersects and APInt::isSubsetOf to improve a few more places in SimplifyDemandedBits. llvm-svn: 300896	2017-04-20 21:24:37 +00:00
Sanjay Patel	fb5b3e773a	[InstCombine] allow ashr/lshr demanded bits folds with splat constants llvm-svn: 300888	2017-04-20 20:59:02 +00:00
Craig Topper	17f37ba3b9	[InstCombine] Use APInt::isSubsetOf to simplify some code in SimplifyDemandedBits. NFC This allows us to use less temporary APInt for And and Invert operations. llvm-svn: 300885	2017-04-20 20:47:35 +00:00
Craig Topper	0ec3f2f39a	[InstCombine] Remove redundant code from SimplifyDemandedBits handling for Or. The code above it is equivalent if you work through the bitwise math. llvm-svn: 300876	2017-04-20 19:31:22 +00:00
Davide Italiano	b965121ba8	[CodeExtractor] Remove a bunch of unneeded constructors. Differential Revision: https://reviews.llvm.org/D32305 llvm-svn: 300869	2017-04-20 18:33:40 +00:00
Craig Topper	bcfd2d1789	[APInt] Rename getSignBit to getSignMask getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask. Differential Revision: https://reviews.llvm.org/D32108 llvm-svn: 300856	2017-04-20 16:56:25 +00:00
Craig Topper	a8129a1122	[APInt] Add isSubsetOf method that can check if one APInt is a subset of another without creating temporary APInts This question comes up in many places in SimplifyDemandedBits. This makes it easy to ask without allocating additional temporary APInts. The BitVector class provides a similar functionality through its (IMHO badly named) test(const BitVector&) method. Though its output polarity is reversed. I've provided one example use case in this patch. I plan to do more as a follow up. Differential Revision: https://reviews.llvm.org/D32258 llvm-svn: 300851	2017-04-20 16:17:13 +00:00
Craig Topper	83dc1c60aa	In SimplifyDemandedUseBits, use computeKnownBits directly to handle Constants Currently we don't explicitly process ConstantDataSequential, ConstantAggregateZero, or ConstantVector, or Undef before applying the Depth limit. Instead they occur after the depth check in the non-instruction path. For the constant types that we do handle, the code is replicated from computeKnownBits. This patch fixes the missing constant handling and the reduces the amount of code by just using computeKnownBits directly for any type of Constant. Differential Revision: https://reviews.llvm.org/D32123 llvm-svn: 300849	2017-04-20 16:14:58 +00:00
Reid Kleckner	f1de9e83c2	[DAE] Simplify attribute list creation, NFC Removes a use of getSlotAttributes, which I intend to change. llvm-svn: 300795	2017-04-19 23:45:45 +00:00
Reid Kleckner	0a5ed3d5dc	[GlobalOpt] Simplify attribute code stripping nest, NFC llvm-svn: 300787	2017-04-19 23:26:44 +00:00
Reid Kleckner	aa0cec7d6d	Simplify test for sret attribute in instcombine This change is correct because the verifier requires that at most one argument be marked 'sret'. NFC, removes a use of AttributeList slot APIs. llvm-svn: 300784	2017-04-19 23:17:47 +00:00
Kostya Serebryany	c5d3d49034	[sanitizer-coverage] remove some more stale code llvm-svn: 300778	2017-04-19 22:42:11 +00:00
Evgeniy Stepanov	7c9b086ef5	Remove two unused variables (-Werror). llvm-svn: 300777	2017-04-19 22:27:23 +00:00
Kostya Serebryany	be87d480ff	[sanitizer-coverage] remove stale code llvm-svn: 300769	2017-04-19 21:48:09 +00:00
Craig Topper	9b71a402c2	[APInt] Cast calls to add/sub/mul overflow methods to void if only their overflow bool out param is used. This is preparation for a clang change to improve the [[nodiscard]] warning to not be ignored on methods that return a class marked [[nodiscard]] that are defined in the class itself. See D32207. We should consider adding wrapper methods to APInt that return the overflow flag directly and discard the APInt result. This would eliminate the void casts and the need to create a bool before the call to pass to the out param. llvm-svn: 300758	2017-04-19 21:09:45 +00:00
Matt Arsenault	d3406bc45c	StructurizeCFG: Directly invert cmp instructions The most common case for a branch condition is a single use compare. Directly invert the branch predicate rather than adding a lot of xor i1 true which the DAG will have to fold later. This produces nicer to read structurizer output. This produces some random changes in codegen due to the DAG swapping branch conditions itself, and then does a poor job of dealing with those inverts. llvm-svn: 300732	2017-04-19 18:29:07 +00:00
Sanjoy Das	5945447d84	[GVN] Don't coerce non-integral pointers to integers or vice versa Summary: See http://llvm.org/docs/LangRef.html#non-integral-pointer-type The NewGVN test does not fail without these changes (perhaps it does try to coerce pointers <-> integers to begin with?), but I added the test case anyway. Reviewers: dberlin Subscribers: mcrosier, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D32208 llvm-svn: 300730	2017-04-19 18:21:09 +00:00
Reid Kleckner	9d16fa09c6	Prefer addAttr(Attribute::AttrKind) over the AttributeList overload This should simplify the call sites, which typically want to tweak one attribute at a time. It should also avoid creating ephemeral AttributeLists that live forever. llvm-svn: 300718	2017-04-19 17:28:52 +00:00
Davide Italiano	ffcb4df204	[InstCombine] Reduce visitLoadInst() code duplication. NFCI. llvm-svn: 300717	2017-04-19 17:26:57 +00:00
Chandler Carruth	ae3386aa74	Revert r300657 due to crashes in stage2 of bootstraps: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/2476/steps/build-stage2-LLVMgold.so/logs/stdio http://bb.pgr.jp/builders/clang-3stage-x86_64-linux/builds/15036/steps/build_llvmclang/logs/stdio I've updated the commit thread, reverting to get the bots back to green. Original commit summary: [JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. llvm-svn: 300662	2017-04-19 06:23:20 +00:00
Xin Tong	636a332906	[JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. . Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread). Reviewers: efriedma, sanjoy Reviewed By: sanjoy Subscribers: dberlin, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D30869 llvm-svn: 300657	2017-04-19 05:15:57 +00:00
Sanjoy Das	f09c1e346e	Add a getPointerOperandType() helper to LoadInst and StoreInst; NFC I will use this in a later change. llvm-svn: 300613	2017-04-18 22:00:54 +00:00
Davide Italiano	80fe987b42	[LoopReroll] Prefer hasNUses/hasNUses or more as they're cheaper. NFCI. llvm-svn: 300607	2017-04-18 21:42:21 +00:00
Daniel Berlin	9d0042b47c	NewGVN: Fix memory congruence verification. The return true should be a return false. Merge the appropriate if statements so it doesn't happen again. llvm-svn: 300584	2017-04-18 20:15:47 +00:00
Easwaran Raman	76aba5f6d7	[SLP vectorizer] Allow phi node reordering in tryToVectorizeList. In tryToVectorizeList, under a very limited circumstance (when entered from tryToVectorizePair), the values may be reordered (swapped) and the SLP tree is built with the new order. This extends that to the case when starting from phis in vectorizeChainsInBlock when there are exactly two phis. The textual order of phi nodes shouldn't really matter. Without this change, the loop body in the accompnaying test case is fully vectorized when we swap the orde of the phis but not with this order. While this doesn't solve the phi-ordering problem in a general way (for more than 2 phis), this is simple fix that piggybacks on an existing mechanism and is useful in cases like multiplying two complex numbers. Differential revision: https://reviews.llvm.org/D32065 llvm-svn: 300574	2017-04-18 18:16:57 +00:00
Craig Topper	fc947bcfba	[APInt] Use lshrInPlace to replace lshr where possible This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result. This adds an lshrInPlace(const APInt &) version as well. Differential Revision: https://reviews.llvm.org/D32155 llvm-svn: 300566	2017-04-18 17:14:21 +00:00
Daniel Berlin	ec9deb7f54	NewGVN: Don't waste time value numbering unreachable blocks llvm-svn: 300565	2017-04-18 17:06:11 +00:00
Zvi Rackover	d942397e24	LoopRerollPass: Prefer Value::hasOneUse() over Value::getNumUses(). NFC. getNumUses() can be more expensive as it iterates over all list's elements. llvm-svn: 300558	2017-04-18 14:55:43 +00:00
Gil Rapaport	fb1d915ab2	[LV] Cache block mask values This patch is part of D28975's breakdown. Add caching for block masks similar to the cache already used for edge masks, replacing generation per user with reusing the first generated value which dominates all uses. Differential Revision: https://reviews.llvm.org/D32054 llvm-svn: 300557	2017-04-18 14:43:43 +00:00
Nikolai Bozhenov	9e4a1c39db	[GVNHoist] Mark GlobalsAA as preserved by GVNHoist. Reviewers: sebpop, hiraditya Reviewed By: sebpop Subscribers: n.bozhenov, llvm-commits Differential Revision: https://reviews.llvm.org/D32158 Patch by Andrei Elovikov <andrei.elovikov@intel.com> llvm-svn: 300552	2017-04-18 13:25:49 +00:00
Andrea Di Biagio	517e3fc34c	[SampleProfile] Don't assert when printing the DebugLoc of a branch. NFC. llvm-svn: 300544	2017-04-18 11:27:58 +00:00
Andrea Di Biagio	e3edef0977	[SampleProfile] Skip intrinsic calls when visiting callsites in InlineHotFunctions. Before this patch, we always called method 'findCalleeFunctionSamples()' on intrinsic calls. However, intrinsic calls like llvm.dbg.value() are not viable candidates for obvious reasons. No functional change intended. Differential Revision: https://reviews.llvm.org/D32008 llvm-svn: 300541	2017-04-18 10:08:53 +00:00
Adrian Prantl	6825fb64e9	PR32382: Fix emitting complex DWARF expressions. The DWARF specification knows 3 kinds of non-empty simple location descriptions: 1. Register location descriptions - describe a variable in a register - consist of only a DW_OP_reg 2. Memory location descriptions - describe the address of a variable 3. Implicit location descriptions - describe the value of a variable - end with DW_OP_stack_value & friends The existing DwarfExpression code is pretty much ignorant of these restrictions. This used to not matter because we only emitted very short expressions that we happened to get right by accident. This patch makes DwarfExpression aware of the rules defined by the DWARF standard and now chooses the right kind of location description for each expression being emitted. This would have been an NFC commit (for the existing testsuite) if not for the way that clang describes captured block variables. Based on how the previous code in LLVM emitted locations, DW_OP_deref operations that should have come at the end of the expression are put at its beginning. Fixing this means changing the semantics of DIExpression, so this patch bumps the version number of DIExpression and implements a bitcode upgrade. There are two major changes in this patch: I had to fix the semantics of dbg.declare for describing function arguments. After this patch a dbg.declare always takes the address of a variable as the first argument, even if the argument is not an alloca. When lowering a DBG_VALUE, the decision of whether to emit a register location description or a memory location description depends on the MachineLocation — register machine locations may get promoted to memory locations based on their DIExpression. (Future) optimization passes that want to salvage implicit debug location for variables may do so by appending a DW_OP_stack_value. For example: DBG_VALUE, [RBP-8] --> DW_OP_fbreg -8 DBG_VALUE, RAX --> DW_OP_reg0 +0 DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0 All testcases that were modified were regenerated from clang. I also added source-based testcases for each of these to the debuginfo-tests repository over the last week to make sure that no synchronized bugs slip in. The debuginfo-tests compile from source and run the debugger. https://bugs.llvm.org/show_bug.cgi?id=32382 <rdar://problem/31205000> Differential Revision: https://reviews.llvm.org/D31439 llvm-svn: 300522	2017-04-18 01:21:53 +00:00
Dehao Chen	1ea8bd8109	Build SymbolMap in SampleProfileLoader to help matchin function names with suffix. Summary: If there is suffix added in the function name (e.g. module hash added by thinLTO), we will not be able to find a match in profile as the suffix does not exist in profile. This patch build a map from function name to Function *. The map includes the entry for the stripped function name so that inlineHotFunctions can find the corresponding function to promote/inline. Reviewers: davidxl, dnovillo, tejohnson Reviewed By: davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31952 llvm-svn: 300507	2017-04-17 22:23:05 +00:00
Craig Topper	c228068d90	[SimplifyCFG] Use hasNUses instead of comparing getNumUses to a constant." The use list is a linked list so getNumUses requires a linear scan through the whole list. hasNUses will stop scanning at N and see if that is the end. llvm-svn: 300505	2017-04-17 22:13:00 +00:00
Davide Italiano	cdc937d0fc	[InstCombine] Matchers work with both ConstExpr and Instructions. So, `cast<Instruction>` is not guaranteed to succeed. Change the code so that we create a new constant and use it in the newly created instruction, as it's done in other places in InstCombine. OK'ed by Sanjay/Craig. Fixes PR32686. llvm-svn: 300495	2017-04-17 20:49:50 +00:00
Peter Collingbourne	a0f371a106	Bitcode: Add a string table to the bitcode format. Add a top-level STRTAB block containing a string table blob, and start storing strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in the string table. This change allows us to share names between globals and comdats as well as between modules, and improves the efficiency of loading bitcode files by no longer using a bit encoding for symbol names. Once we start writing the irsymtab to the bitcode file we will also be able to share strings between it and the module. On my machine, link time for Chromium for Linux with ThinLTO decreases by about 7% for no-op incremental builds or about 1% for full builds. Total bitcode file size decreases by about 3%. As discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html Differential Revision: https://reviews.llvm.org/D31838 llvm-svn: 300464	2017-04-17 17:51:36 +00:00
Craig Topper	d23004c37b	Introduce APInt::isSignBitSet/isSignBitClear. Use in place isSignBitSet in place of isNegative in known bits tracking. This makes statements like KnownZero.isNegative() (which means the value we're tracking is positive) less confusing. llvm-svn: 300457	2017-04-17 16:38:20 +00:00
Matt Arsenault	7205f3c2e4	AMDGPU: SimplifyDemandedElts for image intrinsics Causes some VGPR usage improvements in shaderdb, but introduces some SGPR spilling regressions due to random scheduling changes later. llvm-svn: 300453	2017-04-17 15:12:44 +00:00
Davide Italiano	ce161a7812	[LCSSA] Don't insert tokens into the worklist at all. We're gonna skip them anyway, so there's no point in inserting them in the first place. llvm-svn: 300452	2017-04-17 14:32:05 +00:00
Max Kazantsev	751579cac0	[LoopPeeling] Get rid of Phis that become invariant after N steps This patch is a generalization of the improvement introduced in rL296898. Previously, we were able to peel one iteration of a loop to get rid of a Phi that becomes an invariant on the 2nd iteration. In more general case, if a Phi becomes invariant after N iterations, we can peel N times and turn it into invariant. In order to do this, we for every Phi in loop's header we define the Invariant Depth value which is calculated as follows: Given %x = phi <Inputs from above the loop>, ..., [%y, %back.edge]. If %y is a loop invariant, then Depth(%x) = 1. If %y is a Phi from the loop header, Depth(%x) = Depth(%y) + 1. Otherwise, Depth(%x) is infinite. Notice that if we peel a loop, all Phis with Depth = 1 become invariants, and all other Phis with finite depth decrease the depth by 1. Thus, peeling N first iterations allows us to turn all Phis with Depth <= N into invariants. Reviewers: reames, apilipenko, mkuper, skatkov, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31613 llvm-svn: 300446	2017-04-17 09:52:02 +00:00
Max Kazantsev	8ed6b66d85	[LoopPeeling] Fix condition for phi-eliminating peeling When peeling loops basing on phis becoming invariants, we make a wrong loop size check. UP.Threshold should be compared against the total numbers of instructions after the transformation, which is equal to 2 * LoopSize in case of peeling one iteration. We should also check that the maximum allowed number of peeled iterations is not zero. Reviewers: sanjoy, anna, reames, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31753 llvm-svn: 300441	2017-04-17 05:38:28 +00:00
Craig Topper	218a359fbd	[InstCombine] Simplify 1/X for vectors. llvm-svn: 300439	2017-04-17 03:41:47 +00:00
Craig Topper	1a18a7c51e	[InstCombine] Add support for vector srem->urem. llvm-svn: 300437	2017-04-17 01:51:24 +00:00
Craig Topper	f248468359	[InstCombine] Add support for turning vector sdiv into udiv. llvm-svn: 300435	2017-04-17 01:51:19 +00:00
Davide Italiano	ee654bf5f1	[LCSSA] Simplify a loop. NFCI. llvm-svn: 300433	2017-04-17 00:02:45 +00:00
Craig Topper	da886c665b	[InstCombine][ValueTracking] When computing known bits for Srem make sure we don't compute known bits for the LHS twice. If we already called computeKnownBits for the RHS being a constant power of 2, we've already computed everything we can and should just stop. I think previously we would still recurse if we had determined the result was negative or had not determined the sign bit at all. llvm-svn: 300432	2017-04-16 21:46:12 +00:00
Davide Italiano	dd37c67d81	[LCSSA] Fix non-determinism due to iterating over a SmallPtrSet. Use a SmallSetVector instead. llvm-svn: 300431	2017-04-16 21:07:04 +00:00
Craig Topper	0d304f01b4	[InstCombine] In SimplifyDemandedUseBits, don't bother to mask known bits of constants with DemandedMask. Just because we didn't demand them doesn't mean they aren't known. llvm-svn: 300430	2017-04-16 20:55:58 +00:00
Michael Zuckerman	16b20d2fc5	[X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization This patch adds new optimization (Folding cmp(sub(a,b),0) into cmp(a,b)) to instCombineCall pass and was written specific for X86 CMP intrinsics. Differential Revision: https://reviews.llvm.org/D31398 llvm-svn: 300422	2017-04-16 13:26:08 +00:00
Sanjay Patel	ef9f586bb2	[InstCombine] allow (X != C1 && X != C2) and similar patterns to match splat vector constants llvm-svn: 300402	2017-04-15 17:55:06 +00:00
Vedant Kumar	1a6a2b642b	[ProfileData] Unify getInstrProfSectionName helpers This is a version of D32090 that unifies all of the `getInstrProfSectionName` helper functions. (Note: the build failures which D32090 would have addressed were fixed with r300352.) We should unify these helper functions because they are hard to use in their current form. E.g we recently introduced more helpers to fix section naming for COFF files. This scheme doesn't totally succeed at hiding low-level details about section naming, so we should switch to an API that is easier to maintain. This is not an NFC commit because it fixes llvm-cov's testing support for COFF files (this falls out of the API change naturally). This is an area where we lack tests -- I will see about adding one as a follow up. Testing: check-clang, check-profile, check-llvm. Differential Revision: https://reviews.llvm.org/D32097 llvm-svn: 300381	2017-04-15 00:09:57 +00:00
Craig Topper	9a458cd517	[InstCombine] MakeAnd/Or/Xor handling to reuse previous APInt computations When checking if we should return a constant, we create some temporary APInts to see if we know all bits. But the exact computations we do are needed in several other locations in the same code. This patch moves them to named temporaries so we can reuse them. Ideally we'd write directly to KnownZero/One, but we currently seem to only write those variables after all the simplifications checks and I didn't want to change that with this patch. Differential Revision: https://reviews.llvm.org/D32094 llvm-svn: 300376	2017-04-14 22:34:14 +00:00
Reid Kleckner	fb502d2f5e	[IR] Make paramHasAttr to use arg indices instead of attr indices This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367	2017-04-14 20:19:02 +00:00
Sanjay Patel	7cfe41659c	[InstCombine] (X != C1 && X != C2) --> (X \| (C1 ^ C2)) != C2 ...when C1 differs from C2 by one bit and C1 <u C2: http://rise4fun.com/Alive/Vuo And move related folds to a helper function. This reduces code duplication and will make it easier to remove the scalar-only restriction as a follow-up step. llvm-svn: 300364	2017-04-14 19:23:50 +00:00
Craig Topper	fb71b7d3e0	[InstCombine] Support folding a subtract with a constant LHS into a phi node We currently only support folding a subtract into a select but not a PHI. This fixes that. I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects. Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards. Differential Revision: https://reviews.llvm.org/D31686 llvm-svn: 300363	2017-04-14 19:20:12 +00:00
Craig Topper	c22c7b1459	[InstCombine] Refactor SimplifyUsingDistributiveLaws to more explicitly skip code when LHS/RHS aren't BinaryOperators Currently this code always makes 2 or 3 calls to tryFactorization regardless of whether the LHS/RHS are BinaryOperators. We make 3 calls when both operands are BinaryOperators with the same opcode. Or surprisingly, when neither are BinaryOperators. This is because getBinOpsForFactorization returns Instruction::BinaryOpsEnd when the operand is not a BinaryOperator. If both LHS and RHS are not BinaryOperators then they both have an Opcode of Instruction::BinaryOpsEnd. When this happens we rely on tryFactorization to early out due to A/B/C/D being null. Similar behavior occurs for the other calls, we rely on getBinOpsForFactorization having made A/B or C/D null to get tryFactorization to early out. We also rely on these null checks to check the result of getIdentityValue and early out for it. This patches refactors this to pull these checks up to SimplifyUsingDistributiveLaws so we don't rely on BinaryOpsEnd as a sentinel or this A/B/C/D null behavior. I think this makes this code easier to reason about. Should also give a tiny performance improvement for cases where the LHS or RHS isn't a BinaryOperator. Differential Revision: https://reviews.llvm.org/D31913 llvm-svn: 300353	2017-04-14 17:55:41 +00:00
Davide Italiano	91239088a1	[FunctionImport] assert(false) -> llvm_unreachable(). NFCI. llvm-svn: 300344	2017-04-14 17:22:02 +00:00
Sanjoy Das	e3a15e832c	Tighten the API for ScalarEvolutionNormalization llvm-svn: 300331	2017-04-14 15:49:59 +00:00
Sanjoy Das	ac9f3ea0b4	Remove NormalizeAutodetect; NFC It is cleaner to have a callback based system where the logic of whether an add recurrence is normalized or not lives on IVUsers. This is one step in a multi-step cleanup. llvm-svn: 300330	2017-04-14 15:49:53 +00:00
Gil Rapaport	334f8fbe47	[LV] Remove implicit single basic block assumption This patch is part of D28975's breakdown - no change in output intended. LV's code currently assumes the vectorized loop is a single basic block up until predicateInstructions() is called. This patch removes two manifestations of this assumption (loop phi incoming values, dominator tree update) by replacing the use of vectorLoopBody with the vectorized loop's latch/header. Differential Revision: https://reviews.llvm.org/D32040 llvm-svn: 300310	2017-04-14 07:30:23 +00:00
Craig Topper	c9a4fc0750	[InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFC llvm-svn: 300305	2017-04-14 05:09:04 +00:00
Xinliang David Li	9a71766751	Fix test failure on windows: pass module to getInstrProfXXName calls llvm-svn: 300302	2017-04-14 03:03:24 +00:00
Daniel Berlin	2f72b19b05	NewGVN: Don't propagate over phi backedges where undef causes us to have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299	2017-04-14 02:53:37 +00:00
Xinliang David Li	57dea2d359	[Profile] PE binary coverage bug fix PR/32584 Differential Revision: https://reviews.llvm.org/D32023 llvm-svn: 300277	2017-04-13 23:37:12 +00:00
Reid Kleckner	f021fab2af	[IR] Make getParamAttributes take argument numbers, not ArgNo+1 Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272	2017-04-13 23:12:13 +00:00
Craig Topper	e7563f8dda	[InstCombine] Use APInt::getBitsSetFrom instead of inverting the result of getLowBitsSet. NFC llvm-svn: 300265	2017-04-13 21:49:48 +00:00
Davide Italiano	af36d02430	[LCSSA] Efficiently compute blocks dominating at least one exit. For LCSSA purposes, loop BBs not dominating any of the exits aren't interesting, as none of the values defined in these blocks can be used outside the loop. The way the code computed this information was by comparing each BB of the loop with each of the exit blocks and ask the dominator tree about their dominance relation. This is slow. A more efficient way, implemented here, is that of starting from the exit blocks and walking the dom upwards until we hit an header. By transitivity, all the blocks we encounter in our path dominate an exit. For the testcase provided in PR31851, this reduces compile time on `opt -O2` by ~25%, going from 1m47s to 1m22s. Thanks to Dan/MichaelZ for discussions/suggesting the approach/review. Differential Revision: https://reviews.llvm.org/D31843 llvm-svn: 300255	2017-04-13 20:36:59 +00:00
Richard Smith	6c2615177b	Revert accidentally-committed files in r300252. llvm-svn: 300253	2017-04-13 20:31:21 +00:00
Richard Smith	55bd375b69	Remove all allocation and divisions from GreatestCommonDivisor Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252	2017-04-13 20:29:59 +00:00
Reid Kleckner	257cb4e099	[InstCombine] Fix !prof metadata preservation for invokes Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251	2017-04-13 20:26:38 +00:00
Davide Italiano	0b30227f75	[LCSSA] Assert that we always have a valid loop. We could otherwise add BBs not belonging to a loop in `formLCSSA` and later crash when trying to iterate the loop blocks. llvm-svn: 300244	2017-04-13 20:05:37 +00:00
Davide Italiano	549078d1ab	[LCSSA] Remove spurious whitespaces. NFCI. llvm-svn: 300243	2017-04-13 20:02:27 +00:00
Davide Italiano	5129951296	[LCSSA] Use `auto` when the type is obvious. NFCI. llvm-svn: 300242	2017-04-13 20:01:30 +00:00
Dehao Chen	2c7ca9b5df	SamplePGO: convert callsite samples map key from callsite_location to callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240	2017-04-13 19:52:10 +00:00
Anna Thomas	dcdb325fee	[LV] Fix the vector code generation for first order recurrence Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 llvm-svn: 300238	2017-04-13 18:59:25 +00:00
Sanjay Patel	445d03bf00	[InstCombine] fold X == 0 \|\| X == -1 to one compare (PR32524) This is effectively a retry of: https://reviews.llvm.org/rL299851 but now we have tests and an assert to make sure the bug that was exposed with that attempt will not happen again. I'll fix the code duplication and missing sibling fold next, but I want to make this change as small as possible to reduce risk since I messed it up last time. This should fix: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 300236	2017-04-13 18:47:06 +00:00
Reid Kleckner	aea2a28098	[DAE] Simplify call site replacement code with CallSite NFC llvm-svn: 300235	2017-04-13 18:42:03 +00:00
Reid Kleckner	c3fae796fd	[InstCombine] Simplify attribute code with new AttributeList::get NFC llvm-svn: 300230	2017-04-13 18:11:03 +00:00
Reid Kleckner	3a1150352d	[ArgPromotion] Don't drop !prof metadata on promoted calls Noticed by inspection while doing attribute work. DAE, InstCombineCalls, and ArgPromotion have a fair amount of duplicated code for hacking on call sites, and you can find bugs by comparing them. Add a test case for this. llvm-svn: 300229	2017-04-13 18:10:30 +00:00
Sanjay Patel	9745d24a66	[InstCombine] use similar ops for related folds; NFCI It's less efficient to produce 'ule' than 'ult' since we know we're going to canonicalize to 'ult', but we shouldn't have duplicated code for these folds. As a trade-off, this was a pretty terrible way to make a '2'. :) if (LHSC == SubOne(RHSC)) AddC = ConstantExpr::getSub(AddOne(RHSC), LHSC); The next steps are to share the code to fix PR32524 and add the missing 'and' fold that was left out when PR14708 was fixed: https://bugs.llvm.org/show_bug.cgi?id=14708 llvm-svn: 300222	2017-04-13 17:36:24 +00:00
Sanjay Patel	a8ebb46e0e	[InstCombine] fix assert to not always be true llvm-svn: 300202	2017-04-13 16:05:01 +00:00
Geoff Berry	85a530fb59	Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline." This reverts commit r296872 now that PR32153 has been fixed. llvm-svn: 300200	2017-04-13 15:36:25 +00:00
Ayal Zaks	cd712b6c49	[LV] Refactor ILV to provide vectorizeInstruction(); NFC Refactoring InnerLoopVectorizer's vectorizeBlockInLoop() to provide vectorizeInstruction(). Aligning DeadInstructions with its only user. Facilitates driving the transformation by VPlan - follows https://reviews.llvm.org/D28975 and its tentative breakdown. Differential Revision: https://reviews.llvm.org/D31997 llvm-svn: 300183	2017-04-13 09:07:23 +00:00
Reid Kleckner	7f72033e1c	[IR] Take func, ret, and arg attrs separately in AttributeList::get This seems like a much more natural API, based on Derek Schuff's comments on r300015. It further hides the implementation detail of AttributeList that function attributes come last and appear at index ~0U, which is easy for the user to screw up. git diff says it saves code as well: 97 insertions(+), 137 deletions(-) This also makes it easier to change the implementation, which I want to do next. llvm-svn: 300153	2017-04-13 00:58:09 +00:00
Craig Topper	c75f94bfa5	[InstCombine] Teach SimplifyMultipleUseDemandedBits to handle And/Or/Xor known bits using the LHS/RHS known bits it already acquired without recursing back into computeKnownBits. This replicates the known bits and constant creation code from the single use case for these instructions and adds it here. The computeKnownBits and constant creation code for other instructions is now in the default case of the opcode switch. llvm-svn: 300094	2017-04-12 19:32:47 +00:00
Craig Topper	cf3641fd57	[InstCombine] Remove unreachable code for turning an And where all demanded bits on both sides are known to be zero into a constant 0. We already handled a superset check that included the known ones too and folded to a constant that may include ones. But it can also handle the case of no ones. llvm-svn: 300093	2017-04-12 19:08:03 +00:00
Sanjay Patel	6e41018942	[InstCombine] fix wrong undef handling when converting select to shuffle As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32486 ...the canonicalization of vector select to shufflevector does not hold up when undef elements are present in the condition vector. Try to make the undef handling clear in the code and the LangRef. Differential Revision: https://reviews.llvm.org/D31980 llvm-svn: 300092	2017-04-12 18:39:53 +00:00
Craig Topper	f35a7f7b49	[InstCombine] In SimplifyMultipleUseDemandedBits, use a switch instead of cascaded ifs on opcode. NFC llvm-svn: 300085	2017-04-12 18:25:25 +00:00
Craig Topper	9a51c7f343	[InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant. Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user. This might then reduce the instruction to having a single use and allow additional optimizations on the other path. This picks up an additional case that r300075 didn't catch. Differential Revision: https://reviews.llvm.org/D31552 llvm-svn: 300084	2017-04-12 18:17:46 +00:00
Craig Topper	b0076fe8b4	[InstCombine] Move portion of SimplifyDemandedUseBits that deals with instructions with multiple uses out to a separate method. NFCI llvm-svn: 300082	2017-04-12 18:05:21 +00:00
Craig Topper	845033a6c9	Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 llvm-svn: 300075	2017-04-12 16:49:59 +00:00
Sanjay Patel	33439f982b	[InstCombine] morph an existing instruction instead of creating a new one One potential way to make InstCombine (very slightly?) faster is to recycle instructions when possible instead of creating new ones. It's not explicitly stated AFAIK, but we don't consider this an "InstSimplify". We could, however, make a new layer to house transforms like this if that makes InstCombine more manageable (just throwing out an idea; not sure how much opportunity is actually here). Differential Revision: https://reviews.llvm.org/D31863 llvm-svn: 300067	2017-04-12 15:11:33 +00:00
Jonas Paulsson	22776892c9	[SLPVectorizer] Pass the right type argument to getCmpSelInstrCost() In getEntryCost(), make the scalar type for a compare instruction that of the operands, not i1. This is needed in order to call getCmpSelInstrCost() for a compare in a sensible way, the same way as the LoopVectorizer does. New test: test/Transforms/SLPVectorizer/SystemZ/SLP-cmp-cost-query.ll Review: Matthew Simpson https://reviews.llvm.org/D31601 llvm-svn: 300061	2017-04-12 13:29:25 +00:00
Jonas Paulsson	592dbea779	[LoopVectorizer] Improve handling of branches during cost estimation. The cost for a branch after vectorization is very different depending on if the vectorizer will if-convert the block (branch is eliminated), or if scalarized and predicated blocks will be produced (branch duplicated before each block). There is also the case of remaining scalar branches, such as the back-edge branch. This patch handles these cases differently with TTI based cost estimates. Review: Matthew Simpson https://reviews.llvm.org/D31175 llvm-svn: 300058	2017-04-12 13:13:15 +00:00
Jonas Paulsson	da74ed42da	[LoopVectorizer, TTI] New method supportsEfficientVectorElementLoadStore() Since SystemZ supports vector element load/store instructions, there is no need for extracts/inserts if a vector load/store gets scalarized. This patch lets Target specify that it supports such instructions by means of a new TTI hook that defaults to false. The use for this is in the LoopVectorizer getScalarizationOverhead() method, which will with this patch produce a smaller sum for a vector load/store on SystemZ. New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll Review: Adam Nemet https://reviews.llvm.org/D30680 llvm-svn: 300056	2017-04-12 12:41:37 +00:00
Jonas Paulsson	fccc7d66c3	[SystemZ] TargetTransformInfo cost functions implemented. getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052	2017-04-12 11:49:08 +00:00
Bjorn Pettersson	4af0593ecc	[LoadCombine] Avoid analysing dead basic blocks Summary: Dead basic blocks may be forming a loop, for which SSA form is fulfilled, but with a circular def-use chain. LoadCombine could enter an infinite loop when analysing such dead code. This patch solves the problem by simply avoiding to analyse all basic blocks that aren't forward reachable, from function entry, in LoadCombine. Fixes https://bugs.llvm.org/show_bug.cgi?id=27065 Reviewers: mehdi_amini, chandlerc, grosser, Bigcheese, davide Reviewed By: davide Subscribers: dberlin, zzheng, bjope, grandinj, Ka-Ka, materi, jholewinski, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31032 llvm-svn: 300034	2017-04-12 08:07:55 +00:00
Chandler Carruth	927d8e610a	[IR] Redesign the case iterator in SwitchInst to actually be an iterator and to expose a handle to represent the actual case rather than having the iterator return a reference to itself. All of this allows the iterator to be used with common STL facilities, standard algorithms, etc. Doing this exposed some missing facilities in the iterator facade that I've fixed and required some work to the actual iterator to fully support the necessary API. Differential Revision: https://reviews.llvm.org/D31548 llvm-svn: 300032	2017-04-12 07:27:28 +00:00
Craig Topper	b5194eeebf	[InstCombine][IR] Add a commutable BinOp matcher. Use it to reduce some code. NFC llvm-svn: 300030	2017-04-12 05:49:28 +00:00
Bob Haarman	4075ccc717	ThinLTOBitcodeWriter: keep comdats together, rename if leader is renamed Summary: COFF requires that every comdat contain a symbol with the same name as the comdat. ThinLTOBitcodeWriter renames symbols, which may cause this requirement to be violated. This change avoids such violations by renaming comdats if their leaders are renamed. It also keeps comdats together when splitting modules. Reviewers: pcc, mehdi_amini, tejohnson Reviewed By: pcc Subscribers: rnk, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31963 llvm-svn: 300019	2017-04-12 01:43:07 +00:00
Reid Kleckner	c2cb560045	[IR] Add AttributeSet to hide AttributeSetNode* again, NFC Summary: For now, it just wraps AttributeSetNode*. Eventually, it will hold AvailableAttrs as an inline bitset, and adding and removing enum attributes will be super cheap. This sinks AttributeSetNode back down to lib/IR/AttributeImpl.h. Reviewers: pete, chandlerc Subscribers: llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D31940 llvm-svn: 300014	2017-04-12 00:38:00 +00:00
Evgeniy Stepanov	90fd87303c	[asan] Give global metadata private linkage. Internal linkage preserves names like "__asan_global_foo" which may account to 2% of unstripped binary size. llvm-svn: 299995	2017-04-11 22:28:13 +00:00
Anna Thomas	00dc1b74b7	[LV] Avoid vectorizing first order recurrence when phi uses are outside loop In the vectorization of first order recurrence, we vectorize such that the last element in the vector will be the one extracted to pass into the scalar remainder loop. However, this is not true when there is a phi (other than the primary induction variable) is used outside the loop. In such a case, we need the value from the second last iteration (i.e. the phi value), not the last iteration (which would be the phi update). I've added a test case for this. Also see PR32396. A follow up patch would generate the correct code gen for such cases, and turn this vectorization on. Differential Revision: https://reviews.llvm.org/D31910 Reviewers: mssimpso llvm-svn: 299985	2017-04-11 21:02:00 +00:00
Daniel Berlin	554dcd8c89	MemorySSA: Move to Analysis, from Transforms/Utils. It's used as Analysis, it has Analysis passes, and once NewGVN is made an Analysis, this removes the cross dependency from Analysis to Transform/Utils. NFC. llvm-svn: 299980	2017-04-11 20:06:36 +00:00
Andrea Di Biagio	8e26936bfd	[AddDiscriminators] Assign discriminators to MemIntrinsic calls. Before this patch, pass AddDiscriminators always avoided to assign discriminators to intrinsic calls. This was done mainly for two reasons: 1) We wanted to minimize the number of based discriminators used. 2) We wanted to avoid non-deterministic discriminator assignment for different debug levels. Unfortunately, that approach was problematic for MemIntrinsic calls. MemIntrinsic calls can be split by SROA into loads and stores, and each new load/store instruction would obtain the debug location from the original intrinsic call. If we don't assign a discriminator to MemIntrinsic calls, then we cannot correctly set the discriminator for the newly created loads and stores. This may have a negative impact on the basic block weight computation performed by the SampleLoader. This patch fixes the issue by letting MemIntrinsic calls have a discriminator. Differential Revision: https://reviews.llvm.org/D31900 llvm-svn: 299972	2017-04-11 19:07:30 +00:00
Craig Topper	957a94cc03	Fix spelling compliment->complement. Mostly refering to 2s complement. NFC llvm-svn: 299970	2017-04-11 18:47:58 +00:00
Craig Topper	271b2245f4	[InstCombine] Use ConstantExpr::getBinOpIdentity to implement getIdentityValue. This removes a TODO in getIdentityValue and may allow some transforms to occur earlier. But I was unable to find any transforms we didn't already handle. llvm-svn: 299966	2017-04-11 17:42:40 +00:00
Sanjay Patel	28611acef9	revert r299851 - [InstCombine] fix matching of or-of-icmps constants (PR32524) This is a candidate culprit for multiple bot fails, so reverting pending investigation. llvm-svn: 299955	2017-04-11 15:57:32 +00:00
Serge Guelton	59a2d7b909	Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. Differential Revision: https://reviews.llvm.org/D31070 llvm-svn: 299949	2017-04-11 15:01:18 +00:00
Geoff Berry	9d597adde4	[GVNHoist] Re-enable GVNHoist by default Turn GVNHoist back on by default now that PR32153 has been fixed. llvm-svn: 299944	2017-04-11 14:36:30 +00:00
Keno Fischer	30779772cf	[StripDeadDebug/DIFinder] Track inlined SPs Summary: In rL299692 I improved strip-dead-debug-info's ability to drop CUs that are not referenced from the current module. However, in doing so I neglected to realize that some SPs could be referenced entirely from inlined functions. It appears I was not the only one to make this mistake, because DebugInfoFinder, doesn't find those SPs either. Fix this in DebugInfoFinder and then use that to make sure not to drop those CUs in strip-dead-debug-info. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31904 llvm-svn: 299936	2017-04-11 13:32:11 +00:00
Diana Picus	b050c7fbe0	Revert "Turn some C-style vararg into variadic templates" This reverts commit r299925 because it broke the buildbots. See e.g. http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008 llvm-svn: 299928	2017-04-11 10:07:12 +00:00
Serge Guelton	5fd75fb72e	Turn some C-style vararg into variadic templates Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. llvm-svn: 299925	2017-04-11 08:36:52 +00:00
Sylvestre Ledru	06faa9bf32	Simplify the code and remove dead code Summary: Fix coverity cid 1374240 Reviewers: dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D31928 llvm-svn: 299924	2017-04-11 08:21:27 +00:00
Craig Topper	8c75adf95b	[InstCombine] Refinement of r299915. Only consider a ConstantVector for Neg if all the elements are Undef or ConstantInt. llvm-svn: 299917	2017-04-11 06:32:48 +00:00
Craig Topper	18f9e424e7	[InstCombine] Support weird size element types in dyn_castNegVal. llvm-svn: 299915	2017-04-11 05:42:47 +00:00
Hal Finkel	b63ed91549	[LICM] Hoist fp division from the loops and replace by a reciprocal When allowed, we can hoist a division out of a loop in favor of a multiplication by the reciprocal. Fixes PR32157. Patch by vit9696! Differential Revision: https://reviews.llvm.org/D30819 llvm-svn: 299911	2017-04-11 02:22:54 +00:00
Daniel Berlin	bf80cfe6b6	Revert "NewGVN: Don't propagate over phi backedges where undef causes us to have >1 value." It's not ready yet this was an accidental commit :( This reverts r299903 llvm-svn: 299904	2017-04-11 00:07:26 +00:00
Daniel Berlin	3938111fe7	NewGVN: Don't propagate over phi backedges where undef causes us to have >1 value. Fixes PR 32607. llvm-svn: 299903	2017-04-11 00:02:38 +00:00
Reid Kleckner	eb9dd5b87f	Reland "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies" This re-lands r299875. I introduced a bug in Clang code responsible for replacing K&R, no prototype declarations with a real function definition with a prototype. The bug was here: // Collect any return attributes from the call. - if (oldAttrs.hasAttributes(llvm::AttributeList::ReturnIndex)) - newAttrs.push_back(llvm::AttributeList::get(newFn->getContext(), - oldAttrs.getRetAttributes())); + newAttrs.push_back(oldAttrs.getRetAttributes()); Previously getRetAttributes() carried AttributeList::ReturnIndex in its AttributeList. Now that we return the AttributeSetNode* directly, it no longer carries that index, and we call this overload with a single node: AttributeList::get(LLVMContext&, ArrayRef<AttributeSetNode*>) That aborted with an assertion on x86_32 targets. I added an explicit triple to the test and added CHECKs to help find issues like this in the future sooner. llvm-svn: 299899	2017-04-10 23:31:05 +00:00
Davide Italiano	f58a30236b	[NewGVN] Surround with parens to clarify allegedly ambiguous precedence. This Placates GCC7 with -Werror. Also, clang-format the assertions while I'm here. llvm-svn: 299895	2017-04-10 23:08:35 +00:00
Davide Italiano	fa6a0a819d	[MemorySSA] We don't need to compute dominator levels anymore. Differential Revision: https://reviews.llvm.org/D31818 llvm-svn: 299893	2017-04-10 22:44:46 +00:00
Matt Arsenault	3c1fc768ed	Allow DataLayout to specify addrspace for allocas. LLVM makes several assumptions about address space 0. However, alloca is presently constrained to always return this address space. There's no real way to avoid using alloca, so without this there is no way to opt out of these assumptions. The problematic assumptions include: - That the pointer size used for the stack is the same size as the code size pointer, which is also the maximum sized pointer. - That 0 is an invalid, non-dereferencable pointer value. These are problems for AMDGPU because alloca is used to implement the private address space, which uses a 32-bit index as the pointer value. Other pointers are 64-bit and behave more like LLVM's notion of generic address space. By changing the address space used for allocas, we can change our generic pointer type to be LLVM's generic pointer type which does have similar properties. llvm-svn: 299888	2017-04-10 22:27:50 +00:00
Dehao Chen	d4a3397861	Emit less compiler optimization remarks in samplepgo to reduce a call to findCalleeFunctionSamples which is going to be refactored. Summary: Now the SamplePGO support is more stable, we do not need so many verbose optimization remarks emitted. Reviewers: dnovillo, davidxl Reviewed By: davidxl Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D31826 llvm-svn: 299883	2017-04-10 20:49:16 +00:00
Geoff Berry	635e505675	[GVNHoist] Call isGuaranteedToTransferExecutionToSuccessor on each instruction w.r.t. https://bugs.llvm.org/show_bug.cgi?id=32153 The consensus seems to be isGuaranteedToTransferExecutionToSuccessor should be called for each function. Patch by Aditya Kumar Differential Revision: https://reviews.llvm.org/D31035 llvm-svn: 299882	2017-04-10 20:45:17 +00:00
Evgeniy Stepanov	ed7fce7c84	Revert "[asan] Put ctor/dtor in comdat." This reverts commit r299696, which is causing mysterious test failures. llvm-svn: 299880	2017-04-10 20:36:36 +00:00
Evgeniy Stepanov	ba7c2e9661	Revert "[asan] Fix dead stripping of globals on Linux." This reverts commit r299697, which caused a big increase in object file size. llvm-svn: 299879	2017-04-10 20:36:30 +00:00
Reid Kleckner	211b1f324f	Revert "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies" This reverts r299875. A Linux bot came back with a test failure: http://bb.pgr.jp/builders/test-clang-i686-linux-RA/builds/741/steps/test_clang/logs/Clang%20%3A%3A%20CodeGen__2006-05-19-SingleEltReturn.c llvm-svn: 299878	2017-04-10 20:34:19 +00:00
Reid Kleckner	324c99dee5	[IR] Make AttributeSetNode public, avoid temporary AttributeList copies Summary: AttributeList::get(Fn\|Ret\|Param)Attributes no longer creates a temporary AttributeList just to hide the AttributeSetNode type. I've also added a factory method to create AttributeLists from a parallel array of AttributeSetNodes. I think this simplifies construction of AttributeLists when rewriting function prototypes. Previously we would test if a particular index had attributes, and conditionally add a temporary attribute list to a vector. Now the attribute set vector is parallel to the argument vector already that these passes already construct. My long term vision is to wrap AttributeSetNode* inside an AttributeSet type that holds the enum attributes, but that will come in a follow up change. I haven't done any performance measurements for this change because profiling hasn't shown that any of the affected code is hot. Reviewers: pete, chandlerc, sanjoy, hfinkel Reviewed By: pete Subscribers: jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D31198 llvm-svn: 299875	2017-04-10 20:18:10 +00:00
Sanjay Patel	e4159d2238	[InstCombine] improve variable names; NFCI llvm-svn: 299871	2017-04-10 19:38:36 +00:00
Matt Arsenault	daa08875b3	[MemCpyOpt] Only replace memcpy with bitcast if address spaces match Patch by James Price llvm-svn: 299866	2017-04-10 19:00:25 +00:00
Daniel Berlin	74603a68ef	MemorySSA: Make lifetime starts defs for mustaliased pointers Summary: While we don't want them aliasing with other pointers, there seems to be no point in not having them clobber must-aliased'd pointers. If some day, we split the aliasing and ordering chains, we'd make this not aliasing but an ordering barrier (IE it doesn't affect it's memory, but we can't hoist it above it). Reviewers: hfinkel, george.burgess.iv Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31865 llvm-svn: 299865	2017-04-10 18:46:00 +00:00
Craig Topper	0d830ff7bf	[InstCombine] Use commutable matchers and m_OneUse in visitSub to shorten code. Add missing test cases. In one case I removed commute handling for a multiply with a constant since we'll eventually get the constant on the right hand side. llvm-svn: 299863	2017-04-10 18:09:25 +00:00
Craig Topper	98851adc2a	[InstCombine] Use m_c_Add to shorten some code. Add testcases for this fold since they were missing. NFC llvm-svn: 299853	2017-04-10 16:59:40 +00:00
Sanjay Patel	570e35c157	[InstCombine] fix matching of or-of-icmps constants (PR32524) Also, make the same change in and-of-icmps and remove a hack for detecting that case. Finally, add some FIXME comments because the code duplication here is awful. This should fix the remaining IR problem noted in: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 299851	2017-04-10 16:55:57 +00:00
Craig Topper	3eec73e20b	[InstCombine] Support folding of add instructions with vector constants into select operations We currently only fold scalar add of constants into selects. This improves this to support vectors too. Differential Revision: https://reviews.llvm.org/D31683 llvm-svn: 299847	2017-04-10 16:40:00 +00:00
Craig Topper	31cc143b51	[InstCombine] Use commutable and/or/xor matchers to simplify some code Summary: This is my first time using the commutable matchers so wanted to make sure I was doing it right. Are there any other matcher tricks to further shrink this? Can we commute the whole match so we don't have to LHS and RHS separately? Reviewers: davide, spatel Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31680 llvm-svn: 299840	2017-04-10 07:13:40 +00:00
Craig Topper	838d13e7ee	[InstCombine] Make sure we preserve fast math flags when folding fp instructions into phi nodes Summary: I noticed in the select folding code that we copied fast math flags, but did not do the same for the similar handling in phi nodes. This patch fixes that to do the same thing as select Reviewers: spatel, davide, majnemer, hfinkel Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31690 llvm-svn: 299838	2017-04-10 07:00:10 +00:00
Craig Topper	d8840d7b10	[InstCombine] use m_c_And and m_c_Xor to handle commuted versions of a transform. llvm-svn: 299837	2017-04-10 06:53:28 +00:00
Craig Topper	7639460367	[InstCombine] Remove unnecessary dyn_cast to BinaryOperator around some matcher checks in visitXor. The matchers themselves should be enough. llvm-svn: 299835	2017-04-10 06:53:23 +00:00
Craig Topper	4738321f0c	[InstCombine] Make the (A\|B)^B -> A & ~B transform code consistent with the very similar (A&B)^B -> ~A & B code. This should be NFC except for the addition of hasOneUse check. I think this code is still overly complicated and should use matchers, but first I wanted to make it consistent. llvm-svn: 299834	2017-04-10 06:53:21 +00:00
Craig Topper	4f16d82d6b	[InstCombine] Use m_OneUse to shorten some code. NFC llvm-svn: 299833	2017-04-10 06:53:19 +00:00
Xin Tong	34888c08bc	[SCCP] Resolve indirect branch target when possible. Summary: Resolve indirect branch target when possible. This potentially eliminates more basicblocks and result in better evaluation for phi and other things. Reviewers: davide, efriedma, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30322 llvm-svn: 299830	2017-04-10 00:33:25 +00:00
Sanjay Patel	16a054d5c7	[InstCombine] remove dead cases from icmp pair switches; NFCI "PredicatesFoldable" returns false for signed/unsigned mismatched pairs, so these cases should never exist. We'll default to 'unreachable' on those predicate combos instead. Most of what's left in these switches belongs in InstSimplify (and may already be there), so there's probably more that can be done to reduce this code. llvm-svn: 299829	2017-04-09 21:51:34 +00:00
Davide Italiano	612d5a9c5c	[Mem2Reg] Remove AliasSetTracker updating logic from the pass. No caller has been passing it for a long time. llvm-svn: 299827	2017-04-09 20:47:14 +00:00
Hal Finkel	a9d67cf601	[MemorySSA] Fix use of pointsToConstantMemory in isUseTriviallyOptimizableToLiveOnEntry In isUseTriviallyOptimizableToLiveOnEntry, pointsToConstantMemory needs to be called on the load's pointer operand, not on the result of the load (which might not even be a pointer). llvm-svn: 299823	2017-04-09 12:57:50 +00:00
Craig Topper	afa07c5ef6	[InstCombine] Extend some OR combines to support vectors. This adds support for these combines for vectors (X^C)\|Y -> (X\|Y)^C iff Y&C == 0 Y\|(X^C) -> (X\|Y)^C iff Y&C == 0 llvm-svn: 299822	2017-04-09 06:12:41 +00:00
Craig Topper	e63c21b1ba	[InstCombine] Extend a canonicalization check to apply to vector constants too. llvm-svn: 299821	2017-04-09 06:12:39 +00:00
Craig Topper	437c97622b	[InstCombine] Use the SubOne helper function to shorten some code. NFC llvm-svn: 299819	2017-04-09 06:12:34 +00:00
Craig Topper	9d1821b262	[InstCombine] rename variable for easier reading; NFC We usually give constants a 'C' somewhere in the name... llvm-svn: 299818	2017-04-09 06:12:31 +00:00
Gor Nishanov	bfb2a9db31	[coroutines] Make CoroSplit pass deterministic coro-split-after-phi.ll test was flaky due to non-determinism in the coroutine frame construction that was sorting the spill vector using a pointer to a def as a part of the key. The sorting was intended to make sure that spills for the same def are kept together, however, we populate the vector by processing defs in order, so the spill entires will end up together anyways. This change removes spill sorting and restores the determinism in the test. llvm-svn: 299809	2017-04-08 00:49:46 +00:00
Evgeniy Stepanov	349adbacca	[cfi] Take over existing __cfi_check in CrossDSOCFI. https://reviews.llvm.org/D31796 will emit a dummy __cfi_check in the frontend. llvm-svn: 299805	2017-04-07 23:00:20 +00:00
Daniel Berlin	a823656ce7	NewGVN: Make CongruenceClass a real class in preparation for splitting NewGVN into analysis and eliminator. llvm-svn: 299792	2017-04-07 18:38:09 +00:00
Gor Nishanov	138ad6c9c0	[coroutines] Insert spills of PHI instructions correctly Summary: Fix a bug where we were inserting a spill in between the PHIs in the beginning of the block. Consider this fragment: ``` begin: %phi1 = phi i32 [ 0, %entry ], [ 2, %alt ] %phi2 = phi i32 [ 1, %entry ], [ 3, %alt ] %sp1 = call i8 @llvm.coro.suspend(token none, i1 false) switch i8 %sp1, label %suspend [i8 0, label %resume i8 1, label %cleanup] resume: call i32 @print(i32 %phi1) ``` Unless we are spilling the argument or result of the invoke, we were always inserting the spill immediately following the instruction. The fix adds a check that if the spilled instruction is a PHI Node, select an appropriate insert point with `getFirstInsertionPt()` that skips all the PHI Nodes and EH pads. Reviewers: majnemer, rnk Reviewed By: rnk Subscribers: qcolombet, EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D31799 llvm-svn: 299771	2017-04-07 14:16:49 +00:00
Matthew Simpson	11fe2e9f2b	Reapply r298620: [LV] Vectorize GEPs This patch reapplies r298620. The original patch was reverted because of two issues. First, the patch exposed a bug in InstCombine that caused the Chromium builds to fail (PR32414). This issue was fixed in r299017. Second, the patch introduced a bug in the vectorizer's scalars analysis that caused test suite builds to fail on SystemZ. The scalars analysis was too aggressive and marked a memory instruction scalar, even though it was going to be vectorized. This issue has been fixed in the current patch and several new test cases for the scalars analysis have been added. llvm-svn: 299770	2017-04-07 14:15:34 +00:00
Craig Topper	33e0dbcc58	[InstCombine] Handle more commuted cases of ((A & B) \| ~A) -> (~A \| B) llvm-svn: 299747	2017-04-07 07:32:00 +00:00
Daniel Berlin	d952ceae2f	AliasAnalysis: Be less conservative about volatile than atomic. Summary: getModRefInfo is meant to answer the question "what impact does this instruction have on a given memory location" (not even another instruction). Long debate on this on IRC comes to the conclusion the answer should be "nothing special". That is, a noalias volatile store does not affect a memory location just by being volatile. Note: DSE and GVN and memdep currently believe this, because memdep just goes behind AA's back after it says "modref" right now. see line 635 of memdep. Prior to this patch we would get modref there, then check aliasing, and if it said noalias, we would continue. getModRefInfo already has this same AA check, it just wasn't being used because volatile was lumped in with ordering. (I am separately testing whether this code in memdep is now dead except for the invariant load case) Reviewers: jyknight, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31726 llvm-svn: 299741	2017-04-07 01:28:36 +00:00
Craig Topper	72a622cac7	[InstCombine] Add more commuted patterns to support folding ((~A & B) \| A) -> (A \| B). llvm-svn: 299737	2017-04-07 00:29:47 +00:00
Craig Topper	a521c30dc6	[InstCombine] Remove testing assert I accidentally left in r299710. llvm-svn: 299715	2017-04-06 21:29:43 +00:00
Craig Topper	b4da6840d8	[InstCombine] When checking to see if we can turn subtracts of 2^n - 1 into xor, we only need to call computeKnownBits on the RHS not the whole subtract. While there use isMask instead of isPowerOf2(C+1) Calling computeKnownBits on the RHS should allows us to recurse one step further. isMask is equivalent to the isPowerOf2(C+1) except in the case where C is all ones. But that was already handled earlier by creating a not which is an Xor with all ones. So this should be fine. llvm-svn: 299710	2017-04-06 21:06:03 +00:00
Rong Xu	2bf4c59025	[PGO] Preserve GlobalsAA in pgo-memop-opt pass. Preserve GlobalsAA analysis in memory intrinsic calls optimization based on profiled size. llvm-svn: 299707	2017-04-06 20:56:00 +00:00
Craig Topper	7226d796aa	[InstCombine] Remove redundant combine from visitAnd This combine is fully handled by SimplifyDemandedInstructionBits as of r299658 where I fixed this code to ensure the Add/Sub had only a single user. Otherwise it would fire and create additional instructions. That fix resulted in an improvement to code generated for tsan which is why I committed it before deleting. Differential Revision: https://reviews.llvm.org/D31543 llvm-svn: 299704	2017-04-06 20:41:48 +00:00
Mehdi Amini	db11fdfda5	Revert "Turn some C-style vararg into variadic templates" This reverts commit r299699, the examples needs to be updated. llvm-svn: 299702	2017-04-06 20:23:57 +00:00
Mehdi Amini	579540a8f7	Turn some C-style vararg into variadic templates Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu> Differential Revision: https://reviews.llvm.org/D31070 llvm-svn: 299699	2017-04-06 20:09:31 +00:00
Evgeniy Stepanov	6c3a8cbc4d	[asan] Fix dead stripping of globals on Linux. Use a combination of !associated, comdat, @llvm.compiler.used and custom sections to allow dead stripping of globals and their asan metadata. Sometimes. Currently this works on LLD, which supports SHF_LINK_ORDER with sh_link pointing to the associated section. This also works on BFD, which seems to treat comdats as all-or-nothing with respect to linker GC. There is a weird quirk where the "first" global in each link is never GC-ed because of the section symbols. At this moment it does not work on Gold (as in the globals are never stripped). This is a re-land of r298158 rebased on D31358. This time, asan.module_ctor is put in a comdat as well to avoid quadratic behavior in Gold. llvm-svn: 299697	2017-04-06 19:55:17 +00:00
Evgeniy Stepanov	5dfe420d10	[asan] Put ctor/dtor in comdat. When possible, put ASan ctor/dtor in comdat. The only reason not to is global registration, which can be TU-specific. This is not the case when there are no instrumented globals. This is also limited to ELF targets, because MachO does not have comdat, and COFF linkers may GC comdat constructors. The benefit of this is a lot less __asan_init() calls: one per DSO instead of one per TU. It's also necessary for the upcoming gc-sections-for-globals change on Linux, where multiple references to section start symbols trigger quadratic behaviour in gold linker. This is a rebase of r298756. llvm-svn: 299696	2017-04-06 19:55:13 +00:00
Evgeniy Stepanov	039af609f1	[asan] Delay creation of asan ctor. Create the constructor in the module pass. This in needed for the GC-friendly globals change, where the constructor can be put in a comdat in some cases, but we don't know about that in the function pass. This is a rebase of r298731 which was reverted due to a false alarm. llvm-svn: 299695	2017-04-06 19:55:09 +00:00
Keno Fischer	bacc64b5fa	[StripDeadDebugInfo] Drop dead CUs entirely Summary: Prior to this while it would delete the dead DIGlobalVariables, it would leave dead DICompileUnits and everything referenced therefrom. For a bit bitcode file with thousands of compile units those dead nodes easily outnumbered the real ones. Clean that up. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D31720 llvm-svn: 299692	2017-04-06 19:26:22 +00:00
Daniel Berlin	21279bd37a	NewGVN: Rename some functions for consistency llvm-svn: 299685	2017-04-06 18:52:58 +00:00
Daniel Berlin	08fe6e0f74	NewGVN: Fixup some small issues llvm-svn: 299684	2017-04-06 18:52:55 +00:00
Daniel Berlin	5845e0549e	NewGVN: Fix a small formatting issue in performSymbolicLoadEvaluation. llvm-svn: 299683	2017-04-06 18:52:53 +00:00
Daniel Berlin	1316a94ebc	NewGVN: This patch makes memory congruence work for all types of memorydefs, not just stores. Along the way, we audit and fixup issues about how we were tracking memory leaders, and improve the verifier to notice more memory congruency issues. llvm-svn: 299682	2017-04-06 18:52:50 +00:00
Craig Topper	3fc1225c18	[InstCombine] Fix a case where we weren't checking that an instruction had a single use resulting in extra instructions being created. llvm-svn: 299658	2017-04-06 16:42:46 +00:00
Daniel Berlin	d7a7ae061f	MemorySSA: Remove MemorySSA walker caching. Summary: Remove all the caching the clobber walker does, and that the caching walker does. With the patch to enable storing clobbering access results for stores, i can find no improvement with the cache turned on (and a number of degradations, both time and memory, from the cost of caching. For a large program i have, we do millions of lookups and inserts with zero hits). I haven't tried to rename or simplify the walker otherwise yet. (Appreciate some perf testing on this past my own testing) Reviewers: george.burgess.iv, davide Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31576 llvm-svn: 299578	2017-04-05 19:01:58 +00:00
Sanjay Patel	50c82c4395	[InstCombine] add fold for icmp with or mask of low bits (PR32542) We already have these 'and' folds: // X & -C == -C -> X > u ~C // X & -C != -C -> X <= u ~C // iff C is a power of 2 ...but we were missing the 'or' siblings. http://rise4fun.com/Alive/n6 This should improve: https://bugs.llvm.org/show_bug.cgi?id=32524 ...but there are 2 or more other pieces to fix still. Differential Revision: https://reviews.llvm.org/D31712 llvm-svn: 299570	2017-04-05 17:57:05 +00:00
Sanjay Patel	519a87a468	[InstCombine] fix formatting and variable names; NFCI There must be some opportunity to refactor big chunks of nearly duplicated code in FoldOrOfICmps / FoldAndOfICmps. Also, none of this works with vectors, but it should. llvm-svn: 299568	2017-04-05 17:38:34 +00:00
Daniel Berlin	3082b8e062	MemorySSA: Fix and use optimized_def_chain llvm-svn: 299566	2017-04-05 17:26:25 +00:00
Akira Hatanaka	75be84f3c2	[ObjCArc] Do not dereference an invalidated iterator. Fix a bug in ARC contract pass where an iterator that pointed to a deleted instruction was dereferenced. It appears that tryToContractReleaseIntoStoreStrong was incorrectly assuming that a call to objc_retain would not immediately follow a call to objc_release. rdar://problem/25276306 llvm-svn: 299507	2017-04-05 03:44:09 +00:00
Bob Haarman	6de8134784	ThinLTOBitcodeWriter: handle aliases first in filterModule Summary: This change fixes a "local linkage requires default visibility" assert when attempting to build LLVM with ThinLTO on Windows. Reviewers: pcc, tejohnson, mehdi_amini Reviewed By: pcc Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D31632 llvm-svn: 299491	2017-04-05 00:42:07 +00:00
Daniel Berlin	e33bc31df4	Re-apply MemorySSA: Add support for caching clobbering access in stores with some fixes. Summary: This enables us to cache the clobbering access for stores, despite the fact that we can't rewrite the use-def chains themselves. Early testing shows that, after this change, for larger testcases, it will be a significant net positive (memory and time) to remove the walker caching. Reviewers: george.burgess.iv, davide Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31567 llvm-svn: 299486	2017-04-04 23:43:10 +00:00
Daniel Berlin	f49d4c45a1	Revert "MemorySSA: Add support for caching clobbering access in stores" This reverts revision r299322. llvm-svn: 299485	2017-04-04 23:43:04 +00:00
Sanjay Patel	0bf0abedf6	[InstCombine] rename variable for easier reading; NFC We usually give constants a 'C' somewhere in the name... llvm-svn: 299474	2017-04-04 22:06:03 +00:00
Craig Topper	c745b6a1f6	[InstCombine] Turn subtract of vectors of i1 into xor like we do for scalar i1. Matches what we already do for add. llvm-svn: 299472	2017-04-04 21:44:56 +00:00
Craig Topper	86173600ec	[InstCombine] Support folding and/or/xor with a constant vector RHS into selects and phis Currently we only fold with ConstantInt RHS. This generalizes to any Constant RHS. Differential Revision: https://reviews.llvm.org/D31610 llvm-svn: 299466	2017-04-04 20:26:25 +00:00
Rong Xu	48596b6f7a	[PGO] Memory intrinsic calls optimization based on profiled size This patch optimizes two memory intrinsic operations: memset and memcpy based on the profiled size of the operation. The high level transformation is like: mem_op(..., size) ==> switch (size) { case s1: mem_op(..., s1); goto merge_bb; case s2: mem_op(..., s2); goto merge_bb; ... default: mem_op(..., size); goto merge_bb; } merge_bb: Differential Revision: http://reviews.llvm.org/D28966 llvm-svn: 299446	2017-04-04 16:42:20 +00:00
Craig Topper	e06b6bcfa1	[InstCombine] Use setAllBits in place of getAllOnesValue since we know the bitwidths are the same. NFCI llvm-svn: 299413	2017-04-04 05:03:02 +00:00
Zvi Rackover	82bf48d8b9	InstCombine: Use the InstSimplify hook for shufflevector Summary: Start using the recently added InstSimplify hook for shuffles in the respective InstCombine visitor. Reviewers: spatel, RKSimon, craig.topper, majnemer Reviewed By: majnemer Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D31526 llvm-svn: 299412	2017-04-04 04:47:57 +00:00
Craig Topper	1604f0773b	[InstCombine] Remove canonicalization for (X & C1) \| C2 --> (X \| C2) & (C1\|C2) when C1 & C2 have common bits. It turns out that SimplifyDemandedInstructionBits will get called earlier and remove bits from C1 first. Effectively doing (X & (C1&C2)) \| C2. So by the time it got to this check there could be no common bits. I think the DAGCombiner has the same check but its check can be executed because it handles demanded bits later. I'll look at it next. llvm-svn: 299384	2017-04-03 20:41:47 +00:00
Craig Topper	3882613956	[DAGCombine][InstCombine] Fix inverted if condition in equivalent comments in DAGCombine and InstCombine. NFC llvm-svn: 299378	2017-04-03 19:18:48 +00:00
Craig Topper	79120e80b8	Revert r299337 "[InstCombine] Remove redundant combine from visitAnd" One of the tsan bots started failing at this commit. I don't see anything obviously wrong with the commit so trying this to see if it recovers. Failing log: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/6792 llvm-svn: 299366	2017-04-03 17:22:23 +00:00
Sanjay Patel	77bf622db6	[InstCombine] fix formatting for foldLogOpOfMaskedICmps and related bits; NFCI 1. Improve enum, function, and variable names. 2. Improve comments. 3. Fix variable capitalization. 4. Run clang-format. As an existing code comment suggests, this should work with vector types / splat constants too, so making this look right first will reduce the diffs needed for that change. llvm-svn: 299365	2017-04-03 16:53:12 +00:00
Craig Topper	d33ee1b960	[APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 llvm-svn: 299362	2017-04-03 16:34:59 +00:00
Craig Topper	d0b053d229	[InstCombine] Make foldOpWithConstantIntoOperand take a BinaryOperator instead of a generic Instruction. It blindly assumes there are two operands so make it explicit. llvm-svn: 299351	2017-04-03 07:08:08 +00:00
Craig Topper	07944f891c	[InstCombine] Remove a And transform that should be handled by SimplifyDemandedInstructionBits. NFCI llvm-svn: 299349	2017-04-03 06:02:09 +00:00
Craig Topper	70e4f434ae	[InstCombine] Make InstCombiner::OptAndOp take a BinaryOperator instead of an Instruction. The callers have already performed the necessary cast before calling. This allows us to remove a comment that says the instruction must be a BinaryOperator and make it explicit in the argument type. Had to add a default case to the switch because BinaryOperator::getOpcode() returns a BinaryOps enum. llvm-svn: 299339	2017-04-02 17:57:30 +00:00
Craig Topper	d133591a7e	[InstCombine] Remove redundant combine from visitAnd As far as I can tell this combine is fully handled by SimplifyDemandedInstructionBits. I was only looking at this because it is the only user of APIntOps::isShiftedMask which is itself broken. As demonstrated by r299187. I was going to fix isShiftedMask and needed to make sure we had coverage for the new cases it would expose to this combine. But looks like we can nuke it instead. Differential Revision: https://reviews.llvm.org/D31543 llvm-svn: 299337	2017-04-02 17:34:30 +00:00
Daniel Berlin	07daac8a36	NewGVN: Handle coercion of constant stores, loads, memory insts. Summary: Depends on D30928. This adds support for coercion of stores and memory instructions that do not require insertion to process. Another few tests down. I added the relevant tests from rle.ll Reviewers: davide Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D30929 llvm-svn: 299330	2017-04-02 13:23:44 +00:00
Nikolai Bozhenov	fca527af5c	[BypassSlowDivision] Do not bypass division of hash-like values Disable bypassing if one of the operands looks like a hash value. Slow division often occurs in hashtable implementations and fast division is never taken there because a hash value is extremely unlikely to have enough upper bits set to zero. A value is considered to be hash-like if it is produced by 1) XOR operation 2) Multiplication by a constant wider than the shorter type 3) PHI node with all incoming values being hash-like Differential Revision: https://reviews.llvm.org/D28200 llvm-svn: 299329	2017-04-02 13:14:30 +00:00
Daniel Berlin	8a00270838	MemorySSA: Add support for caching clobbering access in stores Summary: This enables us to cache the clobbering access for stores, despite the fact that we can't rewrite the use-def chains themselves. Early testing shows that, after this change, for larger testcases, it will be a significant net positive (memory and time) to remove the walker caching. Reviewers: george.burgess.iv, davide Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31567 llvm-svn: 299322	2017-04-02 05:09:15 +00:00
Daniel Berlin	9a9c9ff260	NewGVN: Don't try to kill off the stored value of stores when processing the congruence class of the store. Because we use the stored value of a store as the def, it isn't dead just because it appears as a def when it comes from a store. Note: I have not hit any cases with the memory code as it is where this breaks anything, just because of what memory congruences we actually allow. In a followup that improves memory congruence, this bug actually breaks real stuff (but the verifier catches it). llvm-svn: 299300	2017-04-01 09:44:33 +00:00
Daniel Berlin	9b4984926c	NewGVN: Clean up GVNExpression memory hierarchy, restructure hash computation a bit so we don't have to redefine it for loads, stores, and calls llvm-svn: 299299	2017-04-01 09:44:29 +00:00
Daniel Berlin	871ecd90ca	NewGVN: Use def_chain iterator in singleReachablePhiPath instead of recursion llvm-svn: 299298	2017-04-01 09:44:24 +00:00
Daniel Berlin	07275c3065	Move def_chain iterator to MemorySSA.h so it can be reused llvm-svn: 299297	2017-04-01 09:44:19 +00:00
Daniel Berlin	d042031f0f	MemorySSA: Push const correctness further. llvm-svn: 299295	2017-04-01 09:01:12 +00:00
Daniel Berlin	7500c5641e	MemorySSA: Kill the WalkTargetCache now that we have getBlockDefs. llvm-svn: 299294	2017-04-01 08:59:45 +00:00
Craig Topper	47fd2de304	[APInt] Fix bugs in isShiftedMask to match behavior of the similar function in MathExtras.h This removes a parameter from the routine that was responsible for a lot of the issue. It was a bit count that had to be set to the BitWidth of the APInt and would get passed to getLowBitsSet. This guaranteed the call to getLowBitsSet would create an all ones value. This was then compared to (V \| (V-1)). So the only shifted masks we detected had to have the MSB set. The one in tree user is a transform in InstCombine that never fires due to earlier transforms covering the case better. I've submitted a patch to remove it completely, but for now I've just adapted it to the new interface for isShiftedMask. llvm-svn: 299273	2017-03-31 22:23:42 +00:00
Craig Topper	e625d74271	[InstCombine] When adding an Instruction and its Users to the worklist at the same time, make sure we put the Users in first. Then put in the instruction. This way we ensure we immediately revisit the instruction and do any additional optimizations before visiting the users. Otherwise we might visit the users, then the instruction, then users again, then instruction again. llvm-svn: 299267	2017-03-31 21:35:30 +00:00
Craig Topper	885fa12e8a	[APInt] Remove shift functions from APIntOps namespace. Replace the few users with the APInt class methods. NFCI llvm-svn: 299248	2017-03-31 20:01:16 +00:00
Joerg Sonnenberger	28bed106e0	Do not translate rint into nearbyint, but truncate it like nearbyint. A common way to implement nearbyint is by fiddling with the floating point environment and calling rint. This is used at least by the BSD libm and musl. As such, canonicalizing the latter to the former will create infinite loops for libm and generally pessimize performance, at least when the generic C versions are used. This change preserves the rint in the libcall translation and also handles the domain truncation logic, so that rint with float argument will be reduced to rintf etc. llvm-svn: 299247	2017-03-31 19:58:07 +00:00
Dehao Chen	fed890ea3a	Fix the InstCombine to reserve the VP metadata and sets correct call count. Summary: Currently the VP metadata was dropped when InstCombine converts a call to direct call. This patch converts the VP metadata to branch_weights so that its hotness is recorded. Reviewers: eraman, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31344 llvm-svn: 299228	2017-03-31 15:59:52 +00:00
Mikael Holmen	79235bd4d8	[Scalarizer] Handle scalar arguments in vector GEP Summary: Triggered by commit r298620: "[LV] Vectorize GEPs". If we encounter a vector GEP with scalar arguments, we splat the scalar into a vector of appropriate size before we scatter the argument. Reviewers: arsenm, mehdi_amini, bkramer Reviewed By: arsenm Subscribers: bjope, mssimpso, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D31416 llvm-svn: 299186	2017-03-31 06:29:49 +00:00
Peter Collingbourne	6b193966ac	ThinLTOBitcodeWriter: Use Module::global_values(). NFCI. llvm-svn: 299132	2017-03-30 23:43:08 +00:00
Craig Topper	79e5bc528d	[InstCombine] Fix typo last->least. NFC llvm-svn: 299123	2017-03-30 22:28:55 +00:00
Matt Arsenault	79f837c254	AMDGPU: Add all atomicrmw fields to atomic.inc/dec Add scope, order, isVolatile llvm-svn: 299122	2017-03-30 22:21:40 +00:00
Hongbin Zheng	bfd7c38de7	[SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its operands are non-negative Since there is no sdiv in SCEV, an 'udiv' is a better canonical form than an 'sdiv' as the user of induction variable Differential Revision: https://reviews.llvm.org/D31488 llvm-svn: 299118	2017-03-30 21:56:56 +00:00
Simon Pilgrim	68168d17b9	Spelling mistakes in comments. NFCI. Based on corrections mentioned in patch for clang for PR27635 llvm-svn: 299072	2017-03-30 12:59:53 +00:00
Matthew Simpson	c8f0aeccda	[InstCombine] Correct the check for vector GEPs Some of the GEP combines (e.g., descaling) can't handle vector GEPs. We have an existing check that attempts to bail out if given a vector GEP. However, the check only tests the GEP's pointer operand. A GEP results in a vector of pointers if at least one of its operands is vector-typed (e.g., its pointer operand could be a scalar, but its index could be a vector). We should just check the type of the GEP itself. This should fix PR32414. Reference: https://bugs.llvm.org/show_bug.cgi?id=32414 Differential Revision: https://reviews.llvm.org/D31470 llvm-svn: 299017	2017-03-29 18:23:08 +00:00
Filipe Cabecinhas	8b94273fe6	Cleanup in preparation for D30703. NFCI Make the enumerators follow the coding convention and start with OW_... llvm-svn: 298996	2017-03-29 14:42:27 +00:00
Anna Thomas	923e574bff	[InstCombine] For select rule, use positive check of constant int for select operand. NFCI llvm-svn: 298906	2017-03-28 09:32:24 +00:00
Alex Shlyapnikov	bbd5cc63d7	Revert "[asan] Delay creation of asan ctor." Speculative revert. Some libfuzzer tests are affected. This reverts commit r298731. llvm-svn: 298890	2017-03-27 23:11:50 +00:00
Alex Shlyapnikov	09171aa31f	Revert "[asan] Put ctor/dtor in comdat." Speculative revert, some libfuzzer tests are affected. This reverts commit r298756. llvm-svn: 298889	2017-03-27 23:11:47 +00:00
Matthew Simpson	b8ff4a4a70	[LV] Transform truncations of non-primary induction variables The vectorizer tries to replace truncations of induction variables with new induction variables having the smaller type. After r295063, this optimization was applied to all integer induction variables, including non-primary ones. When optimizing the truncation of a non-primary induction variable, we still need to transform the new induction so that it has the correct start value. This should fix PR32419. Reference: https://bugs.llvm.org/show_bug.cgi?id=32419 llvm-svn: 298882	2017-03-27 20:07:38 +00:00
Anna Thomas	f57ae33381	[InstCombine] Avoid incorrect folding of select into phi nodes when incoming element is a vector type Summary: We are incorrectly folding selects into phi nodes when the incoming value of a phi node is a constant vector. This optimization is done in `FoldOpIntoPhi` when the select condition is a phi node with constant incoming values. Without the fix, we are miscompiling (i.e. incorrectly folding the select into the phi node) when the vector contains non-zero elements. This patch fixes the miscompile and we will correctly fold based on the select vector operand (see added test cases). Reviewers: majnemer, sanjoy, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31189 llvm-svn: 298845	2017-03-27 13:52:51 +00:00
Serge Pavlov	b71bb80c2d	[LoopUnroll] Remap references in peeled iteration References in cloned blocks must be remapped prior to dominator calculation. Differential Revision: https://reviews.llvm.org/D31281 llvm-svn: 298811	2017-03-26 16:46:53 +00:00
Joerg Sonnenberger	fa7367428a	Split the SimplifyCFG pass into two variants. The first variant contains all current transformations except transforming switches into lookup tables. The second variant contains all current transformations. The switch-to-lookup-table conversion results in code that is more difficult to analyze and optimize by other passes. Most importantly, it can inhibit Dead Code Elimination. As such it is often beneficial to only apply this transformation very late. A common example is inlining, which can often result in range restrictions for the switch expression. Changes in execution time according to LNT: SingleSource/Benchmarks/Misc/fp-convert +3.03% MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -11.20% MultiSource/Benchmarks/Olden/perimeter/perimeter -10.43% and a couple of smaller changes. For perimeter it also results 2.6% a smaller binary. Differential Revision: https://reviews.llvm.org/D30333 llvm-svn: 298799	2017-03-26 06:44:08 +00:00
Chandler Carruth	0d256c0f5d	[IR] Make SwitchInst::CaseIt almost a normal iterator. This moves it to the iterator facade utilities giving it full random access semantics, etc. It can also now be used with standard algorithms like std::all_of and std::any_of and range adaptors like llvm::reverse. Also make the semantics of iterating match what every other iterator uses and forbid decrementing past the begin iterator. This was used as a hacky way to work around iterator invalidation. However, every instance trying to do this failed to actually avoid touching invalid iterators despite the clear documentation that the removed and all subsequent iterators become invalid including the end iterator. So I've added a return of the next iterator to removeCase and rewritten the loops that were doing this to correctly follow the iterator pattern of either incremneting or removing and assigning fresh values to the iterator and the end. In one case we were trying to go backwards to make this cleaner but it doesn't actually work. I've made that code match the code we use everywhere else to remove cases as we iterate. This changes the order of cases in one test output and I moved that test to CHECK-DAG so it wouldn't care -- the order isn't semantically meaningful anyways. llvm-svn: 298791	2017-03-26 02:49:23 +00:00
Craig Topper	47596dd4cc	[InstCombine] Change the interface of SimplifyDemandedBits so that it takes the instruction and operand instead of the Use. The first thing it did was get the User for the Use to get the instruction back. This requires looking through the Uses for the User using the waymarking walk. That's pretty fast, but its probably still better to just pass the Instruction we already had. llvm-svn: 298772	2017-03-25 06:52:52 +00:00
Davide Italiano	e9781e7b2f	[NewGVN] Adjust NDEBUG markers. This avoids 'used but not defined' warnings in Release builds with GCC. llvm-svn: 298760	2017-03-25 02:40:02 +00:00
Evgeniy Stepanov	71bb8f1ad0	[asan] Put ctor/dtor in comdat. When possible, put ASan ctor/dtor in comdat. The only reason not to is global registration, which can be TU-specific. This is not the case when there are no instrumented globals. This is also limited to ELF targets, because MachO does not have comdat, and COFF linkers may GC comdat constructors. The benefit of this is a lot less __asan_init() calls: one per DSO instead of one per TU. It's also necessary for the upcoming gc-sections-for-globals change on Linux, where multiple references to section start symbols trigger quadratic behaviour in gold linker. llvm-svn: 298756	2017-03-25 01:01:11 +00:00
Craig Topper	8fbb74b5b2	Revert r298711 "[InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits" Tsan bot is failing. llvm-svn: 298745	2017-03-24 22:12:10 +00:00
Ivan Krasin	c2124e185c	Revert r298620: [LV] Vectorize GEPs Reason: breaks linking Chromium with LLD + ThinLTO (a pass crashes) LLVM bug: https://bugs.llvm.org//show_bug.cgi?id=32413 Original change description: [LV] Vectorize GEPs This patch adds support for vectorizing GEPs. Previously, we only generated vector GEPs on-demand when creating gather or scatter operations. All GEPs from the original loop were scalarized by default, and if a pointer was to be stored to memory, we would have to build up the pointer vector with insertelement instructions. With this patch, we will vectorize all GEPs that haven't already been marked for scalarization. The patch refines collectLoopScalars to more exactly identify the scalar GEPs. The function now more closely resembles collectLoopUniforms. And the patch moves vector GEP creation out of vectorizeMemoryInstruction and into the main vectorization loop. The vector GEPs needed for gather and scatter operations will have already been generated before vectoring the memory accesses. Original Differential Revision: https://reviews.llvm.org/D30710 llvm-svn: 298735	2017-03-24 20:49:43 +00:00
Evgeniy Stepanov	64e872a91f	[asan] Delay creation of asan ctor. Create the constructor in the module pass. This in needed for the GC-friendly globals change, where the constructor can be put in a comdat in some cases, but we don't know about that in the function pass. llvm-svn: 298731	2017-03-24 20:42:15 +00:00
Matt Arsenault	4c7795dd31	AMDGPU: Fold rcp/rsq of undef to undef llvm-svn: 298725	2017-03-24 19:04:57 +00:00
Matt Arsenault	18bb24a1be	TTI: Split IsSimple in MemIntrinsicInfo All this did before was assert in EarlyCSE. llvm-svn: 298724	2017-03-24 18:56:43 +00:00
Teresa Johnson	428b9e0627	[ThinLTO] Correct counting of functions in inliner stats Summary: Declarations need to be filtered out when counting functions. Reviewers: eraman Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31336 llvm-svn: 298720	2017-03-24 17:59:06 +00:00
Craig Topper	d4521c2fc2	[InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits SimplifyDemandedUseBits for Add/Sub already recursed down LHS and RHS for simplifying bits. If that didn't provide any simplifications we fall back to calling computeKnownBits which will recurse again. Instead just take the known bits for LHS and RHS we already have and call into a new function in ValueTracking that can calculate the known bits given the LHS/RHS bits. llvm-svn: 298711	2017-03-24 16:56:51 +00:00
Benjamin Kramer	46f5e2c47b	Make GCC happy again. llvm-svn: 298702	2017-03-24 14:15:35 +00:00
Daniel Berlin	ffc30781f4	NewGVN: Small cleanup of two dominance related functions to make them easier to understand. llvm-svn: 298692	2017-03-24 06:33:51 +00:00
Daniel Berlin	0e9001131d	NewGVN: Small cleanup of useless expression deletion, and don't uselessly create two expressions in symbolic store evaluation. llvm-svn: 298691	2017-03-24 06:33:48 +00:00
Daniel Berlin	9d0796e5d0	NewGVN: Fix PR32403 - Handling of undef in phis was not quite correct due to LLVM's view of phi nodes. It would cause NewGVN not to fixpoint in some interesting edge cases. llvm-svn: 298687	2017-03-24 05:30:34 +00:00
Craig Topper	36f2e0eee8	[InstCombine] Use range-based for loop. NFC llvm-svn: 298680	2017-03-24 02:58:02 +00:00
Craig Topper	df73e7c5b7	[InstCombine] Fix 80 column violation I accidentally introduced. NFC llvm-svn: 298679	2017-03-24 02:57:59 +00:00
Reid Kleckner	392f062675	[sancov] Don't instrument blocks with no insertion point This prevents crashes when attempting to instrument functions containing C++ try. Sanitizer coverage will still fail at runtime when an exception is thrown through a sancov instrumented function, but that seems marginally better than what we have now. The full solution is to color the blocks in LLVM IR and only instrument blocks that have an unambiguous color, using the appropriate token. llvm-svn: 298662	2017-03-23 23:30:41 +00:00
Dehao Chen	722e94061b	Set the prof weight correctly for call instructions in DeadArgumentElimination. Summary: In DeadArgumentElimination, the call instructions will be replaced. We also need to set the prof weights so that function inlining can find the correct profile. Reviewers: eraman Reviewed By: eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31143 llvm-svn: 298660	2017-03-23 23:26:00 +00:00
Bryant Wong	def79b21e4	[MetaRenamer] Don't rename library functions. Library functions can have specific semantics that affect the behavior of certain passes. DSE, for instance, gives special treatment to malloc-ed pointers but not to pointers returned from an equivalently typed (but differently named) function. MetaRenamer ought not to alter program semantics, so library functions must remain untouched. Reviewers: mehdi_amini, majnemer, chandlerc, davide Reviewed By: davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D31304 llvm-svn: 298659	2017-03-23 23:21:07 +00:00
Dehao Chen	8c88671985	Disable loop unrolling and icp in SamplePGO ThinLTO compile phase Summary: loop unrolling and icp will make the sample profile annotation much harder in the backend. So disable these 2 optimization in the ThinLTO compile phase. Will add a test in cfe in a separate patch. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: mehdi_amini, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D31217 llvm-svn: 298646	2017-03-23 21:20:05 +00:00
Craig Topper	74494d0179	[InstCombine] Remove some code from visitAnd that dealt with trying to reduce the LHS of a sub to 0. This should now be fully handled by SimplifyDemandedInstructionBits now. Now that we call ShrinkDemandedConstant on the RHS of sub this should be taken care of. This code doesn't trigger on any in tree regressions, but did before ShrinkDemandedConstant was added to the RHS. llvm-svn: 298644	2017-03-23 21:00:13 +00:00
Teresa Johnson	0c6a4ff8dc	[ThinLTO] Add support for emitting minimized bitcode for thin link Summary: The cumulative size of the bitcode files for a very large application can be huge, particularly with -g. In a distributed build environment, all of these files must be sent to the remote build node that performs the thin link step, and this can exceed size limits. The thin link actually only needs the summary along with a bitcode symbol table. Until we have a proper bitcode symbol table, simply stripping the debug metadata results in significant size reduction. Add support for an option to additionally emit minimized bitcode modules, just for use in the thin link step, which for now just strips all debug metadata. I plan to add a cc1 option so this can be invoked easily during the compile step. However, care must be taken to ensure that these minimized thin link bitcode files produce the same index as with the original bitcode files, as these original bitcode files will be used in the backends. Specifically: 1) The module hash used for caching is typically produced by hashing the written bitcode, and we want to include the hash that would correspond to the original bitcode file. This is because we want to ensure that changes in the stripped portions affect caching. Added plumbing to emit the same module hash in the minimized thin link bitcode file. 2) The module paths in the index are constructed from the module ID of each thin linked bitcode, and typically is automatically generated from the input file path. This is the path used for finding the modules to import from, and obviously we need this to point to the original bitcode files. Added gold-plugin support to take a suffix replacement during the thin link that is used to override the identifier on the MemoryBufferRef constructed from the loaded thin link bitcode file. The assumption is that the build system can specify that the minimized bitcode file has a name that is similar but uses a different suffix (e.g. out.thinlink.bc instead of out.o). Added various tests to ensure that we get identical index files out of the thin link step. Reviewers: mehdi_amini, pcc Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31027 llvm-svn: 298638	2017-03-23 19:47:39 +00:00
Matthew Simpson	4e7b71bc86	[LV] Vectorize GEPs This patch adds support for vectorizing GEPs. Previously, we only generated vector GEPs on-demand when creating gather or scatter operations. All GEPs from the original loop were scalarized by default, and if a pointer was to be stored to memory, we would have to build up the pointer vector with insertelement instructions. With this patch, we will vectorize all GEPs that haven't already been marked for scalarization. The patch refines collectLoopScalars to more exactly identify the scalar GEPs. The function now more closely resembles collectLoopUniforms. And the patch moves vector GEP creation out of vectorizeMemoryInstruction and into the main vectorization loop. The vector GEPs needed for gather and scatter operations will have already been generated before vectoring the memory accesses. Differential Revision: https://reviews.llvm.org/D30710 llvm-svn: 298620	2017-03-23 16:29:58 +00:00
Matthew Simpson	1fb4064531	[LV] Delete unneeded scalar GEP creation code The code for generating scalar base pointers in vectorizeMemoryInstruction is not needed. We currently scalarize all GEPs and maintain the scalarized values in VectorLoopValueMap. The GEP cloning in this unneeded code is the same as that in scalarizeInstruction. The test cases that changed as a result of this patch changed because we were able to reuse the scalarized GEP that we previously generated instead of cloning a new one. Differential Revision: https://reviews.llvm.org/D30587 llvm-svn: 298615	2017-03-23 16:07:21 +00:00
Dehao Chen	53a0c082d2	Do not set branch weight if the branch weight annotation is present. Summary: ThinLTO will annotate the CFG twice. If the branch weight is set by the first annotation, we should not set the branch weight again in the second annotation because the first annotation is more accurate as there is less optimization that could affect debug info accuracy. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: mehdi_amini, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D31228 llvm-svn: 298602	2017-03-23 14:43:10 +00:00
Luqman Aden	3f807c91dc	Preserve nonnull metadata on Loads through SROA & mem2reg. Summary: https://llvm.org/bugs/show_bug.cgi?id=31142 : SROA was dropping the nonnull metadata on loads from allocas that got optimized out. This patch simply preserves nonnull metadata on loads through SROA and mem2reg. Reviewers: chandlerc, efriedma Reviewed By: efriedma Subscribers: hfinkel, spatel, efriedma, arielb1, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D27114 llvm-svn: 298540	2017-03-22 19:16:39 +00:00
Peter Collingbourne	f7691d8b41	IPO: Const correctness for summaries passed into passes. Pass const qualified summaries into importers and unqualified summaries into exporters. This lets us const-qualify the summary argument to thinBackend. Differential Revision: https://reviews.llvm.org/D31230 llvm-svn: 298534	2017-03-22 18:22:59 +00:00
Peter Collingbourne	9a3f97977f	IR: Fix a race condition in type id clients of ModuleSummaryIndex. Add a const version of the getTypeIdSummary accessor that avoids mutating the TypeIdMap. Differential Revision: https://reviews.llvm.org/D31226 llvm-svn: 298531	2017-03-22 18:04:39 +00:00
Sanjay Patel	2f602cea41	[InstCombine] canonicalize insertelement of scalar constant ahead of insertelement of variable insertelement (insertelement X, Y, IdxC1), ScalarC, IdxC2 --> insertelement (insertelement X, ScalarC, IdxC2), Y, IdxC1 As noted in the code comment and seen in the test changes, the motivation is that by pulling constant insertion up, we may be able to constant fold some insertelement instructions. Differential Revision: https://reviews.llvm.org/D31196 llvm-svn: 298520	2017-03-22 17:10:44 +00:00
Evgeny Astigeevich	7823c66e05	r286814 resulted that CallPenalty can be subtracted twice: - First time, during calculation of the cost in InlineCost.cpp - Second time, during calculation of the cost in Inliner.cpp This patches fixes this. Differential Revision: https://reviews.llvm.org/D31137 llvm-svn: 298496	2017-03-22 12:01:57 +00:00
Craig Topper	07f2915ad8	[InstCombine] Teach SimplifyDemandedUseBits to shrink Constants on the left side of subtracts Summary: Subtracts can have constants on the left side, but we don't shrink them based on demanded bits. This patch fixes that to match the right hand side. Reviewers: davide, majnemer, spatel, sanjoy, hfinkel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31119 llvm-svn: 298478	2017-03-22 04:03:53 +00:00
George Burgess IV	56c7e88c2c	Let llvm.objectsize be conservative with null pointers This adds a parameter to @llvm.objectsize that makes it return conservative values if it's given null. This fixes PR23277. Differential Revision: https://reviews.llvm.org/D28494 llvm-svn: 298430	2017-03-21 20:08:59 +00:00
Dehao Chen	9907e9d860	Do not inline hot callsites for samplepgo in thinlto compile phase. Summary: Because SamplePGO passes will be invoked twice in ThinLTO build: once at compile phase, the other at backend. We want to make sure the IR at the 2nd phase matches the hot part in profile, thus we do not want to inline hot callsites in the first phase. Reviewers: tejohnson, eraman Reviewed By: tejohnson Subscribers: mehdi_amini, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D31201 llvm-svn: 298428	2017-03-21 19:55:36 +00:00
Reid Kleckner	b518054b87	Rename AttributeSet to AttributeList Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 llvm-svn: 298393	2017-03-21 16:57:19 +00:00
Yi Kong	019e1c4f99	Test commit access Remove some trailing whitespaces. llvm-svn: 298379	2017-03-21 14:49:19 +00:00
Artur Pilipenko	4cc6130f52	NFC. InstCombiner::visitFAdd extract LHSIntVal/RHSIntVal local variables llvm-svn: 298359	2017-03-21 11:32:15 +00:00

... 16 17 18 19 20 ...

19208 Commits