llvm-project

Commit Graph

Author	SHA1	Message	Date
Weiming Zhao	984f1dc338	Fix DebugLoc propagation for unreachable LoadInst Summary: Currently, when GVN creates a load and when InstCombine creates a new store for unreachable Load, the DebugLoc info gets lost. Reviewers: dberlin, davide, aprantl Reviewed By: aprantl Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D34639 llvm-svn: 308404	2017-07-19 01:27:24 +00:00
Vitaly Buka	74443f0778	[asan] Copy arguments passed by value into explicit allocas for ASan Summary: ASan determines the stack layout from alloca instructions. Since arguments marked as "byval" do not have an explicit alloca instruction, ASan does not produce red zones for them. This commit produces an explicit alloca instruction and copies the byval argument into the allocated memory so that red zones are produced. Submitted on behalf of @morehouse (Matt Morehouse) Reviewers: eugenis, vitalybuka Reviewed By: eugenis Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D34789 llvm-svn: 308387	2017-07-18 22:28:03 +00:00
Davide Italiano	6da7db31df	[TRE] Simplify canTRE() a bit using all_of(). NFCI. This has a ~11 years old FIXME, which may not be true today. We might consider removing this code altogether. llvm-svn: 308319	2017-07-18 15:42:59 +00:00
Alexander Potapenko	9385aaa848	[sancov] Fix PR33732 Coverage hooks that take less-than-64-bit-integers as parameters need the zeroext parameter attribute (http://llvm.org/docs/LangRef.html#paramattrs) to make sure they are properly extended by the x86_64 ABI. llvm-svn: 308296	2017-07-18 11:47:56 +00:00
Max Kazantsev	2c627a97fd	[IRCE] Recognize loops with ne/eq latch conditions In some particular cases eq/ne conditions can be turned into equivalent slt/sgt conditions. This patch teaches parseLoopStructure to handle some of these cases. Differential Revision: https://reviews.llvm.org/D35010 llvm-svn: 308264	2017-07-18 04:53:48 +00:00
Martin Storsjo	2f24e93481	[AArch64] Extend CallingConv::X86_64_Win64 to AArch64 as well Rename the enum value from X86_64_Win64 to plain Win64. The symbol exposed in the textual IR is changed from 'x86_64_win64cc' to 'win64cc', but the numeric value is kept, keeping support for old bitcode. Differential Revision: https://reviews.llvm.org/D34474 llvm-svn: 308208	2017-07-17 20:05:19 +00:00
Teresa Johnson	f9dc3deaa6	Revert "Restore with fix "[ThinLTO] Ensure we always select the same function copy to import"" This reverts commit r308114 (and follow on fixes to test). There is a linking failure in a ThinLTO bot: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rthinlto_build/3663/ (and undefined reference). It seems like it must be a second order effect of the heuristic change I made, and may take some time to try to reproduce locally and track down. Therefore, reverting for now. llvm-svn: 308206	2017-07-17 19:25:38 +00:00
Simon Pilgrim	7ac3efacc0	Remove unnecessary cast. NFCI. llvm-svn: 308166	2017-07-17 09:35:03 +00:00
Davide Italiano	579064e2c1	[InstCombine] Don't violate dominance when replacing instructions. Differential Revision: https://reviews.llvm.org/D35376 llvm-svn: 308144	2017-07-16 18:56:30 +00:00
Craig Topper	2072aca51c	[InstCombine] Move (0 - x) & 1 --> x & 1 to SimplifyDemandedUseBits. This removes a dedicated matcher and allows us to support more than just an AND masking the lower bit. llvm-svn: 308124	2017-07-16 05:37:58 +00:00
Teresa Johnson	a7660b0127	Restore with fix "[ThinLTO] Ensure we always select the same function copy to import" This restores r308078/r308079 with a fix for bot non-determinisim (make sure we run llvm-lto in single threaded mode so the debug output doesn't get interleaved). llvm-svn: 308114	2017-07-15 22:58:06 +00:00
Craig Topper	d918d5b36b	[InstCombine] Improve the expansion in SimplifyUsingDistributiveLaws to handle cases where one side doesn't simplify, but the other side resolves to an identity value Summary: If one side simplifies to the identity value for inner opcode, we can replace the value with just the operation that can't be simplified. I've removed a couple now unneeded special cases in visitAnd and visitOr. There are probably other cases I missed. Reviewers: spatel, majnemer, hfinkel, dberlin Reviewed By: spatel Subscribers: grandinj, llvm-commits, spatel Differential Revision: https://reviews.llvm.org/D35451 llvm-svn: 308111	2017-07-15 21:49:49 +00:00
Sanjay Patel	3437ee2740	[InstCombine] improve (1 << x) & 1 --> zext(x == 0) folding 1. Add a one-use check to prevent increasing instruction count. 2. Generalize the pattern matching to include vector types. llvm-svn: 308105	2017-07-15 17:26:01 +00:00
Sanjay Patel	55b9f88ecc	[InstCombine] allow (0 - x) & 1 --> x & 1 for vectors llvm-svn: 308098	2017-07-15 15:29:47 +00:00
Sanjay Patel	27339133a7	[InstCombine] remove dead code/tests; NFCI These patterns and tests were added to InstSimplify with: https://reviews.llvm.org/rL303004 llvm-svn: 308096	2017-07-15 15:01:33 +00:00
Chandler Carruth	d78a38ed2e	Revert r308078 (and subsequent tweak in r308079) which introduces a test that appears to exhibit non-determinism and is flaking on the bots pretty consistently. r308078: [ThinLTO] Ensure we always select the same function copy to import r308079: Require asserts in new test that uses debug flag llvm-svn: 308095	2017-07-15 13:50:26 +00:00
Florian Hahn	ad993521ac	[LoopInterchange] Add some optimization remarks. Reviewers: anemet, karthikthecool, blitz.opensource Reviewed By: anemet Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35122 llvm-svn: 308094	2017-07-15 13:13:19 +00:00
Dinar Temirbulatov	3c64077c82	[SLPVectorizer] Add an extra parameter to tryScheduleBundle function, NFCI. llvm-svn: 308081	2017-07-15 05:43:54 +00:00
Teresa Johnson	82b4fb1afe	[ThinLTO] Ensure we always select the same function copy to import Summary: Check if the first eligible callee is under the instruction threshold. Checking this on the first eligible callee ensures that we don't end up selecting different callees to import when we invoke this routine with different thresholds due to reaching the callee via paths that are shallower or hotter (when there are multiple copies, i.e. with weak or linkonce linkage). We don't want to leave the decision of which copy to import up to the backend. Reviewers: mehdi_amini Subscribers: inglorion, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D35436 llvm-svn: 308078	2017-07-15 04:53:05 +00:00
Geoff Berry	f7d5daa0c0	[EarlyCSE] Handle calls with no MemorySSA info. Summary: When checking for memory dependencies between calls using MemorySSA, handle cases where the calls have no MemoryAccess associated with them because the AA analysis being used has determined that the call does not read/write memory. Fixes PR33756 Reviewers: dberlin, davide Subscribers: mcrosier, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D35317 llvm-svn: 308051	2017-07-14 20:13:21 +00:00
Haicheng Wu	476adcca6b	[JumpThreading] Add a pattern to TryToUnfoldSelectInCurrBB() Add the following pattern to TryToUnfoldSelectInCurrBB() bb: %p = phi [0, %bb1], [1, %bb2], [0, %bb3], [1, %bb4], ... %c = cmp %p, 0 %s = select %c, trueval, falseval The Select in the above pattern will be unfolded and then jump-threaded. The current implementation does not allow CMP in the middle of PHI and Select. Differential Revision: https://reviews.llvm.org/D34762 llvm-svn: 308050	2017-07-14 19:16:47 +00:00
Jakub Kuderski	b292c22c8d	[Dominators] Make IsPostDominator a template parameter Summary: DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere. This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction. Reviewers: dberlin, sanjoy, davide, grosser Reviewed By: dberlin Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35315 llvm-svn: 308040	2017-07-14 18:26:09 +00:00
Sanjay Patel	3f4db3ea97	[InstCombine] convert bitwise (in)equality checks to logical ops (PR32401) As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 we have a backend transform to undo this: https://reviews.llvm.org/rL299542 when it's likely that the xor version leads to better codegen, but we want this form in IR for better analysis and simplification potential. llvm-svn: 308031	2017-07-14 15:09:49 +00:00
Max Kazantsev	f80ffa1a78	[IRCE] Fix corner case with Start = INT_MAX When iterating through loop for (int i = INT_MAX; i > 0; i--) We fail to generate the pre-loop for it. It happens because we use the overflown value in a comparison predicate when identifying whether or not we need it. In old logic, we used SLE predicate against Greatest value which exceeds all seen values of the IV and might be overflown. Now we use the GreatestSeen value of this IV with SLT predicate. Also added a test that ensures that a pre-loop is generated for such loops. Differential Revision: https://reviews.llvm.org/D35347 llvm-svn: 308001	2017-07-14 06:35:03 +00:00
Dinar Temirbulatov	21599fe2de	[SLPVectorizer] Add an extra parameter to alreadyVectorized function, NFCI. llvm-svn: 307996	2017-07-14 03:48:29 +00:00
Simon Pilgrim	f32f4be957	Fix unused variable warning on EXPENSIVE_CHECKS release builds. NFCI. llvm-svn: 307929	2017-07-13 17:10:12 +00:00
Davide Italiano	c3dc055780	Reapply [GlobalOpt] Remove unreachable blocks before optimizing a function. This commit reapplies r307215 now that we found out and fixed the cause of the cfi test failure (in r307871). llvm-svn: 307920	2017-07-13 15:40:59 +00:00
Anna Thomas	ec9b326569	[RuntimeUnrolling] Update DomTree correctly when exit blocks have successors Summary: When we runtime unroll with multiple exit blocks, we also need to update the immediate dominators of the immediate successors of the exit blocks. Reviewers: reames, mkuper, mzolotukhin, apilipenko Reviewed by: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35304 llvm-svn: 307909	2017-07-13 13:21:23 +00:00
Xinliang David Li	f564c6959e	[PGO] Enhance pgo counter promotion This is an incremental change to the promotion feature. There are two problems with the current behavior: 1) loops with multiple exiting blocks are totally disabled 2) a counter update can only be promoted one level up in the loop nest -- which does help much for short trip count inner loops inside a high trip-count outer loops. Due to this limitation, we still saw very large profile count fluctuations from run to run for the affected loops which are usually very hot. This patch adds the support for promotion counters iteratively across the loop nest. It also turns on the promotion for loops with multiple exiting blocks (with a limit). For single-threaded applications, the performance impact is flat on average. For instance, dealII improves, but povray regresses. llvm-svn: 307863	2017-07-12 23:27:44 +00:00
Anna Thomas	8e431a9851	[LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loop Refactored the code and separated out a function `canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it easier to add profitability heuristics later. Added tests to runtime unrolling to make sure that unrolling for multi-exit loops is not done unless the option -unroll-runtime-multi-exit is true. llvm-svn: 307843	2017-07-12 20:55:43 +00:00
Sam Clegg	fd5ab25ae1	Remove unneeded use of #undef DEBUG_TYPE. NFC Where is is needed (at the end of headers that define it), be consistent about its use. Also fix a few header guards that I found in the process. Differential Revision: https://reviews.llvm.org/D34916 llvm-svn: 307840	2017-07-12 20:49:21 +00:00
Michael Kuperstein	fdb46b2fb4	[LV] Don't allow outside uses of IVs if the SCEV is predicated on loop conditions. This fixes PR33706. Differential Revision: https://reviews.llvm.org/D35227 llvm-svn: 307837	2017-07-12 19:53:55 +00:00
Jakub Kuderski	b323f4f173	[LoopRotate] Fix DomTree update logic for unreachable nodes. Fix PR33701. Summary: LoopRotate manually updates the DoomTree by iterating over all predecessors of a basic block and computing the Nearest Common Dominator. When a predecessor happens to be unreachable, `DT.findNearestCommonDominator` returns nullptr. This patch teaches LoopRotate to handle this case and fixes [[ https://bugs.llvm.org/show_bug.cgi?id=33701 \| PR33701 ]]. In the future, LoopRotate should be taught to use the new incremental API for updating the DomTree. Reviewers: dberlin, davide, uabelho, grosser Subscribers: efriedma, mzolotukhin Differential Revision: https://reviews.llvm.org/D35074 llvm-svn: 307828	2017-07-12 18:42:16 +00:00
Peter Collingbourne	cacac6a104	LowerTypeTests: When importing functions skip definitions where the summary contains a decl. This normally indicates mixed CFI + non-CFI compilation, and will result in us treating the function in the same way as a function defined outside of the LTO unit. Part of PR33752. Differential Revision: https://reviews.llvm.org/D35281 llvm-svn: 307744	2017-07-12 00:39:12 +00:00
Davide Italiano	b8ad3eebca	[IPO] Temporarily rollback r307215. [GlobalOpt] Remove unreachable blocks before optimizing a function. While the change is presumably correct, it exposes a latent bug in DI which breaks on of the CFI checks. I'll analyze it further and try to understand what's going on. llvm-svn: 307729	2017-07-11 23:10:17 +00:00
Konstantin Zhuravlyov	bb80d3e1d3	Enhance synchscope representation OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722	2017-07-11 22:23:00 +00:00
Anna Thomas	bafe766f5d	[LoopUnrollRuntime] NFC: Add some debugging trace messages for why loop wasn't unrolled. llvm-svn: 307705	2017-07-11 20:44:37 +00:00
Davide Italiano	ee1c82112e	[NewGVN] Check for congruency of memory accesses. This is fine as nothing in the code relies on leader and memory leader being the same for a given congruency class. Ack'ed by Dan. Fixes PR33720. llvm-svn: 307699	2017-07-11 19:49:12 +00:00
Davide Italiano	67b0e53dc1	[NewGVN] Fix an innocent typo I found while debugging PR33720. llvm-svn: 307694	2017-07-11 19:19:45 +00:00
Davide Italiano	fb4544cd15	[NewGVN] Clarify the function invariants formatting them properly. llvm-svn: 307692	2017-07-11 19:15:36 +00:00
Evgeniy Stepanov	3d5ea713f7	[msan] Only check shadow memory for operands that are sized. Fixes PR33347: https://bugs.llvm.org/show_bug.cgi?id=33347. Differential Revision: https://reviews.llvm.org/D35160 Patch by Matt Morehouse. llvm-svn: 307684	2017-07-11 18:13:52 +00:00
Anna Thomas	5526a33f4f	[LoopUnrollRuntime] Avoid multi-exit nested loop with epilog generation The loop structure for the outer loop does not contain the epilog preheader when we try to unroll inner loop with multiple exits and epilog code is generated. For now, we just bail out in such cases. Added a test case that shows the problem. Without this bailout, we would trip on assert saying LCSSA form is incorrect for outer loop. llvm-svn: 307676	2017-07-11 17:16:33 +00:00
Dinar Temirbulatov	09b6779709	[SLPVectorizer] Revert change in cancelScheduling with referencing to FirstInBundle, NFCI. llvm-svn: 307667	2017-07-11 15:54:50 +00:00
Hiroshi Inoue	0ca79dcf4b	fix typos in comments; NFC llvm-svn: 307626	2017-07-11 06:04:59 +00:00
Chandler Carruth	01f0c8a8c4	[PM/ThinLTO] Fix PR33536, a bug where the ThinLTO bitcode writer was querying for analysis results on a function declaration rather than a definition. The only reason this worked previously is by chance -- because the way we got alias analysis results with the legacy PM, we happened to not compute a dominator tree and so we happened to not hit an assert even though it didn't make any real sense. Now we bail out before trying to compute alias analysis so that we don't hit these asserts. llvm-svn: 307625	2017-07-11 05:39:20 +00:00
Leo Li	93abd7d915	[ConstantHoisting] Remove dupliate logic in constant hoisting Summary: As metioned in https://reviews.llvm.org/D34576, checkings in `collectConstantCandidates` can be replaced by using `llvm::canReplaceOperandWithVariable`. The only special case is that `collectConstantCandidates` return false for all `IntrinsicInst` but it is safe for us to collect constant candidates from `IntrinsicInst`. Reviewers: pirama, efriedma, srhines Reviewed By: efriedma Subscribers: llvm-commits, javed.absar Differential Revision: https://reviews.llvm.org/D34921 llvm-svn: 307587	2017-07-10 20:45:34 +00:00
Davide Italiano	a7a77540ef	[NewGVN] Simplify a lambda a little bit. NFCI. llvm-svn: 307586	2017-07-10 20:45:00 +00:00
Serge Guelton	f6329ec2e9	Fix invalid cast in instcombine UMul/ZExt idiom Fixes https://bugs.llvm.org/show_bug.cgi?id=25454 Do not assume IRBuilder creates Instruction where it can create Value. Do not assume idiom operands are constant, leave generalisation ot the IRBuilder. Differential Revision: https://reviews.llvm.org/D35114 llvm-svn: 307554	2017-07-10 16:51:40 +00:00
Anna Thomas	70ffd65ca9	[LoopUnrollRuntime] Remove strict assert about VMap requirement When unrolling under multiple exits which is under off-by-default option, the assert that checks for VMap entry in loop exit values is too strong. (assert if VMap entry did not exist, the value should be a constant). However, values derived from constants or from values outside loop, does not have a VMap entry too. Removed the assert and added a testcase showcasing the property for non-constant values. llvm-svn: 307542	2017-07-10 15:29:38 +00:00
Mikael Holmen	e0ced14449	[ArgumentPromotion] Change use of removed argument in llvm.dbg.value to undef Summary: This solves PR33641. When removing a dead argument we must also handle possibly existing calls to llvm.dbg.value that use the removed argument. Now we change the use of the otherwise dead argument to an undef for some other pass to cleanup later. If the calls are left untouched, they will later on cause errors: "function-local metadata used in wrong function" since the ArgumentPromotion rewrites the code by creating a new function with the wanted signature, but the metadata is not recreated so the new function may then erroneously use metadata from the old function. Reviewers: mstorsjo, rnk, arsenm Reviewed By: rnk Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34874 llvm-svn: 307521	2017-07-10 06:07:24 +00:00
Craig Topper	fde4723ebe	[IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing isIntegerTy(unsigned), but also works for vectors. llvm-svn: 307492	2017-07-09 07:04:03 +00:00
Craig Topper	95d2347ae1	[IR] Make use of Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC llvm-svn: 307491	2017-07-09 07:04:00 +00:00
Hiroshi Inoue	713b5ba2de	fix trivial typos; NFC sucessor -> successor llvm-svn: 307488	2017-07-09 05:54:44 +00:00
Chandler Carruth	bd9c29039e	[PM] Finish implementing and fix a chain of bugs uncovered by testing the invalidation propagation logic from an SCC to a Function. I wrote the infrastructure to test this but didn't actually use it in the unit test where it was designed to be used. =[ My bad. Once I actually added it to the test case I discovered that it also hadn't been properly implemented, so I've implemented it. The logic in the FAM proxy for an SCC pass to propagate invalidation follows the same ideas as the FAM proxy for a Module pass, but the implementation is a bit different to reflect the fact that it is forwarding just for an SCC. However, implementing this correctly uncovered a surprising "bug" (it was conservatively correct but relatively very expensive) in how we handle invalidation when splitting one SCC into multiple SCCs. We did an eager invalidation when in reality we should be deferring invaliadtion for the current SCC to the CGSCC pass manager and just invaliating the newly constructed SCCs. Otherwise we end up invalidating too much too soon. This was exposed by the inliner test case that I've updated. Now, we invalidate just the split off '(test1_f)' SCC when doing the CG update, and then the inliner finishes and invalidates the '(test1_g, test1_h)' SCC's analyses. The first few attempts at fixing this hit still more bugs, but all of those are covered by existing tests. For example, the inliner should also preserve the FAM proxy to avoid unnecesasry invalidation, and this is safe because the CG update routines it uses handle any necessary adjustments to the FAM proxy. Finally, the unittests for the CGSCC pass manager needed a bunch of updates where we weren't correctly preserving the FAM proxy because it hadn't been fully implemented and failing to preserve it didn't matter. Note that this doesn't yet fix the current crasher due to MemSSA finding a stale dominator tree, but without this the fix to that crasher doesn't really make any sense when testing because it relies on the proxy behavior. llvm-svn: 307487	2017-07-09 03:59:31 +00:00
Craig Topper	e79b3e7d9a	[InstCombine] Speculatively implement a fix for what might be the root cause of PR33721 by making sure that we have integer types before doing select C, -1, 0 -> sext C to int I recently changed m_One and m_AllOnes to use Constant::isOneValue/isAllOnesValue which work on floating point values too. The original implementation looked specifically for ConstantInt scalars and splats. So I'm guessing we are accidentally trying to issue sext/zexts on floating point types now. Hopefully I figure out how to reproduce the failure from the PR soon. llvm-svn: 307486	2017-07-09 03:25:17 +00:00
Max Kazantsev	b9edcbcb1d	Re-enable "[IndVars] Canonicalize comparisons between non-negative values and indvars" The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate. In this patch we use the original predicate to do the transformation. Also added a test case that exercises this situation. Differentian Revision: https://reviews.llvm.org/D35107 llvm-svn: 307477	2017-07-08 17:17:30 +00:00
Craig Topper	bb4069e439	[InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference. This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do. llvm-svn: 307451	2017-07-07 23:16:26 +00:00
Dehao Chen	64c46574b0	Increase the import-threshold for crtical functions. Summary: For interative sample-pgo, if a hot call site is inlined in the profiling binary, we should inline it in before profile annotation in the backend. Before that, the compile phase first collects all GUIDs that needs to be imported and creates virtual "hot" call edge in the summary. However, "hot" is not good enough to guarantee the callsites get inlined. This patch introduces "critical" call edge, and assign much higher importing threshold for those edges. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, mehdi_amini, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D35096 llvm-svn: 307439	2017-07-07 21:01:00 +00:00
Anna Thomas	e3872003d0	[LoopUnrollRuntime] Support multiple exit blocks unrolling when prolog remainder generated With the NFC refactoring in rL307417 (git SHA 987dd01), all the logic is in place to support multiple exit/exiting blocks when prolog remainder is generated. This patch removed the assert that multiple exit blocks unrolling is only supported when epilog remainder is generated. Also, added test runs and checks with PROLOG prefix in runtime-loop-multiple-exits.ll test cases. llvm-svn: 307435	2017-07-07 20:12:32 +00:00
Davide Italiano	4eb210bdeb	[Local] Update the comment for removeUnreachableBlocks. It referenced a wrong function name, and didn't mention what the second argument did. This should be slightly more accurate now. llvm-svn: 307425	2017-07-07 18:54:14 +00:00
Gor Nishanov	8cdf648795	[cloning] Do not duplicate types when cloning functions Summary: This is an addon to the change rl304488 cloning fixes. (Originally rl304226 reverted rl304228 and reapplied rl304488 https://reviews.llvm.org/D33655) rl304488 works great when DILocalVariables that comes from the inlined function has a 'unique-ed' type, but, in the case when the variable type is distinct we will create a second DILocalVariable in the scope of the original function that was inlined. Consider cloning of the following function: ``` define private void @f() !dbg !5 { %1 = alloca i32, !dbg !11 call void @llvm.dbg.declare(metadata i32* %1, metadata !14, metadata !12), !dbg !18 ret void, !dbg !18 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) ; came from an inlined function !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Without this fix, when function 'f' is cloned, we will create another DILocalVariable for "inlined", due to its type being distinct. ``` define private void @f.1() !dbg !23 { %1 = alloca i32, !dbg !26 call void @llvm.dbg.declare(metadata i32* %1, metadata !28, metadata !12), !dbg !30 ret void, !dbg !30 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ; !28 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !29) ; OOPS second DILocalVariable !29 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Now we have two DILocalVariable for "inlined" within the same scope. This result in assert in AsmPrinter/DwarfDebug.h:131: void llvm::DbgVariable::addMMIEntry(const llvm::DbgVariable &): Assertion `V.Var == Var && "conflicting variable"' failed. (Full example: See: https://bugs.llvm.org/show_bug.cgi?id=33492) In this change we prevent duplication of types so that when a metadata for DILocalVariable is cloned it will get uniqued to the same metadate node as an original variable. Reviewers: loladiro, dblaikie, aprantl, echristo Reviewed By: loladiro Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D35106 llvm-svn: 307418	2017-07-07 18:24:20 +00:00
Anna Thomas	734ab3f75c	[LoopUnrollRuntime] NFC: use the precomputed loop exit in ConnectProlog Minor refactoring to use the preexisting loop exit that's already calculated. We do not need to recompute the loop exit in ConnectProlog. Apart from avoiding redundant computation, this is required for supporting multiple loop exits when Prolog remainder loops are generated. llvm-svn: 307417	2017-07-07 18:05:28 +00:00
Yaxun Liu	b909f11a31	[InferAddressSpaces] Fix assertion about null pointer InferAddressSpaces does not check address space in collectFlatAddressExpressions, which causes values with non flat address space put into Postorder and causes assertion in cloneValueWithNewAddressSpace. This patch fixes assertion in OpenCL 2.0 conformance test generic_address_space subtest for amdgcn target. Differential Revision: https://reviews.llvm.org/D34991 llvm-svn: 307349	2017-07-07 02:40:13 +00:00
Sean Fertile	9cd1cdf814	Extend memcpy expansion in Transform/Utils to handle wider operand types. Adds loop expansions for known-size and unknown-sized memcpy calls, allowing the target to provide the operand types through TTI callbacks. The default values for the TTI callbacks use int8 operand types and matches the existing behaviour if they aren't overridden by the target. Differential revision: https://reviews.llvm.org/D32536 llvm-svn: 307346	2017-07-07 02:00:06 +00:00
Evgeniy Stepanov	7d3eeaaa96	Revert r307342, r307343. Revert "Copy arguments passed by value into explicit allocas for ASan." Revert "[asan] Add end-to-end tests for overflows of byval arguments." Build failure on lldb-x86_64-ubuntu-14.04-buildserver. Test failure on clang-cmake-aarch64-42vma and sanitizer-x86_64-linux-android. llvm-svn: 307345	2017-07-07 01:31:23 +00:00
Evgeniy Stepanov	2a7a4bc1c9	Copy arguments passed by value into explicit allocas for ASan. ASan determines the stack layout from alloca instructions. Since arguments marked as "byval" do not have an explicit alloca instruction, ASan does not produce red zones for them. This commit produces an explicit alloca instruction and copies the byval argument into the allocated memory so that red zones are produced. Patch by Matt Morehouse. Differential revision: https://reviews.llvm.org/D34789 llvm-svn: 307342	2017-07-07 00:48:25 +00:00
Wei Mi	7586755013	[ConstHoisting] Turn on consthoist-with-block-frequency by default. Using profile information to guide consthoisting is generally helpful for performance, so the patch turns it on by default. No compile time or perf regression were found using spec2000 and spec2006 on x86. Some significant improvement (>20%) was seen on internal benchmarks. Differential Revision: https://reviews.llvm.org/D35063 llvm-svn: 307338	2017-07-07 00:11:05 +00:00
Craig Topper	cb22039bee	[InstCombine] No need to pass DataLayout to helper functions if we're passing the InstCombiner object. We can just ask it for the DataLayout. NFC llvm-svn: 307333	2017-07-06 23:18:43 +00:00
Craig Topper	4853c4304b	[InstCombine] Remove unused arguments from some helper functions. NFC llvm-svn: 307332	2017-07-06 23:18:42 +00:00
Craig Topper	2bb9f0f620	[InstCombine] Change a couple helper functions to only take the IRBuilder as an argument and not the whole InstCombiner object. NFC llvm-svn: 307331	2017-07-06 23:18:41 +00:00
Wei Mi	20526b2725	[ConstHoisting] choose to hoist when frequency is the same. The patch is to adjust the strategy of frequency based consthoisting: Previously when the candidate block has the same frequency with the existing blocks containing a const, it will not hoist the const to the candidate block. For that case, now we change the strategy to hoist the const if only existing blocks have more than one block member. This is helpful for reducing code size. Differential Revision: https://reviews.llvm.org/D35084 llvm-svn: 307328	2017-07-06 22:32:27 +00:00
Davide Italiano	f4891d29f8	[lib/LTO] Add a comment to explain where we set the linkage in the summary. Pointed out by Teresa! llvm-svn: 307305	2017-07-06 20:04:20 +00:00
Davide Italiano	6a5fbe52fa	[LTO] Fix the interaction between linker redefined symbols and ThinLTO This is the same as r304719 but for ThinLTO. The substantial difference is that in this case we don't have whole visibility, just the summary. In the LTO case, when we got the resolution for the input file we could just see if the linker told us whether a symbol was linker redefined (using --wrap or --defsym) and switch the linkage directly for the GV. Here, we have the summary. So, we record that the linkage changed from <whatever it was> to $weakany to prevent IPOs across this symbol boundaries and actually just switch the linkage at FunctionImport time. This patch should also fixes the lld bits (as all the scaffolding for communicating if a symbol is linker redefined should be there & should be the same), but I'll make sure to add some tests there as well. Fixes PR33192. Differential Revision: https://reviews.llvm.org/D35064 llvm-svn: 307303	2017-07-06 19:58:26 +00:00
Craig Topper	e9bf7ebacf	[InstCombine] Remove include of DIBuilder.h and Dwarf.h as they don't appear to be necessary. llvm-svn: 307295	2017-07-06 18:47:47 +00:00
Leo Li	5499b1b8be	Modify constraints in `llvm::canReplaceOperandWithVariable` Summary: `Instruction::Switch`: only first operand can be set to a non-constant value. `Instruction::InsertValue` both the first and the second operand can be set to a non-constant value. `Instruction::Alloca` return true for non-static allocation. Reviewers: efriedma Reviewed By: efriedma Subscribers: srhines, pirama, llvm-commits Differential Revision: https://reviews.llvm.org/D34905 llvm-svn: 307294	2017-07-06 18:47:05 +00:00
Craig Topper	ca2c87653c	[Constants] Replace calls to ConstantInt::equalsInt(0)/equalsInt(1) with isZero and isOne. NFCI llvm-svn: 307293	2017-07-06 18:39:49 +00:00
Craig Topper	79ab643da8	[Constants] If we already have a ConstantInt*, prefer to use isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292	2017-07-06 18:39:47 +00:00
Anna Thomas	eb6d5d1950	[LoopUnrollRuntime] Bailout when multiple exiting blocks to the unique latch exit block Currently, we do not support multiple exiting blocks to the latch exit block. However, this bailout wasn't triggered when we had a unique exit block (which is the latch exit), with multiple exiting blocks to that unique exit. Moved the bailout so that it's triggered in both cases and added testcase. llvm-svn: 307291	2017-07-06 18:39:26 +00:00
Craig Topper	47c8f66997	[InstCombine] Remove Builder argument from InstCombiner::tryFactorization. NFC Builder is already a member of the InstCombiner class so we can use it with passing it. llvm-svn: 307290	2017-07-06 18:35:52 +00:00
Craig Topper	dfd01ea9ed	[SimplifyCFG] Move a portion of an if statement that should already be implied to an assert Summary: In this code we got to Dom by following the predecessor link of BB. So it stands to reason that BB should also show up as a successor of Dom's terminator right? There isn't a way to have the CFG connect in only one direction is there? Reviewers: jmolloy, davide, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D35025 llvm-svn: 307276	2017-07-06 16:29:43 +00:00
Craig Topper	95e4142f94	[InstCombine] Change helper method to a file local static method. NFC llvm-svn: 307275	2017-07-06 16:24:23 +00:00
Craig Topper	fc42acef92	[InstCombine] Clarify comment to mention other transform that it does. NFC llvm-svn: 307274	2017-07-06 16:24:22 +00:00
Craig Topper	22795de20a	[InstCombine] Add single use checks to SimplifyBSwap to ensure we are really saving instructions Bswap isn't a simple operation so we need to make sure we are really removing a call to it before doing these simplifications. For the case when both LHS and RHS are bswaps I've allowed it to be moved if either LHS or RHS has a single use since that at least allows us to move it later where it might find another bswap to combine with and it decreases the use count on the other side so maybe the other user can be optimized. Differential Revision: https://reviews.llvm.org/D34974 llvm-svn: 307273	2017-07-06 16:24:21 +00:00
Craig Topper	3e1909d797	[InstCombine] Don't create extra ConstantInt objects in foldSelectICmpAnd. NFCI Instead just use APInt objects and only create a ConstantInt at the end if we need it for the Offset. llvm-svn: 307270	2017-07-06 15:58:54 +00:00
Wei Mi	90707394e3	[LSR] Narrow search space by filtering non-optimal formulae with the same ScaledReg and Scale. When the formulae search space is huge, LSR uses a series of heuristic to keep pruning the search space until the number of possible solutions are within certain limit. The big hammer of the series of heuristics is NarrowSearchSpaceByPickingWinnerRegs, which picks the register which is used by the most LSRUses and deletes the other formulae which don't use the register. This is a effective way to prune the search space, but quite often not a good way to keep the best solution. We saw cases before that the heuristic pruned the best formula candidate out of search space. To relieve the problem, we introduce a new heuristic called NarrowSearchSpaceByFilterFormulaWithSameScaledReg. The basic idea is in order to reduce the search space while keeping the best formula, we want to keep as many formulae with different Scale and ScaledReg as possible. That is because the central idea of LSR is to choose a group of loop induction variables and use those induction variables to represent LSRUses. An induction variable candidate is often represented by the Scale and ScaledReg in a formula. If we have more formulae with different ScaledReg and Scale to choose, we have better opportunity to find the best solution. That is why we believe pruning search space by only keeping the best formula with the same Scale and ScaledReg should be more effective than PickingWinnerReg. And we use two criteria to choose the best formula with the same Scale and ScaledReg. The first criteria is to select the formula using less non shared registers, and the second criteria is to select the formula with less cost got from RateFormula. The patch implements the heuristic before NarrowSearchSpaceByPickingWinnerRegs, which is the last resort. Testing shows we get 1.8% and 2% on two internal benchmarks on x86. llvm nightly testsuite performance is neutral. We also tried lsr-exp-narrow and it didn't help on the two improved internal cases we saw. Differential Revision: https://reviews.llvm.org/D34583 llvm-svn: 307269	2017-07-06 15:52:14 +00:00
Max Kazantsev	98838527c6	Revert "Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars""" It appears that the problem is still there. Needs more analysis to understand why SaturatedMultiply test fails. llvm-svn: 307249	2017-07-06 10:47:13 +00:00
Max Kazantsev	c8db20b78c	Revert "Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars"" It seems that the patch was reverted by mistake. Clang testing showed failure of the MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the fresh code base and was able to confirm that the transformation introduced by the change does not happen in the said test. This gives a strong confidence that the actual reason of the failure of the initial patch was somewhere else, and that problem now seems to be fixed. Re-submitting the change to confirm that. llvm-svn: 307244	2017-07-06 09:57:41 +00:00
Frederich Munch	52dfcd18d1	Avoid constructing GlobalExtensions only to find out it is empty. Summary: GlobalExtensions is dereferenced twice, once for iteration and then a check if it is empty. As a ManagedStatic this dereference forces it's construction which is unnecessary. Reviewers: efriedma, davide, mehdi_amini Reviewed By: mehdi_amini Subscribers: chapuni, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D33381 llvm-svn: 307229	2017-07-06 00:09:09 +00:00
Davide Italiano	7dd0694f96	[GlobalOpt] Remove unreachable blocks before optimizing a function. LLVM's definition of dominance allows instructions that are cyclic in unreachable blocks, e.g.: %pat = select i1 %condition, @global, i16* %pat because any instruction dominates an instruction in a block that's not reachable from entry. So, remove unreachable blocks from the function, because a) there's no point in analyzing them and b) GlobalOpt should otherwise grow some more complicated logic to break these cycles. Differential Revision: https://reviews.llvm.org/D35028 llvm-svn: 307215	2017-07-05 22:28:28 +00:00
Craig Topper	cc418b656a	[InstCombine] Use CmpInst::Predicate with m_Cmp instead of ICmpInst::Predicate. NFC There isn't really an ICmpInst version so we're just accessing the CmpInst version through inheritance. llvm-svn: 307199	2017-07-05 20:31:00 +00:00
Dinar Temirbulatov	b78adec638	[SLPVectorizer] Add an extra parameter to cancelScheduling function, NFCI. llvm-svn: 307158	2017-07-05 13:53:03 +00:00
David Green	b26a0a460c	[IndVarSimplify] Add AShr exact flags using induction variables ranges. This adds exact flags to AShr/LShr flags where we can statically prove it is valid using the range of induction variables. This allows further optimisations to remove extra loads. Differential Revision: https://reviews.llvm.org/D34207 llvm-svn: 307157	2017-07-05 13:25:58 +00:00
Max Kazantsev	ebe56283bc	Revert "[IndVars] Canonicalize comparisons between non-negative values and indvars" This patch seems to cause failures of test MathExtras.SaturatingMultiply on multiple buildbots. Reverting until the reason of that is clarified. Differential Revision: https://reviews.llvm.org/rL307126 llvm-svn: 307135	2017-07-05 09:44:41 +00:00
Max Kazantsev	80bc4a5554	[IndVars] Canonicalize comparisons between non-negative values and indvars -If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative, then signed and unsigned comparisons between them produce the same result. Both of those can be seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like %c = icmp slt i32 %iv, %b to %c = icmp ult i32 %iv, %b if both %iv and %b are known to be non-negative. Differential Revision: https://reviews.llvm.org/D34979 llvm-svn: 307126	2017-07-05 06:38:49 +00:00
Anna Thomas	ada4ddc0bc	[LoopDeletion] NFC: Add loop being analyzed debug statement llvm-svn: 307096	2017-07-04 17:00:03 +00:00
Anna Thomas	90f69abc8b	[LoopDeletion] NFC: Add debug statements to the optimization We have a DEBUG option for loop deletion, but no related debug messages. Added some debug messages to state why loop deletion failed. llvm-svn: 307078	2017-07-04 14:05:19 +00:00
Craig Topper	0f746c2793	[InstCombine] Add TODOs for a couple things that should maybe be in InstSimplify instead. NFC llvm-svn: 307065	2017-07-04 06:50:48 +00:00
Florian Hahn	4eeff394d3	[LoopInterchange] Add more debug messages to currentLimitations(). Summary: This makes it easier to find out which limitation prevented this pass from doing its work. Reviewers: karthikthecool, mzolotukhin, efriedma, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34940 llvm-svn: 307035	2017-07-03 15:32:00 +00:00
Benjamin Kramer	fb620493e1	Revert "[GVN] Recommit the patch "Add phi-translate support in scalarpre"." This reverts commit r306313. This breaks selfhost at -O3 and PR33652. Let me know if you need additional information on reproducing the issue. llvm-svn: 307021	2017-07-03 12:23:10 +00:00
Craig Topper	8036970008	[InstCombine] Add a TODO for a probable missing single use check. NFC Will try to fix it soon, but in case I forget. llvm-svn: 307003	2017-07-03 05:54:16 +00:00
Craig Topper	766ce6e9cf	[InstCombine] Support BITWISE_OP( BSWAP(x), CONSTANT ) -> BSWAP( BITWISE_OP(x, BSWAP(CONSTANT) ) ) for splat vectors. llvm-svn: 307002	2017-07-03 05:54:15 +00:00
Craig Topper	32fce4d647	[InstCombine] Remove support for BITWISE_OP(CONSTANT, BSWAP(x)) -> BSWAP(OP(BSWAP(CONSTANT), x)). Constants were already canonicalized to the right hand side before we got here. llvm-svn: 307000	2017-07-03 05:54:13 +00:00
Craig Topper	1e4643a98e	[InstCombine] Support BITWISE_OP(BSWAP(A),BSWAP(B))->BSWAP(BITWISE_OP(A, B)) for vectors. llvm-svn: 306999	2017-07-03 05:54:13 +00:00
Craig Topper	c6948c25cc	[InstCombine] Remove an if that should have been guaranteed by the caller. Replace with an assert. NFC llvm-svn: 306997	2017-07-03 05:54:11 +00:00
Simon Pilgrim	df2657ac2d	[InstCombine] Use m_BitReverse pattern match helper. NFCI. llvm-svn: 306986	2017-07-02 16:31:16 +00:00
Sanjay Patel	b51e072d35	[InstCombine] fix crash when folding cmp+bswap vector We assumed the constant was a scalar when creating the replacement operand. Also, improve tests for this fold and move the tests for this fold to their own file. I'll move the related and missing tests to this file as a follow-up. llvm-svn: 306985	2017-07-02 16:05:11 +00:00
Sanjay Patel	c3d5cf0bb7	[InstCombine] look through bswap/bitreverse for equality comparisons I noticed this missed bswap optimization in the CGP memcmp() expansion, and then I saw that we don't have the fold in InstCombine. Differential Revision: https://reviews.llvm.org/D34763 llvm-svn: 306980	2017-07-02 14:34:50 +00:00
Hiroshi Inoue	bb703e8960	fix trivial typos; NFC suport -> support llvm-svn: 306968	2017-07-02 03:24:54 +00:00
Craig Topper	f60ab47098	[InstCombine] Fold (a \| b) ^ (~a \| ~b) --> ~(a ^ b) and (a & b) ^ (~a & ~b) --> ~(a ^ b) Summary: I came across this while thinking about what would happen if one of the operands in this xor pattern was itself a inverted (A & ~B) ^ (~A & B)-> (A^B). The patterns here assume that the (~a \| ~b) will be demorganed to ~(a & b) first. Though I wonder if there's a multiple use case that would prevent the demorgan. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34870 llvm-svn: 306967	2017-07-02 01:15:51 +00:00
Davide Italiano	e3f7dda1fb	[CodeExtractor] Remove unneded and commented out debugging stmts. llvm-svn: 306966	2017-07-02 00:07:18 +00:00
Hiroshi Inoue	ef1c2ba22a	fix trivial typos, NFC llvm-svn: 306952	2017-07-01 07:12:15 +00:00
Davide Italiano	9282f1aece	[Cloner] Re-map simplfied cloned instructions. This commit pretty much rolls back the logic added in r306495 as in the testcase provided we simplify an `icmp` looking through a PHI that hasn't been mapped yet. I think instsimplify shouldn't do threading over select/phis or just looking through phis in general, but this is what we have now. Also, add a test to prevent this from happening in case somebody wants to modify this code again. Briefly discussed with Kyle Butt (thanks Kyle!). llvm-svn: 306938	2017-07-01 03:29:33 +00:00
Teresa Johnson	32d95742b8	Recommit "r306541 - Add zero-length check to memcpy/memset load store loop expansion"" With fix for use-after-free errors. We can't add the new branch and remove the old one until we are done with the Builder constructed for the block. llvm-svn: 306937	2017-07-01 03:24:10 +00:00
Teresa Johnson	c12306c0ad	Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." This still breaks PPC tests we have. I'll forward reproduction instructions to dehao. llvm-svn: 306936	2017-07-01 03:24:09 +00:00
Teresa Johnson	eb4fba9d61	re-commit r306336: Enable vectorizer-maximize-bandwidth by default. Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306935	2017-07-01 03:24:08 +00:00
Teresa Johnson	de56903bde	revert r306336 for breaking ppc test. llvm-svn: 306934	2017-07-01 03:24:07 +00:00
Teresa Johnson	1fbaffeba1	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 llvm-svn: 306933	2017-07-01 03:24:06 +00:00
Dinar Temirbulatov	2fb1075f14	[SLPVectorizer] Add isOdd() helper function, NFCI. llvm-svn: 306887	2017-06-30 21:16:26 +00:00
Craig Topper	bcf511c0da	[InstCombine] Replace an unnecessary use of a matcher with just an isa and a cast. NFC We aren't looking through any levels of IR here so I don't think we need the power of a matcher or the temporary variable it requires. llvm-svn: 306885	2017-06-30 21:09:34 +00:00
Ayal Zaks	2ff59d4350	[LV] Sink casts to unravel first order recurrence Check if a single cast is preventing handling a first-order-recurrence Phi, because the scheduling constraints it imposes on the first-order-recurrence shuffle are infeasible; but they can be made feasible by moving the cast downwards. Record such casts and move them when vectorizing the loop. Differential Revision: https://reviews.llvm.org/D33058 llvm-svn: 306884	2017-06-30 21:05:06 +00:00
Sumanth Gundapaneni	5372f0a73e	[SimplifyCFG] Update the name of switch generated lookup table. This patch appends the name of the function to the switch generated lookup table. This will ease the visual debugging in identifying the function the table is generated from. Differential Revision: https://reviews.llvm.org/D34817 llvm-svn: 306867	2017-06-30 20:00:01 +00:00
Simon Pilgrim	77c3c5f9b8	[InstCombine] Add m_BitReverse pattern match helper. NFCI. llvm-svn: 306860	2017-06-30 18:58:29 +00:00
Anna Thomas	e5e5e59d8b	[RuntimeUnrolling] Add logic for loops with multiple exit blocks Summary: Runtime unrolling is done for loops with a single exit block and a single exiting block (and this exiting block should be the latch block). This patch adds logic to support unrolling in the presence of multiple exit blocks (which also means multiple exiting blocks). Currently this is under an off-by-default option and is supported when epilog code is generated. Support in presence of prolog code will be in a future patch (we just need to add more tests, and update comments). This patch is essentially an implementation patch. I have not added any heuristic (in terms of branches added or code size) to decide when this should be enabled. Reviewers: mkuper, sanjoy, reames, evstupac Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33001 llvm-svn: 306846	2017-06-30 17:57:07 +00:00
Nikolai Bozhenov	bde9b14c6f	Revert of r306525: "Canonicalize clamp of float types to minmax" llvm-svn: 306815	2017-06-30 10:39:09 +00:00
Ayal Zaks	8d26f0a602	[LV] Optimize for size when vectorizing loops with tiny trip count It may be detrimental to vectorize loops with very small trip count, as various costs of the vectorized loop body as well as enclosing overheads including runtime tests and scalar iterations may outweigh the gains of vectorizing. The current cost model measures the cost of the vectorized loop body only, expecting it will amortize other costs, and loops with known or expected very small trip counts are not vectorized at all. This patch allows loops with very small trip counts to be vectorized, but under OptForSize constraints, which ensure the cost of the loop body is dominant, having no runtime guards nor scalar iterations. Patch inspired by D32451. Differential Revision: https://reviews.llvm.org/D34373 llvm-svn: 306803	2017-06-30 08:02:35 +00:00
Craig Topper	880bf82685	[InstCombine] In foldXorToXor, move the commutable matcher from the LHS match to the RHS match. No meaningful change intended. There are two conditions ORed here with similar checks and each contain two matches that must be true for the if to succeed. With the commutable match on the first half of the OR then both ifs basically have the same first part and only the second part distinguishs. With this change we move the commutable match to second half and make the first half unique. This caused some tests to change because we now produce a commuted result, but this shouldn't matter in practice. llvm-svn: 306800	2017-06-30 07:37:41 +00:00
Chandler Carruth	3545a9e1f9	Remove the BBVectorize pass. It served us well, helped kick-start much of the vectorization efforts in LLVM, etc. Its time has come and past. Back in 2014: http://lists.llvm.org/pipermail/llvm-dev/2014-November/079091.html Time to actually let go and move forward. =] I've updated the release notes both about the removal and the deprecation of the corresponding C API. llvm-svn: 306797	2017-06-30 07:09:08 +00:00
Daniel Jasper	3b704ceba1	Revert "r306541 - Add zero-length check to memcpy/memset load store loop expansion" Segfaults in non-optimized builds. I'll get a stack trace and a reproducer to Teresa. llvm-svn: 306793	2017-06-30 06:37:33 +00:00
Daniel Jasper	5ce1ce742e	Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." This still breaks PPC tests we have. I'll forward reproduction instructions to dehao. llvm-svn: 306792	2017-06-30 06:32:21 +00:00
Max Kazantsev	8d0322e612	[SCEV] Use depth limit instead of local cache for SExt and ZExt In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously. In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short. Differential Revision: https://reviews.llvm.org/D34273 llvm-svn: 306785	2017-06-30 05:04:09 +00:00
Eric Christopher	a95aac3751	Reduce indenting and clean up comparisons around sign bit. llvm-svn: 306781	2017-06-30 01:57:48 +00:00
Eric Christopher	710c1c8faa	Reduce the complexity of the signbit/branch test functions. llvm-svn: 306779	2017-06-30 01:35:31 +00:00
Dehao Chen	2f31d0d86e	Hook the sample PGO machinery in the new PM Summary: This patch hooks up SampleProfileLoaderPass with the new PM. Reviewers: chandlerc, davidxl, davide, tejohnson Reviewed By: chandlerc, tejohnson Subscribers: tejohnson, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D34720 llvm-svn: 306763	2017-06-29 23:33:05 +00:00
Dinar Temirbulatov	f05c73c132	[SLPVectorizer] Moving Entry->NeedToGather check out of inner loop, since it is invariant there. NFCI. llvm-svn: 306749	2017-06-29 21:56:33 +00:00
Sam Clegg	3d65030c45	Remove `inline` keyword from inline `classof` methods The style guide states that the explicit `inline` should not be used with inline methods. classof is very common inline method with a fair amount on inconsistency: $ git grep classof ./include \| grep inline \| wc -l 230 $ git grep classof ./include \| grep -v inline \| wc -l 257 I chose to target this method rather the larger change since this method is easily cargo-culted (I did it at least once). I considered doing the larger change and removing all occurrences but that would be a much larger change. Differential Revision: https://reviews.llvm.org/D33906 llvm-svn: 306731	2017-06-29 19:35:17 +00:00
Xin Tong	02008c30b5	Remove useless header. NFC llvm-svn: 306712	2017-06-29 17:48:12 +00:00
Leo Li	20fbad9307	[ConstantHoisting] Avoid hoisting constants in GEPs that index into a struct type. Summary: Indices for GEPs that index into a struct type should always be constants. This added more checks in `collectConstantCandidates:` which make sure constants for GEP pointer type are not hoisted. This fixed Bug https://bugs.llvm.org/show_bug.cgi?id=33538 Reviewers: ributzka, rnk Reviewed By: ributzka Subscribers: efriedma, llvm-commits, srhines, javed.absar, pirama Differential Revision: https://reviews.llvm.org/D34576 llvm-svn: 306704	2017-06-29 17:03:34 +00:00
Daniel Berlin	b7df17ec59	PredicateInfo: Use OrderedInstructions instead of our homemade version. llvm-svn: 306703	2017-06-29 17:01:14 +00:00
Daniel Berlin	b779db7ebc	NewGVN: Remove useless test in addPhiOfOps. llvm-svn: 306702	2017-06-29 17:01:10 +00:00
Daniel Berlin	7c757aee38	Remove unneeded else from OrderedInstructions::dominates. llvm-svn: 306701	2017-06-29 17:01:03 +00:00
Dinar Temirbulatov	7b96266a16	[SLPVectorizer] Introducing getTreeEntry() helper function [NFC] Differential Revision: https://reviews.llvm.org/D34756 llvm-svn: 306655	2017-06-29 08:46:18 +00:00
Craig Topper	798a19ab8e	[InstCombine] In visitXor, use m_Not on the instruction itself instead of looking for all ones in Op1. This is consistent with 3 other not checks before this one. NFCI llvm-svn: 306617	2017-06-29 00:07:08 +00:00
Keno Fischer	a236dae5d1	[InstCombine] Retain TBAA when narrowing memory accesses Summary: As discussed on the mailing list it is legal to propagate TBAA to loads/stores from/to smaller regions of a larger load tagged with TBAA. Do so for (load->extractvalue)=>(gep->load) and similar foldings. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D31954 llvm-svn: 306615	2017-06-28 23:36:40 +00:00
Ayal Zaks	d9bc43ef2a	[LV] Fix PR33613 - retain order of insertelement per part r306381 caused PR33613, by reversing the order in which insertelements were generated per unroll part. This patch fixes PR33613 by retraining this order, placing each set of insertelements per part immediately after the last scalar being packed for this part. Includes a test case derived from PR33613. Reference: https://bugs.llvm.org/show_bug.cgi?id=33613 Differential Revision: https://reviews.llvm.org/D34760 llvm-svn: 306575	2017-06-28 17:59:33 +00:00
Geoff Berry	b0573547f6	[LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D34532 llvm-svn: 306564	2017-06-28 17:01:15 +00:00
Sanjay Patel	4e96f19052	[InstCombine] use local variable to reduce code; NFCI llvm-svn: 306560	2017-06-28 16:39:06 +00:00
Geoff Berry	66d9bdbca8	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 llvm-svn: 306554	2017-06-28 15:53:17 +00:00
Teresa Johnson	538b8d25f0	Add zero-length check to memcpy/memset load store loop expansion Summary: I was testing using this expansion logic in other cases besides NVPTX, and found some runtime failures due to the lack of a check for a zero length memcpy/memset before the loop. There is already such a check in the memmove expansion code though. Reviewers: hfinkel Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34707 llvm-svn: 306541	2017-06-28 13:07:37 +00:00
Nikolai Bozhenov	b01e6b5a52	[InstCombine] Canonicalize clamp of float types to minmax in fast mode. Summary: This commit allows matchSelectPattern to recognize clamp of float arguments in the presence of FMF the same way as already done for integers. This case is a little different though. With integers, given the min/max pattern is recognized, DAGBuilder starts selecting MIN/MAX "automatically". That is not the case for float, because for them only full FMINNAN/FMINNUM/FMAXNAN/FMAXNUM ISD nodes exist and they do care about NaNs. On the other hand, some backends (e.g. X86) have only FMIN/FMAX nodes that do not care about NaNS and the former NAN/NUM nodes are illegal thus selection is not happening. So I decided to do such kind of transformation in IR (InstCombiner) instead of complicating the logic in the backend. Reviewers: spatel, jmolloy, majnemer, efriedma, craig.topper Reviewed By: efriedma Subscribers: hiraditya, javed.absar, n.bozhenov, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D33186 llvm-svn: 306525	2017-06-28 09:26:20 +00:00
Max Kazantsev	6c466a376e	[IRCE][NFC] Better get SCEV for 1 in calculateSubRanges A slightly more efficient way to get constant, we avoid resolving in getSCEV and excessive invocations, and we don't create a ConstantInt if 'true' branch is taken. Differential Revision: https://reviews.llvm.org/D34672 llvm-svn: 306503	2017-06-28 04:57:45 +00:00

1 2 3 4 5 ...

18469 Commits