Commit Graph

1530 Commits

Author SHA1 Message Date
Sanjay Patel 84dca494b1 don't repeat function names in comments; NFC
llvm-svn: 248166
2015-09-21 15:33:26 +00:00
Simon Pilgrim 996725eb17 [InstCombine] Use SimplifyDemandedVectorEltsLow helper function. NFCI.
Use the SimplifyDemandedVectorEltsLow helper function introduced in D12680.

llvm-svn: 248089
2015-09-19 11:41:53 +00:00
David Majnemer 47ce0b81b0 [InstCombine] FoldICmpCstShrCst failed for ashr when comparing against -1
(icmp eq (ashr C1, %V) -1) may have multiple answers if C1 is not a
power of two and has the sign bit set.

This fixes PR24873.

llvm-svn: 248074
2015-09-19 00:48:31 +00:00
David Majnemer e5977ebecc [InstCombine] FoldICmpCstShrCst didn't handle icmps of -1 in the ashr case correctly
llvm-svn: 248073
2015-09-19 00:48:26 +00:00
Larisse Voufo 532bf7153c Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan. (Complete version of r247497. See D12886)
llvm-svn: 248022
2015-09-18 19:14:35 +00:00
Simon Pilgrim 61116ddc7b [InstCombine] Added vector demanded bits support for SSE4A EXTRQ/INSERTQ instructions
The SSE4A instructions EXTRQ/INSERTQ only use the lower 64-bits (or less) for many of their input vector operands and all of them have undefined upper 64-bits results.

Differential Revision: http://reviews.llvm.org/D12680

llvm-svn: 247934
2015-09-17 20:32:45 +00:00
Sanjoy Das e5f4889ba9 [InstCombine] Optimize icmp slt signum(x), 1 --> icmp slt x, 1
Summary:
`signum(x)` is sometimes implemented as `(x >> 63) | (-x >>> 63)` (for
an `i64` `x`).  This change adds a matcher for that pattern, and an
instcombine rule to optimize `signum(x) s< 1`.

Later, we can also consider optimizing:

  icmp slt signum(x), 0 --> icmp slt x, 0
  icmp sle signum(x), 1 --> true

etc.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12703

llvm-svn: 247846
2015-09-16 20:41:29 +00:00
Larisse Voufo 6b867c7254 Revert "Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan." for preliminary community discussion (See. D12886)
llvm-svn: 247716
2015-09-15 19:14:05 +00:00
Arch D. Robison 8ed0854f55 Broaden optimization of fcmp ([us]itofp x, constant) by instcombine.
The patch extends the optimization to cases where the constant's
magnitude is so small or large that the rounding of the conversion
is irrelevant.  The "so small" case includes negative zero.

Differential review: http://reviews.llvm.org/D11210

llvm-svn: 247708
2015-09-15 17:51:59 +00:00
Chen Li 0d043b52eb [InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite
Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. It also adds assertions in isKnownNonNull() and isKnownNonNullFromDominatingCondition() to make sure the value checked is pointer type (as defined in LLVM document). These assertions might trip failures in things which are not  covered under llvm/test, but fixes should be pretty obvious. 

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12779

llvm-svn: 247587
2015-09-14 18:10:43 +00:00
Simon Pilgrim 48ffca0f47 Fixed unused variable warning.
llvm-svn: 247505
2015-09-12 14:00:17 +00:00
Simon Pilgrim 20c607b110 [InstCombine] CVTPH2PS Vector Demanded Elements + Constant Folding
Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion):

<4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion.

Added constant folding support.

Differential Revision: http://reviews.llvm.org/D12731

llvm-svn: 247504
2015-09-12 13:39:53 +00:00
Larisse Voufo f57162b6e7 Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan.
llvm-svn: 247497
2015-09-12 01:41:55 +00:00
Bruce Mitchener e9ffb45b60 Fix typos.
Summary: This fixes a variety of typos in docs, code and headers.

Subscribers: jholewinski, sanjoy, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12626

llvm-svn: 247495
2015-09-12 01:17:08 +00:00
Sanjay Patel 41c739b3fa typo; NFC
llvm-svn: 247454
2015-09-11 19:29:18 +00:00
Mehdi Amini 2bd08527ff Revert "[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite"
This reverts commit r247356.

Breaks test/Transforms/InstCombine/pr8547.ll with:

Wrong types for attribute: byval inalloca nest noalias nocapture nonnull readnone readonly sret dereferenceable(1) dereferenceable_or_null(1)
  %call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i64 0, i64 0), i32 nonnull %conv2) #0
LLVM ERROR: Broken function found, compilation aborted!

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247371
2015-09-11 01:33:48 +00:00
Chen Li a29c612ddd [InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite
Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12779

llvm-svn: 247356
2015-09-10 23:04:49 +00:00
Chen Li 32a51416e5 [InstCombineCalls] Use isKnownNonNullAt() to check nullness of gc.relocate return value
Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of gc.relocate return value. In this way it can handle cases where the relocated value does not have nonnull attribute but has a dominating null check from the CFG.

Reviewers: reames

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D12772

llvm-svn: 247353
2015-09-10 22:35:41 +00:00
Jakub Kuderski 58ea4eeb9e There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that
removes cast by performing the lshr on smaller types. However, currently there
is no trunc(lshr (sext A), Cst) variant.
This patch add such optimization by transforming trunc(lshr (sext A), Cst)
to ashr A, Cst.

Differential Revision: http://reviews.llvm.org/D12520

llvm-svn: 247271
2015-09-10 11:31:20 +00:00
David Majnemer d34dbf07bd Revert trunc(lshr (sext A), Cst) to ashr A, Cst
This reverts commit r246997, it introduced a regression (PR24763).

llvm-svn: 247180
2015-09-09 20:20:08 +00:00
Chandler Carruth 7b560d40bd [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.

This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:

- FunctionAAResults is a type-erasing alias analysis results aggregation
  interface to walk a single query across a range of results from
  different alias analyses. Currently this is function-specific as we
  always assume that aliasing queries are *within* a function.

- AAResultBase is a CRTP utility providing stub implementations of
  various parts of the alias analysis result concept, notably in several
  cases in terms of other more general parts of the interface. This can
  be used to implement only a narrow part of the interface rather than
  the entire interface. This isn't really ideal, this logic should be
  hoisted into FunctionAAResults as currently it will cause
  a significant amount of redundant work, but it faithfully models the
  behavior of the prior infrastructure.

- All the alias analysis passes are ported to be wrapper passes for the
  legacy PM and new-style analysis passes for the new PM with a shared
  result object. In some cases (most notably CFL), this is an extremely
  naive approach that we should revisit when we can specialize for the
  new pass manager.

- BasicAA has been restructured to reflect that it is much more
  fundamentally a function analysis because it uses dominator trees and
  loop info that need to be constructed for each function.

All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.

The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.

This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.

Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.

One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.

Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.

Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.

Differential Revision: http://reviews.llvm.org/D12080

llvm-svn: 247167
2015-09-09 17:55:00 +00:00
Sanjay Patel 6eccf487c9 don't repeat function names in comments; NFC
llvm-svn: 247154
2015-09-09 15:24:36 +00:00
Sanjay Patel e283441836 function names start with a lower case letter; NFC
llvm-svn: 247150
2015-09-09 14:54:29 +00:00
Sanjay Patel 2fbab9d893 don't repeat function names in comments; NFC
llvm-svn: 247148
2015-09-09 14:34:26 +00:00
Sanjay Patel b54e62fe17 refactor matches for De Morgan's Laws; NFCI
llvm-svn: 247061
2015-09-08 20:14:13 +00:00
Sanjay Patel 1854927556 remove function names from comments; NFC
llvm-svn: 247043
2015-09-08 18:24:36 +00:00
Jakub Kuderski 7cd4810021 There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that
removes cast by performing the lshr on smaller types. However, currently there
is no trunc(lshr (sext A), Cst) variant.
This patch add such optimization by transforming trunc(lshr (sext A), Cst)
to ashr A, Cst.

Differential Revision: http://reviews.llvm.org/D12520

llvm-svn: 246997
2015-09-08 10:03:17 +00:00
David Majnemer 135ca40a7d [InstCombine] Don't divide by zero when evaluating a potential transform
Trivial multiplication by zero may survive the worklist.  We tried to
reassociate the multiplication with a division instruction, causing us
to divide by zero; bail out instead.

This fixes PR24726.

llvm-svn: 246939
2015-09-06 06:49:59 +00:00
David Majnemer daa24b9789 [InstCombine] Don't assume m_Mul gives back an Instruction
This fixes PR24713.

llvm-svn: 246933
2015-09-05 20:44:56 +00:00
Sanjoy Das 6f5dca70ed [InstCombine] Fix PR24605.
PR24605 is caused due to an incorrect insert point in instcombine's IR
builder.  When simplifying

  %t = add X Y
  ...
  %m = icmp ... %t

the replacement for %t should be placed before %t, not before %m, as
there could be a use of %t between %t and %m.

llvm-svn: 246315
2015-08-28 19:09:31 +00:00
Sanjoy Das c86c162a58 Re-apply r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0"
The original checkin was buggy, this change has a fix.

Original commit message:

[InstCombine] Transform A & (L - 1) u< L --> L != 0

Summary:

This transform is never a pessimization at the IR level (since it
replaces an `icmp` with another), and has potentiall payoffs:

 1. It may make the `icmp` fold away or become loop invariant.
 2. It may make the `A & (L - 1)` computation dead.

This shows up in Java, in range checks generated by array accesses of
the form `a[i & (a.length - 1)]`.

Reviewers: reames, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12210

llvm-svn: 245753
2015-08-21 22:22:37 +00:00
NAKAMURA Takumi 6a6232818d Revert r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0"
It caused miscompilation in clang.

llvm-svn: 245678
2015-08-21 07:46:07 +00:00
Sanjoy Das e472d8a57a [InstCombine] Transform A & (L - 1) u< L --> L != 0
Summary:
This transform is never a pessimization at the IR level (since it
replaces an `icmp` with another), and has potentiall payoffs:

 1. It may make the `icmp` fold away or become loop invariant.
 2. It may make the `A & (L - 1)` computation dead.

This shows up in Java, in range checks generated by array accesses of
the form `a[i & (a.length - 1)]`.

Reviewers: reames, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12210

llvm-svn: 245635
2015-08-20 22:31:55 +00:00
Adrian Prantl cbdfdb74d3 Rename Instruction::dropUnknownMetadata() to dropUnknownNonDebugMetadata()
and make it always preserve debug locations, since all callers wanted this
behavior anyway.

This is addressing a post-commit review feedback for r245589.

NFC (inside the LLVM tree).

llvm-svn: 245622
2015-08-20 22:00:30 +00:00
Adrian Prantl baf90fc265 Fix a bug that caused SimplifyCFG to drop DebugLocs.
Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all
metadata in KnownSet, but the condition for DebugLocs was inverted.

Most users of dropUnknownMetadata() actually worked around this by not
adding LLVMContext::MD_dbg to their list of KnowIDs.
This is now made explicit.

llvm-svn: 245589
2015-08-20 18:24:02 +00:00
Balaram Makam ccf59731e3 Optimize bitwise even/odd test (-x&1 -> x&1) to not use negation.
Summary: We know that -x & 1 is equivalent to x & 1, avoid using negation for testing if a negative integer is even or odd.

Reviewers: majnemer

Subscribers: junbuml, mssimpso, gberry, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D12156

llvm-svn: 245569
2015-08-20 15:35:00 +00:00
Sanjay Patel 1cd6d88e4d use minSize wrapper; NFCI
These were missed when other uses were switched over:
http://llvm.org/viewvc/llvm-project?view=revision&revision=243994

llvm-svn: 245311
2015-08-18 16:44:23 +00:00
David Majnemer 8ed559ad22 Revert "[InstCombinePHI] Partial simplification of identity operations."
This reverts commit r244887, it caused PR24470.

llvm-svn: 245194
2015-08-17 03:11:26 +00:00
David Majnemer dfa3b09541 [InstCombine] Replace an and+icmp with a trunc+icmp
Bitwise arithmetic can obscure a simple sign-test.  If replacing the
mask with a truncate is preferable if the type is legal because it
permits us to rephrase the comparison more explicitly.

llvm-svn: 245171
2015-08-16 07:09:17 +00:00
Nick Lewycky 8075fd22b9 Fix a crash where a utility function wasn't aware of fcmp vectors and created a value with the wrong type. Fixes PR24458!
llvm-svn: 245119
2015-08-14 22:46:49 +00:00
Charlie Turner 6153698f26 [InstCombinePHI] Partial simplification of identity operations.
Consider this code:

BB:
  %i = phi i32 [ 0, %if.then ], [ %c, %if.else ]
  %add = add nsw i32 %i, %b
  ...

In this common case the add can be moved to the %if.else basic block, because
adding zero is an identity operation. If we go though %if.then branch it's
always a win, because add is not executed; if not, the number of instructions
stays the same.

This pattern applies also to other instructions like sub, shl, shr, ashr | 0,
mul, sdiv, div | 1.

Patch by Jakub Kuderski!

llvm-svn: 244887
2015-08-13 12:38:58 +00:00
Simon Pilgrim becd5e8abd [InstCombine] SSE/AVX vector shifts demanded shift amount bits
Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this.

I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift.

Differential Revision: http://reviews.llvm.org/D11938

llvm-svn: 244872
2015-08-13 07:39:03 +00:00
Simon Pilgrim 93f59f53ca unused variable warning fix.
llvm-svn: 244725
2015-08-12 08:23:36 +00:00
Simon Pilgrim 8c049d5c03 [InstCombine] Move SSE/AVX vector blend folding to instcombiner
As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely).

InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask).

I also moved all the relevant combine tests into InstCombine/blend_x86.ll

Differential Revision: http://reviews.llvm.org/D11934

llvm-svn: 244723
2015-08-12 08:08:56 +00:00
Sanjoy Das 827529e7a0 Fix PR24354.
`InstCombiner::OptimizeOverflowCheck` was asserting an
invariant (operands to binary operations are ordered by decreasing
complexity) that wasn't really an invariant.  Fix this by instead having
`InstCombiner::OptimizeOverflowCheck` establish the invariant if it does
not hold.

llvm-svn: 244676
2015-08-11 21:33:55 +00:00
James Molloy 134bec2722 Add support for floating-point minnum and maxnum
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.

matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.

C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.

ARM and AArch64 at least have different instructions for these different cases.

llvm-svn: 244580
2015-08-11 09:12:57 +00:00
Simon Pilgrim a3a72b41de [InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner
As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations.

Differential Revision: http://reviews.llvm.org/D11886

llvm-svn: 244495
2015-08-10 20:21:15 +00:00
Benjamin Kramer df005cbe19 Fix some comment typos.
llvm-svn: 244402
2015-08-08 18:27:36 +00:00
David Majnemer 60c994b985 [InstCombine] Don't try to sink EH pad instructions
Found by inspection, this change should not effect the existing
landingpad behavior.

llvm-svn: 244391
2015-08-08 03:51:49 +00:00
Simon Pilgrim 3815c16bf8 [InstCombine] Fix SSE2/AVX2 vector logical shift by constant
This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes.

e.g.

%1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>)

In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) | 15) - giving a zero result.

In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821).

Differential Revision: http://reviews.llvm.org/D11760

llvm-svn: 244341
2015-08-07 18:22:50 +00:00
Pete Cooper ebcd748927 Convert a bunch of loops to foreach. NFC.
After r244074, we now have a successors() method to iterate over
all the successors of a TerminatorInst.  This commit changes a bunch
of eligible loops to use it.

llvm-svn: 244260
2015-08-06 20:22:46 +00:00
Simon Pilgrim 18617d193f Fixed line endings.
llvm-svn: 244021
2015-08-05 08:18:00 +00:00
Simon Pilgrim dcfd7a3fba [InstCombine] Moved SSE vector shift constant folding into its own helper function. NFCI.
This will make some upcoming bugfixes + improvements easier to manage.

llvm-svn: 243962
2015-08-04 07:49:58 +00:00
Sanjay Patel d411114e77 fix formatting; NFC
llvm-svn: 243424
2015-07-28 15:38:43 +00:00
Simon Pilgrim 074c0d97dc Fixed signed/unsigned comparison warning.
llvm-svn: 243306
2015-07-27 19:07:15 +00:00
Simon Pilgrim 15c0a59463 [InstCombine][X86][SSE] Replace sign/zero extension intrinsics with native IR
Now that we are generating sane codegen for vector sext/zext nodes on SSE targets, this patch uses instcombine to replace the SSE41/AVX2 pmovsx and pmovzx intrinsics with the equivalent native IR code.

Differential Revision: http://reviews.llvm.org/D11503

llvm-svn: 243303
2015-07-27 18:52:15 +00:00
Simon Pilgrim 54fcd62c6f [InstCombine][SSE4A] Standardized references to Length/Width and Index/Start to match AMD docs. NFCI.
llvm-svn: 243226
2015-07-25 20:41:00 +00:00
David Majnemer 33b6f82e72 [InstCombine] Generalize sub of selects optimization to all BinaryOperators
This exposes further optimization opportunities if the selects are
correlated.

llvm-svn: 242235
2015-07-14 22:39:23 +00:00
David Majnemer 599ca4426c [InstSimplify] Teach InstSimplify how to simplify extractelement
llvm-svn: 242008
2015-07-13 01:15:53 +00:00
David Majnemer 25a796e148 [InstSimplify] Teach InstSimplify how to simplify extractvalue
llvm-svn: 242007
2015-07-13 01:15:46 +00:00
Bjorn Steinbrink a6b929dfe2 [InstCombine] Actually combine AA metadata when replacing one load with another
Fixes PR24083

llvm-svn: 241955
2015-07-10 22:30:17 +00:00
Benjamin Kramer f4ebfa3ae1 [InstSimplify] Fold away ord/uno fcmps when nnan is present.
This is important to fold away the slow case of complex multiplies
emitted by clang.

llvm-svn: 241911
2015-07-10 14:02:02 +00:00
Bjorn Steinbrink 8350534772 [InstCombine] Employ AliasAnalysis in FindAvailableLoadedValue
llvm-svn: 241887
2015-07-10 06:55:49 +00:00
Bjorn Steinbrink a91fd0998f [InstCombine] Properly combine metadata when replacing a load with another
Not doing this can lead to misoptimizations down the line, e.g. because
of range metadata on the replacing load excluding values that are valid
for the load that is being replaced.

llvm-svn: 241886
2015-07-10 06:55:44 +00:00
Jingyue Wu 5e34ce33f5 [InstCombine] call SimplifyICmpInst with correct context
Summary:
Fixes PR23809. Without passing the context to SimplifyICmpInst, we would
use the assume to prove that the condition feeding the assume is
trivially true (see isValidAssumeForContext in ValueTracking.cpp),
causing the removal of the assume which may be useful for later
optimizations.

Test Plan: pr23800.ll

Reviewers: hfinkel, majnemer

Reviewed By: hfinkel

Subscribers: henryhu, llvm-commits, wengxt, broune, meheff, eliben

Differential Revision: http://reviews.llvm.org/D10695

llvm-svn: 240683
2015-06-25 20:14:47 +00:00
Sanjay Patel 6a24811d87 fix typo; NFC
llvm-svn: 240480
2015-06-23 23:26:22 +00:00
Sanjay Patel 9b7e6776a1 don't repeat function names in comments; NFC
llvm-svn: 240478
2015-06-23 23:05:08 +00:00
Alexander Kornienko f00654e31b Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.

llvm-svn: 240390
2015-06-23 09:49:53 +00:00
David Majnemer 726901b638 [InstCombine] Optimize subtract of selects into a select of a sub
This came up when examining some code generated by clang's IRGen for
certain member pointers.

llvm-svn: 240369
2015-06-23 02:49:24 +00:00
Alexander Kornienko 70bc5f1398 Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:

tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
  -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
  llvm/lib/


Thanks to Eugene Kosov for the original patch!

llvm-svn: 240137
2015-06-19 15:57:42 +00:00
David Majnemer 7fddeccb8b Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.

This isn't desirable because:
- All LandingPadInsts in the same function must have the same
  personality routine.  This means that each LandingPadInst beyond the
  first has an operand which produces no additional information.

- There is ongoing work to introduce EH IR constructs other than
  LandingPadInst.  Moving the personality routine off of any one
  particular Instruction and onto the parent function seems a lot better
  than have N different places a personality function can sneak onto an
  exceptional function.

Differential Revision: http://reviews.llvm.org/D10429

llvm-svn: 239940
2015-06-17 20:52:32 +00:00
Philip Reames c25df11614 Reapply 239795 - [InstCombine] Propagate non-null facts to call parameters
The original change broke clang side tests.  I will be submitting those momentarily.  This change includes post commit feedback on the original change from from Pete Cooper.

Original Submission comments:
If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129.

Differential Revision: http://reviews.llvm.org/D9132

llvm-svn: 239849
2015-06-16 20:24:25 +00:00
Philip Reames 1a6305f313 Revert 239795
I forgot to update some clang test cases.  I'll fix and resubmit tomorrow.

llvm-svn: 239800
2015-06-16 01:20:53 +00:00
Philip Reames dfc29fba60 [InstCombine] Propagate non-null facts to call parameters
If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129.

Differential Revision: http://reviews.llvm.org/D9132

llvm-svn: 239795
2015-06-16 00:43:54 +00:00
David Majnemer 3f0fb98d01 [InstCombine, InstSimplify] Move xforms from Combine to Simplify
There were several SelectInst combines that always returned an existing
instruction instead of modifying an old one or creating a new one.
These are prime candidates for moving to InstSimplify.

llvm-svn: 239229
2015-06-06 22:40:21 +00:00
David Majnemer 468f670021 [InstCombine] Don't miscompile select to poison
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand.  However, doing so is only valid if the
computation doesn't inject poison into the computation.

It might be helpful to consider the following example:
  (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)

The select is equivalent to (add %i, 1) but not (add nsw %i, 1).

Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.

llvm-svn: 239215
2015-06-06 02:30:43 +00:00
Renato Golin 3dabb23384 Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced"
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least  on ARM, increasing the
compilation time (stage 2, clang's) by 5x.

llvm-svn: 239175
2015-06-05 18:24:12 +00:00
Sanjoy Das c80dad6f18 [InstCombine][NFC] Add a ``break;`` statement.
This change is NFC because both the ``break;`` and the fall through end
up returning immediately. However, this helps clarify intent and also
ensures correctness in case more ``case`` blocks are added later.

llvm-svn: 239172
2015-06-05 18:04:46 +00:00
Sanjoy Das 72cb5e1087 [InstCombine] Fix PR23751.
PR23751 was caused by a missing ``break;`` in r234388.

llvm-svn: 239171
2015-06-05 18:04:42 +00:00
David Majnemer 6d8081835d [InstCombine] Rephrase fix to SimplifyWithOpReplaced
I don't have the IR which is causing the build bot breakage but I can
postulate as to why they are timing out:
1. SimplifyWithOpReplaced was stripping flags from the simplified value.
2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because
   it's simplification wasn't correct.
3. InstCombine would revisit the add instruction and note that it can
   rederive the flags.
4. By modifying the value, we chose to revisit instructions which reuse
   the value.  One of the instructions is the original select, causing
   LLVM to never reach fixpoint.

Instead, strip the flags only when we are sure we are going to perform
the simplification.

llvm-svn: 239141
2015-06-05 09:57:57 +00:00
Daniel Jasper 917fa5ee66 Revert "[InstCombine] Don't miscompile safe increment idiom"
This is breaking a lot of build bots and is causing very long-running
compiles (infinite loops)?

Likely, we shouldn't return nullptr?

llvm-svn: 239139
2015-06-05 09:31:20 +00:00
David Majnemer 00f7d9ecc8 [InstCombine] Don't miscompile safe increment idiom
We cleverly handle cases where computation done in one argument of a select
instruction is suitable for the other operand, thus obviating the need
of the select and the comparison.  However, the other operand cannot
have flags.

This fixes PR23757.

llvm-svn: 239115
2015-06-04 23:11:30 +00:00
Benjamin Kramer f5e2fc474d Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.


Call sites were found with the ASTMatcher + some semi-automated cleanup.

memberCallExpr(
    argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
    on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
    hasArgument(0, bindTemporaryExpr(
                       hasType(recordDecl(hasNonTrivialDestructor())),
                       has(constructExpr()))),
    unless(isInTemplateInstantiation()))

No functional change intended.

llvm-svn: 238602
2015-05-29 19:43:39 +00:00
David Majnemer dd04352558 [InstCombine] Fold IntToPtr and PtrToInt into preceding loads.
Currently we only fold a BitCast into a Load when the BitCast is its
only user.

Do the same for any no-op cast.

Differential Revision: http://reviews.llvm.org/D9152

llvm-svn: 238452
2015-05-28 18:39:17 +00:00
David Majnemer 4c3753c4d4 [InstCombine] Don't eagerly propagate nsw for A*B+A*C => A*(B+C)
InstCombine transforms A *nsw B +nsw A *nsw C to A *nsw (B + C).
This is incorrect -- e.g. if A = -1, B = 1, C = INT_SMAX. Then
nothing in the LHS overflows, but the multiplication in RHS overflows.

We need to first make sure that we won't multiple by INT_SMAX + 1.

Test case `add_of_mul` contributed by Sanjoy Das.

This fixes PR23635.

Differential Revision: http://reviews.llvm.org/D9629

llvm-svn: 238066
2015-05-22 23:02:11 +00:00
David Majnemer 1503258157 [InstSimplify] Handle some overflow intrinsics in InstSimplify
This change does a few things:
- Move some InstCombine transforms to InstSimplify
- Run SimplifyCall from within InstCombine::visitCallInst
- Teach InstSimplify to fold [us]mul_with_overflow(X, undef) to 0.

llvm-svn: 237995
2015-05-22 03:56:46 +00:00
David Majnemer 27e89ba24c [InstCombine] X - 0 is equal to X, not undef
A refactoring made @llvm.ssub.with.overflow.i32(i32 %X, i32 0) transform
into undef instead of %X.

This fixes PR23624.

llvm-svn: 237968
2015-05-21 23:04:21 +00:00
James Molloy 2b21a7cf36 Reapply r237539 with a fix for the Chromium build.
Make sure if we're truncating a constant that would then be sign extended
that the sign extension of the truncated constant is the same as the
original constant.

> Canonicalize min/max expressions correctly.
>
> This patch introduces a canonical form for min/max idioms where one operand
> is extended or truncated. This often happens when the other operand is a
> constant. For example:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = sext i32 %a to i64
> %3 = select i1 %1, i64 %2, i64 0
>
> Would now be canonicalized into:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = select i1 %1, i32 %a, i32 0
> %3 = sext i32 %2 to i64
>
> This builds upon a patch posted by David Majenemer
> (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
> passively stopped instcombine from ruining canonical patterns. This
> patch additionally actively makes instcombine canonicalize too.
>
> Canonicalization of expressions involving a change in type from int->fp
> or fp->int are not yet implemented.

llvm-svn: 237821
2015-05-20 18:41:25 +00:00
Hans Wennborg 2f21b8760e Revert r237539: "Reapply r237520 with another fix for infinite looping"
This caused PR23583.

llvm-svn: 237739
2015-05-19 23:06:30 +00:00
David Blaikie ff6409d096 Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only
llvm-svn: 237624
2015-05-18 22:13:54 +00:00
James Molloy 53958e187a Reapply r237520 with another fix for infinite looping
SimplifyDemandedBits was "simplifying" a constant by removing just sign bits.
This caused a canonicalization race between different parts of instcombine.

Fix and regression test added - third time lucky?

llvm-svn: 237539
2015-05-17 08:27:27 +00:00
James Molloy e8698ae3e1 Revert commits r237521 and r237520.
The AArch64 LNT bot is unhappy - I've found that the problem is in
SimpliftDemandedBits, but that's going to require another code review
so reverting in the meantime.

llvm-svn: 237528
2015-05-16 21:27:14 +00:00
James Molloy b5aa200a33 Reapply r237453 with a fix for the test timeouts.
The test timeouts were due to instcombine fighting itself. Regression test added.
Original log message:

Canonicalize min/max expressions correctly.

This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:

  %1 = icmp slt i32 %a, i32 0
    %2 = sext i32 %a to i64
      %3 = select i1 %1, i64 %2, i64 0

Would now be canonicalized into:

  %1 = icmp slt i32 %a, i32 0
    %2 = select i1 %1, i32 %a, i32 0
      %3 = sext i32 %2 to i64

This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.

Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.

llvm-svn: 237520
2015-05-16 13:10:45 +00:00
James Molloy 1675b4a57f Revert "Canonicalize min/max expressions correctly."
This reverts r237453 - it was causing timeouts on some bots. Reverting
while I investigate (it's probably InstCombine fighting itself...)

llvm-svn: 237458
2015-05-15 17:45:09 +00:00
James Molloy 6edf0b4cd4 Canonicalize min/max expressions correctly.
This patch introduces a canonical form for min/max idioms where one operand
is extended or truncated. This often happens when the other operand is a
constant. For example:

  %1 = icmp slt i32 %a, i32 0
  %2 = sext i32 %a to i64
  %3 = select i1 %1, i64 %2, i64 0

Would now be canonicalized into:

  %1 = icmp slt i32 %a, i32 0
  %2 = select i1 %1, i32 %a, i32 0
  %3 = sext i32 %2 to i64

This builds upon a patch posted by David Majenemer
(https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
passively stopped instcombine from ruining canonical patterns. This
patch additionally actively makes instcombine canonicalize too.

Canonicalization of expressions involving a change in type from int->fp
or fp->int are not yet implemented.

llvm-svn: 237453
2015-05-15 16:10:59 +00:00
Jingyue Wu ca32190379 [ValueTracking] refactor: extract method haveNoCommonBitsSet
Summary:
Extract method haveNoCommonBitsSet so that we don't have to duplicate this logic in
InstCombine and SeparateConstOffsetFromGEP.

This patch also makes SeparateConstOffsetFromGEP more precise by passing
DominatorTree to computeKnownBits.

Test Plan: value-tracking-domtree.ll that tests ValueTracking indeed leverages dominating conditions

Reviewers: broune, meheff, majnemer

Reviewed By: majnemer

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9734

llvm-svn: 237407
2015-05-14 23:53:19 +00:00
Pete Cooper 833f34d837 Convert PHI getIncomingValue() to foreach over incoming_values(). NFC.
We already had a method to iterate over all the incoming values of a PHI.  This just changes all eligible code to use it.

Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB.

llvm-svn: 237169
2015-05-12 20:05:31 +00:00
Sanjoy Das 89c5491a72 [RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to vector of pointers
Summary:
In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for
each relocated pointer, and the gc_relocate has the same type with the
pointer. During the creation of gc_relocate intrinsic, llvm requires to
mangle its type. However, llvm does not support mangling of all possible
types. RewriteStatepointsForGC will hit an assertion failure when it
tries to create a gc_relocate for pointer to vector of pointers because
mangling for vector of pointers is not supported.

This patch changes the way RewriteStatepointsForGC pass creates
gc_relocate. For each relocated pointer, we erase the type of pointers
and create an unified gc_relocate of type i8 addrspace(1)*. Then a
bitcast is inserted to convert the gc_relocate to the correct type. In
this way, gc_relocate does not need to deal with different types of
pointers and the unsupported type mangling is no longer a problem. This
change would also ease further merge when LLVM erases types of pointers
and introduces an unified pointer type.

Some minor changes are also introduced to gc_relocate related part in
InstCombineCalls, CodeGenPrepare, and Verifier accordingly.

Patch by Chen Li!

Reviewers: reames, AndyAyers, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9592

llvm-svn: 237009
2015-05-11 18:49:34 +00:00
James Molloy 71b91c2dba Rip min/max pattern matching out of InstCombine and into
ValueTracking.

This matching functionality is useful in more than just InstCombine, so
make it available in ValueTracking.

NFC.

llvm-svn: 236998
2015-05-11 14:42:20 +00:00
Hal Finkel f0d68d788b [InstCombine/PowerPC] Fix single-precision QPX load/store replacement
The QPX single-precision load/store intrinsics have implied
truncation/extension from/to the declared value type of <4 x double> to the
memory type of <4 x float>. When we can prove the alignment of the pointer
argument, and thus replace the intrinsic with a regular load or store, we need
to load or store the correct data type (<4 x float>) instead of (<4 x double>).

llvm-svn: 236973
2015-05-11 06:37:03 +00:00