* SubOne/AddOne functions always return ConstantInt, declare them as such
* Pull code for handling setcc X, cst, where cst is at the end of the range,
or cc is LE or GE up earlier in visitSetCondInst. This reduces #iterations
in some cases.
* Fold: (div X, C1) op C2 -> range check, implementing div.ll:test6 - test9.
llvm-svn: 16588
This takes something like this:
%A = phi int [ 3, %cond_false.0 ], [ 2, %endif.0.i ], [ 2, %endif.1.i ]
%B = div int %tmp.243, 4
and turns it into:
%A = phi int [ 3/4, %cond_false.0 ], [ 2/4, %endif.0.i ], [ 2/4, %endif.1.i ]
which is later simplified (in this case) into %A = 0.
This triggers thousands of times in spec, for example, 269 times in 176.gcc.
This is tested by InstCombine/add.ll:test23 and set.ll:test18.
llvm-svn: 16582
Instcombine (setcc (truncate X), C1).
This occurs THOUSANDS of times in many benchmarks. Particularlly common
seem to be things like (seteq (cast bool X to int), int 0)
This turns it into (seteq bool %X, false), which then becomes (not %X).
llvm-svn: 16567
This is important for several reasons:
1. Benchmarks have lots of code that looks like this (perlbmk in particular):
%tmp.2.i = setne int %tmp.0.i, 128 ; <bool> [#uses=1]
%tmp.6343 = seteq int %tmp.0.i, 1 ; <bool> [#uses=1]
%tmp.63 = and bool %tmp.2.i, %tmp.6343 ; <bool> [#uses=1]
we now fold away the setne, a clear improvement.
2. In the more important cases, such as (X >= 10) & (X < 20), we now produce
smaller code: (X-10) < 10.
3. Perhaps the nicest effect of this patch is that it really helps out the
code generators. In particular, for a 'range test' like the above,
instead of generating this on X86 (the difference on PPC is even more
pronounced):
cmp %EAX, 50
setge %CL
cmp %EAX, 100
setl %AL
and %CL, %AL
cmp %CL, 0
we now generate this:
add %EAX, -50
cmp %EAX, 50
Furthermore, this causes setcc's to be folded into branches more often.
These combinations trigger dozens of times in the spec benchmarks, particularly
in 176.gcc, 186.crafty, 253.perlbmk, 254.gap, & 099.go.
llvm-svn: 16559
Implement (setcc (shl X, C1), C2) folding.
The second one occurs several dozen times in spec. The first was added
just in case. :)
These are tested by shift.ll:test2[12], and div.ll:test5
llvm-svn: 16549
This latent bug was exposed by recent changes, and is tested as:
llvm/test/Regression/Transforms/InstCombine/2004-09-28-BadShiftAndSetCC.llx
llvm-svn: 16546
triggers often, for example:
6x in povray, 1x in gzip, 279x in gcc, 1x in crafty, 8x in eon, 11x in perlbmk,
362x in gap, 4x in vortex, 14 in m88ksim, 211x in 126.gcc, 1x in compress,
11x in ijpeg, and 4x in 147.vortex.
llvm-svn: 16521
Move include/Config and include/Support into include/llvm/Config,
include/llvm/ADT and include/llvm/Support. From here on out, all LLVM
public header files must be under include/llvm/.
llvm-svn: 16137
assumed that a constant on the RHS of a multiplication was either an
IntConstant or an FPConstant. It checked for an IntConstant and then,
if it did not find one, did a hard cast to an FPConstant. That code
would crash if the RHS were a ConstantExpr that was neither an
IntConstant nor an FPConstant. This version replaces the hard cast
with a dyn_cast. It performs the same way for IntConstants and
FPConstants but does nothing, instead of crashing, for constant
expressions.
The regression test for this change is 2004-07-27-ConstantExprMul.ll.
llvm-svn: 15291
* Test for whether bits are shifted out during the optzn.
If so, the fold is illegal, though it can be handled explicitly for setne/seteq
This fixes the miscompilation of 254.gap last night, which was a latent bug
exposed by other optimizer improvements.
llvm-svn: 15085
actually care about. Someday when the cast instruction is gone, we can do
better here, but this will do for now. This implements
instcombine/cast.ll:test17/18 as well.
llvm-svn: 15018
"load (cast foo)". This allows us to compile C++ code like this:
class Bclass {
public: virtual int operator()() { return 666; }
};
class Dclass: public Bclass {
public: virtual int operator()() { return 667; }
} ;
int main(int argc, char** argv) {
Dclass x;
return x();
}
Into this:
int %main(int %argc, sbyte** %argv) {
entry:
call void %__main( )
ret int 667
}
Instead of this:
int %main(int %argc, sbyte** %argv) {
entry:
%x = alloca "struct.std::bad_typeid" ; <"struct.std::bad_typeid"*> [#uses=3]
call void %__main( )
%tmp.1.i.i = getelementptr "struct.std::bad_typeid"* %x, uint 0, uint 0, uint 0 ; <int (...)***> [#uses=1]
store int (...)** getelementptr ([3 x int (...)*]* %vtable for Bclass, int 0, long 2), int (...)*** %tmp.1.i.i
%tmp.3.i = getelementptr "struct.std::bad_typeid"* %x, int 0, uint 0, uint 0 ; <int (...)***> [#uses=1]
store int (...)** getelementptr ([3 x int (...)*]* %vtable for Dclass, int 0, long 2), int (...)*** %tmp.3.i
%tmp.5 = load int ("struct.std::bad_typeid"*)** cast (int (...)** getelementptr ([3 x int (...)*]* %vtable for Dclass, int 0, long 2) to int
("struct.std::bad_typeid"*)**) ; <int ("struct.std::bad_typeid"*)*> [#uses=1]
%tmp.6 = call int %tmp.5( "struct.std::bad_typeid"* %x ) ; <int> [#uses=1]
ret int %tmp.6
ret int 0
}
In order words, we now resolve the virtual function call.
llvm-svn: 14783
186.crafty, fhourstones and 132.ijpeg.
Bugpoint makes really nasty miscompilations embarassingly easy to find. It
narrowed it down to the instcombiner and this testcase (from fhourstones):
bool %l7153_l4706_htstat_loopentry_2E_4_no_exit_2E_4(int* %i, [32 x int]* %works, int* %tmp.98.out) {
newFuncRoot:
%tmp.96 = load int* %i ; <int> [#uses=1]
%tmp.97 = getelementptr [32 x int]* %works, long 0, int %tmp.96 ; <int*> [#uses=1]
%tmp.98 = load int* %tmp.97 ; <int> [#uses=2]
%tmp.99 = load int* %i ; <int> [#uses=1]
%tmp.100 = and int %tmp.99, 7 ; <int> [#uses=1]
%tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=2]
%tmp.102 = cast bool %tmp.101 to int ; <int> [#uses=0]
br bool %tmp.101, label %codeRepl4.exitStub, label %codeRepl3.exitStub
codeRepl4.exitStub: ; preds = %newFuncRoot
store int %tmp.98, int* %tmp.98.out
ret bool true
codeRepl3.exitStub: ; preds = %newFuncRoot
store int %tmp.98, int* %tmp.98.out
ret bool false
}
... which only has one combination performed on it:
$ llvm-as < t.ll | opt -instcombine -debug | llvm-dis
IC: Old = %tmp.101 = seteq int %tmp.100, 7 ; <bool> [#uses=1]
New = setne int %tmp.100, 0 ; <bool>:<badref> [#uses=0]
IC: MOD = br bool %tmp.101, label %codeRepl3.exitStub, label %codeRepl4.exitStub
IC: MOD = %tmp.97 = getelementptr [32 x int]* %works, uint 0, int %tmp.96 ; <int*> [#uses=1]
It doesn't get much better than this. :)
llvm-svn: 14109
collapse this:
bool %le(int %A, int %B) {
%c1 = setgt int %A, %B
%tmp = select bool %c1, int 1, int 0
%c2 = setlt int %A, %B
%result = select bool %c2, int -1, int %tmp
%c3 = setle int %result, 0
ret bool %c3
}
into:
bool %le(int %A, int %B) {
%c3 = setle int %A, %B ; <bool> [#uses=1]
ret bool %c3
}
which is handy, because the Java FE makes these sequences all over the place.
This is tested as: test/Regression/Transforms/InstCombine/JavaCompare.ll
llvm-svn: 14086
This code hadn't been updated after the "structs with more than 256 elements"
related changes to the GEP instruction. Also it was not handling the
ConstantAggregateZero class.
Now it does!
llvm-svn: 13834
into (X & (C2 << C1)) != (C3 << C1), where the shift may be either left or
right and the compare may be any one.
This triggers 1546 times in 176.gcc alone, as it is a common pattern that
occurs for bitfield accesses.
llvm-svn: 13740
in the size calculation.
This is not something you want to see:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - UNROLLING!
The problem was that 2*2147483648 == 0.
Now we get:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - TOO LARGE: 4294967296>100
Thanks to some anonymous person playing with the demo page that repeatedly
caused zion to go into swapping land. That's one way to ensure you'll get
a quick bugfix. :)
Testcase here: Transforms/LoopUnroll/2004-05-13-DontUnrollTooMuch.ll
llvm-svn: 13564
%tmp.0 = getelementptr [50 x sbyte]* %ar, uint 0, int 5 ; <sbyte*> [#uses=2]
%tmp.7 = getelementptr sbyte* %tmp.0, int 8 ; <sbyte*> [#uses=1]
together. This patch actually allows us to simplify and generalize the code.
llvm-svn: 13415
is only used by a cast, and the casted type is the same size as the original
allocation, it would eliminate the cast by folding it into the allocation.
Unfortunately, it was placing the new allocation instruction right before
the cast, which could pull (for example) alloca instructions into the body
of a function. This turns statically allocatable allocas into expensive
dynamically allocated allocas, which is bad bad bad.
This fixes the problem by placing the new allocation instruction at the same
place the old one was, duh. :)
llvm-svn: 13289
This transforms code like this:
%C = or %A, %B
%D = select %cond, %C, %A
into:
%C = select %cond, %B, 0
%D = or %A, %C
Since B is often a constant, the select can often be eliminated. In any case,
this reduces the usage count of A, allowing subsequent optimizations to happen.
This xform applies when the operator is any of:
add, sub, mul, or, xor, and, shl, shr
llvm-svn: 12800
#1 is to unconditionally strip constantpointerrefs out of
instruction operands where they are absolutely pointless and inhibit
optimization. GRRR!
#2 is to implement InstCombine/getelementptr_const.ll
llvm-svn: 12519
(A <setcc1> B) logicalop (A <setcc2> B) -> (A <setcc3> B) or true or false
Where setcc[123] is one of the 6 setcc instructions, and logicalop is one of: And, Or, Xor
llvm-svn: 7828
of codes. For example,
short kernel (short t1) {
t1 >>= 8; t1 <<= 8;
return t1;
}
became:
short %kernel(short %t1.1) {
%tmp.3 = shr short %t1.1, ubyte 8 ; <short> [#uses=1]
%tmp.5 = cast short %tmp.3 to int ; <int> [#uses=1]
%tmp.7 = shl int %tmp.5, ubyte 8 ; <int> [#uses=1]
%tmp.8 = cast int %tmp.7 to short ; <short> [#uses=1]
ret short %tmp.8
}
before, now it becomes:
short %kernel(short %t1.1) {
%tmp.3 = shr short %t1.1, ubyte 8 ; <short> [#uses=1]
%tmp.8 = shl short %tmp.3, ubyte 8 ; <short> [#uses=1]
ret short %tmp.8
}
which will become:
short %kernel(short %t1.1) {
%tmp.3 = and short %t1.1, 0xFF00
ret short %tmp.3
}
This implements cast-set.ll:test4 and test5
llvm-svn: 7290
IC: (X ^ C1) | C2 --> (X | C2) ^ (C1&~C2)
We are now guaranteed that all 'or's will be inside of 'and's, and all 'and's
will be inside of 'xor's, if the second operands are constants.
llvm-svn: 7264
This hunk:
- } else if (Src->getNumOperands() == 2 && Src->use_size() == 1) {
+ } else if (Src->getNumOperands() == 2) {
Allows GEP folding to be more aggressive, which reduces the number of instructions
and can dramatically speed up BasicAA in some cases.
llvm-svn: 6286
allows optimization of this:
int %test4(int %A, int %B) {
%a = xor int %A, -1
%c = and int %a, 5 ; 5 = ~c2
%d = xor int %c, -1
ret int %d
}
into this:
int %test4(int %A, int %B) { ; No predecessors!
%c.demorgan = or int %A, -6 ; <int> [#uses=1]
ret int %c.demorgan
}
llvm-svn: 5736
* A & ~A == 0
* A / (2^c) == A >> c if unsigned
* 0 / A == 0
* 1.0 * A == A
* A * (2^c) == A << c
* A ^ ~A == -1
* A | ~A == -1
* 0 % X = 0
* A % (2^c) == A & (c-1) if unsigned
* A - (A & B) == A & ~B
* -1 - A == ~A
llvm-svn: 5587
* Renamed StatisticReporter.h/cpp to Statistic.h/cpp
* Broke constructor to take two const char * arguments instead of one, so
that indendation can be taken care of automatically.
* Sort the list by pass name when printing
* Make sure to print all statistics as a group, instead of randomly when
the statistics dtors are called.
* Updated ProgrammersManual with new semantics.
llvm-svn: 4002
* New ReplaceInstUsesWith function to factor out tons of common code
This needs to be used more in the future still, but it's a good start
* New InsertNewInstBefore to allow multi-instruction replacements
* Change getMaxValue functions to isAllOnesValue function, which doesn't
have to CREATE/lookup a new constant. Also the name is accurate
* Add new isMaxValue, isMinValue, isMaxValueMinusOne, isMinValuePlusOne
functions: This should be moved to Constant* classes eventually
* Implement xor X, ALLONES -> not X
* Fold ALL setcc's of booleans away
* Handle various SetCC's for integers against values at the end of their
ranges, possibly off by one. This implements the setcc-strength-reduce.ll
testcase.
llvm-svn: 3286
- Reenable gep (gep x) -> x
- Make instcombine do dead instruction elimination where it's really
easy. Now visitors don't have to ensure they aren't not processing
dead instructions.
llvm-svn: 3222
* Add new RegisterOpt/RegisterAnalysis templates for registering passes that
are to show up in opt or analyze
* Register Analyses now
* Change optimizations to use RegisterOpt instead of RegisterPass
* Add support for different "PassType's"
* Add new RegisterOpt/RegisterAnalysis templates for registering passes that
are to show up in opt or analyze
* Register Analyses now
* Change optimizations to use RegisterOpt instead of RegisterPass
* Remove getPassName implementations from various subclasses
llvm-svn: 3113
"This testcase caused instcombine to fail because it got the same instruction on
it's worklist more than once (which is ok), but then deleted the instruction.
Since the inst stayed on the worklist, as soon as it came back up to be
processed, bad things happened, and opt asserted."
llvm-svn: 2623
- Rename runOnMethod to runOnFunction
* Transform getAnalysisUsageInfo into getAnalysisUsage
- Method is now const
- It now takes one AnalysisUsage object to fill in instead of 3 vectors
to fill in
- Pass's now specify which other passes they _preserve_ not which ones
they modify (be conservative!)
- A pass can specify that it preserves all analyses (because it never
modifies the underlying program)
* s/Method/Function/g in other random places as well
llvm-svn: 2333
We now use an InstVisitor to delegate to different cases that we are
interested in handling. We also fix the FIXME's by adding users to the
worklist when appropriate.
llvm-svn: 2292
llvm/Support/CFG.h
* Make pred & succ iterators for intervals global functions
* Add #includes that are now neccesary because BasicBlock.h doesn't include
InstrTypes.h anymore
llvm-svn: 1750
Method::inst_* is now in llvm/Support/InstIterator.h
GraphTraits specializations for BasicBlock and Methods are now in llvm/Support/CFG.h
llvm-svn: 1746