isel has its own particular features that it wants in the CFG, in order to
reduce the number of times a constant is computed, etc. Make sure that we
clean up the CFG before doing any other things for isel. Doing so can
dramatically reduce the number of split edges and reduce the number of
places that constants get computed. For example, this shrinks
CodeGen/Generic/phi-immediate-factoring.ll from 44 to 37 instructions on X86,
and from 21 to 17 MBB's in the output. This is primarily a code size win,
not a performance win.
This implements CodeGen/Generic/phi-immediate-factoring.ll and PR1296.
llvm-svn: 35575
1. Line out nested call of APInt::zext/trunc.
2. Make more use of APInt::getHighBitsSet/getLowBitsSet.
3. Use APInt[] operator instead of expression like "APIntVal & SignBit".
llvm-svn: 35444
2. Use APInt[] instead of "X & SignBit".
3. Clean up some codes.
4. Make the expression like "ShiftAmt = ShiftAmtC->getZExtValue()" safe.
llvm-svn: 35424
1. Line out nested use of zext/trunc.
2. Make more use of getHighBitsSet/getLowBitsSet.
3. Use APInt[] != 0 instead of "(APInt & SignBit) != 0".
llvm-svn: 35408
When converting an add/xor/and triplet into a trunc/sext, only do so if the
intermediate integer type is a bitwidth that the targets can handle.
llvm-svn: 35400
original and new instruction. A slight performance hit with ostringstream
but it is only for debug.
Also, clean up an uninitialized variable warning noticed in a release build.
llvm-svn: 35358
Fix SingleSource/Regression/C/2003-05-21-UnionBitFields.c by changing a
getHighBitsSet call to getLowBitsSet call that was incorrectly converted
from the original lshr constant expression.
llvm-svn: 35348
Remove a use of getLowBitsSet that caused the mask used for replacement of
shl/lshr pairs with an AND instruction to be computed incorrectly. Its not
clear exactly why this is the case. This solves the disappearing shifts
problem, but it doesn't fix Regression/C/2003-05-21-UnionBitFields. It
seems there is more going on.
llvm-svn: 35342
* Don't assume shift amounts are <= 64 bits
* Avoid creating an extra APInt in SubOne and AddOne by using -- and ++
* Add another use of getLowBitsSet
* Convert a series of if statements to a switch
llvm-svn: 35339
using the facilities of APInt. While this duplicates a tiny fraction of
the constant folding code, it also makes the code easier to read and
avoids large ConstantExpr overhead for simple, known computations.
llvm-svn: 35335
* Convert the last use of a uint64_t that should have been an APInt.
* Change ComputeMaskedBits to have a const reference argument for the Mask
so that recursions don't cause unneeded temporaries. This causes temps
to be needed in other places (where the mask has to change) but this
change optimizes for the recursion which is more frequent.
* Remove two instances of &ing a Mask with getAllOnesValue. Its not
needed any more because APInt is accurate in its bit computations.
* Start using the getLowBitsSet and getHighBits set methods on APInt
instead of shifting. This makes it more clear in the code what is
going on.
llvm-svn: 35321
* APIntify visitAdd and visitSelectInst
* Remove unused uint64_t versions of utility functions that have been
replaced with APInt versions.
This completes most of the changes for APIntification of InstCombine. This
passes llvm-test and llvm/test/Transforms/InstCombine/APInt.
Patch by Zhou Sheng.
llvm-svn: 35287
* Re-enable the APInt version of MaskedValueIsZero.
* APIntify the Comput{Un}SignedMinMaxValuesFromKnownBits functions
* APIntify visitICmpInst.
llvm-svn: 35270
Analyze GEPs. If the indices are all zero, transfer whether the pointer is
known to be not null through the GEP.
Add a few more cases for xor and shift instructions.
llvm-svn: 35257
* Fix some indentation and comments in InsertRangeTest
* Add an "IsSigned" parameter to AddWithOverflow and make it handle signed
additions. Also, APIntify this function so it works with any bitwidth.
* For the icmp pred ([us]div %X, C1), C2 transforms, exit early if the
div instruction's RHS is zero.
* Finally, for icmp pred (sdiv %X, C1), -C2, fix an off-by-one error. The
HiBound needs to be incremented in order to get the range test correct.
llvm-svn: 35247
Add new micro-optimizations.
Add icmp predicate snuggling. Given %x ULT 4, "icmp ugt %x, 2" becomes
"icmp eq %x, 3". This doesn't apply in any non-trivial cases yet due to missing
support for NE values in ValueRanges.
llvm-svn: 35119
1. Replace getSignedMinValue() with getSignBit() for better code readability.
2. Replace APIntOps::shl() with operator<<= for convenience.
3. Make APInt construction more effective.
llvm-svn: 35060
Provide an APIntified version of MaskedValueIsZero. This will (temporarily)
cause a "defined but not used" message from the compiler. It will be used
in the next patch in this series.
Patch by Sheng Zhou.
llvm-svn: 35019
Add a new ComputeMaskedBits function that is APIntified. We'll slowly
convert things over to use this version. When its all done, we'll remove
the existing version.
llvm-svn: 35018
scalarrepl things down to elements, but mem2reg can't promote elements that
are memset/memcpy'd. Until then, the code is disabled "0 &&".
llvm-svn: 34924
would scan the entire loop body, then scan all users of instructions in the
loop, looking for users outside the loop. Now, since we know that the
loop is in LCSSA form, we know that any users outside the loop will be LCSSA
phi nodes. Just scan them.
This speeds up indvars significantly.
llvm-svn: 34898
the order that instcombine processed instructions in the testcase. The end
result is that instcombine finished with:
define i16 @test1(i16 %a) {
%tmp = zext i16 %a to i32 ; <i32> [#uses=2]
%tmp21 = lshr i32 %tmp, 8 ; <i32> [#uses=1]
%tmp5 = shl i32 %tmp, 8 ; <i32> [#uses=1]
%tmp.upgrd.32 = or i32 %tmp21, %tmp5 ; <i32> [#uses=1]
%tmp.upgrd.3 = trunc i32 %tmp.upgrd.32 to i16 ; <i16> [#uses=1]
ret i16 %tmp.upgrd.3
}
which can't get matched as a bswap.
This patch makes instcombine more sophisticated about removing truncating
casts, allowing it to turn this into:
define i16 @test2(i16 %a) {
%tmp211 = lshr i16 %a, 8
%tmp52 = shl i16 %a, 8
%tmp.upgrd.323 = or i16 %tmp211, %tmp52
ret i16 %tmp.upgrd.323
}
which then matches as bswap. This fixes bswap.ll and implements
InstCombine/cast2.ll:test[12]. This also implements cast elimination of
add/sub.
llvm-svn: 34870
a value from the worklist required scanning the entire worklist to remove all
entries. We now use a combination map+vector to prevent duplicates from
happening and prevent the scan. This speeds up instcombine on a large file
from the llvm-gcc bootstrap from 189.7s to 4.84s in a debug build and from
5.04s to 1.37s in a release build.
llvm-svn: 34848
First round of ConstantRange changes. This makes all CR constructors use
only APInt and not use ConstantInt. Clients are adjusted accordingly.
llvm-svn: 34756
as its main datastructure. There are many improvements yet to be made, but
this speeds up opt --std-compile-opts on 447.dealII by 7.3%.
llvm-svn: 34193
This happened because deadargelim now causes VMCore to auto-rename every
function that it hacks arguments out of. Because it hacks arguments out of
functions in a non-deterministic order, this caused the resultant numbering
to be nondet. The fix is to just be careful to not rename functions!
llvm-svn: 34005
BBNumbers. Instead of using a bi-directional mapping, just use a single
densemap. This speeds up mem2reg on 176.gcc by 8%, from 1.3489 to
1.2485s.
llvm-svn: 33940
the Transforms library. This reduces debug library size by 132 KB, debug
binary size by 376 KB, and reduces link time for llvm tools slightly.
llvm-svn: 33939
This patch replaces the SymbolTable class with ValueSymbolTable which does
not support types planes. This means that all symbol names in LLVM must now
be unique. The patch addresses the necessary changes to deal with this and
removes code no longer needed as a result. This completes the bulk of the
changes for this PR. Some cleanup patches will follow.
llvm-svn: 33918
Learn from sext and zext. The destination value falls within the range of the
source type.
Generalize properties regarding constant ints.
Get smarter about marking blocks as unreachable. If 1 >= 2 in order for this
block to execute, then it isn't reachable.
llvm-svn: 33889
Make the Module's dependent library use a std::vector instead of SetVector
adjust #includes in .cpp files because SetVector.h is no longer included.
llvm-svn: 33855
This feature is needed in order to support shifts of more than 255 bits
on large integer types. This changes the syntax for llvm assembly to
make shl, ashr and lshr instructions look like a binary operator:
shl i32 %X, 1
instead of
shl i32 %X, i8 1
Additionally, this should help a few passes perform additional optimizations.
llvm-svn: 33776
pessimization where instcombine can sink a load (good for code size) that
prevents an alloca from being promoted by mem2reg (bad for everything).
llvm-svn: 33771
updating. These were exposed by Devang's recent passmgr changes (with
non-default passorderings) because now the inliner can be interleved with
the LCSSA pass.
llvm-svn: 33760
ConstantFoldInstOperands/ConstantFoldCall to take a pointer to an array
of operands + size, instead of an std::vector.
In some cases, switch to using a SmallVector instead of a vector.
This allows us to get rid of some special case gross code that was there
to avoid the cost of constructing a vector.
llvm-svn: 33670
This occurs in C++ code like:
#include <iostream>
#include <iterator>
int a[] = { 1, 2, 3, 4, 5 };
int main() {
using namespace std;
copy(a, a + sizeof(a)/sizeof(a[0]), ostream_iterator<int>(cout, "\n"));
return 0;
}
Before we would decide the loop trip count is:
sdiv (i32 sub (i32 ptrtoint (i32* getelementptr ([5 x i32]* @a, i32 0, i32 5) to i32), i32 ptrtoint ([5 x i32]* @a to i32)), i32 4)
Now we decide it is "5". Amazing.
This code will need to be refactored, but I'm doing that as a separate
commit.
llvm-svn: 33665
Fix initializeConstant, now initializeInt. Fixes major performance
bottleneck.
X == Y || X->DominatedBy(Y) is redundant. Remove the X == Y part.
Fix crasher in makeEqual where getOrInsertNode would add a new constant,
producing an NE relationship between the two members we're trying to make
equal. This now allows us to mark more BBs as unreachable.
llvm-svn: 33612
1. New parameter attribute called 'inreg'. It has meaning "place this
parameter in registers, if possible". This is some generalization of
gcc's regparm(n) attribute. It's currently used only in X86-32 backend.
2. Completely rewritten CC handling/lowering code inside X86 backend.
Merged stdcall + c CCs and fastcall + fast CC.
3. Dropped CSRET CC. We cannot add struct return variant for each
target-specific CC (e.g. stdcall + csretcc and so on).
4. Instead of CSRET CC introduced 'sret' parameter attribute. Setting in
on first attribute has meaning 'This is hidden pointer to structure
return. Handle it gently'.
5. Fixed small bug in llvm-extract + add new feature to
FunctionExtraction pass, which relinks all internal-linkaged callees
from deleted function to external linkage. This will allow further
linking everything together.
NOTEs: 1. Documentation will be updated soon.
2. llvm-upgrade should be improved to translate csret => sret.
Before this, there will be some unexpected test fails.
llvm-svn: 33597
The Module::setEndianness and Module::setPointerSize methods have been
removed. Instead you can get/set the DataLayout. Adjust thise accordingly.
llvm-svn: 33530
changes: (1) don't special case for i1 any more, (2) use the new
TargetData::getTypeSizeInBits method to ensure source and dest are the
same bit width.
llvm-svn: 33427
We only want to do this if the src and destination types have the same
bit width. This patch uses TargetData::getTypeSizeInBits() instead of
making a special case for integer types and avoiding the transform if
they don't match.
llvm-svn: 33414
This is the final patch for this PR. It implements some minor cleanup
in the use of IntegerType, to wit:
1. Type::getIntegerTypeMask -> IntegerType::getBitMask
2. Type::Int*Ty changed to IntegerType* from Type*
3. ConstantInt::getType() returns IntegerType* now, not Type*
This also fixes PR1120.
Patch by Sheng Zhou.
llvm-svn: 33370
transform. Change some variable names so it is clear what is source and
what is dest of the cast. Also, add an assert to ensure that the integer
to integer case is asserting if the bitwidths are different. This prevents
illegal casts from being formed and catches bitwidth bugs sooner.
llvm-svn: 33337
because TargetData::getTypeSize() returns the same for i1 and i8. This fix
is not right for the full generality of bitwise types, but it fixes the
regression.
llvm-svn: 33237
the basic block and is stable across runs in gdb or valgrind.
Make Node::update handle edges which dominate and are tighter than
existing edges.
Replace makeEqual's "squeeze theorem" code. Fixes miscompilation.
Gate the calls to defToOps and opsToDef. Before this, we were getting IG
edges about values which weren't even defined in the dominated area. This
reduces the size of the IG by about half.
llvm-svn: 33236
rename Type::getIntegralTypeMask to Type::getIntegerTypeMask.
This makes naming much more consistent. For example, there are now no longer any
instances of IntegerType that are not considered isInteger! :)
llvm-svn: 33225
that properties were being applied where they didn't belong. Fixes crash
in new MiBench testcase.
Also mark debugging code as such in #ifdef.
llvm-svn: 33177
Implement the arbitrary bit-width integer feature. The feature allows
integers of any bitwidth (up to 64) to be defined instead of just 1, 8,
16, 32, and 64 bit integers.
This change does several things:
1. Introduces a new Derived Type, IntegerType, to represent the number of
bits in an integer. The Type classes SubclassData field is used to
store the number of bits. This allows 2^23 bits in an integer type.
2. Removes the five integer Type::TypeID values for the 1, 8, 16, 32 and
64-bit integers. These are replaced with just IntegerType which is not
a primitive any more.
3. Adjust the rest of LLVM to account for this change.
Note that while this incremental change lays the foundation for arbitrary
bit-width integers, LLVM has not yet been converted to actually deal with
them in any significant way. Most optimization passes, for example, will
still only deal with the byte-width integer types. Future increments
will rectify this situation.
llvm-svn: 33113
recommended that getBoolValue be replaced with getZExtValue and that
get(bool) be replaced by get(const Type*, uint64_t). This implements
those changes.
llvm-svn: 33110
This patch converts getPrimitiveSize to getPrimitiveSizeInBits where it is
appropriate to do so (comparison of integer primitive types).
llvm-svn: 33012
Enable complex addressing modes on 64-bit platforms involving two induction
variables by keeping a size and scale in 64-bits not 32.
Patch by Dan Gohman.
llvm-svn: 33011
Take an incremental step towards type plane elimination. This change
separates types from values in the symbol tables by finally making use
of the TypeSymbolTable class. This yields more natural interfaces for
dealing with types and unclutters the SymbolTable class.
llvm-svn: 32956
This patch replaces signed integer types with signless ones:
1. [US]Byte -> Int8
2. [U]Short -> Int16
3. [U]Int -> Int32
4. [U]Long -> Int64.
5. Removal of isSigned, isUnsigned, getSignedVersion, getUnsignedVersion
and other methods related to signedness. In a few places this warranted
identifying the signedness information from other sources.
llvm-svn: 32785
Fix this by ensuring that a bitcast is inserted to do sign switching. This
is only temporarily needed as the merging of signed and unsigned is next
on the SignlessTypes plate.
llvm-svn: 32757
This patch removes the SetCC instructions and replaces them with the ICmp
and FCmp instructions. The SetCondInst instruction has been removed and
been replaced with ICmpInst and FCmpInst.
llvm-svn: 32751
code to handle instructions as well, so that we properly fold things like
X & undef -> 0.
This fixes Transforms/SCCP/2006-12-19-UndefBug.ll
llvm-svn: 32715
creation. These changes are still temporary but at least this pushes
knowledge of signedness out closer to where it can be determined properly
and allows signedness to be removed from VMCore.
llvm-svn: 32654
rework the hacks that had us passing OStream in. We pass in std::ostream*
instead, check for null, and then dispatch to the correct print() method.
llvm-svn: 32636
used to determine whether a ZExt or SExt cast is performed. Instead, pass
an "isSigned" bool to the function and determine its value from the opcode
of the cast involved.
Also, clean up some cruft from previous patches.
llvm-svn: 32548
The cast patch introduced the possibility that the wrong cast opcode
could be used and that this transform could trigger on different kinds
of cast operations. This patch rectifies that.
llvm-svn: 32538