- Calling getUser in a loop is much more expensive than iterating over a few instructions.
- Use it instead of the open-coded loop in AddrModeMatcher.
- 5% speedup on ARMDisassembler.cpp Release builds.
llvm-svn: 145810
Some code want to check that *any* call within a function has the 'returns
twice' attribute, not just that the current function has one.
llvm-svn: 142221
profile metadata at the same time. Use it to preserve metadata attached
to a branch when re-writing it in InstCombine.
Add metadata to the canonicalize_branch InstCombine test, and check that
it is tranformed correctly.
Reviewed by Nick Lewycky!
llvm-svn: 142168
They are not in sync now, for example Bitcast would show up as LLVMCall.
So instead introduce 2 functions that map to and from the opcodes in the C
bindings.
llvm-svn: 141290
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons. Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all"). Patch mostly by
Nadav Rotem.
llvm-svn: 139159
init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC. While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function. To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function. Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!). Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC. Patch mostly by Sanjoy Das.
llvm-svn: 139140
call. The call may be in the same BB as the landingpad instruction. If that's
the case, then inserting the loads after the landingpad inst, but before the
extractvalues, causes undefined behavior.
llvm-svn: 139088
slots. This fixes a bug where the number of nodes coming into the PHI node may
not equal the number of predecessors. E.g., two or more landingpad instructions
may require a PHI before reaching the eh.exception and eh.selector instructions.
llvm-svn: 139035
Perform the upgrading in steps.
* First, create a map of the invokes to the EH intrinsics.
* Next, take that mapping and determine if the invoke's unwind destination has a
single predecessor. If not, then create a new empty block to hold the new
landingpad instruction.
* Create a landingpad instruction into the uwnind destination. Fill it with the
values from the old selector. Map the old intrinsic calls to the new
landingpad values (there may be multiple landingpad instructions per instrinic
call pairs).
* Go through the old intrinsic calls, create a PHI node when necessary, and then
replace their values with the new values from the landingpad instructions.
* Delete all dead instructions.
* ???
* Profit!
llvm-svn: 138990
This upgrade suffers from the problems of the old EH scheme - i.e., that the
calls to llvm.eh.exception() and llvm.eh.selector() can wander off and get
lost. It makes a valiant effort to reclaim these little lost lambs.
This is a first draft, so it hasn't yet been hooked up to the parser.
llvm-svn: 138602
getFirstInsertionPt() returns an iterator to the first insertion point in a
basic block. This is after all PHIs and any other instruction which is required
to be at the top of the basic block (like LandingPadInst).
llvm-svn: 137744
This caused a race condition where a thread calls ~LLVMContextImpl which calls
Module::dropAllReferences which calls begin() on an empty ilist that would
create the sentinel, which racily accesses the global context.
This can not be fixed by locking inside createSentinel because the lock would
need to be shared with all users of the global context, including those that
reside outside LLVM's own code.
llvm-svn: 137546
of the instruction.
Note that this change affects the existing non-atomic load and store
instructions; the parser now accepts both forms, and the change is noted
in the release notes.
llvm-svn: 137527
This implements the 'landingpad' instruction. It's used to indicate that a basic
block is a landing pad. There are several restrictions on its use (see
LangRef.html for more detail). These restrictions allow the exception handling
code to gather the information it needs in a much more sane way.
This patch has the definition, implementation, C interface, parsing, and bitcode
support in it.
llvm-svn: 137501
Frontends(eg. clang) might pass incomplete form of IR, to step off the way beyond iterator end. In the case I had met, it took infinite loop due to meeting bogus PHInode.
Thanks to Jay Foad and John McCall.
llvm-svn: 137175
This adds the 'resume' instruction class, IR parsing, and bitcode reading and
writing. The 'resume' instruction resumes propagation of an existing (in-flight)
exception whose unwinding was interrupted with a 'landingpad' instruction (to be
added later).
llvm-svn: 136589
working on x86 (at least for trivial testcases); other architectures will
need more work so that they actually emit the appropriate instructions for
orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC,
Mips, and Alpha backends need such changes.)
llvm-svn: 136457
specified in the same file that the library itself is created. This is
more idiomatic for CMake builds, and also allows us to correctly specify
dependencies that are missed due to bugs in the GenLibDeps perl script,
or change from compiler to compiler. On Linux, this returns CMake to
a place where it can relably rebuild several targets of LLVM.
I have tried not to change the dependencies from the ones in the current
auto-generated file. The only places I've really diverged are in places
where I was seeing link failures, and added a dependency. The goal of
this patch is not to start changing the dependencies, merely to move
them into the correct location, and an explicit form that we can control
and change when necessary.
This also removes a serialization point in the build because we don't
have to scan all the libraries before we begin building various tools.
We no longer have a step of the build that regenerates a file inside the
source tree. A few other associated cleanups fall out of this.
This isn't really finished yet though. After talking to dgregor he urged
switching to a single CMake macro to construct libraries with both
sources and dependencies in the arguments. Migrating from the two macros
to that style will be a follow-up patch.
Also, llvm-config is still generated with GenLibDeps.pl, which means it
still has slightly buggy dependencies. The internal CMake
'llvm-config-like' macro uses the correct explicitly specified
dependencies however. A future patch will switch llvm-config generation
(when using CMake) to be based on these deps as well.
This may well break Windows. I'm getting a machine set up now to dig
into any failures there. If anyone can chime in with problems they see
or ideas of how to solve them for Windows, much appreciated.
llvm-svn: 136433
'atomicrmw' instructions, which allow representing all the current atomic
rmw intrinsics.
The allowed operands for these instructions are heavily restricted at the
moment; we can probably loosen it a bit, but supporting general
first-class types (where it makes sense) might get a bit complicated,
given how SelectionDAG works.
As an initial cut, these operations do not support specifying an alignment,
but it would be possible to add if we think it's useful. Specifying an
alignment lower than the natural alignment would be essentially
impossible to support on anything other than x86, but specifying a greater
alignment would be possible. I can't think of any useful optimizations which
would use that information, but maybe someone else has ideas.
Optimizer/codegen support coming soon.
llvm-svn: 136404
* InvokeInst: Get the landingpad instruction associated with this invoke.
* LandingPadInst: A method to reserve extra space for clauses.
llvm-svn: 136325
an assert on Darwin llvm-gcc builds.
Assertion failed: (castIsValid(op, S, Ty) && "Invalid cast!"), function Create, file /Users/buildslave/zorg/buildbot/smooshlab/slave-0.8/build.llvm-gcc-i386-darwin9-RA/llvm.src/lib/VMCore/Instructions.cpp, li\
ne 2067.
etc.
http://smooshlab.apple.com:8013/builders/llvm-gcc-i386-darwin9-RA/builds/2354
--- Reverse-merging r134893 into '.':
U include/llvm/Target/TargetData.h
U include/llvm/DerivedTypes.h
U tools/bugpoint/ExtractFunction.cpp
U unittests/Support/TypeBuilderTest.cpp
U lib/Target/ARM/ARMGlobalMerge.cpp
U lib/Target/TargetData.cpp
U lib/VMCore/Constants.cpp
U lib/VMCore/Type.cpp
U lib/VMCore/Core.cpp
U lib/Transforms/Utils/CodeExtractor.cpp
U lib/Transforms/Instrumentation/ProfilingUtils.cpp
U lib/Transforms/IPO/DeadArgumentElimination.cpp
U lib/CodeGen/SjLjEHPrepare.cpp
--- Reverse-merging r134888 into '.':
G include/llvm/DerivedTypes.h
U include/llvm/Support/TypeBuilder.h
U include/llvm/Intrinsics.h
U unittests/Analysis/ScalarEvolutionTest.cpp
U unittests/ExecutionEngine/JIT/JITTest.cpp
U unittests/ExecutionEngine/JIT/JITMemoryManagerTest.cpp
U unittests/VMCore/PassManagerTest.cpp
G unittests/Support/TypeBuilderTest.cpp
U lib/Target/MBlaze/MBlazeIntrinsicInfo.cpp
U lib/Target/Blackfin/BlackfinIntrinsicInfo.cpp
U lib/VMCore/IRBuilder.cpp
G lib/VMCore/Type.cpp
U lib/VMCore/Function.cpp
G lib/VMCore/Core.cpp
U lib/VMCore/Module.cpp
U lib/AsmParser/LLParser.cpp
U lib/Transforms/Utils/CloneFunction.cpp
G lib/Transforms/Utils/CodeExtractor.cpp
U lib/Transforms/Utils/InlineFunction.cpp
U lib/Transforms/Instrumentation/GCOVProfiling.cpp
U lib/Transforms/Scalar/ObjCARC.cpp
U lib/Transforms/Scalar/SimplifyLibCalls.cpp
U lib/Transforms/Scalar/MemCpyOptimizer.cpp
G lib/Transforms/IPO/DeadArgumentElimination.cpp
U lib/Transforms/IPO/ArgumentPromotion.cpp
U lib/Transforms/InstCombine/InstCombineCompares.cpp
U lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
U lib/Transforms/InstCombine/InstCombineCalls.cpp
U lib/CodeGen/DwarfEHPrepare.cpp
U lib/CodeGen/IntrinsicLowering.cpp
U lib/Bitcode/Reader/BitcodeReader.cpp
llvm-svn: 134949
patch brings numerous advantages to LLVM. One way to look at it
is through diffstat:
109 files changed, 3005 insertions(+), 5906 deletions(-)
Removing almost 3K lines of code is a good thing. Other advantages
include:
1. Value::getType() is a simple load that can be CSE'd, not a mutating
union-find operation.
2. Types a uniqued and never move once created, defining away PATypeHolder.
3. Structs can be "named" now, and their name is part of the identity that
uniques them. This means that the compiler doesn't merge them structurally
which makes the IR much less confusing.
4. Now that there is no way to get a cycle in a type graph without a named
struct type, "upreferences" go away.
5. Type refinement is completely gone, which should make LTO much MUCH faster
in some common cases with C++ code.
6. Types are now generally immutable, so we can use "Type *" instead
"const Type *" everywhere.
Downsides of this patch are that it removes some functions from the C API,
so people using those will have to upgrade to (not yet added) new API.
"LLVM 3.0" is the right time to do this.
There are still some cleanups pending after this, this patch is large enough
as-is.
llvm-svn: 134829
"Reinstate r133435 and r133449 (reverted in r133499) now that the clang
self-hosted build failure has been fixed (r133512)."
Due to some additional warnings.
llvm-svn: 133700
representing a constant reference to ValType. Normally this is just
"const ValType &", but when ValType is a std::vector we want to use
ArrayRef as the reference type.
llvm-svn: 133611
Change PHINodes to store simple pointers to their incoming basic blocks,
instead of full-blown Uses.
Note that this loses an optimization in SplitCriticalEdge(), because we
can no longer walk the use list of a BasicBlock to find phi nodes. See
the comment I removed starting "However, the foreach loop is slow for
blocks with lots of predecessors".
Extend replaceAllUsesWith() on a BasicBlock to also update any phi
nodes in the block's successors. This mimics what would have happened
when PHINodes were proper Users of their incoming blocks. (Note that
this only works if OldBB->replaceAllUsesWith(NewBB) is called when
OldBB still has a terminator instruction, so it still has some
successors.)
llvm-svn: 133435
Change various bits of code to make better use of the existing PHINode
API, to insulate them from forthcoming changes in how PHINodes store
their operands.
llvm-svn: 133434
I don't think the AugmentedUse struct buys us much, either in
correctness or in ease of use. Ditch it, and simplify Use::getUser() and
User::allocHungoffUses().
llvm-svn: 133433
all over the place in different styles and variants. Standardize on two
preferred entrypoints: one that takes a StructType and ArrayRef, and one that
takes StructType and varargs.
In cases where there isn't a struct type convenient, we now add a
ConstantStruct::getAnon method (whose name will make more sense after a few
more patches land).
It would be "really really nice" if the ConstantStruct::get and
ConstantVector::get methods didn't make temporary std::vectors.
llvm-svn: 133412
for pre-2.9 bitcode files. We keep x86 unaligned loads, movnt, crc32, and the
target indep prefetch change.
As usual, updating the testsuite is a PITA.
llvm-svn: 133337
optimizations when emitting calls to the function; instead those calls may
use faster relocations which require the function to be immediately resolved
upon loading the dynamic object featuring the call. This is useful when it
is known that the function will be called frequently and pervasively and
therefore there is no merit in delaying binding of the function.
Currently only implemented for x86-64, where it turns into a call through
the global offset table.
Patch by Dan Gohman, who assures me that he's going to add LangRef documentation
for this once it's committed.
llvm-svn: 133080
Unfortunately we can't follow what the rest of the language does (wrapping it
in double-quotes) because that would cause an ambiguity with metadata strings,
so instead we escape any unusual characters with \xx escaping.
llvm-svn: 133050
Added asserts whenever attempting to use a potentially
uninitialized pass. This helps people trying to develop a new pass and
people trying to understand the bug reports filed by the former people.
llvm-svn: 132520
than either the primitive size or the element primitive size (in the case
of vectors), simplify the vector logic. No functionality change. There
is some distracting churn in the patch because I lined up comments better
while there - sorry about that.
llvm-svn: 131533
happily accept things like "sext <2 x i32> to <999 x i64>". It would
also accept "sext <2 x i32> to i64", though the verifier would catch
that later. Fixed by having castIsValid check that vector lengths match
except when doing a bitcast. (2) When creating a cast instruction, check
that the cast is valid (this was already done when creating constexpr
casts). While there, replace getScalarSizeInBits (used to allow more
vector casts) with getPrimitiveSizeInBits in getCastOpcode and isCastable
since vector to vector casts are now handled explicitly by passing to the
element types; i.e. this bit should result in no functional change.
llvm-svn: 131532
can be used to turn a <4 x i64> into a <4 x i32> but getCastOpcode would assert
if you passed these types to it. Note that this strictly extends the previous
functionality: if getCastOpcode previously accepted two vector types (i.e. didn't
assert) then it still will and returns the same opcode (BitCast). That's because
before it would only accept vectors with the same bitwidth, and the new code only
touches vectors with the same length. However if two vectors have both the same
bitwidth and the same length then their element types have the same bitwidth, so
the new logic will return BitCast as before.
llvm-svn: 131530
Still to do:
- Allow replacing / removing passes (infrastructure there, just needs an infrastructure exposed)
- Defining sets of passes to be added or removed as a group
- Extending the support to allow user-defined groups of optimisations
- Allow plugins to be specified for loading automatically (e.g. from plugins.conf or some similar mechanism)
Reviewed by Nick Lewycky.
llvm-svn: 131155
Now that we have a first-class way to represent unaligned loads, the unaligned
load intrinsics are superfluous.
First part of <rdar://problem/8460511>.
llvm-svn: 129401
--- Reverse-merging r129235 into '.':
D test/Feature/bb_attrs.ll
U include/llvm/BasicBlock.h
U include/llvm/Bitcode/LLVMBitCodes.h
U lib/VMCore/AsmWriter.cpp
U lib/VMCore/BasicBlock.cpp
U lib/AsmParser/LLParser.cpp
U lib/AsmParser/LLLexer.cpp
U lib/AsmParser/LLToken.h
U lib/Bitcode/Reader/BitcodeReader.cpp
U lib/Bitcode/Writer/BitcodeWriter.cpp
llvm-svn: 129259
* Add a "landing pad" attribute to the BasicBlock.
* Modify the bitcode reader and writer to handle said attribute.
Later: The verifier will ensure that the landing pad attribute is used in the
appropriate manner. I.e., not applied to the entry block, and applied only to
basic blocks that are branched to via a `dispatch' instruction.
(This is a work-in-progress.)
llvm-svn: 129235
had gotten out of sync: isCastable didn't think it was possible to
cast the x86_mmx type to anything, while it did think it possible
to cast an i64 to x86_mmx.
llvm-svn: 128705
was lowering them to sext / uxt + mul instructions. Unfortunately the
optimization passes may hoist the extensions out of the loop and separate them.
When that happens, the long multiplication instructions can be broken into
several scalar instructions, causing significant performance issue.
Note the vmla and vmls intrinsics are not added back. Frontend will codegen them
as intrinsics vmull* + add / sub. Also note the isel optimizations for catching
mul + sext / zext are not changed either.
First part of rdar://8832507, rdar://9203134
llvm-svn: 128502
It generates output that lools like
8 times line number info lost by Scalar Replacement of Aggregates (SSAUp)
1 times line number info lost by Simplify well-known library calls
12 times variable info lost by Jump Threading
llvm-svn: 127381
the value splatted into every element. Extend this to getTrue and getFalse which
by providing new overloads that take Types that are either i1 or <N x i1>. Use
it in InstCombine to add vector support to some code, fixing PR8469!
llvm-svn: 127116
uses.
The result produced by the streamer is used to give the linker more accurate
information and to add to llvm.compiler.used. The second improvement removes
the need for the user to add __attribute__((used)) to functions only used in
inline asm. The first one lets us build firefox with LTO on Darwin :-)
llvm-svn: 126830
It caused a crash in MultiSource/Benchmarks/Bullet.
Opt hit an assertion with "opt -std-compile-opts" because
Constant::getAllOnesValue doesn't know how to handle floats.
This patch added a test to reproduce the problem and a check that the
destination vector is of integer type.
Thank you Benjamin!
llvm-svn: 125459