This adds functionality to remove size/zero extension during indvars
without generating a canonical IV and rewriting all IV users. It's
disabled by default so should have no effect on codegen. Work in progress.
llvm-svn: 130829
Only create a canonical IV for backedge taken count if it will
actually be used by LinearFunctionTestReplace. And some related
cleanup, preparing to reduce dependence on canonical IVs.
No significant effect on x86 or arm in the test-suite.
llvm-svn: 130799
model constants which can be added to base registers via add-immediate
instructions which don't require an additional register to materialize
the immediate.
llvm-svn: 130743
This obviously helps a lot if the division would be turned into a libcall
(think i64 udiv on i386), but div is also one of the few remaining instructions
on modern CPUs that become more expensive when the bitwidth gets bigger.
This also helps register pressure on i386 when dividing chars, divb needs
two 8-bit parts of a 16 bit register as input where divl uses two registers.
int foo(unsigned char a) { return a/10; }
int bar(unsigned char a, unsigned char b) { return a/b; }
compiles into (x86_64)
_foo:
imull $205, %edi, %eax
shrl $11, %eax
ret
_bar:
movzbl %dil, %eax
divb %sil, %al
movzbl %al, %eax
ret
llvm-svn: 130615
This shouldn't happen in practice because the icmp would be a constant.
Add a check so we don't miscompile code if something goes wrong.
llvm-svn: 130446
between two reads (threading).
Fix an off-by-one in the indirect counter table that I meant to revert after an
earlier experiment. Whoops!
Implement GCOV_PREFIX. Doesn't handle GCOV_PREFIX_STRIP yet.
Fix an off-by-one in string emission. Extra whoops!
Tolerate DISubprograms that have null Function*'s attached to them. I don't yet
understand what this means, but it happens when you have a global static with
a non-trivial constructor/destructor.
Fix a crash on switch statements with a single successor (default-only).
llvm-svn: 130443
a nice and tidy:
%x1 = load i32* %0, align 4
%1 = icmp eq i32 %x1, 1179403647
br i1 %1, label %if.then, label %if.end
instead of doing lots of loads and branches. May the FreeBSD bootloader
long fit in its allocated space.
llvm-svn: 130416
wider load would allow elimination of subsequent loads, and when the wider
load is still a native integer type. This eliminates a ton of loads on
various benchmarks involving struct fields, though it is somewhat hobbled
by clang not being very aggressive about field alignment.
This is yet another step along the way towards resolving PR6627.
llvm-svn: 130390
Modified LinearFunctionTestReplace to push the condition on the dead
list instead of eagerly deleting it. This can cause unnecessary
IV rewrites, which should have no effect on codegen and will not be an
issue once we stop generating canonical IVs.
llvm-svn: 130340
effective in avoiding recomputation of LCSSA form; the widespread
use of instsimplify (which looks through phi nodes) means it was
not preserving LCSSA form anyway; and instcombine is no longer
scheduled in the middle of the loop passes so this doesn't matter
anymore.
llvm-svn: 130301
when X has multiple uses. This is useful for exposing secondary optimizations,
but the X86 backend isn't ready for this when X has a single use. For example,
this can disable load folding.
This is inching towards resolving PR6627.
llvm-svn: 130238
Add support for switch and indirectbr edges. This works by densely numbering
all blocks which have such terminators, and then separately numbering the
possible successors. The predecessors write down a number, the successor knows
its own number (as a ConstantInt) and sends that and the pointer to the number
the predecessor wrote down to the runtime, who looks up the counter in a
per-function table.
Coverage data should now be functional, but I haven't tested it on anything
other than my 2-file synthetic test program for coverage.
llvm-svn: 130186
return it as a clobber. This allows GVN to do smart things.
Enhance GVN to be smart about the case when a small load is clobbered
by a larger overlapping load. In this case, forward the value. This
allows us to compile stuff like this:
int test(void *P) {
int tmp = *(unsigned int*)P;
return tmp+*((unsigned char*)P+1);
}
into:
_test: ## @test
movl (%rdi), %ecx
movzbl %ch, %eax
addl %ecx, %eax
ret
which has one load. We already handled the case where the smaller
load was from a must-aliased base pointer.
llvm-svn: 130180
necessary since gcov counts transitions between blocks. It can't see if you've
run every line in a straight-line function, so we add an edge for it to notice.
llvm-svn: 129905
Break the arc-profile code out to a function like the notes emission code is,
and reorder the functions in the file.
The only functionality change is that we no longer modify the Module when the
Module has no debug info to use.
llvm-svn: 129631
instruction around, reducing work.
Greatly simplify handling of debug instructions. There is no need to
build up a vector of them and then move them into the one predecessor
if we're processing a block. Instead just rescan the block and *copy*
them into the pred. If a block gets merged into multiple preds, this
will retain more debug info.
llvm-svn: 129502
the same allocation size but different primitive sizes(e.g., <3xi32> and
<4xi32>). When ScalarRepl promotes them, it can't use a bit cast but
should use a shuffle vector instead.
llvm-svn: 129472
will allow multiple context with different loop unroll parameters to run. This is a minor change and no effect
on existing application.
llvm-svn: 129449
Now that we have a first-class way to represent unaligned loads, the unaligned
load intrinsics are superfluous.
First part of <rdar://problem/8460511>.
llvm-svn: 129401
Use debug info in the IR to find the directory/file:line:col. Each time that location changes, bump a counter.
Unlike the existing profiling system, we don't try to look at argv[], and thusly don't require main() to be present in the IR. This matches GCC's technique where you specify the profiling flag when producing each .o file.
The runtime library is minimal, currently just calling printf at program shutdown time. The API is designed to make it possible to emit GCOV data later on.
llvm-svn: 129340
reassociation opportunities are exposed. This fixes a bug where
the nested reassociation expects to be the IR to be consistent,
but it isn't, because the outer reassociation has disconnected
some of the operands. rdar://9167457
llvm-svn: 129324
mean that it has to be ConstantArray of ConstantStruct. We might have
ConstantAggregateZero, at either level, so don't crash on that.
Also, semi-deprecate the sentinal value. The linker isn't aware of sentinals so
we end up with the two lists appended, each with their "sentinals" on them.
Different parts of LLVM treated sentinals differently, so make them all just
ignore the single entry and continue on with the rest of the list.
llvm-svn: 129307
is equivalent to any other relevant value; it isn't true in general.
If it is equivalent, the LoopPromoter will tell the AST the equivalence.
Also, delete the PreheaderLoad if it is unused.
Chris, since you were the last one to make major changes here, can you check
that this is sane?
llvm-svn: 129049
space info. We crash with an assert in this case. This change checks that the
address space of the bitcasted pointer is the same as the gep ptr.
llvm-svn: 128884
after the given instruction; make sure to handle that case correctly.
(It's difficult to trigger; the included testcase involves a dead
block, but I don't think that's a requirement.)
While I'm here, get rid of the unnecessary warning about
SimplifyInstructionsInBlock, since it should work correctly as far as I know.
llvm-svn: 128782
It's possible to craft an input that hits the recursion limits in a way
that SimplifyDemandedBits doesn't simplify the icmp but ComputeMaskedBits
can infer which bits are zero.
No test case as it depends on too many other things. Fixes PR9609.
llvm-svn: 128777