David Greene
dc276c315c
Reapply 122341 to fix PR8199 now that clang changes are in.
...
llvm-svn: 122754
2011-01-03 17:30:25 +00:00
Chris Lattner
9e5e9ed79a
earlycse can do trivial with-a-block dead store
...
elimination as well. This deletes 60 stores in 176.gcc
that largely come from bitfield code.
llvm-svn: 122736
2011-01-03 04:17:24 +00:00
Chris Lattner
e0e32a9ef0
now that loads are in their own table, we can implement
...
store->load forwarding. This allows EarlyCSE to zap 600 more
loads from 176.gcc.
llvm-svn: 122732
2011-01-03 03:46:34 +00:00
Chris Lattner
0446bb23f8
add a testcase for readonly call CSE
...
llvm-svn: 122730
2011-01-03 03:33:47 +00:00
Chris Lattner
b9a8efc960
Teach EarlyCSE to do trivial CSE of loads and read-only calls.
...
On 176.gcc, this catches 13090 loads and calls, and increases the
number of simple instructions CSE'd from 29658 to 36208.
llvm-svn: 122727
2011-01-03 03:18:43 +00:00
Chris Lattner
8fac5db251
add DEBUG and -stats output to earlycse.
...
Teach it to CSE the rest of the non-side-effecting instructions.
llvm-svn: 122716
2011-01-02 23:19:45 +00:00
Chris Lattner
18ae5436b1
Enhance earlycse to do CSE of casts, instsimplify and die.
...
Add a testcase.
llvm-svn: 122715
2011-01-02 23:04:14 +00:00
Chris Lattner
9c69406f2b
fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make
...
sure that the loop we're promoting into a memcpy doesn't mutate the input
of the memcpy. Before we were just checking that the dest of the memcpy
wasn't mod/ref'd by the loop.
llvm-svn: 122712
2011-01-02 21:14:18 +00:00
Chris Lattner
5702a43c09
If a loop iterates exactly once (has backedge count = 0) then don't
...
mess with it. We'd rather peel/unroll it than convert all of its
stores into memsets.
llvm-svn: 122711
2011-01-02 20:24:21 +00:00
Benjamin Kramer
25e6e06e42
Try to reuse the value when lowering memset.
...
This allows us to compile:
void test(char *s, int a) {
__builtin_memset(s, a, 15);
}
into 1 mul + 3 stores instead of 3 muls + 3 stores.
llvm-svn: 122710
2011-01-02 19:57:05 +00:00
Benjamin Kramer
2fdea4c8f1
Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors.
...
We could implement a DAGCombine to turn x * 0x0101 back into logic operations
on targets that doesn't support the multiply or it is slow (p4) if someone cares
enough.
Example code:
void test(char *s, int a) {
__builtin_memset(s, a, 4);
}
before:
_test: ## @test
movzbl 8(%esp), %eax
movl %eax, %ecx
shll $8, %ecx
orl %eax, %ecx
movl %ecx, %eax
shll $16, %eax
orl %ecx, %eax
movl 4(%esp), %ecx
movl %eax, 4(%ecx)
movl %eax, (%ecx)
ret
after:
_test: ## @test
movzbl 8(%esp), %eax
imull $16843009, %eax, %eax ## imm = 0x1010101
movl 4(%esp), %ecx
movl %eax, 4(%ecx)
movl %eax, (%ecx)
ret
llvm-svn: 122707
2011-01-02 19:44:58 +00:00
Chris Lattner
8455b6e45e
enhance loop idiom recognition to scan *all* unconditionally executed
...
blocks in a loop, instead of just the header block. This makes it more
aggressive, able to handle Duncan's Ada examples.
llvm-svn: 122704
2011-01-02 19:01:03 +00:00
Duncan Sands
64f1c0dcda
Fix PR8702 by not having LoopSimplify claim to preserve LCSSA form. As described
...
in the PR, the pass could break LCSSA form when inserting preheaders. It probably
would be easy enough to fix this, but since currently we always go into LCSSA form
after running this pass, doing so is not urgent.
llvm-svn: 122695
2011-01-02 13:38:21 +00:00
Chris Lattner
ddf58010bd
Allow loop-idiom to run on multiple BB loops, but still only scan the loop
...
header for now for memset/memcpy opportunities. It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for
loops" into 2 basic block loops that loop-idiom was ignoring.
With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:
for (j=0; j<MAX_history; ++j) {
history_new[i][j+1] = history[2*i][j];
}
Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine. Woo.
llvm-svn: 122685
2011-01-02 07:58:36 +00:00
Chris Lattner
85b6d81d41
teach loop idiom recognition to form memcpy's from simple loops.
...
llvm-svn: 122678
2011-01-02 03:37:56 +00:00
Chris Lattner
1903c42b97
fix a globalopt crash on two Adobe-C++ testcases that the recent
...
loop idiom pass exposed.
llvm-svn: 122674
2011-01-01 22:31:46 +00:00
Rafael Espindola
abe3eaa481
Fix darwin bots.
...
llvm-svn: 122672
2011-01-01 21:58:41 +00:00
Rafael Espindola
d606e54757
Add support for the 'H' modifier.
...
llvm-svn: 122667
2011-01-01 20:58:46 +00:00
Anton Korobeynikov
879be84aa5
Update the test
...
llvm-svn: 122666
2011-01-01 20:57:26 +00:00
Chris Lattner
a3514441e0
add a validity check that was missed, fixing a crash on the
...
new testcase.
llvm-svn: 122662
2011-01-01 20:12:04 +00:00
Duncan Sands
772749aea1
Revert commit 122654 at the request of Chris, who reckons that instsimplify
...
is the wrong hammer for this nail, and is probably right.
llvm-svn: 122661
2011-01-01 20:08:02 +00:00
Chris Lattner
91a4435875
improve validity check to handle constant-trip-count loops more
...
aggressively. In practice, this doesn't help anything though,
see the todo.
llvm-svn: 122660
2011-01-01 19:54:22 +00:00
Chris Lattner
8b3baf6d75
implement the "no aliasing accesses in loop" safety check. This pass
...
should be correct now.
llvm-svn: 122659
2011-01-01 19:39:01 +00:00
Rafael Espindola
3686473578
Fix PR8878.
...
llvm-svn: 122658
2011-01-01 19:05:35 +00:00
Duncan Sands
e3c539581c
Fix a README item by having InstructionSimplify do a mild form of value
...
numbering, in which it considers (for example) "%a = add i32 %x, %y" and
"%b = add i32 %x, %y" to be equal because the operands are equal and the
result of the instructions only depends on the values of the operands.
This has almost no effect (it removes 4 instructions from gcc-as-one-file),
and perhaps slows down compilation: I measured a 0.4% slowdown on the large
gcc-as-one-file testcase, but it wasn't statistically significant.
llvm-svn: 122654
2011-01-01 16:12:09 +00:00
Che-Liang Chiou
5451fc9195
ptx: remove reg-reg addressing mode and st.const
...
llvm-svn: 122653
2011-01-01 11:58:58 +00:00
Che-Liang Chiou
15e8d2c5e7
ptx: add store instruction
...
llvm-svn: 122652
2011-01-01 10:50:37 +00:00
Nick Lewycky
ee0432ce08
Add another non-commutable instruction that gas accepts commuted forms for.
...
Fixes PR8861.
llvm-svn: 122641
2010-12-30 22:10:49 +00:00
Che-Liang Chiou
3ee0501338
ptx: add state spaces
...
llvm-svn: 122638
2010-12-30 10:41:27 +00:00
Daniel Dunbar
ab14a6f174
MC/Mach-O/Thumb: Set the thumb bit in the symbol table.
...
llvm-svn: 122630
2010-12-29 14:14:06 +00:00
Rafael Espindola
46a5b05207
Correctly encode pcrel|indirect.
...
llvm-svn: 122624
2010-12-29 04:31:26 +00:00
NAKAMURA Takumi
a834008959
test/Transforms/ConstProp/logicaltest.ll: FileCheck-ize.
...
llvm-svn: 122620
2010-12-29 03:58:56 +00:00
NAKAMURA Takumi
dbacfb73e0
test/CodeGen/X86/negative-sin.ll: FileCheck-ize.
...
llvm-svn: 122619
2010-12-29 03:58:47 +00:00
NAKAMURA Takumi
74835a22d8
test/CodeGen/X86/fp-in-intregs.ll: FileCheck-ize.
...
llvm-svn: 122618
2010-12-29 03:58:36 +00:00
Rafael Espindola
563301dfdb
Fix bug when trying to output uint16_t or uint32_t.
...
llvm-svn: 122615
2010-12-29 02:30:49 +00:00
Rafael Espindola
290d71671e
Implement cfi_def_cfa. Also don't convert to dwarf reg numbers twice. Looks
...
like 6 is a fixed point of that and so the previous tests were OK :-)
llvm-svn: 122614
2010-12-29 01:42:56 +00:00
Rafael Espindola
426e68f793
Implement cfi_def_cfa_register.
...
llvm-svn: 122612
2010-12-29 00:26:06 +00:00
Rafael Espindola
86d347dd31
Initial .cfi_offset implementation.
...
llvm-svn: 122611
2010-12-29 00:09:59 +00:00
Rafael Espindola
6bbfb6c06c
Don't produce a "DW_CFA_advance_loc 0".
...
llvm-svn: 122609
2010-12-28 23:38:03 +00:00
Rafael Espindola
85d91982ca
Implement .cfi_remember_state and .cfi_restore_state.
...
llvm-svn: 122602
2010-12-28 18:36:23 +00:00
Rafael Espindola
736a35d9ab
Relax address updates in the eh_frame section.
...
llvm-svn: 122591
2010-12-28 05:39:27 +00:00
Rafael Espindola
a75b87b55a
Start adding basic support for emitting the call frame instructions.
...
llvm-svn: 122590
2010-12-28 04:15:37 +00:00
Rafael Espindola
1de2dd0e5e
Add support for .cfi_lsda.
...
llvm-svn: 122584
2010-12-27 15:56:22 +00:00
Daniel Dunbar
a895c69431
MC/Mach-O/Thumb: Select appropriate relocation types for Thumb.
...
llvm-svn: 122583
2010-12-27 14:49:49 +00:00
Rafael Espindola
8fc59a682f
Handle reloc_riprel_4byte_movq_load. Should make the bots happy.
...
llvm-svn: 122579
2010-12-27 02:03:24 +00:00
Rafael Espindola
2ac8355ecd
Add support for the same encodings of the personality function that gnu as
...
supports.
llvm-svn: 122577
2010-12-27 00:36:05 +00:00
Chris Lattner
29e14edc8d
implement enough of the memset inference algorithm to recognize and insert
...
memsets. This is still missing one important validity check, but this is enough
to compile stuff like this:
void test0(std::vector<char> &X) {
for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I)
*I = 0;
}
void test1(std::vector<int> &X) {
for (long i = 0, e = X.size(); i != e; ++i)
X[i] = 0x01010101;
}
With:
$ clang t.cpp -S -o - -O2 -emit-llvm | opt -loop-idiom | opt -O3 | llc
to:
__Z5test0RSt6vectorIcSaIcEE: ## @_Z5test0RSt6vectorIcSaIcEE
## BB#0: ## %entry
subq $8, %rsp
movq (%rdi), %rax
movq 8(%rdi), %rsi
cmpq %rsi, %rax
je LBB0_2
## BB#1: ## %bb.nph
subq %rax, %rsi
movq %rax, %rdi
callq ___bzero
LBB0_2: ## %for.end
addq $8, %rsp
ret
...
__Z5test1RSt6vectorIiSaIiEE: ## @_Z5test1RSt6vectorIiSaIiEE
## BB#0: ## %entry
subq $8, %rsp
movq (%rdi), %rax
movq 8(%rdi), %rdx
subq %rax, %rdx
cmpq $4, %rdx
jb LBB1_2
## BB#1: ## %for.body.preheader
andq $-4, %rdx
movl $1, %esi
movq %rax, %rdi
callq _memset
LBB1_2: ## %for.end
addq $8, %rsp
ret
llvm-svn: 122573
2010-12-26 23:42:51 +00:00
Chris Lattner
6cf8d6cc6e
start using irbuilder to make mem intrinsics in a few passes.
...
llvm-svn: 122572
2010-12-26 22:57:41 +00:00
Rafael Espindola
9ae2d05d45
Add support for @note. Patch by Jörg Sonnenberger.
...
llvm-svn: 122568
2010-12-26 21:30:59 +00:00
Rafael Espindola
9141b611ad
Add basic support for .cfi_personality.
...
llvm-svn: 122566
2010-12-26 20:20:31 +00:00