Chris Lattner
470ab00c76
A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2.
...
llvm-svn: 49986
2008-04-20 05:52:46 +00:00
Chris Lattner
a124f1e219
Not all x86-64 machines have sse3 apparently.
...
llvm-svn: 49985
2008-04-20 05:47:56 +00:00
Chris Lattner
50fb77f829
rename *.llx -> *.ll
...
llvm-svn: 49970
2008-04-19 22:29:10 +00:00
Evan Cheng
5102bd9359
64-bit atomic operations.
...
llvm-svn: 49949
2008-04-19 02:30:38 +00:00
Evan Cheng
7e4a55bc58
Be more careful with insert_subreg and extract_subreg where either source or destination operand has already been coalesced with another register that's defined by a insert_subreg or extract_subreg.
...
llvm-svn: 49843
2008-04-17 07:58:04 +00:00
Evan Cheng
c8c3a899c0
Fix a sub-register indice propagation bug.
...
llvm-svn: 49832
2008-04-17 00:06:42 +00:00
Evan Cheng
147cb764b5
Don't forget about sub-register indices when rematting instructions.
...
llvm-svn: 49830
2008-04-16 23:44:44 +00:00
Evan Cheng
59aa126e48
After reading memory that's already freed.
...
llvm-svn: 49810
2008-04-16 20:24:25 +00:00
Evan Cheng
7b989d853e
Really test what's intended.
...
llvm-svn: 49802
2008-04-16 18:21:55 +00:00
Evan Cheng
e45b8f89c5
Rewrite LiveVariable liveness computation. The new implementation is much simplified. It eliminated the nasty recursive routines and removed the partial def / use bookkeeping. There is also potential for performance improvement by replacing the conservative handling of partial physical register definitions. The code is currently disabled until live interval analysis is taught of the name scheme.
...
This patch also fixed a couple of nasty corner cases.
llvm-svn: 49784
2008-04-16 09:46:40 +00:00
Dan Gohman
d43d3beeb0
Add support for the form of the SSE41 extractps instruction that
...
puts its result in a 32-bit GPR.
llvm-svn: 49762
2008-04-16 02:32:24 +00:00
Dan Gohman
8c99ccaf96
Recreate the size SDNode instead of reusing the old one in the x86
...
memcpy lowering code; this ensures that the size node has the desired
result type. This fixes a regression from r49572 with @llvm.memcpy.i64
on x86-32.
llvm-svn: 49761
2008-04-16 01:32:32 +00:00
Dan Gohman
01a5d36d9d
Add movd instructions to move from MMX registers
...
to 64-bit GPR registers on x86-64.
llvm-svn: 49757
2008-04-15 23:55:07 +00:00
Dan Gohman
4370f26750
Treat EntryToken nodes as "passive" so that they aren't added to the
...
ScheduleDAG; they don't correspond to any actual instructions so they
don't need to be scheduled.
This fixes a bug where the EntryToken was being scheduled multiple
times in some cases, though it ended up not causing any trouble because
EntryToken doesn't expand into anything. With this fixed the schedulers
reliably schedule the expected number of units, so we can check this
with an assertion.
This requires a tweak to test/CodeGen/X86/loop-hoist.ll because it
ends up getting scheduled differently in a trivial way, though it was
enough to fool the prcontext+grep that the test does.
llvm-svn: 49701
2008-04-15 01:22:18 +00:00
Dan Gohman
b37dab1360
Upgrade these tests for the current intrinsic prototypes.
...
llvm-svn: 49669
2008-04-14 18:19:18 +00:00
Dale Johannesen
ea3aa5bf11
Remove -unwind-tables-optional everywhere, since
...
this is now the default.
llvm-svn: 49667
2008-04-14 17:56:54 +00:00
Arnold Schwaighofer
634fc9a33a
This patch corrects the handling of byval arguments for tailcall
...
optimized x86-64 (and x86) calls so that they work (... at least for
my test cases).
Should fix the following problems:
Problem 1: When i introduced the optimized handling of arguments for
tail called functions (using a sequence of copyto/copyfrom virtual
registers instead of always lowering to top of the stack) i did not
handle byval arguments correctly e.g they did not work at all :).
Problem 2: On x86-64 after the arguments of the tail called function
are moved to their registers (which include ESI/RSI etc), tail call
optimization performs byval lowering which causes xSI,xDI, xCX
registers to be overwritten. This is handled in this patch by moving
the arguments to virtual registers first and after the byval lowering
the arguments are moved from those virtual registers back to
RSI/RDI/RCX.
llvm-svn: 49584
2008-04-12 18:11:06 +00:00
Dan Gohman
544ab2c50b
Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal
...
on any current target and aren't optimized in DAGCombiner. Instead
of using intermediate nodes, expand the operations, choosing between
simple loads/stores, target-specific code, and library calls,
immediately.
Previously, the code to emit optimized code for these operations
was only used at initial SelectionDAG construction time; now it is
used at all times. This fixes some cases where rep;movs was being
used for small copies where simple loads/stores would be better.
This also cleans up code that checks for alignments less than 4;
let the targets make that decision instead of doing it in
target-independent code. This allows x86 to use rep;movs in
low-alignment cases.
Also, this fixes a bug that resulted in the use of rep;stos for
memsets of 0 with non-constant memory size when the alignment was
at least 4. It's better to use the library in this case, which
can be significantly faster when the size is large.
This also preserves more SourceValue information when memory
intrinsics are lowered into simple loads/stores.
llvm-svn: 49572
2008-04-12 04:36:06 +00:00
Dan Gohman
8c7cf88f7e
Fix a bug that prevented x86-64 from using rep.movsq for
...
8-byte-aligned data.
llvm-svn: 49571
2008-04-12 02:35:39 +00:00
Evan Cheng
33281864c1
If a PHI node has a single implicit_def source, replace it with an implicit_def instead of a copy.
...
llvm-svn: 49543
2008-04-11 17:54:45 +00:00
Evan Cheng
b53d560150
New test.
...
llvm-svn: 49514
2008-04-10 23:49:09 +00:00
Evan Cheng
16ea87d6ee
A copy instruction may use a register multiple times on some targets. Change them all.
...
llvm-svn: 49491
2008-04-10 18:38:47 +00:00
Chris Lattner
ad75302497
Fix the x86-64 side of PR2108 by adding a v2f64 version of
...
MOVZQI2PQIrr. This would be better handled as a dag combine
(with the goal of eliminating the bitconvert) but I don't know
how to do that safely. Thoughts welcome.
llvm-svn: 49463
2008-04-10 05:13:43 +00:00
Evan Cheng
9d339849ee
Teach branch folding pass about implicit_def instructions. Unfortunately we can't just eliminate them since register scavenger expects every register use to be defined. However, we can delete them when there are no intra-block uses. Carefully removing some implicit def's which enable more blocks to be optimized away.
...
llvm-svn: 49461
2008-04-10 02:32:10 +00:00
Evan Cheng
c8eeb752a3
- More aggressively coalescing away copies whose source is defined by an implicit_def.
...
- Added insert_subreg coalescing support.
llvm-svn: 49448
2008-04-09 20:57:25 +00:00
Evan Cheng
aa3b55f842
Missed a hasInterval check.
...
llvm-svn: 49415
2008-04-09 01:30:15 +00:00
Dale Johannesen
5169fa17b5
Rename -disable-required-unwind-tables to -unwind-tables-optional.
...
llvm-svn: 49391
2008-04-08 18:10:08 +00:00
Dale Johannesen
399c0e63af
Missed one.
...
llvm-svn: 49365
2008-04-08 00:14:59 +00:00
Dale Johannesen
da298f9107
Add -disable-required-unwind-tables to tests
...
that need it (usually, grepping for some string
found in unwind info)
llvm-svn: 49364
2008-04-08 00:14:17 +00:00
Evan Cheng
7b93268305
Fix test.
...
llvm-svn: 49343
2008-04-07 17:02:18 +00:00
Chris Lattner
5bd2f736e6
fix this testcase to pass and remove a duplicate instance of itself.
...
llvm-svn: 49281
2008-04-06 21:39:17 +00:00
Torok Edwin
613d7afe64
Prefer to expand mask for xor to -1, so we have a chance to turn it into a not.
...
If it cannot be expanded, it will keep the old behaviour and try to shrink the constant.
Part of enhancement for PR2191.
llvm-svn: 49280
2008-04-06 21:23:02 +00:00
Evan Cheng
b5fdc923d3
1. IMPLICIT_DEF can *re-define* any register.
...
2. Coalescer can now create an interesting situation where a register def can
reaches itself without being killed.
llvm-svn: 49246
2008-04-05 01:27:09 +00:00
Evan Cheng
f77b5ef3d0
Favors pshufd over shufps when shuffling elements from one vector. pshufd is faster than shufps.
...
llvm-svn: 49244
2008-04-05 00:30:36 +00:00
Evan Cheng
4b9a2c0b59
New test case.
...
llvm-svn: 49190
2008-04-03 21:25:03 +00:00
Dale Johannesen
5316aebeb6
Testcase for EH with functions whose names are stripped.
...
llvm-svn: 49111
2008-04-02 20:16:41 +00:00
Dan Gohman
980d7200c1
Speculatively micro-optimize memory-zeroing calls on Darwin 10.
...
llvm-svn: 49048
2008-04-01 20:38:36 +00:00
Evan Cheng
0bd72c5ccd
More soft fp fixes.
...
llvm-svn: 49016
2008-04-01 02:18:22 +00:00
Evan Cheng
86e476b7cb
Unbreak ARM / Thumb soft FP support.
...
llvm-svn: 49012
2008-04-01 01:50:16 +00:00
Dale Johannesen
0de94a1712
Mark functions in some tests as 'nounwind'. Generating
...
EH info for these functions causes the tests to fail for
random reasons (e.g. looking for 'or' or counting lines
with asm-printer; labels count as lines.)
llvm-svn: 49003
2008-03-31 23:20:09 +00:00
Evan Cheng
e4f77c69ac
It's not safe to fold a load from GV stub or constantpool into a two-address use.
...
llvm-svn: 49002
2008-03-31 23:19:51 +00:00
Dan Gohman
f549b26254
Fix a DAGCombiner optimization to respect volatile qualification.
...
llvm-svn: 48994
2008-03-31 20:32:52 +00:00
Dan Gohman
fd2eb00cc2
Fix a tokenfactor node to use the load chain rather than the
...
load value. This fixes PR2177.
llvm-svn: 48932
2008-03-28 23:45:16 +00:00
Evan Cheng
5832410d77
Fix a memory bug: increment an iterator of a deleted machine instr.
...
llvm-svn: 48853
2008-03-27 01:27:25 +00:00
Evan Cheng
db390694ff
One more coalescer fix wrt deadness propagation.
...
llvm-svn: 48837
2008-03-26 20:15:49 +00:00
Evan Cheng
289ba4f335
Avoid commuting a def MI in order to coalesce a copy instruction away if any use of the same val# is a copy instruction that has already been coalesced.
...
llvm-svn: 48833
2008-03-26 19:03:01 +00:00
Dale Johannesen
ad6c23d5e9
Use ## for comment delimiter on darwin x86-32, so
...
llvm's output .s files will go through gcc -std=c99
without triggering preprocesser errors. Approach
suggested by Daveed Vandevoorde.
llvm-svn: 48808
2008-03-25 23:29:30 +00:00
Evan Cheng
df1690dc7c
Handle a special case xor undef, undef -> 0. Technically this should be transformed to undef. But this is such a common idiom (misuse) we are going to handle it.
...
llvm-svn: 48792
2008-03-25 20:08:07 +00:00
Dan Gohman
883cbfd0ba
Add CMP32mr and friends to the load-unfolding table. Among
...
other things, this allows the scheduler to unfold a load operand
in the 2008-01-08-SchedulerCrash.ll testcase, so it now successfully
clones the comparison to avoid a pushf+popf.
llvm-svn: 48777
2008-03-25 16:53:19 +00:00
Tanya Lattner
8bf97c2324
Byebye llvm-upgrade!
...
llvm-svn: 48762
2008-03-25 04:26:08 +00:00
Evan Cheng
7d564c3b4a
lastRegisterUse() should ignore identity copies. Those will be erased.
...
llvm-svn: 48759
2008-03-25 02:02:19 +00:00
Bill Wendling
6306183df3
Use the bit size of the operand instead of the hard-coded 32 to generate the
...
mask.
llvm-svn: 48750
2008-03-24 23:16:37 +00:00
Evan Cheng
615488ab45
- SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix the instruction specification and teaches lowering code to use it only when the only use is a store instruction.
...
llvm-svn: 48746
2008-03-24 21:52:23 +00:00
Dan Gohman
d8ea040c31
APIntify SelectionDAG's EXTRACT_ELEMENT code.
...
llvm-svn: 48726
2008-03-24 16:38:05 +00:00
Bill Wendling
7e2a6c4112
New testcase.
...
llvm-svn: 48697
2008-03-22 22:27:01 +00:00
Evan Cheng
31604a62f6
Teach DAG combiner to commute commutable binary nodes in order to achieve sdisel CSE.
...
llvm-svn: 48673
2008-03-22 01:55:50 +00:00
Dan Gohman
a25dde6fee
Handle getresult instructions in different basic blocks
...
from their aggregate operands by moving the getresult
instructions.
llvm-svn: 48657
2008-03-21 21:01:32 +00:00
Chris Lattner
5abbe6cef5
Add support for calls that return two FP values in
...
ST(0)/ST(1).
llvm-svn: 48634
2008-03-21 06:38:26 +00:00
Chris Lattner
7e59a30e9f
disable a bogus assertion.
...
llvm-svn: 48633
2008-03-21 06:01:05 +00:00
Chris Lattner
b6f04a3e0a
Enable support for returning two long-double values in ST(0)/ST(1).
...
This allows us to compile fp-stack-2results.ll into:
_test:
fldz
fld1
ret
which returns 1 in ST(0) and 0 in ST(1). This is needed for x86-64
_Complex long double.
llvm-svn: 48632
2008-03-21 05:57:20 +00:00
Evan Cheng
92b4488202
Undo 48570. Correctly match mmx shift instructions with an immediate operand.
...
llvm-svn: 48627
2008-03-21 00:40:09 +00:00
Evan Cheng
7a3e750fd2
Fix this xform: (sra (shl X, m), result_size) -> (sign_extend (trunc (shl X, result_size - n - m)))
...
llvm-svn: 48578
2008-03-20 02:18:41 +00:00
Scott Michel
bbaf3edace
Add more patterns to match in the integer comparison test harnesses.
...
Fix bugs encountered, mostly due to range matching for immediates;
the CellSPU's 10-bit immediates are sign extended, covering a
larger range of unsigned values.
llvm-svn: 48575
2008-03-20 00:51:36 +00:00
Evan Cheng
bbba76fc99
Add intrinsics to match mmx shift builtin's with immediate operand.
...
llvm-svn: 48569
2008-03-19 23:38:52 +00:00
Dan Gohman
b9056838d2
Add support for multiple return values for the PPC target by
...
converting call result lowering to use the CallingConvLowering
infastructure.
llvm-svn: 48552
2008-03-19 21:39:28 +00:00
Christopher Lamb
8fe9109469
Fix X86's isTruncateFree to not claim that truncate to i1 is free. This fixes Bill's testcase that failed for r48491.
...
llvm-svn: 48542
2008-03-19 08:30:06 +00:00
Evan Cheng
56e9e57d28
Fixed a coalescer bug caused by a typo.
...
llvm-svn: 48526
2008-03-19 02:26:36 +00:00
Evan Cheng
44c0b4f754
Fix live variables issues:
...
1. If part of a register is re-defined, an implicit kill and an implicit def are added to denote read / mod / write. However, this should only be necessary if the register is actually read later. This is a performance issue.
2. If a sub-register is being defined, and it doesn't have a previous use, do not add a implicit kill to the last use of a super-register:
= EAX, AX<imp-use,kill>
...
AX =
In this case, EAX is live but AX is killed, this is wrong and will cause the coalescer to do bad things.
llvm-svn: 48521
2008-03-19 00:52:20 +00:00
Evan Cheng
484064370a
Fix a x86-64 isel lowering bug that's been around forever. A x86-64 varargs function implicitly reads X86::AL, don't clobber it!
...
llvm-svn: 48515
2008-03-18 23:36:35 +00:00
Bill Wendling
43784cc27d
It might be nice to have this run as x86 on non-x86 platforms...
...
llvm-svn: 48511
2008-03-18 22:38:22 +00:00
Bill Wendling
efb4d9ef80
Temporarily revert r48491. It's breaking test/CodeGen/X86/xorl.ll.
...
llvm-svn: 48510
2008-03-18 22:29:51 +00:00
Dale Johannesen
12c76db312
Make conversions of i8/i16 to ppcf128 work.
...
llvm-svn: 48493
2008-03-18 17:28:38 +00:00
Christopher Lamb
3e408d4d82
Target independent DAG transform to use truncate for field extraction + sign extend on targets where this is profitable. Passes nightly on x86-64.
...
llvm-svn: 48491
2008-03-18 16:46:39 +00:00
Evan Cheng
d096ec0a86
Rewrite code that propagate isDead information after a dead copy is coalesced. This remove some ugly spaghetti code and fixed a number of subtle bugs.
...
llvm-svn: 48490
2008-03-18 08:26:47 +00:00
Chris Lattner
3b79fdcae5
ensure we continue matching x86-64 rotates.
...
llvm-svn: 48437
2008-03-17 01:35:03 +00:00
Evan Cheng
84aec09fdb
Fix PR2138. Apparently any modification to a std::multimap (including remove entries for a different key) can invalidate multimap iterators.
...
llvm-svn: 48371
2008-03-14 20:44:01 +00:00
Dan Gohman
b72127ac4c
More APInt-ification.
...
llvm-svn: 48344
2008-03-13 22:13:53 +00:00
Evan Cheng
442d708bfb
New test case.
...
llvm-svn: 48338
2008-03-13 08:05:02 +00:00
Evan Cheng
ecde45ecb5
A test case I forgot to check in.
...
llvm-svn: 48335
2008-03-13 06:42:46 +00:00
Evan Cheng
5c26bde55e
TwoAddressInstructionPass enhancement. After it converts a two address instruction into a 3-address one, sink it past the instruction that kills the read-mod-write register if its definition is used past the kill. This reduces the number of live register by one.
...
llvm-svn: 48333
2008-03-13 06:37:55 +00:00
Evan Cheng
65e9d5f1a8
Experimental scheduler change to schedule / coalesce the copies added for function livein's. Take 2008-03-10-RegAllocInfLoop.ll, the schedule looks like this after these copies are inserted:
...
entry: 0x12049d0, LLVM BB @0x1201fd0, ID#0:
Live Ins: %EAX %EDX %ECX
%reg1031<def> = MOVPC32r 0
%reg1032<def> = ADD32ri %reg1031, <es:_GLOBAL_OFFSET_TABLE_>, %EFLAGS<imp-def>
%reg1028<def> = MOV32rr %EAX
%reg1029<def> = MOV32rr %EDX
%reg1030<def> = MOV32rr %ECX
%reg1027<def> = MOV8rm %reg0, 1, %reg0, 0, Mem:LD(1,1) [0x1201910 + 0]
%reg1025<def> = MOV32rr %reg1029
%reg1026<def> = MOV32rr %reg1030
%reg1024<def> = MOV32rr %reg1028
The copies unnecessarily increase register pressure and it will end up requiring a physical register to be spilled.
With -schedule-livein-copies:
entry: 0x12049d0, LLVM BB @0x1201fa0, ID#0:
Live Ins: %EAX %EDX %ECX
%reg1031<def> = MOVPC32r 0
%reg1032<def> = ADD32ri %reg1031, <es:_GLOBAL_OFFSET_TABLE_>, %EFLAGS<imp-def>
%reg1024<def> = MOV32rr %EAX
%reg1025<def> = MOV32rr %EDX
%reg1026<def> = MOV32rr %ECX
%reg1027<def> = MOV8rm %reg0, 1, %reg0, 0, Mem:LD(1,1) [0x12018e0 + 0]
Much better!
llvm-svn: 48307
2008-03-12 22:19:41 +00:00
Dan Gohman
f7492cf0ec
Fix this test on hosts that don't have sse2.
...
llvm-svn: 48296
2008-03-12 20:40:51 +00:00
Dan Gohman
35f8f07c00
Make this test x86-specific for now; targets that don't use
...
the automated CallingConv code to handle return values typically
don't support multiple return values.
llvm-svn: 48265
2008-03-12 00:25:14 +00:00
Dan Gohman
8a361c1f92
Basic feature test for multiple return values in codegen.
...
llvm-svn: 48260
2008-03-11 23:53:16 +00:00
Anton Korobeynikov
80b53b8f6b
Testcase for PR2137
...
llvm-svn: 48258
2008-03-11 22:43:42 +00:00
Anton Korobeynikov
6f51973734
Update testcase for recent aliases change
...
llvm-svn: 48250
2008-03-11 21:42:20 +00:00
Dan Gohman
6616836e71
Add a test to ensure that all-ones vectors are materialized with pcmpeqd.
...
llvm-svn: 48247
2008-03-11 21:37:00 +00:00
Dan Gohman
44b4c07cd1
Use the correct value for InSignBit.
...
llvm-svn: 48245
2008-03-11 21:29:43 +00:00
Chris Lattner
8abed80a69
Implement basic support for the 'f' register class constraint. This basically
...
works, but probably won't if you mix it with 't' or 'u' yet.
llvm-svn: 48243
2008-03-11 19:50:13 +00:00
Dale Johannesen
1b12647544
The feature this is testing did not work in the general case,
...
and has been removed.
llvm-svn: 48232
2008-03-11 17:48:26 +00:00
Evan Cheng
34e5b34426
Learn how to xfail a test.
...
llvm-svn: 48219
2008-03-11 07:51:31 +00:00
Evan Cheng
e88a625ecd
When the register allocator runs out of registers, spill a physical register around the def's and use's of the interval being allocated to make it possible for the interval to target a register and spill it right away and restore a register for uses. This likely generates terrible code but is before than aborting.
...
llvm-svn: 48218
2008-03-11 07:19:34 +00:00
Evan Cheng
756f9d9c51
XFAIL due to Dale's change.
...
llvm-svn: 48216
2008-03-11 07:15:44 +00:00
Dan Gohman
d6819da453
Generalize ExpandIntToFP to handle the case where the operand is legal
...
and it's the result that requires expansion. This code is a little confusing
because the TargetLoweringInfo tables for [US]INT_TO_FP use the operand type
(the integer type) rather than the result type.
llvm-svn: 48206
2008-03-11 01:59:03 +00:00
Scott Michel
92275427e5
- Style cleanup in IA64ISelLowering.h: add 'virtual' keyword for consistency.
...
- Add test pattern matching in CellSPU's icmp32.ll test harness
- Fix CellSPU fcmp.ll-generated assert.
llvm-svn: 48197
2008-03-10 23:49:09 +00:00
Chris Lattner
7362d38391
Don't emit FP_REG_KILL into a block that just returns. Nothing
...
can be live out of the block anyway, so it isn't needed.
llvm-svn: 48192
2008-03-10 23:34:12 +00:00
Dan Gohman
f4300950f1
Implement more support for fp-to-i128 and i128-to-fp conversions.
...
llvm-svn: 48189
2008-03-10 23:03:31 +00:00
Bill Wendling
24155f7381
Update llc flags for PPC register scavenger.
...
llvm-svn: 48187
2008-03-10 22:59:08 +00:00
Dan Gohman
272e234477
Fix mul expansion to check the correct number of bits for
...
zero extension when checking if an unsigned multiply is
safe.
llvm-svn: 48171
2008-03-10 20:42:19 +00:00
Dale Johannesen
fe2c0e2dca
These tests don't work unless SSE2 is active.
...
Judging from the checking comments this is intentional,
so add the flag (makes them pass on non-x86 host).
llvm-svn: 48157
2008-03-10 17:33:57 +00:00