Benjamin Kramer
bde9176663
Fix typos found by http://github.com/lyda/misspell-check
...
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Hans Wennborg
789acfb63d
Implement the local-dynamic TLS model for x86 (PR3985)
...
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.
llvm-svn: 157818
2012-06-01 16:27:21 +00:00
Jakob Stoklund Olesen
4f203ea34b
Add support for return value promotion in X86 calling conventions.
...
Patch by Yiannis Tsiouris!
llvm-svn: 157757
2012-05-31 17:28:20 +00:00
Justin Holewinski
aa58397b3c
Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
...
to pass around a struct instead of a large set of individual values. This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.
NV_CONTRIB
llvm-svn: 157479
2012-05-25 16:35:28 +00:00
Craig Topper
53b4b73be9
Fix constant used for pshufb mask when lowering v16i8 shuffles. Bug introduced in r157043. Fixes PR12908.
...
llvm-svn: 157236
2012-05-22 06:09:38 +00:00
Craig Topper
e88f2fd4f7
Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces.
...
llvm-svn: 157175
2012-05-21 06:40:16 +00:00
Nadav Rotem
c93e91da27
On Haswell, perfer storing YMM registers using a single instruction.
...
llvm-svn: 157129
2012-05-19 20:30:08 +00:00
Nadav Rotem
900c7cb7ce
Add support for additional in-reg vbroadcast patterns
...
llvm-svn: 157127
2012-05-19 19:57:37 +00:00
Craig Topper
0cf4038c59
Simplify code a bit. No functional change intended.
...
llvm-svn: 157044
2012-05-18 07:07:36 +00:00
Craig Topper
92db928ee9
Simplify handling of v16i8 shuffles and fix a missed optimization.
...
llvm-svn: 157043
2012-05-18 06:42:06 +00:00
Hans Wennborg
f9d0e44b82
Implement initial-exec TLS model for 32-bit PIC x86
...
This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong
code here (see the update to test/CodeGen/X86/tls-pie.ll).
llvm-svn: 156611
2012-05-11 10:11:01 +00:00
Nadav Rotem
1a65397017
Fix merge-typo and cleanup
...
llvm-svn: 156541
2012-05-10 12:50:02 +00:00
Nadav Rotem
15946e50c1
AVX2: Add an additional broadcast idiom.
...
llvm-svn: 156540
2012-05-10 12:39:13 +00:00
Nadav Rotem
b86a3fb8d0
Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program.
...
Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users.
Fix PR11900.
llvm-svn: 156539
2012-05-10 12:22:05 +00:00
Chad Rosier
d8287fec17
Fix a regression from r147481. This combine should only happen if there is a
...
single use.
rdar://11360370
llvm-svn: 156316
2012-05-07 18:47:44 +00:00
Manman Ren
ef4e0479ec
X86: optimization for -(x != 0)
...
This patch will optimize -(x != 0) on X86
FROM
cmpl $0x01,%edi
sbbl %eax,%eax
notl %eax
TO
negl %edi
sbbl %eax %eax
In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;
rdar: 10961709
llvm-svn: 156312
2012-05-07 18:06:23 +00:00
Craig Topper
00a1e6d48b
Use MVT instead of EVT as the argument to all the shuffle decode functions. Simplify some of the decode functions.
...
llvm-svn: 156268
2012-05-06 19:46:21 +00:00
Craig Topper
804be3b546
Add VPERMQ/VPERMPD to the list of target specific shuffles that can be looked through for DAG combine purposes.
...
llvm-svn: 156266
2012-05-06 18:54:26 +00:00
Benjamin Kramer
e31f31e5c0
Add a new target hook "predictableSelectIsExpensive".
...
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.
Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.
I'm not entirely happy with the name of this flag, suggestions welcome ;)
llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Craig Topper
bdd2e34b1f
Fix some loops to match coding standards. No functional change intended.
...
llvm-svn: 156159
2012-05-04 06:39:13 +00:00
Craig Topper
d4d3237bb8
Fix up some spacing. No functional change.
...
llvm-svn: 156158
2012-05-04 06:18:33 +00:00
Craig Topper
e2ae413746
Simplify broadcast lowering code. No functional change intended.
...
llvm-svn: 156157
2012-05-04 05:49:51 +00:00
Craig Topper
42f2182366
Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.
...
llvm-svn: 156156
2012-05-04 04:44:49 +00:00
Craig Topper
59063c0a3d
Simplify shuffle narrowing code a bit. No functional change intended.
...
llvm-svn: 156154
2012-05-04 04:08:44 +00:00
Craig Topper
242183834a
Use 'unsigned' instead of 'int' in a few places dealing with counts of vector elements.
...
llvm-svn: 156060
2012-05-03 07:26:59 +00:00
Craig Topper
315a5cc789
Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982.
...
llvm-svn: 156059
2012-05-03 07:12:59 +00:00
Preston Gurd
926afd7401
For Intel Atom, use ILP scheduling always, instead of ILP for 64 bit
...
and Hybrid for 32 bit, since benchmarks show ILP scheduling is better
most of the time.
llvm-svn: 156028
2012-05-02 22:02:02 +00:00
Manman Ren
f02efc8731
Revert r155853
...
The commit is intended to fix rdar://10961709.
But it is the root cause of PR12720.
Revert it for now.
llvm-svn: 155992
2012-05-02 15:24:32 +00:00
Craig Topper
c73bc39c22
Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter.
...
llvm-svn: 155982
2012-05-02 08:03:44 +00:00
Manman Ren
425a55c1ce
X86: optimization for max-like struct
...
This patch will optimize the following cases on X86
(a > b) ? (a-b) : 0
(a >= b) ? (a-b) : 0
(b < a) ? (a-b) : 0
(b <= a) ? (a-b) : 0
FROM
movl %edi, %ecx
subl %esi, %ecx
cmpl %edi, %esi
movl $0, %eax
cmovll %ecx, %eax
TO
xorl %eax, %eax
subl %esi, %edi
cmovll %eax, %edi
movl %edi, %eax
rdar: 10734411
llvm-svn: 155919
2012-05-01 17:16:15 +00:00
Manman Ren
4f4d5c8fc8
X86: optimization for -(x != 0)
...
This patch will optimize -(x != 0) on X86
FROM
cmpl $0x01,%edi
sbbl %eax,%eax
notl %eax
TO
negl %edi
sbbl %eax %eax
llvm-svn: 155853
2012-04-30 22:51:25 +00:00
Chad Rosier
d427d51c2b
Tidy up. No functional change intended.
...
llvm-svn: 155832
2012-04-30 17:47:15 +00:00
Craig Topper
55b3990837
No need to normalize index before calling Extract128BitVector
...
llvm-svn: 155811
2012-04-30 05:17:10 +00:00
Jakub Staszak
da03f3ba64
Remove unneeded casts. No functionality change.
...
llvm-svn: 155800
2012-04-29 20:52:53 +00:00
Craig Topper
3b94fa63d6
Simplify code a bit. No functional change intended.
...
llvm-svn: 155798
2012-04-29 20:22:05 +00:00
Craig Topper
0fa6c7e593
Use 'unsigned' instead of 'int' in several places when retrieving number of vector elements.
...
llvm-svn: 155742
2012-04-27 22:54:43 +00:00
Chad Rosier
32c2178ef3
Add x86-specific DAG combine to simplify:
...
x == -y --> x+y == 0
x != -y --> x+y != 0
On x86, the generated code goes from
negl %esi
cmpl %esi, %edi
je .LBB0_2
to
addl %esi, %edi
je .L4
This case is correctly handled for ARM with "cmn".
Patch by Manman Ren.
rdar://11245199
PR12545
llvm-svn: 155739
2012-04-27 22:33:25 +00:00
Craig Topper
42cd8d2c00
Tidy up spacing.
...
llvm-svn: 155733
2012-04-27 21:05:09 +00:00
Benjamin Kramer
913da4b261
X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures.
...
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.
Fixes PR6679. Patch by Christoph Erhardt!
llvm-svn: 155704
2012-04-27 12:07:43 +00:00
Craig Topper
5ff6dc34b9
Use vector_shuffles instead of target specific unpack nodes for AVX ZERO_EXTEND/ANY_EXTEND combine. These will be converted to target specific nodes during lowering. This is more consistent with other code.
...
llvm-svn: 155537
2012-04-25 06:39:39 +00:00
Nadav Rotem
7b7b99c74a
AVX2: The BLENDPW instruction selects between vectors of v16i16 using an i8
...
immediate. We can't use it here because the shuffle code does not check that
the lower part of the word is identical to the upper part.
llvm-svn: 155440
2012-04-24 11:27:53 +00:00
Craig Topper
0b65c40821
Remove dangling spaces. Fix some other formatting.
...
llvm-svn: 155429
2012-04-24 06:36:35 +00:00
Craig Topper
6f2a535de2
Simplify code a bit and make it compile better. Remove unused parameters.
...
llvm-svn: 155428
2012-04-24 06:02:29 +00:00
Nadav Rotem
3f8acfc3c4
Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the integer type is i8 (commonly used in graphics).
...
llvm-svn: 155397
2012-04-23 21:53:37 +00:00
Craig Topper
153bb34a3c
Use MVT instead of EVT through all of LowerVECTOR_SHUFFLEtoBlend and not just the switch. Saves a little bit of binary size.
...
llvm-svn: 155339
2012-04-23 07:36:33 +00:00
Craig Topper
0a2c809d09
Make getZeroVector and getOnesVector more alike as far as how they detect 128-bit versus 256-bit vectors. Be explicit about both sizes and use llvm_unreachable. Similar changes to getLegalSplat.
...
llvm-svn: 155337
2012-04-23 07:24:41 +00:00
Craig Topper
2bbe8bcf4e
Tidy up by removing some 'else' after 'return'
...
llvm-svn: 155336
2012-04-23 06:57:04 +00:00
Craig Topper
5c51eeecfc
Tidy up spacing in LowerVECTOR_SHUFFLEtoBlend. Remove code that checks if shuffle operand has a different type than the the shuffle result since it can never happen.
...
llvm-svn: 155333
2012-04-23 06:38:28 +00:00
Craig Topper
a52f0d09b6
Add a couple llvm_unreachables.
...
llvm-svn: 155332
2012-04-23 03:42:40 +00:00
Craig Topper
984dc015ae
Remove some tab characers.
...
llvm-svn: 155331
2012-04-23 03:28:34 +00:00