llvm-project

Commit Graph

Author	SHA1	Message	Date
Rafael Espindola	d173f4237d	Avoid a hard coded constant. llvm-svn: 68603	2009-04-08 08:09:33 +00:00
Dan Gohman	ad3e549a53	Implement support for using modeling implicit-zero-extension on x86-64 with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG instructions), and teach the DAGCombiner to take advantage of this on targets which support it. This eliminates many redundant zero-extension operations on x86-64. This adds a new TargetLowering hook, isZExtFree. It's similar to isTruncateFree, except it only applies to actual definitions, and not no-op truncates which may not zero the high bits. Also, this adds a new optimization to SimplifyDemandedBits: transform operations like x+y into (zext (add (trunc x), (trunc y))) on targets where all the casts are no-ops. In contexts where the high part of the add is explicitly masked off, this allows the mask operation to be eliminated. Fix the DAGCombiner to avoid undoing these transformations to eliminate casts on targets where the casts are no-ops. Also, this adds a new two-address lowering heuristic. Since two-address lowering runs before coalescing, it helps to be able to look through copies when deciding whether commuting and/or three-address conversion are profitable. Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle the case that a clobber range extended both before and beyond an existing live range. In that case, multiple live ranges need to be added. This was exposed by the new subreg coalescing code. Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the spiller behavior it was looking for no longer occurrs with the new instruction selection. llvm-svn: 68576	2009-04-08 00:15:30 +00:00
Bill Wendling	4aa25b79f9	Temporarily revert r68552. This was causing a failure in the self-hosting LLVM builds. --- Reverse-merging (from foreign repository) r68552 into '.': U test/CodeGen/X86/tls8.ll U test/CodeGen/X86/tls10.ll U test/CodeGen/X86/tls2.ll U test/CodeGen/X86/tls6.ll U lib/Target/X86/X86Instr64bit.td U lib/Target/X86/X86InstrSSE.td U lib/Target/X86/X86InstrInfo.td U lib/Target/X86/X86RegisterInfo.cpp U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86CodeEmitter.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86InstrInfo.h U lib/Target/X86/X86ISelDAGToDAG.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h U lib/Target/X86/X86ISelLowering.h U lib/Target/X86/X86InstrInfo.cpp U lib/Target/X86/X86InstrBuilder.h U lib/Target/X86/X86RegisterInfo.td llvm-svn: 68560	2009-04-07 22:35:25 +00:00
Rafael Espindola	1edda06792	Reduce code duplication on the TLS implementation. This introduces a small regression on the generated code quality in the case we are just computing addresses, not loading values. Will work on it and on X86-64 support. llvm-svn: 68552	2009-04-07 21:37:46 +00:00
Mon P Wang	9c186c5d27	Added a x86 dag combine to increase the chances to use a movq for v2i64 on x86-32. llvm-svn: 68368	2009-04-03 02:43:30 +00:00
Chris Lattner	d2eb0a63a1	silence warning in release-asserts build. llvm-svn: 68253	2009-04-01 22:14:45 +00:00
Evan Cheng	d9d6e427d6	i128 shift libcalls are not available on x86. llvm-svn: 68133	2009-03-31 19:38:51 +00:00
Dan Gohman	6b42dfddf4	Reapply 68073, with fixes. EH Landing-pad basic blocks are not entered via fall-through. Don't miss fallthroughs from blocks terminated by conditional branches. Also, move isOnlyReachableByFallthrough out of line. llvm-svn: 68129	2009-03-31 18:39:13 +00:00
Rafael Espindola	9277379fc0	remove unused arguments. llvm-svn: 68109	2009-03-31 16:16:57 +00:00
Bill Wendling	6afae239c2	Really temporarily revert r68073. llvm-svn: 68100	2009-03-31 08:42:40 +00:00
Bill Wendling	b8017e02ca	Oy! When reverting r68073, I added in experimental code. Sorry... llvm-svn: 68099	2009-03-31 08:41:31 +00:00
Bill Wendling	c4b08e5eb0	Revert r68073. It's causing a failure in the Apple-style builds. llvm-svn: 68092	2009-03-31 08:26:26 +00:00
Evan Cheng	885bc6de52	X86 address mode isel tweak. If the base of the address is also used by a CopyToReg (i.e. it's likely live-out), do not fold the sub-expressions into the addressing mode to avoid computing the address twice. The CopyToReg use will be isel'ed to a LEA, re-use it for address instead. This is not yet enabled. llvm-svn: 68082	2009-03-31 01:13:53 +00:00
Dan Gohman	adccd30533	Except in asm-verbose mode, avoid printing labels for blocks that are only reachable via fall-through edges. This dramatically reduces the number of labels printed, and thus also the number of labels the assembler must parse and remember. llvm-svn: 68073	2009-03-30 22:55:17 +00:00
Evan Cheng	a84a318873	When optimzing a mul by immediate into two, the resulting mul's should get a x86 specific node to avoid dag combiner from hacking on them further. llvm-svn: 68066	2009-03-30 21:36:47 +00:00
Anton Korobeynikov	7c5f3c40ca	Do not propagate ELF-specific stuff (data.rel) into other targets. This simplifies code and also ensures correctness. llvm-svn: 68032	2009-03-30 15:27:43 +00:00
Anton Korobeynikov	c247fd396c	Add data.rel stuff llvm-svn: 68031	2009-03-30 15:27:03 +00:00
Rafael Espindola	1f11c3c36f	Use array_lengthof llvm-svn: 67950	2009-03-28 19:02:18 +00:00
Rafael Espindola	6ff3dabbb4	Have only one definition of X86AddrNumOperands. llvm-svn: 67949	2009-03-28 18:55:31 +00:00
Rafael Espindola	c2a17d3022	Make code a bit less brittle by no hardcoding the number of operands in an address in so many places. llvm-svn: 67945	2009-03-28 17:03:24 +00:00
Evan Cheng	fd81c73cde	Optimize some 64-bit multiplication by constants into two lea's or one lea + shl since imulq is slow (latency 5). e.g. x * 40 => shlq $3, %rdi leaq (%rdi,%rdi,4), %rax This has the added benefit of allowing more multiply to be folded into addressing mode. e.g. a * 24 + b => leaq (%rdi,%rdi,2), %rax leaq (%rsi,%rax,8), %rax llvm-svn: 67917	2009-03-28 05:57:29 +00:00
Rafael Espindola	705f2a6cd2	Avoid hardcoding that X86 addresses have 4 operands. llvm-svn: 67848	2009-03-27 15:57:50 +00:00
Rafael Espindola	227815437a	Use less hard coded constants to make the code less brittle. llvm-svn: 67846	2009-03-27 15:45:05 +00:00
Rafael Espindola	e728019392	I am trying to add a segment to the X86 addresses matching to improve TLS support (see http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090309/075220.html), but that code is VERY brittle. This patch just makes it a bit more resistant. llvm-svn: 67843	2009-03-27 15:26:30 +00:00
Evan Cheng	d88ebc352c	-no-implicit-float means explicit fp operations are legal. llvm-svn: 67784	2009-03-26 23:06:32 +00:00
Bill Wendling	aa28be652c	Pull transform from target-dependent code into target-independent code. llvm-svn: 67742	2009-03-26 06:14:09 +00:00
Bill Wendling	94f299f2c5	Match this pattern so that we can generate simpler code: %a = ... %b = and i32 %a, 2 %c = srl i32 %b, 1 %d = br i32 %c, into %a = ... %b = and %a, 2 %c = X86ISD::CMP %b, 0 %d = X86ISD::BRCOND %c ... This applies only when the AND constant value has one bit set and the SRL constant is equal to the log2 of the AND constant. The back-end is smart enough to convert the result into a TEST/JMP sequence. llvm-svn: 67728	2009-03-26 01:47:50 +00:00
Bill Wendling	189d67181c	Doxygen-ify comments. llvm-svn: 67727	2009-03-26 01:46:56 +00:00
Evan Cheng	5e5a63cf8f	CodeGen still defaults to non-verbose asm, but llc now overrides it and default to verbose. llvm-svn: 67668	2009-03-25 01:47:28 +00:00
Evan Cheng	9966403e90	Don't print global names twice with -asm-verbose. llvm-svn: 67667	2009-03-25 01:08:42 +00:00
Dan Gohman	efd2d44aa5	I was convinced that it's ok to allow a second i8 return value to be returned in DL. LLVM's multiple-return-value support is not ABI-conforming; front-ends that wish to have code emitted that conforms to an ABI are currently expected to make arrangements for this on their own rather than assuming that multiple-return-values will automatically do the right thing. This commit doesn't fundamentally change this situation. llvm-svn: 67588	2009-03-24 01:04:34 +00:00
Evan Cheng	a774a99245	Do not emit comments unless -asm-verbose. llvm-svn: 67580	2009-03-24 00:17:40 +00:00
Dan Gohman	4a683478d5	Correct some comments. Operand numbers start at 0. llvm-svn: 67518	2009-03-23 15:40:10 +00:00
Evan Cheng	968c3b0d6e	Model inline asm constraint which ties an input to an output register as machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies. llvm-svn: 67512	2009-03-23 08:01:15 +00:00
Dan Gohman	772de0ae2d	Fix a grammaro in a comment that Bill noticed. llvm-svn: 67507	2009-03-23 05:02:44 +00:00
Dan Gohman	70d9929def	Add comments explaining why there's only one register for i8 return values. llvm-svn: 67502	2009-03-23 04:28:24 +00:00
Nick Lewycky	bfd4ad67c7	Remove strange extra semicolons. llvm-svn: 67287	2009-03-19 05:51:39 +00:00
Chris Lattner	a6bed3e950	Disable the "call to immediate" optimization on x86-64. It is not safe in general because the immediate could be an arbitrary value that does not fit in a 32-bit pcrel displacement. Conservatively fall back to loading the value into a register and calling through it. We still do the optzn on X86-32. llvm-svn: 67142	2009-03-18 00:43:52 +00:00
Dan Gohman	d6e571b202	Recognize bswapl as bswap too. llvm-svn: 67072	2009-03-17 02:45:40 +00:00
Dan Gohman	77a9279d80	Recognize "bswapq" as an alternate spelling for the bswap instruction. llvm-svn: 67071	2009-03-17 02:17:27 +00:00
Dan Gohman	f98cd1b48a	Use %rip-relative addressing on x86-64 whenever practical, as it has a smaller encoding than absolute addressing. llvm-svn: 67002	2009-03-14 02:33:41 +00:00
Dan Gohman	2293eb6037	Don't forego folding of loads into 64-bit adds when the other operand is a signed 32-bit immediate. Unlike with the 8-bit signed immediate case, it isn't actually smaller to fold a 32-bit signed immediate instead of a load. In fact, it's larger in the case of 32-bit unsigned immediates, because they can be materialized with movl instead of movq. llvm-svn: 67001	2009-03-14 02:07:16 +00:00
Dan Gohman	a62e4ab690	Improve FastISel's handling of truncates to i1, and implement ptrtoint and inttoptr in X86FastISel. These casts aren't always handled in the generic FastISel code because X86 sometimes needs custom code to do truncation and zero-extension. llvm-svn: 66988	2009-03-13 23:53:06 +00:00
Dan Gohman	c0bb959591	Fix FastISel's assumption that i1 values are always zero-extended by inserting explicit zero extensions where necessary. Included is a testcase where SelectionDAG produces a virtual register holding an i1 value which FastISel previously mistakenly assumed to be zero-extended. llvm-svn: 66941	2009-03-13 20:42:20 +00:00
Rafael Espindola	997b74ac61	add 8 and 16 bit TLS moves. add a fixme note on how to remove code duplication. llvm-svn: 66932	2009-03-13 19:39:55 +00:00
Rafael Espindola	71144973f3	Improve sext and zext of TLS variables. llvm-svn: 66922	2009-03-13 18:37:06 +00:00
Chris Lattner	3fb71c8f49	generalize this code so that fast isel handles integer truncates to i1, which codegen to the same thing as integer truncates to i8 (the top bits are just undefined). This implements rdar://6667338 llvm-svn: 66902	2009-03-13 16:36:42 +00:00
Bill Wendling	798fd56d0f	These instructions have special lowering that may lower them to SSE instructions. Prevent that if we don't want implicit uses of SSE. llvm-svn: 66877	2009-03-13 08:41:47 +00:00
Evan Cheng	1fb8aedd1e	Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues. 1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants. 2. MachineConstantPool alignment field is also a log2 value. 3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values. 4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries. 5. Asm printer uses expensive data structure multimap to track constant pool entries by sections. 6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic. Solutions: 1. ConstantPoolSDNode alignment field is changed to keep non-log2 value. 2. MachineConstantPool alignment field is also changed to keep non-log2 value. 3. Functions that create ConstantPool nodes are passing in non-log2 alignments. 4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT. 5. Asm printer uses cheaper data structure to group constant pool entries. 6. Asm printer compute entry offsets after grouping is done. 7. Change JIT code to compute entry offsets on the fly. llvm-svn: 66875	2009-03-13 07:51:59 +00:00
Chris Lattner	99cc133710	generalize the previous code to use the full generality of LEA for i32/i64 expressions (we could also do i16 on cpus where i16 lea is fast, but I didn't add this). On the example, we now generate: _test: movl 4(%esp), %eax cmpl $42, (%eax) setl %al movzbl %al, %eax leal 4(%eax,%eax,8), %eax ret instead of: _test: movl 4(%esp), %eax cmpl $41, (%eax) movl $4, %ecx movl $13, %eax cmovg %ecx, %eax ret llvm-svn: 66869	2009-03-13 05:53:31 +00:00

1 2 3 4 5 ...

4154 Commits