llvm-project

Commit Graph

Author	SHA1	Message	Date
Jakob Stoklund Olesen	2e22e6a361	%RCX is not a function live-out in eh.return functions. The function live-out registers must be live at all function returns, and %RCX is only used by eh.return. When a function also has a normal return, only %RAX holds a return value. This fixes PR13188. llvm-svn: 159116	2012-06-24 15:53:01 +00:00
NAKAMURA Takumi	704de074b8	llvm/lib: [CMake] Add explicit dependency to intrinsics_gen. llvm-svn: 159112	2012-06-24 13:32:01 +00:00
Craig Topper	fd5e6e7db1	Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns. llvm-svn: 159109	2012-06-24 07:07:16 +00:00
Craig Topper	b925230fb1	Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns. llvm-svn: 159108	2012-06-24 06:55:37 +00:00
Craig Topper	f48ec7a708	Fix build failures from r159106. llvm-svn: 159107	2012-06-24 06:08:31 +00:00
Craig Topper	bab2b89944	Remove intrinsic specific instructions for CVTPD2PS and replace with just patterns. llvm-svn: 159106	2012-06-24 05:44:31 +00:00
Craig Topper	3cee08ce7d	Remove intrinsic specific instructions for CVTPD2DQ. Replace with patterns. llvm-svn: 159105	2012-06-24 05:33:24 +00:00
Pete Cooper	3c680dec8a	Remove code i'd been testing with but didn't mean to commit. Oops llvm-svn: 159094	2012-06-24 00:08:36 +00:00
Pete Cooper	fe212e762f	DAG legalisation can now handle illegal fma vector types by scalarisation llvm-svn: 159092	2012-06-24 00:05:44 +00:00
Craig Topper	a899cc15f1	Remove intrinsic specific instructions for (V)CVTDQ2PS. Use a Pat instead instead. llvm-svn: 159090	2012-06-23 22:33:14 +00:00
Craig Topper	7e9415220a	Make CVTDQ2PS instruction use SSE2 predicate instead of SSE1. No functional change because there are no patterns in the instructions. Also fix a typo in a comment. llvm-svn: 159087	2012-06-23 20:52:45 +00:00
Craig Topper	24e3418215	Move CVTPD2DQ to use SSE2 predicate instead of SSE3. Move DQ2PD and PD2DQ to the SSE2 section of the file. llvm-svn: 159086	2012-06-23 20:15:42 +00:00
Benjamin Kramer	53ffe55a66	Add a microoptimization note. llvm-svn: 159082	2012-06-23 15:19:31 +00:00
Hans Wennborg	cbe34b4cc9	Extend the IL for selecting TLS models (PR9788) This allows the user/front-end to specify a model that is better than what LLVM would choose by default. For example, a variable might be declared as @x = thread_local(initialexec) global i32 42 if it will not be used in a shared library that is dlopen'ed. If the specified model isn't supported by the target, or if LLVM can make a better choice, a different model may be used. llvm-svn: 159077	2012-06-23 11:37:03 +00:00
Craig Topper	8c03ea79c4	Use correct memory types for (V)CVTDQ2PD instructions. llvm-svn: 159075	2012-06-23 08:30:27 +00:00
Craig Topper	2361cd9897	Silence an unused variable warning on release builds. llvm-svn: 159074	2012-06-23 08:09:30 +00:00
Craig Topper	1cac50bc5e	Compress flags in X86 op folding to reduce space in static tables. llvm-svn: 159073	2012-06-23 08:01:18 +00:00
Craig Topper	d9c7d0dda4	Make helper method static since it doesn't use anything in the class. llvm-svn: 159071	2012-06-23 04:58:41 +00:00
Craig Topper	431f1e7192	Remove intrinsic specific instructions for 128-bit (V)CVTDQ2PD. Replace with intrinsic patterns. Mem forms omitted because the load size is only 64-bits. llvm-svn: 159070	2012-06-23 04:23:36 +00:00
Rafael Espindola	a3088f09b3	Handle aliases to tls variables in all architectures, not just x86. llvm-svn: 159058	2012-06-23 00:30:03 +00:00
Evan Cheng	68c2f9a9a7	(sub X, imm) gets canonicalized to (add X, -imm) There are patterns to handle immediates when they fit in the immediate field. e.g. %sub = add i32 %x, -123 => sub r0, r0, #123 Add patterns to catch immediates that do not fit but should be materialized with a single movw instruction rather than movw + movt pair. e.g. %sub = add i32 %x, -65535 => movw r1, #65535 sub r0, r0, r1 rdar://11726136 llvm-svn: 159057	2012-06-23 00:29:06 +00:00
Jim Grosbach	087affe2f3	ARM: Add a better diagnostic for some out of range immediates. As an example of how the custom DiagnosticType can be used to provide better operand-mismatch diagnostics, add a custom diagnostic for the imm0_15 operand class used for several system instructions. Update the tests to expect the improved diagnostic. rdar://8987109 llvm-svn: 159051	2012-06-22 23:56:48 +00:00
Hal Finkel	460e94d842	Add support for the PPC isel instruction. The isel (integer select) instruction is supported on the 440 and A2 embedded cores and on the POWER7. llvm-svn: 159045	2012-06-22 23:10:08 +00:00
Chad Rosier	f5cdea3d79	Whitespace. llvm-svn: 159035	2012-06-22 22:07:19 +00:00
Hal Finkel	8db5547252	Revert r158679 - use case is unclear (and it increases the memory footprint). Original commit message: Allow up to 64 functional units per processor itinerary. This patch changes the type used to hold the FU bitset from unsigned to uint64_t. This will be needed for some upcoming PowerPC itineraries. llvm-svn: 159027	2012-06-22 20:27:13 +00:00
Andrew Trick	9c302673b2	Use "NoItineraries" for processors with no itineraries. This makes it explicit when ScoreboardHazardRecognizer will be used. "GenericItineraries" would only make sense if it contained real itinerary values and still required ScoreboardHazardRecognizer. llvm-svn: 158963	2012-06-22 03:58:51 +00:00
Jakob Stoklund Olesen	321d41a871	Functions calling __builtin_eh_return must have a frame pointer. The code in X86TargetLowering::LowerEH_RETURN() assumes that a frame pointer exists, but the frame pointer was forced by the presence of llvm.eh.unwind.init which isn't guaranteed. If llvm.eh.unwind.init is actually required in functions calling eh.return (is it?), we should diagnose that instead of emitting bad machine code. This should fix the dragonegg-x86_64-linux-gcc-4.6-test bot. llvm-svn: 158961	2012-06-22 03:04:27 +00:00
Andrew Trick	77d0b88999	ARM scheduling fix: don't guess at implicit operand latency. This is a minor drive-by fix with no robust way to unit test. As an example see neon-div.ll: SU(16): %Q8<def> = VMOVLsv4i32 %D17, pred:14, pred:%noreg, %Q8<imp-use,kill> val SU(1): Latency=2 Reg=%Q8 ...should be latency=1 llvm-svn: 158960	2012-06-22 02:50:33 +00:00
Andrew Trick	3ccb1b8cf9	ARM scheduling fix: compute predicated implicit use properly. Minor drive by fix to cleanup latency computation. Calling getOperandLatency with a deliberately incorrect operand index does not give you the latency you want. llvm-svn: 158959	2012-06-22 02:50:31 +00:00
Lang Hames	b8650f106a	Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956	2012-06-22 01:09:09 +00:00
Hal Finkel	0a479ae7d1	Convert the PPC backend to use the new FMA infrastructure. The existing contraction patterns are replaced with fma/fneg. Overall functionality should be the same. llvm-svn: 158955	2012-06-22 00:49:52 +00:00
Akira Hatanaka	765c312314	1. fix null program output after some other changes 2. re-enable null.ll test 3. fix some minor style violations Patch by Reed Kotler. llvm-svn: 158935	2012-06-21 20:39:10 +00:00
Hal Finkel	a86b0f20dd	Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc stores on PPC. Thanks to Tobias von Koch for pointing out this problem. llvm-svn: 158932	2012-06-21 20:10:48 +00:00
Jack Carter	b2fd5f66b4	The inline asm operand modifier 'c' is suppose to be generic across architectures. It has the following description in the gnu sources: Substitute immediate value without immediate syntax Several Architectures such as x86 have local implementations of operand modifier 'c' which go beyond the above description slightly. To make use of the generic modifiers without overriding local implementation one can make a call to the base class method for AsmPrinter::PrintAsmOperand() in the locally derived method's "default" case in the switch statement. That way if it is already defined locally the generic version will never get called. This change is needed when test/CodeGen/generic/asm-large-immediate.ll failed on a native Mips board. The test was assuming a generic implementation was in place. Affected files: lib/Target/Mips/MipsAsmPrinter.cpp: Changed the default case to call the base method. lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp Added 'c' to the switch cases. test/CodeGen/Mips/asm-large-immediate.ll Mips compiled version of the generic one Contributer: Jack Carter llvm-svn: 158925	2012-06-21 17:14:46 +00:00
Lang Hames	90b2a4cbad	Add a missing llvm.fma -> VFNMS pattern to the ARM backend. llvm-svn: 158902	2012-06-21 06:10:00 +00:00
Akira Hatanaka	87505f46ac	Revert r158846. llvm-svn: 158855	2012-06-20 21:19:39 +00:00
Akira Hatanaka	da448fe0b1	In MipsDisassembler.cpp, instead of defining register class tables, use the ones that are generated by TableGen and are already available in MipsGenRegisterInfo.inc. Suggested by Jakob Stoklund Olesen. Also, fix bug in function DecodeAFGR64RegisterClass. Patch by Vladimir Medic. llvm-svn: 158846	2012-06-20 20:39:23 +00:00
Hal Finkel	ca542beffe	Add support for generating reg+reg (indexed) pre-inc loads on PPC. llvm-svn: 158823	2012-06-20 15:43:03 +00:00
Chandler Carruth	5c0997f066	Remove 'static' from inline functions defined in header files. There is a pretty staggering amount of this in LLVM's header files, this is not all of the instances I'm afraid. These include all of the functions that (in my build) are used by a non-static inline (or external) function. Specifically, these issues were caught by the new '-Winternal-linkage-in-inline' warning. I'll try to just clean up the remainder of the clearly redundant "static inline" cases on functions (not methods!) defined within headers if I can do so in a reliable way. There were even several cases of a missing 'inline' altogether, or my personal favorite "static bool inline". Go figure. ;] llvm-svn: 158800	2012-06-20 08:39:33 +00:00
Craig Topper	21d04fc118	Add predicate check around some patterns. llvm-svn: 158797	2012-06-20 07:30:23 +00:00
Craig Topper	3b662a6279	Add predicate check around some patterns. llvm-svn: 158795	2012-06-20 07:01:11 +00:00
Craig Topper	b9e8e18949	Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases. llvm-svn: 158792	2012-06-20 05:39:26 +00:00
Lang Hames	39fb1d08dc	Add DAG-combines for aggressive FMA formation. This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or FSUB + FMUL. The combines are performed when: (a) Either AllowExcessFPPrecision option (-enable-excess-fp-precision for llc) OR UnsafeFPMath option (-enable-unsafe-fp-math) are set, and (b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of the FADD/FSUB, and (c) The FMUL only has one user (the FADD/FSUB). If your target has fast FMA instructions you can make use of these combines by overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for types supported by your FMA instruction, and adding patterns to match ISD::FMA to your FMA instructions. llvm-svn: 158757	2012-06-19 22:51:23 +00:00
Jakob Stoklund Olesen	0f855e4263	Implement PPCInstrInfo::isCoalescableExtInstr(). The PPC::EXTSW instruction preserves the low 32 bits of its input, just like some of the x86 instructions. Use it to reduce register pressure when the low 32 bits have multiple uses. This requires a small change to PeepholeOptimizer since EXTSW takes a 64-bit input register. This is related to PR5997. llvm-svn: 158743	2012-06-19 21:14:34 +00:00
Jan Wen Voung	7f5d79f864	Have ARM ELF use correct reloc for "b" instr. The condition code didn't actually matter for arm "b" instructions, unlike "bl". It should just use the R_ARM_JUMP24 reloc. llvm-svn: 158722	2012-06-19 16:03:02 +00:00
Hal Finkel	d465810f7c	Mark most PPC register classes to avoid write-after-write. For processors with the G5-like instruction-grouping scheme, this helps avoid early group termination due to a write-after-write dependency within the group. It should also help on pipelined embedded cores. On POWER7, over the test suite, this gives an average 0.5% speedup. The largest speedups are: SingleSource/Benchmarks/Stanford/Quicksort - 33% MultiSource/Applications/d/make_dparser - 21% MultiSource/Benchmarks/FreeBench/analyzer/analyzer - 12% MultiSource/Benchmarks/MiBench/telecomm-FFT/telecomm-fft - 12% Largest slowdowns: SingleSource/Benchmarks/Stanford/Bubblesort - 23% MultiSource/Benchmarks/Prolangs-C++/city/city - 21% MultiSource/Benchmarks/BitBench/uuencode/uuencode - 16% MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode - 13% llvm-svn: 158719	2012-06-19 13:57:17 +00:00
Akira Hatanaka	9f96bb8619	Make MipsLongBranch::runOnMachineFunction return true. llvm-svn: 158702	2012-06-19 03:45:29 +00:00
Akira Hatanaka	9846239bbc	Use MachineBasicBlock::instr_iterator instead of MachineBasicBlock::iterator in MipsCodeEmitter.cpp. llvm-svn: 158701	2012-06-19 03:39:45 +00:00
Hal Finkel	1cc27e44a4	Add support for generating reg+reg preinc stores on PPC. PPC will now generate STWUX and friends. llvm-svn: 158698	2012-06-19 02:34:32 +00:00
Rafael Espindola	ca3e0ee8b3	Move the support for using .init_array from ARM to the generic TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM, on X86 it is not easy to find out if .init_array should be used or not, so the decision is made via TargetOptions and defaults to off. Add a command line option to llc that enables it. llvm-svn: 158692	2012-06-19 00:48:28 +00:00

1 2 3 4 5 ...

21581 Commits