llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	4cca620c18	remove two (useless) tests that use incorrect intrinsic prototypes, detected by the new intrinsic verifier. llvm-svn: 157543	2012-05-27 19:31:00 +00:00
Peter Collingbourne	4d358b55fa	Have getOrCreateSubprogramDIE store the DIE for a subprogram definition in the map before calling itself to retrieve the DIE for the declaration. Without this change, if this causes getOrCreateSubprogramDIE to be recursively called on the definition, it will create multiple DIEs for that definition. Fixes PR12831. llvm-svn: 157541	2012-05-27 18:36:44 +00:00
Benjamin Kramer	f2beccf6b4	SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights. SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move the most likely condition to the front so it is checked first and the others can be skipped. This is currently not as effective as it could be because SimplifyCFG destroys profiling metadata when merging branches and switches. Merging branch weight metadata is tricky though. This code touches at most 3 cases so I didn't use a proper sorting algorithm. llvm-svn: 157521	2012-05-26 20:01:32 +00:00
Duncan Sands	3c05cd3ea8	Since commit 157467, if reassociate isn't actually going to change an expression then it doesn't alter the instructions composing it, however it would continue to move the instructions to just before the expression root. Ensure it doesn't move them either, so now it really does nothing if there is nothing to do. That commit also ensured that nsw etc flags weren't cleared if the expression was not being changed. Tweak this a bit so that it doesn't clear flags on the initial part of a computation either if that part didn't change but later bits did. llvm-svn: 157518	2012-05-26 16:42:52 +00:00
Nuno Lopes	e9b0bdf804	bounds checking: add support for byval arguments llvm-svn: 157498	2012-05-25 21:15:17 +00:00
Justin Holewinski	c98041d4d9	[NVPTX] Add a new test case for the newly-enabled call handling NV_CONTRIB llvm-svn: 157485	2012-05-25 17:20:38 +00:00
Nuno Lopes	a6da3ff896	boundschecking: add support for select add experimental support for alloc_size metadata llvm-svn: 157481	2012-05-25 16:54:04 +00:00
NAKAMURA Takumi	3eca973bf8	test/CodeGen/X86/bigstructret.ll: Suppress one test. It is msvc-incompatible. (compatible to mingw32 and netbsd, though) llvm-svn: 157474	2012-05-25 15:40:54 +00:00
NAKAMURA Takumi	501dbd06ae	test/CodeGen/X86/bigstructret.ll: Relax stack offsets for hosts of stack-align=8, eg. win32 and netbsd. llvm-svn: 157471	2012-05-25 15:12:21 +00:00
Duncan Sands	bddfb2f96b	Make the reassociation pass more powerful so that it can handle expressions with arbitrary topologies (previously it would give up when hitting a diamond in the use graph for example). The testcase from PR12764 is now reduced from a pile of additions to the optimal 1617*%x0+208. In doing this I changed the previous strategy of dropping all uses for expression leaves to one of dropping all but one use. This works out more neatly (but required a bunch of tweaks) and is also safer: some recently fixed bugs during recursive linearization were because the linearization code thinks it completely owns a node if it has no uses outside the expression it is linearizing. But if the node was also in another expression that had been linearized (and thus all uses of the node from that expression dropped) then the conclusion that it is completely owned by the expression currently being linearized is wrong. Keeping one use from within each linearized expression avoids this kind of mistake. llvm-svn: 157467	2012-05-25 12:03:02 +00:00
Eli Friedman	315a0c79f3	Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process. llvm-svn: 157446	2012-05-25 00:09:29 +00:00
Jakob Stoklund Olesen	36a5c8e550	Add support for range expressions in TableGen foreach loops. Like this: foreach i = 0-127 in ... Use braces for composite ranges: foreach i = {0-3,9-7} in ... llvm-svn: 157432	2012-05-24 22:17:39 +00:00
Jakob Stoklund Olesen	74fd80e8fc	Don't put TGParser scratch results in the output. Only fully expanded Records should go into RecordKeeper. llvm-svn: 157431	2012-05-24 22:17:36 +00:00
David Blaikie	c575c80c3b	Fix for CHECK-NOT misspelling. Patch by Nicklas Bo Jensen. llvm-svn: 157421	2012-05-24 22:08:29 +00:00
Justin Holewinski	907f7606f2	Remove the PTX back-end and all of its artifacts (triple, etc.) This back-end was deprecated in favor of the NVPTX back-end. NV_CONTRIB llvm-svn: 157417	2012-05-24 21:38:21 +00:00
Owen Anderson	921082b883	Teach tblgen's set theory "sequence" operator to support an optional stride operand. llvm-svn: 157416	2012-05-24 21:37:08 +00:00
Akira Hatanaka	a649cc75b3	Turn on mips16 pseudo op when compiling for mips16. Expand test case for this. Patch by Reed Kotler. llvm-svn: 157410	2012-05-24 18:37:43 +00:00
Akira Hatanaka	df98a7a34d	Enable Mips16 compiler to compile a null program. First code from the Mips16 compiler. Includes trivial test program. Patch by Reed Kotler. llvm-svn: 157408	2012-05-24 18:32:33 +00:00
Tobias Grosser	6b31d170a4	Add half support to LLVM (for OpenCL) Submitted by: Anton Lokhmotov <Anton.Lokhmotov@arm.com> Approved by: o Anton Korobeynikov o Micah Villmow o David Neto llvm-svn: 157393	2012-05-24 15:59:06 +00:00
Stepan Dyatkovskiy	183d18aa5a	PR1255 related changes (case ranges): LowerSwitch::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced. test/Transform/LowerSwitch/feature.ll - this test was refactored: grep + count was replaced with FileCheck usage. llvm-svn: 157384	2012-05-24 09:33:20 +00:00
Jakob Stoklund Olesen	41ebcda8f4	Add a test case for global live range splitting. llvm-svn: 157357	2012-05-23 23:42:23 +00:00
Jakob Stoklund Olesen	0ce90494e6	Add a last resort tryInstructionSplit() to RAGreedy. Live ranges with a constrained register class may benefit from splitting around individual uses. It allows the remaining live range to use a larger register class where it may allocate. This is like spilling to a different register class. This is only attempted on constrained register classes. <rdar://problem/11438902> llvm-svn: 157354	2012-05-23 22:37:27 +00:00
Kaelyn Uhrain	4dbe0cd6dc	Fix typo in flag to opt, and also a CHECK-NEXT that doesn't follow a CHECK. The latter error was hidden by the former, and the test harness used by e.g. "make check" silently ignored that opt was printing an error message about an unknown flag instead of running on the test file. llvm-svn: 157341	2012-05-23 20:21:36 +00:00
Jakob Stoklund Olesen	5b8f476037	Correctly deal with identity copies in RegisterCoalescer. Now that the coalescer keeps live intervals and machine code in sync at all times, it needs to deal with identity copies differently. When merging two virtual registers, all identity copies are removed right away. This means that other identity copies must come from somewhere else, and they are going to have a value number. Deal with such copies by merging the value numbers before erasing the copy instruction. Otherwise, we leave dangling value numbers in the live interval. This fixes PR12927. llvm-svn: 157340	2012-05-23 20:21:06 +00:00
Chad Rosier	223faf719c	[arm-fast-isel] Add support for non-global callee. Patch by Jush Lu <jush.msn@gmail.com>. llvm-svn: 157336	2012-05-23 18:38:57 +00:00
Nuno Lopes	10287d839f	BoundsChecking: add a couple of simple tests and fix a bug in branch emition llvm-svn: 157329	2012-05-23 16:24:52 +00:00
Patrik Hägglund	8a1e316c15	Fix the inliner so that the optsize function attribute don't alter the inline threshold if the global inline threshold is lower (as for -Oz). Reviewed by Chandler Carruth and Bill Wendling. llvm-svn: 157323	2012-05-23 13:42:57 +00:00
Eric Christopher	c49643586b	Add support for C++11 enum classes in llvm. Part of rdar://11496790 llvm-svn: 157303	2012-05-23 00:09:20 +00:00
Andrew Trick	a7a3de1bcf	LSR fix: add a missing phi check during IV hoisting. Fixes PR12898: SCEVExpander crash. llvm-svn: 157263	2012-05-22 17:39:59 +00:00
Nuno Lopes	ad40c0a425	revert my previous patches that introduced an additional parameter to the objectsize intrinsic. After a lot of discussion, we realized it's not the best option for run-time bounds checking llvm-svn: 157255	2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen	924279ca0e	Only erase virtregs with no uses left. Also make sure registers aren't erased twice if the dead def mentions the register twice. This fixes PR12911. llvm-svn: 157254	2012-05-22 14:52:12 +00:00
Duncan Sands	4df5e96d3a	Fix PR12858, a crash due to GVN's PRE not fully removing an instruction from the leader table. That's because it wasn't expecting instructions to turn up as leader for a value number that is not its own, but equality propagation could create this situation. One solution is to have the leader table use a WeakVH but this slows down GVN by about 5%. Instead just have equality propagation not add instructions to the leader table, only constants and arguments. In theory this might cause GVN to run more (each time it changes something it runs again) but it doesn't seem to occur enough to cause a slow down. llvm-svn: 157251	2012-05-22 14:17:53 +00:00
Jim Grosbach	da04fa0d02	FileCheck'ize test, and add a bit to test for r157221. llvm-svn: 157222	2012-05-21 23:50:00 +00:00
Craig Topper	e88f2fd4f7	Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces. llvm-svn: 157175	2012-05-21 06:40:16 +00:00
Peter Collingbourne	8eb05fd093	When legalising shifts, do not pre-build a list of operands which may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve the other operands when calling UpdateNodeOperands. Fixes PR12889. llvm-svn: 157162	2012-05-20 18:36:15 +00:00
Hal Finkel	601f555eee	Add a missing PPC 64-bit stwu pattern. This seems to fix the remaining compile-time failures on PPC64 when compiling with -enable-ppc-preinc. llvm-svn: 157159	2012-05-20 17:11:24 +00:00
Jakob Stoklund Olesen	691ae3388f	Use the right register class for LDRrs. llvm-svn: 157152	2012-05-20 06:38:47 +00:00
Jakob Stoklund Olesen	4fd0e4f415	Transfer memory operands to the right instruction. They need to go on the PICLDR as the verifier points out. llvm-svn: 157151	2012-05-20 06:38:42 +00:00
Jakob Stoklund Olesen	1f1c6add10	Properly constrain register classes for sub-registers. Not all GR64 registers have sub_8bit sub-registers. llvm-svn: 157150	2012-05-20 06:38:37 +00:00
Jakob Stoklund Olesen	a103a516c6	Properly constrain register classes in 2-addr. X86 has 2-addr instructions with different constraints on the tied def and use operands. One is GR32, one is GR32_NOSP. llvm-svn: 157149	2012-05-20 06:38:32 +00:00
Peter Collingbourne	9a03c73297	Do not pass an invalid domtree to SimplifyInstruction from LoopUnswitch. Fixes PR12887. llvm-svn: 157140	2012-05-20 01:32:09 +00:00
Jakob Stoklund Olesen	a34a69ce0c	Fix 12892. Dead code elimination during coalescing could cause a virtual register to be split into connected components. The following rewriting would be confused about the already joined copies present in the code, but without a corresponding value number in the live range. Erase all joined copies instantly when joining intervals such that the MI and LiveInterval representations are always in sync. llvm-svn: 157135	2012-05-19 23:34:59 +00:00
Peter Collingbourne	97b1076435	Do not eliminate allocas whose alignment exceeds that of the copied-in constant, as a subsequent user may rely on over alignment. Fixes PR12885. llvm-svn: 157134	2012-05-19 22:52:10 +00:00
Jakob Stoklund Olesen	25ced18407	Erase joined copies immediately. The late dead code elimination is no longer necessary. The test changes are cause by a register hint that can be either %rdi or %rax. The choice depends on the use list order, which this patch changes. llvm-svn: 157131	2012-05-19 20:54:07 +00:00
Nadav Rotem	c93e91da27	On Haswell, perfer storing YMM registers using a single instruction. llvm-svn: 157129	2012-05-19 20:30:08 +00:00
Nadav Rotem	900c7cb7ce	Add support for additional in-reg vbroadcast patterns llvm-svn: 157127	2012-05-19 19:57:37 +00:00
Eric Christopher	b5cf66cda2	Actually support DW_TAG_rvalue_reference_type that we were trying to generate out of the front end. rdar://11479676 llvm-svn: 157094	2012-05-19 01:36:37 +00:00
Eric Christopher	bc5d24999c	Add support for the 'd' mips inline asm output modifier. Patch by Jack Carter. llvm-svn: 157093	2012-05-19 00:51:56 +00:00
Andrew Trick	7fa4e0fea6	SCEV: Add MarkPendingLoopPredicates to avoid recursive isImpliedCond. getUDivExpr attempts to simplify by checking for overflow. isLoopEntryGuardedByCond then evaluates the loop predicate which may lead to the same getUDivExpr causing endless recursion. Fixes PR12868: clang 3.2 segmentation fault. llvm-svn: 157092	2012-05-19 00:48:25 +00:00
Dan Gohman	14862c3141	Fix replacing all the users of objc weak runtime routines when deleting them. rdar://11434915. llvm-svn: 157080	2012-05-18 22:17:29 +00:00
Nuno Lopes	ac59380dfd	allow LazyValueInfo::getEdgeValue() to reason about multiple edges from the same switch instruction by doing union of ranges (which may still be conservative, but it's more aggressive than before) llvm-svn: 157071	2012-05-18 21:02:10 +00:00
Jim Grosbach	4b63d2ae1d	Refactor data-in-code annotations. Use a dedicated MachO load command to annotate data-in-code regions. This is the same format the linker produces for final executable images, allowing consistency of representation and use of introspection tools for both object and executable files. Data-in-code regions are annotated via ".data_region"/".end_data_region" directive pairs, with an optional region type. data_region_directive := ".data_region" { region_type } region_type := "jt8" \| "jt16" \| "jt32" \| "jta32" end_data_region_directive := ".end_data_region" The previous handling of ARM-style "$d.*" labels was broken and has been removed. Specifically, it didn't handle ARM vs. Thumb mode when marking the end of the section. rdar://11459456 llvm-svn: 157062	2012-05-18 19:12:01 +00:00
Nuno Lopes	b63d6cdf79	add test case for bugfix in r157032 llvm-svn: 157058	2012-05-18 17:44:58 +00:00
Eric Christopher	9ca26cfb5f	Add support for the mips 'x' inline asm modifier. Patch by Jack Carter. llvm-svn: 157057	2012-05-18 17:39:35 +00:00
Joel Jones	f1c120e9ef	FileCheck-ify, apropos of nothing llvm-svn: 157051	2012-05-18 16:24:01 +00:00
Craig Topper	92db928ee9	Simplify handling of v16i8 shuffles and fix a missed optimization. llvm-svn: 157043	2012-05-18 06:42:06 +00:00
Evan Cheng	22d405f57b	Teach two-address pass to update the "source" map so it doesn't perform a non-profitable commute using outdated info. The test case would still fail because of poor pre-RA schedule. That will be fixed by MI scheduler. rdar://11472010 llvm-svn: 157038	2012-05-18 01:33:51 +00:00
Danil Malyshev	cd492b0a98	Temporarily disabled the MCJIT tests for Darwin, because the RuntimeDyldMachO has a problems with relocations for 32bit x86. llvm-svn: 157035	2012-05-18 00:30:58 +00:00
Kevin Enderby	badd100c26	Fixed a bug in llvm-objdump when disassembling using -macho option for a binary containing no symbols. Fixed the crash and fixed it not disassembling anything. llvm-svn: 157031	2012-05-18 00:13:56 +00:00
Jakob Stoklund Olesen	874e401382	Remove a test that was only testing for physreg joining. This is the same as the other tests: Clever tricks are required to make the arguments and return value line up in a single-instruction function. It rarely happens in real life. We have plenty other examples of this behavior. llvm-svn: 157030	2012-05-18 00:07:14 +00:00
Jakob Stoklund Olesen	589c6eb95c	Remove -join-physregs from the test suite. This option has been disabled for a while, and it is going away so I can clean up the coalescer code. The tests that required physreg joining to be enabled were almost all of the form "tiny function with interference between arguments and return value". Such functions are usually inlined in the real world. The problem exposed by phys_subreg_coalesce-3.ll is real, but fairly rare. llvm-svn: 157027	2012-05-17 23:44:19 +00:00
Kevin Enderby	f1b225d0e0	Fix the encoding of the armv7m (MClass) for MSR APSR writes which was missing the 0b10 mask encoding bits. Make MSR APSR writes without a _<bits> qualifier an alias for MSR APSR_nzcvq even though ARM as deprecated it use. Also add support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions. Some FIXMEs in the code for better error checking when versions shouldn't be used. rdar://11457025 llvm-svn: 157019	2012-05-17 22:18:01 +00:00
Danil Malyshev	7c5db45350	- Added ExecutionEngine/MCJIT tests - Added HOST_ARCH to Makefile.config.in The HOST_ARCH will be used by MCJIT tests filter, because MCJIT supported only x86 and ARM architectures now. llvm-svn: 157015	2012-05-17 21:07:47 +00:00
Tim Northover	af501a29d3	Remove incorrect pattern for ARM SMML instruction. Patch by Meador Inge. llvm-svn: 156989	2012-05-17 13:12:13 +00:00
Chandler Carruth	d8c08c2111	Teach the 'opt' tool about '-Os' and '-Oz', corresponding to the Clang options, to enable easier testing of the innards of LLVM that are enabled by such optimization strategies. Note that this doesn't provide the (much needed) function attribute support for -Oz (as opposed to -Os), but still seems like a positive step to better test the logic that Clang currently relies on. Patch by Patrik Hägglund. llvm-svn: 156913	2012-05-16 08:32:49 +00:00
Evan Cheng	58a95f0c8a	Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474 llvm-svn: 156896	2012-05-16 01:54:27 +00:00
Jakob Stoklund Olesen	984997b3a0	Enable sub-sub-register copy coalescing. It is now possible to coalesce weird skewed sub-register copies by picking a super-register class larger than both original registers. The included test case produces code like this: vld2.32 {d16, d17, d18, d19}, [r0]! vst2.32 {d18, d19, d20, d21}, [r0] We still perform interference checking as if it were a normal full copy join, so this is still quite conservative. In particular, the f1 and f2 functions in the included test case still have remaining copies because of false interference. llvm-svn: 156878	2012-05-15 23:31:35 +00:00
Kevin Enderby	a414bcc0e3	Add a test case for r156840, a fix to llvm-objdump when disassembling using -macho to disassemble the last symbol to the end of the section. llvm-svn: 156850	2012-05-15 20:20:50 +00:00
Sirish Pande	91856a1f15	Enable all Hexagon tests. llvm-svn: 156824	2012-05-15 16:13:12 +00:00
David Majnemer	a9330fe553	Teach SimplifyLibCalls about stpcpy. llvm-svn: 156815	2012-05-15 11:46:21 +00:00
Jakob Stoklund Olesen	dc2e0cd44a	Fix PR12821. RAFast must add an <imp-def> operand when it is rewriting a sub-register def that isn't a read-modify-write. llvm-svn: 156777	2012-05-14 21:10:25 +00:00
Chad Rosier	a968caf8e0	Move the capture analysis from MemoryDependencyAnalysis to a more general place so that it can be reused in MemCpyOptimizer. This analysis is needed to remove an unnecessary memcpy when returning a struct into a local variable. rdar://11341081 PR12686 llvm-svn: 156776	2012-05-14 20:35:04 +00:00
Brendon Cahoon	f6b687e5d1	Revert 156634 upon request until code improvement changes are made. llvm-svn: 156775	2012-05-14 19:35:42 +00:00
Dan Gohman	164fe18cfe	Rename @llvm.debugger to @llvm.debugtrap. llvm-svn: 156774	2012-05-14 18:58:10 +00:00
Rafael Espindola	47b7dac220	Add support for the .rept directive. Patch by Vladmir Sorokin. I added support for nesting. llvm-svn: 156714	2012-05-12 16:31:10 +00:00
Benjamin Kramer	6bee7f750d	ELF: Add support for the asm .version directive. llvm-svn: 156712	2012-05-12 14:30:47 +00:00
Benjamin Kramer	95d31bcba5	AsmParser: Add support for the .purgem directive. Based on a patch by Team PaX. llvm-svn: 156709	2012-05-12 11:21:46 +00:00
Benjamin Kramer	66b8d4d28f	AsmParser: ignore the .extern directive. llvm-svn: 156707	2012-05-12 11:18:59 +00:00
Benjamin Kramer	e297b9f506	AsmParser: Add support for .ifc and .ifnc directives. Based on a patch from PaX Team. llvm-svn: 156706	2012-05-12 11:18:51 +00:00
Benjamin Kramer	62c18b0881	AsmParser: Add support for .ifb and .ifnb directives. Based on a patch from PaX Team. llvm-svn: 156705	2012-05-12 11:18:42 +00:00
Stepan Dyatkovskiy	0beab5e1cd	Recommited r156374 with critical fixes in BitcodeReader/Writer: Ordinary patch for PR1255. Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object. Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported. llvm-svn: 156704	2012-05-12 10:48:17 +00:00
Jay Foad	ca0c499609	Teach Function::hasAddressTaken that BlockAddress doesn't really take the address of a function. llvm-svn: 156703	2012-05-12 08:30:16 +00:00
Sirish Pande	4bd20c50eb	Support for Hexagon feature, New Value Jump. llvm-svn: 156698	2012-05-12 05:10:30 +00:00
Akira Hatanaka	763ab85690	Fix test cases. llvm-svn: 156697	2012-05-12 03:25:16 +00:00
Akira Hatanaka	8f3573034b	Make the following changes in MipsAsmPrinter.cpp: - Remove code which lowers pseudo SETGP01. - Fix LowerSETGP01. The first two of the three instructions that are emitted to initialize the global pointer register now use register $2. - Stop emitting .cpload directive. llvm-svn: 156689	2012-05-12 00:48:43 +00:00
Akira Hatanaka	d918f77ba3	Insert instructions to the entry basic block which initializes the global pointer register. This is the first of the series of patches which clean up the way global pointer register is used. The patches will make the following improvements: - Make $gp an allocatable temporary register rather than reserving it. - Use a virtual register as the global pointer register and let the register allocator decide which register to assign to it or whether spill/reloads are needed. - Make sure $gp is valid at the entry of a called function, which is necessary for functions using lazy binding. - Remove the need for emitting .cprestore and .cpload directives. llvm-svn: 156671	2012-05-12 00:17:17 +00:00
Akira Hatanaka	0661b81bca	Do not replace operands of pseudo instructions with register $zero. llvm-svn: 156663	2012-05-11 23:22:18 +00:00
Akira Hatanaka	5d60c36f37	Use regular expression to match register names. llvm-svn: 156656	2012-05-11 23:00:40 +00:00
Chad Rosier	aa9cb9df59	[fast-isel] Add support for selecting @llvm.trap(). llvm-svn: 156646	2012-05-11 21:33:49 +00:00
Brendon Cahoon	31f8723ef3	Hexagon constant extender support. Patch by Jyotsna Verma. llvm-svn: 156634	2012-05-11 19:56:59 +00:00
Chad Rosier	3268692aa8	[fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup. llvm-svn: 156632	2012-05-11 19:40:25 +00:00
Chad Rosier	90f9afe659	[fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg retval. Hoists check before emitting the call to avoid unnecessary work. rdar://11430407 PR12796 llvm-svn: 156628	2012-05-11 18:51:55 +00:00
Nuno Lopes	e2cfd3ce95	objectsize: add a few more tests and fix a bug llvm-svn: 156625	2012-05-11 18:25:29 +00:00
Hans Wennborg	addad7388d	Fix test/CodeGen/X86/tls-pie.ll. llvm-svn: 156612	2012-05-11 10:19:54 +00:00
Hans Wennborg	f9d0e44b82	Implement initial-exec TLS model for 32-bit PIC x86 This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong code here (see the update to test/CodeGen/X86/tls-pie.ll). llvm-svn: 156611	2012-05-11 10:11:01 +00:00
Silviu Baranga	ddc67a7655	Added the missing bit definition for the 4th bit of the STR (post reg) instruction. It is now set to 0. The patch also sets the unpredictable mask for SEL and SXTB-type instructions. llvm-svn: 156609	2012-05-11 09:28:27 +00:00
Silviu Baranga	5a719f9b9a	Fixed the LLVM ARM v7 assembler and instruction printer for 8-bit immediate offset addressing. The assembler and instruction printer were not properly handeling the #-0 immediate. llvm-svn: 156608	2012-05-11 09:10:54 +00:00
Eli Friedman	e0a64d83fc	Fix a minor logic mistake transforming compares in instcombine. PR12514. llvm-svn: 156600	2012-05-11 01:32:59 +00:00
Manman Ren	dc8ad0058f	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156599	2012-05-11 01:30:47 +00:00
Dan Gohman	dfab443ae8	Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(), but it generates int3 on x86 instead of ud2. llvm-svn: 156593	2012-05-11 00:19:32 +00:00
Nuno Lopes	f573030391	objectsize: add support for GEPs with non-constant indexes add an additional parameter to InstCombiner::EmitGEPOffset() to force it to not emit operations with NUW flag llvm-svn: 156585	2012-05-10 23:17:35 +00:00
Eric Christopher	ed51b9ec0b	Add support for the 'X' inline asm operand modifier. Patch by Jack Carter. llvm-svn: 156577	2012-05-10 21:48:22 +00:00
Sirish Pande	69295b8963	Hexagon V5 FP Support. llvm-svn: 156568	2012-05-10 20:20:25 +00:00
Dan Gohman	ed7c24e2d9	Teach DeadStoreElimination to eliminate exit-block stores with phi addresses. llvm-svn: 156558	2012-05-10 18:57:38 +00:00
Manman Ren	b555b382bd	Revert: 156550 "ARM: peephole optimization to remove cmp instruction" This commit broke an external linux bot and gave a compile-time warning. llvm-svn: 156556	2012-05-10 18:49:43 +00:00
Nuno Lopes	300d629924	teach DSE and isInstructionTriviallyDead() about calloc llvm-svn: 156553	2012-05-10 17:14:00 +00:00
Joel Jones	7f04344b8b	formatting change: strip debug info from test llvm-svn: 156551	2012-05-10 16:55:31 +00:00
Manman Ren	c860887b2d	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156550	2012-05-10 16:48:21 +00:00
Joel Jones	3d90a9ae65	Fix a problem with incomplete equality testing of PHINodes in Instruction::IsIdenticalToWhenDefined. This manifested itself when inlining two calls to the same function. The inlined function had a switch statement that returned one of a set of global variables. Without this modification, the two phi instructions that chose values from the branches of the switch instruction inlined from the callee were considered equivalent and jump-threading replaced a load for the first switch value with a phi selecting from the second switch, thereby producing incorrect code. This patch has been tested with "make check-all", "lnt runteste nt", and llvm self-hosted, and on the original program that had this problem, wireshark. <rdar://problem/11025519> llvm-svn: 156548	2012-05-10 15:59:41 +00:00
Nadav Rotem	15946e50c1	AVX2: Add an additional broadcast idiom. llvm-svn: 156540	2012-05-10 12:39:13 +00:00
Nadav Rotem	b86a3fb8d0	Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program. Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users. Fix PR11900. llvm-svn: 156539	2012-05-10 12:22:05 +00:00
Dan Gohman	f8b19d09ba	Fix the objc_storeStrong recognizer to stop before walking off the end of a basic block if there's no store. llvm-svn: 156520	2012-05-09 23:08:33 +00:00
Nuno Lopes	7100f463b0	objectsize: refactor code a bit to enable future changes to support run-time information add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs) llvm-svn: 156515	2012-05-09 21:30:57 +00:00
Danil Malyshev	47aba39004	Added a regress test for the bug #9964 before close it. This bug was fixed by Jim Grosbach in #138879, thanks Jim! llvm-svn: 156505	2012-05-09 19:07:04 +00:00
Nuno Lopes	01547b3ad2	change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept. This commit only adds the parameter. Code taking advantage of it will follow. llvm-svn: 156473	2012-05-09 15:52:43 +00:00
Filipe Cabecinhas	5c43305383	Fixed a typo llvm-svn: 156471	2012-05-09 14:43:50 +00:00
Akira Hatanaka	ca41d13bbd	Add another peephole pattern for conditional moves. llvm-svn: 156460	2012-05-09 02:29:29 +00:00
Akira Hatanaka	05b9dad1e6	Make register FP allocatable if the compiled function does not have dynamic allocas. llvm-svn: 156458	2012-05-09 01:38:13 +00:00
Akira Hatanaka	0a8ab718cb	Expand 64-bit shifts if target ABI is O32. llvm-svn: 156457	2012-05-09 00:55:21 +00:00
Dan Gohman	61708d37d6	Fix objc_storeStrong pattern matching to catch a potential use of the old value after the store but before it is released. This fixes rdar:/11116986. llvm-svn: 156442	2012-05-08 23:34:08 +00:00
Eric Christopher	4d25052a9a	Handle OpDeref in case it comes in as a register operand. Part of rdar://11352000 llvm-svn: 156405	2012-05-08 18:56:00 +00:00
Daniel Dunbar	d18888242e	Revert r156393, "[tests] Remove some remaining DejaGNU related cruft.", this patch wasn't ready yet. llvm-svn: 156395	2012-05-08 18:26:07 +00:00
Daniel Dunbar	898f02a613	[tests] Remove some remaining DejaGNU related cruft. llvm-svn: 156393	2012-05-08 18:11:49 +00:00
Duncan Sands	3bbb1d50df	Calling ReassociateExpression recursively is extremely dangerous since it will replace the operands of expressions with only one use with undef and generate a new expression for the original without using RAUW to update the original. Thus any copies of the original expression held in a vector may end up referring to some bogus value - and using a ValueHandle won't help since there is no RAUW. There is already a mechanism for getting the effect of recursion non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't being used systematically. Have various places where recursion had snuck in at some point use the RedoInsts mechanism instead. Fixes PR12169. llvm-svn: 156379	2012-05-08 12:16:05 +00:00
Stepan Dyatkovskiy	5eafce5c88	Rejected r156374: Ordinary PR1255 patch. Due to clang-x86_64-debian-fnt buildbot failure. llvm-svn: 156377	2012-05-08 08:33:21 +00:00
Craig Topper	7daf897678	Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit. llvm-svn: 156375	2012-05-08 06:58:15 +00:00
Stepan Dyatkovskiy	b6a4640163	Ordinary patch for PR1255. Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object. Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported. llvm-svn: 156374	2012-05-08 06:36:08 +00:00
Owen Anderson	ab63d84252	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled. llvm-svn: 156324	2012-05-07 20:51:25 +00:00
Owen Anderson	f4f80e1f39	Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions. This allows for greater CSE opportunities. llvm-svn: 156323	2012-05-07 20:47:23 +00:00
Chad Rosier	d8287fec17	Fix a regression from r147481. This combine should only happen if there is a single use. rdar://11360370 llvm-svn: 156316	2012-05-07 18:47:44 +00:00
Manman Ren	ef4e0479ec	X86: optimization for -(x != 0) This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td: def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>; rdar: 10961709 llvm-svn: 156312	2012-05-07 18:06:23 +00:00
Eric Christopher	9c492e6ebf	Add support for the 'l' constraint. Patch by Jack Carter. llvm-svn: 156294	2012-05-07 06:25:15 +00:00
Eric Christopher	e3c494de82	Add support for the 'c' constraint. Patch by Jack Carter. llvm-svn: 156293	2012-05-07 06:25:10 +00:00
Eric Christopher	c18ae4a3b1	Add support for the 'P' constraint. Patch by Jack Carter. llvm-svn: 156292	2012-05-07 06:25:02 +00:00
Eric Christopher	470578a91b	Add support for the 'O' constraint. Patch by Jack Carter. llvm-svn: 156285	2012-05-07 05:46:48 +00:00
Eric Christopher	e07aa430b8	Add support for the 'N' inline asm constraint. Patch by Jack Carter. llvm-svn: 156284	2012-05-07 05:46:43 +00:00
Eric Christopher	1109b3406d	Add support for the 'L' inline asm constraint. Patch by Jack Carter. llvm-svn: 156283	2012-05-07 05:46:37 +00:00
Eric Christopher	3ff88a05b7	Add support for the inline asm constraint 'K'. llvm-svn: 156282	2012-05-07 05:46:29 +00:00
Craig Topper	d4e1894ec1	Add SSE4A MOVNTSS/MOVNTSD instructions. llvm-svn: 156281	2012-05-07 05:36:19 +00:00
Eric Christopher	7201e1b4b9	Support the 'J' constraint. Patch by Jack Carter. llvm-svn: 156280	2012-05-07 03:13:42 +00:00
Eric Christopher	1d6c89eea1	Add support for the 'I' inline asm constraint. Also add tests from the previous 2 patches. Patch by Jack Carter. llvm-svn: 156279	2012-05-07 03:13:32 +00:00
Benjamin Kramer	3d38c17b59	Switch the select to branch transformation on by default. The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258	2012-05-06 14:25:16 +00:00
Benjamin Kramer	047d7ca0b1	CodeGenPrepare: Add a transform to turn selects into branches in some cases. This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234	2012-05-05 12:49:22 +00:00
Stepan Dyatkovskiy	cb2a1a34e2	Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw". Also added fix to 2011-06-13-nsw-alloca.ll test. llvm-svn: 156231	2012-05-05 07:09:40 +00:00
Justin Holewinski	ae556d3ef7	This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it. The new target machines are: nvptx (old ptx32) => 32-bit PTX nvptx64 (old ptx64) => 64-bit PTX The sources are based on the internal NVIDIA NVPTX back-end, and contain more functionality than the current PTX back-end currently provides. NV_CONTRIB llvm-svn: 156196	2012-05-04 20:18:50 +00:00
Sebastian Pop	2420e8b7d5	Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions. llvm-svn: 156195	2012-05-04 19:53:56 +00:00
Craig Topper	42f2182366	Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles. llvm-svn: 156156	2012-05-04 04:44:49 +00:00
Kevin Enderby	914223010c	Fix issues with the ARM bl and blx thumb instructions and the J1 and J2 bits for the assembler and disassembler. Which were not being set/read correctly for offsets greater than 22 bits in some cases. Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles! llvm-svn: 156118	2012-05-03 22:41:56 +00:00
Nuno Lopes	d4cf35d775	remove calls to calloc if the allocated memory is not used (it was already being done for malloc) fix a few typos found by Chad in my previous commit llvm-svn: 156110	2012-05-03 22:08:19 +00:00
Sirish Pande	f8e5e3c072	Support for target dependent Hexagon VLIW packetizer. This patch creates and optimizes packets as per Hexagon ISA rules. llvm-svn: 156109	2012-05-03 21:52:53 +00:00
Nuno Lopes	d2b71e7fa9	add support for calloc to objectsize lowering llvm-svn: 156102	2012-05-03 21:19:58 +00:00
Silviu Baranga	9560af848c	Fixed disassembler for vstm/vldm ARM VFP instructions. llvm-svn: 156077	2012-05-03 16:38:40 +00:00
Craig Topper	315a5cc789	Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982. llvm-svn: 156059	2012-05-03 07:12:59 +00:00
Evan Cheng	b64e7b778b	Fix two-address pass's aggressive instruction commuting heuristics. It's meant to catch cases like: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 By commuting ADD, it let coalescer eliminate all of the copies. However, there was a bug in the heuristics where it ended up commuting the ADD in: %reg1024<def> = MOV r0 %reg1025<def> = MOV 0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 That did no benefit but rather ensure the last MOV would not be coalesced. rdar://11355268 llvm-svn: 156048	2012-05-03 01:45:13 +00:00
Owen Anderson	41b0665b5b	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs. llvm-svn: 156029	2012-05-02 22:17:40 +00:00
Owen Anderson	b5f167c660	Teach DAG combine that multiplication by 1.0 can always be constant folded. llvm-svn: 156023	2012-05-02 21:32:35 +00:00
Jim Grosbach	28b0b7279e	ARM: Add missing two-operand VBIC aliases. llvm-svn: 156019	2012-05-02 21:11:56 +00:00
Manman Ren	f02efc8731	Revert r155853 The commit is intended to fix rdar://10961709. But it is the root cause of PR12720. Revert it for now. llvm-svn: 155992	2012-05-02 15:24:32 +00:00
Bill Wendling	274ba89d77	The value held in the vector may be RAUW'ed by some of the canonicalization methods. Use a weak value handle to keep up with this. PR12245 llvm-svn: 155984	2012-05-02 09:59:45 +00:00
Richard Barton	0fc56890ba	Disallow YIELD and other allocated nop hints in pre-ARMv6 architectures. llvm-svn: 155983	2012-05-02 09:43:18 +00:00
Craig Topper	c73bc39c22	Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter. llvm-svn: 155982	2012-05-02 08:03:44 +00:00
Bill Wendling	b6b50c6638	Strip the pointer casts off of allocas so that the selection DAG can find them. PR10799 llvm-svn: 155954	2012-05-01 22:50:45 +00:00
Jim Grosbach	1d20efb837	ARM: Add a few missing add->sub aliases w/ 'w' suffix. Aliases for adding a negative immediate when using an explicit 'w' suffix. E.g., adds.w r2, #-16 adds.w r2, r2, #-16 addw r2, #-16 addw r2, #-16 addw r2, r2, #-16 rdar://11330769 llvm-svn: 155946	2012-05-01 21:17:34 +00:00
Jim Grosbach	70bed4faaf	ARM: allow vanilla expressions for movw/movt. Expressions for movw/movt don't always have an :upper16: or :lower16: on them and that's ok. When they don't, it's just a plain [0-65536] immediate result, effectively the same as a :lower16: variant kind. rdar://10550147 llvm-svn: 155941	2012-05-01 20:43:21 +00:00
Jim Grosbach	758e0cc94a	MC: Unknown assembler directives are now hard errors. Previously, an unsupported/unknown assembler directive issued a warning. That's generally unsafe, and inconsistent with the behaviour of pretty much every system assembler. Now that the MC assemblers are mature enough to be the default on multiple targets, it's reasonable to issue errors for these. For target or platform directives that need to stay warnings, we should add explicit handlers for them in, e.g., ELFAsmParser.cpp, DarwinAsmParser.cpp, et. al., and issue the warning there. rdar://9246275 llvm-svn: 155926	2012-05-01 18:38:27 +00:00
Manman Ren	425a55c1ce	X86: optimization for max-like struct This patch will optimize the following cases on X86 (a > b) ? (a-b) : 0 (a >= b) ? (a-b) : 0 (b < a) ? (a-b) : 0 (b <= a) ? (a-b) : 0 FROM movl %edi, %ecx subl %esi, %ecx cmpl %edi, %esi movl $0, %eax cmovll %ecx, %eax TO xorl %eax, %eax subl %esi, %edi cmovll %eax, %edi movl %edi, %eax rdar: 10734411 llvm-svn: 155919	2012-05-01 17:16:15 +00:00
Alexey Samsonov	c4b3ad8195	X86: Use StackRegister instead of FrameRegister in getFrameIndexReference (to generate debug info for local variables) if stack needs realignment llvm-svn: 155917	2012-05-01 15:16:06 +00:00
Jay Foad	8fc810c2ef	Regression test for PR2960. llvm-svn: 155912	2012-05-01 11:11:34 +00:00
Nick Lewycky	78ee67e814	An instruction in a loop is not guaranteed to be executed just because the loop has no exit blocks. Fixes PR12706! llvm-svn: 155884	2012-05-01 04:03:01 +00:00
Lang Hames	3a90fabd85	Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. Fixes <rdar://problem/11291436>. This is a second attempt at a fix for this, the first was r155468. Thanks to Chandler, Bob and others for the feedback that helped me improve this. llvm-svn: 155866	2012-05-01 00:20:38 +00:00
Manman Ren	4f4d5c8fc8	X86: optimization for -(x != 0) This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax llvm-svn: 155853	2012-04-30 22:51:25 +00:00
Manman Ren	5b7e08c9d8	test/CodeGen/X86/select.ll: remove spaces llvm-svn: 155840	2012-04-30 18:54:27 +00:00
Derek Schuff	b051adf263	Fix fastcc structure return with fast-isel on x86-32 On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. (this time, actually commit what was reviewed!) llvm-svn: 155825	2012-04-30 16:57:15 +00:00
Bob Wilson	9245c93656	Don't introduce illegal types when creating vmull operations. <rdar://11324364> ARM BUILD_VECTORs created after type legalization cannot use i8 or i16 operands, since those types are not legal. Instead use i32 operands, which will be implicitly truncated by the BUILD_VECTOR to match the element type. llvm-svn: 155824	2012-04-30 16:53:34 +00:00
Duncan Sands	34c4869cf6	Just mark the sign bit as known zero, rather than any other irrelevant bits known zero in the LHS. Fixes PR12541. llvm-svn: 155818	2012-04-30 11:56:58 +00:00
Bill Wendling	bf4b9afbeb	Second attempt at PR12573: Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If the pass is sure that it thinks it knows what it's doing, then it may go ahead and specify that the landing pad can have its critical edge split. The loop unswitch pass is one of these passes. It will split the critical edges of all edges coming from a loop to a landing pad not within the loop. Doing so will retain important loop analysis information, such as loop simplify. llvm-svn: 155817	2012-04-30 10:44:54 +00:00
Rafael Espindola	dd48931461	Make sure HoistInsertPosition finds a position that is dominated by all inputs. llvm-svn: 155809	2012-04-30 03:53:06 +00:00
Andrew Trick	833f04962a	Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice. This time, also fix the caller of AddGlue to properly handle incomplete chains. AddGlue had failure modes, but shamefully hid them from its caller. It's luck ran out. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155749	2012-04-28 01:03:23 +00:00
Jim Grosbach	c6f32b3295	ARM: Thumb add(sp plus register) asm constraints. Make sure when parsing the Thumb1 sp+register ADD instruction that the source and destination operands match. In thumb2, just use the wide encoding if they don't. In Thumb1, issue a diagnostic. rdar://11219154 llvm-svn: 155748	2012-04-27 23:51:36 +00:00
Derek Schuff	a99b168145	Revert r155745 llvm-svn: 155746	2012-04-27 23:37:41 +00:00
Derek Schuff	bbf8b83e90	Fix fastcc structure return with fast-isel on x86-32 On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. llvm-svn: 155745	2012-04-27 23:27:17 +00:00
Andrew Trick	7a773ec053	Temporarily revert r155668: Fix the SD scheduler to avoid gluing. This definitely caused regression with ARM -mno-thumb. llvm-svn: 155743	2012-04-27 22:55:59 +00:00
Chad Rosier	32c2178ef3	Add x86-specific DAG combine to simplify: x == -y --> x+y == 0 x != -y --> x+y != 0 On x86, the generated code goes from negl %esi cmpl %esi, %edi je .LBB0_2 to addl %esi, %edi je .L4 This case is correctly handled for ARM with "cmn". Patch by Manman Ren. rdar://11245199 PR12545 llvm-svn: 155739	2012-04-27 22:33:25 +00:00
Evan Cheng	73fd08d5bd	Make test less fragile. llvm-svn: 155732	2012-04-27 20:48:18 +00:00
Hal Finkel	27c3246169	Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.). Target specific types should not be vectorized. As a practical matter, these types are already register matched (at least in the x86 case), and codegen does not always work correctly (at least in the ppc case, and this is not worth fixing because ppc_fp128 is currently broken and will probably go away soon). llvm-svn: 155729	2012-04-27 19:34:00 +00:00
Lang Hames	ea001225c1	Fix the order of the operands in the llvm.fma intrinsic patterns for ARM, <rdar://problem/11325085>. llvm-svn: 155724	2012-04-27 18:51:24 +00:00
Dan Gohman	1ccecdb2fd	Reapply r155682, making constant folding more consistent, with a fix to work properly with how the code handles all-undef PHI nodes. llvm-svn: 155721	2012-04-27 17:50:22 +00:00
Richard Barton	82f95ea2ad	Fix ARM assembly parsing for upper case condition codes on IT instructions. llvm-svn: 155720	2012-04-27 17:34:01 +00:00
Benjamin Kramer	6cff5ad411	Missed some register numbers. llvm-svn: 155706	2012-04-27 12:21:46 +00:00
Benjamin Kramer	b1a17c425a	Update edis test for r155704. llvm-svn: 155705	2012-04-27 12:14:03 +00:00
Benjamin Kramer	913da4b261	X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures. * Model FPSW (the FPU status word) as a register. * Add ISel patterns for the FUCOM, FNSTSW and SAHF instructions. During Legalize/Lowering, build a node sequence to transfer the comparison result from FPSW into EFLAGS. If you're wondering about the right-shift: That's an implicit sub-register extraction (%ax -> %ah) which is handled later on by the instruction selector. Fixes PR6679. Patch by Christoph Erhardt! llvm-svn: 155704	2012-04-27 12:07:43 +00:00
NAKAMURA Takumi	6008dfdb70	Revert r155682, "Use ConstantExpr::getExtractElement when constant-folding vectors" It broke stage2 build. stage1/clang sometimes crashed. llvm-svn: 155699	2012-04-27 07:59:20 +00:00
Kostya Serebryany	a1259778b4	[tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov llvm-svn: 155698	2012-04-27 07:31:53 +00:00
Craig Topper	e57b49ee16	Add mcpu to tests to prevent them from using AVX instructions on Sandy Bridge after r155618. llvm-svn: 155696	2012-04-27 07:11:58 +00:00
Evan Cheng	1ec87ee096	Implement a bastardized ABI. llvm-svn: 155686	2012-04-27 02:11:10 +00:00
Evan Cheng	f52003de56	- thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2 instructions. - However, it does support dmb, dsb, isb, mrs, and msr. rdar://11331541 llvm-svn: 155685	2012-04-27 01:27:19 +00:00
Dan Gohman	90f3798f26	Use ConstantExpr::getExtractElement when constant-folding vectors instead of getAggregateElement. This has the advantage of being more consistent and allowing higher-level constant folding to procede even if an inner extract element cannot be folded. Make ConstantFoldInstruction call ConstantFoldConstantExpression on the instruction's operands, making it more consistent with ConstantFoldConstantExpression itself. This makes sure that ConstantExprs get TargetData-aware folding before being handed off as operands for further folding. This causes more expressions to be folded, but due to a known shortcoming in constant folding, this currently has the side effect of stripping a few more nuw and inbounds flags in the non-targetdata side of constant-fold-gep.ll. This is mostly harmless. This fixes rdar://11324230. llvm-svn: 155682	2012-04-27 00:54:36 +00:00
Chad Rosier	7813dcee30	Add instcombine patterns for the following transformations: (x & y) \| (x ^ y) -> x \| y (x & y) + (x ^ y) -> x \| y Patch by Manman Ren. rdar://10770603 llvm-svn: 155674	2012-04-26 23:29:14 +00:00
Andrew Trick	03fa574af5	Fix the SD scheduler to avoid gluing the same node twice. DAGCombine strangeness may result in multiple loads from the same offset. They both may try to glue themselves to another load. We could insist that the redundant loads glue themselves to each other, but the beter fix is to bail out from bad gluing at the time we detect it. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155668	2012-04-26 21:48:25 +00:00
Tim Northover	3de97b7a86	Use VLD1 in NEON extenting-load patterns instead of VLDR. On some cores it's a bad idea for performance to mix VFP and NEON instructions and since these patterns are NEON anyway, the NEON load should be used. llvm-svn: 155630	2012-04-26 08:46:29 +00:00
Chandler Carruth	739ef80fd7	Teach the reassociate pass to fold chains of multiplies with repeated elements to minimize the number of multiplies required to compute the final result. This uses a heuristic to attempt to form near-optimal binary exponentiation-style multiply chains. While there are some cases it misses, it seems to at least a decent job on a very diverse range of inputs. Initial benchmarks show no interesting regressions, and an 8% improvement on SPASS. Let me know if any other interesting results (in either direction) crop up! Credit to Richard Smith for the core algorithm, and helping code the patch itself. llvm-svn: 155616	2012-04-26 05:30:30 +00:00
Evan Cheng	8a8e9d1b63	Specify cpu to unbreak tests. llvm-svn: 155604	2012-04-26 01:38:10 +00:00
Evan Cheng	9f7ad310b5	If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume the feature set of v7a. This comes about if the user specifies something like -arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as uxtab in this case. rdar://11318438 llvm-svn: 155601	2012-04-26 01:13:36 +00:00
Jakob Stoklund Olesen	6eeeb7e19c	Try to fix llvm-arm-linux builder with -mcpu. llvm-svn: 155589	2012-04-25 21:22:33 +00:00
Preston Gurd	82cac0acc0	Trivial change to make the test use -mcpu=generic so as to avoid a failure if run on an Intel Atom with post RA instruction scheduling. llvm-svn: 155587	2012-04-25 21:04:54 +00:00
Chandler Carruth	eeb9e5810a	Actually delete now-empty file. llvm-svn: 155532	2012-04-25 02:30:00 +00:00
Lang Hames	2fd0c69125	Reverting r155468. Chris and Chandler have convinced me that it's dangerous and in poor taste. Talking through some alternate solutions with Chandler. llvm-svn: 155530	2012-04-25 02:16:54 +00:00
Akira Hatanaka	2020e27d6d	Do not use $gp as a dedicated global register if the target ABI is not O32. llvm-svn: 155522	2012-04-25 01:24:52 +00:00
Jim Grosbach	5117ef7453	ARM: improved assembler diagnostics for missing CPU features. When an instruction match is found, but the subtarget features it requires are not available (missing floating point unit, or thumb vs arm mode, for example), issue a diagnostic that identifies what the feature mismatch is. rdar://11257547 llvm-svn: 155499	2012-04-24 22:40:08 +00:00
Nadav Rotem	450d69a5ee	ConstantFoldSelectInstruction swapped the operands of the select. Fix 12592. Patch by Matt Pharr. llvm-svn: 155480	2012-04-24 20:18:49 +00:00
Nadav Rotem	d50c3b2c57	Fix the testcase. We do expect two vblendw on XMMs. llvm-svn: 155477	2012-04-24 19:57:38 +00:00
Nadav Rotem	edef71790b	Add a testcase for 155440 llvm-svn: 155475	2012-04-24 19:45:28 +00:00
Evan Cheng	2d14d8aca1	MachineBasicBlock::SplitCriticalEdge() should follow LLVM IR variant and refuse to break edge to EH landing pad. rdar://11300144 llvm-svn: 155470	2012-04-24 19:06:55 +00:00
Lang Hames	84531c2b5f	Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. This fixes <rdar://problem/11291436>. llvm-svn: 155468	2012-04-24 18:58:36 +00:00
Chandler Carruth	aacb8a5809	Fix a crash on valid (if UB) bitcode that is produced for some global constants in C++11 mode. I have no idea why it required such particular circumstances to get here, the code seems clearly to rely upon unchecked assumptions. Specifically, when we decide to form an index into a struct type, we may have gone through (at least one) zero-length array indexing round, which would have left the offset un-adjusted, and thus not necessarily valid for use when indexing the struct type. This is just an canonicalization step, so the correct thing is to refuse to canonicalize nonsensical GEPs of this form. Implemented, and test case added. Fixes PR12642. Pair debugged and coded with Richard Smith. =] I credit him with most of the debugging, and preventing me from writing the wrong code. llvm-svn: 155466	2012-04-24 18:42:47 +00:00
Kevin Enderby	70be447e5c	Add missing test cases for ARM VLD3 (single 3-element structure to all lanes) instructions. llvm-svn: 155453	2012-04-24 17:45:56 +00:00
Kevin Enderby	c8d223e41e	Add missing test cases for ARM VLD4 (single 4-element structure to all lanes) instructions. llvm-svn: 155444	2012-04-24 15:55:00 +00:00
Nadav Rotem	aa3ff8da00	AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions using the pattern (vbroadcast (i32load src)). In some cases, after we generate this pattern new users are added to the load node, which prevent the selection of the blend pattern. This commit provides fallback patterns which perform in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1). llvm-svn: 155437	2012-04-24 11:07:03 +00:00
Bill Wendling	1981c0e533	FileCheck-ize tests. llvm-svn: 155434	2012-04-24 10:45:44 +00:00
Bill Wendling	4cf911c0cd	FileCheck-ize these tests. llvm-svn: 155433	2012-04-24 10:36:42 +00:00
Bill Wendling	cd6df16cb4	FileCheck-ize these tests. Harden some of them. llvm-svn: 155432	2012-04-24 09:15:38 +00:00
Nadav Rotem	3f8acfc3c4	Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the integer type is i8 (commonly used in graphics). llvm-svn: 155397	2012-04-23 21:53:37 +00:00
Preston Gurd	9a0914753a	This patch fixes a problem which arose when using the Post-RA scheduler on X86 Atom. Some of our tests failed because the tail merging part of the BranchFolding pass was creating new basic blocks which did not contain live-in information. When the anti-dependency code in the Post-RA scheduler ran, it would sometimes rename the register containing the function return value because the fact that the return value was live-in to the subsequent block had been lost. To fix this, it is necessary to run the RegisterScavenging code in the BranchFolding pass. This patch makes sure that the register scavenging code is invoked in the X86 subtarget only when post-RA scheduling is being done. Post RA scheduling in the X86 subtarget is only done for Atom. This patch adds a new function to the TargetRegisterClass to control whether or not live-ins should be preserved during branch folding. This is necessary in order for the anti-dependency optimizations done during the PostRASchedulerList pass to work properly when doing Post-RA scheduling for the X86 in general and for the Intel Atom in particular. The patch adds and invokes the new function trackLivenessAfterRegAlloc() instead of using the existing requiresRegisterScavenging(). It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of requiresRegisterScavenging(). It changes the all the targets that implemented requiresRegisterScavenging() to also implement trackLivenessAfterRegAlloc(). It adds an assertion in the Post RA scheduler to make sure that post RA liveness information is available when it is needed. It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order to avoid running into the added assertion. Finally, this patch restores the use of anti-dependency checking (which was turned off temporarily for the 3.1 release) for Intel Atom in the Post RA scheduler. Patch by Andy Zhang! Thanks to Jakob and Anton for their reviews. llvm-svn: 155395	2012-04-23 21:39:35 +00:00
Jim Grosbach	f6371b5238	ARM: Add testcases for two-operand variants of VSRA/VRSRA/VSRI. llvm-svn: 155391	2012-04-23 21:00:47 +00:00
Jim Grosbach	76cdd136bf	Add ARM mode tests for the NEON vector shift-accumulate tests. llvm-svn: 155390	2012-04-23 21:00:44 +00:00
Jim Grosbach	5c7e9e5e1b	Tidy up. Reformat for ease of reading. llvm-svn: 155389	2012-04-23 21:00:42 +00:00
Chandler Carruth	3c3bb55a85	Revert r155365, r155366, and r155367. All three of these have regression test suite failures. The failures occur at each stage, and only get worse, so I'm reverting all of them. Please resubmit these patches, one at a time, after verifying that the regression test suite passes. Never submit a patch without running the regression test suite. llvm-svn: 155372	2012-04-23 18:25:57 +00:00
Sirish Pande	a3f8ba2439	Hexagon V5 (floating point) support. llvm-svn: 155367	2012-04-23 17:49:40 +00:00
Sirish Pande	2c7bf00fba	Support for Hexagon architectural feature, new value jump. llvm-svn: 155366	2012-04-23 17:49:28 +00:00
Sirish Pande	6cd2251598	Support for Hexagon VLIW Packetizer. llvm-svn: 155365	2012-04-23 17:49:20 +00:00
Jakob Stoklund Olesen	43bcb970e5	Reapply r155136 after fixing PR12599. Original commit message: Defer some shl transforms to DAGCombine. The shl instruction is used to represent multiplication by a constant power of two as well as bitwise left shifts. Some InstCombine transformations would turn an shl instruction into a bit mask operation, making it difficult for later analysis passes to recognize the constsnt multiplication. Disable those shl transformations, deferring them to DAGCombine time. An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'. These transformations are deferred: (X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses) (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1) (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2) The corresponding exact transformations are preserved, just like div-exact + mul: (X >>?,exact C) << C --> X (X >>?,exact C1) << C2 --> X << (C2-C1) (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2) The disabled transformations could also prevent the instruction selector from recognizing rotate patterns in hash functions and cryptographic primitives. I have a test case for that, but it is too fragile. llvm-svn: 155362	2012-04-23 17:39:52 +00:00
Elena Demikhovsky	6c6cdec3de	cleaned line endings in the newly added test file llvm-svn: 155315	2012-04-22 13:22:48 +00:00
Chandler Carruth	8ffa7c8afd	Tidy up this test more: 1) Make the checked assertions a bit more precise. We really want the canonical forms coming out of reassociate to be exactly what is expected. 2) Remove other passes, and switch the test to actually directly check that reassociate makes the important transforms and canonicalizations. 3) Fold in a related test case now that we're using FileCheck. Make the same tidying changes to it. llvm-svn: 155311	2012-04-22 10:11:26 +00:00
Chandler Carruth	f6f57535ed	FileCheck-ize a test, and tidy it up a touch. llvm-svn: 155310	2012-04-22 10:11:23 +00:00
Elena Demikhovsky	8d7e56c409	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 llvm-svn: 155309	2012-04-22 09:39:03 +00:00
Nadav Rotem	31caa27bf5	Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors. llvm-svn: 155296	2012-04-21 20:08:32 +00:00
Jakob Stoklund Olesen	d114da6004	Fix PR12599. The X86 target is editing the selection DAG while isel is selecting nodes following a topological ordering. When the DAG hacking triggers CSE, nodes can be deleted and bad things happen. llvm-svn: 155257	2012-04-20 23:36:09 +00:00
Jim Grosbach	2937df45a8	ARM: Update NEON assembly two-operand aliases. Use the new TwoOperandAliasConstraint to handle lots of the two-operand aliases for NEON instructions. There's still more to go, but this is a good chunk of them. llvm-svn: 155210	2012-04-20 18:12:54 +00:00
Manuel Klimek	e2f9a21db5	Removes json-bench from the test dependencies. llvm-svn: 155197	2012-04-20 13:45:49 +00:00
Jakob Stoklund Olesen	205ee3b389	Revert r155136 "Defer some shl transforms to DAGCombine." While the patch was perfect and defect free, it exposed a really nasty bug in X86 SelectionDAG that caused an llc crash when compiling lencod. I'll put the patch back in after fixing the SelectionDAG problem. llvm-svn: 155181	2012-04-20 00:38:45 +00:00
Jim Grosbach	9cc324d31a	ARM some VFP tblgen'erated two-operand aliases. llvm-svn: 155178	2012-04-20 00:15:00 +00:00
Jim Grosbach	86afe67e10	Tidy up. Formatting. llvm-svn: 155177	2012-04-20 00:14:57 +00:00
Dan Gohman	26aa827461	Avoid a bug in the path count computation, preventing an infinite loop repeatedlt making the same change. This is for rdar://11256239. llvm-svn: 155160	2012-04-19 21:50:46 +00:00
Joel Jones	a7691f18a6	Test for the the problem with xors being changed into ands when the set bits aren't the same for both args of the xor. This transformation is in the function TargetLowering::SimplifyDemandedBits in the file lib/CodeGen/SelectionDAG/TargetLowering.cpp. I have tested this test using a previous version of llc which the defect and the a version of llc which does not. I got the expected fail and pass, respectively. This test goes with rdar://11195364 and the check in with the fix: svn r154955 llvm-svn: 155156	2012-04-19 20:54:44 +00:00
Michael J. Spencer	9125493efe	Remove llvm-ld and llvm-stub (which is only used by llvm-ld). llvm-ld is no longer useful and causes confusion and so it is being removed. * Does not work very well on Windows because it must call a gcc like driver to assemble and link. * Has lots of hard coded paths which are wrong on many systems. * Does not understand most of ld's options. * Can be partially replaced by llvm-link \| opt \| {llc \| as, llc -filetype=obj} \| ld, or fully replaced by Clang. I know of no production use of llvm-ld, and hacking use should be replaced by Clang's driver. llvm-svn: 155147	2012-04-19 19:27:54 +00:00
Jakob Stoklund Olesen	6b6c81e6b2	Defer some shl transforms to DAGCombine. The shl instruction is used to represent multiplication by a constant power of two as well as bitwise left shifts. Some InstCombine transformations would turn an shl instruction into a bit mask operation, making it difficult for later analysis passes to recognize the constsnt multiplication. Disable those shl transformations, deferring them to DAGCombine time. An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'. These transformations are deferred: (X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses) (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1) (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2) The corresponding exact transformations are preserved, just like div-exact + mul: (X >>?,exact C) << C --> X (X >>?,exact C1) << C2 --> X << (C2-C1) (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2) The disabled transformations could also prevent the instruction selector from recognizing rotate patterns in hash functions and cryptographic primitives. I have a test case for that, but it is too fragile. llvm-svn: 155136	2012-04-19 16:46:26 +00:00
Jakob Stoklund Olesen	201ba5fa00	Extract the broken part of XFAILed test into its own file. llvm-svn: 155081	2012-04-19 00:20:38 +00:00
Jakob Stoklund Olesen	905969a1d4	FileCheckize llvm-svn: 155010	2012-04-18 17:01:26 +00:00
Jakob Stoklund Olesen	7ecc4e9bb3	Nobody likes shifty instructions, but that was a bit strong. llvm-svn: 155009	2012-04-18 16:44:44 +00:00
Silviu Baranga	ca45af9a75	Added support for disassembling unpredictable swp/swpb ARM instructions. llvm-svn: 155004	2012-04-18 14:18:57 +00:00
Silviu Baranga	d5c6a63a50	Fix the bahavior of the disassembler when decoding unpredictable mrs instructions on ARM. Now the diasassembler emmits warnings instead of errors. llvm-svn: 155002	2012-04-18 14:09:07 +00:00
Silviu Baranga	41f1fcd80e	Added support for unpredictable mcrr/mcrr2/mrrc/mrrc2 ARM instruction in the disassembler. Since the upredicability conditions are complex, C++ code was added to handle them. llvm-svn: 155001	2012-04-18 13:12:50 +00:00
Silviu Baranga	a2944116dc	Fixed decoding for the ARM cdp2 instruction. The restriction on the coprocessor number was removed for this instruction. llvm-svn: 155000	2012-04-18 13:02:55 +00:00
Silviu Baranga	9da1918c84	Add suport for unpredicatble cases of the cmp, tst, teq and cmnz ARM instructions in the disassembler. llvm-svn: 154999	2012-04-18 12:48:43 +00:00
Joe Groff	246034465c	FileCheckify, un-XFAIL SimplifyLibCalls/floor test Fixes build on MSVC llvm-svn: 154970	2012-04-18 00:36:07 +00:00
Joe Groff	3a940250bf	Move win32 SimplifyLibcall test under Transforms llvm-svn: 154967	2012-04-18 00:07:45 +00:00
Joe Groff	a81bcbb9bb	fix pr12559: mark unavailable win32 math libcalls also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint llvm-svn: 154960	2012-04-17 23:05:54 +00:00
Akira Hatanaka	71928e681b	Add disassembler to MIPS. Patch by Vladimir Medic. llvm-svn: 154935	2012-04-17 18:03:21 +00:00
Benjamin Kramer	7ce42c476a	Force cmov on test so block placement doesn't shuffle the code around. This made the test fail with -mcpu=generic (when building on a non-x86 host). llvm-svn: 154926	2012-04-17 13:55:23 +00:00
James Molloy	a9bcf20d22	Fix bad EXTRACT_SUBREG in instruction selection for extending-loads on NEON. llvm-svn: 154915	2012-04-17 08:18:00 +00:00
Benjamin Kramer	e364d195e9	Revert "SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW." This isn't right either, reverting for now. llvm-svn: 154910	2012-04-17 06:33:57 +00:00
Andrew Trick	13840499df	Test cases that assume layout should use -disable-code-place. llvm-svn: 154908	2012-04-17 06:20:42 +00:00
Kevin Enderby	29ae538647	Fix ARM disassembly of VLD2 (single 2-element structure to all lanes) instructions with writebacks. And add test a case for all opcodes handed by DecodeVLD2DupInstruction() in ARMDisassembler.cpp . llvm-svn: 154884	2012-04-17 00:49:27 +00:00
Preston Gurd	e63746195d	temporarily XFAIL this test until post RA live-ins is properly enabled. llvm-svn: 154882	2012-04-17 00:21:35 +00:00
Chandler Carruth	1f05b5a4ec	Disable the atom scheduling test after r154874 broke it. llvm-svn: 154877	2012-04-16 23:11:39 +00:00
Jim Grosbach	2bf5f73977	ARM two-operand forms for vhadd and vhsub instructions. rdar://11252521 llvm-svn: 154875	2012-04-16 23:00:25 +00:00
Chandler Carruth	f594b178c6	Relax this test a touch to cope with different assembly variants. llvm-svn: 154870	2012-04-16 22:20:48 +00:00
Chandler Carruth	1f5580b6f3	Fix updateTerminator to be resiliant to degenerate terminators where both fallthrough and a conditional branch target the same successor. Gracefully delete the conditional branch and introduce any unconditional branch needed to reach the actual successor. This fixes memory corruption in 2009-06-15-RegScavengerAssert.ll and possibly other tests. Also, while I'm here fix a latent bug I spotted by inspection. I never applied the same fundamental fix to this fallthrough successor finding logic that I did to the logic used when there are no conditional branches. As a consequence it would have selected landing pads had they be aligned in just the right way here. I don't have a test case as I spotted this by inspection, and the previous time I found this required have of TableGen's source code to produce it. =/ I hate backend bugs. ;] Thanks to Jim Grosbach for helping me reason through this and reviewing the fix. llvm-svn: 154867	2012-04-16 22:03:00 +00:00
Jim Grosbach	1e1d68f1b9	MC assembly parser handling for trailing comma in macro instantiation. A trailing comma means no argument at all (i.e., as if the comma were not present), not an empty argument to the invokee. rdar://11252521 llvm-svn: 154863	2012-04-16 21:18:49 +00:00
Jakob Stoklund Olesen	73d96651ab	FileCheckize these tests. Add an extra test to ldr_post with an immediate increment. llvm-svn: 154859	2012-04-16 20:56:42 +00:00
Jakob Stoklund Olesen	e8ee9d1c8c	Disable code placement for this test. It makes it less sensitive to small changes in heuristics. llvm-svn: 154857	2012-04-16 20:49:06 +00:00
Duncan Sands	9af6298293	Remove support for the special 'fast' value for fpmath accuracy for the moment. llvm-svn: 154850	2012-04-16 19:39:33 +00:00
Richard Smith	12da79b859	Fix incorrect atomics codegen introduced in r154705, and extend test to catch it. llvm-svn: 154845	2012-04-16 18:43:53 +00:00
Akira Hatanaka	4c0db08880	This patch fixes 3 problems: 1. CHECKNEXT was used instead of CHECK-NEXT which caused the line to be ignored which in turn hid the next 2 problems: 2. ('sh_offset', 0x{{{[0-9,a-f]+}}) had one too many leading curly braces and failed to do it's job of accepting all hex digits and: 3. The check for the hex values for the code instructions didn't account for blank separators. Patch by Jack Carter. llvm-svn: 154842	2012-04-16 18:20:26 +00:00
Jim Grosbach	6068d0014a	ARM assembly two-operand forms for VRSHL. rdar://11252521 llvm-svn: 154840	2012-04-16 18:03:16 +00:00
Jim Grosbach	9bfe7054af	Tidy up. Test formatting. llvm-svn: 154839	2012-04-16 18:03:14 +00:00
Akira Hatanaka	3e9d81f47c	Do not add offset in applyFixup. This has already been accounted for in Value. llvm-svn: 154838	2012-04-16 18:00:19 +00:00
Jim Grosbach	cd1c000a9f	ARM two-operand aliases for VRHADD instructions. rdar://11252521 llvm-svn: 154832	2012-04-16 17:14:11 +00:00
Jim Grosbach	5b1910a741	Tidy up. Testcase formatting. llvm-svn: 154831	2012-04-16 17:14:07 +00:00
Bill Wendling	7e6be75e06	Move to X86 directory because this fails on non-X86 platforms. llvm-svn: 154825	2012-04-16 16:38:48 +00:00
Duncan Sands	05f4df8d72	Make it possible to indicate relaxed floating point requirements at the IR level through the use of 'fpmath' metadata. Currently this only provides a 'fpaccuracy' value, which may be a number in ULPs or the keyword 'fast', however the intent is that this will be extended with additional information about NaN's, infinities etc later. No optimizations have been hooked up to this so far. llvm-svn: 154822	2012-04-16 16:28:59 +00:00
Chandler Carruth	4190b507c5	Flip the new block-placement pass to be on by default. This is mostly to test the waters. I'd like to get results from FNT build bots and other bots running on non-x86 platforms. This feature has been pretty heavily tested over the last few months by me, and it fixes several of the execution time regressions caused by the inlining work by preventing inlining decisions from radically impacting block layout. I've seen very large improvements in yacr2 and ackermann benchmarks, along with the expected noise across all of the benchmark suite whenever code layout changes. I've analyzed all of the regressions and fixed them, or found them to be impossible to fix. See my email to llvmdev for more details. I'd like for this to be in 3.1 as it complements the inliner changes, but if any failures are showing up or anyone has concerns, it is just a flag flip and so can be easily turned off. I'm switching it on tonight to try and get at least one run through various folks' performance suites in case SPEC or something else has serious issues with it. I'll watch bots and revert if anything shows up. llvm-svn: 154816	2012-04-16 13:49:17 +00:00
Chandler Carruth	a355e7cf82	Remove an overly brittle test. This test will no longer be interesting once we start changing the block layout, so just nuke it. If anyone has ideas about how to craft a code layout agnostic form of the test please let me know. llvm-svn: 154815	2012-04-16 13:49:09 +00:00
Chandler Carruth	8c0b41d656	Add a somewhat hacky heuristic to do something different from whole-loop rotation. When there is a loop backedge which is an unconditional branch, we will end up with a branch somewhere no matter what. Try placing this backedge in a fallthrough position above the loop header as that will definitely remove at least one branch from the loop iteration, where whole loop rotation may not. I haven't seen any benchmarks where this is important but loop-blocks.ll tests for it, and so this will be covered when I flip the default. llvm-svn: 154812	2012-04-16 13:33:36 +00:00
Richard Barton	def81b9155	Add -disassemble support for -show-inst and -show-encode capability llvm-mc. Also refactor so all MC paraphernalia are created once for all uses as much as possible. The test change is to account for the fact that the default disassembler behaviour has changed with regards to specifying the assembly syntax to use. llvm-svn: 154809	2012-04-16 11:32:10 +00:00
Chandler Carruth	8c74c7b1c6	Tweak the loop rotation logic to check whether the loop is naturally laid out in a form with a fallthrough into the header and a fallthrough out of the bottom. In that case, leave the loop alone because any rotation will introduce unnecessary branches. If either side looks like it will require an explicit branch, then the rotation won't add any, do it to ensure the branch occurs outside of the loop (if possible) and maximize the benefit of the fallthrough in the bottom. llvm-svn: 154806	2012-04-16 09:31:23 +00:00
Hal Finkel	e0cf6397fd	Remove dead SD nodes after the combining pass. Fixes PR12201. llvm-svn: 154786	2012-04-16 03:33:22 +00:00
Chandler Carruth	ccc7e42b1f	Rewrite how machine block placement handles loop rotation. This is a complex change that resulted from a great deal of experimentation with several different benchmarks. The one which proved the most useful is included as a test case, but I don't know that it captures all of the relevant changes, as I didn't have specific regression tests for each, they were more the result of reasoning about what the old algorithm would possibly do wrong. I'm also failing at the moment to craft more targeted regression tests for these changes, if anyone has ideas, it would be welcome. The first big thing broken with the old algorithm is the idea that we can take a basic block which has a loop-exiting successor and a looping successor and use the looping successor as the layout top in order to get that particular block to be the bottom of the loop after layout. This happens to work in many cases, but not in all. The second big thing broken was that we didn't try to select the exit which fell into the nearest enclosing loop (to which we exit at all). As a consequence, even if the rotation worked perfectly, it would result in one of two bad layouts. Either the bottom of the loop would get fallthrough, skipping across a nearer enclosing loop and thereby making it discontiguous, or it would be forced to take an explicit jump over the nearest enclosing loop to earch its successor. The point of the rotation is to get fallthrough, so we need it to fallthrough to the nearest loop it can. The fix to the first issue is to actually layout the loop from the loop header, and then rotate the loop such that the correct exiting edge can be a fallthrough edge. This is actually much easier than I anticipated because we can handle all the hard parts of finding a viable rotation before we do the layout. We just store that, and then rotate after layout is finished. No inner loops get split across the post-rotation backedge because we check for them when selecting the rotation. That fix exposed a latent problem with our exitting block selection -- we should allow the backedge to point into the middle of some inner-loop chain as there is no real penalty to it, the whole point is that it won't be a fallthrough edge. This may have blocked the rotation at all in some cases, I have no idea and no test case as I've never seen it in practice, it was just noticed by inspection. Finally, all of these fixes, and studying the loops they produce, highlighted another problem: in rotating loops like this, we sometimes fail to align the destination of these backwards jumping edges. Fix this by actually walking the backwards edges rather than relying on loopinfo. This fixes regressions on heapsort if block placement is enabled as well as lots of other cases where the previous logic would introduce an abundance of unnecessary branches into the execution. llvm-svn: 154783	2012-04-16 01:12:56 +00:00
Craig Topper	bfc9a5f7d3	Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors. llvm-svn: 154778	2012-04-15 22:43:31 +00:00
Nadav Rotem	42bcd04ee3	Fix PR12529. The Vxx family of instructions are only supported by AVX. Use non-vex instructions for SSE4. llvm-svn: 154770	2012-04-15 19:36:44 +00:00
Nadav Rotem	02ef0c3524	When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type. llvm-svn: 154764	2012-04-15 15:08:09 +00:00
Elena Demikhovsky	779a72b49e	Added VPERM optimization for AVX2 shuffles llvm-svn: 154761	2012-04-15 11:18:59 +00:00
Duncan Sands	34bd91a49f	Rename "fpaccuracy" metadata to the more generic "fpmath". That's because I'm thinking of generalizing it to be able to specify other freedoms beyond accuracy (such as that NaN's don't have to be respected). I'd like the 3.1 release (the first one with this metadata) to have the more generic name already rather than having to auto-upgrade it in 3.2. llvm-svn: 154744	2012-04-14 12:36:06 +00:00
Hal Finkel	83c9796033	Fix an error in BBVectorize important for vectorizing pointer types. When vectorizing pointer types it is important to realize that potential pairs cannot be connected via the address pointer argument of a load or store. This is because even after vectorization, the address is still a scalar because the address of the higher half of the pair is implicit from the address of the lower half (it need not be, and should not be, explicitly computed). llvm-svn: 154735	2012-04-14 07:32:50 +00:00
Hal Finkel	f589519a67	Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs. llvm-svn: 154734	2012-04-14 07:32:43 +00:00
Richard Smith	3e8f1f6aea	Fix X86 codegen for 'atomicrmw nand' to generate x = ~(x & y), not x = ~x & y. llvm-svn: 154705	2012-04-13 22:47:00 +00:00
Hal Finkel	b2336a79f9	Add support to BBVectorize for vectorizing selects. llvm-svn: 154700	2012-04-13 20:45:45 +00:00
Evan Cheng	267a4ada52	On Darwin targets, only use vfma etc. if the source use fma() intrinsic explicitly. llvm-svn: 154689	2012-04-13 18:59:28 +00:00
Dan Gohman	e1e352af2b	Consider ObjC runtime calls objc_storeWeak and others which make a copy of their argument as "escape" points for objc_retainBlock optimization. This fixes rdar://11229925. llvm-svn: 154682	2012-04-13 18:28:58 +00:00
Sylvestre Ledru	a10d97ac91	Catch the Python exception when subprocess.Popen is failing. For example, if llc cannot be found, the full python stacktrace is displayed and no interesting information are provided. + fail the process when an exception occurs llvm-svn: 154665	2012-04-13 11:22:18 +00:00
Dan Gohman	de8d2c446b	Use the new Use-aware dominates method to apply the objc runtime library return value optimization for phi uses. Even when the phi itself is not dominated, the specific use may be dominated. llvm-svn: 154647	2012-04-13 01:08:28 +00:00
Dan Gohman	8478d76d64	Don't move objc_autorelease calls past autorelease pool boundaries when optimizing autorelease calls on phi nodes with null operands. This fixes rdar://11207070. llvm-svn: 154642	2012-04-13 00:59:57 +00:00
Sirish Pande	1d195b9c25	Disable Hexagon test temporarily. There is an assert at line 558 in ScheduleDAGInstrs::buildSchedGraph(AliasAnalysis *AA). This assert needs to addressed for post RA scheduler. Until that assert is addressed, any passes that uses post ra scheduler will fail. So, I am temporarily disabling the hexagon tests until that fix is in. The assert is as follows: assert(!MI->isTerminator() && !MI->isLabel() && "Cannot schedule terminators or labels!"); llvm-svn: 154617	2012-04-12 21:06:54 +00:00
Preston Gurd	2138ef6d3d	This patch improves the MCJIT runtime dynamic loader by adding new handling of zero-initialized sections, virtual sections and common symbols and preventing the loading of sections which are not required for execution such as debug information. Patch by Andy Kaylor! llvm-svn: 154610	2012-04-12 20:13:57 +00:00
Craig Topper	d0271b27cb	Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are integer instructions. llvm-svn: 154580	2012-04-12 07:23:00 +00:00
Akira Hatanaka	c80ae58a5e	Revert changes that were accidentally committed. llvm-svn: 154563	2012-04-11 23:19:55 +00:00
Akira Hatanaka	1e962f250b	Fix string that is being checked. llvm-svn: 154547	2012-04-11 23:11:33 +00:00
Akira Hatanaka	47ad674f67	Emit neg.s or neg.d only if -enable-no-nans-fp-math is supplied by user, otherwise expand FNEG during legalization. llvm-svn: 154546	2012-04-11 22:59:08 +00:00
Akira Hatanaka	7f4c9d1429	Emit abs.s or abs.d only if -enable-no-nans-fp-math is supplied by user. Invalid operation is signaled if the operand of these instructions is NaN. llvm-svn: 154545	2012-04-11 22:49:04 +00:00
Kevin Enderby	72f18bbcff	Fixed a case of ARM disassembly getting an assert on a bad encoding of a VST instruction. llvm-svn: 154544	2012-04-11 22:40:17 +00:00
Akira Hatanaka	4f5c8421b3	Fix bugs in lowering of FCOPYSIGN nodes. - FCOPYSIGN nodes that have operands of different types were not handled. - Different code was generated depending on the endianness of the target. Additionally, code is added that emits INS and EXT instructions, if they are supported by target (they are R2 instructions). llvm-svn: 154540	2012-04-11 22:13:04 +00:00
Jim Grosbach	6e536de1a1	ARM 'vuzp.32 Dd, Dm' is a pseudo-instruction. While there is an encoding for it in VUZP, the result of that is undefined, so we should avoid it. Define the instruction as a pseudo for VTRN.32 instead, as the ARM ARM indicates. rdar://11222366 llvm-svn: 154511	2012-04-11 17:40:18 +00:00
Jim Grosbach	4640c8169f	ARM 'vzip.32 Dd, Dm' is a pseudo-instruction. While there is an encoding for it in VZIP, the result of that is undefined, so we should avoid it. Define the instruction as a pseudo for VTRN.32 instead, as the ARM ARM indicates. rdar://11221911 llvm-svn: 154505	2012-04-11 16:53:25 +00:00
Evan Cheng	5efc442290	Add more fused mul+add/sub patterns. rdar://10139676 llvm-svn: 154484	2012-04-11 06:59:47 +00:00
Nadav Rotem	9bc178ac5c	Reapply 154396 after fixing a test. Original message: Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendV uses a register for the selection while Vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154483	2012-04-11 06:40:27 +00:00
Evan Cheng	48346c1cd9	Clean up ARM fused multiply + add/sub support some more: rename some isel predicates. Also remove NEON2 since it's not really useful and it is confusing. If NEON + VFP4 implies NEON2 but NEON2 doesn't imply NEON + VFP4, what does it really mean? rdar://10139676 llvm-svn: 154480	2012-04-11 05:33:07 +00:00
Evan Cheng	67a09fc397	Match (fneg (fma) to vfnma. rdar://10139676 llvm-svn: 154469	2012-04-11 01:21:25 +00:00
Charles Davis	74c282b5ef	Add retw and lretw instructions. Also, fix Intel syntax parsing for all ret instructions. llvm-svn: 154468	2012-04-11 01:10:53 +00:00
Evan Cheng	d0f61cbefe	Merge fma.ll into fusedMAC.ll llvm-svn: 154466	2012-04-11 01:03:11 +00:00
Kevin Enderby	d2980cd041	Fix ARM disassembly of VLD instructions with writebacks. And add test a case for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp . llvm-svn: 154459	2012-04-11 00:25:40 +00:00
Jim Grosbach	ad66de155b	ARM add missing Thumb1 two-operand aliases for shift-by-immediate. rdar://11222742 llvm-svn: 154457	2012-04-11 00:15:16 +00:00
Evan Cheng	aca6c822e6	Fix a number of problems with ARM fused multiply add/subtract instructions. 1. The new instruction itinerary entries are not properly described. 2. The asm parser can't handle vfms and vfnms. 3. There were no assembler, disassembler test cases. 4. HasNEON2 has the wrong assembler predicate. rdar://10139676 llvm-svn: 154456	2012-04-11 00:13:00 +00:00
Jakob Stoklund Olesen	0bcf8f4bfb	Fix test to be register assignment invariant. llvm-svn: 154453	2012-04-11 00:00:24 +00:00
Owen Anderson	6f1ee1634d	Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point. Zap a testcase that this allows us to completely fold away. llvm-svn: 154447	2012-04-10 22:46:53 +00:00
Kostya Serebryany	5ba61ac651	[tsan] two more compile-time optimizations: - don't isntrument reads from constant globals. Saves ~1.5% of instrumented instructions on CPU2006 (counting static instructions, not their execution). - don't insrument reads from vtable (which is a global constant too). Saves ~5%. I did not measure the run-time impact of this, but it is certainly non-negative. llvm-svn: 154444	2012-04-10 22:29:17 +00:00
Evan Cheng	d0007f3c83	Handle llvm.fma.* intrinsics. rdar://10914096 llvm-svn: 154439	2012-04-10 21:40:28 +00:00
Duncan Sands	4f53074cca	Add a comment noting that the fdiv -> fmul conversion won't generate multiplication by a denormal, and some tests checking that. llvm-svn: 154431	2012-04-10 20:35:27 +00:00
Eric Christopher	65ada95b84	Temporarily revert this patch to see if it brings the buildbots back. llvm-svn: 154425	2012-04-10 19:33:16 +00:00
Kostya Serebryany	bf2de80be6	[tsan] compile-time instrumentation: do not instrument a read if a write to the same temp follows in the same BB. Also add stats printing. On Spec CPU2006 this optimization saves roughly 4% of instrumented reads (which is 3% of all instrumented accesses): Writes : 161216 Reads : 446458 Reads-before-write: 18295 llvm-svn: 154418	2012-04-10 18:18:56 +00:00
Eric Christopher	e9abba71fe	To ensure that we have more accurate line information for a block don't elide the branch instruction if it's the only one in the block, otherwise it's ok. PR9796 and rdar://11215207 llvm-svn: 154417	2012-04-10 18:18:10 +00:00
Jim Grosbach	df5a244797	ARM fix cc_out operand handling for t2SUBrr instructions. We were incorrectly conflating some add variants which don't have a cc_out operand with the mirroring sub encodings, which do. Part of the awesome non-orthogonality legacy of thumb1. Similarly, handling of add/sub of an immediate was sometimes incorrectly removing the cc_out operand for add/sub register variants. rdar://11216577 llvm-svn: 154411	2012-04-10 17:31:55 +00:00
Nadav Rotem	f934f91709	Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendv uses a register for the selection while vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154396	2012-04-10 14:33:13 +00:00
Anton Korobeynikov	4d1220de34	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394	2012-04-10 13:22:49 +00:00
Duncan Sands	af06b26c8e	Express the number of ULPs in fpaccuracy metadata as a real rather than a rational number, eg as 2.5 rather than 5, 2. OK'd by Peter Collingbourne. llvm-svn: 154387	2012-04-10 08:22:43 +00:00
Andrew Trick	4442bfe559	Fix 12513: Loop unrolling breaks with indirect branches. Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386	2012-04-10 05:14:42 +00:00
Evan Cheng	0752624970	Add proper checks. llvm-svn: 154379	2012-04-10 03:15:42 +00:00
Evan Cheng	f8bad08001	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Rafael Espindola	1d9672bdce	Don't try to zExt just to check if an integer constant is zero, it might not fit in a i64. llvm-svn: 154364	2012-04-10 00:16:22 +00:00
Lang Hames	ec96cd0690	Test case for PR12495. llvm-svn: 154359	2012-04-09 23:58:59 +00:00
Akira Hatanaka	8483a6c47d	Have TargetLowering::getPICJumpTableRelocBase return a node that points to the GOT if jump table uses 64-bit gp-relative relocation. llvm-svn: 154341	2012-04-09 20:32:12 +00:00
Chad Rosier	e0e38f61a5	When performing a truncating store, it's possible to rearrange the data in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340	2012-04-09 20:32:02 +00:00
Rafael Espindola	8f62b3248e	Pattern match a setcc of boolean value with 0 as a truncate. llvm-svn: 154322	2012-04-09 16:06:03 +00:00
Nadav Rotem	fb7e2ae53c	Lower some x86 shuffle sequences to the vblend family of instructions. llvm-svn: 154313	2012-04-09 08:33:21 +00:00
Nadav Rotem	b801ca3976	Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type. Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering. llvm-svn: 154310	2012-04-09 07:45:58 +00:00
Chandler Carruth	3779ac10b4	Cleanup and relax a restriction on the matching of global offsets into x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is not using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304	2012-04-09 02:13:06 +00:00
Chandler Carruth	84b834267e	Fold 15 tiny test cases into a single file that implements the comprehensive testing of TLS codegen for x86. Convert all of the ones that were still using grep to use FileCheck. Remove some redundancies between them. Perhaps most interestingly expand the test cases so that they actually fully list the instruction snippet being tested. TLS operations are very narrowly defined, and so these seem reasonably stable. More importantly, the existing test cases already were crazy fine grained, expecting specific registers to be allocated. This just clarifies that no other instructions are expected, and fills in some crucial gaps that weren't being tested at all. This will make any subsequent changes to TLS much more clear during review. llvm-svn: 154303	2012-04-09 01:43:17 +00:00
Duncan Sands	2f1dc3814b	Only have codegen turn fdiv by a constant into fmul by the reciprocal when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296	2012-04-08 18:08:12 +00:00
Chandler Carruth	ede4a8aa2b	Teach LLVM about a PIE option which, when enabled on top of PIC, makes optimizations which are valid for position independent code being linked into a single executable, but not for such code being linked into a shared library. I discussed the design of this with Eric Christopher, and the decision was to support an optional bit rather than a completely separate relocation model. Fundamentally, this is still PIC relocation, its just that certain optimizations are only valid under a PIC relocation model when the resulting code won't be in a shared library. The simplest path to here is to expose a single bit option in the TargetOptions. If folks have different/better designs, I'm all ears. =] I've included the first optimization based upon this: changing TLS models to the *Exec models when PIE is enabled. This is the LLVM component of PR12380 and is all of the hard work. llvm-svn: 154294	2012-04-08 17:51:45 +00:00
Chandler Carruth	f82b0e2d29	Teach InstCombine to nuke a common alloca pattern -- an alloca which has GEPs, bit casts, and stores reaching it but no other instructions. These often show up during the iterative processing of the inliner, SROA, and DCE. Once we hit this point, we can completely remove the alloca. These were actually showing up in the final, fully optimized code in a bunch of inliner tests I've been working on, and notably they show up after LLVM finishes optimizing away all function calls involved in hash_combine(a, b). llvm-svn: 154285	2012-04-08 14:36:56 +00:00
Nadav Rotem	82609df647	AVX2: Build splat vectors by broadcasting a scalar from the constant pool. Previously we used three instructions to broadcast an immediate value into a vector register. On Sandybridge we continue to load the broadcasted value from the constant pool. llvm-svn: 154284	2012-04-08 12:54:54 +00:00
Bill Wendling	8c783d4122	Remove old 'grep' lines. llvm-svn: 154283	2012-04-08 11:53:54 +00:00
Bill Wendling	57f8e5ebe4	FileCheckize these testcases. llvm-svn: 154281	2012-04-08 11:00:38 +00:00
Nadav Rotem	71d07ae5cb	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266	2012-04-07 21:19:08 +00:00
Duncan Sands	5f8397a934	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265	2012-04-07 20:04:00 +00:00
Chandler Carruth	28192c9398	Fix ValueTracking to conclude that debug intrinsics are safe to speculate. Without this, loop rotate (among many other places) would suddenly stop working in the presence of debug info. I found this looking at loop rotate, and have augmented its tests with a reduction out of a very hot loop in yacr2 where failing to do this rotation costs sometimes more than 10% in runtime performance, perturbing numerous downstream optimizations. This should have no impact on performance without debug info, but the change in performance when debug info is enabled can be extreme. As a consequence (and this how I got to this yak) any profiling of performance problems should be treated with deep suspicion -- they may have been wildly innacurate of debug info was enabled for profiling. =/ Just a heads up. llvm-svn: 154263	2012-04-07 19:22:18 +00:00
Benjamin Kramer	e1f4ca1b0f	SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW. Found by inspection. llvm-svn: 154262	2012-04-07 17:19:26 +00:00
Alexis Hunt	78fce432b7	Make the test for r154235 more platform-independent with a shorter string. llvm-svn: 154243	2012-04-07 01:33:14 +00:00
Alexis Hunt	0235f684f0	Output UTF-8-encoded characters as identifier characters into assembly by default. This is a behaviour configurable in the MCAsmInfo. I've decided to turn it on by default in (possibly optimistic) hopes that most assemblers are reasonably sane. If this proves a problem, switching to default seems reasonable. I'm not sure if this is the opportune place to test, but it seemed good to make sure it was tested somewhere. llvm-svn: 154235	2012-04-07 00:37:53 +00:00
Akira Hatanaka	487e56763d	Add lines in global-address.ll to test N32 and N64 code generation. llvm-svn: 154202	2012-04-06 20:23:36 +00:00
Jakob Stoklund Olesen	967b86a0a2	Allow negative immediates in ARM and Thumb2 compares. ARM and Thumb2 mode can use cmn instructions to compare against negative immediates. Thumb1 mode can't. llvm-svn: 154183	2012-04-06 17:45:04 +00:00
Chandler Carruth	49da93396e	Sink the collection of return instructions until after all simplification has been performed. This is a bit less efficient (requires another ilist walk of the basic blocks) but shouldn't matter in practice. More importantly, it's just too much work to keep track of all the various ways the return instructions can be mutated while simplifying them. This fixes yet another crasher, reported by Daniel Dunbar. llvm-svn: 154179	2012-04-06 17:21:31 +00:00
Chandler Carruth	e547fefcb7	Tweak this test to ensure the inliner did indeed fire. Thanks to Richard Smith for pointing this out in review. llvm-svn: 154178	2012-04-06 17:21:28 +00:00
Craig Topper	bdc9f071a4	Test case for PR12413 llvm-svn: 154172	2012-04-06 14:38:25 +00:00
Craig Topper	447417c932	Allow 256-bit shuffles to be split if a 128-bit lane contains elements from a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413. llvm-svn: 154166	2012-04-06 07:45:23 +00:00
Craig Topper	4eb9616b24	Add the tests that were supposed to go with r153935 that I forgot svn add llvm-svn: 154165	2012-04-06 07:09:59 +00:00
Chandler Carruth	17e335888c	Actually finish this sentence in the comment the way I intended. Thanks Matt for pointing this out. llvm-svn: 154158	2012-04-06 01:19:38 +00:00
Chandler Carruth	e41f6f4189	Sink the return instruction collection until after we're done deleting dead code, including dead return instructions in some cases. Otherwise, we end up having a bogus poniter to a return instruction that blows up much further down the road. It turns out that this pattern is both simpler to code, easier to update in the face of enhancements to the inliner cleanup, and likely cheaper given that it won't add dead instructions to the list. Thanks to John Regehr's numerous test cases for teasing this out. llvm-svn: 154157	2012-04-06 01:11:52 +00:00
Jim Grosbach	930f2f66e7	ARM assembly aliases for add negative immediates using sub. 'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out. Thumb1 aliases for adding a negative immediate to the stack pointer, also. rdar://11192734 llvm-svn: 154123	2012-04-05 20:57:13 +00:00
Akira Hatanaka	43fb2b2cea	Reapply test case in 154038, this time with triple to prevent the backend from emitting gp_rel relocation. llvm-svn: 154122	2012-04-05 20:44:35 +00:00
Eric Christopher	aec8a82694	Patch to set is_stmt a little better for prologue lines in a function. This enables debuggers to see what are interesting lines for a breakpoint rather than any line that starts a function. rdar://9852092 llvm-svn: 154120	2012-04-05 20:39:05 +00:00
Jakob Stoklund Olesen	37492eac8c	Don't break the IV update in TLI::SimplifySetCC(). LSR always tries to make the ICmp in the loop latch use the incremented induction variable. This allows the induction variable to be kept in a single register. When the induction variable limit is equal to the stride, SimplifySetCC() would break LSR's hard work by transforming: (icmp (add iv, stride), stride) --> (cmp iv, 0) This forced us to use lea for the IC update, preventing the simpler incl+cmp. <rdar://problem/7643606> <rdar://problem/11184260> llvm-svn: 154119	2012-04-05 20:30:20 +00:00
Dan Gohman	cc64bbca81	Fix accidentally inverted logic from r152803, and make the testcase slightly less trivial. This fixes rdar://11171718. llvm-svn: 154118	2012-04-05 20:27:21 +00:00
Silviu Baranga	af3c79f0ac	Added support for unpredictable ADC/SBC instructions on ARM, and also fixed some corner cases involving the PC register as an operand for these instructions. llvm-svn: 154101	2012-04-05 16:19:29 +00:00
Silviu Baranga	d365397daa	Added support for handling unpredictable arithmetic instructions on ARM. llvm-svn: 154100	2012-04-05 16:13:15 +00:00
James Molloy	1ea6473688	An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases. Noticed while investigating PR12274! llvm-svn: 154090	2012-04-05 10:01:12 +00:00
Jim Grosbach	15c6884a4b	ARM assembly aliases for two-operand V[R]SHR instructions. rdar://11189467 llvm-svn: 154087	2012-04-05 07:23:53 +00:00
Jim Grosbach	3d00eecc53	ARM assembly parsing for 'msr' plain 'cpsr' operand. Plain 'cpsr' is an alias for 'cpsr_fc'. rdar://11153753 llvm-svn: 154080	2012-04-05 03:17:53 +00:00
Jakob Stoklund Olesen	f2390e8303	Pass the right sign to TLI->isLegalICmpImmediate. LSR can fold three addressing modes into its ICmpZero node: ICmpZero BaseReg + Offset => ICmp BaseReg, -Offset ICmpZero -1ScaleReg + Offset => ICmp ScaleReg, Offset ICmpZero BaseReg + -1ScaleReg => ICmp BaseReg, ScaleReg The first two cases are only used if TLI->isLegalICmpImmediate() likes the offset. Make sure the right Offset sign is passed to this method in the second case. The ARM version is not symmetric. <rdar://problem/11184260> llvm-svn: 154079	2012-04-05 03:10:56 +00:00
Akira Hatanaka	121342fcc2	Reapply 154038 without the failing test. llvm-svn: 154062	2012-04-04 22:16:36 +00:00
Owen Anderson	4743c6e159	Revert r154038. It was causing make check failures. llvm-svn: 154054	2012-04-04 21:18:58 +00:00
Akira Hatanaka	9705c865d9	Fix LowerGlobalAddress to produce instructions with the correct relocation types for N32 ABI. Add new test case and update existing ones. llvm-svn: 154038	2012-04-04 19:02:38 +00:00
Akira Hatanaka	b3a2b8c199	Fix LowerConstantPool to produce instructions with the correct relocation types for N32 ABI and update test case. llvm-svn: 154034	2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen	0a5b72f0e4	Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr. A MOVCCr instruction can be commuted by inverting the condition. This can help reduce register pressure and remove unnecessary copies in some cases. <rdar://problem/11182914> llvm-svn: 154033	2012-04-04 18:23:42 +00:00
Akira Hatanaka	aeff24e424	Fix LowerBlockAddress to produce instructions with the correct relocation types for N32 ABI and update test case. llvm-svn: 154031	2012-04-04 18:22:53 +00:00
Hongbin Zheng	e1fd20172b	Add testcase for r154007, when a function has the optsize attribute, the loop should be unrolled according the value of OptSizeUnrollThreshold. llvm-svn: 154014	2012-04-04 13:24:40 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Michael J. Spencer	22120c47a7	Add YAML parser to Support. llvm-svn: 153977	2012-04-03 23:09:22 +00:00
Pete Cooper	9511ec86f9	Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095> llvm-svn: 153976	2012-04-03 22:57:55 +00:00
Eric Christopher	b81e2b403c	Fix thinko check for number of operands to be the one that actually might have more than 19 operands. Add a testcase to make sure I never screw that up again. Part of rdar://11026482 llvm-svn: 153961	2012-04-03 17:55:42 +00:00
Nadav Rotem	269703f983	Add an additional testcase which checks ops with multiple users. llvm-svn: 153939	2012-04-03 07:39:36 +00:00
Craig Topper	7629d63bc4	Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo. llvm-svn: 153935	2012-04-03 05:20:24 +00:00
Akira Hatanaka	d19f025374	Revert r153924. Delete test/MC/Disassembler/Mips and lib/Target/Mips/Disassembler. llvm-svn: 153926	2012-04-03 03:01:13 +00:00
Akira Hatanaka	55059262aa	Revert r153924. There were buildbot failures. llvm-svn: 153925	2012-04-03 02:51:09 +00:00
Akira Hatanaka	e2498d014b	MIPS disassembler support. Patch by Vladimir Medic. llvm-svn: 153924	2012-04-03 02:20:58 +00:00
Jakob Stoklund Olesen	291007b055	Allocate virtual registers in ascending order. This is just the fallback tie-breaker ordering, the main allocation order is still descending size. Patch by Shamil Kurmangaleev! llvm-svn: 153904	2012-04-02 22:30:39 +00:00
Lang Hames	aaafacd07e	During two-address lowering, rescheduling an instruction does not untie operands. Make TryInstructionTransform return false to reflect this. Fixes PR11861. llvm-svn: 153892	2012-04-02 19:58:43 +00:00
Rafael Espindola	2e5c58e77b	No need to run llvm-as. llvm-svn: 153890	2012-04-02 19:44:20 +00:00
Akira Hatanaka	b1f68f9696	Initial 64 bit direct object support. This patch allows llvm to recognize that a 64 bit object file is being produced and that the subsequently generated ELF header has the correct information. The test case checks for both big and little endian flavors. Patch by Jack Carter. llvm-svn: 153889	2012-04-02 19:25:22 +00:00
Stepan Dyatkovskiy	f62ffeca88	Fast fix for PR12343: http://llvm.org/bugs/show_bug.cgi?id=12343 We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling. Fix forbids this case for unswitching. llvm-svn: 153879	2012-04-02 17:16:45 +00:00
Silviu Baranga	ac37acd31b	Added fix in TableGen instruction decoder generation. The decoder now breaks for every leaf node. llvm-svn: 153874	2012-04-02 15:20:39 +00:00
Nadav Rotem	702f080767	Optimizing swizzles of complex shuffles may generate additional complex shuffles. Do not try to optimize swizzles of shuffles if the source shuffle has more than a single user, except when the source shuffle is also a swizzle. llvm-svn: 153864	2012-04-02 07:11:12 +00:00
Hal Finkel	322e41a914	Enable prefetch generation on PPC64. llvm-svn: 153851	2012-04-01 20:08:17 +00:00
Nadav Rotem	b078350872	This commit contains a few changes that had to go in together. 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848	2012-04-01 19:31:22 +00:00
Hal Finkel	9f9f8929ee	Add instruction itinerary for the PPC64 A2 core. This adds a full itinerary for IBM's PPC64 A2 embedded core. These cores form the basis for the CPUs in the new IBM BG/Q supercomputer. llvm-svn: 153842	2012-04-01 19:22:40 +00:00
Chandler Carruth	cdb1f8cff1	Add some more testing to cover the remaining two cases where always-inlining is disabled: recursive functions and indirectbr. llvm-svn: 153833	2012-04-01 10:36:17 +00:00
Chandler Carruth	c5bfb3c0f5	Fix a pretty scary bug I introduced into the always inliner with a single missing character. Somehow, this had gone untested. I've added tests for returns-twice logic specifically with the always-inliner that would have caught this, and fixed the bug. Thanks to Matt for the careful review and spotting this!!! =D llvm-svn: 153832	2012-04-01 10:21:05 +00:00
Chandler Carruth	1989bb9c43	Replace four tiny tests with various uses of grep and not with a single test and FileCheck. llvm-svn: 153831	2012-04-01 10:11:17 +00:00
Rafael Espindola	77242fa79e	Add a triple to the test. llvm-svn: 153818	2012-03-31 18:59:07 +00:00
Rafael Espindola	80c540e656	Teach CodeGen's version of computeMaskedBits to understand the range metadata. This is the CodeGen equivalent of r153747. I tested that there is not noticeable performance difference with any combination of -O0/-O2 /-g when compiling gcc as a single compilation unit. llvm-svn: 153817	2012-03-31 18:14:00 +00:00
Chandler Carruth	0539c071ea	Initial commit for the rewrite of the inline cost analysis to operate on a per-callsite walk of the called function's instructions, in breadth-first order over the potentially reachable set of basic blocks. This is a major shift in how inline cost analysis works to improve the accuracy and rationality of inlining decisions. A brief outline of the algorithm this moves to: - Build a simplification mapping based on the callsite arguments to the function arguments. - Push the entry block onto a worklist of potentially-live basic blocks. - Pop the first block off of the front of the worklist (for breadth-first ordering) and walk its instructions using a custom InstVisitor. - For each instruction's operands, re-map them based on the simplification mappings available for the given callsite. - Compute any simplification possible of the instruction after re-mapping, and store that back int othe simplification mapping. - Compute any bonuses, costs, or other impacts of the instruction on the cost metric. - When the terminator is reached, replace any conditional value in the terminator with any simplifications from the mapping we have, and add any successors which are not proven to be dead from these simplifications to the worklist. - Pop the next block off of the front of the worklist, and repeat. - As soon as the cost of inlining exceeds the threshold for the callsite, stop analyzing the function in order to bound cost. The primary goal of this algorithm is to perfectly handle dead code paths. We do not want any code in trivially dead code paths to impact inlining decisions. The previous metric was extremely flawed here, and would always subtract the average cost of two successors of a conditional branch when it was proven to become an unconditional branch at the callsite. There was no handling of wildly different costs between the two successors, which would cause inlining when the path actually taken was too large, and no inlining when the path actually taken was trivially simple. There was also no handling of the code path, only the immediate successors. These problems vanish completely now. See the added regression tests for the shiny new features -- we skip recursive function calls, SROA-killing instructions, and high cost complex CFG structures when dead at the callsite being analyzed. Switching to this algorithm required refactoring the inline cost interface to accept the actual threshold rather than simply returning a single cost. The resulting interface is pretty bad, and I'm planning to do lots of interface cleanup after this patch. Several other refactorings fell out of this, but I've tried to minimize them for this patch. =/ There is still more cleanup that can be done here. Please point out anything that you see in review. I've worked really hard to try to mirror at least the spirit of all of the previous heuristics in the new model. It's not clear that they are all correct any more, but I wanted to minimize the change in this single patch, it's already a bit ridiculous. One heuristic that is not yet mirrored is to allow inlining of functions with a dynamic alloca if the caller has a dynamic alloca. I will add this back, but I think the most reasonable way requires changes to the inliner itself rather than just the cost metric, and so I've deferred this for a subsequent patch. The test case is XFAIL-ed until then. As mentioned in the review mail, this seems to make Clang run about 1% to 2% faster in -O0, but makes its binary size grow by just under 4%. I've looked into the 4% growth, and it can be fixed, but requires changes to other parts of the inliner. llvm-svn: 153812	2012-03-31 12:42:41 +00:00
Chandler Carruth	6f202a7ced	Clean up the naming in this test. Someone pointed this out in review at one point, and I forgot to go back and clean it up. Sorry about that. =/ llvm-svn: 153801	2012-03-31 10:38:48 +00:00
Chandler Carruth	564b4ba704	FileCheck-ize this test, and generally tidy it up prior to changing things around. llvm-svn: 153799	2012-03-31 09:22:33 +00:00
Hal Finkel	5cad8742cc	Correctly vectorize powi. The powi intrinsic requires special handling because it always takes a single integer power regardless of the result type. As a result, we can vectorize only if the powers are equal. Fixes PR12364. llvm-svn: 153797	2012-03-31 03:38:40 +00:00
Jim Grosbach	fdaab531b7	ARM assembler should prefer non-aliases encoding of cmp. When an immediate is both a value [t2_]so_imm and a [t2_]so_imm_neg, we want to use the non-negated form to make sure we prefer the normal encoding, not the aliased encoding via the negation of, e.g., 'cmp.w'. llvm-svn: 153770	2012-03-30 19:59:02 +00:00
Jim Grosbach	daa04130ed	ARM encoding for VSWP got the second operand incorrect. Make the non-tied register operand names line up with what the base class encoding handler expects. rdar://11157236 llvm-svn: 153766	2012-03-30 18:53:01 +00:00
Jim Grosbach	def5e34812	ARM integrated assembler should encoding choice for add/sub imm. For 'adds r2, r2, #56' outside of an IT block, the 16-bit encoding T2 can be used for this syntax. Prefer the narrow encoding when possible. rdar://11156277 llvm-svn: 153759	2012-03-30 17:20:40 +00:00
Jim Grosbach	199ab90946	ARM assembly parsing needs to be paranoid about negative immediates. Make sure to treat immediates as unsigned when doing relative comparisons. rdar://11153621 llvm-svn: 153753	2012-03-30 16:31:31 +00:00
James Molloy	fb5cd6085f	Ensure conditional BL instructions for ARM are given the fixup fixup_arm_condbranch. Patch by Tim Northover! llvm-svn: 153737	2012-03-30 09:15:32 +00:00
Evan Cheng	a40d40602c	ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249 llvm-svn: 153717	2012-03-30 01:24:39 +00:00
Bill Wendling	afe7ec7070	Testcase for r153710. llvm-svn: 153711	2012-03-30 00:26:54 +00:00
Bill Wendling	4f2a951275	Add testcase for r153705 llvm-svn: 153706	2012-03-30 00:05:02 +00:00
Lang Hames	323a5ced21	Change the constant in this testcase so that it results in a constant pool load. llvm-svn: 153704	2012-03-29 23:52:38 +00:00
Bill Wendling	76fdc4b885	Revert r153694. It was causing failures in the buildbots. llvm-svn: 153701	2012-03-29 23:23:59 +00:00
Chandler Carruth	d6735ce57a	Filecheck-ize this test so that it actually tests something reasonable. llvm-svn: 153697	2012-03-29 22:01:41 +00:00
Danil Malyshev	3548eaf896	Re-factored RuntimeDyld. Added ExecutionEngine/MCJIT tests. llvm-svn: 153694	2012-03-29 21:46:18 +00:00
Jim Grosbach	0b0298302c	ARM assembly 'cmp lr, #0' should not encode using 'cmn'. The CMP->CMN alias was matching for an immediate of zero when it should only match for negative values. rdar://11129224 llvm-svn: 153689	2012-03-29 21:19:52 +00:00
Lang Hames	dd1211b4e1	The shuffle scheduler is only available in asserts build - make misched-new.ll testcase require asserts. llvm-svn: 153687	2012-03-29 21:11:47 +00:00
Lang Hames	5569ce7d56	Make x86 REP_MOV* and REP_STO instructions use the correct operand sizes in 64-bit mode. llvm-svn: 153680	2012-03-29 19:54:28 +00:00
Akira Hatanaka	0603ad8c65	Expand FREM. llvm-svn: 153671	2012-03-29 18:43:11 +00:00
Jakob Stoklund Olesen	4e55044ff5	Don't PRE compares. CodeGenPrepare sinks compare instructions down to their uses to prevent live flags and predicate registers across basic blocks. PRE of a compare instruction prevents that, forcing the i1 compare result into a general purpose register. That is usually more expensive than the redundant compare PRE was trying to eliminate in the first place. llvm-svn: 153657	2012-03-29 17:22:39 +00:00
Joel Jones	68d59e8a90	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153635	2012-03-29 05:45:48 +00:00
Joel Jones	b474099e63	Reverted to revision 153616 to unblock build llvm-svn: 153623	2012-03-29 01:20:56 +00:00
Joel Jones	b88c81fe0f	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153617	2012-03-29 00:37:47 +00:00
Jakob Stoklund Olesen	b6a7a89289	Don't kill the base register when expanding strd. When an strd instruction doesn't get the registers it wants, it can be expanded into two str instructions. Make sure the first str doesn't kill the base register in the case where the base and data registers are identical: t2STRi12 %R0<kill>, %R0, 4, pred:14, pred:%noreg t2STRi12 %R2<kill>, %R0, 8, pred:14, pred:%noreg <rdar://problem/11101911> llvm-svn: 153611	2012-03-28 23:07:03 +00:00
Rafael Espindola	5054ee82cc	Handle intrinsics in GlobalsModRef. Fixes pr12351. llvm-svn: 153604	2012-03-28 21:31:24 +00:00
Jakob Stoklund Olesen	9e512120b7	Spill DPair registers, not just QPR. The arm_neon intrinsics can create virtual registers from the DPair register class which allows both even-odd and odd-even D-register pairs. This fixes PR12389. llvm-svn: 153603	2012-03-28 21:20:32 +00:00
Chad Rosier	e27081d348	Revert r153521 as it's causing large regressions on the nightly testers. Original commit message for r153521 (aka r153423): Use the new range metadata in computeMaskedBits and add a new optimization to instruction simplify that lets us remove an and when loding a boolean value. llvm-svn: 153587	2012-03-28 18:42:50 +00:00
Benjamin Kramer	aa9e4a5e59	GlobalOpt: If we have an inbounds GEP from a ConstantAggregateZero global that we just determined to be constant, replace all loads from it with a zero value. llvm-svn: 153576	2012-03-28 14:50:09 +00:00
Richard Barton	7ce39497b4	Fixup VST1.32 with writeback instruction. Also re-factor non-writeback version. llvm-svn: 153573	2012-03-28 10:18:11 +00:00
Chandler Carruth	772c88b887	Switch to WeakVHs in the value mapper, and aggressively prune dead basic blocks in the function cloner. This removes the last case of trivially dead code that I've been seeing in the wild getting inlined, analyzed, re-inlined, optimized, only to be deleted. Nukes a FIXME from the cleanup tests. llvm-svn: 153572	2012-03-28 08:38:27 +00:00
Eric Christopher	7285c7d51d	Fix the output of the DW_TAG_friend tag to include DW_AT_friend and not the rest of the member tag. Fixes PR11695 llvm-svn: 153570	2012-03-28 07:34:31 +00:00
Akira Hatanaka	e3c00e5b97	Fix test case. llvm-svn: 153555	2012-03-28 00:25:01 +00:00
Eric Christopher	d8abaf3fc4	Add a test for the previous commit. Also, remove two tests that were testing a) the wrong behavior or b) something that I'm already testing in the new test. llvm-svn: 153525	2012-03-27 18:35:57 +00:00
Chad Rosier	8e6dbccd03	Reapply r153423; the original commit was fine. The failing test, distray, had undefined behavior, which Rafael was kind enough to fix. Original commit message for r153423: Use the new range metadata in computeMaskedBits and add a new optimization to instruction simplify that lets us remove an and when loding a boolean value. llvm-svn: 153521	2012-03-27 17:44:52 +00:00
Evan Cheng	7fede87349	Post-ra LICM should take care not to hoist an instruction that would clobber a register that's read by the preheader terminator. rdar://11095580 llvm-svn: 153492	2012-03-27 01:50:58 +00:00
Evan Cheng	a2b48d985b	ARM has a peephole optimization which looks for a def / use pair. The def produces a 32-bit immediate which is consumed by the use. It tries to fold the immediate by breaking it into two parts and fold them into the immmediate fields of two uses. e.g movw r2, #40885 movt r3, #46540 add r0, r0, r3 => add.w r0, r0, #3019898880 add.w r0, r0, #30146560 ; However, this transformation is incorrect if the user produces a flag. e.g. movw r2, #40885 movt r3, #46540 adds r0, r0, r3 => add.w r0, r0, #3019898880 adds.w r0, r0, #30146560 Note the adds.w may not set the carry flag even if the original sequence would. rdar://11116189 llvm-svn: 153484	2012-03-26 23:31:00 +00:00
Andrew Trick	7004e4b95e	SCEV fix: Handle loop invariant loads. Fixes PR11882: NULL dereference in ComputeLoadConstantCompareExitLimit. llvm-svn: 153480	2012-03-26 22:33:59 +00:00
Andrew Trick	f62744bb0d	Unit test for PR11950: LSR crash. llvm-svn: 153472	2012-03-26 21:45:37 +00:00
Chad Rosier	08e57e5ccf	Revert r153423 as this is causing failures on our internal nightly testers. Original commit message: Use the new range metadata in computeMaskedBits and add a new optimization to instruction simplify that lets us remove an and when loading a boolean value. llvm-svn: 153452	2012-03-26 18:07:14 +00:00
Kostya Serebryany	6f8a776041	[tsan] treat vtable pointer updates in a special way (requires tbaa); fix a bug (forgot to return true after instrumenting); make sure the tsan tests are run llvm-svn: 153448	2012-03-26 17:35:03 +00:00
Benjamin Kramer	df2348ecf3	Remove stale CBackend tests. llvm-svn: 153433	2012-03-26 11:16:50 +00:00
Rafael Espindola	df9b4adb82	Use the new range metadata in computeMaskedBits and add a new optimization to instruction simplify that lets us remove an and when loding a boolean value. llvm-svn: 153423	2012-03-26 01:44:11 +00:00
Chandler Carruth	8059c84af1	Teach instsimplify how to simplify comparisons of pointers which are constant-offsets of a common base using the generic GEP-walking logic I added for computing pointer differences in the same situation. llvm-svn: 153419	2012-03-25 21:28:14 +00:00
Chandler Carruth	2741aae80b	Switch the pointer-difference simplification logic to only work with inbounds GEPs. This isn't really necessary for simplifying pointer differences, but I'm planning to re-use the same code to simplify pointer comparisons where it is necessary. Since real code almost exclusively uses inbounds GEPs, it doesn't seem worth it to support the extra complexity of turning it on and off. If anyone would like that back, feel free to shout. Note that instcombine will still catch any of these patterns. llvm-svn: 153418	2012-03-25 20:43:07 +00:00
Eli Bendersky	a77c95f317	This file is no longer needed (DejaGNU-isms removed from code) llvm-svn: 153412	2012-03-25 12:43:54 +00:00
Chandler Carruth	ef82cf5b1e	Teach the function cloner (and thus the inliner) to simplify PHINodes aggressively. There are lots of dire warnings about this being expensive that seem to predate switching to the TrackingVH-based value remapper that is automatically updated on RAUW. This makes it easy to not just prune single-entry PHIs, but to fully simplify PHIs, and to recursively simplify the newly inlined code to propagate PHINode simplifications. This introduces a bit of a thorny problem though. We may end up simplifying a branch condition to a constant when we fold PHINodes, and we would like to nuke any dead blocks resulting from this so that time isn't wasted continually analyzing them, but this isn't easy. Deleting basic blocks after they are fully cloned and mapped into the new function currently requires manually updating the value map. The last piece of the simplification-during-inlining puzzle will require either switching to WeakVH mappings or some other piece of refactoring. I've left a FIXME in the testcase about this. llvm-svn: 153410	2012-03-25 10:34:54 +00:00
Eli Bendersky	f33086052d	Continue cleanup of LIT, getting rid of the remaining artifacts from dejagnu * Removed test/lib/llvm.exp - it is no longer needed * Deleted the dg.exp reading code from test/lit.cfg. There are no dg.exp files left in the test suite so this code is no longer required. test/lit.cfg is now much shorter and clearer * Removed a lot of duplicate code in lit.local.cfg files that need access to the root configuration, by adding a "root" attribute to the TestingConfig object. This attribute is dynamically computed to provide the same information as was previously provided by the custom getRoot functions. * Documented the config.root attribute in docs/CommandGuide/lit.pod llvm-svn: 153408	2012-03-25 09:02:19 +00:00
Chandler Carruth	2121199241	Move the instruction simplification of callsite arguments in the inliner to instead rely on much more generic and powerful instruction simplification in the function cloner (and thus inliner). This teaches the pruning function cloner to use instsimplify rather than just the constant folder to fold values during cloning. This can simplify a large number of things that constant folding alone cannot begin to touch. For example, it will realize that 'or' and 'and' instructions with certain constant operands actually become constants regardless of what their other operand is. It also can thread back through the caller to perform simplifications that are only possible by looking up a few levels. In particular, GEPs and pointer testing tend to fold much more heavily with this change. This should (in some cases) have a positive impact on compile times with optimizations on because the inliner itself will simply avoid cloning a great deal of code. It already attempted to prune proven-dead code, but now it will be use the stronger simplifications to prove more code dead. llvm-svn: 153403	2012-03-25 04:03:40 +00:00
Chandler Carruth	bc3bc9df2f	FileCheck-ize this test. Note the FIXME I've introduced here: we've regressed seriously here, we are no longer removing allocas during inline cleanup. This appears to be because of lifetime markers "using" them. =/ I'll look into this shortly. llvm-svn: 153394	2012-03-24 21:24:19 +00:00
Hal Finkel	e44eb28807	Fix small-integer VAARG on SVR4 ABI PPC64. The PPC64 SVR4 ABI requires integer stack arguments, and thus the var. args., that are smaller than 64 bits be zero extended to 64 bits. llvm-svn: 153373	2012-03-24 03:53:55 +00:00
Rafael Espindola	ef9f5504ea	First part of PR12251. Add documentation and verifier support for the range metadata. llvm-svn: 153359	2012-03-24 00:14:51 +00:00
Dan Gohman	e3ed2b0699	Don't convert objc_retainAutoreleasedReturnValue to objc_retain if it is retaining the return value of an invoke that it immediately follows. llvm-svn: 153344	2012-03-23 18:09:00 +00:00
Dan Gohman	5c70fadc17	It's not possible to insert code immediately after an invoke in the same basic block, and it's not safe to insert code in the successor blocks if the edges are critical edges. Splitting those edges is possible, but undesirable, especially on the unwind side. Instead, make the bottom-up code motion to consider invokes to be part of their successor blocks, rather than part of their parent blocks, so that it doesn't push code past them and onto the edges. This fixes PR12307. llvm-svn: 153343	2012-03-23 17:47:54 +00:00
Andrew Trick	d97b83e320	Remove -enable-lsr-nested in time for 3.1. Tests cases have been removed but attached to open PR12330. llvm-svn: 153286	2012-03-22 22:42:45 +00:00
Andrew Trick	f2c7af53f3	Convert -indvars tests that rely on SCEV expansion to -loop-reduce tests. llvm-svn: 153259	2012-03-22 17:10:07 +00:00
Andrew Trick	b4f08cd6df	Remove tests: indvars trivially preserves GEPs now. llvm-svn: 153258	2012-03-22 17:09:46 +00:00
Andrew Trick	a8242b6a58	Remove test: trivial canonical IV test which is covered by other SCEV tests. llvm-svn: 153257	2012-03-22 17:09:34 +00:00
Andrew Trick	bd11257df7	Test scalar evolution directly instead of testing the result of canonical indvars. llvm-svn: 153256	2012-03-22 17:09:31 +00:00
Andrew Trick	db149f9e73	Remove redundant -enable-iv-rewrite=false flags from test cases. llvm-svn: 153255	2012-03-22 17:09:04 +00:00
Silviu Baranga	4afd7d2316	Added soft fail checks for the disassembler when decoding some corner cases of the STRD, STRH, LDRD, LDRH, LDRSH and LDRSB instructions on ARM. llvm-svn: 153252	2012-03-22 14:14:49 +00:00
Silviu Baranga	d213f2111a	Added soft fail cases for the disassembler when decoding LDRSBT, LDRHT or LDRSHT instruction on ARM llvm-svn: 153251	2012-03-22 13:24:43 +00:00
Silviu Baranga	a6ea32afdd	Added soft fail cases for the disassembler when decoding MUL instructions on ARM. llvm-svn: 153250	2012-03-22 13:14:39 +00:00
Chandler Carruth	e26dafeb79	Revert a series of commits to MCJIT to get the build working in CMake (and hopefully on Windows). The bots have been down most of the day because of this, and it's not clear to me what all will be required to fix it. The commits started with r153205, then r153207, r153208, and r153221. The first commit seems to be the real culprit, but I couldn't revert a smaller number of patches. When resubmitting, r153207 and r153208 should be folded into r153205, they were simple build fixes. llvm-svn: 153241	2012-03-22 05:44:06 +00:00
Chad Rosier	6a63a74113	[fast-isel] Fold "urem x, pow2" -> "and x, pow2-1". This should fix the 271% execution-time regression for nsieve-bits on the ARMv7 -O0 -g nightly tester. This may also improve compile-time on architectures that would otherwise generate a libcall for urem (e.g., ARM) or fall back to the DAG selector. rdar://10810716 llvm-svn: 153230	2012-03-22 00:21:17 +00:00
Andrew Trick	267b57de6f	misched: tag a few XFAILs that I plan to fix llvm-svn: 153222	2012-03-21 22:31:31 +00:00
Danil Malyshev	70186bef8b	Re-factored RuntimeDyld. Added ExecutionEngine/MCJIT tests. llvm-svn: 153221	2012-03-21 21:06:29 +00:00
Kevin Enderby	7e7d5eefb2	Fix ARM disassembly of VST1 and VST2 instructions with writeback. And add test case for all opcodes handed by DecodeVSTInstruction() in ARMDisassembler.cpp . llvm-svn: 153218	2012-03-21 20:54:32 +00:00
Joerg Sonnenberger	5463e66768	Fix generation of the address size override prefix. Add assertions for the invalid cases. At least 16bit operand in 64bit mode is currently not rejected in the parser. llvm-svn: 153166	2012-03-21 05:48:07 +00:00
Andrew Trick	e357cfa3db	I meant to disable this test, not XFAIL it llvm-svn: 153165	2012-03-21 05:18:53 +00:00
Andrew Trick	f0a517fec8	misched: beginning to add unit tests llvm-svn: 153163	2012-03-21 04:12:19 +00:00
Akira Hatanaka	0137dfe42a	Incremental big endian patch by Jack Carter. These changes allow us to compile big endian from the command line for 32 bit Mips targets. This patch will result in code and data actually being produced in the correct endianess. llvm-svn: 153153	2012-03-21 00:52:01 +00:00
Chad Rosier	cbf45a6d8a	Fix test case from r153135. llvm-svn: 153140	2012-03-20 21:49:54 +00:00
Chad Rosier	4106917355	[avx] Add patterns for combining vextractf128 + vmovaps/vmovups/vmobdqu to vextractf128 with 128-bit mem dest. Combines vextractf128 $0, %ymm0, %xmm0 vmovaps %xmm0, (%rdi) to vextractf128 $0, %ymm0, (%rdi) rdar://11082570 llvm-svn: 153139	2012-03-20 21:43:40 +00:00
Jim Grosbach	1283317db4	Assembler should accept redefinitions of unused variable symbols. rdar://11027851 llvm-svn: 153137	2012-03-20 21:33:21 +00:00
Andrew Trick	f7711010e1	LoopSimplify bug fix. Handle indirect loop back edges. Do not call SplitBlockPredecessors on a loop preheader when one of the predecessors is an indirectbr. Otherwise, you will hit this assert: !isa<IndirectBrInst>(Preds[i]->getTerminator()) && "Cannot split an edge from an IndirectBrInst" llvm-svn: 153134	2012-03-20 21:24:52 +00:00
Andrew Trick	9c45706baf	LSR: teach isSimplifiedLoopNest to handle PHI IVUsers. llvm-svn: 153132	2012-03-20 21:24:44 +00:00
Andrew Trick	3660735e18	LSR: fix IVUsers isSimplifiedLoopNest to perform a full domtree walk instead of skipping the current loop. My prior fix was incomplete because of an overzealous compile-time optimization: Better fix for: <rdar://problem/11049788> Segmentation fault: 11 in LoopStrengthReduce llvm-svn: 153131	2012-03-20 21:24:40 +00:00
Chad Rosier	5a6011267a	[avx] Move the vextractf128 patterns closer to the vextractf128 def. Remove whitespace from test case. No functional change intended. llvm-svn: 153103	2012-03-20 18:24:55 +00:00
Kevin Enderby	816ca27ef6	Fix assembling ARM vst2 instructions with double-spaced registers. llvm-svn: 153099	2012-03-20 17:41:51 +00:00
Jim Grosbach	997614f597	ARM non-scattered MachO relocations for movw/movt. Needed when building -mdynamic-no-pic code. rdar://10459256 llvm-svn: 153097	2012-03-20 17:25:45 +00:00
Chad Rosier	58a7c9fd3e	Fix test. llvm-svn: 153095	2012-03-20 17:20:46 +00:00
Chad Rosier	07a4cb9382	[avx] Adjust the VINSERTF128rm pattern to allow for unaligned loads. This results in things such as vmovups 16(%rdi), %xmm0 vinsertf128 $1, %xmm0, %ymm0, %ymm0 to be combined to vinsertf128 $1, 16(%rdi), %ymm0, %ymm0 rdar://11076953 llvm-svn: 153092	2012-03-20 17:08:51 +00:00
Silviu Baranga	32a49333ec	The ARM instructions that have an unpredictable behavior when the pc register operand is given now fail with soft fail. Modified the regression tests to reflect this. llvm-svn: 153089	2012-03-20 15:54:56 +00:00
Bill Wendling	7315c4b9cd	It's possible to have a constant expression who's size is quite big (e.g., i128). In that case, we may not be able to print out the MCExpr as an expression. For instance, we could have an MCExpr like this: 0xBEEF0000BEEF0000 \| (0xBEEF0000BEEF0000 << 64) The MCExpr printer handles sizes up to 64-bits, but this expression would require 128-bits. In this situation, try to evaluate the constant expression and emit that as the value into 64-bit chunks. <rdar://problem/11070338> llvm-svn: 153081	2012-03-20 08:56:43 +00:00
Anton Korobeynikov	3edd854d64	Perform mul combine when multiplying wiht negative constants. Patch by Weiming Zhao! This fixes PR12212 llvm-svn: 153049	2012-03-19 19:19:50 +00:00
NAKAMURA Takumi	bed1cb1e13	llvm/test/DebugInfo: Move two tests to DebugInfo/X86. They are X86-dependent. llvm-svn: 153038	2012-03-19 16:16:03 +00:00
Preston Gurd	48ccc4df0b	This patch adds X86 instruction itineraries for non-pseudo opcodes in X86InstrCompiler.td. It also adds –mcpu-generic to the legalize-shift-64.ll test so the test will pass if run on an Intel Atom CPU, which would otherwise produce an instruction schedule which differs from that which the test expects. llvm-svn: 153033	2012-03-19 14:10:12 +00:00
Nick Lewycky	fa30607eca	Factor out the multiply analysis code in ComputeMaskedBits and apply it to the overflow checking multiply intrinsic as well. Add a test for this, updating the test from grep to FileCheck. llvm-svn: 153028	2012-03-18 23:28:48 +00:00
Jim Grosbach	2c8e0ac85c	MC asm parser macro argument count was wrong when empty. evaluated to '1' when the argument list was empty (should be '0'). rdar://11057257 llvm-svn: 152967	2012-03-17 00:11:42 +00:00
Jim Grosbach	905686a82a	ARM ldm/stm register lists can be out of order. It's not a good style idea, as the registers will be laid down in memory in numerical order, not the order they're in the list, but it's legal. vldm/vstm are stricter. rdar://11064740 llvm-svn: 152943	2012-03-16 20:48:38 +00:00
Bill Wendling	55b6b2b6a9	Revert r152907. llvm-svn: 152935	2012-03-16 18:20:54 +00:00
Bill Wendling	a2a26b546c	The alignment of the pointer part of the store instruction may have an alignment. If that's the case, then we want to make sure that we don't increase the alignment of the store instruction. Because if we increase it to be "more aligned" than the pointer, code-gen may use instructions which require a greater alignment than the pointer guarantees. <rdar://problem/11043589> llvm-svn: 152907	2012-03-16 07:40:08 +00:00
Chandler Carruth	b37fc13a36	Rip out support for 'llvm.noinline'. This thing has a strange history... It was added in 2007 as the first cut at supporting no-inline attributes, but we didn't have function attributes of any form at the time. However, it was added without any mention in the LangRef or other documentation. Later on, in 2008, Devang added function notes for 'inline=never' and then turned them into proper function attributes. From that point onward, as far as I can tell, the world moved on, and no one has touched 'llvm.noinline' in any meaningful way since. It's time has now come. We have had better mechanisms for doing this for a long time, all the frontends I'm aware of use them, and this is just holding back progress. Given that it was never a documented feature of the IR, I've provided no auto-upgrade support. If people know of real, in-the-wild bitcode that relies on this, yell at me and I'll add it, but I seriously doubt anyone cares. llvm-svn: 152904	2012-03-16 06:10:15 +00:00
Andrew Trick	070e540a3e	LSR fix: Add isSimplifiedLoopNest to IVUsers analysis. Only record IVUsers that are dominated by simplified loop headers. Otherwise SCEVExpander will crash while looking for a preheader. I previously tried to work around this in LSR itself, but that was insufficient. This way, LSR can continue to run if some uses are not in simple loops, as long as we don't attempt to analyze those users. Fixes <rdar://problem/11049788> Segmentation fault: 11 in LoopStrengthReduce llvm-svn: 152892	2012-03-16 03:16:56 +00:00
Eli Friedman	e06535b2f6	In InstCombiner::visitOr, make sure we reverse the operand swap used for checking for or-of-xor operations after those checks; a later check expects that any constant will be in Op1. PR12234. llvm-svn: 152884	2012-03-16 00:52:42 +00:00
Jim Grosbach	7cb9a13b02	ARM optional operand on MRC/MCR assembly instructions. rdar://11058464 llvm-svn: 152883	2012-03-16 00:45:58 +00:00
Jim Grosbach	24d90e2ddc	ARM vmrs system registers mvfr0 and mvfr1 handling. rdar://11058464 llvm-svn: 152881	2012-03-16 00:27:18 +00:00
Eric Christopher	a4a0cf8394	Do the right thing on NULL uint64 fields. Patch by Clemens Hammacher! Fixes PR12243 llvm-svn: 152880	2012-03-16 00:21:54 +00:00
Eric Christopher	7734ca2891	For types with a parent of the compile unit make sure and emit the DECL information. rdar://10855921 llvm-svn: 152876	2012-03-15 23:55:40 +00:00
Chad Rosier	26d05887d9	[fast-isel] Address Eli's comments for r152847. Specifically, add a test case and still allow immediate encoding, just not with cmn. rdar://11038907 llvm-svn: 152869	2012-03-15 22:54:20 +00:00
Jim Grosbach	d28888dd77	ARM case-insensitive checking for APSR_nzcv. rdar://11056591 llvm-svn: 152846	2012-03-15 21:34:14 +00:00
Matt Beaumont-Gay	18abf74edd	line endings llvm-svn: 152832	2012-03-15 20:24:29 +00:00
Lang Hames	c35ee8b54a	Use vmov.f32 to materialize f32 consts on ARM. This relaxes constraints on register allocation by allowing all 32 D-registers to be used. Patch by Cameron Zwarich. llvm-svn: 152824	2012-03-15 18:49:02 +00:00
Kristof Beyls	327d2f9da5	Fix VCVT decoding (between floating-point and fixed-point, Floating-point). Patch by Richard Barton. llvm-svn: 152814	2012-03-15 17:50:29 +00:00
Rafael Espindola	f58927855b	Short term fix for pr12270 before we change dominates to handle unreachable code. While here, reduce indentation. llvm-svn: 152803	2012-03-15 15:52:59 +00:00
Nadav Rotem	6fd1d32c63	When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations. llvm-svn: 152784	2012-03-15 08:49:06 +00:00
Eric Christopher	7dd54fb695	Revert the removal of DW_AT_MIPS_linkage_name when we aren't putting out the DW_AT_name. Older gdbs unfortunately still use it to disambiguate member functions in templated classes (gdb.cp/templates.exp). rdar://11043421 (which is now deferred for a bit) llvm-svn: 152782	2012-03-15 08:19:33 +00:00
Chad Rosier	b9b73170e3	[avx] Add patterns for VINSERTF128rm. This results in things such as vmovaps -96(%rbx), %xmm1 vinsertf128 $1, %xmm1, %ymm0, %ymm0 to be combined to vinsertf128 $1, -96(%rbx), %ymm0, %ymm0 rdar://10643481 llvm-svn: 152762	2012-03-15 00:45:30 +00:00
Aaron Ballman	a733297fa6	Fixed a transform crash when setting a negative size value for memset. Fixes PR12202. llvm-svn: 152756	2012-03-15 00:05:31 +00:00
Chandler Carruth	4d1d34fbfc	Extend the inline cost calculation to account for bonuses due to correlated pairs of pointer arguments at the callsite. This is designed to recognize the common C++ idiom of begin/end pointer pairs when the end pointer is a constant offset from the begin pointer. With the C-based idiom of a pointer and size, the inline cost saw the constant size calculation, and this provides the same level of information for begin/end pairs. In order to propagate this information we have to search for candidate operations on a pair of pointer function arguments (or derived from them) which would be simplified if the pointers had a known constant offset. Then the callsite analysis looks for such pointer pairs in the argument list, and applies the appropriate bonus. This helps LLVM detect that half of bounds-checked STL algorithms (such as hash_combine_range, and some hybrid sort implementations) disappear when inlined with a constant size input. However, it's not a complete fix due the inaccuracy of our cost metric for constants in general. I'm looking into that next. Benchmarks showed no significant code size change, and very minor performance changes. However, specific code such as hashing is showing significantly cleaner inlining decisions. llvm-svn: 152752	2012-03-14 23:19:53 +00:00
Dan Gohman	532fb8131b	When an invoke is marked with metadata indicating its unwind edge should be ignored by ARC optimization, don't insert new ARC runtime calls in the unwind destination. llvm-svn: 152748	2012-03-14 23:05:06 +00:00
Eric Christopher	a9916d0296	Remove the DW_AT_MIPS_linkage name attribute when we don't need it output (we're emitting a specification already and the information isn't changing). Saves 1% on the debug information for a build of llvm. Fixes rdar://11043421 llvm-svn: 152697	2012-03-14 02:59:17 +00:00
Evan Cheng	7bf83096df	DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to (i16 load $addr+csizeof(i16)) and replace uses of (i32 vextract) with the i16 load. It should issue an extload instead: (i32 extload $addr+csizeof(i16)). rdar://11035895 llvm-svn: 152675	2012-03-13 22:00:52 +00:00
Kevin Enderby	1ef22f33d0	Change the X86 assembler to not require a segment register on string instruction's destination operand like it does for the source operand. Also fix a typo in the comment for X86AsmParser::isSrcOp(). llvm-svn: 152654	2012-03-13 19:47:55 +00:00
Chris Lattner	87fa77bd8a	enhance jump threading to preserve TBAA information when PRE'ing loads, fixing rdar://11039258, an issue that came up when inspecting clang's bootstrapped codegen. llvm-svn: 152635	2012-03-13 18:07:41 +00:00
Dan Gohman	eab06fa3c9	Teach globalopt how to evaluate an invoke with a non-void return type. llvm-svn: 152634	2012-03-13 18:01:37 +00:00
Duncan Sands	395ac42dd2	Generalize the "trunc(ptrtoint(x)) - trunc(ptrtoint(y)) -> trunc(ptrtoint(x-y))" optimization introduced by Chandler. llvm-svn: 152626	2012-03-13 14:07:05 +00:00
Eli Friedman	c8cbd06947	Fix regression from r151466: an we can't replace uses of an instruction reachable from the entry block with uses of an instruction not reachable from the entry block. PR12231. llvm-svn: 152595	2012-03-13 01:06:07 +00:00
Kevin Enderby	987cef1fe2	Change the second line of the test added for r152414 to use CHECK-NEXT. Suggestion by Bill Wendling! llvm-svn: 152582	2012-03-12 21:38:09 +00:00
Kevin Enderby	fb3110b5d2	Added a missing error check for X86 assembly with mismatched base and index registers not both being 64-bit or both being 32-bit registers. llvm-svn: 152580	2012-03-12 21:32:09 +00:00
Kostya Serebryany	afbb65dee7	[asan] move x86-specific test to a separate X86 directory with a custom lit.local.cfg file llvm-svn: 152567	2012-03-12 18:49:11 +00:00
Chandler Carruth	595fda8466	When inlining a function and adding its inner call sites to the candidate set for subsequent inlining, try to simplify the arguments to the inner call site now that inlining has been performed. The goal here is to propagate and fold constants through deeply nested call chains. Without doing this, we loose the inliner bonus that should be applied because the arguments don't match the exact pattern the cost estimator uses. Reviewed on IRC by Benjamin Kramer. llvm-svn: 152556	2012-03-12 11:19:33 +00:00
Chandler Carruth	a0796555e2	Teach instsimplify how to constant fold pointer differences. Typically instcombine has handled this, but pointer differences show up in several contexts where we would like to get constant folding, and cannot afford to run instcombine. Specifically, I'm working on improving the constant folding of arguments used in inline cost analysis with instsimplify. Doing this in instsimplify implies some algorithm changes. We have to handle multiple layers of all-constant GEPs because instsimplify cannot fold them into a single GEP the way instcombine can. Also, we're only interested in all-constant GEPs. The result is that this doesn't really replace the instcombine logic, it's just complimentary and focused on constant folding. Reviewed on IRC by Benjamin Kramer. llvm-svn: 152555	2012-03-12 11:19:31 +00:00
Chandler Carruth	6242a0f771	FileCheck-ize this test. llvm-svn: 152554	2012-03-12 11:19:28 +00:00
Andrew Trick	61d277f146	Move llc + target triple tests into X86 llvm-svn: 152502	2012-03-10 19:03:51 +00:00
Benjamin Kramer	fee6372daa	Don't try to filecheck bitcode. llvm-svn: 152498	2012-03-10 18:07:46 +00:00
Bill Wendling	0624d2a1ec	Make this transformation slightly less agressive and more correct. The 'CmpInst::isFalseWhenEqual' function returns 'false' for values other than simply equality. For instance, it returns 'false' for <= or >=. This isn't the correct behavior for this transformation, which is checking for strict equality and non-equality. It was causing the gcc.c-torture/execute/frame-address.c test to fail because it would completely (and incorrectly) optimize a whole function into a 'ret i32 0'. llvm-svn: 152497	2012-03-10 17:56:03 +00:00
Bill Wendling	ebb10df441	Fix disasm of iret, sysexit, and sysret when displayed with Intel syntax. Patch by Kay Tiong Khoo! llvm-svn: 152487	2012-03-10 07:37:27 +00:00
Kevin Enderby	deed5aaa41	Add the missing call to Error when a bad X86 scale expression is parsed. llvm-svn: 152443	2012-03-09 22:24:10 +00:00
David Meyer	6c614bf717	Support reading GNU symbol versions in ELFObjectFile * Add enums and structures for GNU version information. * Implement extraction of that information on a per-symbol basis (ELFObjectFile::getSymbolVersion). * Implement a generic interface, GetELFSymbolVersion(), for getting the symbol version from the ObjectFile (hides the templating). * Have llvm-readobj print out the version, when available. * Add a test for the new feature: readobj-elf-versioning.test llvm-svn: 152436	2012-03-09 20:59:52 +00:00
Dan Gohman	500b598c5c	When identifying exit nodes for the reverse-CFG reverse-post-order traversal, consider nodes for which the only successors are backedges which the traversal is ignoring to be exit nodes. This fixes a problem where the bottom-up traversal was failing to visit split blocks along split loop backedges. This fixes rdar://10989035. llvm-svn: 152421	2012-03-09 18:50:52 +00:00
Kevin Enderby	014e1cde5f	Fix the x86 disassembler to at least print the lock prefix if it is the first prefix. Added a FIXME to remind us this still does not work when it is not the first prefix. llvm-svn: 152414	2012-03-09 17:52:49 +00:00
NAKAMURA Takumi	aebd3da46d	test/MC/X86/lit.local.cfg: Fix up to detect 'X86' in targets. llvm-svn: 152406	2012-03-09 14:52:38 +00:00
Duncan Sands	cca89124a2	Eliminate switch cases that can never match, for example removes all negative switch cases if the branch condition is known to be positive. Inspired by a recent improvement to GCC's VRP. llvm-svn: 152405	2012-03-09 13:45:18 +00:00
Chandler Carruth	783b7198b7	Undo a previous restriction on the inline cost calculation which Nick introduced. Specifically, there are cost reductions for all constant-operand icmp instructions against an alloca, regardless of whether the alloca will in fact be elligible for SROA. That means we don't want to abort the icmp reduction computation when we abort the SROA reduction computation. That in turn frees us from the need to keep a separate worklist and defer the ICmp calculations. Use this new-found freedom and some judicious function boundaries to factor the innards of computing the cost factor of any given instruction out of the loop over the instructions and into static helper functions. This greatly simplifies the code, and hopefully makes it more clear what is happening here. Reviewed by Eric Christopher. There is some concern that we'd like to ensure this doesn't get out of hand, and I plan to benchmark the effects of this change over the next few days along with some further fixes to the inline cost. llvm-svn: 152368	2012-03-09 02:49:36 +00:00
Chad Rosier	a281afc676	Fix a regression from r147481. Original commit message from r147481: DAGCombine for transforming 128->256 casts into a vmovaps, rather then a vxorps + vinsertf128 pair if the original vector came from a load. Fix: Unaligned loads need to generate a vmovups. rdar://10974078 llvm-svn: 152366	2012-03-09 02:00:48 +00:00
Benjamin Kramer	0ef86b0ea3	Remove the no longer existent psp triple from a test. The test fell back to the C backend, making it useless and it started to fail on configurations that don't build the C backend. llvm-svn: 152342	2012-03-08 21:22:27 +00:00
Akira Hatanaka	d60cb3822f	Test case for r152280, r152285 and r152290. llvm-svn: 152292	2012-03-08 03:32:42 +00:00
Rafael Espindola	bdd1258784	Use llvm-mc instead of llc. Patch by Jack Carter. llvm-svn: 152242	2012-03-07 20:58:59 +00:00
Jakob Stoklund Olesen	aa0f752fc8	Fix infinite loop in nested multiclasses. Patch by Michael Liao! llvm-svn: 152232	2012-03-07 16:39:35 +00:00
Eric Christopher	54cf8ff45e	Add the DW_AT_APPLE_runtime_class attribute to forward declarations as well as completely defined classes. This fixes rdar://10956070 llvm-svn: 152171	2012-03-07 00:15:19 +00:00
Evan Cheng	80893ce5f5	Extend r148086 to check for [r +/- reg] address mode. This fixes queens performance regression (due to increased register pressure from overly aggressive pre-inc formation). llvm-svn: 152162	2012-03-06 23:33:32 +00:00
Eli Friedman	de850676e0	Fix the operand ordering on aliases for shld and shrd. PR12173, part 2. llvm-svn: 152136	2012-03-06 19:58:46 +00:00
Kevin Enderby	520eb3ba8a	Fix a bug in the ARM disassembly of the neon VLD2 all lanes instruction. llvm-svn: 152127	2012-03-06 18:33:12 +00:00
Jakob Stoklund Olesen	d9b427ee65	Add <imp-def> operands when reloading into physregs. When an instruction only writes sub-registers, it is still necessary to add an <imp-def> operand for the super-register. When reloading into a virtual register, rewriting will add the operand, but when loading directly into a virtual register, the <imp-def> operand is still necessary. llvm-svn: 152095	2012-03-06 02:48:17 +00:00
Lang Hames	718cfbe05a	Split fpscr into two registers: FPSCR and FPSCR_NZCV. The fpscr register contains both flags (set by FP operations/comparisons) and control bits. The control bits (FPSCR) should be reserved, since they're always available and needn't be defined before use. The flag bits (FPSCR_NZCV) should like to be unreserved so they can be hoisted by MachineCSE. This fixes PR12165. llvm-svn: 152076	2012-03-06 00:19:55 +00:00
Jim Grosbach	8dc347fc27	ARM vpush/vpop assembler mnemonics accept an optional size suffix. rdar://10988114 llvm-svn: 152068	2012-03-05 23:16:31 +00:00
Eli Friedman	a8b75ac798	Make sure we don't return bits outside the mask in ComputeMaskedBits. PR12189. llvm-svn: 152066	2012-03-05 23:09:40 +00:00
Jakob Stoklund Olesen	fcd435ee73	Remove a test case that no longer makes sense. This was testing the handling of sub-register coalescing followed by remat. The original problem was caused by the extra <imp-def> operands added by sub-register coalescing. Those <imp-def> operands are not added any longer, and the test case passes even when the original patch is reverted. llvm-svn: 152040	2012-03-05 19:10:13 +00:00
Sebastian Pop	957a6583f1	updated patch for the ARM fused multiply add/sub In this update: - I assumed neon2 does not imply vfpv4, but neon and vfpv4 imply neon2. - I kept setting .fpu=neon-vfpv4 code attribute because that is what the assembler understands. Patch by Ana Pazos <apazos@codeaurora.org> llvm-svn: 152036	2012-03-05 17:39:52 +00:00
Eli Friedman	a5a6d6aa8f	Make aliases for shld and shrd match gas. PR12173. llvm-svn: 152014	2012-03-05 04:31:54 +00:00
Jakob Stoklund Olesen	f729ceae04	Use <def,undef> operands when spilling NEON bundles. MachineOperands that define part of a virtual register must have an <undef> flag if they are not intended as read-modify-write operands. The old trick of adding an <imp-def> operand doesn't work any longer. Fixes PR12177. llvm-svn: 152008	2012-03-04 18:40:30 +00:00
Duncan Sands	4d928e7dff	Nick pointed out on IRC that GVN's propagateEquality wasn't propagating equalities into phi node operands for which the equality is known to hold in the incoming basic block. That's because replaceAllDominatedUsesWith wasn't handling phi nodes correctly in general (that this didn't give wrong results was just luck: the specific way GVN uses replaceAllDominatedUsesWith precluded wrong changes to phi nodes). llvm-svn: 152006	2012-03-04 13:25:19 +00:00
Bill Wendling	97b9359623	Do trivial CSE of dead BBs during codegen preparation. Some BBs can become dead after codegen preparation. If we delete them here, it could help enable tail-call optimizations later on. <rdar://problem/10256573> llvm-svn: 152002	2012-03-04 10:46:01 +00:00
Jakob Stoklund Olesen	a0bd36e3bc	Fix RA-dependent test. llvm-svn: 151958	2012-03-03 00:26:30 +00:00
Benjamin Kramer	d9d80b1dde	LVI: Recognize the form instcombine canonicalizes range checks into when forming constant ranges. This could probably be made a lot smarter, but this is a common case and doesn't require LVI to scan a lot of code. With this change CVP can optimize away the "shift == 0" case in Hashing.h that only gets hit when "shift" is in a range not containing 0. llvm-svn: 151919	2012-03-02 15:34:43 +00:00
Chad Rosier	f5e086f18e	Prevent obscure and incorrect tail-call optimization. In this instance we are generating the tail-call during legalizeDAG. The 2nd floor call can't be a tail call because it clobbers %xmm1, which is defined by the first floor call. The first floor call can't be a tail-call because it's not in the tail position. The only reasonable way I could think to fix this in a target-independent manner was to check for glue logic on the copy reg. rdar://10930395 llvm-svn: 151877	2012-03-02 02:50:46 +00:00
Eric Christopher	7524fe4551	Revert "Reorder the sections being output to reduce the number of assembler" The inline table needs to be constructed ahead of time so that it doesn't try to create new strings while we're emitting everything. This reverts commit a8ff9bccb399183cdd5f1c3cec2bda763664b4b0. llvm-svn: 151864	2012-03-02 00:30:24 +00:00
Evan Cheng	d12af5dc69	Neuter the optimization I implemented with r107852 and r108258 which turn some floating point equality comparisons into integer ones with -ffast-math. The issue is the optimization causes +0.0 != -0.0. Now the optimization is only done when one side is known to be 0.0. The other side's sign bit is masked off for the comparison. rdar://10964603 llvm-svn: 151861	2012-03-01 23:27:13 +00:00
Eric Christopher	66b0721014	Reorder the sections being output to reduce the number of assembler fixups that are being used to determine section offsets. Reduces the total number of fixups by 50% for a non-trivial testcase. Part of rdar://10413936 llvm-svn: 151852	2012-03-01 22:50:31 +00:00
David Meyer	c429b80da1	[Object] Add ObjectFile::getLoadName() for retrieving the soname/installname of a shared object. llvm-svn: 151845	2012-03-01 22:19:54 +00:00
Kevin Enderby	f0269b4270	Change ARMInstPrinter::printPredicateOperand() so it will not abort if it runs into the undefined 15 condition code value. llvm-svn: 151844	2012-03-01 22:13:02 +00:00
Akira Hatanaka	6bbe1f0d10	Fix bugs which were introduced when support for base+index floating point loads and stores was added. - SelectAddr should return false if Parent is an unaligned f32 load or store. - Only aligned load and store nodes should be matched to select reg+imm floating point instructions. - MIPS does not have support for f64 unaligned load or store instructions. llvm-svn: 151843	2012-03-01 22:12:30 +00:00
Preston Gurd	be1c875a1c	Trivial change to make the test use Use –mcpu=generic, so that the test will not fail when run on an Intel Atom processor, due to the Atom scheduler producing an instruction sequence that is different from that which is normally expected. llvm-svn: 151832	2012-03-01 19:57:20 +00:00
Chad Rosier	2913f500fa	Revert r151816 as Jim has the appropriate fix. llvm-svn: 151818	2012-03-01 17:41:19 +00:00
Chad Rosier	f0208ed76a	Fix testcases from r151807. llvm-svn: 151816	2012-03-01 17:31:30 +00:00
Jim Grosbach	394ad59d90	Add missing triple for tests. Make darwin bots happier. llvm-svn: 151813	2012-03-01 17:30:32 +00:00
James Molloy	f6298e9281	Fix a codegen fault in which log2 or exp2 could be dead-code eliminated even though they could have sideeffects. Only allow log2/exp2 to be converted to an intrinsic if they are declared "readnone". llvm-svn: 151807	2012-03-01 14:32:18 +00:00
NAKAMURA Takumi	74e736f0eb	llvm/test/CMakeLists.txt: Update dependencies to add llvm-readobj to "check". llvm-svn: 151795	2012-03-01 03:14:13 +00:00
David Meyer	2fc34c5f84	[Object] * Add begin_dynamic_table() / end_dynamic_table() private interface to ELFObjectFile. * Add begin_libraries_needed() / end_libraries_needed() interface to ObjectFile, for grabbing the list of needed libraries for a shared object or dynamic executable. * Implement this new interface completely for ELF, leave stubs for COFF and MachO. * Add 'llvm-readobj' tool for dumping ObjectFile information. llvm-svn: 151785	2012-03-01 01:36:50 +00:00
Lang Hames	76e66c31a0	Don't redundantly copy implicit operands when rematerializing. While we're at it - don't copy vreg implicit operands while rematerializing. This fixes PR12138. llvm-svn: 151779	2012-03-01 00:41:17 +00:00
Richard Trieu	37ddc0fab6	Fix flags for test in MC/MachO/ARM/empty-function-nop.ll llvm-svn: 151778	2012-03-01 00:29:09 +00:00
Benjamin Kramer	d05a0c6c42	LegalizeIntegerTypes: Reorder operations in the "big shift by small amount" optimization, making the lives of later passes easier. llvm-svn: 151722	2012-02-29 13:27:00 +00:00
Duncan Sands	bb2fe65542	Have GVN also do condition propagation when the right-hand side is not a constant. This fixes PR1768. llvm-svn: 151713	2012-02-29 11:12:03 +00:00
Bill Wendling	7f9f5680ca	Testcase for r151691. llvm-svn: 151694	2012-02-29 01:53:13 +00:00
Jim Grosbach	617f84ddbd	ARM implement TargetInstrInfo::getNoopForMachoTarget() Without this hook, functions w/ a completely empty body (including no epilogue) will cause an MCEmitter assertion failure. For example, define internal fastcc void @empty_function() { unreachable } rdar://10947471 llvm-svn: 151673	2012-02-28 23:53:30 +00:00
David Meyer	1df4b84db4	In the ObjectFile interface, replace isInternal(), isAbsolute(), isGlobal(), and isWeak(), with a bitset of flags. llvm-svn: 151670	2012-02-28 23:47:53 +00:00
Rafael Espindola	c22c85c29c	On ELF, create relocations to the abbreviation and line sections when producing debug info for assembly files. We were already doing the right thing when producing debug info for C/C++. ELF linkers don't know dwarf, so they depend on these relocations to produce valid dwarf output. llvm-svn: 151655	2012-02-28 21:13:05 +00:00
Benjamin Kramer	0c281a7deb	LegalizeIntegerTypes: Reenable the large shift with small amount optimization. To avoid problems with zero shifts when getting the bits that move between words we use a trick: first shift the by amount-1, then do another shift by one. When amount is 0 (and size 32) we first shift by 31, then by one, instead of by 32. Also fix a latent bug that emitted the low and high words in the wrong order when shifting right. Fixes PR12113. llvm-svn: 151637	2012-02-28 17:58:00 +00:00
Daniel Dunbar	ee7b899343	Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part. llvm-svn: 151630	2012-02-28 15:36:07 +00:00
Nadav Rotem	875e463b19	Fix a bug in the code that builds SDNodes from vector GEPs. When the GEP index is a vector of pointers, the code that calculated the size of the element started from the vector type, and not the contained pointer type. As a result, instead of looking at the data element pointed by the vector, this code used the size of the vector. This works for 32bit members (on 32bit systems), but not for other types. Added code to peel the vector type and added a test. llvm-svn: 151626	2012-02-28 11:54:05 +00:00
Evan Cheng	87c7b09d8d	Some ARM implementaions, e.g. A-series, does return stack prediction. That is, the processor keeps a return addresses stack (RAS) which stores the address and the instruction execution state of the instruction after a function-call type branch instruction. Calling a "noreturn" function with normal call instructions (e.g. bl) can corrupt RAS and causes 100% return misprediction so LLVM should use a unconditional branch instead. i.e. mov lr, pc b _foo The "mov lr, pc" is issued in order to get proper backtrace. rdar://8979299 llvm-svn: 151623	2012-02-28 06:42:03 +00:00
Pete Cooper	39b5255df4	Reverted r152620 - DSE: Shorten memset when a later store overwrites the start of it. There were all sorts of buildbot issues llvm-svn: 151621	2012-02-28 05:06:24 +00:00
Pete Cooper	f3862f91de	DSE: Shorten memset when a later store overwrites the start of it llvm-svn: 151620	2012-02-28 04:27:10 +00:00
Akira Hatanaka	330d901ce3	Add support for floating point base register + offset register addressing mode load and store instructions. llvm-svn: 151611	2012-02-28 02:55:02 +00:00
Jakob Stoklund Olesen	4c5ad2b812	Handle regmasks in MachineCSE. Don't attempt to extend physreg live ranges across calls. <rdar://problem/10942095> llvm-svn: 151610	2012-02-28 02:08:50 +00:00
Jakob Stoklund Olesen	92c15b2b2c	Enable ARM base pointer when calling functions with large arguments. When an outgoing call takes more than 2k of arguments on the stack, we don't allocate that call frame in the prolog, but adjust the stack pointer immediately before the call instead. This causes problems with the emergency spill slot because PEI can't track stack pointer adjustments on the second pass, and if the outgoing arguments are too big, SP can't be used to reach the emergency spill slot at all. Work around these problems by ensuring there is a base or frame pointer that can be used to access the emergency spill slot. <rdar://problem/10917166> llvm-svn: 151604	2012-02-28 01:15:01 +00:00
Michael J. Spencer	8c4729fd44	[Object] Add {begin,end}_dynamic_symbols stubs and implementation for ELF. Add -D option to llvm-nm to dump dynamic symbols. Patch by David Meyer. llvm-svn: 151600	2012-02-28 00:40:37 +00:00
Bill Wendling	2b3f61af18	Add back removed code. It still causes LLVM to miscompile. But not having it breaks other things. llvm-svn: 151594	2012-02-27 23:48:30 +00:00
Preston Gurd	43b2506e32	test commit. llvm-svn: 151588	2012-02-27 23:31:51 +00:00

... 10 11 12 13 14 ...

16835 Commits