llvm-project

Commit Graph

Author	SHA1	Message	Date
Jyotsna Verma	50ca6dd8a7	Hexagon: Use multiclass for absolute addressing mode stores. llvm-svn: 174412	2013-02-05 18:15:34 +00:00
Jakob Stoklund Olesen	eb1084ee54	Add a test case for PR14750. This was fixed by r174402. llvm-svn: 174405	2013-02-05 18:04:15 +00:00
Derek Schuff	90aa1d8abe	[MC] Bundle alignment: Invalidate relaxed fragments Currently, when a fragment is relaxed, its size is modified, but its offset is not (it gets laid out as a side effect of checking whether it needs relaxation), then all subsequent fragments are invalidated because their offsets need to change. When bundling is enabled, relaxed fragments need to get laid out again, because the increase in size may push it over a bundle boundary. So instead of only invalidating subsequent fragments, also invalidate the fragment that gets relaxed, which causes it to get laid out again. This patch also fixes some trailing whitespace and fixes the bundling-related debug output of MCFragments. llvm-svn: 174401	2013-02-05 17:55:27 +00:00
Tom Stellard	7d41161a2d	R600: Add tests for instruction predicates llvm-svn: 174393	2013-02-05 17:09:13 +00:00
Tom Stellard	2e5e7a5bef	R600: Emit function name in the AsmPrinter Emitting the function name allows us to check for it in the FileCheck tests so we can make sure FileCheck is checking the output of the correct function. llvm-svn: 174392	2013-02-05 17:09:11 +00:00
Jyotsna Verma	6f635b5488	Hexagon: Add V4 compare instructions. Enable relationship mapping for the existing instructions. llvm-svn: 174389	2013-02-05 16:42:24 +00:00
NAKAMURA Takumi	7ec43d9b37	Formatting. llvm-svn: 174380	2013-02-05 15:32:16 +00:00
NAKAMURA Takumi	6635fe56d3	llvm/test/Transforms/LoopVectorize/X86/vector_ptr_load_store.ll: "-debug" requires +Asserts. llvm-svn: 174379	2013-02-05 15:32:10 +00:00
Arnold Schwaighofer	22174f5d5a	Loop Vectorizer: Handle pointer stores/loads in getWidestType() In the loop vectorizer cost model, we used to ignore stores/loads of a pointer type when computing the widest type within a loop. This meant that if we had only stores/loads of pointers in a loop we would return a widest type of 8bits (instead of 32 or 64 bit) and therefore a vector factor that was too big. Now, if we see a consecutive store/load of pointers we use the size of a pointer (from data layout). This problem occured in SingleSource/Benchmarks/Shootout-C++/hash.cpp (reduced test case is the first test in vector_ptr_load_store.ll). radar://13139343 llvm-svn: 174377	2013-02-05 15:08:02 +00:00
NAKAMURA Takumi	3753b28cd2	Revert r174343, "When the target-independent DAGCombiner inferred a higher alignment for a load," It caused hangups in compiling clang/lib/Parse/ParseDecl.cpp and clang/lib/Driver/Tools.cpp in stage2 on some hosts. llvm-svn: 174374	2013-02-05 14:44:16 +00:00
Logan Chien	4b724429b8	Link .ARM.exidx with corresponding text section. The sh_link in the ELF section header of .ARM.exidx should be filled with the section index of the corresponding text section. llvm-svn: 174372	2013-02-05 14:18:59 +00:00
Arnold Schwaighofer	a804bbee9b	ARM cost model: Cost for scalar integer casts and floating point conversions Also adds some costs for vector integer float conversions. llvm-svn: 174371	2013-02-05 14:05:55 +00:00
Jack Carter	428a06cc75	This patch that sets the Mips ELF header flag for MicroMips architectures. Contributer: Zoran Jovanovic llvm-svn: 174360	2013-02-05 09:30:03 +00:00
Jack Carter	9c1a027fe8	This patch that sets the EmitAlias flag in td files and enables the instruction printer to print aliased instructions. Due to usage of RegisterOperands a change in common code (utils/TableGen/AsmWriterEmitter.cpp) is required to get the correct register value if it is a RegisterOperand. Contributer: Vladimir Medic llvm-svn: 174358	2013-02-05 08:32:10 +00:00
Eric Christopher	6a421a944d	Add support for testing the output of the abbrev table for the skeleton CU as part of the DWARF5 split dwarf proposal. llvm-svn: 174351	2013-02-05 07:32:00 +00:00
Eric Christopher	7a2cdf798b	Add support for emitting a stub DW_AT_GNU_dwo_id as part of the DWARF5 split dwarf proposal. llvm-svn: 174350	2013-02-05 07:31:55 +00:00
Michael Gottesman	e2376cdf71	Add code to GlobalVariable.h so that global variables marked as externally_initialized return false for hasDefiniteInitializer and hasUniqueInitializer. rdar://12580965. llvm-svn: 174345	2013-02-05 06:53:26 +00:00
Owen Anderson	a47fdbb032	When the target-independent DAGCombiner inferred a higher alignment for a load, it would replace the load with one with the higher alignment. However, it did not place the new load in the worklist, which prevented later DAG combines in the same phase (for example, target-specific combines) from ever seeing it. This patch corrects that oversight, and updates some tests whose output changed due to slightly different DAGCombine outputs. llvm-svn: 174343	2013-02-05 06:25:30 +00:00
Michael Gottesman	27e7ef326a	Added LLVM Asm/Bitcode Reader/Writer support for new IR keyword externally_initialized. llvm-svn: 174340	2013-02-05 05:57:38 +00:00
Manman Ren	86b1d868ba	[Stack Alignment] emit warning instead of a hard error Per discussion in rdar://13127907, we should emit a hard error only if people write code where the requested alignment is larger than achievable and assumes the low bits are zeros. A warning should be good enough when we are not sure if the source code assumes the low bits are zeros. rdar://13127907 llvm-svn: 174336	2013-02-04 23:45:08 +00:00
Jyotsna Verma	7ab68fbd1d	Hexagon: Add V4 combine instructions and some more Def Pats for V2. llvm-svn: 174331	2013-02-04 15:52:56 +00:00
Benjamin Kramer	c35d526489	Disable a couple more vector splat optimizations on PPC. I didn't see those because the test case used "not grep". FileCheck the test and XFAIL it, preserving the old optimization, so this can be fixed eventually. llvm-svn: 174330	2013-02-04 15:52:32 +00:00
Benjamin Kramer	2c9da989c2	X86: Open up some opportunities for constant folding by postponing shift lowering. Fixes PR15141. llvm-svn: 174327	2013-02-04 15:19:33 +00:00
Benjamin Kramer	548ffa274a	SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors. This required disabling a PowerPC optimization that did the following: input: x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16> lowered to: tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8> x = ADD tmp, tmp The add now gets folded immediately and we're back at the BUILD_VECTOR we started from. I don't see a way to fix this currently so I left it disabled for now. Fix some trivially foldable X86 tests too. llvm-svn: 174325	2013-02-04 15:19:18 +00:00
Tim Northover	37b131f607	Update debugging test for change in expected metadata. llvm-svn: 174321	2013-02-04 12:15:00 +00:00
David Blaikie	2811f8ac28	[DebugInfo] remove more node indirection (this time from the subprogram's variable lists) llvm-svn: 174305	2013-02-04 05:56:36 +00:00
Arnold Schwaighofer	98f1012f9b	ARM cost model: Penalize insertelement into D subregisters Swift has a renaming dependency if we load into D subregisters. We don't have a way of distinguishing between insertelement operations of values from loads and other values. Therefore, we are pessimistic for now (The performance problem showed up in example 14 of gcc-loops). radar://13096933 llvm-svn: 174300	2013-02-04 02:52:05 +00:00
David Blaikie	33111dfea0	Remove the (apparently) unnecessary debug info metadata indirection. The main lists of debug info metadata attached to the compile_unit had an extra layer of metadata nodes they went through for no apparent reason. This patch removes that (& still passes just as much of the GDB 7.5 test suite). If anyone can show evidence as to why these extra metadata nodes are there I'm open to reverting this patch & documenting why they're there. llvm-svn: 174266	2013-02-02 05:56:24 +00:00
Reed Kotler	f8933f83f0	Start static relocation implementation for mips16. This checkin makes hello world work. llvm-svn: 174264	2013-02-02 04:07:35 +00:00
Manman Ren	053e4ff008	Removing ssp and uwtable from the testcase llvm-svn: 174259	2013-02-02 01:34:38 +00:00
Shuxin Yang	cadd8a068e	rdar://13126763 Fix a bug in DAGCombine. The symptom is mistakenly optimizing expression "x + xx" into "x 3.0". llvm-svn: 174239	2013-02-02 00:22:03 +00:00
Manman Ren	e697d3cd2e	[Dwarf] avoid emitting multiple AT_const_value for static memebers. Testing case is reduced from MultiSource/BenchMarks/Prolangs-C++/deriv1. rdar://problem/13071590 llvm-svn: 174235	2013-02-01 23:54:37 +00:00
Bill Schmidt	52742c25ae	LLVM enablement for some older PowerPC CPUs llvm-svn: 174230	2013-02-01 22:59:51 +00:00
Dan Gohman	9ee4bc1abc	Add a testcase for some past-the-end address subtleties. llvm-svn: 174210	2013-02-01 19:37:52 +00:00
David Sehr	8114a7a651	Two changes relevant to LEA and x32: 1) allows the use of RIP-relative addressing in 32-bit LEA instructions under x86-64 (ILP32 and LP64) 2) separates the size of address registers in 64-bit LEA instructions from control by ILP32/LP64. llvm-svn: 174208	2013-02-01 19:28:09 +00:00
Jyotsna Verma	10f5c2db4e	Hexagon: Test case to confirm generation of indexed loads with zero offset. llvm-svn: 174196	2013-02-01 16:40:06 +00:00
Benjamin Kramer	c05aa958b1	InstSimplify: stripAndComputeConstantOffsets can be called with vectors of pointers too. Prepare it for vectors of pointers and handle simple cases. We don't handle complicated cases because accumulateConstantOffset bails on pointer vectors. Fixes selfhost on i386. llvm-svn: 174179	2013-02-01 15:21:10 +00:00
Tim Northover	e3d4236402	Add explicit triples to AArch64 tests Only Linux is supported at the moment, and other platforms quickly fault. As a result these tests would fail on non-Linux hosts. It may be worth making the tests more generic again as more platforms are supported. llvm-svn: 174170	2013-02-01 11:40:47 +00:00
Nadav Rotem	4349f6963e	Revert r174152. The shift amount may overflow and in that case this transformation is illegal. llvm-svn: 174156	2013-02-01 07:59:33 +00:00
Nadav Rotem	1d584029ae	Optimize shift lefts of a constant by a value plus constant into a single shift. llvm-svn: 174152	2013-02-01 06:45:40 +00:00
Dan Gohman	b3e2d3a638	Rewrite instsimplify's handling if icmp on pointer values to remove the remaining use of AliasAnalysis concepts such as isIdentifiedObject to prove pointer inequality. @external_compare in test/Transforms/InstSimplify/compare.ll shows a simple case where a noalias argument can be equal to a global variable address, and while AliasAnalysis can get away with saying that these pointers don't alias, instsimplify cannot say that they are not equal. llvm-svn: 174122	2013-02-01 00:11:13 +00:00
Dan Gohman	995d40e1e2	An alloca can be equal to an argument. It can't alias an alloca, but it could be equal, since there's nothing preventing a caller from correctly predicting the stack location of an alloca. llvm-svn: 174119	2013-01-31 23:49:33 +00:00
Bill Wendling	1c7cc8ae90	Remove the AttrBuilder form of the Attribute::get creators. The AttrBuilder is for building a collection of attributes. The Attribute object holds only one attribute. So it's not really useful for the Attribute object to have a creator which takes an AttrBuilder. This has two fallouts: 1. The AttrBuilder no longer holds its internal attributes in a bit-mask form. 2. The attributes are now ordered alphabetically (hence why the tests have changed). llvm-svn: 174110	2013-01-31 23:16:25 +00:00
Tom Stellard	4926921bd4	R600: Fold clamp, neg, abs Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 174099	2013-01-31 22:11:54 +00:00
Manman Ren	aec2ce7db4	Linker: correctly link in dbg.declare This is a re-worked version of r174048. Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. rdar://problem/13089880 llvm-svn: 174093	2013-01-31 21:19:18 +00:00
Lang Hames	dd47804394	When lowering memcpys to loads and stores, make sure we don't promote alignments past the natural stack alignment. llvm-svn: 174085	2013-01-31 20:23:43 +00:00
Derek Schuff	b76ec3bb5e	[MC] bundle alignment: prevent padding instructions from crossing bundle boundaries llvm-svn: 174067	2013-01-31 17:00:03 +00:00
Tim Northover	e0e3aefdd3	Add AArch64 as an experimental target. This patch adds support for AArch64 (ARM's 64-bit architecture) to LLVM in the "experimental" category. Currently, it won't be built unless requested explicitly. This initial commit should have support for: + Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions (except the late addition CRC instructions). + CodeGen features required for C++03 and C99. + Compilation for the "small" memory model: code+static data < 4GB. + Absolute and position-independent code. + GNU-style (i.e. "__thread") TLS. + Debugging information. The principal omission, currently, is performance tuning. This patch excludes the NEON support also reviewed due to an outbreak of batshit insanity in our legal department. That will be committed soon bringing the changes to precisely what has been approved. Further reviews would be gratefully received. llvm-svn: 174054	2013-01-31 12:12:40 +00:00
Pekka Jaaskelainen	995a3e731d	Made the min-trip-count-switch test X86-specific to avoid breakage with builds without X86-support. llvm-svn: 174052	2013-01-31 10:33:22 +00:00
Alexey Samsonov	5234a8ed9f	Revert r173946. This breaks compilation of googletest with Clang llvm-svn: 174048	2013-01-31 08:02:11 +00:00
Michael Gottesman	41e4ac4224	Filecheckized 2x tests in SimplifyCFG and removed their date prefix to fit with current llvm style for test names. llvm-svn: 174011	2013-01-31 01:04:23 +00:00
Eric Christopher	4e3e94c13d	Check and allow floating point registers to select the size of the register for inline asm. This conforms to how gcc allows for effective casting of inputs into gprs (fprs is already handled). llvm-svn: 174008	2013-01-31 00:50:46 +00:00
Eli Bendersky	6c84b90b70	Replace some more greps with FileChecks in tests llvm-svn: 174006	2013-01-31 00:44:12 +00:00
Eli Bendersky	a320e00e74	Rewrite this test properly with a FileCheck instead of greps llvm-svn: 173997	2013-01-31 00:11:52 +00:00
Dan Gohman	6a61fccb96	Fix ConstantFold's folding of icmp instructions to recognize that, for example, a one-past-the-end pointer from one global variable may be equal to the base pointer of another global variable. llvm-svn: 173995	2013-01-31 00:01:45 +00:00
Hal Finkel	e1df90958d	PPC QPX requires a 32-byte aligned stack On systems which support the QPX vector instructions, the stack must be 32-byte aligned. llvm-svn: 173993	2013-01-30 23:43:27 +00:00
Evan Cheng	9449ec956f	Forgot the test case before. llvm-svn: 173988	2013-01-30 22:57:00 +00:00
Hal Finkel	efb305e54c	Add definitions for the PPC a2q core marked as having QPX available This is the first commit of a large series which will add support for the QPX vector instruction set to the PowerPC backend. This instruction set is used on the IBM Blue Gene/Q supercomputers. llvm-svn: 173973	2013-01-30 21:17:42 +00:00
Manman Ren	81dcc62805	Linker: correctly link in dbg.declare Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. llvm-svn: 173946	2013-01-30 17:42:15 +00:00
Eli Bendersky	2e2ce49e59	Add a special ARM trap encoding for NaCl. More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html Patch by JF Bastien llvm-svn: 173943	2013-01-30 16:30:19 +00:00
Logan Chien	a436e4c7e4	Add missing header and test cases for r173939. llvm-svn: 173941	2013-01-30 15:48:50 +00:00
Nadav Rotem	513bd8a73c	InstCombine: canonicalize sext-and --> select sext-not-and --> select. Patch by Muhammad Tauqir Ahmad. llvm-svn: 173901	2013-01-30 06:35:22 +00:00
Saleem Abdulrasool	26127bd746	build: add --with-python option This adds a new --with-python option to allow configuration of the python binary for building. If not specified, $PATH will be searched for common python binary names (python, python2, python3). If specified, and the path is not executable, it will attempt to search $PATH. Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org> Reviewed-by: Eric Christopher <echristo@gmail.com>, Daniel Dunbar <daniel@zuster.org> llvm-svn: 173890	2013-01-30 04:07:37 +00:00
Jack Carter	718da0b53b	This patch implements runtime ARM specific setting of ELF header e_flags. Contributer: Jack Carter llvm-svn: 173885	2013-01-30 02:24:33 +00:00
Jack Carter	7f378104b6	This patch implements runtime Mips specific setting of ELF header e_flags. Contributer: Jack Carter llvm-svn: 173884	2013-01-30 02:16:36 +00:00
Jack Carter	1bd90ff6cc	This patch reworks how llvm targets set and update ELF header e_flags. Currently gathering information such as symbol, section and data is done by collecting it in an MCAssembler object. From MCAssembler and MCAsmLayout objects ELFObjectWriter::WriteObject() forms and streams out the ELF object file. This patch just adds a few members to the MCAssember class to store and access the e_flag settings. It allows for runtime additions to the e_flag by assembler directives. The standalone assembler can get to MCAssembler from getParser().getStreamer().getAssembler(). This patch is the generic infrastructure and will be followed by patches for ARM and Mips for their target specific use. Contributer: Jack Carter llvm-svn: 173882	2013-01-30 02:09:52 +00:00
Akira Hatanaka	4385564e97	[mips] Test case for r173862. Patch by Sasa Stankovic. llvm-svn: 173863	2013-01-30 00:28:15 +00:00
Renato Golin	5e9d55eca0	Adding simple cast cost to ARM Changing ARMBaseTargetMachine to return ARMTargetLowering intead of the generic one (similar to x86 code). Tests showing which instructions were added to cast when necessary or cost zero when not. Downcast to 16 bits are not lowered in NEON, so costs are not there yet. llvm-svn: 173849	2013-01-29 23:31:38 +00:00
Michael J. Spencer	54b24e1000	[MC][COFF] Delay handling symbol aliases when writing Fixes PR14447 and PR9034. Patch by Nico Rieck! llvm-svn: 173839	2013-01-29 22:10:07 +00:00
Pekka Jaaskelainen	f50ab84bb1	LoopVectorize: convert TinyTripCountVectorThreshold constant to a command line switch. llvm-svn: 173837	2013-01-29 21:42:08 +00:00
David Blaikie	9a7a7a9a6f	Support artificial parameters in function types. Provides the functionality for Clang change r172911 - I just had this still lying around. llvm-svn: 173820	2013-01-29 19:35:24 +00:00
Tim Northover	a0edd3ee66	Fix 64-bit atomic operations in Thumb mode. The ARM and Thumb variants of LDREXD and STREXD have different constraints and take different operands. Previously the code expanding atomic operations didn't take this into account and asserted in Thumb mode. llvm-svn: 173780	2013-01-29 09:06:13 +00:00
Craig Topper	c048154b9b	Merge SSE and AVX shuffle instructions in the comment printer. llvm-svn: 173777	2013-01-29 07:54:31 +00:00
Bill Wendling	f2955aa3f2	Convert getAttributes() to return an AttributeSetNode. The AttributeSetNode contains all of the attributes. This removes one (hopefully last) use of the Attribute class as a container of multiple attributes. llvm-svn: 173761	2013-01-29 03:20:31 +00:00
Andrew Kaylor	6d8776a514	Add support for source and line information to IntelJITEventListener for object emitted by MCJIT. llvm-svn: 173712	2013-01-28 19:52:37 +00:00
Bill Schmidt	2e4ae4e154	This patch addresses bug 15031. The common code in the post-RA scheduler to break anti-dependencies on the critical path contained a flaw. In the reported case, an anti-dependency between the overlapping registers %X4 and %R4 exists: %X29<def> = OR8 %X4, %X4 %R4<def>, %X3<def,dead,tied3> = LBZU 1, %X3<kill,tied1> The unpatched code breaks the dependency by replacing %R4 and its uses with %R3, the first register on the available list. However, %R3 and %X3 overlap, so this creates two overlapping definitions on the same instruction. The fix is straightforward, preventing selection of a register that overlaps any other defined register on the same instruction. The test case is reduced from the bug report, and verifies that we no longer produce "lbzu 3, 1(3)" when breaking this anti-dependency. llvm-svn: 173706	2013-01-28 18:36:58 +00:00
Evgeniy Stepanov	6f85ef300d	[msan] Mostly disable msan-handle-icmp-exact. It is way too slow. Change the default option value to 0. Always do exact shadow propagation for unsigned ICmp with constants, it is cheap (under 1% cpu time) and required for correctness. llvm-svn: 173682	2013-01-28 11:42:28 +00:00
Craig Topper	5c683972bc	Fix 256-bit PALIGNR comment decoding to understand that it works on independent 256-bit lanes. llvm-svn: 173674	2013-01-28 07:41:18 +00:00
Richard Osborne	038d24f90c	[XCore] Add missing l2rus instructions. These instructions are not targeted by the compiler but they are needed for the MC layer. llvm-svn: 173634	2013-01-27 22:28:30 +00:00
Richard Osborne	f2ecd40929	[XCore] Add missing l2r instructions. These instructions are not targeted by the compiler but they are needed for the MC layer. llvm-svn: 173629	2013-01-27 21:26:02 +00:00
Richard Osborne	7fe8f63544	[XCore] Add missing 1r instructions. These instructions are not targeted by the compiler but they are needed for the MC layer. llvm-svn: 173624	2013-01-27 20:46:21 +00:00
Richard Osborne	8f56317287	[XCore] Add missing 0r instructions. These instructions are not targeted by the compiler but they are needed for the MC layer. llvm-svn: 173623	2013-01-27 20:42:57 +00:00
Benjamin Kramer	05cc93964a	When the legalizer is splitting vector shifts, the result may not have the right shift amount type. Fix that by adding a cast to the shift expander. This came up with vector shifts on sse-less X86 CPUs. <2 x i64> = shl <2 x i64> <2 x i64> -> i64,i64 = shl i64 i64; shl i64 i64 -> i32,i32,i32,i32 = shl_parts i32 i32 i64; shl_parts i32 i32 i64 Now we cast the last two i64s to the right type. Fixes the crash in PR14668. llvm-svn: 173615	2013-01-27 11:19:11 +00:00
Chandler Carruth	329b590e6e	Re-revert r173342, without losing the compile time improvements, flat out bug fixes, or functionality preserving refactorings. llvm-svn: 173610	2013-01-27 06:42:03 +00:00
David Blaikie	9f4b70dde0	PR14566: Debug Info: Removing top level lexical blocks This adds support for LLVM to accept metadata that doesn't include a top level lexical block in a function. Specifically LLVM couldn't handle this when there were file changes relating to these blocks. I've updated a few test cases to ensure other functionality (such as inlining) isn't affected by this change, but haven't pervasively updated all the test cases. llvm-svn: 173592	2013-01-26 21:55:23 +00:00
Benjamin Kramer	6a93596538	X86: Decode PALIGN operands so I don't have to do it in my head. llvm-svn: 173572	2013-01-26 13:31:37 +00:00
Benjamin Kramer	99c68dd964	X86: Do splat promotion later, so the optimizer can chew on it first. This catches many cases where we can emit a more efficient shuffle for a specific mask or when the mask contains undefs. Once the splat is lowered to unpacks we can't do that anymore. There is a possibility of moving the promotion after pshufb matching, but I'm not sure if pshufb with a mask loaded from memory is faster than 3 shuffles, so I avoided that for now. llvm-svn: 173569	2013-01-26 11:44:21 +00:00
Benjamin Kramer	7268a05178	FileCheckize and merge some tests. llvm-svn: 173568	2013-01-26 11:14:32 +00:00
Andrew Kaylor	9a8ff813f3	Add DIContext::getLineInfoForAddressRange() function and test. This function allows a caller to obtain a table of line information for a function using the function's address and size. llvm-svn: 173537	2013-01-26 00:28:05 +00:00
NAKAMURA Takumi	8653bcf024	llvm/test/CMakeLists.txt: Add a dependency to llvm-rtdyld in check-llvm. llvm-svn: 173528	2013-01-25 23:24:07 +00:00
Hal Finkel	4e5ca9e578	Initial implementation of PPCTargetTransformInfo This provides a place to add customized operation cost information and control some other target-specific IR-level transformations. The only non-trivial logic in this checkin assigns a higher cost to unaligned loads and stores (covered by the included test case). llvm-svn: 173520	2013-01-25 23:05:59 +00:00
Andrew Kaylor	d55d7019fc	Add support for applying in-memory relocations to the .debug_line section and, in the case of ELF files, using symbol addresses when available for relocations to the .debug_info section. Also extending the llvm-rtdyld tool to add the ability to dump line number information for testing purposes. llvm-svn: 173517	2013-01-25 22:50:58 +00:00
Reid Kleckner	1aa3784960	XFAIL close-stderr on win32 The test runner does not rewrite instances of /dev/null inside the quoted sh command. /dev/null does not exist, so opt will fail to open it, and return a non-zero exit code. llvm-svn: 173509	2013-01-25 22:12:54 +00:00
Reid Kleckner	0198e00318	Set the +x bit on two batch scripts Cygwin git-svn will faithfully forward the svn properties all the way down to the NTFS executable permission. Without the +x bit, tests using these scripts fail with "Access Denied". llvm-svn: 173508	2013-01-25 22:12:50 +00:00
Reid Kleckner	ab083f727b	FileCheck-ify some grep tests These tests in particular try to use escaped square brackets as an argument to grep, which is failing for me with native win32 python. It appears the backslash is being lost near the CreateProcess*() call. llvm-svn: 173506	2013-01-25 22:11:46 +00:00
Eli Bendersky	597fc1233a	In this patch, we teach X86_64TargetMachine that it has a ILP32 (defined by the x32 ABI) mode, in which case its pointers are 32-bits in size. This knowledge is also added to X86RegisterInfo that now returns the appropriate registers in getPointerRegClass. There are many outcomes to this change. In order to keep the patches separate and manageable, we start by focusing on some simple testable cases. The patch adds a test with passing a pointer to a function - focusing on the difference between the two data models for x86-64. Another test is added for handling of 'sret' arguments (and functionality is added in X86ISelLowering to make it work). A note on naming: the "x32 ABI" document refers to the AMD64 architecture (in LLVM it's distinguished by being is64Bits() in the x86 subtarget) with two variations: the LP64 (default) data model, and the ILP32 data model. This patch adds predicates to the subtarget which are consistent with this naming scheme. llvm-svn: 173503	2013-01-25 22:07:43 +00:00
Eli Bendersky	158ea095c0	Add back a RUN line removed by mistake by a previous commit llvm-svn: 173502	2013-01-25 21:58:09 +00:00
Richard Osborne	6b86eec819	Add instruction encodings / disassembly support for l4r instructions. llvm-svn: 173501	2013-01-25 21:55:32 +00:00
Eli Bendersky	e6abe83258	Now that llvm-dwarfdump supports flags to specify which DWARF section to dump, use them in tests that run llvm-dwarfdump. This is in order to make tests as specific as possible. llvm-svn: 173498	2013-01-25 21:44:53 +00:00
Hal Finkel	1a57ba57a2	Improve the !add TableGen test case. Suggested by Sean Silva. llvm-svn: 173481	2013-01-25 20:29:25 +00:00
Eli Bendersky	7a94daa170	Add command-line flags for DWARF dumping. Flags for dumping specific DWARF sections added in lib/DebugInfo and llvm-dwarfdump. llvm-svn: 173480	2013-01-25 20:26:43 +00:00
Richard Osborne	a19fa86a70	Add instruction encodings / disassembly support for l5r instructions. llvm-svn: 173479	2013-01-25 20:20:07 +00:00
Evgeniy Stepanov	fac8403249	[msan] Implement exact shadow propagation for relational ICmp. Only for integers, pointers, and vectors of those. No floats. Instrumentation seems very heavy, and may need to be replaced with some approximation in the future. llvm-svn: 173452	2013-01-25 15:31:10 +00:00
Hal Finkel	c7d4dc13a4	Add an addition operator to TableGen This adds an !add(a, b) operator to tablegen; this will be used to cleanup the PPC register definitions. llvm-svn: 173445	2013-01-25 14:49:08 +00:00
Silviu Baranga	3eb45a03af	Fixed the condition codes for the atomic64 min/umin code generation on ARM. If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation. llvm-svn: 173437	2013-01-25 10:39:49 +00:00
Andrew Trick	e2c3f5c982	MIsched: Improve the interface to SchedDFS analysis (subtrees). Allow the strategy to select SchedDFS. Allow the results of SchedDFS to affect initialization of the scheduler state. llvm-svn: 173425	2013-01-25 06:33:57 +00:00
Chandler Carruth	ceff222dea	Switch this code away from Value::isUsedInBasicBlock. That code either loops over instructions in the basic block or the use-def list of the value, neither of which are really efficient when repeatedly querying about values in the same basic block. What's more, we already know that the CondBB is small, and so we can do a much more efficient test by counting the uses in CondBB, and seeing if those account for all of the uses. Finally, we shouldn't blanket fail on any such instruction, instead we should conservatively assume that those instructions are part of the cost. Note that this actually fixes a bug in the pass because isUsedInBasicBlock has a really terrible bug in it. I'll fix that in my next commit, but the fix for it would make this code suddenly take the compile time hit I thought it already was taking, so I wanted to go ahead and migrate this code to a faster & better pattern. The bug in isUsedInBasicBlock was also causing other tests to test the wrong thing entirely: for example we weren't actually disabling speculation for floating point operations as intended (and tested), but the test passed because we failed to speculate them due to the isUsedInBasicBlock failure. llvm-svn: 173417	2013-01-25 05:40:09 +00:00
Andrew Trick	44f750a3e5	MISched: Add SchedDFSResult to ScheduleDAGMI to formalize the interface and allow other strategies to select it. llvm-svn: 173413	2013-01-25 04:01:04 +00:00
Jack Carter	07c818d2da	This patch implements parsing the .word directive for the Mips assembler. Contributer: Vladimir Medic llvm-svn: 173407	2013-01-25 01:31:34 +00:00
Akira Hatanaka	28aed9ca85	[mips] Set flag neverHasSideEffects flag on some of the floating point instructions. llvm-svn: 173401	2013-01-25 00:20:39 +00:00
Benjamin Kramer	1c4e323fdd	Reapply chandlerc's r173342 now that the miscompile it was triggering is fixed. Original commit message: Plug TTI into the speculation logic, giving it a real cost interface that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173357	2013-01-24 16:44:25 +00:00
Benjamin Kramer	435eba09b7	ConstantFolding: Add a missing folding that leads to a miscompile. We use constant folding to see if an intrinsic evaluates to the same value as a constant that we know. If we don't take the undefinedness into account we get a value that doesn't match the actual implementation, and miscompiled code. This was uncovered by Chandler's simplifycfg changes. llvm-svn: 173356	2013-01-24 16:28:28 +00:00
Chandler Carruth	321c6a7c50	Revert r173342 temporarily. It appears to cause a very late miscompile of stage2 in a bootstrap. Still investigating.... llvm-svn: 173343	2013-01-24 13:24:24 +00:00
Chandler Carruth	5f4519309f	Plug TTI into the speculation logic, giving it a real cost interface that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173342	2013-01-24 12:39:29 +00:00
Chandler Carruth	01bffaad03	Address a large chunk of this FIXME by accumulating the cost for unfolded constant expressions rather than checking each one independently. llvm-svn: 173341	2013-01-24 12:05:17 +00:00
Chandler Carruth	8a21005cca	Switch the constant expression speculation cost evaluation away from a cost fuction that seems both a bit ad-hoc and also poorly suited to evaluating constant expressions. Notably, it is missing any support for trivial expressions such as 'inttoptr'. I could fix this routine, but it isn't clear to me all of the constraints its other users are operating under. The core protection that seems relevant here is avoiding the formation of a select instruction wich a further chain of select operations in a constant expression operand. Just explicitly encode that constraint. Also, update the comments and organization here to make it clear where this needs to go -- this should be driven off of real cost measurements which take into account the number of constants expressions and the depth of the constant expression tree. llvm-svn: 173340	2013-01-24 11:53:01 +00:00
Kostya Serebryany	87191f6221	[asan] adaptive redzones for globals (the larger the global the larger is the redzone) llvm-svn: 173335	2013-01-24 10:35:40 +00:00
Reed Kotler	a2d76bce1f	The next phase of Mips16 hard float implementation. Allow Mips16 routines to call Mips32 routines that have abi requirements that either arguments or return values are passed in floating point registers. This handles only the pic case. We have not done non pic for Mips16 yet in any form. The libm functions are Mips32, so with this addition we have a complete Mips16 hard float implementation. We still are not able to complete mix Mip16 and Mips32 with hard float. That will be the next phase which will have several steps. For Mips32 to freely call Mips16 some stub functions must be created. llvm-svn: 173320	2013-01-24 04:24:02 +00:00
Benjamin Kramer	d9c3dabbba	ConstantFolding: Evaluate GEP indices in the index type. This fixes some edge cases that we would get wrong with uint64_ts. PR14986. llvm-svn: 173289	2013-01-23 20:41:05 +00:00
Richard Osborne	54e311821f	Add instruction encodings / disassembly support for l6r instructions. llvm-svn: 173288	2013-01-23 20:08:11 +00:00
Benjamin Kramer	e4c46fec73	Revert "InstCombine: Clean up weird code that talks about a modulus that's long gone." This causes crashes during the build of compiler-rt during selfhost. Add a testcase for coverage. llvm-svn: 173279	2013-01-23 17:52:29 +00:00
Bill Wendling	7c8f96a91b	Add the heuristic to differentiate SSPStrong from SSPRequired. The requirements of the strong heuristic are: * A Protector is required for functions which contain an array, regardless of type or length. * A Protector is required for functions which contain a structure/union which contains an array, regardless of type or length. Note, there is no limit to the depth of nesting. * A protector is required when the address of a local variable (i.e., stack based variable) is exposed. (E.g., such as through a local whose address is taken as part of the RHS of an assignment or a local whose address is taken as part of a function argument.) llvm-svn: 173231	2013-01-23 06:43:53 +00:00
Bill Wendling	d154e283f2	Add the IR attribute 'sspstrong'. SSPStrong applies a heuristic to insert stack protectors in these situations: * A Protector is required for functions which contain an array, regardless of type or length. * A Protector is required for functions which contain a structure/union which contains an array, regardless of type or length. Note, there is no limit to the depth of nesting. * A protector is required when the address of a local variable (i.e., stack based variable) is exposed. (E.g., such as through a local whose address is taken as part of the RHS of an assignment or a local whose address is taken as part of a function argument.) This patch implements the SSPString attribute to be equivalent to SSPRequired. This will change in a subsequent patch. llvm-svn: 173230	2013-01-23 06:41:41 +00:00
Nadav Rotem	ab3e698ee9	Add support for reverse pointer induction variables. These are loops that contain pointers that count backwards. For example, this is the hot loop in BZIP: do { m = --p; p = ( ... ); } while (--n); llvm-svn: 173219	2013-01-23 01:35:00 +00:00
Richard Osborne	1a06479f46	Add instruction encodings / disassembly support for u10 / lu10 instructions. llvm-svn: 173204	2013-01-22 22:55:04 +00:00
Michael Liao	3dffc5e2b7	Fix an issue of pseudo atomic instruction DAG schedule - Add list of physical registers clobbered in pseudo atomic insts Physical registers are clobbered when pseudo atomic instructions are expanded. Add them in clobber list to prevent DAG scheduler to mis-schedule them after these insns are declared side-effect free. - Add test case from Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 173200	2013-01-22 21:47:38 +00:00
Kevin Enderby	81c944cadb	Add a warning when there is a macro defintion that has named parameters but the body does not use them and it appears the body has positional parameters. This can cause unexpected results as in the added test case. As the darwin version of gas(1) which only supported positional parameters, happened to ignore the named parameters. Now that we want to support both styles of macros we issue a warning in this specific case. rdar://12861644 llvm-svn: 173199	2013-01-22 21:44:53 +00:00
Akira Hatanaka	88c0ec826c	[mips] Implement MipsRegisterInfo::getRegPressureLimit. llvm-svn: 173197	2013-01-22 21:34:25 +00:00
Kevin Enderby	0017d8a469	Have the integrated assembler give an error if $1 is used as an identifier in an expression. Currently this bug causes the line to be ignored in a release build and an assert in a debug build. rdar://13062484 llvm-svn: 173195	2013-01-22 21:09:20 +00:00
Eli Bendersky	583860339b	Add forgotten test case for the x32 commit llvm-svn: 173181	2013-01-22 18:52:39 +00:00
Benjamin Kramer	fee7d21ae7	X86: Make sure we account for the FMA4 register immediate value, otherwise rip-rel relocations will be off by one byte. PR15040. llvm-svn: 173176	2013-01-22 18:05:59 +00:00
Dmitri Gribenko	44ee2e7a23	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 173163	2013-01-22 14:39:21 +00:00
Evgeniy Stepanov	c4415591ed	[msan] Do not insert check on volatile store. Volatile bitfields can cause valid stores of uninitialized bits. llvm-svn: 173153	2013-01-22 12:30:52 +00:00
Michael Gottesman	469a465fa8	This test is only supposed to test that the objc-arc alias analysis allows for gvn to perform certain optimizations. Thus the runline should only contain -objc-arc-aa, not the full -objc-arc. llvm-svn: 173126	2013-01-22 04:41:11 +00:00
Daniel Dunbar	34ea79f9a3	[MC/Mach-O] Load commands are supposed to 8-byte aligned on 64-bit. llvm-svn: 173120	2013-01-22 03:42:49 +00:00
Andrew Trick	2d35fabc7d	Remove target triple from an LSR test. Manish already fixed this test to work with NoTTI. llvm-svn: 173110	2013-01-22 00:57:16 +00:00
Paul Redmond	9d86a4a3b6	Transform (sub 0, (zext bool to A)) to (sext bool to A) and (sub 0, (sext bool to A)) to (zext bool to A). Patch by Muhammad Ahmad Reviewed by Duncan Sands llvm-svn: 173093	2013-01-21 21:57:20 +00:00
Richard Osborne	9d3ec06ef8	Add instruction encodings / disassembly support for u6 / lu6 instructions. llvm-svn: 173086	2013-01-21 20:44:17 +00:00
Richard Osborne	6e58c6d86d	Add instruction encoding / disassembly support for ru6 / lru6 instructions. llvm-svn: 173085	2013-01-21 20:42:16 +00:00
Richard Osborne	4e69724869	Add instruction encodings / disassembly support for l2rus instructions. llvm-svn: 172987	2013-01-20 18:51:15 +00:00
Richard Osborne	9fbf57b26c	Add instruction encodings / disassembly support for l3r instructions. llvm-svn: 172986	2013-01-20 18:37:49 +00:00
Richard Osborne	f063fcee7a	Add instruction encodings / disassembler support for 2rus instructions. llvm-svn: 172985	2013-01-20 17:22:43 +00:00
Richard Osborne	3fb7395233	Add instruction encodings / disassembly support 3r instructions. It is not possible to distinguish 3r instructions from 2r / rus instructions using only the fixed bits. Therefore if an instruction doesn't match the 2r / rus format try to decode it as a 3r instruction before returning Fail. llvm-svn: 172984	2013-01-20 17:18:47 +00:00
NAKAMURA Takumi	9439237063	llvm/test/CodeGen/X86/win_ftol2.ll: Add -cpu=generic to appease valgrind. On valgrind the processor is reported; Host CPU: athlon-fx llvm-svn: 172983	2013-01-20 15:40:02 +00:00
Nadav Rotem	9450fcfff1	Revert 172708. The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends. This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical. Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model. llvm-svn: 172968	2013-01-20 08:35:56 +00:00
Nadav Rotem	c42f90b1f4	LoopVectorizer: Implement a new heuristics for selecting the unroll factor. We ignore the cpu frontend and focus on pipeline utilization. We do this because we don't have a good way to estimate the loop body size at the IR level. llvm-svn: 172964	2013-01-20 05:24:29 +00:00
Nadav Rotem	2169dbed2c	Change the cpu type in the test. llvm-svn: 172963	2013-01-20 05:20:56 +00:00
NAKAMURA Takumi	619ca0dc40	llvm/test/Other/close-stderr.ll: Mark this as XFAIL:valgrind. We got 127 instead of 1 here. llvm-svn: 172956	2013-01-20 03:35:39 +00:00
David Blaikie	a39a76efbc	The last of PR14471 - emission of constant floats llvm-svn: 172941	2013-01-20 01:18:01 +00:00
David Blaikie	b085931026	Fix a latent bug exposed by recent static member debug info changes. We weren't encoding boolean constants correctly due to modeling boolean as a signed type & then sign extending an i1 up to a byte & getting 255. llvm-svn: 172926	2013-01-19 23:00:25 +00:00
Benjamin Kramer	d455ed85d1	LoopVectorizer: Emit memory checks into their own basic block. This separates the check for "too few elements to run the vector loop" from the "memory overlap" check, giving a lot nicer code and allowing to skip the memory checks when we're not going to execute the vector code anyways. We still leave the decision of whether to emit the memory checks as branches or setccs, but it seems to be doing a good job. If ugly code pops up we may want to emit them as separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso. Most of this is legwork to allow multiple bypass blocks while updating PHIs, dominators and loop info. llvm-svn: 172902	2013-01-19 13:57:58 +00:00
Nadav Rotem	7b3120b9ae	On Sandybridge split unaligned 256bit stores into two xmm-sized stores. llvm-svn: 172894	2013-01-19 08:38:41 +00:00
Jakob Stoklund Olesen	ac6cfa41d6	Remove some register allocation order dependencies. llvm-svn: 172874	2013-01-19 00:03:32 +00:00
Nadav Rotem	7431211214	On Sandybridge loading unaligned 256bits using two XMM loads (vmovups and vinsertf128) is faster than using a single vmovups instruction. llvm-svn: 172868	2013-01-18 23:10:30 +00:00
Eric Christopher	e9ec2458e7	Split out DW_OP_addr for the split debug info DWARF5 proposal. llvm-svn: 172857	2013-01-18 22:11:33 +00:00
Jack Carter	c1b17ed2e1	This is a resubmittal. For some reason it broke the bots yesterday but I cannot reproduce the problem and have scrubed my sources and even tested with llvm-lit -v --vg. Support for Mips register information sections. Mips ELF object files have a section that is dedicated to register use info. Some of this information such as the assumed Global Pointer value is used by the linker in relocation resolution. The register info file is .reginfo in o32 and .MIPS.options in 64 and n32 abi files. This patch contains the changes needed to create the sections, but leaves the actual register accounting for a future patch. Contributer: Jack Carter llvm-svn: 172847	2013-01-18 21:20:38 +00:00
Jack Carter	86c2c564ff	This is a resubmittal. For some reason it broke the bots yesterday but I cannot reproduce the problem and have scrubed my sources and even tested with llvm-lit -v --vg. Removal of redundant code and formatting fixes. Contributers: Jack Carter/Vladimir Medic llvm-svn: 172842	2013-01-18 20:15:06 +00:00
Daniel Dunbar	9585612876	[MC/Mach-O] Implement integrated assembler support for linker options. - Also, fixup syntax errors in LangRef and missing newline in the MCAsmStreamer. llvm-svn: 172837	2013-01-18 19:37:00 +00:00
NAKAMURA Takumi	b72e763325	llvm/test/CodeGen/X86/Atomics-64.ll: Tweak for 2nd RUN not to overwrite %t. It sometimes causes spurious failure on lit win32. Feel free to prune or suppress each output. llvm-svn: 172823	2013-01-18 14:52:02 +00:00
Daniel Dunbar	eec0f32eea	[MC/Mach-O] Add support for linker options in Mach-O files. llvm-svn: 172779	2013-01-18 01:26:07 +00:00
Daniel Dunbar	16004b8324	[MC/Mach-O] Add AsmParser support for .linker_option directive. llvm-svn: 172778	2013-01-18 01:25:48 +00:00
Bill Wendling	da29e00578	Reverting r171325 & r172363. This was causing a mis-compile on the self-hosted LTO build bots. Okay, here's how to reproduce the problem: 1) Build a Release (or Release+Asserts) version of clang in the normal way. 2) Using the clang & clang++ binaries from (1), build a Release (or Release+Asserts) version of the same sources, but this time enable LTO --- specify the `-flto' flag on the command line. 3) Run the ARC migrator tests: $ arcmt-test --args -triple x86_64-apple-darwin10 -fsyntax-only -x objective-c++ ./src/tools/clang/test/ARCMT/cxx-rewrite.mm You'll see that the output isn't correct (the whitespace is off). The mis-compile is in the function `RewriteBuffer::RemoveText' in the clang/lib/Rewrite/Core/Rewriter.cpp file. When that function and RewriteRope.cpp are compiled with LTO and the `arcmt-test' executable is regenerated, you'll see the error. When those files are not LTO'ed, then the output of the `arcmt-test' is fine. It is really hard to get a testcase out of this. I'll file a PR with what I have currently. --- Reverse-merging r172363 into '.': U include/llvm/Analysis/MemoryBuiltins.h U lib/Analysis/MemoryBuiltins.cpp --- Reverse-merging r171325 into '.': U test/Transforms/InstCombine/objsize.ll G include/llvm/Analysis/MemoryBuiltins.h G lib/Analysis/MemoryBuiltins.cpp llvm-svn: 172756	2013-01-17 21:28:46 +00:00
Bill Schmidt	94b8cdbf55	Restore reverted test case, this time with REQUIRES: asserts llvm-svn: 172747	2013-01-17 19:46:51 +00:00
Bill Schmidt	b400204fd8	Remove bad test case llvm-svn: 172746	2013-01-17 19:39:36 +00:00
Bill Schmidt	dee1ef8f53	This patch fixes PR13626 by providing i128 support in the return calling convention. 128-bit integers are now properly returned in GPR3 and GPR4 on PowerPC. llvm-svn: 172745	2013-01-17 19:34:57 +00:00
Jyotsna Verma	9b60c1d171	Add indexed load/store instructions for offset validation check. This patch fixes bug 14902 - http://llvm.org/bugs/show_bug.cgi?id=14902 llvm-svn: 172737	2013-01-17 18:42:37 +00:00
Bill Schmidt	6b2940b01e	This patch fixes the PPC calling convention to handle returns of _Complex float and _Complex long double, by simply increasing the number of floating point registers available for return values. The test case verifies that the correct registers are loaded. llvm-svn: 172733	2013-01-17 17:45:19 +00:00
Elena Demikhovsky	f6a30e05d5	Optimization for the following SIGN_EXTEND pairs: v8i8 -> v8i64, v8i8 -> v8i32, v4i8 -> v4i64, v4i16 -> v4i64 for AVX and AVX2. Bug 14865. llvm-svn: 172708	2013-01-17 09:59:53 +00:00
Eric Christopher	4c7765f166	Fix the assembly and dissassembly of DW_FORM_sec_offset. Found this by changing both the string of the dwo_name to be correct and the type of the statement list. Testcases all around. llvm-svn: 172699	2013-01-17 03:00:04 +00:00
Eric Christopher	1826617133	Add the DW_AT_GNU_addr_base for the skeleton cu. Add support for emitting the dwarf32 version of DW_FORM_sec_offset and correct disassembler support. llvm-svn: 172698	2013-01-17 02:59:59 +00:00
Jack Carter	2a74a87b71	This is a resubmittal. For some reason it broke the bots yesterday but I cannot reproduce the problem and have scrubed my sources and even tested with llvm-lit -v --vg. The Mips RDHWR (Read Hardware Register) instruction was not tested for assembler or dissassembler consumption. This patch adds that functionality. Contributer: Vladimir Medic llvm-svn: 172685	2013-01-17 00:28:20 +00:00
Daniel Dunbar	d77d9fb04d	[IR] Add 'Append' and 'AppendUnique' module flag behaviors. llvm-svn: 172659	2013-01-16 21:38:56 +00:00
Michael Gottesman	00dfc68c2d	Added test for r172599 which fixes bugzilla://14584,rdar://11744105. llvm-svn: 172656	2013-01-16 21:07:18 +00:00
Eric Christopher	69fc38f02f	Make this test X86 only. llvm-svn: 172652	2013-01-16 20:31:35 +00:00
Eric Christopher	45008a5688	Move this to X86. llvm-svn: 172651	2013-01-16 20:31:32 +00:00
Eric Christopher	ce26df829f	Add testcase missed yesterday from Paul Robinson. llvm-svn: 172646	2013-01-16 19:53:47 +00:00
Daniel Dunbar	0ec72bbc4d	[Linker] Change module flag linking to be more extensible. - Instead of computing a bunch of buckets of different flag types, just do an incremental link resolving conflicts as they arise. - This also has the advantage of making the link result deterministic and not dependent on map iteration order. llvm-svn: 172634	2013-01-16 18:39:23 +00:00
Kevin Enderby	e82ada6983	We want the dwarf AT_producer for assembly source files to match clang's AT_producer. Which includes clang's version information so we can tell which version of the compiler was used. This is the first of two steps to allow us to do that. This is the llvm-mc change to provide a method to set the AT_producer string. The second step, coming soon to a clang near you, will have the clang driver pass the value of getClangFullVersion() via an flag when invoking the integrated assembler on assembly source files. rdar://12955296 llvm-svn: 172630	2013-01-16 17:46:23 +00:00
Peter Collingbourne	a51c6ed608	Introduce llvm::sys::getProcessTriple() function. In r143502, we renamed getHostTriple() to getDefaultTargetTriple() as part of work to allow the user to supply a different default target triple at configure time. This change also affected the JIT. However, it is inappropriate to use the default target triple in the JIT in most circumstances because this will not necessarily match the current architecture used by the process, leading to illegal instruction and other such errors at run time. Introduce the getProcessTriple() function for use in the JIT and its clients, and cause the JIT to use it. On architectures with a single bitness, the host and process triples are identical. On other architectures, the host triple represents the architecture of the host CPU, while the process triple represents the architecture used by the host CPU to interpret machine code within the current process. For example, when executing 32-bit code on a 64-bit Linux machine, the host triple may be 'x86_64-unknown-linux-gnu', while the process triple may be 'i386-unknown-linux-gnu'. This fixes JIT for the 32-on-64-bit (and vice versa) build on non-Apple platforms. Differential Revision: http://llvm-reviews.chandlerc.com/D254 llvm-svn: 172627	2013-01-16 17:27:22 +00:00
Benjamin Kramer	b7050f0a7c	Move test that depends on the x86 target into a target-specific directory. Should fix the arm buildbot (which only builds the arm target). llvm-svn: 172611	2013-01-16 13:25:56 +00:00
Alexey Samsonov	1345d35e40	ASan: wrap mapping scale and offset in a struct and make it a member of ASan passes. Add test for non-default mapping scale and offset. No functionality change llvm-svn: 172610	2013-01-16 13:23:28 +00:00
Benjamin Kramer	1f25d24a8f	Remove triple from this test, it makes it fail when X86 TTI is missing. Without a triple opt falls back to NoTTI which comes closer to LSR's pre-TTI behavior. llvm-svn: 172609	2013-01-16 13:19:59 +00:00
Jack Carter	5619f91bf7	reverting 172579 llvm-svn: 172594	2013-01-16 01:29:10 +00:00
Jack Carter	e0c1e1a47e	Akira, Hope you are feeling better. The Mips RDHWR (Read Hardware Register) instruction was not tested for assembler or dissassembler consumption. This patch adds that functionality. Contributer: Vladimir Medic llvm-svn: 172579	2013-01-16 00:07:45 +00:00
Eric Christopher	962c9089d9	Split address information for DWARF5 split dwarf proposal. This involves using the DW_FORM_GNU_addr_index and a separate .debug_addr section which stays in the executable and is fully linked. Sneak in two other small changes: a) Print out the debug_str_offsets.dwo section. b) Change form we're expecting the entries in the debug_str_offsets.dwo section to take from ULEB128 to U32. Add tests for all of this in the fission-cu.ll test. llvm-svn: 172578	2013-01-15 23:56:56 +00:00
Nadav Rotem	7df850924d	Teach InstCombine to optimize extract of a value from a vector add operation with a constant zero. llvm-svn: 172576	2013-01-15 23:43:14 +00:00
Shuxin Yang	e822745202	1. Hoist minus sign as high as possible in an attempt to reveal some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (XY) X => (XX) Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of XX, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551	2013-01-15 21:09:32 +00:00
Daniel Dunbar	c36547d422	[IR] Add verification for module flags with the "require" behavior. llvm-svn: 172549	2013-01-15 20:52:06 +00:00
Evgeniy Stepanov	701d2b861e	[msan] Temporarily remove ICmpEQ tests. They are failing on the bots. llvm-svn: 172540	2013-01-15 17:12:04 +00:00
Evgeniy Stepanov	d14e47b146	[msan] Fix handling of equality comparison of pointer vectors. Also improve test coveration of the handling of relational comparisons. llvm-svn: 172539	2013-01-15 16:44:52 +00:00
Renato Golin	51c25b0818	Pattern-matched variables in post-inc-icmpzero.ll Test was failing for clang-native-arm-cortex-a9 build-bot configuration. The reason for the failure was the test was using hardcoded names. The attached patch fixes this failure by replacing the hard-coded variables names with pattern-matched variable names. Patch by Manish Verma, ARM llvm-svn: 172534	2013-01-15 15:22:45 +00:00
Daniel Dunbar	25c4b5718b	[IR] Add verifier support for llvm.module.flags. - Also, update the LangRef documentation on module flags to match the implementation. llvm-svn: 172498	2013-01-15 01:22:53 +00:00
Jack Carter	f238510c43	This patch fixes a Mips specific bug where we need to generate a N64 compound relocation R_MIPS_GPREL_32/R_MIPS_64/R_MIPS_NONE. The bug was exposed by the SingleSourcetest case DuffsDevice.c. Contributer: Jack Carter llvm-svn: 172496	2013-01-15 01:08:02 +00:00
Shuxin Yang	320f52a4b0	This change is to implement following rules under the condition C_A and/or C_R --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2C1) (if C_A) => X (1/(C2C1)) (if C_A && C_R) rule 2: XC1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(YZ) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (ZY)/X (similar to rule3) rule 5: C1/(XC2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488	2013-01-14 22:48:41 +00:00
Chad Rosier	5c118fd2ec	[ms-inline asm] Extend support for parsing Intel bracketed memory operands that have an arbitrary ordering of the base register, index register and displacement. rdar://12527141 llvm-svn: 172484	2013-01-14 22:31:35 +00:00
Bill Schmidt	d006c6938b	This patch addresses an incorrect transformation in the DAG combiner. The included test case is derived from one of the GCC compatibility tests. The problem arises after the selection DAG has been converted to type-legalized form. The combiner first sees a 64-bit load that can be converted into a pre-increment form. The original load feeds into a SRL that isolates the upper 32 bits of the loaded doubleword. This looks like an opportunity for DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load. However, this transformation is not valid, as the replacement load is not a pre-increment load. The pre-increment load produces an extra result, which feeds a subsequent add instruction. The replacement load only has one result value, and this value is propagated to all uses of the pre- increment load, including the add. Because the add is looking for the second result value as its operand, it ends up attempting to add a constant to a token chain, resulting in a crash. So the patch simply disables this transformation for any load with more than two result values. llvm-svn: 172480	2013-01-14 22:04:38 +00:00
Andrew Trick	d4e1b5e291	SCEVExpander fix. RAUW needs to update the InsertedExpressions cache. Note that this bug is only exposed because LTO fails to use TTI. Fixes self-LTO of clang. rdar://13007381. llvm-svn: 172462	2013-01-14 21:00:37 +00:00
Michael Gottesman	c99ee6b336	Added bugzilla PR number to test case. llvm-svn: 172369	2013-01-13 22:17:22 +00:00
Michael Gottesman	f15c0bb495	Fixed an infinite loop in the block escape in analysis in ObjCARC caused by 2x blocks each assigned a value via a phi-node causing each to depend on the other. A test case is provided as well. llvm-svn: 172368	2013-01-13 22:12:06 +00:00
Benjamin Kramer	bcd14a0f26	X86: Add patterns for X86ISD::VSEXT in registers. Those can occur when something between the sextload and the store is on the same chain and blocks isel. Fixes PR14887. llvm-svn: 172353	2013-01-13 11:37:04 +00:00
Nadav Rotem	40e45eeae2	Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 and i16). llvm-svn: 172348	2013-01-13 07:56:29 +00:00
Benjamin Kramer	5ea0349ef5	When lowering an inreg sext first shift left, then right arithmetically. Shifting right two times will only yield zero. Should fix SingleSource/UnitTests/SignlessTypes/factor. llvm-svn: 172322	2013-01-12 19:06:44 +00:00
Michael Gottesman	556ff61122	Fixed bug in ObjCARC where we were changing a call from objc_autoreleaseRV => objc_autorelease but were not updating the InstructionClass to IC_Autorelease. llvm-svn: 172288	2013-01-12 01:25:19 +00:00
Michael Gottesman	c9656faf1e	Fixed a bug where we were tail calling objc_autorelease causing an object to not be placed into an autorelease pool. The reason that this occurs is that tail calling objc_autorelease eventually tail calls -[NSObject autorelease] which supports fast autorelease. This can cause us to violate the semantic gaurantees of __autoreleasing variables that assignment to an __autoreleasing variables always yields an object that is placed into the innermost autorelease pool. The fix included in this patch works by: 1. In the peephole optimization function OptimizeIndividualFunctions, always remove tail call from objc_autorelease. 2. Whenever we convert to/from an objc_autorelease, set/unset the tail call keyword as appropriate. NOTE I also handled the case where objc_autorelease is converted in OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I will be removing that in a later patch and I wanted to make sure that the tree is in a consistent state vis-a-vis ARC always. Additionally some test cases are provided and all tests that have tail call marked objc_autorelease keywords have been modified so that tail call has been removed. NOTE One test fails due to a separate bug that I am going to commit soon. Thus I marked the check line TMP: instead of CHECK: so make check does not fail. llvm-svn: 172287	2013-01-12 01:25:15 +00:00
Jack Carter	873c724b4a	This patch tackles the problem of parsing Mips register names in the standalone assembler llvm-mc. Registers such as $A1 can represent either a 32 or 64 bit register based on the instruction using it. In addition, based on the abi, $T0 can represent different 32 bit registers. The problem is resolved by the Mips specific AsmParser td definitions changing to work together. Many cases of RegisterClass parameters are now RegisterOperand. Contributer: Vladimir Medic llvm-svn: 172284	2013-01-12 01:03:14 +00:00
Nadav Rotem	dbe5c72d03	PPC: Implement efficient lowering of sign_extend_inreg. llvm-svn: 172269	2013-01-11 22:57:48 +00:00
Preston Gurd	99c6990457	Update patch for the pad short functions pass for Intel Atom (only). Adds a check for -Oz, changes the code to not re-visit BBs, and skips over DBG_VALUE instrs. Patch by Andy Zhang. llvm-svn: 172258	2013-01-11 22:06:56 +00:00
Nadav Rotem	e55aa3c848	ARM Cost Model: Modify the target independent cost model to ask the target if it supports the different CAST types. We didn't do this on X86 because of the different register sizes and types, but on ARM this makes sense. llvm-svn: 172245	2013-01-11 19:54:13 +00:00
Eric Christopher	0cb6fd930e	For inline asm: - recognize string "{memory}" in the MI generation - mark as mayload/maystore when there's a memory clobber constraint. PR14859. Patch by Krzysztof Parzyszek llvm-svn: 172228	2013-01-11 18:12:39 +00:00
Tim Northover	3a51aab390	Simplify writing floating types to assembly. This removes previous special cases for each floating-point type in favour of a shared codepath. llvm-svn: 172189	2013-01-11 10:36:13 +00:00
Nadav Rotem	853fe0acb9	ARM Cost Model: We need to detect the max bitwidth of types in the loop in order to select the max vectorization factor. We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector zext/sext/trunc operations. llvm-svn: 172178	2013-01-11 07:11:59 +00:00
Michael Gottesman	5284bd013f	Converted test dont-tce-tail-marked-call.ll to use FileCheck. llvm-svn: 172172	2013-01-11 04:16:35 +00:00
Michael Gottesman	93a0d49c7e	This commit is a 4x squash commit consisting of 4x functions converted to use FileCheck instead of grep. Messages: Converted test case trivial_codegen_tailcall.ll to use FileCheck. Converted test return_constant.ll to use FileCheck instead of grep. Converted test reorder_load.ll to use FileCheck instead of grep. Converted test intervening-inst.ll to use FileCheck instead of grep. llvm-svn: 172171	2013-01-11 04:12:53 +00:00
Shuxin Yang	c5c730b0e0	PR14904: Segmentation fault running pass 'Recognize loop idioms' The root cause is mistakenly taking for granted that "dyn_cast<Instruction>(a-Value)" return a non-NULL instruction. llvm-svn: 172145	2013-01-10 23:32:01 +00:00
Evan Cheng	098d7b76b0	CastInst::castIsValid should return true if the dest type is the same as Value's current type. The casting is trivial even for aggregate type. llvm-svn: 172143	2013-01-10 23:22:53 +00:00
NAKAMURA Takumi	e46e8225f4	llvm/test/CodeGen/X86/ms-inline-asm.ll: Fixup; Globals doesn't have leading underscore in symbol on linux. llvm-svn: 172139	2013-01-10 23:02:48 +00:00
Michael J. Spencer	d857c1c9bf	[llvm-objdump] Emit addresses with the correct number of leading 0's. llvm-svn: 172130	2013-01-10 22:40:50 +00:00
Peter Collingbourne	f7d65c43d0	[msan] Change va_start/va_copy shadow memset alignment to 8. This fixes va_start/va_copy of a va_list field which happens to not be laid out at a 16-byte boundary. Differential Revision: http://llvm-reviews.chandlerc.com/D276 llvm-svn: 172128	2013-01-10 22:36:33 +00:00
Evan Cheng	c8444b159a	PR14896: Handle memcpy from constant string where the memcpy size is larger than the string size. llvm-svn: 172124	2013-01-10 22:13:27 +00:00
Chad Rosier	a4bc9437a2	[ms-inline asm] Add support for calling functions from inline assembly. Part of rdar://12991541 llvm-svn: 172121	2013-01-10 22:10:27 +00:00
Owen Anderson	dbf0ca523d	Teach InstCombine to hoist FABS and FNEG through FPTRUNC instructions. The application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc. llvm-svn: 172117	2013-01-10 22:06:52 +00:00
Nadav Rotem	6eae65cfac	LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals. PR14878 llvm-svn: 172079	2013-01-10 17:34:39 +00:00
Joey Gouly	5fad3e9ad6	Fix a copy/paste error in the IR Linker, casting an ArrayType instead of a VectorType. llvm-svn: 172054	2013-01-10 10:49:36 +00:00
Joey Gouly	58bf951dec	Fix TryToShrinkGlobalToBoolean in GlobalOpt, so that it does not discard address spaces. llvm-svn: 172051	2013-01-10 10:31:11 +00:00
Manman Ren	207bcbacca	Stack Alignment: throw error if we can't satisfy the minimal alignment requirement when creating stack objects in MachineFrameInfo. Add CreateStackObjectWithMinAlign to throw error when the minimal alignment can't be achieved and to clamp the alignment when the preferred alignment can't be achieved. Same is true for CreateVariableSizedObject. Will not emit error in CreateSpillStackObject or CreateStackObject. As long as callers of CreateStackObject do not assume the object will be aligned at the requested alignment, we should not have miscompile since later optimizations which look at the object's alignment will have the correct information. rdar://12713765 llvm-svn: 172027	2013-01-10 01:10:10 +00:00
Nadav Rotem	b1791a75cd	ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor. llvm-svn: 172010	2013-01-09 22:29:00 +00:00
Evan Cheng	5652a8df32	Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y). It cahced XOR's operands before calling visitXOR() but failed to update the operands when visitXOR changed the XOR node. rdar://12968664 llvm-svn: 171999	2013-01-09 20:56:40 +00:00
Benjamin Kramer	130fcde3e5	LICM: Hoist insertvalue/extractvalue out of loops. Fixes PR14854. llvm-svn: 171984	2013-01-09 18:12:03 +00:00
Adhemerval Zanella	1ae2248e14	PowerPC: EH adjustments This patch adjust the r171506 to make all DWARF enconding pc-relative for PPC64. It also adds the R_PPC64_REL32 relocation handling in MCJIT (since the eh_frame will not generate PIC-relative relocation) and also adds the emission of stubs created by the TTypeEncoding. llvm-svn: 171979	2013-01-09 17:08:15 +00:00
Nadav Rotem	3f5825c6c1	add -march to the test llvm-svn: 171956	2013-01-09 07:04:23 +00:00
Nadav Rotem	977e0be4a0	Efficient lowering of vector sdiv when the divisor is a splatted power of two constant. PR 14848. The lowered sequence is based on the existing sequence the target-independent DAG Combiner creates for the scalar case. Patch by Zvi Rackover. llvm-svn: 171953	2013-01-09 05:14:33 +00:00
Andrew Trick	9f0b95f260	MIsched: add an ILP window property to machine model. This was an experimental option, but needs to be defined per-target. e.g. PPC A2 needs to aggressively hide latency. I converted some in-order scheduling tests to A2. Hal is working on more test cases. llvm-svn: 171946	2013-01-09 03:36:49 +00:00
Nadav Rotem	4c66f87e8e	ARM Cost Model: Add a basic vectorization unrolling test. llvm-svn: 171931	2013-01-09 01:29:07 +00:00
Nadav Rotem	30a65bc39e	Remove the -licm pass from the loop vectorizer test because the loop vectorizer does it now. llvm-svn: 171930	2013-01-09 01:20:59 +00:00
Nadav Rotem	b696c36fcd	Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM. llvm-svn: 171928	2013-01-09 01:15:42 +00:00
Shuxin Yang	f0537ab681	Consider expression "0.0 - X" as the negation of X if - this expression is explicitly marked no-signed-zero, or - no-signed-zero of this expression can be derived from some context. llvm-svn: 171922	2013-01-09 00:13:41 +00:00
Tim Northover	90fb75d859	Specify complete triple for fp128 tests. This avoids FileCheck failing over different comment characters in assembly (notably powerpc64 on Linux vs Darwin) and should fix David's build-bot. llvm-svn: 171886	2013-01-08 19:36:33 +00:00
Jack Carter	c3dd91c4d7	This patch produces the correct addend value for an R_MIPS_GPREL16 relocation. Contributer: Jack Carter llvm-svn: 171882	2013-01-08 19:01:28 +00:00
Jack Carter	9e28cd3fad	This patch produces the correct pointer size value in the 64 bit .eh_frame section. It doesn't however allow exception handling to work yet since it depends on the correct relocation model being set in the ELF header flags. Contributer: Jack Carter llvm-svn: 171881	2013-01-08 18:53:20 +00:00
Preston Gurd	a01daace88	Pad Short Functions for Intel Atom The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. llvm-svn: 171879	2013-01-08 18:27:24 +00:00
Tim Northover	7bb9992cce	Allow the asm printer to print fp128 values properly. llvm-svn: 171866	2013-01-08 16:56:23 +00:00
Bill Wendling	76c6521ba1	Make sure we don't emit instructions before a landingpad instruction. PR14782 llvm-svn: 171846	2013-01-08 10:51:32 +00:00
Eric Christopher	95de6bd469	Add the C testcase to this file. Suggested by Dave Blaikie. llvm-svn: 171839	2013-01-08 03:03:14 +00:00
Eric Christopher	72a529566c	Remove the llvm-local DW_TAG_vector_type tag and add a test to make sure that vector types do work. llvm-svn: 171833	2013-01-08 01:53:52 +00:00
David Blaikie	5c0b298b91	Mark artificial types as such in the annotated debug output. llvm-svn: 171826	2013-01-08 00:31:02 +00:00
Nadav Rotem	5a197c06f3	LoopVectorizer: Add support for floating point reductions llvm-svn: 171812	2013-01-07 23:13:00 +00:00
Eli Bendersky	3e76eb2dbc	Add some additional tests for the .bundle_lock align_to_end feature that didn't make into the last commit. Also, update the test-generation script to generate an exhaustive test for align_to_end as well, and include the generated test. llvm-svn: 171811	2013-01-07 23:12:59 +00:00
Nadav Rotem	c60d7d96f5	LoopVectorizer: When we vectorizer and widen loops we process many elements at once. This is a good thing, except for small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the rest of the loop. This patch disables widening of loops with a small static trip count. llvm-svn: 171798	2013-01-07 21:54:51 +00:00
Eli Bendersky	802b62871e	Add the align_to_end option to .bundle_lock in the MC implementation of aligned bundling. The document describing this feature and the implementation has also been updated: https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/aligned-bundling-support-in-llvm llvm-svn: 171797	2013-01-07 21:51:08 +00:00
Shuxin Yang	df0e61e793	This change is to implement following rules: o. X/C1 * C2 => X * (C2/C1) (if C2/C1 is neither special FP nor denormal) o. X/C1 * C2 -> X/(C1/C2) (if C2/C1 is either specical FP or denormal, but C1/C2 is a normal Fp) Let MDC denote multiplication or dividion with one & only one operand being a constant o. (MDC ± C1) * C2 => (MDC * C2) ± (C1 * C2) (so long as the constant-folding doesn't yield any denormal or special value) llvm-svn: 171793	2013-01-07 21:39:23 +00:00
Eric Christopher	2cbd5767ad	Add support for separating strings for the split debug info DWARF5 proposal. This leaves the strings in the skeleton die as strp, but in all dwo files they're accessed now via DW_FORM_GNU_str_index. Add support for dumping these sections and modify the fission-cu.ll testcase to have the correct strings and form. Fix a small bug in the fixed form sizes routine that involved out of array accesses for the table and add a FIXME in the extractFast routine to fix this up. llvm-svn: 171779	2013-01-07 19:32:41 +00:00
Bill Schmidt	9b1e3e25dc	This patch addresses bug 14678 by fixing two problems in medium code model code generation. Variables addressed through a GlobalAlias were not being handled, and variables with available_externally linkage were treated incorrectly. The patch contains two new tests to verify the correct code generation for these cases. llvm-svn: 171778	2013-01-07 19:29:18 +00:00
Quentin Colombet	3b2db0bcd3	When code size is the priority (Oz, MinSize attribute), help llvm turning a code like this: if (foo) free(foo) into that: free(foo) Move a call to free from basic block FB into FB's predecessor, P, when the path from P to FB is taken only if the argument of free is not equal to NULL. Some restrictions apply on P and FB to be sure that this code motion is profitable. Namely: 1. FB must have only one predecessor P. 2. FB must contain only the call to free plus an unconditional branch to S. 3. P's successors are FB and S. Because of 1., we will not increase the code size when moving the call to free from FB to P. Because of 2., FB will be empty after the move. Because of 2. and 3., P's branch instruction becomes useless, so as FB (simplifycfg will do the job). llvm-svn: 171762	2013-01-07 18:37:41 +00:00
David Blaikie	8a9b6a3681	Make test/DebugInfo/member-pointers.ll portable by removing the TargetData llvm-svn: 171759	2013-01-07 17:52:49 +00:00
Chandler Carruth	26c59fa870	Switch the SCEV expander and LoopStrengthReduce to use TargetTransformInfo rather than TargetLowering, removing one of the primary instances of the layering violation of Transforms depending directly on Target. This is a really big deal because LSR used to be a "special" pass that could only be tested fully using llc and by looking at the full output of it. It also couldn't run with any other loop passes because it had to be created by the backend. No longer is this true. LSR is now just a normal pass and we should probably lift the creation of LSR out of lib/CodeGen/Passes.cpp and into the PassManagerBuilder. =] I've not done this, or updated all of the tests to use opt and a triple, because I suspect someone more familiar with LSR would do a better job. This change should be essentially without functional impact for normal compilations, and only change behvaior of targetless compilations. The conversion required changing all of the LSR code to refer to the TTI interfaces, which fortunately are very similar to TargetLowering's interfaces. However, it also allowed us to always expect to have some implementation around. I've pushed that simplification through the pass, and leveraged it to simplify code somewhat. It required some test updates for one of two things: either we used to skip some checks altogether but now we get the default "no" answer for them, or we used to have no information about the target and now we do have some. I've also started the process of removing AddrMode, as the TTI interface doesn't use it any longer. In some cases this simplifies code, and in others it adds some complexity, but I think it's not a bad tradeoff even there. Subsequent patches will try to clean this up even further and use other (more appropriate) abstractions. Yet again, almost all of the formatting changes brought to you by clang-format. =] llvm-svn: 171735	2013-01-07 14:41:08 +00:00
David Tweed	3f90937535	Fix a mistaken commit that included some debugging code. llvm-svn: 171734	2013-01-07 13:41:55 +00:00
David Tweed	a11edf0ce3	There was a switch fall-through in the parser for textual LLVM that caused bogus comparison operands to default to eq/oeq. Fix that, fix a couple of tests that accidentally passed and test for bogus comparison opeartors explicitly. llvm-svn: 171733	2013-01-07 13:32:38 +00:00
Silviu Baranga	a055aab506	Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them. llvm-svn: 171730	2013-01-07 12:31:25 +00:00
Chandler Carruth	7383bfd67e	Switch BBVectorize to directly depend on having a TTI analysis. This could be simplified further, but Hal has a specific feature for ignoring TTI, and so I preserved that. Also, I needed to use it because a number of tests fail when switching from a null TTI to the NoTTI nonce implementation. That seems suspicious to me and so may be something that you need to look into Hal. I worked it by preserving the old behavior for these tests with the flag that ignores all target info. llvm-svn: 171722	2013-01-07 10:22:36 +00:00
David Blaikie	5d3249b554	PR14759: Debug info support for C++ member pointers. This works fine with GDB for member variable pointers, but GDB's support for member function pointers seems to be quite unrelated to DW_TAG_ptr_to_member_type. (see GDB bug 14998 for details) llvm-svn: 171698	2013-01-07 05:51:15 +00:00
Craig Topper	4f1c7256f9	Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior. cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix. cvtt2si/cvt2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix. Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein. llvm-svn: 171668	2013-01-06 20:39:29 +00:00
Evan Cheng	3fb03e23a4	Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch. llvm-svn: 171665	2013-01-06 19:00:15 +00:00
Andrew Trick	f950ce8e38	Fix a crash in LSR replaceCongruentIVs. Indirect branch in the preheader crashes replaceCongruentIVs. Fixes rdar://12910141. llvm-svn: 171653	2013-01-06 05:59:39 +00:00
Michael J. Spencer	c445408710	[Object][ELF] Fix incorrect size of members for the 64 version of Elf_Phdr_Impl. llvm-svn: 171650	2013-01-06 03:57:11 +00:00
Michael J. Spencer	209565db2d	[objdump] Add --private-headers, -p. This currently prints the ELF program headers. llvm-svn: 171649	2013-01-06 03:56:49 +00:00
David Blaikie	e05754576b	Include access modifiers in subprogram metadata IR comment. Based on code review feedback in r171604 from Chandler Carruth & Eric Christopher. llvm-svn: 171636	2013-01-05 21:39:33 +00:00
David Blaikie	800a916f99	Emit DW_TAG_formal_parameter for unnamed parameters. This change essentially reverts r87069 which came without a test case. It causes no regressions in the GDB 7.5 test suite & fixes 25 xfails (commit to the test suite to follow). If anyone can present a test case that demonstrates why this check is necessary I'd be happy to account for it in one way or another. llvm-svn: 171609	2013-01-05 07:43:02 +00:00
Craig Topper	92a70b1e65	Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. llvm-svn: 171608	2013-01-05 07:39:25 +00:00
Nadav Rotem	478b6a47ec	Revert revision 171524. Original message: URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171603	2013-01-05 05:42:48 +00:00
Nadav Rotem	f19d515316	Fix a typo. Remove the duplicated test. llvm-svn: 171584	2013-01-05 01:17:46 +00:00
Nadav Rotem	e9f5bfd5e9	iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS. PR14803. llvm-svn: 171583	2013-01-05 01:15:47 +00:00
Nadav Rotem	6d9dafe3ff	Force a fixed unroll count on the target independent tests. This should fix clang-native-arm-cortex-a9. Thanks Renato. llvm-svn: 171582	2013-01-05 00:58:48 +00:00
Andrew Trick	18021a45aa	tabs-to-spaces llvm-svn: 171550	2013-01-04 23:11:35 +00:00
Paul Redmond	874f01e956	Do not vectorize loops with subtraction reductions Since subtraction does not commute the loop vectorizer incorrectly vectorizes reductions such as x = A[i] - x. Disabling for now. llvm-svn: 171537	2013-01-04 22:10:16 +00:00
Eric Christopher	cad9b53c02	Add a name for the anonymous type we're creating for subrange types and a FIXME for what we should be doing. Should solve the immediacy of PR12069 where our debug info is crashing another tool. llvm-svn: 171536	2013-01-04 21:51:53 +00:00
Preston Gurd	e36b685a94	The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171524	2013-01-04 20:54:54 +00:00
Michael J. Spencer	bae14cef80	[Object][ELF] Add a maximum alignment. This is used by createELFObjectFile to create a properly aligned reader. llvm-svn: 171520	2013-01-04 20:36:28 +00:00
Akira Hatanaka	b13b33359b	[mips] MipsTargetLowering::getSetCCResultType should return a vector type if vectors are being compared. llvm-svn: 171517	2013-01-04 20:06:01 +00:00
Manman Ren	fe5a61edbe	Memory Dependence Analysis: fix a miscompile that uses DT to approxmiate the reachablity. We conservatively approximate the reachability analysis by saying it is not reachable if there is a single path starting from "From" and the path does not reach "To". rdar://12801584 llvm-svn: 171512	2013-01-04 19:19:47 +00:00
Adhemerval Zanella	9b0b781395	PowerPC: Fix eh_frame relocation for PIC This patch fixes the PPC eh_frame definitions for the personality and frame unwinding for PIC objects. It makes PIC build correctly creates relative relocations in the '.rela.eh_frame' segments and thus avoiding a text relocation that generates a DT_TEXTREL segments in link phase. llvm-svn: 171506	2013-01-04 19:08:13 +00:00
Nadav Rotem	e1d5c4b8b9	LoopVectorizer: 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469	2013-01-04 17:48:25 +00:00
Nadav Rotem	c616a5408a	Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message: Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171468	2013-01-04 17:35:21 +00:00
Elena Demikhovsky	5f2f06d2d9	Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171467	2013-01-03 08:48:33 +00:00
Michael Gottesman	820aac1c78	Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks." This reverts commit r171461 since it breaks the following tests: Clang :: Analysis/outofbound-notwork.c Clang :: Analysis/string-fail.c Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp Clang :: CXX/temp/temp.param/p14.cpp Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c Clang :: CodeGen/blocks-2.c Clang :: CodeGen/libcalls-d.c Clang :: CodeGen/libcalls-ld.c Clang :: CodeGenCXX/conversion-function.cpp Clang :: CodeGenCXX/debug-info-limit-type.cpp Clang :: CodeGenCXX/inheriting-constructor.cpp Clang :: FixIt/fixit-errors.c Clang :: FixIt/fixit-pmem.cpp Clang :: Modules/namespaces.cpp Clang :: PCH/changed-files.c Clang :: PCH/pr4489.c Clang :: PCH/source-manager-stack.c Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp Clang :: SemaTemplate/instantiate-function-1.mm llvm-svn: 171466	2013-01-03 08:18:30 +00:00
Craig Topper	7c27cc9fd0	Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. llvm-svn: 171461	2013-01-03 06:40:20 +00:00
Nadav Rotem	d554a517c0	LoopVectorizer: Test the unrolling flag. llvm-svn: 171446	2013-01-03 01:47:31 +00:00
Michael J. Spencer	e0219f78d3	[Object] Temporarily disable these tests. They are failing because archives create unaligned ELF files. The recent Endian change added a __builtin_unreachable() when this happens. I will be committing a fix for this soon. llvm-svn: 171438	2013-01-03 01:24:32 +00:00
Jakob Stoklund Olesen	725d57682b	Fix PR14732 by handling all kinds of IMPLICIT_DEF live ranges. Most IMPLICIT_DEF instructions are removed by the ProcessImplicitDefs pass, and a few are reinserted by PHIElimination when a PHI argument is <undef>. RegisterCoalescer was assuming that all IMPLICIT_DEF live ranges look like those created by PHIElimination, and that their live range never leaves the basic block. The PR14732 test case does tricks with PHI nodes that causes a longer IMPLICIT_DEF live range to appear. This happens very rarely, but RegisterCoalescer should be able to handle it. llvm-svn: 171435	2013-01-03 00:47:51 +00:00
Nadav Rotem	4897392360	Avoid vectorization when the function has the "noimplicitflot" attribute. llvm-svn: 171429	2013-01-02 23:54:43 +00:00
Eric Christopher	da4b2195fc	Extend the dumping infrastructure to deal with additional sections for debug info. These are some of the dwo sections from the DWARF5 split debug info proposal. Update the fission-cu.ll testcase to show what we should be able to dump more of now. Work in progress: Ultimately the relocations will be gone for the dwo section and the strings will be a different form (as well as the rest of the sections will be included). llvm-svn: 171428	2013-01-02 23:52:13 +00:00
Tom Stellard	567f886eb0	DAGCombiner: Avoid generating illegal vector INT_TO_FP nodes DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two mistakes: 1. It was checking the legality of scalar INT_TO_FP nodes and then generating vector nodes. 2. It was passing the result value type to TargetLoweringInfo::getOperationAction() when it should have been passing the value type of the first operand. llvm-svn: 171420	2013-01-02 22:13:01 +00:00
Kevin Enderby	726e0ea6eb	Adds missing aliases for fcom and fcomp instructions without arguments. Patch by Michael M Kuperstein! llvm-svn: 171414	2013-01-02 21:20:15 +00:00
Nadav Rotem	c8d7047fa9	AVX: Fix a bug in WidenMaskArithmetic. llvm-svn: 171397	2013-01-02 17:40:39 +00:00
Dmitri Gribenko	86fb558d9a	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. While there, FileCheck'ize tests. llvm-svn: 171344	2013-01-01 14:04:36 +00:00
Dmitri Gribenko	d7beca87f5	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. My previous regex was not good enough to find these. llvm-svn: 171343	2013-01-01 13:57:25 +00:00
Nadav Rotem	b1615b1ac4	Make opt grab the triple from the module and use it to initialize the target machine. llvm-svn: 171341	2013-01-01 08:00:32 +00:00
Nuno Lopes	d896a400f1	recommit r171298 (add support for PHI nodes to ObjectSizeOffsetVisitor). Hopefully with bugs corrected now. llvm-svn: 171325	2012-12-31 20:45:10 +00:00
Benjamin Kramer	af463573cb	Revert "add support for PHI nodes to ObjectSizeOffsetVisitor" This reverts r171298. Breaks clang selfhost. llvm-svn: 171318	2012-12-31 19:51:10 +00:00
Jakub Staszak	c48bbe7170	Add extra CHECK to make sure that 'or' instruction was replaced. Also add an assert to avoid confusion in the code where is known that C1 <= C2. llvm-svn: 171310	2012-12-31 18:26:42 +00:00
Rafael Espindola	c8288c103d	Fix bits check in ELFObjectFile::isSectionZeroInit(). Fixes PR14723. Patch by Sami Liedes! llvm-svn: 171309	2012-12-31 18:20:51 +00:00
Rafael Espindola	21bd841d27	Dump sections. Extracted from a patch by Sami Liedes. llvm-svn: 171304	2012-12-31 16:29:44 +00:00
Rafael Espindola	144af2cb4d	Print a header above the symbols. Extracted from a patch by Sami Liedes. llvm-svn: 171302	2012-12-31 16:05:21 +00:00
Nuno Lopes	7ab7c02d23	add support for PHI nodes to ObjectSizeOffsetVisitor llvm-svn: 171298	2012-12-31 13:52:36 +00:00
Chris Lattner	f5cca68c2c	Fix LICM's memory promotion optimization to preserve TBAA tags when promoting a store in a loop. This was noticed when working on PR14753, but isn't directly related. llvm-svn: 171281	2012-12-31 08:37:17 +00:00
Chris Lattner	eeefe1bc07	teach instcombine to preserve TBAA tag when merging two stores, part of PR14753 llvm-svn: 171279	2012-12-31 08:10:58 +00:00
Jakub Staszak	ea2b9b9d67	Transform (A == C1 \|\| A == C2) into (A & ~(C1 ^ C2)) == C1 if C1 and C2 differ only with one bit. Fixes PR14708. llvm-svn: 171270	2012-12-31 00:34:55 +00:00
Hal Finkel	6dbdd4307b	Support ppcf128 in SelectionDAG::getConstantFP Fixes pr14751. Patch by Kai; Thanks! llvm-svn: 171261	2012-12-30 19:03:32 +00:00
Nadav Rotem	0b37f14371	LoopVectorizer: Fix a bug in the code that updates the loop exiting block. LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs. The bug happened because undefs are not loop values. This patch handles these PHIs. PR14725 llvm-svn: 171251	2012-12-30 07:47:00 +00:00
Dmitri Gribenko	56bf2e1830	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171250	2012-12-30 02:33:22 +00:00
Dmitri Gribenko	10c4b4d249	Add a check to the test Analysis/ScalarEvolution/2010-09-03-RequiredTransitive.ll This test did not test anything at all (except for opt crashing, but that was not the reason why it was added). llvm-svn: 171248	2012-12-30 01:42:34 +00:00
Dmitri Gribenko	b137c9e551	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. llvm-svn: 171246	2012-12-30 01:28:40 +00:00
NAKAMURA Takumi	5a495a5c96	llvm/test/Transforms/GVN/null-aliases-nothing.ll: Fix a RUN line not to emit ModuleID. Larry Evans reported it fails if source tree contains "load", like "download". llvm-svn: 171243	2012-12-30 00:33:26 +00:00
Chandler Carruth	86ed53089f	Fix a stunning oversight in the inline cost analysis. It was never propagating one of the values it simplified to a constant across a myriad of instructions. Notably, ptrtoint instructions when we had a constant pointer (say, 0) didn't propagate that, blocking a massive number of down-stream optimizations. This was uncovered when investigating why we fail to inline and delete the boilerplate in: void f() { std::vector<int> v; v.push_back(1); } It turns out most of the efforts I've made thus far to improve the analysis weren't making it far purely because of this. After this is fixed, the store-to-load forwarding patch enables LLVM to optimize the above to an empty function. We still can't nuke a second push_back, but for different reasons. There is a very real chance this will cause somewhat noticable changes in inlining behavior, so please let me know if you see regressions (or improvements!) because of this patch. llvm-svn: 171196	2012-12-28 14:43:42 +00:00
Chandler Carruth	753e21d057	Teach the inline cost analysis about calls that can be simplified and how to propagate constants through insert and extract value instructions. With the recent improvements to instsimplify, this allows inline cost analysis to constant fold through intrinsic functions, including notably the with.overflow intrinsic math routines which often show up inside of STL abstractions. This is yet another piece in the puzzle of breaking down the code for: void f() { std::vector<int> v; v.push_back(1); } But it still isn't enough. There are a pile of bugs in inline cost still blocking this. llvm-svn: 171195	2012-12-28 14:23:32 +00:00
Chandler Carruth	f6182155f6	Teach instsimplify to use the constant folder where appropriate for constant folding calls. Add the initial tests for this which show that now instsimplify can simplify blindingly obvious code patterns expressed with both intrinsics and library calls. llvm-svn: 171194	2012-12-28 14:23:29 +00:00
Nadav Rotem	3da9ac72fa	AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend. llvm-svn: 171178	2012-12-28 05:45:24 +00:00
Alexey Samsonov	29dd7f2090	[ASan] Fix lifetime intrinsics handling. Now for each intrinsic we check if it describes one of 'interesting' allocas. Assume that allocas can go through casts and phi-nodes before apperaring as llvm.lifetime arguments llvm-svn: 171153	2012-12-27 08:50:58 +00:00
Nadav Rotem	2a054b4475	On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized register. In most cases we actually compare or select YMM-sized registers and mixing the two types creates horrible code. This commit optimizes some of the transition sequences. PR14657. llvm-svn: 171148	2012-12-27 08:15:45 +00:00
Eric Christopher	3bf29fda91	For the dwarf5 split debug info code split out the string section per compile unit/skeleton compile unit. Update tests accordingly. llvm-svn: 171133	2012-12-27 02:14:01 +00:00
Eric Christopher	c8a88ee691	FileCheck-ize. llvm-svn: 171132	2012-12-27 02:13:58 +00:00
Eric Christopher	d6152aabbb	FileCheck-ize. llvm-svn: 171131	2012-12-27 02:13:55 +00:00
Eric Christopher	5a6acfa4c8	Right now all of the relocations are 32-bit dwarf, and the relocation information doesn't return an addend for Rel relocations. Go ahead and use this information to fix relocation handling inside dwarfdump for 32-bit ELF REL. llvm-svn: 171126	2012-12-27 01:07:07 +00:00
Nadav Rotem	5350cd314b	If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified. PR14719. llvm-svn: 171124	2012-12-26 23:30:53 +00:00
Nadav Rotem	3f7c4f36ba	LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1 llvm-svn: 171114	2012-12-26 19:08:17 +00:00
Evgeniy Stepanov	5eb5bf8b46	[msan] Raise alignment of origin stores/loads when possible. Origin alignment is as high as the alignment of the corresponding application location, but never less than 4. llvm-svn: 171110	2012-12-26 11:55:09 +00:00
NAKAMURA Takumi	40aa3285f4	llvm/test/CodeGen/X86: FileCheck-ize two tests in r171083. llvm-svn: 171084	2012-12-26 03:19:30 +00:00
NAKAMURA Takumi	334f685328	llvm/test/CodeGen/X86: Disable avx in two tests corresponding to r171082. llvm-svn: 171083	2012-12-26 03:08:55 +00:00
Hal Finkel	30e95a8ebb	BBVectorize: Use VTTI to compute costs for intrinsics vectorization For the time being this includes only some dummy test cases. Once the generic implementation of the intrinsics cost function does something other than assuming scalarization in all cases, or some target specializes the interface, some real test cases can be added. Also, for consistency, I changed the type of IID from unsigned to Intrinsic::ID in a few other places. llvm-svn: 171079	2012-12-26 01:36:57 +00:00
Hal Finkel	b44f890133	LoopVectorize: Enable vectorization of the fmuladd intrinsic llvm-svn: 171076	2012-12-25 23:21:29 +00:00
Hal Finkel	2a456112ec	BBVectorize: Enable vectorization of the fmuladd intrinsic llvm-svn: 171075	2012-12-25 22:36:08 +00:00
Hal Finkel	2ebe6d08cd	Loosen scheduling restrictions on the PPC dcbt intrinsic As with the prefetch intrinsic to which it maps, simply have dcbt marked as reading from and writing to its arguments instead of having unmodeled side effects. While this might cause unwanted code motion (because aliasing checks don't really capture cache-line sharing), it is more important that prefetches in unrolled loops don't block the scheduler from rearranging the unrolled loop body. llvm-svn: 171073	2012-12-25 18:51:18 +00:00
Hal Finkel	1b5ff08d43	Expand PPC64 atomic load and store Use of store or load with the atomic specifier on 64-bit types would cause instruction-selection failures. As with the 32-bit case, these can use the default expansion in terms of cmp-and-swap. llvm-svn: 171072	2012-12-25 17:22:53 +00:00
Evgeniy Stepanov	f19c086d1e	[msan] Fix handling of vectors of pointers. VectorType::getInteger() can not be used with them, because pointer size depends on the target. llvm-svn: 171070	2012-12-25 16:04:38 +00:00
Evgeniy Stepanov	ec8371283b	[msan] Fix handling of select with vector condition. llvm-svn: 171069	2012-12-25 14:56:21 +00:00
Benjamin Kramer	a9f265ee98	Harden test so it's not affected by changes to compare lowering. This only failed on hosts that don't have SSE41. llvm-svn: 171066	2012-12-25 13:23:23 +00:00
Benjamin Kramer	81b5a8fd2e	X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity. llvm-svn: 171064	2012-12-25 13:09:08 +00:00
Benjamin Kramer	df4af41b9b	X86: Custom lower <2 x i64> eq and ne when SSE41 is not available. pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack. Small speedup on loop-vectorized viterbi (-march=core2). llvm-svn: 171063	2012-12-25 12:54:19 +00:00
Nick Lewycky	fb43258080	Fix typo "Makre" -> "Make". llvm-svn: 171043	2012-12-24 19:55:47 +00:00
NAKAMURA Takumi	1b18db7ea3	llvm/test/CodeGen/X86/fold-vex.ll: Add explicit triple. llvm-svn: 171029	2012-12-24 11:14:06 +00:00
Nadav Rotem	dc0ad92b64	Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. llvm-svn: 171024	2012-12-24 09:40:33 +00:00
Nadav Rotem	5f7c12cfbd	LoopVectorizer: When checking for vectorizable types, also check the StoreInst operands. PR14705. llvm-svn: 171023	2012-12-24 09:14:18 +00:00
Nadav Rotem	bd5d1d832a	LoopVectorizer: Fix an endless loop in the code that looks for reductions. The bug was in the code that detects PHIs in if-then-else block sequence. PR14701. llvm-svn: 171008	2012-12-24 01:22:06 +00:00
Nadav Rotem	cf9999d9d5	CostModel: Change the default target-independent implementation for finding the cost of arithmetic functions. We now assume that the cost of arithmetic operations that are marked as Legal or Promote is low, but ops that are marked as custom are higher. llvm-svn: 171002	2012-12-23 17:31:23 +00:00
Nadav Rotem	aa92ea4f12	We are not ready to estimate the cost of integer expansions based on the number of parts. This test is too noisy. llvm-svn: 170999	2012-12-23 09:11:07 +00:00
Nadav Rotem	2cade68025	Loop Vectorizer: Update the cost model of scatter/gather operations and make them more expensive. llvm-svn: 170995	2012-12-23 07:23:55 +00:00
Benjamin Kramer	76268ac682	X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available. pmuludq is slow, but it turns out that all the unpacking and packing of the scalarized mul is even slower. 10% speedup on loop-vectorized paq8p. llvm-svn: 170985	2012-12-22 16:07:56 +00:00
Benjamin Kramer	b2f0a2bd4b	X86: Emit vector sext as shuffle + sra if vpmovsx is not available. Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. llvm-svn: 170984	2012-12-22 11:34:28 +00:00
Nadav Rotem	d5aae980cb	In some cases, due to scheduling constraints we copy the EFLAGS. The only way to read the eflags is using push and pop. If we don't adjust the stack then we run over the first frame index. This is not something that we want to do, so we have to make sure that our machine function does not copy the flags. If it does then we have to emit the prolog that adjusts the stack. rdar://12896831 llvm-svn: 170961	2012-12-21 23:48:49 +00:00
Akira Hatanaka	d6b694f036	[mips] Fix encoding of BAL instruction. Also, fix assembler test case which was not catching the error. llvm-svn: 170953	2012-12-21 23:13:59 +00:00
Benjamin Kramer	b4688f84bd	try to unbreak ppc buildbots. llvm-svn: 170913	2012-12-21 18:11:45 +00:00
Benjamin Kramer	82d1c371e2	X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization. Part of PR14667. llvm-svn: 170908	2012-12-21 17:46:58 +00:00
Tom Stellard	a8b0351720	R600: Expand vec4 INT <-> FP conversions llvm-svn: 170901	2012-12-21 16:33:24 +00:00
Evgeniy Stepanov	4fbc0d08bf	[msan] Remove unreachable blocks before instrumenting a function. llvm-svn: 170883	2012-12-21 11:18:49 +00:00
Nadav Rotem	6d4fdd6d2c	Improve the X86 cost model for loads and stores. llvm-svn: 170830	2012-12-21 01:33:59 +00:00
Reed Kotler	93f778d2bd	Add test case for r170674 llvm-svn: 170823	2012-12-21 00:55:10 +00:00
Nadav Rotem	e7785686a5	Fix a bug in the code that checks if we can vectorize loops while using dynamic memory bound checks. Before the fix we were able to vectorize this loop from the Livermore Loops benchmark: for ( k=1 ; k<n ; k++ ) x[k] = x[k-1] + y[k]; llvm-svn: 170811	2012-12-21 00:07:35 +00:00
Eric Christopher	6e47b725ff	Move these files over to the debug info directory. llvm-svn: 170810	2012-12-21 00:03:42 +00:00
Bob Wilson	7bba4f8957	Revert "Adding support for llvm.arm.neon.vaddl[su].* and" This reverts r170694. The operations can be represented in IR without adding any new intrinsics. llvm-svn: 170765	2012-12-20 21:09:38 +00:00
Nadav Rotem	2ababf68d7	LoopVectorize: Fix a bug in the scalarization of instructions. Before if-conversion we could check if a value is loop invariant if it was declared inside the basic block. Now that loops have multiple blocks this check is incorrect. This fixes External/SPEC/CINT95/099_go/099_go llvm-svn: 170756	2012-12-20 20:24:40 +00:00
Evan Cheng	ddc0cb6dc5	On some ARM cpus, flags setting movs with shifter operand, i.e. lsl, lsr, asr, are more expensive than the non-flag setting variant. Teach thumb2 size reduction pass to avoid generating them unless we are optimizing for size. rdar://12892707 llvm-svn: 170728	2012-12-20 19:59:30 +00:00
Eli Bendersky	4cfb5b9e64	Change Lit error redirection to FileCheck to a more common syntax since it can potentially cause some bots to fail. llvm-svn: 170726	2012-12-20 19:54:02 +00:00
Eli Bendersky	f658e92724	Add a largish auto-generated test for the aligned bundling feature, along with the script generating it. The test should never be modified manually. If anyone needs to change it, please change the script and re-run it. The script is placed into utils/testgen - I couldn't think of a better place, and after some discussion on IRC this looked like a logical location. llvm-svn: 170720	2012-12-20 19:16:57 +00:00
Eli Bendersky	4c4f11eb0d	Tests for the aligned bundling support added in r170718 llvm-svn: 170719	2012-12-20 19:07:30 +00:00
Rafael Espindola	642c7cd56e	Simplify the testcase a bit. I checked that it would still crash llc before the corresponding fix. llvm-svn: 170709	2012-12-20 17:47:27 +00:00
James Molloy	4f6fb953a7	Add a new attribute, 'noduplicate'. If a function contains a noduplicate call, the call cannot be duplicated - Jump threading, loop unrolling, loop unswitching, and loop rotation are inhibited if they would duplicate the call. Similarly inlining of the function is inhibited, if that would duplicate the call (in particular inlining is still allowed when there is only one callsite and the function has internal linkage). llvm-svn: 170704	2012-12-20 16:04:27 +00:00
Renato Golin	6b2ea4a48f	Adding support for llvm.arm.neon.vaddl[su].* and llvm.arm.neon.vsub[su].* intrinsics. Patch by Pete Couperus <pjcoup@gmail.com> llvm-svn: 170694	2012-12-20 13:52:11 +00:00
Reed Kotler	d019dbf75e	fix most of remaining issues with large frames. these patches are tested a lot by test-suite but make check tests are forthcoming once the next few patches that complete this are committed. with the next few patches the pass rate for mips16 is near 100% llvm-svn: 170656	2012-12-20 04:07:42 +00:00
Akira Hatanaka	f423672117	[mips] Use "or $r0, $r1, $zero" instead of "addu $r0, $zero, $r1" to copy physical register $r1 to $r0. GNU disassembler recognizes an "or" instruction as a "move", and this change makes the disassembled code easier to read. Original patch by Reed Kotler. llvm-svn: 170655	2012-12-20 04:06:06 +00:00
Bob Wilson	3365b80290	Do not introduce vector operations in functions marked with noimplicitfloat. <rdar://problem/12879313> llvm-svn: 170630	2012-12-20 01:36:20 +00:00
Eric Christopher	3c5a1914b6	Split out abbreviations for the skeleton info from the rest of the abbreviations. Part of implementing split dwarf. llvm-svn: 170589	2012-12-19 22:02:53 +00:00
Evan Cheng	eae6d2ccea	LLVM sdisel normalize bit extraction of the form: ((x & 0xff00) >> 8) << 2 to (x >> 6) & 0x3fc This is general goodness since it folds a left shift into the mask. However, the trailing zeros in the mask prevents the ARM backend from using the bit extraction instructions. And worse since the mask materialization may require an addition instruction. This comes up fairly frequently when the result of the bit twiddling is used as memory address. e.g. = ptr[(x & 0xFF0000) >> 16] We want to generate: ubfx r3, r1, #16, #8 ldr.w r3, [r0, r3, lsl #2] vs. mov.w r9, #1020 and.w r2, r9, r1, lsr #14 ldr r2, [r0, r2] Add a late ARM specific isel optimization to ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the 'base + offset' address computation; change the mask to one which doesn't have trailing zeros and enable the use of ubfx. Note the optimization has to be done late since it's target specific and we don't want to change the DAG normalization. It's also fairly restrictive as shifter operands are not always free. It's only done for lsh 1 / 2. It's known to be free on some cpus and they are most common for address computation. This is a slight win for blowfish, rijndael, etc. rdar://12870177 llvm-svn: 170581	2012-12-19 20:16:09 +00:00
Roman Divacky	e3d323052f	Remove edis - the enhanced disassembler. Fixes PR14654. llvm-svn: 170578	2012-12-19 19:55:47 +00:00
Paul Redmond	5917f4c715	Transform (x&C)>V into (x&C)!=0 where possible When the least bit of C is greater than V, (x&C) must be greater than V if it is not zero, so the comparison can be simplified. Although this was suggested in Target/X86/README.txt, it benefits any architecture with a directly testable form of AND. Patch by Kevin Schoedel llvm-svn: 170576	2012-12-19 19:47:13 +00:00
Benjamin Kramer	c5071466d4	PowerPC: Expand VSELECT nodes. There's probably a better expansion for those nodes than the default for altivec, but this is better than crashing. VSELECTs occur in loop vectorizer output. llvm-svn: 170551	2012-12-19 15:49:14 +00:00
Benjamin Kramer	ae0bb61053	Make TargetLowering::getTypeConversion more resilient against odd illegal MVTs. - An MVT can become an EVT when being split (e.g. v2i8 -> v1i8, the latter doesn't exist) - Return the scalar value when an MVT is scalarized (v1i64 -> i64) Fixes PR14639ff. llvm-svn: 170546	2012-12-19 14:34:28 +00:00
Evgeniy Stepanov	d7571cd4bc	[msan] Heuristically instrument unknown intrinsics. This changes adds shadow and origin propagation for unknown intrinsics by examining the arguments and ModRef behaviour. For now, only 3 classes of intrinsics are handled: - those that look like simple SIMD store - those that look like simple SIMD load - those that don't have memory effects and look like arithmetic/logic/whatever operation on simple types. llvm-svn: 170530	2012-12-19 11:22:04 +00:00
Elena Demikhovsky	14a4af0e66	Optimized load + SIGN_EXTEND patterns in the X86 backend. llvm-svn: 170506	2012-12-19 07:50:20 +00:00
Nadav Rotem	33360d8ae9	After reducing the size of an operation in the DAG we zero-extend the reduced bitwidth op back to the original size. If we reduce ANDs then this can cause an endless loop. This patch changes the ZEXT to ANY_EXTEND if the demanded bits are equal or smaller than the size of the reduced operation. llvm-svn: 170505	2012-12-19 07:39:08 +00:00
Craig Topper	63f5921776	Teach SimplifySetCC that comparing AssertZext i1 against a constant 1 can be rewritten as a compare against a constant 0 with the opposite condition. llvm-svn: 170495	2012-12-19 06:12:28 +00:00
Shuxin Yang	37a1efe1c6	rdar://12801297 InstCombine for unsafe floating-point add/sub. llvm-svn: 170471	2012-12-18 23:10:12 +00:00
Jakub Staszak	338863a546	Reverse order of checking SSE level when calculating compare cost, so we check AVX2 before AVX. llvm-svn: 170464	2012-12-18 22:57:56 +00:00
Quentin Colombet	23b404d5ad	Disable ARM partial flag dependency optimization at -Oz To not over constrain the scheduler for ARM in thumb mode, some optimizations for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency). Disables this check when code size is the priority, i.e., code is compiled with -Oz. llvm-svn: 170462	2012-12-18 22:47:16 +00:00
Andrew Trick	ec2564818c	MISched: add dependence to ExitSU to model live-out latency. llvm-svn: 170454	2012-12-18 20:53:01 +00:00
Benjamin Kramer	f0e5d2f032	LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations. For example on x86 with SSE4.2 a <8 x i8> add reduction becomes movdqa %xmm0, %xmm1 movhlps %xmm1, %xmm1 ## xmm1 = xmm1[1,1] paddw %xmm0, %xmm1 pshufd $1, %xmm1, %xmm0 ## xmm0 = xmm1[1,0,0,0] paddw %xmm1, %xmm0 phaddw %xmm0, %xmm0 pextrb $0, %xmm0, %edx instead of pextrb $2, %xmm0, %esi pextrb $0, %xmm0, %edx addb %sil, %dl pextrb $4, %xmm0, %esi addb %dl, %sil pextrb $6, %xmm0, %edx addb %sil, %dl pextrb $8, %xmm0, %esi addb %dl, %sil pextrb $10, %xmm0, %edi pextrb $14, %xmm0, %edx addb %sil, %dil pextrb $12, %xmm0, %esi addb %dil, %sil addb %sil, %dl llvm-svn: 170439	2012-12-18 18:40:20 +00:00
Hal Finkel	943f76d1b3	Check multiple register classes for inline asm tied registers A register can be associated with several distinct register classes. For example, on PPC, the floating point registers are each associated with both F4RC (which holds f32) and F8RC (which holds f64). As a result, this code would fail when provided with a floating point register and an f64 operand because it would happen to find the register in the F4RC class first and return that. From the F4RC class, SDAG would extract f32 as the register type and then assert because of the invalid implied conversion between the f64 value and the f32 register. Instead, search all register classes. If a register class containing the the requested register has the requested type, then return that register class. Otherwise, as before, return the first register class found that contains the requested register. llvm-svn: 170436	2012-12-18 17:50:58 +00:00
Nadav Rotem	cb23342876	Rename the test so that we can add additional vectors-of-pointers tests into the same file in the future. llvm-svn: 170414	2012-12-18 05:50:54 +00:00
Nadav Rotem	a5024fc3e1	SROA: Replace calls to getScalarSizeInBits to DataLayout's API because getScalarSizeInBits could not handle vectors of pointers. llvm-svn: 170412	2012-12-18 05:23:31 +00:00
NAKAMURA Takumi	ad0c80b8e6	llvm/test/MC/ELF/comp-dir.s: Appease MSYS Bash. llvm-svn: 170410	2012-12-18 05:08:12 +00:00
Eric Christopher	906da23229	Add support for passing -main-file-name all the way through to the assembler. Part of PR14624 llvm-svn: 170390	2012-12-18 00:31:01 +00:00
Chandler Carruth	d75be9b4fb	Add a triple to this test -- it has to be an ELF platform... llvm-svn: 170374	2012-12-17 21:44:50 +00:00
Chandler Carruth	10700aad85	Prepare LLVM to fix PR14625, exposing a hook in MCContext to manage the compilation directory. This defaults to the current working directory, just as it always has, but now an assembler can choose to override it with a custom directory. I've taught llvm-mc about this option and added a test case. llvm-svn: 170371	2012-12-17 21:32:42 +00:00
Chandler Carruth	e3f4119b06	Fix another SROA crasher, PR14601. This was a silly oversight, we weren't pruning allocas which were used by variable-length memory intrinsics from the set that could be widened and promoted as integers. Fix that. llvm-svn: 170353	2012-12-17 18:48:07 +00:00
Tim Northover	5edabc131a	Teach MachO which sections contain code llvm-svn: 170349	2012-12-17 17:59:32 +00:00
Richard Osborne	459e35c261	Add instruction encodings / disassembly support for l2r instructions. llvm-svn: 170345	2012-12-17 16:28:02 +00:00
Chandler Carruth	21eb4e96c2	Teach the rewriting of memcpy calls to support subvector copies. This also cleans up a bit of the memcpy call rewriting by sinking some irrelevant code further down and making the call-emitting code a bit more concrete. Previously, memcpy of a subvector would actually miscompile (!!!) the copy into a single vector element copy. I have no idea how this ever worked. =/ This is the memcpy half of PR14478 which we probably weren't noticing previously because it didn't actually assert. The rewrite relies on the newly refactored insert- and extractVector functions to do the heavy lifting, and those are the same as used for loads and stores which makes the test coverage a bit more meaningful here. llvm-svn: 170338	2012-12-17 14:51:24 +00:00
Richard Osborne	51bf1b269a	Add instruction encodings for PEEK and ENDIN. Previously these were marked with the wrong format. llvm-svn: 170334	2012-12-17 14:23:54 +00:00
Chandler Carruth	cacda256a1	Fix a secondary bug I introduced while fixing the first part of PR14478. The first half of fixing this bug was actually in r170328, but was entirely coincidental. It did however get me to realize the nature of the bug, and adapt the test case to test more interesting behavior. In turn, that uncovered the rest of the bug which I've fixed here. This should fix two new asserts that showed up in the vectorize nightly tester. llvm-svn: 170333	2012-12-17 14:03:01 +00:00
Richard Osborne	041071c558	Add instruction encodings / disassembly support for rus instructions. llvm-svn: 170330	2012-12-17 13:50:04 +00:00
Richard Osborne	e405e58639	Add instruction encodings for ZEXT and SEXT. Previously these were marked with the wrong format. llvm-svn: 170327	2012-12-17 13:20:37 +00:00
Richard Osborne	3a0d5cc314	Add instruction encodings / disassembly support for 2r instructions. llvm-svn: 170323	2012-12-17 12:29:31 +00:00
Richard Osborne	016967e4ff	Add instruction encodings / disassembly support for 0r instructions. llvm-svn: 170322	2012-12-17 12:26:29 +00:00
Craig Topper	f924a58af1	Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt. llvm-svn: 170304	2012-12-17 05:02:29 +00:00
Chandler Carruth	ccca504f3a	Fix the first part of PR14478: memset now works. PR14478 highlights a serious problem in SROA that simply wasn't being exercised due to a lack of vector input code mixed with C-library function calls. Part of SROA was written carefully to handle subvector accesses via memset and memcpy, but the rewriter never grew support for this. Fixing it required refactoring the subvector access code in other parts of SROA so it could be shared, and then fixing the splat formation logic and using subvector insertion (this patch). The PR isn't quite fixed yet, as memcpy is still broken in the same way. I'm starting on that series of patches now. Hopefully this will be enough to bring the bullet benchmark back to life with the bb-vectorizer enabled, but that may require fixing memcpy as well. llvm-svn: 170301	2012-12-17 04:07:37 +00:00
Richard Osborne	c5287b8889	Add tests for disassembly of 1r XCore instructions. llvm-svn: 170295	2012-12-16 18:06:30 +00:00
Reed Kotler	aee4d5d194	This patch is needed to make c++ exceptions work for mips16. Mips16 is really a processor decoding mode (ala thumb 1) and in the same program, mips16 and mips32 functions can exist and can call each other. If a jal type instruction encounters an address with the lower bit set, then the processor switches to mips16 mode (if it is not already in it). If the lower bit is not set, then it switches to mips32 mode. The linker knows which functions are mips16 and which are mips32. When relocation is performed on code labels, this lower order bit is set if the code label is a mips16 code label. In general this works just fine, however when creating exception handling tables and dwarf, there are cases where you don't want this lower order bit added in. This has been traditionally distinguished in gas assembly source by using a different syntax for the label. lab1: ; this will cause the lower order bit to be added lab2=. ; this will not cause the lower order bit to be added In some cases, it does not matter because in dwarf and debug tables the difference of two labels is used and in that case the lower order bits subtract each other out. To fix this, I have added to mcstreamer the notion of a debuglabel. The default is for label and debug label to be the same. So calling EmitLabel and EmitDebugLabel produce the same result. For various reasons, there is only one set of labels that needs to be modified for the mips exceptions to work. These are the "$eh_func_beginXXX" labels. Mips overrides the debug label suffix from ":" to "=." . This initial patch fixes exceptions. More changes most likely will be needed to DwarfCFException to make all of this work for actual debugging. These changes will be to emit debug labels in some places where a simple label is emitted now. Some historical discussion on this from gcc can be found at: http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.html http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html llvm-svn: 170279	2012-12-16 04:00:45 +00:00
Benjamin Kramer	b16ccde7a4	X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible. We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases if y is a constant. DAGCombiner canonicalizes those so we first have to undo the canonicalization for those cases. The pattern occurs in gzip when the loop vectorizer is enabled. Part of PR14613. llvm-svn: 170273	2012-12-15 16:47:44 +00:00
Chandler Carruth	c50394fcfa	Add a corollary test for PR14572. We got this code path correct already. llvm-svn: 170271	2012-12-15 09:31:54 +00:00
Chandler Carruth	067edd342f	Relax an overly aggressive assert to fix PR14572. The alloca width is based on the alloc size, not the type size. llvm-svn: 170270	2012-12-15 09:26:06 +00:00
Reed Kotler	5fdeb21249	This code implements most of mips16 hardfloat as it is done by gcc. In this case, essentially it is soft float with different library routines. The next step will be to make this fully interoperational with mips32 floating point and that requires creating stubs for functions with signatures that contain floating point types. I have a more sophisticated design for mips16 hardfloat which I hope to implement at a later time that directly does floating point without the need for function calls. The mips16 encoding has no floating point instructions so one needs to switch to mips32 mode to execute floating point instructions. llvm-svn: 170259	2012-12-15 00:20:05 +00:00
Kevin Enderby	06aa3eb8ce	Make sure the alternate PC+imm syntax of LDR instruction with a small immediate generates the narrow version. Needed when doing round-trip assemble/disassemble testing using the alternate syntax that specifies 'pc' directly. llvm-svn: 170255	2012-12-14 23:04:25 +00:00
Michael Ilseman	e2754dc887	Add back FoldOpIntoPhi optimizations with fix. Included test cases to help catch these errors and to test the presence of the optimization itself llvm-svn: 170248	2012-12-14 22:08:26 +00:00
Nadav Rotem	8487537bdb	TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them. llvm-svn: 170245	2012-12-14 21:20:37 +00:00
Nadav Rotem	aa3e2a907e	Fix a crash in ValueTracking on vectors of pointers. llvm-svn: 170240	2012-12-14 20:43:49 +00:00
Bill Schmidt	a4f898448c	This patch removes some nondeterminism from direct object file output for TLS dynamic models on 64-bit PowerPC ELF. The default sort routine for relocations only sorts on the r_offset field; but with TLS, there can be two relocations with the same r_offset. For PowerPC, this patch sorts secondarily on descending r_type, which matches the behavior expected by the linker. llvm-svn: 170237	2012-12-14 20:28:38 +00:00
Shuxin Yang	f8e9a5a061	rdar://12753946 Implement rule : "x * (select cond 1.0, 0.0) -> select cond x, 0.0" llvm-svn: 170226	2012-12-14 18:46:06 +00:00
Bill Schmidt	9f0b4ec0f5	This patch improves the 64-bit PowerPC InitialExec TLS support by providing for a wider range of GOT entries that can hold thread-relative offsets. This matches the behavior of GCC, which was not documented in the PPC64 TLS ABI. The ABI will be updated with the new code sequence. Former sequence: ld 9,x@got@tprel(2) add 9,9,x@tls New sequence: addis 9,2,x@got@tprel@ha ld 9,x@got@tprel@l(9) add 9,9,x@tls Note that a linker optimization exists to transform the new sequence into the shorter sequence when appropriate, by replacing the addis with a nop and modifying the base register and relocation type of the ld. llvm-svn: 170209	2012-12-14 17:02:38 +00:00
Evgeniy Stepanov	49175b237d	[msan] Origin stores and loads do not need explicit alignment. Origin address is always 4 byte aligned, and the access type is always i32. llvm-svn: 170199	2012-12-14 13:43:11 +00:00
David Blaikie	37fefc3f8d	Debug Info: add support to mark member variables as artificial This is the LLVM portion of r170154. llvm-svn: 170156	2012-12-13 22:43:07 +00:00
NAKAMURA Takumi	38d2b2442f	Revert r170020, "Simplify negated bit test", for now. This assumes (1 << n) is always not zero. Consider n is greater than word size. Although I know it is undefined, this transforms undefined behavior hidden. This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang. llvm-svn: 170128	2012-12-13 14:28:16 +00:00
Akira Hatanaka	cf9a61b6ee	[mips] Do not copy GOT address to register $gp if the function being called has internal linkage. llvm-svn: 170092	2012-12-13 03:17:29 +00:00
Eli Bendersky	f9be4c8b47	Make this Lit config file a bit slimmer llvm-svn: 170083	2012-12-13 02:03:46 +00:00
Evan Cheng	bf0baa9de7	Fix a bug in DAGCombiner::MatchBSwapHWord. Make sure the node has operands before referencing them. rdar://12868039 llvm-svn: 170078	2012-12-13 01:34:32 +00:00
Quentin Colombet	c0dba2035a	Take into account minimize size attribute in the inliner. Better controls the inlining of functions when the caller function has MinSize attribute. Basically, when the caller function has this attribute, we do not "force" the inlining of callee functions carrying the InlineHint attribute (i.e., functions defined with inline keyword) llvm-svn: 170065	2012-12-13 01:05:25 +00:00
Nadav Rotem	36510f7194	Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc. llvm-svn: 170051	2012-12-13 00:21:03 +00:00
Jakub Staszak	c6ecd7deba	Fix typo, which prevent test from being check. llvm-svn: 170025	2012-12-12 21:10:56 +00:00
Jakub Staszak	a3619d31d8	unHECKify test fixed by Jacob in r159003. llvm-svn: 170023	2012-12-12 20:58:42 +00:00
David Majnemer	5226aa94ce	Simplify negated bit test llvm-svn: 170020	2012-12-12 20:48:54 +00:00
Evan Cheng	b7d3d03bf9	Fix a logic bug in inline expansion of memcpy / memset with an overlapping load / store pair. It's not legal to use a wider load than the size of the remaining bytes if it's the first pair of load / store. llvm-svn: 170018	2012-12-12 20:43:23 +00:00
Jakub Staszak	0a74fc8d6c	unHECKify test. It was fixed by Chris in 2009. llvm-svn: 170017	2012-12-12 20:43:00 +00:00
Bill Schmidt	25ffd19502	The ordering of two relocations on the same instruction is apparently not predictable when compiled on at least one non-PowerPC host. Source of nondeterminism not apparent. Restrict the test to build on PowerPC hosts for now while looking into the issue further. llvm-svn: 170016	2012-12-12 20:29:20 +00:00
Jakub Staszak	aee9cca331	Fix typo in test-case. llvm-svn: 170015	2012-12-12 20:29:06 +00:00
Jakub Staszak	67bf76ebbc	Fix typo. llvm-svn: 170006	2012-12-12 19:47:04 +00:00
Nadav Rotem	d0bb22bba3	LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size. llvm-svn: 170004	2012-12-12 19:29:45 +00:00
Bill Schmidt	24b8dd6eb7	This patch implements local-dynamic TLS model support for the 64-bit PowerPC target. This is the last of the four models, so we now have full TLS support. This is mostly a straightforward extension of the general dynamic model. I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the register copy following ADDI_TLSLD_L; otherwise everything above the ADDIS_DTPREL_HA appeared dead and was removed. As before, there are new test cases to test the assembly generation, and the relocations output during integrated assembly. The expected code gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll. There are a couple of things I think can be done more efficiently in the overall TLS code, so there will likely be a clean-up patch forthcoming; but for now I want to be sure the functionality is in place. Bill llvm-svn: 170003	2012-12-12 19:29:35 +00:00
Alexey Samsonov	3d43b63a6e	Improve debug info generated with enabled AddressSanitizer. When ASan replaces <alloca instruction> with <offset into a common large alloca>, it should also patch llvm.dbg.declare calls and replace debug info descriptors to mark that we've replaced alloca with a value that stores an address of the user variable, not the user variable itself. See PR11818 for more context. llvm-svn: 169984	2012-12-12 14:31:53 +00:00
NAKAMURA Takumi	be230b8fdb	llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Fix possible typo(s) in CHECK-NOT lines. Found by Alexander Zinenko, thanks! llvm-svn: 169978	2012-12-12 13:34:20 +00:00
NAKAMURA Takumi	cae5321a3b	llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Rename symbols, s/test_/Test/g, not to mismatch "CHECK(-NOT): test". llvm-svn: 169977	2012-12-12 13:34:14 +00:00
NAKAMURA Takumi	69d1405e48	llvm/test/CodeGen/X86/store_op_load_fold.ll: Fix typo, s/CHECK_NEXT/CHECK-NEXT/ llvm-svn: 169957	2012-12-12 01:41:01 +00:00
NAKAMURA Takumi	01ac65af00	llvm/test/CodeGen/X86/store_op_load_fold.ll: Add explicit triple. llvm-svn: 169956	2012-12-12 01:40:56 +00:00
Manman Ren	82751a105c	DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertion rdar://12838504 llvm-svn: 169951	2012-12-12 01:13:50 +00:00
Evan Cheng	04e5518783	Avoid using lossy load / stores for memcpy / memset expansion. e.g. f64 load / store on non-SSE2 x86 targets. llvm-svn: 169944	2012-12-12 00:42:09 +00:00
Shuxin Yang	81b3678564	- Fix a problematic way in creating all-the-1 APInt. - Propagate "exact" bit of [l\|a]shr instruction. llvm-svn: 169942	2012-12-12 00:29:03 +00:00
Michael Ilseman	bb6f691b01	Added a slew of SimplifyInstruction floating-point optimizations, many of which take advantage of fast-math flags. Test cases included. fsub X, +0 ==> X fsub X, -0 ==> X, when we know X is not -0 fsub +/-0.0, (fsub -0.0, X) ==> X fsub nsz +/-0.0, (fsub +/-0.0, X) ==> X fsub nnan ninf X, X ==> 0.0 fadd nsz X, 0 ==> X fadd [nnan ninf] X, (fsub [nnan ninf] 0, X) ==> 0 where nnan and ninf have to occur at least once somewhere in this expression fmul X, 1.0 ==> X llvm-svn: 169940	2012-12-12 00:27:46 +00:00
Nadav Rotem	f707bf4ca3	PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion. llvm-svn: 169916	2012-12-11 21:30:14 +00:00
Tom Stellard	75aadc2813	Add R600 backend A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX llvm-svn: 169915	2012-12-11 21:25:42 +00:00
Bill Schmidt	c56f1d34bc	This patch implements the general dynamic TLS model for 64-bit PowerPC. Given a thread-local symbol x with global-dynamic access, the generated code to obtain x's address is: Instruction Relocation Symbol addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x R_PPC64_REL24 __tls_get_addr nop <use address in r3> The implementation borrows from the medium code model work for introducing special forms of ADDIS and ADDI into the DAG representation. This is made slightly more complicated by having to introduce a call to the external function __tls_get_addr. Using the full call machinery is overkill and, more importantly, makes it difficult to add a special relocation. So I've introduced another opcode GET_TLS_ADDR to represent the function call, and surrounded it with register copies to set up the parameter and return value. Most of the code is pretty straightforward. I ran into one peculiarity when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like BL8_NOP_ELF except that it takes another parameter to represent the symbol ("x" above) that requires a relocation on the call. Something in the TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated identically during the emit phase, so this second operand was never visited to generate relocations. This is the reason for the slightly messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding(). Two new tests are included to demonstrate correct external assembly and correct generation of relocations using the integrated assembler. Comments welcome! Thanks, Bill llvm-svn: 169910	2012-12-11 20:30:11 +00:00
Nadav Rotem	e266efb70b	Loop Vectorize: optimize the vectorization of trunc(induction_var). The truncation is now done on scalars. llvm-svn: 169904	2012-12-11 18:58:10 +00:00
NAKAMURA Takumi	e55382ea55	llvm/test/TableGen: Remove XFAIL:vg_leak in dozen of tests, according to llvm-x86_64-linux-vg_leak. llvm-svn: 169862	2012-12-11 13:14:16 +00:00
Hao Liu	14390f031c	revert the test change llvm-svn: 169823	2012-12-11 06:25:18 +00:00
Hao Liu	0eb50fb49d	A newbie try a test commit llvm-svn: 169821	2012-12-11 06:22:54 +00:00
Nadav Rotem	dbb3328194	Fix PR14565. Don't if-convert loops that have switch statements in them. llvm-svn: 169813	2012-12-11 04:55:10 +00:00
Chad Rosier	d4c0c6cb22	Add a triple to this test. llvm-svn: 169803	2012-12-11 00:51:36 +00:00
Chandler Carruth	b27041c50b	Fix a miscompile in the DAG combiner. Previously, we would incorrectly try to reduce the width of this load, and would end up transforming: (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32) to (truncate (zextload i32 <ptr+4> as i64) to i32) We lost the sext attached to the load while building the narrower i32 load, and replaced it with a zext because lshr always zext's the results. Instead, bail out of this combine when there is a conflict between a sextload and a zext narrowing. The rest of the DAG combiner still optimize the code down to the proper single instruction: movswl 6(...),%eax Which is exactly what we wanted. Previously we read past the end and missed the sign extension: movl 6(...), %eax llvm-svn: 169802	2012-12-11 00:36:57 +00:00
Paul Redmond	c4550d4967	move X86-specific test This test case uses -mcpu=corei7 so it belongs in CodeGen/X86 Reviewed by: Nadav llvm-svn: 169801	2012-12-11 00:36:43 +00:00
Chad Rosier	df42cf39ab	Fall back to the selection dag isel to select tail calls. This shouldn't affect codegen for -O0 compiles as tail call markers are not emitted in unoptimized compiles. Testing with the external/internal nightly test suite reveals no change in compile time performance. Testing with -O1, -O2 and -O3 with fast-isel enabled did not cause any compile-time or execution-time failures. All tests were performed on my x86 machine. I'll monitor our arm testers to ensure no regressions occur there. In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While it's theoretically true that this is just an optimization, it's an optimization that we very much want to happen even at -O0, or else ARC applications become substantially harder to debug. Part of rdar://12553082 llvm-svn: 169796	2012-12-11 00:18:02 +00:00
Eric Christopher	c8a310edc1	Refactor out the abbreviation handling into a separate class that controls each of the abbreviation sets (only a single one at the moment) and computes offsets separately as well for each set of DIEs. No real function change, ordering of abbreviations for the skeleton CU changed but only because we're computing in a separate order. Fix the testcase not to care. llvm-svn: 169793	2012-12-10 23:34:43 +00:00
Evan Cheng	79e2ca90bc	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 llvm-svn: 169791	2012-12-10 23:21:26 +00:00
Arnold Schwaighofer	edd62b14e5	Optimistically analyse Phi cycles Analyse Phis under the starting assumption that they are NoAlias. Recursively look at their inputs. If they MayAlias/MustAlias there must be an input that makes them so. Addresses bug 14351. llvm-svn: 169788	2012-12-10 23:02:41 +00:00
Eli Bendersky	7352e4ffcb	Add a test for explicitly exercising the mc-relax-all flag. llvm-svn: 169764	2012-12-10 20:36:01 +00:00
Eric Christopher	cdf218d606	Use the somewhat semantic term "split dwarf" it more matches what's going on and makes a lot of the terminology in comments make more sense. llvm-svn: 169758	2012-12-10 19:51:21 +00:00
Nadav Rotem	7b5b55c195	Add support for reverse induction variables. For example: while (i--) sum+=A[i]; llvm-svn: 169752	2012-12-10 19:25:06 +00:00
Hal Finkel	66859ae0f6	Use GetUnderlyingObjects in misched misched used GetUnderlyingObject in order to break false load/store dependencies, and the -enable-aa-sched-mi feature similarly relied on GetUnderlyingObject in order to ensure it is safe to use the aliasing analysis. Unfortunately, GetUnderlyingObject does not recurse through phi nodes, and so (especially due to LSR) all of these mechanisms failed for induction-variable-dependent loads and stores inside loops. This change replaces uses of GetUnderlyingObject with GetUnderlyingObjects (which will recurse through phi and select instructions) in misched. Andy reviewed, tested and simplified this patch; Thanks! llvm-svn: 169744	2012-12-10 18:49:16 +00:00
Craig Topper	d8005db486	Teach DAG combine to handle vector add/sub with vectors of all 0s. llvm-svn: 169727	2012-12-10 08:12:29 +00:00
Chandler Carruth	e45f4658a3	Fix PR14548: SROA was crashing on a mixture of i1 and i8 loads and stores. When SROA was evaluating a mixture of i1 and i8 loads and stores, in just a particular case, it would tickle a latent bug where we compared bits to bytes rather than bits to bits. As a consequence of the latent bug, we would allow integers through which were not byte-size multiples, a situation the later rewriting code was never intended to handle. In release builds this could trigger all manner of oddities, but the reported issue in PR14548 was forming invalid bitcast instructions. The only downside of this fix is that it makes it more clear that SROA in its current form is not capable of handling mixed i1 and i8 loads and stores. Sometimes with the previous code this would work by luck, but usually it would crash, so I'm not terribly worried. I'll watch the LNT numbers just to be sure. llvm-svn: 169719	2012-12-10 00:54:45 +00:00
Paul Redmond	2adb13c100	LoopVectorize: support vectorizing intrinsic calls - added function to VectorTargetTransformInfo to query cost of intrinsics - vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc. Reviewed by: Nadav llvm-svn: 169711	2012-12-09 20:42:17 +00:00
Benjamin Kramer	054ab17c17	Drop the address space limit for tests in the makefile build. The limit seems to break newer pythons (see PR13598) so just drop it for now. Eventually lit should learn to set limits for its children instead of a global limit in the makefile. If some PPC bots fail after this change: That's a good thing, they actually run clang tests now. llvm-svn: 169695	2012-12-09 10:34:22 +00:00
Shuxin Yang	95de7c37e2	- Re-enable population count loop idiom recognization - fix a bug which cause sigfault. - add two testing cases which was causing crash llvm-svn: 169687	2012-12-09 03:12:46 +00:00
Craig Topper	a183ddb0fe	Teach DAG combine to handle vector logical operations with vectors of all 1s or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined. llvm-svn: 169684	2012-12-08 22:49:19 +00:00
Chandler Carruth	91e47532fe	Revert the patches adding a popcount loop idiom recognition pass. There are still bugs in this pass, as well as other issues that are being worked on, but the bugs are crashers that occur pretty easily in the wild. Test cases have been sent to the original commit's review thread. This reverts the commits: r169671: Fix a logic error. r169604: Move the popcnt tests to an X86 subdirectory. r168931: Initial commit adding the pass. llvm-svn: 169683	2012-12-08 22:18:29 +00:00
Nadav Rotem	ad0b5fbe8c	When we use the BLEND instruction that uses the MSB as a mask, we can remove the VSRI instruction before it since it does not affect the MSB. Thanks Craig Topper for suggesting this. llvm-svn: 169638	2012-12-07 21:43:11 +00:00
Matthew Curtis	7a93811e8b	In hexagon convertToHardwareLoop, don't deref end() iterator In particular, check if MachineBasicBlock::iterator is end() before using it to call getDebugLoc(); See also this thread on llvm-commits: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155914.html llvm-svn: 169634	2012-12-07 21:03:15 +00:00
Nadav Rotem	481e50efe0	X86: Prefer using VPSHUFD over VPERMIL because it has better throughput. llvm-svn: 169624	2012-12-07 19:01:13 +00:00
Eli Bendersky	2ccd044c22	Add separate statistics for Data and Inst fragments emitted during relaxation. Also fixes a test that was overly-sensitive to the exact order of statistics emitted. llvm-svn: 169619	2012-12-07 17:59:21 +00:00
Tim Northover	5cc3dc86bb	Added Mapping Symbols for ARM ELF Before this patch, when you objdump an LLVM-compiled file, objdump tried to decode data-in-code sections as if they were code. This patch adds the missing Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D). Patch based on work by Greg Fitzgerald. llvm-svn: 169609	2012-12-07 16:50:23 +00:00
David Tweed	cfeb8fc49a	The test unconditionally assumes a particular cpu has a backend build in the target. Buildbots for some hosts may choose to build only their own backend in order to maximise testing-turnaround time. Move the test into a prefixed directory so lit's standard "backend specific" suppression can be done. llvm-svn: 169604	2012-12-07 15:57:45 +00:00
Chandler Carruth	80d3e56c73	Add support to ValueTracking for determining that a pointer is non-null by virtue of inbounds GEPs that preclude a null pointer. This is a very common pattern in the code generated by std::vector and other standard library routines which use allocators that test for null pervasively. This is one step closer to teaching Clang+LLVM to be able to produce an empty function for: void f() { std::vector<int> v; v.push_back(1); v.push_back(2); v.push_back(3); v.push_back(4); } Which is related to getting them to completely fold SmallVector push_back sequences into constants when inlining and other optimizations make that a possibility. llvm-svn: 169573	2012-12-07 02:08:58 +00:00
Dmitri Gribenko	1c704355cf	Fix typos in CHECK lines. Patch by Alexander Zinenko. llvm-svn: 169547	2012-12-06 21:24:47 +00:00
Nadav Rotem	ac450eb59e	Fix a bug in the code that merges consecutive stores. Previously we did not check if loads that happen in between stores alias with the first store in the chain, only with the second store onwards. llvm-svn: 169516	2012-12-06 17:34:13 +00:00
Evgeniy Stepanov	4f220d96c5	[msan] Do not store origin for clean values. Instead of unconditionally storing origin with every application store, only do this when the shadow of the stored value is != 0. This change also delays instrumentation of stores until after the walk over function's instructions, because adding new basic blocks confuses InstVisitor. We only keep 1 origin value per 4 bytes of application memory. This change fixes the bug when a store of a single clean byte wiped the origin for the whole 4-byte area. Since stores of uninitialized values are relatively uncommon, this change improves performance of track-origins mode by 5% median and by up to 47% on specs. llvm-svn: 169490	2012-12-06 11:41:03 +00:00
Bill Wendling	28fe9e7a36	Handle non-default array bounds. Some languages, e.g. Ada and Pascal, allow you to specify that the array bounds are different from the default (1 in these cases). If we have a lower bound that's non-default, then we emit the lower bound. We also calculate the correct upper bound in those cases. llvm-svn: 169484	2012-12-06 07:38:10 +00:00
Craig Topper	216bcd522b	Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing to the normal instructions. llvm-svn: 169482	2012-12-06 07:31:16 +00:00
Evan Cheng	16846051db	Properly fix the tes. llvm-svn: 169464	2012-12-06 02:29:29 +00:00
NAKAMURA Takumi	1eccd286fd	llvm/test/CodeGen/ARM/extload-knownzero.ll: Try to unbreak, to add -O0. I guess Chad expects fastisel here. llvm-svn: 169463	2012-12-06 02:22:58 +00:00
Chad Rosier	9f5c68af4c	[arm fast-isel] Make the fast-isel implementation of memcpy respect alignment. rdar://12821569 llvm-svn: 169460	2012-12-06 01:34:31 +00:00
Evan Cheng	5213139f48	Let targets provide hooks that compute known zero and ones for any_extend and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 llvm-svn: 169459	2012-12-06 01:28:01 +00:00
Richard Smith	0ab6c5d9d8	PR10867: Analogue of r169441 for when using external 'sh'. And actually run the test! llvm-svn: 169446	2012-12-05 23:15:33 +00:00
Richard Smith	69c87b0914	PR10867. lit would interpret RUN: a RUN: b \|\| true as "a && (b \|\| true)" in Tcl mode, and as "(a && b) \|\| true" in sh mode. Everyone seems to (quite reasonably) write tests assuming the Tcl behavior, so use that in sh mode too. llvm-svn: 169441	2012-12-05 22:54:26 +00:00
Andrew Trick	fda7a8832d	RegisterPressureTracker: fix findUseBetween to handle DebugValue llvm-svn: 169427	2012-12-05 21:37:50 +00:00
Andrew Trick	7f7cee39ab	RegisterPresssureTracker: Track live physical register by unit. This is much simpler to reason about, more efficient, and fixes some corner cases involving implicit super-register defs. Fixed rdar://12797931. llvm-svn: 169425	2012-12-05 21:37:42 +00:00
Nadav Rotem	0a471ea66c	Cost Model: change the default cost of control flow instructions (br / ret / ...) to zero. llvm-svn: 169423	2012-12-05 21:21:26 +00:00
David Sehr	05176cad21	Correct ARM NOP encoding The encoding of NOP in ARMAsmBackend.cpp is missing a trailing zero, which causes the emission of a coprocessor instruction rather than "mov r0, r0" as indicated in the comment. The test also checks for the wrong encoding. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121203/157919.html llvm-svn: 169420	2012-12-05 21:01:27 +00:00
Justin Holewinski	fb711156ae	[NVPTX] Fix crash with unnamed struct arguments Patch by Eric Holk llvm-svn: 169418	2012-12-05 20:50:28 +00:00
Michael J. Spencer	0c6ec48d0b	Add dump of Win64 EH unwind data. The new command line option -unwind-info dumps the Win64 EH unwind data to the console. This is a nice feature if you need to debug generated EH data (e.g. from LLVM). Includes a test case. Initial patch by João Matos, extensions and rework by Kai Nacke. llvm-svn: 169415	2012-12-05 20:12:35 +00:00
David Sehr	1fa8d37efc	Test commit. llvm-svn: 169410	2012-12-05 19:47:56 +00:00
Jyotsna Verma	90295156d8	Use multiclass to define store instructions with base+immediate offset addressing mode and immediate stored value. llvm-svn: 169408	2012-12-05 19:32:03 +00:00
Kevin Enderby	168ffb36a5	Added a option to the disassembler to print immediates as hex. This is for the lldb team so most of but not all of the values are to be printed as hex with this option. Some small values like the scale in an X86 address were requested to printed in decimal without the leading 0x. There may be some tweaks need to places that may still be in decimal that they want in hex. Specially for arm. I made my best guess. Any tweaks from here should be simple. I also did the best I know now with help from the C++ gurus creating the cleanest formatImm() utility function and containing the changes. But if someone has a better idea to make something cleaner I'm all ears and game for changing the implementation. rdar://8109283 llvm-svn: 169393	2012-12-05 18:13:19 +00:00
Evgeniy Stepanov	8b51bab495	[msan] Instrument bswap intrinsic. llvm-svn: 169383	2012-12-05 14:39:55 +00:00
Evgeniy Stepanov	474cb3b3b5	[msan] Change linkage type of __msan_track_origins. LinkOnceODRLinkage globals may be removed in GlobalOpt if not used in the current module. llvm-svn: 169377	2012-12-05 12:49:41 +00:00
Elena Demikhovsky	cd3c1c4a16	Simplified BLEND pattern matching for shuffles. Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2. llvm-svn: 169366	2012-12-05 09:24:57 +00:00
Shuxin Yang	ada92f5018	fix a typo llvm-svn: 169345	2012-12-05 00:33:16 +00:00

... 8 9 10 11 12 ...

18675 Commits