llvm-project

Commit Graph

Author	SHA1	Message	Date
Nadav Rotem	7c277da364	Add a new optimization pass: Stack Coloring, that merges disjoint static allocations (allocas). Allocas are known to be disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics). llvm-svn: 163299	2012-09-06 09:17:37 +00:00
Chad Rosier	f24ae7b084	[ms-inline asm] Use the asm dialect from the MI to set the parser dialect. llvm-svn: 163273	2012-09-05 23:57:37 +00:00
Chad Rosier	e53314f7e3	Cleanup a few magic numbers. llvm-svn: 163263	2012-09-05 22:40:13 +00:00
Roman Divacky	ad06cee239	Stop casting away const qualifier needlessly. llvm-svn: 163258	2012-09-05 22:26:57 +00:00
Chad Rosier	cbd2a1983f	[ms-inline asm] We only need one bit to represent the AsmDialect in the MachineInstr. llvm-svn: 163257	2012-09-05 22:17:43 +00:00
Roman Divacky	9338344acb	Constify this properly. Found by gcc48 -Wcast-qual. llvm-svn: 163256	2012-09-05 22:15:49 +00:00
Roman Divacky	665260222f	Constify SDNodeIterator an stop its only non-const user being cast stripped of its constness. Found by gcc48 -Wcast-qual. llvm-svn: 163254	2012-09-05 22:03:34 +00:00
Chad Rosier	994f4040f5	[ms-inline asm] Propagate the asm dialect into the MachineInstr representation. llvm-svn: 163243	2012-09-05 21:00:58 +00:00
Roman Divacky	09c8a3dde5	Remove unused typedefs gcc4.8 warns about. llvm-svn: 163225	2012-09-05 17:55:46 +00:00
Silviu Baranga	3f40d87207	Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value. llvm-svn: 163203	2012-09-05 08:57:21 +00:00
Logan Chien	1b170de77a	Reorder the comments of EmitExceptionTable. llvm-svn: 163194	2012-09-05 06:28:26 +00:00
Craig Topper	2db2353b21	Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores. llvm-svn: 163192	2012-09-05 05:48:09 +00:00
Jakob Stoklund Olesen	ade363e86c	Search the whole instruction for tied operands. Implicit uses can be dynamically tied to defs. This will soon be used for predicated instructions on ARM. llvm-svn: 163177	2012-09-04 22:59:30 +00:00
Jakob Stoklund Olesen	d92e2bc2e9	Typo. llvm-svn: 163154	2012-09-04 18:44:43 +00:00
Jakob Stoklund Olesen	9fceda741d	Actually use the MachineOperand field for isRegTiedToDefOperand(). The MachineOperand::TiedTo field was maintained, but not used. This patch enables it in isRegTiedToDefOperand() and isRegTiedToUseOperand() which are the actual functions use by the register allocator. llvm-svn: 163153	2012-09-04 18:43:25 +00:00
Jakob Stoklund Olesen	c7579cdded	Move tie checks into MachineVerifier::visitMachineOperand. llvm-svn: 163152	2012-09-04 18:38:28 +00:00
Jakob Stoklund Olesen	0a09da83b6	Allow tied uses and defs in different orders. After much agonizing, use a full 4 bits of precious MachineOperand space to encode this. This uses existing padding, and doesn't grow MachineOperand beyond its current 32 bytes. This allows tied defs among the first 15 operands on a normal instruction, just like the current MCInstrDesc constraint encoding. Inline assembly needs to be able to tie more than the first 15 operands, and gets special treatment. Tied uses can appear beyond 15 operands, as long as they are tied to a def that's in range. llvm-svn: 163151	2012-09-04 18:36:28 +00:00
Preston Gurd	cdf540d5d6	Generic Bypass Slow Div - CodeGenPrepare pass for identifying div/rem ops - Backend specifies the type mapping using addBypassSlowDivType - Enabled only for Intel Atom with O2 32-bit -> 8-bit - Replace IDIV with instructions which test its value and use DIVB if the value is positive and less than 256. - In the case when the quotient and remainder of a divide are used a DIV and a REM instruction will be present in the IR. In the non-Atom case they are both lowered to IDIVs and CSE removes the redundant IDIV instruction, using the quotient and remainder from the first IDIV. However, due to this optimization CSE is not able to eliminate redundant IDIV instructions because they are located in different basic blocks. This is overcome by calculating both the quotient (DIV) and remainder (REM) in each basic block that is inserted by the optimization and reusing the result values when a subsequent DIV or REM instruction uses the same operands. - Test cases check for the presents of the optimization when calculating either the quotient, remainder, or both. Patch by Tyler Nowicki! llvm-svn: 163150	2012-09-04 18:22:17 +00:00
Benjamin Kramer	8d9890ab69	IRBuilderify the SjlLjEHPrepare pass. No functionality change. llvm-svn: 163115	2012-09-03 12:27:43 +00:00
Lang Hames	90152701eb	When updating live range endpoints, make sure to preserve the early clobber bit. Fixs PR13719. llvm-svn: 163107	2012-09-03 06:31:45 +00:00
Nadav Rotem	10f6b8802b	Fix a typo. llvm-svn: 163094	2012-09-02 12:21:50 +00:00
Nadav Rotem	500d691d4a	Generate better select code by allowing the target to use scalar select, and not sign-extend. llvm-svn: 163086	2012-09-02 08:20:07 +00:00
Pete Cooper	2455e9c4a5	Only legalise a VSELECT in to bitwise operations if the vector mask bool is zeros or all ones. A vector bool with just ones isn't suitable for masking with. No test case unfortunately as i couldn't find a target which fit all the conditions needed to hit this code. llvm-svn: 163075	2012-09-01 22:27:48 +00:00
Pete Cooper	2117ac40c9	Revert "Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060" This reverts commit 5dd9e214fb92847e947f9edab170f9b4e52b908f. Thanks to Duncan for explaining how this should have been done. Conflicts: test/CodeGen/X86/vec_select.ll llvm-svn: 163064	2012-09-01 17:37:55 +00:00
Logan Chien	64f361e0e1	Fix typo. llvm-svn: 163059	2012-09-01 12:11:41 +00:00
Owen Anderson	90e0eaffa8	Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode. llvm-svn: 163051	2012-09-01 06:04:27 +00:00
Michael Liao	ec385012ae	Fix typo llvm-svn: 163049	2012-09-01 04:09:16 +00:00
Jakob Stoklund Olesen	5c8eda0ebc	Add MachineInstr::tieOperands, remove setIsTied(). Manage tied operands entirely internally to MachineInstr. This makes it possible to change the representation of tied operands, as I will do shortly. The constraint that tied uses and defs must be in the same order was too restrictive. llvm-svn: 163021	2012-08-31 20:50:53 +00:00
Craig Topper	a8227cb76a	Use CloneMachineInstr to make a new MI in commuteInstruction to make the code tolerant of instructions with more than two input operands. llvm-svn: 163000	2012-08-31 16:30:05 +00:00
Jakob Stoklund Olesen	96f87069c4	Don't enforce ordered inline asm operands. I was too optimistic, inline asm can have tied operands that don't follow the def order. Fixes PR13742. llvm-svn: 162998	2012-08-31 15:34:59 +00:00
Pete Cooper	e969340fea	Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060 llvm-svn: 162960	2012-08-30 23:58:52 +00:00
Owen Anderson	cc61f87cf7	Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants. llvm-svn: 162956	2012-08-30 23:35:16 +00:00
Nadav Rotem	ea973bda26	Currently targets that do not support selects with scalar conditions and vector operands - scalarize the code. ARM is such a target because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2). rdar://12201387 llvm-svn: 162926	2012-08-30 19:17:29 +00:00
Jakob Stoklund Olesen	0eecbbeb5b	Don't use MCInstrDesc flags for implicit operands. When a MachineInstr is constructed, its implicit operands are added first, then the explicit operands are inserted before the implicits. MCInstrDesc has oprand flags like early clobber and operand ties that apply to the explicit operands. Don't look at those flags when the implicit operands are first added in the explicit operands's positions. llvm-svn: 162910	2012-08-30 14:39:06 +00:00
Craig Topper	2da13f9ef8	Add FMA to switch statement in VectorLegalizer::LegalizeOp so that it can be expanded when it isn't legal. llvm-svn: 162894	2012-08-30 07:34:22 +00:00
Craig Topper	c8f5d77e75	Add support for FMA to WidenVectorResult. llvm-svn: 162893	2012-08-30 07:13:41 +00:00
Jakob Stoklund Olesen	ffba07b927	Verify the order of tied operands in inline asm. When there are multiple tied use-def pairs on an inline asm instruction, the tied uses must appear in the same order as the defs. It is possible to write an LLVM IR inline asm instruction that breaks this constraint, but there is no reason for a front end to emit the operands out of order. The gnu inline asm syntax specifies tied operands as a single read/write constraint "+r", so ouf of order operands are not possible. llvm-svn: 162878	2012-08-29 23:52:52 +00:00
Jakob Stoklund Olesen	b2bef482fd	Set the isTied flags when building INLINEASM MachineInstrs. For normal instructions, isTied() is set automatically by addOperand(), based on MCInstrDesc, but inline asm has tied operands outside the descriptor. llvm-svn: 162869	2012-08-29 22:02:00 +00:00
Jakob Stoklund Olesen	cea3e77433	Rename hasVolatileMemoryRef() to hasOrderedMemoryRef(). Ordered memory operations are more constrained than volatile loads and stores because they must be ordered with respect to all other memory operations. llvm-svn: 162861	2012-08-29 21:19:21 +00:00
Jakob Stoklund Olesen	813a109fa5	Don't move normal loads across volatile/atomic loads. It is technically allowed to move a normal load across a volatile load, but probably not a good idea. It is not allowed to move a load across an atomic load with Ordering > Monotonic, and we model those with MOVolatile as well. I recently removed the mayStore flag from atomic load instructions, so they don't need a pseudo-opcode. This patch makes up for the difference. llvm-svn: 162857	2012-08-29 20:48:45 +00:00
Jakob Stoklund Olesen	7a837b9a76	Verify the consistency of inline asm operands. The operands on an INLINEASM machine instruction are divided into groups headed by immediate flag operands. Verify this structure. Extract verifyTiedOperands(), and only call it for non-inlineasm instructions. llvm-svn: 162849	2012-08-29 18:11:05 +00:00
Eric Christopher	2a4e616df6	Clean this up slightly, doesn't really fall through. llvm-svn: 162848	2012-08-29 17:59:32 +00:00
Jakob Stoklund Olesen	dbbff7899d	Verify the tied operand flags. WHen running with -verify-machineinstrs, check that tied operands come in matching use/def pairs, and that they are consistent with MCInstrDesc when it applies. llvm-svn: 162816	2012-08-29 00:38:03 +00:00
Jakob Stoklund Olesen	2b16664522	Maintain a vaild isTied bit as operands are added and removed. The isTied bit is set automatically when a tied use is added and MCInstrDesc indicates a tied operand. The tie is broken when one of the tied operands is removed. llvm-svn: 162814	2012-08-29 00:37:58 +00:00
Jakob Stoklund Olesen	e56c60c5eb	Add a MachineOperand::isTied() flag. While in SSA form, a MachineInstr can have pairs of tied defs and uses. The tied operands are used to represent read-modify-write operands that must be assigned the same physical register. Previously, tied operand pairs were computed from fixed MCInstrDesc fields, or by using black magic on inline assembly instructions. The isTied flag makes it possible to add tied operands to any instruction while getting rid of (some of) the inlineasm magic. Tied operands on normal instructions are needed to represent predicated individual instructions in SSA form. An extra <tied,imp-use> operand is required to represent the output value when the instruction predicate is false. Adding a predicate to: %vreg0<def> = ADD %vreg1, %vreg2 Will look like: %vreg0<tied,def> = ADD %vreg1, %vreg2, pred:3, %vreg7<tied,imp-use> The virtual register %vreg7 is the value given to %vreg0 when the predicate is false. It will be assigned the same physreg as %vreg0. This commit adds the isTied flag and sets it based on MCInstrDesc when building an instruction. The flag is not used for anything yet. llvm-svn: 162774	2012-08-28 18:34:41 +00:00
Jakob Stoklund Olesen	dba99d0dfa	Don't allow TargetFlags on MO_Register MachineOperands. Register operands are manipulated by a lot of target-independent code, and it is not always possible to preserve target flags. That means it is not safe to use target flags on register operands. None of the targets in the tree are using register operand target flags. External targets should be using immediate operands to annotate instructions with operand modifiers. llvm-svn: 162770	2012-08-28 18:05:48 +00:00
Jakob Stoklund Olesen	87cb471e52	Remove extra MayLoad/MayStore flags from atomic_load/store. These extra flags are not required to properly order the atomic load/store instructions. SelectionDAGBuilder chains atomics as if they were volatile, and SelectionDAG::getAtomic() sets the isVolatile bit on the memory operands of all atomic operations. The volatile bit is enough to order atomic loads and stores during and after SelectionDAG. This means we set mayLoad on atomic_load, mayStore on atomic_store, and mayLoad+mayStore on the remaining atomic read-modify-write operations. llvm-svn: 162733	2012-08-28 03:11:32 +00:00
Akira Hatanaka	adb14f56c7	Fix bug 13532. In SelectionDAGLegalize::ExpandLegalINT_TO_FP, expand INT_TO_FP nodes without using any f64 operations if f64 is not a legal type. Patch by Stefan Kristiansson. llvm-svn: 162728	2012-08-28 02:12:42 +00:00
Richard Smith	228e6d4cf3	Fix integer undefined behavior due to signed left shift overflow in LLVM. Reviewed offline by chandlerc. llvm-svn: 162623	2012-08-24 23:29:28 +00:00
Jakob Stoklund Olesen	10cdd09318	Avoid including explicit uses when counting SDNode imp-uses. It is legal to have a register node as an explicit operand, it shouldn't be counted as an implicit use. llvm-svn: 162591	2012-08-24 20:52:42 +00:00
Manman Ren	cf10446ffa	BranchProb: modify the definition of an edge in BranchProbabilityInfo to handle the case of multiple edges from one block to another. A simple example is a switch statement with multiple values to the same destination. The definition of an edge is modified from a pair of blocks to a pair of PredBlock and an index into the successors. Also set the weight correctly when building SelectionDAG from LLVM IR, especially when converting a Switch. IntegersSubsetMapping is updated to calculate the weight for each cluster. llvm-svn: 162572	2012-08-24 18:14:27 +00:00
Eric Christopher	bb69a27dbc	Use DW_FORM_flag_present to save space in debug information if we're not in darwin gdb compat mode. Fixes rdar://10975088 llvm-svn: 162526	2012-08-24 01:14:27 +00:00
Eric Christopher	acb7115bde	Remove the DW_AT_MIPS_linkage name attribute when we don't need it output (we're emitting a specification already and the information isn't changing) and we're not in old gdb compat mode. Saves 1% on the debug information for a build of llvm. Fixes rdar://11043421 llvm-svn: 162493	2012-08-23 22:52:55 +00:00
Eric Christopher	20b76a77c3	Turn these two options in to trinary state so that they can be turned on and off separate from the platform if you're on darwin. llvm-svn: 162487	2012-08-23 22:36:40 +00:00
Eric Christopher	4977f214d7	Add a flag to DwarfDebug to allow it to communicate whether or not we're using the darwin old gdb compat mode for emitting dwarf. llvm-svn: 162486	2012-08-23 22:36:36 +00:00
Eric Christopher	a876b8243e	Typo. llvm-svn: 162438	2012-08-23 07:32:06 +00:00
Eric Christopher	3a47c3e3cd	Only emit the __debug_inlined section if we're trying to be compatible with older gdbs on darwin. rdar://10975874 llvm-svn: 162436	2012-08-23 07:32:02 +00:00
Eric Christopher	7782618271	Emit pubtypes only when going for darwin gdb compatibility. rdar://10393214 llvm-svn: 162434	2012-08-23 07:10:56 +00:00
Eric Christopher	978fbff11b	Add an option for darwin gdb compatibility. llvm-svn: 162432	2012-08-23 07:10:46 +00:00
Andrew Trick	ae53561b0c	Simplify the computeOperandLatency API. The logic for recomputing latency based on a ScheduleDAG edge was shady. This bypasses the problem by requiring the client to provide operand indices. This ensures consistent use of the machine model's API. llvm-svn: 162420	2012-08-23 00:39:43 +00:00
David Blaikie	c8c2920a3f	Tidy up a few more uses of MF.getFunction()->getName(). Based on CR feedback from r162301 and Craig Topper's refactoring in r162347 here are a few other places that could use the same API (& in one instance drop a Function.h dependency). llvm-svn: 162367	2012-08-22 17:18:53 +00:00
Benjamin Kramer	f29db275b2	Reduce duplicated hash map lookups. llvm-svn: 162362	2012-08-22 15:37:57 +00:00
Stepan Dyatkovskiy	99120e04be	Rejected 169195. As Duncan commented, bitcasting to proper type is wrong approach. We need to insert some valid TRANCATE node here. llvm-svn: 162354	2012-08-22 09:33:55 +00:00
Craig Topper	a538d831e6	Add a getName function to MachineFunction. Use it in places that previously did getFunction()->getName(). Remove includes of Function.h that are no longer needed. llvm-svn: 162347	2012-08-22 06:07:19 +00:00
Richard Smith	3fb2047f82	Initialize SelectionDAGBuilder's Context in 'init', not in its constructor. The SelectionDAG's 'init' has not been called when the SelectionDAGBuilder is constructed (in SelectionDAGISel's constructor), so this was previously always initialized with 0. llvm-svn: 162333	2012-08-22 00:42:39 +00:00
David Blaikie	9c7226b456	Remove unnecessary cast that was also unnecessarily casting away constness. Even looking at the revision history I couldn't quite piece together why this cast was ever written in the first place, but I assume it was because of some change in the inheritance, perhaps this function was reimplemented in a derived type & this caller was meant to get the base version (& it wasn't virtual)? llvm-svn: 162301	2012-08-21 18:54:23 +00:00
Chad Rosier	d269bd8c24	Add support for the --param ssp-buffer-size= driver option. PR9673 llvm-svn: 162284	2012-08-21 16:15:24 +00:00
Jakob Stoklund Olesen	6bae2a57d5	Fix a quadratic algorithm in MachineBranchProbabilityInfo. The getSumForBlock function was quadratic in the number of successors because getSuccWeight would perform a linear search for an already known iterator. This patch was originally committed as r161460, but reverted again because of assertion failures. Now that duplicate Machine CFG edges have been eliminated, this works properly. llvm-svn: 162233	2012-08-20 22:01:38 +00:00
Jakob Stoklund Olesen	7d33c5739f	Don't add CFG edges for redundant conditional branches. IR that hasn't been through SimplifyCFG can look like this: br i1 %b, label %r, label %r Make sure we don't create duplicate Machine CFG edges in this case. Fix the machine code verifier to accept conditional branches with a single CFG edge. llvm-svn: 162230	2012-08-20 21:39:52 +00:00
Jakob Stoklund Olesen	1d0262677b	Add a verification pass after ExpandISelPseudos. This pass often has weird CFG hacks and hand-written MI building code that can go wrong in many ways. llvm-svn: 162224	2012-08-20 20:52:08 +00:00
Jakob Stoklund Olesen	de31b52c3e	Add CFG checks to MachineVerifier. Verify that the predecessor and successor lists are consistent and free of duplicates. llvm-svn: 162223	2012-08-20 20:52:06 +00:00
Stepan Dyatkovskiy	6a638ec521	Fixed DAGCombiner bug (found and localized by James Malloy): The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it consists purely of get_vector_elts from one or two source vectors. If so, it either makes a concat_vectors node or a shufflevector node. However, it doesn't check the element type width of the underlying vector, so if you have this sequence: Node0: v4i16 = ... Node1: i32 = extract_vector_elt Node0 Node2: i32 = extract_vector_elt Node0 Node3: v16i8 = BUILD_VECTOR Node1, Node2, ... It will attempt to: Node0: v4i16 = ... NewNode1: v16i8 = concat_vectors Node0, ... Where this is actually invalid because the element width is completely different. This causes an assertion failure on DAG legalization stage. Fix: If output item type of BUILD_VECTOR differs from input item type. Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to: Node0: v4i16 = ... NewNode1: v8i16 = concat_vectors Node0, ... NewNode2: v16i8 = bitcast NewNode1 llvm-svn: 162195	2012-08-20 07:57:06 +00:00
Eli Friedman	79a6b30d8a	Make atomic load and store of pointers work. Tighten verification of atomic operations so other unexpected operations don't slip through. Based on patch by Logan Chien. PR11786/PR13186. llvm-svn: 162146	2012-08-17 23:24:29 +00:00
Bill Wendling	bfb9b7598d	Implement stack protectors for structures with character arrays in them. <rdar://problem/10545247> llvm-svn: 162131	2012-08-17 20:59:56 +00:00
Bill Wendling	34bc34ecae	Change the `linker_private_weak_def_auto' linkage to `linkonce_odr_auto_hide' to make it more consistent with its intended semantics. The `linker_private_weak_def_auto' linkage type was meant to automatically hide globals which never had their addresses taken. It has nothing to do with the `linker_private' linkage type, which outputs the symbols with a `l' (ell) prefix among other things. The intended semantic is more like the `linkonce_odr' linkage type. Change the name of the linkage type to `linkonce_odr_auto_hide'. And therefore changing the semantics so that it produces the correct output for the linker. Note: The old linkage name `linker_private_weak_def_auto' will still parse but is not a synonym for `linkonce_odr_auto_hide'. This should be removed in 4.0. <rdar://problem/11754934> llvm-svn: 162114	2012-08-17 18:33:14 +00:00
Benjamin Kramer	ca7ca4f6c6	TargetLowering: Use the large shift amount during legalize types. The legalizer may call us with an overly large type. llvm-svn: 162101	2012-08-17 15:54:21 +00:00
Jakob Stoklund Olesen	714f595c98	Use standard pattern for iterate+erase. Increment the MBB iterator at the top of the loop to properly handle the current (and previous) instructions getting erased. This fixes PR13625. llvm-svn: 162099	2012-08-17 14:38:59 +00:00
Jakob Stoklund Olesen	2382d320b3	Add an MCID::Select flag and TII hooks for optimizing selects. Select instructions pick one of two virtual registers based on a condition, like x86 cmov. On targets like ARM that support predication, selects can sometimes be eliminated by predicating the instruction defining one of the operands. Teach PeepholeOptimizer to recognize select instructions, and ask the target to optimize them. llvm-svn: 162059	2012-08-16 23:11:47 +00:00
Richard Smith	8f3447c032	Fix undefined behavior: don't perform array indexing through a potentially null pointer. llvm-svn: 161919	2012-08-15 01:39:31 +00:00
Richard Smith	0ff8f0eaf9	Fix undefined behavior: binding null pointer to reference. No functionality change. llvm-svn: 161853	2012-08-14 05:31:26 +00:00
Eric Christopher	160522c25a	Grammar. llvm-svn: 161851	2012-08-14 05:13:29 +00:00
Owen Anderson	a40319b7f1	Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC. llvm-svn: 161807	2012-08-13 23:32:49 +00:00
Jakob Stoklund Olesen	396b595b92	Transfer weights in transferSuccessorsAndUpdatePHIs(). llvm-svn: 161805	2012-08-13 23:13:25 +00:00
Jakob Stoklund Olesen	1dc107a84e	Print out MachineBasicBlock successor weights when available. llvm-svn: 161804	2012-08-13 23:13:23 +00:00
Jakob Stoklund Olesen	702bcc3bcf	Remove the TII::scheduleTwoAddrSource() hook. It never does anything when running 'make check', and it get's in the way of updating live intervals in 2-addr. The hook was originally added to help form IT blocks in Thumb2 code before register allocation, but the pass ordering has changed since then, and we run if-conversion after register allocation now. When the MI scheduler is enabled, there will be no less than two schedulers between 2-addr and Thumb2ITBlockPass, so this hook is unlikely to help anything. llvm-svn: 161794	2012-08-13 21:52:57 +00:00
Bill Wendling	49aeb5cc5d	Whitespace cleanup. llvm-svn: 161788	2012-08-13 21:20:43 +00:00
Jakob Stoklund Olesen	d0af1d9657	Count triangles and diamonds in early if-conversion. llvm-svn: 161783	2012-08-13 21:03:27 +00:00
Jakob Stoklund Olesen	62a097d134	Delete dead typedef. llvm-svn: 161782	2012-08-13 21:03:25 +00:00
Jakob Stoklund Olesen	83a927d84a	Handle extra Tail predecessors in if-conversion. It is still possible to if-convert if the tail block has extra predecessors, but the tail phis must be rewritten instead of being removed. llvm-svn: 161781	2012-08-13 20:49:04 +00:00
Benjamin Kramer	59c8b411e0	MachineCSE: Hoist isConstantPhysReg out of the loop, it checks for overlaps already. llvm-svn: 161729	2012-08-11 20:42:59 +00:00
Benjamin Kramer	ef6494f24d	PR13578: Teach MachineCSE that instructions that use a constant register can be CSE'd safely. This is common e.g. when doing rip-relative addressing on x86_64. llvm-svn: 161728	2012-08-11 19:05:13 +00:00
Jakob Stoklund Olesen	bc55bfde03	Add a proper if-conversion cost model. Detect when there is not enough available ILP, so if-conversion can't speculate instructions for free. Compute the lengthening of the critical path when inserting a select instruction that depends on the condition as well as both sides of the if. Reject conversions that would stretch the critical path by more than half a mispredict penalty. llvm-svn: 161713	2012-08-10 22:27:31 +00:00
Jakob Stoklund Olesen	a0042acd3b	Give MachineTraceMetrics its own debug tag. llvm-svn: 161712	2012-08-10 22:27:29 +00:00
Jakob Stoklund Olesen	3484420927	Add more trace query functions. Trace::getResourceLength() computes the number of cycles required to execute the trace when ignoring data dependencies. The number can be compared to the critical path to estimate the trace ILP. Trace::getPHIDepth() computes the data dependency depth of a PHI in a trace successor that isn't necessarily part of the trace. llvm-svn: 161711	2012-08-10 22:27:27 +00:00
Jakob Stoklund Olesen	0a99062cf6	Add getTPred() and getFPred() functions. They identify the PHI predecessors in both diamonds and triangles. llvm-svn: 161689	2012-08-10 20:19:17 +00:00
Jakob Stoklund Olesen	0954d4199a	Include loop-carried dependencies when computing instr heights. When a trace ends with a back-edge, include PHIs in the loop header in the height computations. This makes the critical path through a loop more accurate by including the latencies of the last instructions in the loop. llvm-svn: 161688	2012-08-10 20:11:38 +00:00
Jakob Stoklund Olesen	8c28ac9ec9	Update edge weights correctly in replaceSuccessor(). When replacing Old with New, it can happen that New is already a successor. Add the old and new edge weights instead of creating a duplicate edge. llvm-svn: 161653	2012-08-10 03:23:27 +00:00
Jakob Stoklund Olesen	d9b66506a3	Reapply r161633-161634 "Partition use lists so defs always come before uses."" No changes to these patches, MRI needed to be notified when changing uses into defs and vice versa. llvm-svn: 161644	2012-08-10 00:21:30 +00:00
Jakob Stoklund Olesen	ae7b9711b1	Also update MRI use lists when changing a use to a def and vice versa. This was the cause of the buildbot failures. llvm-svn: 161643	2012-08-10 00:21:26 +00:00
Jakob Stoklund Olesen	acd27c9279	Revert r161633-161634 "Partition use lists so defs always come before uses." These commits broke a number of buildbots. llvm-svn: 161640	2012-08-09 23:31:36 +00:00
Jakob Stoklund Olesen	df01e00710	Partition use lists so defs always come before uses. This makes it possible to speed up def_iterator by stopping at the first use. This makes def_empty() and getUniqueVRegDef() much faster when there are many uses. In a +Asserts build, LiveVariables is 100x faster in one case because getVRegDef() has an assertion that would scan to the end of a def_iterator chain. Spill weight calculation is significantly faster (300x in one case) because isTriviallyReMaterializable() calls MRI->isConstantPhysReg(%RIP) which calls def_empty(%RIP). llvm-svn: 161634	2012-08-09 22:49:46 +00:00
Jakob Stoklund Olesen	7d7051ca3c	Don't use pointer-pointers for the register use lists. Use a more conventional doubly linked list where the Prev pointers form a cycle. This means it is no longer necessary to adjust the Prev pointers when reallocating the VRegInfo array. The test changes are required because the register allocation hint is using the use-list order to break ties. llvm-svn: 161633	2012-08-09 22:49:42 +00:00
Jakob Stoklund Olesen	c4102d4902	Move use list management into MachineRegisterInfo. Register MachineOperands are kept in linked lists accessible via MRI's reg_iterator interfaces. The linked list management was handled partly by MachineOperand methods, partly by MRI methods. Move all of the list management into MRI, delete MO::AddRegOperandToRegInfo() and MO::RemoveRegOperandFromRegInfo(). Be more explicit about handling the cases where an MRI pointer isn't available. llvm-svn: 161632	2012-08-09 22:49:37 +00:00
Jakob Stoklund Olesen	420798ca4f	Fix a future TwoAddressInstructionPass crash. No test case, the crash only happens when the default use list order is changed. llvm-svn: 161627	2012-08-09 22:08:26 +00:00
Nadav Rotem	e0f84d31c8	Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly handle the cases where the memory value type was illegal. PR 13111. llvm-svn: 161565	2012-08-09 01:56:44 +00:00
Jakob Stoklund Olesen	f71bc7b267	Don't use getNextOperandForReg() in RAFast. That particular optimization was probably premature anyway. llvm-svn: 161541	2012-08-08 23:44:01 +00:00
Jakob Stoklund Olesen	bf1ac4bdc3	Deal with irreducible control flow when building traces. We filter out MachineLoop back-edges during the trace-building PO traversals, but it is possible to have CFG cycles that aren't natural loops, and MachineLoopInfo doesn't include such cycles. Use a standard visited set to detect such CFG cycles, and completely ignore them when picking traces. llvm-svn: 161532	2012-08-08 22:12:01 +00:00
Jakob Stoklund Olesen	fa8a26f9df	Heed -stress-early-ifcvt. llvm-svn: 161513	2012-08-08 18:24:23 +00:00
Jakob Stoklund Olesen	e71b6c6b20	Get the MispredictPenalty from MCSchedModel. Thanks, Andy! llvm-svn: 161507	2012-08-08 18:19:58 +00:00
Andrew Trick	db9b1b5e66	Minor cleanup of defaultDefLatency API llvm-svn: 161470	2012-08-08 02:44:11 +00:00
Jakob Stoklund Olesen	0556be983d	Revert "Fix a quadratic algorithm in MachineBranchProbabilityInfo." It caused an assertion failure when compiling consumer-typeset. llvm-svn: 161463	2012-08-08 01:10:31 +00:00
Manman Ren	1be131ba27	X86: enable CSE between CMP and SUB We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462	2012-08-08 00:51:41 +00:00
Jakob Stoklund Olesen	c0b61ff9c7	Fix a quadratic algorithm in MachineBranchProbabilityInfo. The getSumForBlock function was quadratic in the number of successors because getSuccWeight would perform a linear search for an already known iterator. llvm-svn: 161460	2012-08-08 00:20:37 +00:00
Jakob Stoklund Olesen	fbf45dc2bd	Skip tied operand pairs that already have the same register. llvm-svn: 161454	2012-08-07 22:47:06 +00:00
Jakob Stoklund Olesen	505715d816	Add SelectionDAG::getTargetIndex. This adds support for TargetIndex operands during isel. The meaning of these (index, offset, flags) operands is entirely defined by the target. llvm-svn: 161453	2012-08-07 22:37:05 +00:00
Bill Wendling	61396b81a4	For non-Darwin platforms, we want to generate stack protectors only for character arrays. This is in line with what GCC does. <rdar://problem/10529227> llvm-svn: 161446	2012-08-07 20:59:05 +00:00
Jakob Stoklund Olesen	84689b0d5a	Add a new kind of MachineOperand: MO_TargetIndex. A target index operand looks a lot like a constant pool reference, but it is completely target-defined. It contains the 8-bit TargetFlags, a 32-bit index, and a 64-bit offset. It is preserved by all code generator passes. TargetIndex operands can be used to carry target-specific information in cases where immediate operands won't suffice. llvm-svn: 161441	2012-08-07 18:56:39 +00:00
Jakob Stoklund Olesen	296448b293	Fix a couple of typos. llvm-svn: 161437	2012-08-07 18:32:57 +00:00
Jakob Stoklund Olesen	75d9d5159e	Add trace accessor methods, implement primitive if-conversion heuristic. Compare the critical paths of the two traces through an if-conversion candidate. If the difference is larger than the branch brediction penalty, reject the if-conversion. If would never pay. llvm-svn: 161433	2012-08-07 18:02:19 +00:00
Chandler Carruth	881d0a7966	Add a much more conservative strategy for aligning branch targets. Previously, MBP essentially aligned every branch target it could. This bloats code quite a bit, especially non-looping code which has no real reason to prefer aligned branch targets so heavily. As Andy said in review, it's still a bit odd to do this without a real cost model, but this at least has much more plausible heuristics. Fixes PR13265. llvm-svn: 161409	2012-08-07 09:45:24 +00:00
Manman Ren	cb36b8c2e6	MachineCSE: Update the heuristics for isProfitableToCSE. If the result of a common subexpression is used at all uses of the candidate expression, CSE should not increase the live range of the common subexpression. rdar://11393714 and rdar://11819721 llvm-svn: 161396	2012-08-07 06:16:46 +00:00
Jakob Stoklund Olesen	a9d0b850b3	Delete a dead variable. TwoAddressInstructionPass doesn't remat any more. llvm-svn: 161285	2012-08-04 00:04:03 +00:00
Jakob Stoklund Olesen	a0c72ecf79	TwoAddressInstructionPass refactoring: Extract another method. llvm-svn: 161284	2012-08-03 23:57:58 +00:00
Bob Wilson	874886cd66	Refactor and check "onlyReadsMemory" before optimizing builtins. This patch is mostly just refactoring a bunch of copy-and-pasted code, but it also adds a check that the call instructions are readnone or readonly. That check was already present for sin, cos, sqrt, log2, and exp2 calls, but it was missing for the rest of the builtins being handled in this code. llvm-svn: 161282	2012-08-03 23:29:17 +00:00
Jakob Stoklund Olesen	1162a1548b	TwoAddressInstructionPass refactoring: Extract a method. No functional change intended, except replacing a DenseMap with a SmallDenseMap which should behave identically. llvm-svn: 161281	2012-08-03 23:25:45 +00:00
Jakob Stoklund Olesen	24bc514c0c	Begin adding support for updating LiveIntervals in TwoAddressInstructionPass. This is far from complete, and only changes behavior when the -early-live-intervals flag is passed to llc. llvm-svn: 161273	2012-08-03 22:58:34 +00:00
Jakob Stoklund Olesen	1c46589290	Add an experimental -early-live-intervals option. This option runs LiveIntervals before TwoAddressInstructionPass which will eventually learn to exploit and update the analysis. Eventually, LiveIntervals will run before PHIElimination, and we can get rid of LiveVariables. llvm-svn: 161270	2012-08-03 22:12:54 +00:00
Jakob Stoklund Olesen	918999db95	Delete merged physreg copies in joinReservedPhysReg(). Previously, the identity copy would survive through register allocation before it was removed by the rewriter. llvm-svn: 161269	2012-08-03 22:12:51 +00:00
Bob Wilson	871701c606	Try to reduce the compile time impact of r161232. The previous change caused fast isel to not attempt handling any calls to builtin functions. That included things like "printf" and caused some noticable regressions in compile time. I wanted to avoid having fast isel keep a separate list of functions that had to be kept in sync with what the code in SelectionDAGBuilder.cpp was handling. I've resolved that here by moving the list into TargetLibraryInfo. This is somewhat redundant in SelectionDAGBuilder but it will ensure that we keep things consistent. llvm-svn: 161263	2012-08-03 21:26:24 +00:00
Bob Wilson	fa59485b94	Fix memcmp code-gen to honor -fno-builtin. I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp in TargetLibraryInfo, so that it would use custom code for memcmp calls even with -fno-builtin. I also had to add a new -disable-simplify-libcalls option to llc so that I could write a test for this. llvm-svn: 161262	2012-08-03 21:26:18 +00:00
Jakob Stoklund Olesen	daae19f785	Completely eliminate VNInfo flags. The 'unused' state of a value number can be represented as an invalid def SlotIndex. This also exposed code that shouldn't have been looking at unused value VNInfos. llvm-svn: 161258	2012-08-03 20:59:32 +00:00
Jakob Stoklund Olesen	21809385a6	Fix a couple of loops that were processing unused value numbers. Unused VNInfos should be left alone. Their def SlotIndex doesn't point to anything. llvm-svn: 161257	2012-08-03 20:59:29 +00:00
Matt Beaumont-Gay	aaba08d503	Silence unused variable warning in -asserts build llvm-svn: 161256	2012-08-03 20:54:11 +00:00
Jakob Stoklund Olesen	9f565e19c5	Eliminate the VNInfo::hasPHIKill() flag. The only real user of the flag was removeCopyByCommutingDef(), and it has been switched to LiveIntervals::hasPHIKill(). All the code changed by this patch was only concerned with computing and propagating the flag. llvm-svn: 161255	2012-08-03 20:19:44 +00:00
Jakob Stoklund Olesen	06d6a5363b	Make the hasPHIKills flag a computed property. The VNInfo::HAS_PHI_KILL is only half supported. We precompute it in LiveIntervalAnalysis, but it isn't properly updated by live range splitting and functions like shrinkToUses(). It is only used in one place: RegisterCoalescer::removeCopyByCommutingDef(). This patch changes that function to use a new LiveIntervals::hasPHIKill() function that computes the flag for a given value number. llvm-svn: 161254	2012-08-03 20:10:24 +00:00
Jakob Stoklund Olesen	19c4596629	Delete dead function. llvm-svn: 161242	2012-08-03 15:21:21 +00:00
Jakob Stoklund Olesen	47ac20d4d6	Don't delete dead code in TwoAddressInstructionPass. This functionality was added before we started running DeadMachineInstructionElim on all targets. It serves no purpose now. llvm-svn: 161241	2012-08-03 15:11:57 +00:00
Bob Wilson	3e6fa462f3	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Manman Ren	ba8122cc25	X86 Peephole: fold loads to the source register operand if possible. Add more comments and use early returns to reduce nesting in isLoadFoldable. Also disable folding for V_SET0 to avoid introducing a const pool entry and a const pool load. rdar://10554090 and rdar://11873276 llvm-svn: 161207	2012-08-02 19:37:32 +00:00
Jakob Stoklund Olesen	5d30630e22	Compute the critical path length through a trace. Whenever both instruction depths and instruction heights are known in a block, it is possible to compute the length of the critical path as max(depth+height) over the instructions in the block. The stored live-in lists make it possible to accurately compute the length of a critical path that bypasses the current (small) block. llvm-svn: 161197	2012-08-02 18:45:54 +00:00
Jakob Stoklund Olesen	637c467528	Verify regunit intervals along with virtreg intervals. Don't cause regunit intervals to be computed just to verify them. Only check the already cached intervals. llvm-svn: 161183	2012-08-02 16:36:50 +00:00
Jakob Stoklund Olesen	374071dde2	Avoid creating dangling physreg live ranges during DCE. LiveRangeEdit::eliminateDeadDefs() can delete a dead instruction that reads unreserved physregs. This would leave the corresponding regunit live interval dangling because we don't have shrinkToUses() for physical registers. Fix this problem by turning the instruction into a KILL instead of deleting it. This happens in a landing pad in test/CodeGen/X86/2012-05-19-CoalescerCrash.ll: %vreg27<def,dead> = COPY %EDX<kill>; GR32:%vreg27 becomes: KILL %EDX<kill> An upcoming fix to the machine verifier will catch problems like this by verifying regunit live intervals. This fixes PR13498. I am not including the test case from the PR since we already have one exposing the problem once the verifier is fixed. llvm-svn: 161182	2012-08-02 16:36:47 +00:00
Jakob Stoklund Olesen	bde5dc5e46	Add report() functions that take a LiveInterval argument. llvm-svn: 161178	2012-08-02 14:31:49 +00:00
Manman Ren	5759d01230	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. This patch is a rework of r160919 and was tested on clang self-host on my local machine. rdar://10554090 and rdar://11873276 llvm-svn: 161152	2012-08-02 00:56:42 +00:00
Jakob Stoklund Olesen	e736b97eff	Extract some methods from verifyLiveIntervals. No functional change. llvm-svn: 161149	2012-08-02 00:20:20 +00:00
Jakob Stoklund Olesen	a766b4746d	Also verify RegUnit intervals at uses. llvm-svn: 161147	2012-08-01 23:52:40 +00:00
Jakob Stoklund Olesen	2db6b65330	Compute instruction heights through a trace. The height on an instruction is the minimum number of cycles from the instruction is issued to the end of the trace. Heights are computed for all instructions in and below the trace center block. The method for computing heights is different from the depth computation. As we visit instructions in the trace bottom-up, heights of used instructions are pushed upwards. This way, we avoid scanning long use lists, looking for uses in the current trace. At each basic block boundary, a list of live-in registers and their minimum heights is saved in the trace block info. These live-in lists are used when restarting depth computations on a trace that converges with an already computed trace. They will also be used to accurately compute the critical path length. llvm-svn: 161138	2012-08-01 22:36:00 +00:00
Eric Christopher	b1b9451337	Temporarily revert c23b933d5f8be9b51a1d22e717c0311f65f87dcd. It's causing failures in the debug testsuite and possibly PR13486. llvm-svn: 161121	2012-08-01 18:19:01 +00:00
Jakob Stoklund Olesen	5e19d35e9a	Add DataDep constructors. Explicitly check SSA form. llvm-svn: 161115	2012-08-01 16:02:59 +00:00
Elena Demikhovsky	3cb3b0045c	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Manman Ren	f288d2f120	MachineSink: Sort the successors before trying to find SuccToSinkTo. Use stable_sort instead of sort. Follow-up to r161062. rdar://11980766 llvm-svn: 161075	2012-07-31 20:45:38 +00:00
Jakob Stoklund Olesen	059e647c6d	Compute instruction depths through the current trace. Assuming infinite issue width, compute the earliest each instruction in the trace can issue, when considering the latency of data dependencies. The issue cycle is record as a 'depth' from the beginning of the trace. This is half the computation required to find the length of the critical path through the trace. Heights are next. llvm-svn: 161074	2012-07-31 20:44:38 +00:00
Jakob Stoklund Olesen	1dfb101835	Rename CT -> MTM. MachineTraceMetrics is abbreviated MTM. llvm-svn: 161072	2012-07-31 20:25:13 +00:00
Manman Ren	8c549b586c	MachineSink: Sort the successors before trying to find SuccToSinkTo. One motivating example is to sink an instruction from a basic block which has two successors: one outside the loop, the other inside the loop. We should try to sink the instruction outside the loop. rdar://11980766 llvm-svn: 161062	2012-07-31 18:10:39 +00:00
Micah Villmow	b67d7a3a33	Conform to LLVM coding style. llvm-svn: 161061	2012-07-31 18:07:43 +00:00
Micah Villmow	6b12f596ef	Don't generate ordered or unordered comparison operations if it is not legal to do so. llvm-svn: 161053	2012-07-31 16:48:03 +00:00
Jakob Stoklund Olesen	0c807dfae2	Clear kill flags in removeCopyByCommutingDef(). We are extending live ranges, so kill flags are not accurate. They aren't needed until they are recomputed after RA anyway. <rdar://problem/11950722> llvm-svn: 161023	2012-07-31 02:47:24 +00:00
Manman Ren	2b6a0dfd4c	Reverse order of the two branches at end of a basic block if it is profitable. We branch to the successor with higher edge weight first. Convert from je LBB4_8 --> to outer loop jmp LBB4_14 --> to inner loop to jne LBB4_14 jmp LBB4_8 PR12750 rdar: 11393714 llvm-svn: 161018	2012-07-31 01:11:07 +00:00
Andrew Trick	79795897b3	Use the latest MachineRegisterInfo APIs. No functionality. llvm-svn: 161010	2012-07-30 23:48:17 +00:00
Andrew Trick	535a23c38b	Inline MachineRegisterInfo::hasOneUse llvm-svn: 161007	2012-07-30 23:48:12 +00:00
Jakob Stoklund Olesen	68c2cd059e	Avoid looking at stale data in verifyAnalysis(). llvm-svn: 161004	2012-07-30 23:15:12 +00:00
Jakob Stoklund Olesen	c14cf57ba9	Allow traces to enter nested loops. This lets traces include the final iteration of a nested loop above the center block, and the first iteration of a nested loop below the center block. We still don't allow traces to contain backedges, and traces are truncated where they would leave a loop, as seen from the center block. llvm-svn: 161003	2012-07-30 23:15:10 +00:00
Jakob Stoklund Olesen	984cfe8322	Clarify invalidation strategy in comment. llvm-svn: 160997	2012-07-30 21:16:22 +00:00
Jakob Stoklund Olesen	f308c128ea	Assert that all trace candidate blocks have been visited by the PO. When computing a trace, all the candidates for pred/succ must have been visited. Filter out back-edges first, though. The PO traversal ignores them. Thanks to Andy for spotting this in review. llvm-svn: 160995	2012-07-30 21:10:27 +00:00
Jakob Stoklund Olesen	a12a7d5f74	Hook into PassManager's analysis verification. By overriding Pass::verifyAnalysis(), the pass contents will be verified by the pass manager. llvm-svn: 160994	2012-07-30 20:57:50 +00:00
Pete Cooper	91244268d7	Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd llvm-svn: 160987	2012-07-30 20:23:19 +00:00
Jakob Stoklund Olesen	7361846f32	Add MachineInstr::isTransient(). This is a cleaned up version of the isFree() function in MachineTraceMetrics.cpp. Transient instructions are very unlikely to produce any code in the final output. Either because they get eliminated by RegisterCoalescing, or because they are pseudo-instructions like labels and debug values. llvm-svn: 160977	2012-07-30 18:34:14 +00:00
Jakob Stoklund Olesen	3df6c46fdd	Add MachineTraceMetrics::verify(). This function verifies the consistency of cached data in the MachineTraceMetrics analysis. llvm-svn: 160976	2012-07-30 18:34:11 +00:00
Jakob Stoklund Olesen	eb488fe165	Verify that the CFG hasn't changed during invalidate(). The MachineTraceMetrics analysis must be invalidated before modifying the CFG. This will catch some of the violations of that rule. llvm-svn: 160969	2012-07-30 17:36:49 +00:00
Jakob Stoklund Olesen	fee94ca15b	Add MachineBasicBlock::isPredecessor(). A->isPredecessor(B) is the same as B->isSuccessor(A), but it can tolerate a B that is null or dangling. This shouldn't happen normally, but it it useful for verification code. llvm-svn: 160968	2012-07-30 17:36:47 +00:00
Manman Ren	f87dd7c01b	Revert r160920 and r160919 due to dragonegg and clang selfhost failure llvm-svn: 160927	2012-07-29 02:44:09 +00:00
Manman Ren	0fa3ab88ba	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919	2012-07-28 16:48:01 +00:00
Andrew Trick	940534371b	Reenable a basic SSA DAG builder optimization. Jakob fixed ProcessImplicifDefs in r159149. llvm-svn: 160910	2012-07-28 01:48:15 +00:00
Jakob Stoklund Olesen	0563369755	Add more debug output to MachineTraceMetrics. llvm-svn: 160905	2012-07-27 23:58:38 +00:00
Jakob Stoklund Olesen	1152202cc2	Keep track of the head and tail of the trace through each block. This makes it possible to quickly detect blocks that are outside the trace. llvm-svn: 160904	2012-07-27 23:58:36 +00:00
Eric Christopher	86ca9f9e11	Add a DW_AT_high_pc for CUs that are a single address range. Update all tests accordingly. Fixes PR13351. Patch by shinichiro hamaji! llvm-svn: 160899	2012-07-27 22:00:05 +00:00
Jakob Stoklund Olesen	7dfe7abdee	Also compute register mask lists under -new-live-intervals. llvm-svn: 160898	2012-07-27 21:56:39 +00:00
Jakob Stoklund Olesen	97e14e02f1	Eliminate the IS_PHI_DEF flag and VNInfo::setIsPHIDef(). A value number is a PHI def if and only if it begins at a block boundary. This can be derived from the def slot, a separate flag is not necessary. llvm-svn: 160893	2012-07-27 21:11:14 +00:00
Jakob Stoklund Olesen	4021a7bf25	Add a -new-live-intervals experimental option. This option replaces the existing live interval computation with one based on LiveRangeCalc.cpp. The new algorithm does not depend on LiveVariables, and it can be run at any time, before or after leaving SSA form. llvm-svn: 160892	2012-07-27 20:58:46 +00:00
Jakob Stoklund Olesen	bc65e8f94e	Add <imp-def> of super-register when lowering SUBREG_TO_REG. Patch by Tyler Nowicki! llvm-svn: 160888	2012-07-27 20:19:49 +00:00
Jakob Stoklund Olesen	35400b1dda	Use an otherwise unused variable. llvm-svn: 160798	2012-07-26 19:42:56 +00:00
Jakob Stoklund Olesen	f9029fef2a	Start scaffolding for a MachineTraceMetrics analysis pass. This is still a work in progress. Out-of-order CPUs usually execute instructions from multiple basic blocks simultaneously, so it is necessary to look at longer traces when estimating the performance effects of code transformations. The MachineTraceMetrics analysis will pick a typical trace through a given basic block and provide performance metrics for the trace. Metrics will include: - Instruction count through the trace. - Issue count per functional unit. - Critical path length, and per-instruction 'slack'. These metrics can be used to determine the performance limiting factor when executing the trace, and how it will be affected by a code transformation. Initially, this will be used by the early if-conversion pass. llvm-svn: 160796	2012-07-26 18:38:11 +00:00
Dan Gohman	0b3d782933	Add a floor intrinsic. llvm-svn: 160791	2012-07-26 17:43:27 +00:00
Manman Ren	cc1dc6dc11	Disable rematerialization in TwoAddressInstructionPass. It is redundant; RegisterCoalescer will do the remat if it can't eliminate the copy. Collected instruction counts before and after this. A few extra instructions are generated due to spilling but it is normal to see these kinds of changes with almost any small codegen change, according to Jakob. This also fixed rdar://11830760 where xor is expected instead of movi0. llvm-svn: 160749	2012-07-25 18:28:13 +00:00
Jakob Stoklund Olesen	cef9a618b1	Preserve 2-addr constraints in ConnectedVNInfoEqClasses. When a live range splits into multiple connected components, we would arbitrarily assign <undef> uses to component 0. This is wrong when the use is tied to a def that gets assigned to a different component: %vreg69<def> = ADD8ri %vreg68<undef>, 1 The use and def must get the same virtual register. Fix this by assigning <undef> uses to the same component as the value defined by the instruction, if any: %vreg69<def> = ADD8ri %vreg69<undef>, 1 This fixes PR13402. The PR has a test case which I am not including because it is unlikely to keep exposing this behavior in the future. llvm-svn: 160739	2012-07-25 17:15:15 +00:00
Jakob Stoklund Olesen	c6fd3deee6	Verify two-address constraints more carefully. Include <undef> operands and virtual registers after leaving SSA form. llvm-svn: 160734	2012-07-25 16:49:11 +00:00
Craig Topper	17300940ae	Change llvm_unreachable in SplitVectorOperand to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type. llvm-svn: 160661	2012-07-24 04:11:21 +00:00
Sylvestre Ledru	35521e2310	Fix a typo (the the => the) llvm-svn: 160621	2012-07-23 08:51:15 +00:00
Nadav Rotem	9056076cab	Fixed DAGCombine optimizations which generate select_cc for targets that do not support it (X86 does not lower select_cc). PR: 13428 Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160619	2012-07-23 07:59:50 +00:00
Craig Topper	2694c05e86	Tidy up. Fix indentation and remove trailing whitespace. llvm-svn: 160617	2012-07-23 05:38:07 +00:00
Craig Topper	b49546a3b3	Change llvm_unreachable in SplitVectorResult to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type. For instance 256-bit AVX intrinsics without having AVX enabled. llvm-svn: 160616	2012-07-23 04:34:49 +00:00
Benjamin Kramer	5be8f60126	Remove unused private member variables uncovered by the recent changes to clang's -Wunused-private-field. llvm-svn: 160583	2012-07-20 22:05:57 +00:00
Jakob Stoklund Olesen	e2cfd0d45a	Avoid folding loads that are unsafe to move. LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load into its only use. Only do that when the load is safe to move, and it won't extend any live ranges. This fixes PR13414. llvm-svn: 160575	2012-07-20 21:29:31 +00:00
Jakob Stoklund Olesen	f62c07f147	Split loop exiting edges more aggressively. PHIElimination splits critical edges when it predicts it can resolve interference and eliminate copies. It doesn't split the edge if the interference wouldn't be resolved anyway because the phi-use register is live in the critical edge anyway. Teach PHIElimination to split loop exiting edges with interference, even if it wouldn't resolve the interference. This removes the necessary copies from the loop, which is still an improvement from injecting the copies into the loop. The test case demonstrates the improvement. Before: LBB0_1: cmpb $0, (%rdx) leaq 1(%rdx), %rdx movl %esi, %eax je LBB0_1 After: LBB0_1: cmpb $0, (%rdx) leaq 1(%rdx), %rdx je LBB0_1 movl %esi, %eax llvm-svn: 160571	2012-07-20 20:49:53 +00:00
Pete Cooper	dcf94db677	Fix crash in machine verifier when trying to print the def of a register which has no def llvm-svn: 160531	2012-07-19 23:40:38 +00:00
Benjamin Kramer	f364a63c3e	Replace some explicit compare loops with std::equal. No functionality change. llvm-svn: 160501	2012-07-19 10:46:05 +00:00
Galina Kistanova	aaf9735951	Fixed few warnings. llvm-svn: 160493	2012-07-19 04:50:12 +00:00
Bill Wendling	d163405df8	Remove tabs. llvm-svn: 160475	2012-07-19 00:04:14 +00:00
Chandler Carruth	985454e0ac	Fix a somewhat nasty crasher in PR13378. This crashes inside of LiveIntervals due to the two-addr pass generating bogus MI code. The crux of the issue was a loop nesting problem. The intent of the code which attempts to transform instructions before converting them to two-addr form is to defer and reprocess any transformed instructions as the second processing is likely to have more opportunities to coalesce copies, etc. Unfortunately, there was one section of processing that was not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this rewriting proceeded, not only did it occur early, it removed the bits of information needed for the deferred processing to correctly generate the necessary two address form (specifically inserting a copy), but didn't trigger any immediate assertions and produced what appeared to be already valid two-address from code. Thus, the assertion only fired much later in the pipeline. The fix is to hoist the transformation logic up layer to where it can more firmly defer all further processing, and to teach the normal processing to handle an edge case previously handled as part of the transformation logic. This edge case (already matched tied register operands) needs to not defer any steps. As has been brought up repeatedly in the process: wow does this code need refactoring. I may squeeze in some time to at least bring sanity to this loop... but wow... =] Thanks to Jakob for helpful hints on the way here, and the review. llvm-svn: 160443	2012-07-18 18:58:22 +00:00
Nuno Lopes	2151497dca	ignore 'invoke @llvm.donothing', but still keep the edge to the continuation BB llvm-svn: 160411	2012-07-18 00:07:17 +00:00
Evan Cheng	e6a3b03ee0	Back out r160101 and instead implement a dag combine to recover from instcombine transformation. llvm-svn: 160387	2012-07-17 18:54:11 +00:00
Jakob Stoklund Olesen	0ef031186c	Add some trace output to TwoAddressInstructionPass. llvm-svn: 160380	2012-07-17 17:57:23 +00:00
Benjamin Kramer	7c1598caaa	Remove unused variable. llvm-svn: 160372	2012-07-17 17:00:11 +00:00
Nadav Rotem	277a40bc0a	Fix a crash in the legalization of large vectors. When truncating a result of a vector that is split we need to use the result of the split vector, and not re-split the dead node. llvm-svn: 160357	2012-07-17 09:07:37 +00:00
Evan Cheng	780f9b5f92	Implement r160312 as target indepedenet dag combine. llvm-svn: 160354	2012-07-17 08:31:11 +00:00
Evan Cheng	47d7be9578	Make sure constant bitwidth is <= 64 bit before calling getSExtValue(). llvm-svn: 160350	2012-07-17 07:47:50 +00:00
Evan Cheng	f579beca6d	This is another case where instcombine demanded bits optimization created large immediates. Add dag combine logic to recover in case the large immediates doesn't fit in cmp immediate operand field. int foo(unsigned long l) { return (l>> 47) == 1; } we produce %shr.mask = and i64 %l, -140737488355328 %cmp = icmp eq i64 %shr.mask, 140737488355328 %conv = zext i1 %cmp to i32 ret i32 %conv which codegens to movq $0xffff800000000000,%rax andq %rdi,%rax movq $0x0000800000000000,%rcx cmpq %rcx,%rax sete %al movzbl %al,%eax ret TargetLowering::SimplifySetCC would transform (X & -256) == 256 -> (X >> 8) == 1 if the immediate fails the isLegalICmpImmediate() test. For x86, that's immediates which are not a signed 32-bit immediate. Based on a patch by Eli Friedman. PR10328 rdar://9758774 llvm-svn: 160346	2012-07-17 06:53:39 +00:00
Nadav Rotem	60f7904db7	Minor cleanup and docs. llvm-svn: 160311	2012-07-16 18:56:39 +00:00
Nadav Rotem	839a06e9d7	Make ComputeDemandedBits return a deterministic result when computing an AssertZext value. In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits reported that some of the bits were both known to be one and known to be zero. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160305	2012-07-16 18:34:53 +00:00
Nadav Rotem	3050e07108	Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160235	2012-07-15 20:39:08 +00:00
Nadav Rotem	a62368c965	Refactor the code that checks that all operands of a node are UNDEFs. Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs. Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160229	2012-07-15 08:38:23 +00:00
Chandler Carruth	db5536f09d	Reapply r160194, switching to use LV information for finding local kills. The notable fix is to look at any dependencies attached to the kill instruction (or other instructions between MI nad the kill) where the dependencies are specific to the register in question. The old code implicitly handled this by rejecting the transform if any other uses were found within the block, but after the start point. The new code directly finds the kill, and has to re-use the existing dependency scan to check for non-kill uses. This was caught by self-host, but I found the bug via inspection and use of absurd assert scaffolding to compute the kills in two ways and compare them. So I have no useful testcase for this other than "bootstrap". I'd work harder to reduce a test case if this particular code were likely to live for a long time. Thanks to Benjamin Kramer for reviewing the fix itself. llvm-svn: 160228	2012-07-15 03:29:46 +00:00
Nadav Rotem	018921002e	Add a dagcombine optimization to convert concat_vectors of undefs into a single undef. The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node. llvm-svn: 160221	2012-07-14 21:30:27 +00:00
Jakob Stoklund Olesen	8f324a2cc8	Account for early-clobber reload instructions. No test case, there are no in-tree targets that require this. llvm-svn: 160219	2012-07-14 18:45:35 +00:00
Jakob Stoklund Olesen	3d604ab933	Be more verbose when detecting dominance problems. Catch uses of undefined physregs that haven't been added to basic block live-in lists. Run the verifier to pinpoint the problem. Also run the verifier when a virtual register use is not jointly dominated by defs. llvm-svn: 160207	2012-07-13 23:39:05 +00:00
Chandler Carruth	9c97cd5672	Revert r160194, which switched to use LV information for finding local kills. This is causing miscompiles that I'm working on tracking down. llvm-svn: 160196	2012-07-13 22:23:32 +00:00
Chandler Carruth	58c470dc68	Use the LiveVariables information to efficiently get local kills. This removes the largest scaling problem in the test cases from PR13225 when ASan is switched to insert basic blocks in the natural CFG order. It may also solve some scaling problems for more normal code with large numbers of basic blocks and variables. llvm-svn: 160194	2012-07-13 21:18:38 +00:00
Jim Grosbach	1af8c8060c	Provide function name in 'Cannot select' fatal error. When dumping the DAG for a fatal 'Cannot select' back-end error, also provide the name of the function the construct is in. Useful when dealing with large testcases, as the next step is to llvm-extract the function in question to get a small(er) testcase. llvm-svn: 160152	2012-07-13 00:29:09 +00:00
Eric Christopher	bf57091f8b	The end of the prologue should be marked with is_stmt. Fixes PR13303. Patch by Paul Robinson! llvm-svn: 160148	2012-07-12 23:30:25 +00:00
Duncan Sands	671cc2575d	The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of the input vector, it can be bigger (this is helpful for powerpc where <2 x i16> is a legal vector type but i16 isn't a legal type, IIRC). However this wasn't being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220. Lightly tweaked version of a patch by Michael Liao. llvm-svn: 160116	2012-07-12 09:01:35 +00:00
Evan Cheng	b17122859b	InstrEmitter::EmitSubregNode() optimize extract_subreg in this case: r1025 = s/zext r1024, 4 r1026 = extract_subreg r1025, 4 to a copy: r1026 = copy r1024 This is correct. However it uses TII->isCoalescableExtInstr() which can return true for instructions which essentially does a sext_in_reg so this can end up with an illegal copy where the source and destination register classes do not match. Add a check to avoid it. Sorry, no test case possible at this time. rdar://11849816 llvm-svn: 160059	2012-07-11 18:55:07 +00:00
Nadav Rotem	2a148668b6	Rename many of the Tmp1, Tmp2, Tmp3 variables to names such as Chain, Value, Ptr, etc. No functionality change. llvm-svn: 160042	2012-07-11 11:02:16 +00:00
Benjamin Kramer	9488100d46	Remove unused variable. llvm-svn: 160040	2012-07-11 09:39:04 +00:00
Nadav Rotem	de6fd282ef	Refactor the DAG Legalizer by extracting the legalization of Load and Store nodes into their own functions. No functional change. llvm-svn: 160037	2012-07-11 08:52:09 +00:00
Owen Anderson	b8844d6744	Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization. This is a speculative fix for a problem on Mips reported by Akira Hatanaka. llvm-svn: 160036	2012-07-11 06:38:55 +00:00
Jakob Stoklund Olesen	bc90a4ea82	Require and preserve LoopInfo for early if-conversion. It will surely be needed by heuristics. llvm-svn: 160027	2012-07-10 22:39:56 +00:00
Chandler Carruth	2207f76cd4	Teach the LiveInterval::join function to use the fast merge algorithm, generalizing its implementation sufficiently to support this value number scenario as well. This cuts out another significant performance hit in large functions (over 10k basic blocks, etc), especially those with "natural" CFG structures. llvm-svn: 160026	2012-07-10 22:25:21 +00:00
Jakob Stoklund Olesen	02638392c1	Run early if-conversion in domtree post-order. This ordering allows nested if-conversion without using a work list, and it makes it possible to update the dominator tree on the fly as well. Any erased basic blocks will always be dominated by the current post-order position, so the domtree can be pruned without invalidating the iterator. llvm-svn: 160025	2012-07-10 22:18:23 +00:00
Chandler Carruth	77d940011d	Fix a bug where I didn't test for an empty range before inspecting the back of it. I don't have anything even remotely close to a test case for this. It only broke two build bots, both of them doing bootstrap builds, one of them a dragonegg bootstrap. It doesn't break for me when I bootstrap either. It doesn't reproduce every time or on many machines during the bootstrap. Many thanks to Duncan Sands who got the exact command (and stage of the bootstrap) which failed on the dragonegg bootstrap and managed to get it to trigger under valgrind with debug symbols. The fix was then found by inspection. llvm-svn: 159993	2012-07-10 15:41:33 +00:00
Nadav Rotem	d908ddc186	Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991	2012-07-10 13:25:08 +00:00
Chandler Carruth	e18614dd17	Add an efficient merge operation to LiveInterval and use it to avoid quadratic behavior when performing pathological merges. Fixes the core element of PR12652. There is only one user of addRangeFrom left: join. I'm hoping to refactor further in a future patch and have join use this merge operation as well. llvm-svn: 159982	2012-07-10 05:16:17 +00:00
Chandler Carruth	ac766b9b42	Teach LiveIntervals how to verify themselves and start using it in some of the trick merge routines. This adds a layer of testing that was necessary when implementing more efficient (and complex) merge logic for this datastructure. No functionality changed here. llvm-svn: 159981	2012-07-10 05:06:03 +00:00
Andrew Trick	c50f06487c	indentation llvm-svn: 159958	2012-07-09 20:43:01 +00:00
Owen Anderson	d4b841f8f9	Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values. Previously, this would become an integer extension operation, followed by a real integer->float conversion. llvm-svn: 159957	2012-07-09 20:31:12 +00:00
Andrew Trick	87255e340e	I'm introducing a new machine model to simultaneously allow simple subtarget CPU descriptions and support new features of MachineScheduler. MachineModel has three categories of data: 1) Basic properties for coarse grained instruction cost model. 2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD). 3) Instruction itineraties for detailed per-cycle reservation tables. These will all live side-by-side. Any subtarget can use any combination of them. Instruction itineraries will not change in the near term. In the long run, I expect them to only be relevant for in-order VLIW machines that have complex contraints and require a precise scheduling/bundling model. Once itineraries are only actively used by VLIW-ish targets, they could be replaced by something more appropriate for those targets. This tablegen backend rewrite sets things up for introducing MachineModel type #2: per opcode/operand cost model. llvm-svn: 159891	2012-07-07 04:00:00 +00:00
Chad Rosier	879c34f45a	Whitespace. llvm-svn: 159839	2012-07-06 17:44:22 +00:00
Chad Rosier	88d53eae56	[fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic. llvm-svn: 159837	2012-07-06 17:33:39 +00:00
Alexey Samsonov	39602781f6	Fix PR13202 and a regtest. DwarfDebug class could generate the same (inlined) DIVariable twice: 1) when trying to find abstract debug variable for a concrete inlined instance. 2) when explicitly collecting info for variables that were optimized out. This change makes sure that this duplication won't happen and makes Clang pass "gdb.opt/inline-locals" test from gdb testsuite. Reviewed by Eric Christopher. llvm-svn: 159811	2012-07-06 08:45:08 +00:00
Jakob Stoklund Olesen	3f1bb93cab	Add some comments suggested in code review. llvm-svn: 159800	2012-07-06 02:31:22 +00:00
Chandler Carruth	1088676476	Optimize extendIntervalEndTo a tiny bit by saving one call through the vector erase. No functionality changed. llvm-svn: 159746	2012-07-05 12:40:45 +00:00
Chandler Carruth	264854f9a0	Finish fixing the MachineOperand hashing, providing a nice modern hash_value overload for MachineOperands. This addresses a FIXME sufficient for me to remove it, and cleans up the code nicely too. The important changes to the hashing logic: - TargetFlags are now included in all of the hashes. These were complete missed. - Register operands have their subregisters and whether they are a def included in the hash. - We now actually hash all of the operand types. Previously, many operand types were simply dropped on the floor. For example: - Floating point immediates - Large integer immediates (>64-bit) - External globals! - Register masks - Metadata operands - It removes the offset from the block-address hash; I'm a bit suspicious of this, but isIdenticalTo doesn't consider the offset for black addresses. Any patterns involving these entities could have triggered extreme slowdowns in MachineCSE or PHIElimination. Let me know if there are PRs you think might be closed now... I'm looking myself, but I may miss them. llvm-svn: 159743	2012-07-05 11:06:22 +00:00
Duncan Sands	71dacd09fe	All cases are covered, no need for a default. This deals with the corresponding clang warning. llvm-svn: 159742	2012-07-05 10:14:33 +00:00
Chandler Carruth	1d5d23106e	The hash function for MI expressions, used by MachineCSE, is really broken. This patch fixes the superficial problems which lead to the intractably slow compile times reported in PR13225. The specific issue is that we were failing to include the offset of a global variable in the hash code. Oops. This would in turn cause all MIs which were only distinguishable due to operating on different offsets of a global variable to produce identical hash functions. In some of the test cases attached to the PR I saw hash table activity where there were O(1000) probes-per-lookup on average. A very few entries were responsible for most of these probes. There is still quite a bit more to do here. The ad-hoc layering of data in MachineOperands makes them extremely brittle to hash correctly. We're missing quite a few other cases, the only ones I've fixed here are the specific MO types which were allowed through the assert() in getOffset(). llvm-svn: 159741	2012-07-05 10:03:57 +00:00
Duncan Sands	0552a2cad2	Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1 booleans. Patch by James Benton. llvm-svn: 159739	2012-07-05 09:32:46 +00:00
Nick Lewycky	765c699370	Remove ParentMap. You can just ask the domnode for its parent. No functionality change. Move the "Not profitable, avoid CSE!" debug message next to where we fail the check for profitability and use a different message for avoiding CSE due to being in different register classes. llvm-svn: 159729	2012-07-05 06:19:21 +00:00
Jakob Stoklund Olesen	c300ef0e50	Allow trailing physreg RegisterSDNode operands on non-variadic instructions. Also allow trailing register mask operands on non-variadic both MachineSDNodes and MachineInstrs. The extra physreg RegisterSDNode operands are added to the MI as <imp-use> operands. This makes it possible to have non-variadic call instructions. Call and return instructions really are non-variadic, the argument registers should only be used implicitly - they are not part of the encoding. llvm-svn: 159727	2012-07-04 23:53:23 +00:00
Jakob Stoklund Olesen	adb50a7a09	Print SlotIndexes when available for -print-machineinstrs. llvm-svn: 159726	2012-07-04 23:53:19 +00:00
Jakob Stoklund Olesen	2d827d628e	Allow multiple terminators to read virtual registers. Find the kill as the last terminator to read SrcReg. Patch by Philipp Brüschweiler! llvm-svn: 159722	2012-07-04 19:52:05 +00:00
Jakob Stoklund Olesen	29506f5e6d	Make sure -print-machineinstrs applies to the first pass as well. llvm-svn: 159720	2012-07-04 19:28:27 +00:00
Stepan Dyatkovskiy	7ff588f986	Reverted r156659, due to probable performance regressions, DenseMap should be used here: IntegersSubsetMapping - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement if possible. llvm-svn: 159703	2012-07-04 05:53:05 +00:00
Eric Christopher	ef9d710ea6	Reduce some code duplication. llvm-svn: 159701	2012-07-04 02:02:18 +00:00
Matt Beaumont-Gay	11d08b2e22	Fix some ascii art in a comment to not have trailing backslashes (inspiration from IfConversion.cc), and fix some spelling and grammar in the surrounding prose. llvm-svn: 159699	2012-07-04 01:09:45 +00:00
Jakob Stoklund Olesen	f8a63a1507	Add an experimental early if-conversion pass, off by default. This pass performs if-conversion on SSA form machine code by speculatively executing both sides of the branch and using a cmov instruction to select the result. This can help lower the number of branch mispredictions on architectures like x86 that don't have predicable instructions. The current implementation is very aggressive, and causes regressions on mosts tests. It needs good heuristics that have yet to be implemented. llvm-svn: 159694	2012-07-04 00:09:54 +00:00
Stepan Dyatkovskiy	8b0c97e0dd	Part of r159527. Splitted into series of patches and gone with fixed PR13256: IntegersSubsetMapping - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement if possible. llvm-svn: 159659	2012-07-03 13:46:45 +00:00
Eric Christopher	b65acc61a5	Revert "IntRange:" as it appears to be breaking self hosting. This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c. llvm-svn: 159618	2012-07-02 23:22:21 +00:00
Chandler Carruth	34263a0c95	All glory to address sanitizer. ;] It appears to have caught a use-after-free introduced as by r159567 and/or friends which call 'addPass' from many more places. The bug in 'addPass' doesn't appear to be new, and was spotted by inspection when ASan shown a bright light of a stacktrace at these functions. Hopefully this will fix the ASan failure -- I have no test case other than running an ASan-built clang over the test suite. llvm-svn: 159614	2012-07-02 22:56:41 +00:00
Evan Cheng	39e90029a2	Target option DisableJumpTables is a gross hack. Move it to TargetLowering instead. llvm-svn: 159611	2012-07-02 22:39:56 +00:00
Andrew Trick	2f26b34806	misched: allow NULL InstrItineraries. llvm-svn: 159599	2012-07-02 21:55:12 +00:00
Eric Christopher	dd8638fb3e	Turn an assert into an error to make it a bit more friendly. Part of rdar://6880388 and rdar://11766377 llvm-svn: 159590	2012-07-02 21:16:43 +00:00
Bob Wilson	cac3b90633	Extend TargetPassConfig to allow running only a subset of the normal passes. This is still a work in progress but I believe it is currently good enough to fix PR13122 "Need unit test driver for codegen IR passes". For example, you can run llc with -stop-after=loop-reduce to have it dump out the IR after running LSR. Serializing machine-level IR is not yet supported but we have some patches in progress for that. The plan is to serialize the IR to a YAML file, containing separate sections for the LLVM IR, machine-level IR, and whatever other info is needed. Chad suggested that we stash the stop-after pass in the YAML file and use that instead of the start-after option to figure out where to restart the compilation. I think that's a great idea, but since it's not implemented yet I put the -start-after option into this patch for testing purposes. llvm-svn: 159570	2012-07-02 19:48:45 +00:00
Bob Wilson	a3f9fa710a	Move assertion with TargetPassConfig's Initialized flag. llvm-svn: 159569	2012-07-02 19:48:39 +00:00
Bob Wilson	b9b693650a	Consistently use AnalysisID types in TargetPassConfig. This makes it possible to just use a zero value to represent "no pass", so the phony NoPassID global variable is no longer needed. llvm-svn: 159568	2012-07-02 19:48:37 +00:00
Bob Wilson	bbd38dd9c0	Add all codegen passes to the PassManager via TargetPassConfig. This is a preliminary step toward having TargetPassConfig be able to start and stop the compilation at specified passes for unit testing and debugging. No functionality change. llvm-svn: 159567	2012-07-02 19:48:31 +00:00
Manman Ren	72098b2c91	Added assertion in getVRegDef of MachineRegisterInfo to make sure the virtual register does not have multiple definitions. Modified TwoAddressInstructionPass to use getUniqueVRegDef instead of getVRegDef. llvm-svn: 159545	2012-07-02 18:55:36 +00:00
Andrew Trick	f161e391f8	Reapply "Make NumMicroOps a variable in the subtarget's instruction itinerary." Reapplies r159406 with minor cleanup. The regressions appear to have been spurious. llvm-svn: 159541	2012-07-02 18:10:42 +00:00
Stepan Dyatkovskiy	8b9ecca42d	IntRange: - Changed isSingleNumber method behaviour. Now this flag is calculated on demand. IntegersSubsetMapping - Optimized diff operation. - Replaced type of Items field from std::list with std::map. - Added new methods: bool isOverlapped(self &RHS) void add(self& RHS, SuccessorClass S) void detachCase(self& NewMapping, SuccessorClass Succ) void removeCase(SuccessorClass Succ) SuccessorClass findSuccessor(const IntTy& Val) const IntTy* getCaseSingleNumber(SuccessorClass *Succ) IntegersSubsetTest - DiffTest: Added checks for successors. SimplifyCFG Updated SwitchInst usage (now it is case-ragnes compatible) for - SimplifyEqualityComparisonWithOnlyPredecessor - FoldValueComparisonIntoPredecessors llvm-svn: 159527	2012-07-02 13:02:18 +00:00
Rafael Espindola	a77d31d7fd	Now that RegistersDefinedFromSameValue handles one instruction being an implicit_def, the other instruction can be anything, including instructions that define multiple values. Be careful about that and don't assume what operand 0 is. Fixes pr13249. llvm-svn: 159509	2012-07-01 17:08:01 +00:00
Rafael Espindola	efab16d43b	Handle implicit_defs in the register coalescer. I am still trying to produce a reduced testcase, but this fixes pr13209. llvm-svn: 159479	2012-06-30 01:45:55 +00:00
Manman Ren	6fa76dc0e0	Add SrcReg2 to analyzeCompare and optimizeCompareInstr to handle Compare instructions with two register operands. llvm-svn: 159465	2012-06-29 21:33:59 +00:00
Jakob Stoklund Olesen	3e3cdecf98	Clear kill flags in InstrEmitter::EmitSubregNode(). When a local virtual register is made global, make sure to clear any existing kill flags. llvm-svn: 159461	2012-06-29 21:00:03 +00:00
Jakob Stoklund Olesen	da9ea1d6bc	Check for extra kill flags on live-out virtual registers. This would previously get reported as the misleading "Virtual register def doesn't dominate all uses." llvm-svn: 159460	2012-06-29 21:00:00 +00:00
Manman Ren	c146589aa4	Add getUniqueVRegDef to MachineRegisterInfo. This comes in handy during peephole optimization. llvm-svn: 159453	2012-06-29 19:16:05 +00:00
Alexey Samsonov	6e7e6b646b	Cleanup in DwarfDebug - fix a typo and remove two unused functions llvm-svn: 159433	2012-06-29 16:04:14 +00:00
Chandler Carruth	aafe0918bc	Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.h This was always part of the VMCore library out of necessity -- it deals entirely in the IR. The .cpp file in fact was already part of the VMCore library. This is just a mechanical move. I've tried to go through and re-apply the coding standard's preferred header sort, but at 40-ish files, I may have gotten some wrong. Please let me know if so. I'll be committing the corresponding updates to Clang and Polly, and Duncan has DragonEgg. Thanks to Bill and Eric for giving the green light for this bit of cleanup. llvm-svn: 159421	2012-06-29 12:38:19 +00:00
Bill Wendling	f799efdedc	The DIBuilder class is just a wrapper around debug info creation (a.k.a. MDNodes). The module doesn't belong in Analysis. Move it to the VMCore instead. llvm-svn: 159414	2012-06-29 08:32:07 +00:00
Andrew Trick	51a8cf77b8	Revert "Make NumMicroOps a variable in the subtarget's instruction itinerary." This reverts commit r159406. I noticed a performance regression so I'll back out for now. llvm-svn: 159411	2012-06-29 07:10:41 +00:00
Andrew Trick	8c9e6728b3	misched: avoid scheduling instructions that can't be dispatched. llvm-svn: 159408	2012-06-29 03:23:24 +00:00
Andrew Trick	ce27bb999d	misched: count micro-ops toward the issue limit. llvm-svn: 159407	2012-06-29 03:23:22 +00:00
Andrew Trick	1f50152b2d	Make NumMicroOps a variable in the subtarget's instruction itinerary. The TargetInstrInfo::getNumMicroOps API does not change, but soon it will be used by MachineScheduler. Now each subtarget can specify the number of micro-ops per itinerary class. For ARM, this is currently always dynamic (-1), because it is used for load/store multiple which depends on the number of register operands. Zero is now a valid number of micro-ops. This can be used for nop pseudo-instructions or instructions that the hardware can squash during dispatch. llvm-svn: 159406	2012-06-29 03:23:18 +00:00
Nuno Lopes	ec9653b363	add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it llvm-svn: 159383	2012-06-28 22:30:12 +00:00
Jim Grosbach	e0c10d8b86	'Promote' vector [su]int_to_fp should widen elements. Teach vector legalization how to honor Promote for int to float conversions. The code checking whether to promote the operation knew to look at the operand, but the actual promotion code didn't. This fixes that. The operand is promoted up via [zs]ext. rdar://11762659 llvm-svn: 159378	2012-06-28 21:03:44 +00:00
Bill Wendling	e38859dc8e	Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h. The reasoning is because the DebugInfo module is simply an interface to the debug info MDNodes and has nothing to do with analysis. llvm-svn: 159312	2012-06-28 00:05:13 +00:00
Jakob Stoklund Olesen	59a0d3243b	Allow targets to inject passes before the virtual register rewriter. Such passes can be used to tweak the register assignments in a target-dependent way, for example to avoid write-after-write dependencies. llvm-svn: 159209	2012-06-26 17:09:29 +00:00
Chandler Carruth	9139f44d23	Update a bunch of stale comments that dated from when this folled the very first (and worst) placement algorithm. These should now more accurately reflect the reality of the pass. llvm-svn: 159185	2012-06-26 05:16:37 +00:00
Andrew Trick	fb2ba3e1cb	Enable the new LoopInfo algorithm by default. The primary advantage is that loop optimizations will be applied in a stable order. This helps debugging and unit test creation. It is also a better overall implementation without pathologically bad performance on deep functions. On large functions (llvm-stress --size=200000 \| opt -loops) Before: 0.1263s After: 0.0225s On deep functions (after tweaking llvm-stress, thanks Nadav): Before: 0.2281s After: 0.0227s See r158790 for more comments. The loop tree is now consistently generated in forward order, but loop passes are applied in reverse order over the program. If we have a loop optimization that prefers forward order, that can easily be achieved by adding a different type of LoopPassManager. llvm-svn: 159183	2012-06-26 04:11:38 +00:00
Evan Cheng	4c6f917d34	Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection. llvm-svn: 159179	2012-06-26 01:19:33 +00:00
Jakob Stoklund Olesen	a57fc12ec9	Enforce stricter liveness rules for PHIs. Verify that all paths from the entry block to a virtual register read pass through a def. Enable this check even when MRI->isSSA() is false. Verify that the live range of a virtual register is live out of all predecessor blocks, even for PHI-values. This requires that PHIElimination sometimes inserts IMPLICIT_DEF instruction in predecessor blocks. llvm-svn: 159150	2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen	eb49566447	Run ProcessImplicitDefs on SSA form where it can be much simpler. Implicitly defined virtual registers can simply have the <undef> bit set on all uses, and copies can be turned into implicit defs recursively. Physical registers are a bit trickier. We handle the common case where a physreg def is used by a nearby instruction in the same basic block. For more complicated cases, just leave the IMPLICIT_DEF instruction in. llvm-svn: 159149	2012-06-25 18:12:18 +00:00
Jakob Stoklund Olesen	70ed924e18	Teach PHIElimination to handle <undef> operands. When a PHI use is <undef>, don't emit a copy in the predecessor block, but insert an IMPLICIT_DEF instruction instead. This ensures that virtual register uses are always jointly dominated by defs, even if some of them are IMPLICIT_DEF. llvm-svn: 159121	2012-06-25 03:36:12 +00:00
Jakob Stoklund Olesen	6b556f824d	Handle <undef> operands in TwoAddressInstructionPass. When the source register to a 2-addr instruction is undefined, there is no need to attempt any transformations - simply replace the source register with the destination register. This also comes up when lowering IMPLICIT_DEF instructions - make sure the <undef> flag is moved to the new partial register def operand: %vreg8<def> = INSERT_SUBREG %vreg9<undef>, %vreg0<kill>, sub_16bit rewrite undef: %vreg8<def> = INSERT_SUBREG %vreg8<undef>, %vreg0<kill>, sub_16bit convert to: %vreg8:sub_16bit<def,read-undef> = COPY %vreg0<kill> llvm-svn: 159120	2012-06-25 03:27:12 +00:00
NAKAMURA Takumi	704de074b8	llvm/lib: [CMake] Add explicit dependency to intrinsics_gen. llvm-svn: 159112	2012-06-24 13:32:01 +00:00
Pete Cooper	fe212e762f	DAG legalisation can now handle illegal fma vector types by scalarisation llvm-svn: 159092	2012-06-24 00:05:44 +00:00
Jakob Stoklund Olesen	502e4c6ac4	Teach LiveVariables to handle <undef> operands. It's simple: Don't treat <undef> operands as uses, and don't assume a virtual register has a defining instruction unless a real use has been seen. llvm-svn: 159061	2012-06-23 02:23:00 +00:00
Jakob Stoklund Olesen	a127fc780a	Remove ProcessImplicitDefs.h which was unused. The ProcessImplicitDefs class can be local to its implementation file. llvm-svn: 159041	2012-06-22 22:27:36 +00:00
Jakob Stoklund Olesen	b033dede17	Also verify the def index for early clobbers. llvm-svn: 159039	2012-06-22 22:23:58 +00:00
Jakob Stoklund Olesen	4fa84ba8b9	Delete a boring statistic. llvm-svn: 159030	2012-06-22 20:40:15 +00:00
Jakob Stoklund Olesen	c61edda0ab	Store live intervals in an IndexedMap. It is both smaller and faster than DenseMap. llvm-svn: 159029	2012-06-22 20:37:52 +00:00
Hal Finkel	8db5547252	Revert r158679 - use case is unclear (and it increases the memory footprint). Original commit message: Allow up to 64 functional units per processor itinerary. This patch changes the type used to hold the FU bitset from unsigned to uint64_t. This will be needed for some upcoming PowerPC itineraries. llvm-svn: 159027	2012-06-22 20:27:13 +00:00
Jakob Stoklund Olesen	48828bb402	Fix a crash in --debug code. Don't try to print out the live range of a physreg. llvm-svn: 159021	2012-06-22 19:51:41 +00:00
Jakob Stoklund Olesen	48a1647c93	Don't depend on live ranges being present. DBG_VALUE instructions could be referring to non-existing virtual registers. llvm-svn: 159020	2012-06-22 18:51:35 +00:00

... 4 5 6 7 8 ...

14248 Commits