llvm-project

Commit Graph

Author	SHA1	Message	Date
Vikram TV	0876d2d5cf	Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933 llvm-svn: 255096	2015-12-09 05:49:14 +00:00
Reid Kleckner	8de1fe23ed	[CGP] Reimplement r255055 a different way llvm-svn: 255070	2015-12-08 23:00:03 +00:00
Reid Kleckner	e18f92bfe9	Revert "[CGP] Check that we have an insert point before moving llvm.dbg.value around" This reverts commit r255055. Breakage has been reported. llvm-svn: 255063	2015-12-08 22:33:23 +00:00
Reid Kleckner	7c005324d5	[CGP] Check that we have an insert point before moving llvm.dbg.value around llvm-svn: 255055	2015-12-08 21:50:52 +00:00
Justin Bogner	3135ba9b38	AsmPrinter: Use emitGlobalConstantFP to emit elements of constant data It's strange to duplicate the logic for emitting FP values into emitGlobalConstantDataSequential, and it's even stranger that we end up printing the verbose assembly comments differently between the two paths. Just call into emitGlobalConstantFP rather than crudely duplicating its logic. llvm-svn: 254988	2015-12-08 02:37:48 +00:00
Sanjay Patel	dc627ad63f	fix return values to match bool return type; NFC llvm-svn: 254968	2015-12-07 23:34:30 +00:00
Elena Demikhovsky	33e61eceb4	AVX-512: Fixed masked load / store instruction selection for KNL. Patterns were missing for KNL target for <8 x i32>, <8 x float> masked load/store. This intrinsic comes with all legal types: <8 x float> @llvm.masked.load.v8f32(<8 x float>* %addr, i32 align, <8 x i1> %mask, <8 x float> %passThru), but still requires lowering, because VMASKMOVPS, VMASKMOVDQU32 work with 512-bit vectors only. All data operands should be widened to 512-bit vector. The mask operand should be widened to v16i1 with zeroes. Differential Revision: http://reviews.llvm.org/D15265 llvm-svn: 254909	2015-12-07 13:39:24 +00:00
Craig Topper	e5e035a3a8	Replace uint16_t with the MCPhysReg typedef in many places. A lot of physical register arrays already use this typedef. llvm-svn: 254843	2015-12-05 07:13:35 +00:00
Cong Hou	833fe143f5	Normalize successors' probabilities when building MBBs for jump table. llvm-svn: 254837	2015-12-05 05:00:55 +00:00
Matthias Braun	b17e8b1c1d	ScheduleDAGInstrs: Move LiveIntervals field to ScheduleDAGMI Now that ScheduleDAGInstrs doesn't need it anymore we can move the field down the class hierarcy to ScheduleDAGMI. llvm-svn: 254759	2015-12-04 19:54:24 +00:00
Rafael Espindola	71bd70cc30	Revert "[BranchFolding] Merge MMOs during tail merge" This reverts commit r254694. It broke bootstrap. llvm-svn: 254700	2015-12-04 04:15:05 +00:00
Junmo Park	c0731ca183	[BranchFolding] Merge MMOs during tail merge Summary: If we remove the MMOs from Load/Store instructions, they are treated as volatile. This makes other optimization passes unhappy. eg. Load/Store Optimization So, it looks better to merge, not remove. Reviewers: gberry, mcrosier Subscribers: gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D14797 llvm-svn: 254694	2015-12-04 02:29:25 +00:00
Junmo Park	7cc13f2e58	(no commit message) llvm-svn: 254686	2015-12-04 02:06:59 +00:00
Matthias Braun	97d0ffbe06	ScheduleDAGInstrs: Rework schedule graph builder. Re-comitting with a change that avoids undefined uses getting put into the VRegUses list. The new algorithm remembers the uses encountered while walking backwards until a matching def is found. Contrary to the previous version this: - Works without LiveIntervals being available - Allows to increase the precision to subregisters/lanemasks (not used for now) The changes in the AMDGPU tests are necessary because the R600 scheduler is not stable with respect to the order of nodes in the ready queues. Differential Revision: http://reviews.llvm.org/D9068 llvm-svn: 254683	2015-12-04 01:51:19 +00:00
Matthias Braun	c07cbc8d3c	raw_ostream: << operator for callables with raw_ostream argument This is a revised version of r254655 which uses a Printable wrapper class to avoid ambiguous overload problems. Differential Revision: http://reviews.llvm.org/D14348 llvm-svn: 254681	2015-12-04 01:31:59 +00:00
Evgeniy Stepanov	2bb9c5ca22	Emit function alias to data as a function symbol. CFI emits jump slots for indirect functions as a byte array constant, and declares function-typed aliases to these constants. This change fixes AsmPrinter to emit these aliases as function symbols and not data symbols. llvm-svn: 254674	2015-12-04 00:45:43 +00:00
JF Bastien	1ac69947b6	CodeGen peephole: fold redundant phys reg copies Code generation often exposes redundant physical register copies through virtual registers such as: %vreg = COPY %PHYSREG ... %PHYSREG = COPY %vreg There are cases where no intervening clobber of %PHYSREG occurs, and the later copy could therefore be removed. In some cases this further allows us to remove the initial copy. This patch contains a motivating example which comes from the x86 build of Chrome, specifically cc::ResourceProvider::UnlockForRead uses libstdc++'s implementation of hash_map. That example has two tests live at the same time, and after machine sinking LLVM has confused itself enough and things spilling EFLAGS is a great idea even though it's never restored and the comparison results are both live. Before this patch we have: DEC32m %RIP, 1, %noreg, <ga:@L>, %noreg, %EFLAGS<imp-def> %vreg1<def> = COPY %EFLAGS; GR64:%vreg1 %EFLAGS<def> = COPY %vreg1; GR64:%vreg1 JNE_1 <BB#1>, %EFLAGS<imp-use> Both copies are useless. This patch tries to eliminate the later copy in a generic manner. dec is especially confusing to LLVM when compared with sub. I wrote this patch to treat all physical registers generically, but only remove redundant copies of non-allocatable physical registers because the allocatable ones caused issues (e.g. when calling conventions weren't properly modeled) and should be handled later by the register allocator anyways. The following tests used to failed when the patch also replaced allocatable registers: CodeGen/X86/StackColoring.ll CodeGen/X86/avx512-calling-conv.ll CodeGen/X86/copy-propagation.ll CodeGen/X86/inline-asm-fpstack.ll CodeGen/X86/musttail-varargs.ll CodeGen/X86/pop-stack-cleanup.ll CodeGen/X86/preserve_mostcc64.ll CodeGen/X86/tailcallstack64.ll CodeGen/X86/this-return-64.ll This happens because COPY has other special meaning for e.g. dependency breakage and x87 FP stack. Note that all other backends' tests pass. Reviewers: qcolombet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15157 llvm-svn: 254665	2015-12-03 23:43:56 +00:00
Justin Bogner	8d6f4e36e8	AsmPrinter: Simplify emitting FP elements in sequential data. NFC Use APFloat APIs here Rather than manually type-punning through unions. llvm-svn: 254664	2015-12-03 23:28:35 +00:00
Matthias Braun	149b859c55	Revert "raw_ostream: << operator for callables with raw_stream argument" This commit provoked "error C2593: 'operator <<' is ambiguous" on MSVC. This reverts commit r254655. llvm-svn: 254661	2015-12-03 23:00:28 +00:00
Matthias Braun	e957a9bb1b	raw_ostream: << operator for callables with raw_stream argument This allows easier construction of print helpers. Example: Printable PrintLaneMask(unsigned LaneMask) { return Printable([LaneMask](raw_ostream &OS) { OS << format("%08X", LaneMask); }); } // Usage: OS << PrintLaneMask(Mask); Differential Revision: http://reviews.llvm.org/D14348 llvm-svn: 254655	2015-12-03 22:17:26 +00:00
Chih-Hung Hsieh	ed7d81e5d4	[X86] Part 1 to fix x86-64 fp128 calling convention. Almost all these changes are conditioned and only apply to the new x86-64 f128 type configuration, which will be enabled in a follow up patch. They are required together to make new f128 work. If there is any error, we should fix or revert them as a whole. These changes should have no impact to current configurations. * Relax type legalization checks to accept new f128 type configuration, whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has TLI.isTypeLegal true. * Relax GetSoftenedFloat to return in some cases f128 type SDValue, which is TLI.isTypeLegal but not "softened" to i128 node. * Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration, to generate optimized bitwise operators for libm functions. * Enhance related Lower* functions to handle f128 type. * Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions to keep new f128 type in register, and convert f128 operators to library calls. * Fix Combiner, Emitter, Legalizer routines that did not handle f128 type. * Add ExpandConstant to handle i128 constants, ExpandNode to handle ISD::Constant node. * Add one more parameter to getCommonSubClass and firstCommonClass, to guarantee that returned common sub class will contain the specified simple value type. This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp. * Fix infinite loop in getTypeLegalizationCost when f128 is the value type. * Fix printOperand to handle null operand. * Enhance ISD::BITCAST node to handle f128 constant. * Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes. * Enhance X86AsmPrinter to emit f128 values in comments. Differential Revision: http://reviews.llvm.org/D15134 llvm-svn: 254653	2015-12-03 22:02:40 +00:00
Andrew Kaylor	9efb2332e2	[WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets Differential Revision: http://reviews.llvm.org/D14996 llvm-svn: 254629	2015-12-03 18:55:28 +00:00
Matthias Braun	2fd672a221	Revert "ScheduleDAGInstrs: Rework schedule graph builder." This works mostly fine but breaks some stage 1 builders when compiling compiler-rt on i386. Revert for further investigation as I can't see an obvious cause/fix. This reverts commit r254577. llvm-svn: 254586	2015-12-03 03:01:10 +00:00
Matthias Braun	d35fe3d984	ScheduleDAGInstrs: Rework schedule graph builder. The new algorithm remembers the uses encountered while walking backwards until a matching def is found. Contrary to the previous version this: - Works without LiveIntervals being available - Allows to increase the precision to subregisters/lanemasks (not used for now) The changes in the AMDGPU tests are necessary because the R600 scheduler is not stable with respect to the order of nodes in the ready queues. Differential Revision: http://reviews.llvm.org/D9068 llvm-svn: 254577	2015-12-03 02:05:27 +00:00
Matthias Braun	b0083608b4	RegisterPressure: Use range based for, fix else style; NFC llvm-svn: 254575	2015-12-03 01:44:45 +00:00
David Majnemer	70497c696a	Move EH-specific helper functions to a more appropriate place No functionality change is intended. llvm-svn: 254562	2015-12-02 23:06:39 +00:00
Reid Kleckner	1f11b4e3a7	Use std::string instead of strdup() and free() in WinCodeViewLineTables llvm-svn: 254557	2015-12-02 22:34:30 +00:00
Kyle Butt	cf6a8bfe51	[CodeGen]: Fix bad interaction with AntiDep breaking and inline asm. AggressiveAntiDepBreaker was renaming registers specified by the user for inline assembly. While this will work for compiler-specified registers, it won't work for user-specified registers, and at the time this runs, I don't currently see a way to distinguish them. llvm-svn: 254532	2015-12-02 18:58:51 +00:00
Fiona Glaser	1075f6323f	Fix accidental off by one change Didn't break any tests, but did unnecessary extra work. llvm-svn: 254529	2015-12-02 18:46:23 +00:00
Fiona Glaser	e25b06fa23	Scheduler / Regalloc: use unique_ptr[] instead of std::vector vector.resize() is significantly slower than memset in many STLs and the cost of initializing these vectors is significant on targets with many registers. Since we don't need the overhead of a vector, use a simple unique_ptr instead. llvm-svn: 254526	2015-12-02 18:32:59 +00:00
Tim Northover	f520eff782	AArch64: use ldxp/stxp pair to implement 128-bit atomic loads. The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic if there has been a corresponding successful stxp. It's less clear for AArch32, so I'm leaving that alone for now. llvm-svn: 254524	2015-12-02 18:12:57 +00:00
Cong Hou	cb07d7016a	Fix a bug in IfConversion.cpp. The bug is introduced in r254377 which failed some tests on ARM, where a new probability is assigned to a successor but the provided BB may not be a successor. llvm-svn: 254463	2015-12-01 21:50:20 +00:00
Sanjay Patel	0b2a94916d	use range-based for loops; NFCI llvm-svn: 254453	2015-12-01 19:57:43 +00:00
Sanjay Patel	b53791e5a7	don't repeat function/variable names in comments; NFC llvm-svn: 254445	2015-12-01 19:32:35 +00:00
Sanjay Patel	96824deebc	fix typo; NFC llvm-svn: 254442	2015-12-01 19:19:18 +00:00
Elena Demikhovsky	0781d7b2b4	Fixed a failure in cost calculation for vector GEP Cost calculation for vector GEP failed with due to invalid cast to GEP index operand. The bug is fixed, added a test. http://reviews.llvm.org/D14976 llvm-svn: 254408	2015-12-01 12:08:36 +00:00
Yury Gribov	d7dbb66eb8	Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics. The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines. Patch by Max Ostapenko. Differential Revision: http://reviews.llvm.org/D14983 llvm-svn: 254404	2015-12-01 11:40:55 +00:00
Cong Hou	4aef7ef881	Allow known and unknown probabilities coexist in MBB's successor list. Previously it is not allowed for each MBB to have successors with both known and unknown probabilities. However, this may be too strict as at this stage we could not always guarantee that. It is better to remove this restriction now, and I will work on validating MBB's successors' probabilities first (for example, check if the sum is approximate one). llvm-svn: 254402	2015-12-01 11:05:39 +00:00
Cong Hou	d97c100dc4	Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. (This is the second attempt to submit this patch. The first caused two assertion failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687) The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254377	2015-12-01 05:29:22 +00:00
Matthias Braun	50f7f585ed	RegisterPressure: If we do not collect dead defs the list must be empty llvm-svn: 254372	2015-12-01 04:20:06 +00:00
Matthias Braun	ba6b225bf9	RegisterPressure: Remove support for recede()/advance() at MBB boundaries Nobody was checking the returnvalue of recede()/advance() so we can simply replace this code with asserts. llvm-svn: 254371	2015-12-01 04:20:04 +00:00
Matthias Braun	f9f8b92d93	RegisterPressure: Split RegisterOperands analysis code from result object; NFC This is in preparation to expose the RegisterOperands class as RegisterPressure API. llvm-svn: 254368	2015-12-01 04:19:56 +00:00
Hans Wennborg	1dbaf67537	Revert r254348: "Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces." and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction." Asserts were firing in Chromium builds. See PR25687. llvm-svn: 254366	2015-12-01 03:49:42 +00:00
Cong Hou	1ccca9e673	Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction. The root cause is the rounding behavior in BranchProbability construction. We may consider to use truncation instead in the future. llvm-svn: 254356	2015-12-01 00:55:42 +00:00
Evgeniy Stepanov	fd07995363	Extend debug info for function parameters in SDAG. SDAG currently can emit debug location for function parameters when an llvm.dbg.declare points to either a function argument SSA temp, or to an AllocaInst. This change extends this logic by adding a fallback case when neither of the above is true. This is required for SafeStack, which may copy the contents of a byval function argument into something that is not an alloca, and then describe the target as the new location of the said argument. llvm-svn: 254352	2015-12-01 00:34:30 +00:00
Cong Hou	fa1917c673	Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348	2015-12-01 00:02:51 +00:00
Paul Robinson	a2550a6da3	Have 'optnone' respect the -fast-isel=false option. This is primarily useful for debugging optnone v. ISel issues. Differential Revision: http://reviews.llvm.org/D14792 llvm-svn: 254335	2015-11-30 21:56:16 +00:00
Craig Topper	6066164454	Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFC llvm-svn: 254260	2015-11-29 18:05:22 +00:00
Craig Topper	d0573179dc	[SelectionDAG] Use std::any_of instead of a manually coded loop. NFC llvm-svn: 254242	2015-11-29 04:37:11 +00:00
Jonas Paulsson	f12b925bb1	[Stack realignment] Handling of aligned allocas. This patch implements dynamic realignment of stack objects for targets with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo is changed so that for a target that has StackRealignable set to false, over-aligned static allocas are considered to be variable-sized objects and are handled with DYNAMIC_STACKALLOC nodes. It would be good to group aligned allocas into a single big alloca as an optimization, but this is yet todo. SystemZ benefits from this, due to its stack frame layout. New tests SystemZ/alloca-03.ll for aligned allocas, and SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions. Review and help from Ulrich Weigand and Hal Finkel. llvm-svn: 254227	2015-11-28 11:02:32 +00:00
Artyom Skrobov	314ee04268	Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC) Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085	2015-11-25 19:41:11 +00:00
Eric Christopher	4675c439aa	Fix some places where we were assuming that memory type had been legalized to a simple type when lowering a truncating store of a vector type. In this case for an EVT we'll return Expand as we should in all of the cases anyhow. The testcase triggered at the one in VectorLegalizer::LegalizeOp, inspection found the rest. llvm-svn: 254061	2015-11-25 09:11:53 +00:00
Matthias Braun	147110da84	LiveVariables should not clobber MachineOperand::IsDead, ::IsKill on reserved physical registers Patch by Nick Johnson <Nicholas.Paul.Johnson@deshawresearch.com> Differential Revision: http://reviews.llvm.org/D14875 llvm-svn: 254012	2015-11-24 20:06:56 +00:00
Cong Hou	1938f2eb98	Let SelectionDAG start to use probability-based interface to add successors. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes. 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights. 3. Use new interfaces in all other passes. 4. Remove old interfaces. This the second patch above. In this patch SelectionDAG starts to use probability-based interfaces in MBB to add successors but other MC passes are still using weight-based interfaces. Therefore, we need to maintain correct weight list in MBB even when probability-based interfaces are used. This is done by updating weight list in probability-based interfaces by treating the numerator of probabilities as weights. This change affects many test cases that check successor weight values. I will update those test cases once this patch looks good to you. Differential revision: http://reviews.llvm.org/D14361 llvm-svn: 253965	2015-11-24 08:51:23 +00:00
Davide Italiano	c304a0ddc1	[DIE] Make DIE.h NDEBUG conditional-free. Switch dump()/print() method definitions to LLVM_DUMP_METHOD instead. llvm-svn: 253945	2015-11-24 02:21:43 +00:00
Andrew Kaylor	d0430e8580	[WinEH] Fix problem where CodeGenPrepare incorrectly sinks a bitcast into an EH pad. Differential Revision: http://reviews.llvm.org/D14842 llvm-svn: 253902	2015-11-23 19:16:15 +00:00
Simon Pilgrim	1dfe53e180	Remove duplicate getValueType() calls. NFCI. llvm-svn: 253823	2015-11-22 16:49:38 +00:00
Krzysztof Parzyszek	6753f33388	Avoid dependency between TableGen and CodeGen Duplicate a few common definitions between DFAPacketizer.cpp and DFAPacketizerEmitter.cpp to avoid including files from CodeGen in TableGen. llvm-svn: 253820	2015-11-22 15:20:19 +00:00
Krzysztof Parzyszek	b46557292c	Hexagon V60/HVX DFA scheduler support Extended DFA tablegen to: - added "-debug-only dfa-emitter" support to llvm-tblgen - defined CVI_PIPE* resources for the V60 vector coprocessor - allow specification of multiple required resources - supports ANDs of ORs - e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means: (SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1) - added support for combo resources - allows specifying ORs of ANDs - e.g. [CVI_XLSHF, CVI_MPY01] means: (CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1) - increased DFA input size from 32-bit to 64-bit - allows for a maximum of 4 AND'ed terms of 16 resources - supported expressions now include: expression => term [AND term] [AND term] [AND term] term => resource [OR resource]* resource => one_resource \| combo_resource combo_resource => (one_resource [AND one_resource]*) Author: Dan Palermo <dpalermo@codeaurora.org> kparzysz: Verified AMDGPU codegen to be unchanged on all llc tests, except those dealing with instruction encodings. Reapply the previous patch, this time without circular dependencies. llvm-svn: 253793	2015-11-21 20:00:45 +00:00
Krzysztof Parzyszek	4ca21fc1aa	Revert r253790: it breaks all builds for some reason. llvm-svn: 253791	2015-11-21 17:38:33 +00:00
Krzysztof Parzyszek	220a9bc018	Hexagon V60/HVX DFA scheduler support Extended DFA tablegen to: - added "-debug-only dfa-emitter" support to llvm-tblgen - defined CVI_PIPE* resources for the V60 vector coprocessor - allow specification of multiple required resources - supports ANDs of ORs - e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means: (SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1) - added support for combo resources - allows specifying ORs of ANDs - e.g. [CVI_XLSHF, CVI_MPY01] means: (CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1) - increased DFA input size from 32-bit to 64-bit - allows for a maximum of 4 AND'ed terms of 16 resources - supported expressions now include: expression => term [AND term] [AND term] [AND term] term => resource [OR resource]* resource => one_resource \| combo_resource combo_resource => (one_resource [AND one_resource]*) Author: Dan Palermo <dpalermo@codeaurora.org> kparzysz: Verified AMDGPU codegen to be unchanged on all llc tests, except those dealing with instruction encodings. llvm-svn: 253790	2015-11-21 17:23:52 +00:00
Jonas Paulsson	8f0d2b7f1f	[DAGCombiner] Bugfix for lost chain depenedency. When MergeConsecutiveStores() combines two loads and two stores into wider loads and stores, the chain users of both of the original loads must be transfered to the new load, because it may be that a chain user only depends on one of the loads. New test case: test/CodeGen/SystemZ/dag-combine-01.ll Reviewed by James Y Knight. Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=25310#c6 llvm-svn: 253779	2015-11-21 13:25:07 +00:00
Geoff Berry	5256fcada0	[CodeGenPrepare] Create more extloads and fewer ands Summary: Add and instructions immediately after loads that only have their low bits used, assuming that the (and (load x) c) will be matched as a extload and the ands/truncs fed by the extload will be removed by isel. Reviewers: mcrosier, qcolombet, ab Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14584 llvm-svn: 253722	2015-11-20 22:34:39 +00:00
Arnaud A. de Grandmaison	4e89e9f846	[ShrinkWrap] Teach ShrinkWrap to handle targets requiring a register scavenger. The included test only checks for a compiler crash for now. Several people are facing this issue, so we first resolve the crash, and will increase shrinkwrap's coverage later in a follow-up patch. llvm-svn: 253718	2015-11-20 21:54:27 +00:00
Daniel Sanders	b700203c8b	Partially revert r253662: some unrelated work was accidentally committed with it. Sorry. llvm-svn: 253663	2015-11-20 13:16:35 +00:00
Daniel Sanders	be9db3c00a	Revert the revert 253497 and 253539 - These commits aren't the cause of the clang-cmake-mips failures. Sorry for the noise. llvm-svn: 253662	2015-11-20 13:13:53 +00:00
Tobias Edler von Koch	4d45090659	[LTO] Add option to emit assembly from LTOCodeGenerator This adds a new API, LTOCodeGenerator::setFileType, to choose the output file format for LTO CodeGen. A corresponding change to use this new API from llvm-lto and a test case is coming in a separate commit. Differential Revision: http://reviews.llvm.org/D14554 llvm-svn: 253622	2015-11-19 23:59:24 +00:00
Reid Kleckner	cc2f6c35a3	[WinEH] Disable most forms of demotion Now that the register allocator knows about the barriers on funclet entry and exit, testing has shown that this is unnecessary. We still demote PHIs on unsplittable blocks due to the differences between the IR CFG and the Machine CFG. llvm-svn: 253619	2015-11-19 23:23:33 +00:00
Krzysztof Parzyszek	df537b97b1	Expand subregisters in MachineFrameInfo::getPristineRegs http://reviews.llvm.org/D14719 llvm-svn: 253600	2015-11-19 21:18:52 +00:00
Sanjay Patel	4699b8ab6a	[CGP] despeculate expensive cttz/ctlz intrinsics This is another step towards allowing SimplifyCFG to speculate harder, but then have CGP clean things up if the target doesn't like it. Previous patches in this series: http://reviews.llvm.org/D12882 http://reviews.llvm.org/D13297 D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special handling because of weirdness in the intrinsic definition for handling a zero input (that definition can probably be blamed on x86). For example, if we have the usual speculated-by-select expensive op pattern like this: %tobool = icmp eq i64 %A, 0 %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true %cond = select i1 %tobool, i64 64, i64 %0 ret i64 %cond There's an instcombine that will turn it into: %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false) ; is_zero_undef == false This CGP patch is looking for that case and despeculating it back into: entry: %tobool = icmp eq i64 %A, 0 br i1 %tobool, label %cond.end, label %cond.true cond.true: %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true br label %cond.end cond.end: %cond = phi i64 [ %0, %cond.true ], [ 64, %entry ] ret i64 %cond This unfortunately may lead to poorer codegen (see the changes in the existing x86 test), but if we increase speculation in SimplifyCFG (the next step in this patch series), then we should avoid those kinds of cases in the first place. The need for this patch was originally mentioned here: http://reviews.llvm.org/D7506 with follow-up here: http://reviews.llvm.org/D7554 Differential Revision: http://reviews.llvm.org/D14630 llvm-svn: 253573	2015-11-19 16:37:10 +00:00
Hans Wennborg	dcc2500452	X86: More efficient legalization of wide integer compares In particular, this makes the code for 64-bit compares on 32-bit targets much more efficient. Example: define i32 @test_slt(i64 %a, i64 %b) { entry: %cmp = icmp slt i64 %a, %b br i1 %cmp, label %bb1, label %bb2 bb1: ret i32 1 bb2: ret i32 2 } Before this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax setae %al cmpl 16(%esp), %ecx setge %cl je .LBB2_2 movb %cl, %al .LBB2_2: testb %al, %al jne .LBB2_4 movl $1, %eax retl .LBB2_4: movl $2, %eax retl After this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax sbbl 16(%esp), %ecx jge .LBB1_2 movl $1, %eax retl .LBB1_2: movl $2, %eax retl Differential Revision: http://reviews.llvm.org/D14496 llvm-svn: 253572	2015-11-19 16:35:08 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Simon Pilgrim	c1a46b729b	[DAGCombiner] Vector constant folding for comparisons This patch adds support for vector constant folding of integer/float comparisons. This requires FoldConstantVectorArithmetic to support scalar constant operands (in this case ISD::CONDCASE). In future we should be able to support other scalar constant types as necessary (and possibly start calling FoldConstantVectorArithmetic for all node creations) Differential Revision: http://reviews.llvm.org/D14683 llvm-svn: 253504	2015-11-18 21:17:19 +00:00
Betul Buyukkurt	6fac1741c9	[PGO] Value profiling support This change introduces an instrumentation intrinsic instruction for value profiling purposes, the lowering of the instrumentation intrinsic and raw reader updates. The raw profile data files for llvm-profdata testing are updated. llvm-svn: 253484	2015-11-18 18:14:55 +00:00
Jonas Paulsson	af722f8287	[SelectionDAGBuilder] Make sure DemoteReg ends up in right reg-class. The virtual register containing the address for returned value on stack should in the DAG be represented with a CopyFromReg node and not a Register node. Otherwise, InstrEmitter will not make sure that it ends up in the right register class for the target instruction. SystemZ needs this, becuause the reg class for address registers is a subset of the general 64 bit register class. test/SystemZ/CodeGen/args-07.ll and args-04.ll updated to run with -verify-machineinstrs. Reviewed by Hal Finkel. llvm-svn: 253461	2015-11-18 14:59:00 +00:00
Rafael Espindola	449711cb36	Stop producing .data.rel sections. If a section is rw, it is irrelevant if the dynamic linker will write to it or not. It looks like llvm implemented this because gcc was doing it. It looks like gcc implemented this in the hope that it would put all the relocated items close together and speed up the dynamic linker. There are two problem with this: * It doesn't work. Both bfd and gold will map .data.rel to .data and concatenate the input sections in the order they are seen. * If we want a feature like that, it can be implemented directly in the linker since it knowns where the dynamic relocations are. llvm-svn: 253436	2015-11-18 06:02:15 +00:00
Cong Hou	136bc65ec8	Remove a redundant assertion in MachineBasicBlock.cpp. NFC. llvm-svn: 253426	2015-11-18 01:55:56 +00:00
Cong Hou	11c1420173	Remove redundant code in MachineBasicBlock.cpp. NFC. llvm-svn: 253425	2015-11-18 01:45:10 +00:00
Cong Hou	41cf1a5dfb	Improving edge probabilities computation when choosing the best successor in machine block placement. When looking for the best successor from the outer loop for a block belonging to an inner loop, the edge probability computation can be improved so that edges in the inner loop are ignored. For example, suppose we are building chains for the non-loop part of the following code, and looking for B1's best successor. Assume the true body is very hot, then B3 should be the best candidate. However, because of the existence of the back edge from B1 to B0, the probability from B1 to B3 can be very small, preventing B3 to be its successor. In this patch, when computing the probability of the edge from B1 to B3, the weight on the back edge B1->B0 is ignored, so that B1->B3 will have 100% probability. if (...) do { B0; ... // some branches B1; } while(...); else B2; B3; Differential revision: http://reviews.llvm.org/D10825 llvm-svn: 253414	2015-11-18 00:52:52 +00:00
David Blaikie	6196aa06c9	Generalize ownership/passing semantics to allow dsymutil to own abbreviations via unique_ptr While still allowing CodeGen/AsmPrinter in llvm to own them using a bump ptr allocator. (might be nice to replace the pointers there with something that at least automatically calls their dtors, if that's necessary/useful, rather than having it done explicitly (I think a typed BumpPtrAllocator already does this, or maybe a unique_ptr with a custom deleter, etc)) llvm-svn: 253409	2015-11-18 00:34:10 +00:00
Reid Kleckner	c20276d0b2	[WinEH] Move WinEHFuncInfo from MachineModuleInfo to MachineFunction Summary: Now that there is a one-to-one mapping from MachineFunction to WinEHFuncInfo, we don't need to use a DenseMap to select the right WinEHFuncInfo for the current funclet. The main challenge here is that X86WinEHStatePass is an IR pass that doesn't have access to the MachineFunction. I gave it its own WinEHFuncInfo object that it uses to calculate state numbers, which it then throws away. As long as nobody creates or removes EH pads between this pass and SDAG construction, we will get the same state numbers. The other thing X86WinEHStatePass does is to mark the EH registration node. Instead of communicating which alloca was the registration through WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic. This intrinsic generates no code and simply marks the alloca in use. Reviewers: JCTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14668 llvm-svn: 253378	2015-11-17 21:10:25 +00:00
Pat Gavlin	c8ea157811	Lower statepoints with multi-def targets. Statepoint lowering currently expects that the target method of a statepoint only defines a single value. This precludes using statepoints with ABIs that return values in multiple registers (e.g. the SysV AMD64 ABI). This change adds support for lowering statepoints with mutli-def targets. llvm-svn: 253339	2015-11-17 16:04:21 +00:00
Dan Gohman	7aa4abac24	Use TargetRegisterInfo for printing MachineOperand register comments Several places in AsmPrinter.cpp print comments describing MachineOperand registers using MCRegisterInfo, which uses MCOperand-oriented names. This doesn't work for targets that use virtual registers exclusively, as WebAssembly does, since virtual registers are represented and printed differently. This patch preserves what seems to be the spirit of r229978, avoiding the use of TM.getSubtargetImpl(), while still using MachineOperand-oriented printing for MachineOperands. Differential Revision: http://reviews.llvm.org/D14709 llvm-svn: 253338	2015-11-17 16:01:28 +00:00
Rafael Espindola	65e4902156	Drop prelink support. The way prelink used to work was * The compiler decides if a given section only has relocations that are know to point to the same DSO. If so, it names it .data.rel.ro.local<something>. * The static linker puts all of these together. * The prelinker program assigns addresses to each library and resolves the local relocations. There are many problems with this: * It is incompatible with address space randomization. * The information passed by the compiler is redundant. The linker knows if a given relocation is in the same DSO or not. If could sort by that if so desired. * There are newer ways of speeding up DSO (gnu hash for example). * Even if we want to implement this again in the compiler, the previous implementation is pretty broken. It talks about relocations that are "resolved by the static linker". If they are resolved, there are none left for the prelinker. What one needs to track is if an expression will require only dynamic relocations that point to the same DSO. At this point it looks like the prelinker is an historical curiosity. For example, fedora has retired it because it failed to build for two releases (http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f) This patch removes support for it. That is, it stops printing the ".local" sections. llvm-svn: 253280	2015-11-17 00:51:23 +00:00
Matthias Braun	fe9d6f211f	Assume lane masks are always precise Allowing imprecise lane masks in case of more than 32 sub register lanes lead to some tricky corner cases, and I need another bugfix for another one. Instead I rather declare lane masks as precise and let tablegen abort if we do not have enough bits. This does not affect any in-tree target, even AMDGPU only needs 16 lanes at the moment. If the 32 lanes turn out to be a problem in the future, then we can easily change the LaneBitmask typedef to uint64_t. Differential Revision: http://reviews.llvm.org/D14557 llvm-svn: 253279	2015-11-17 00:50:55 +00:00
Keno Fischer	b011c63d19	[DIBuilder] Make createReferenceType take size and align Summary: Since we're passing references to dbg.value as pointers, we need to have the frontend properly declare their sizes and alignments (as it already does for regular pointers) in preparation for my upcoming patch to have the verifer check that the sizes agree. Also augment the backend logic that skips actually emitting this information into DWARF such that it also handles reference types. Reviewers: aprantl, dexonsmith, dblaikie Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D14275 llvm-svn: 253186	2015-11-16 07:57:32 +00:00
Akira Hatanaka	b11ef0897c	Reduce the size of MCRelaxableFragment. MCRelaxableFragment previously kept a copy of MCSubtargetInfo and MCInst to enable re-encoding the MCInst later during relaxation. A copy of MCSubtargetInfo (instead of a reference or pointer) was needed because the feature bits could be modified by the parser. This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment with a constant reference to MCSubtargetInfo. The copies of MCSubtargetInfo are kept in MCContext, and the target parsers are now responsible for asking MCContext to provide a copy whenever the feature bits of MCSubtargetInfo have to be toggled. With this patch, I saw a 4% reduction in peak memory usage when I compiled verify-uselistorder.lto.bc using llc. rdar://problem/21736951 Differential Revision: http://reviews.llvm.org/D14346 llvm-svn: 253127	2015-11-14 06:35:56 +00:00
Quentin Colombet	2cdcfd23cd	[ShrinkWrapping] Disable the optimization for functions with sanitize like attribute. Even if the target supports shrink-wrapping, the prologue and epilogue must not move because a crash can happen anywhere and sanitizers need to be able to unwind from the PC of the crash. llvm-svn: 253116	2015-11-14 01:55:17 +00:00
Matthias Braun	e6edd48d69	MachineScheduler: Print initial pressure in debug dump llvm-svn: 253097	2015-11-13 22:30:31 +00:00
Matthias Braun	3b099db61d	MachineScheduler: Improve debug output for "only one node in readyset" When there is only 1 node left in the ready queue and it is picked call the reason "ONLY1" instead of "NOCAND". llvm-svn: 253096	2015-11-13 22:30:29 +00:00
James Molloy	bb1dbf530a	[SDAG] Fix expansion of BITREVERSE Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash. What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size. Use an APInt for the shift and VT.getScalarSizeInBits(). llvm-svn: 253023	2015-11-13 10:02:36 +00:00
Sanjoy Das	ac9c5b1901	[ImplicitNulls] Add some clarifying comments; NFC llvm-svn: 253020	2015-11-13 08:14:00 +00:00
Tom Stellard	0967c91e0c	Revert "Remove unnecessary call to getAllocatableRegClass" This reverts commit r252565. This also includes the revert of the commit mentioned below in order to avoid breaking tests in AMDGPU: Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64" This reverts commit r252674. llvm-svn: 252956	2015-11-12 21:43:25 +00:00
Sanjoy Das	e8b81649cf	[ImplicitNulls] Fix wrapping by breaking up a condition, NFC llvm-svn: 252947	2015-11-12 20:51:49 +00:00
Sanjoy Das	edc394f1ed	[ImplicitNull] Extract out a HazardDetector class, NFC This will make later functional changes easier to follow. llvm-svn: 252946	2015-11-12 20:51:44 +00:00
Quentin Colombet	aeb85934b6	[ShrinkWrap] Fix a typo in a comment. llvm-svn: 252918	2015-11-12 18:16:27 +00:00
Quentin Colombet	94dc1e0d34	[ShrinkWrap] Make sure we do not mess up with EH funclet lowering. ShrinkWrapping does not understand exception handling constraints for now, so make sure we do not mess with them by aborting on functions that use EH funclets. llvm-svn: 252917	2015-11-12 18:13:42 +00:00
Andrew Kaylor	fb16a3ac9a	[WinEH] Fix problem with removing an element from a SetVector while iterating. Patch provided by Yaron Keren. (Thanks!) llvm-svn: 252913	2015-11-12 17:36:03 +00:00
James Molloy	90111f79f9	[SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsic Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target. This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support. The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently). llvm-svn: 252878	2015-11-12 12:29:09 +00:00
Matthias Braun	b9610a6bc2	LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization - Factor out code to query and modify the sign bit of a floatingpoint value as an integer. This also works if none of the targets integer types is big enough to hold all bits of the floatingpoint value. - Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available, otherwise perform bit manipulation on the sign bit. The previous code used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also takes 34 instructions on ARM Cortex-M4. With this patch we only require 5: vldr d0, LCPI0_0 vmov r2, r3, d0 lsrs r2, r3, #31 bfi r1, r2, #31, #1 bx lr (This could be further improved if the compiler would recognize that r2, r3 is zero). - Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is available otherwise perform bit manipulation on the sign bit. - Perform the sign(x) test by masking out the sign bit and comparing with 0 rather than shifting the sign bit to the highest position and testing for "<s 0". For x86 copysignl (on 80bit values) this gets us: testl $32768, %eax rather than: shlq $48, %rax sets %al testb %al, %al Differential Revision: http://reviews.llvm.org/D11172 llvm-svn: 252839	2015-11-12 01:02:47 +00:00
Reid Kleckner	b9204a584c	[WinEH] Don't forward branches across empty EH pad BBs For really simple SEH catchpads, we tried to forward the invoke unwind edge across the empty block. llvm-svn: 252822	2015-11-11 23:09:31 +00:00
Geoff Berry	2ddfc5e60f	[DAGCombiner] Improve zextload optimization. Summary: Don't fold (zext (and (load x), cst)) -> (and (zextload x), (zext cst)) if (and (load x) cst) will match as a zextload already and has additional users. For example, the following IR: %load = load i32, i32* %ptr, align 8 %load16 = and i32 %load, 65535 %load64 = zext i32 %load16 to i64 store i32 %load16, i32* %dst1, align 4 store i64 %load64, i64* %dst2, align 8 used to produce the following aarch64 code: ldr w8, [x0] and w9, w8, #0xffff and x8, x8, #0xffff str w9, [x1] str x8, [x2] but with this change produces the following aarch64 code: ldrh w8, [x0] str w8, [x1] str x8, [x2] Reviewers: resistor, mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14340 llvm-svn: 252789	2015-11-11 19:42:52 +00:00
Matt Arsenault	d8fed1b793	Add target preference for GatherAllAliases max depth llvm-svn: 252775	2015-11-11 18:44:33 +00:00
Dehao Chen	54511353e3	clang-format lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp llvm-svn: 252769	2015-11-11 18:09:47 +00:00
Dehao Chen	72fdf444b7	Emit discriminator for inlined callsites. Summary: Inlined callsites need to be emitted in debug info so that sample profile can be annotated to the correct inlined instance. Reviewers: dnovillo, dblaikie Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D14511 llvm-svn: 252768	2015-11-11 18:08:18 +00:00
Matthias Braun	2c98d0f477	MachineInstr: addRegisterDefReadUndef() => setRegisterDefReadUndef() This way we can not only add but also remove read undef flags. llvm-svn: 252678	2015-11-11 00:41:58 +00:00
Matthias Braun	4353b30542	TableGen: Emit LaneMask for register classes without subregisters as ~0u This makes it slightly easier to handle classes with and without subregister uniformly. llvm-svn: 252671	2015-11-10 23:23:05 +00:00
Sanjay Patel	33ec5dbe35	less indent; NFCI llvm-svn: 252643	2015-11-10 20:09:02 +00:00
Matt Arsenault	aa118e299c	LegalizeDAG: Implement promote for scalar_to_vector This allows avoiding the default Expand behavior which introduces stack usage. Bitcast the scalar and replace the missing elements with undef. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252632	2015-11-10 18:48:11 +00:00
Matt Arsenault	a46aa641f2	LegalizeDAG: Implement promote for insert_vector_elt This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252631	2015-11-10 18:48:08 +00:00
Matt Arsenault	0b7958a59b	LegalizeDAG: Implement promote for extract_vector_elt This is for AMDGPU to implement v2i64 extract as extract of half of a v4i32. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252630	2015-11-10 18:48:04 +00:00
Sanjay Patel	766589efdc	add 'MustReduceDepth' as an objective/cost-metric for the MachineCombiner This is one of the problems noted in PR25016: https://llvm.org/bugs/show_bug.cgi?id=25016 and: http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html The spilling problem is independent and not addressed by this patch. The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. This is caused by inclusion of the "slack" factor when calculating the critical path of the original code sequence. If we don't add that, then we have a more conservative cost comparison of the old code sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64 MULADD patterns because benchmark regressions were observed without that. The two failing test cases now have identical asm that does what we want: a + b + c + d ---> (a + b) + (c + d) Differential Revision: http://reviews.llvm.org/D13417 llvm-svn: 252616	2015-11-10 16:48:53 +00:00
Andy Ayers	809cbe9ea0	Support for emitting inline stack probes For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages between the current stack limit and the desired new stack pointer location. This implements support for the inline expansion on x64. For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications that arise when introducing new machine basic blocks during prolog and epilog creation. Added a new test case, modified an existing one to exclude non-x64 coreclr (for now). Add test case Fix tests llvm-svn: 252578	2015-11-10 01:50:49 +00:00
Matt Arsenault	6d87f28afd	Remove unnecessary call to getAllocatableRegClass I'm not sure what the point of this was. I'm not sure why you would ever define an instruction that produces an unallocatable register class. No tests fail with this removed, and it seems like it should be a verifier error to define such an instruction. This was problematic for AMDGPU because it would make bad decisions by arbitrarily changing the register class when unsetting isAllocatable for VS_32/VS_64, which is currently set as a workaround to this problem. AMDGPU uses the VS_32/VS_64 register classes to represent operands which can use either VGPRs or SGPRs. When isAllocatable is unset for these, this would need to pick either the SGPR or VGPR class and insert either a copy we don't want, or an illegal copy we would need to deal with later. A semi-arbitrary register class ordering decision is made in tablegen, which resulted in always picking a VGPR class because it happens to have more registers than the SGPR register class. We really just want to use whatever register class the original register had. llvm-svn: 252565	2015-11-10 00:30:14 +00:00
Matthias Braun	7e624d5f11	MachineVerifier: Streamline live interval related error reporting Simply perform additional report_context() calls after a report() instead of adding more and more overloaded variations of report(). Also improve several instances where information was output in an ad-hoc way probably because no matching report() overload was available. llvm-svn: 252552	2015-11-09 23:59:33 +00:00
Matthias Braun	716b43306b	MachineVerifier: Add missing linebreak MachineInstr::print() with SkipOppers==true does not produce a linebreak, so we have to do that in MachineVerifier::report(). llvm-svn: 252551	2015-11-09 23:59:29 +00:00
Matthias Braun	45718db0a1	MachineVerifier: MI::print has no TargetMachine overload The code was passing a target machine pointer which degraded to a true operand to SkipOppers. llvm-svn: 252550	2015-11-09 23:59:25 +00:00
Matthias Braun	42b4b63056	MachineVerifier: print list of live intervals if available llvm-svn: 252549	2015-11-09 23:59:23 +00:00
Sanjay Patel	533c10c651	add a SelectionDAG method to check if no common bits are set in two nodes; NFCI This was suggested in: http://reviews.llvm.org/D13956 and is a follow-on to: http://reviews.llvm.org/rL252515 http://reviews.llvm.org/rL252519 This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG. A corresponding function for IR instructions already exists in ValueTracking. llvm-svn: 252539	2015-11-09 23:31:38 +00:00
David Majnemer	2652b75700	[WinEH] Don't emit CATCHRET from visitCatchPad Instead, emit a CATCHPAD node which will get selected to a target specific sequence. llvm-svn: 252528	2015-11-09 23:07:48 +00:00
Reid Kleckner	64b003f05d	[WinEH] Tweak funclet prologue/epilogue insertion to pass verifier For some reason we'd never run MachineVerifier on WinEH code, and you explicitly have to ask for it with llc. I added it to a few test cases to get some coverage. Fixes PR25461. llvm-svn: 252512	2015-11-09 21:04:00 +00:00
Andrew Kaylor	fdd48fa1e1	[WinEH] Re-committing r252249 (Clone funclets with multiple parents) with additional fixes for determinism problems Differential Revision: http://reviews.llvm.org/D14454 llvm-svn: 252508	2015-11-09 19:59:02 +00:00
Oliver Stannard	563585789c	[CodeGen] Always promote f16 if not legal We don't currently have any runtime library functions for operations on f16 values (other than conversions to and from f32 and f64), so we should always promote it to f32, even if that is not a legal type. In that case, the f32 values would be softened to f32 library calls. SoftenFloatRes_FP_EXTEND now needs to check the promoted operand's type, as it may ne a no-op or require a different library call. getCopyFromParts and getCopyToParts now need to cope with a floating-point value stored in a larger integer part, as is the case for any target that needs to store an f16 value in a 32-bit integer register. Differential Revision: http://reviews.llvm.org/D12856 llvm-svn: 252459	2015-11-09 11:03:18 +00:00
Yaron Keren	9ffee46d45	Erase unused FunctionDIs variables after r252219. llvm-svn: 252401	2015-11-07 10:21:25 +00:00
Joseph Tremoulet	f748c8937e	[WinEH] Update exception pointer registers Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383	2015-11-07 01:11:31 +00:00
Tom Stellard	05691a678e	DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> extload Reviewers: resistor, arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13805 llvm-svn: 252349	2015-11-06 21:58:37 +00:00
Quentin Colombet	9a8efc08d3	[ShrinkWrapping] Teach shrink-wrapping how to analyze RegMask. Previously we were conservatively assuming that RegMask operands clobber callee saved registers. llvm-svn: 252341	2015-11-06 21:00:13 +00:00
Matthias Braun	9198c671e8	MachineScheduler: Add regpressure information to debug dump llvm-svn: 252340	2015-11-06 20:59:02 +00:00
Reid Kleckner	b8fd162fc5	[WinEH] Mark funclet entries and exits as clobbering all registers Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318	2015-11-06 17:06:38 +00:00
NAKAMURA Takumi	9947cacebf	Revert r252249 (and r252255, r252258), "[WinEH] Clone funclets with multiple parents" It behaved flaky due to iterating pointer key values on std::set and std::map. llvm-svn: 252279	2015-11-06 10:07:33 +00:00
Reid Kleckner	e535c1f856	Range-for some LiveIntervals code under review llvm-svn: 252267	2015-11-06 02:01:02 +00:00
Andrew Kaylor	f477585a2b	Fix build warnings llvm-svn: 252255	2015-11-06 01:08:35 +00:00
Andrew Kaylor	29cd576554	[WinEH] Clone funclets with multiple parents Windows EH funclets need to always return to a single parent funclet. However, it is possible for earlier optimizations to combine funclets (probably based on one funclet having an unreachable terminator) in such a way that this condition is violated. These changes add code to the WinEHPrepare pass to detect situations where a funclet has multiple parents and clone such funclets, fixing up the unwind and catch return edges so that each copy of the funclet returns to the correct parent funclet. Differential Revision: http://reviews.llvm.org/D13274?id=39098 llvm-svn: 252249	2015-11-06 00:20:50 +00:00
Peter Collingbourne	d4bff30370	DI: Reverse direction of subprogram -> function edge. Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219	2015-11-05 22:03:56 +00:00
Reid Kleckner	6ddae31045	[WinEH] Fix funclet prologues with stack realignment We already had a test for this for 32-bit SEH catchpads, but those don't actually create funclets. We had a bug that only appeared in funclet prologues, where we would establish EBP and ESI as our FP and BP, and then downstream prologue code would overwrite them. While I was at it, I fixed Win64+funclets+stackrealign. This issue doesn't come up as often there due to the ABI requring 16 byte stack alignment, but now we can rest easy that AVX and WinEH will work well together =P. llvm-svn: 252210	2015-11-05 21:09:49 +00:00
Sanjay Patel	387e66e79f	replace MachineCombinerPattern namespace and enum with enum class; NFCI Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196	2015-11-05 19:34:57 +00:00
Eugene Zelenko	ffec81ca00	Fix some Clang-tidy modernize warnings, other minor fixes. Fixed warnings are: modernize-use-override, modernize-use-nullptr and modernize-redundant-void-arg. Differential revision: http://reviews.llvm.org/D14312 llvm-svn: 252087	2015-11-04 22:32:32 +00:00
Cong Hou	23a3bf0147	Add new interfaces to MBB for manipulating successors with probabilities instead of weights. NFC. This is part-1 of the patch that replaces all edge weights in MBB by probabilities, which only adds new interfaces. No functional changes. Differential revision: http://reviews.llvm.org/D13908 llvm-svn: 252083	2015-11-04 21:37:58 +00:00
Igor Laevsky	35fe692025	[StatepointLowering] Remove distinction between call and invoke safepoints There is no point in having invoke safepoints handled differently than the call safepoints. All relevant decisions could be made by looking at whether or not gc.result and gc.relocate lay in a same basic block. This change will allow to lower call safepoints with relocates and results in a different basic blocks. See test case for example. Differential Revision: http://reviews.llvm.org/D14158 llvm-svn: 252028	2015-11-04 01:16:10 +00:00
Peter Collingbourne	94d778697a	CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering. A profile of an LTO link of Chrome revealed that we were spending some ~30-50% of execution time in the function Constant::getRelocationInfo(), which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn from TargetMachine::getNameWithPrefix(). It turns out that we only need the result of getKindForGlobal() when targeting Mach-O, so this change moves the relevant part of the logic to TargetLoweringObjectFileMachO. NFCI. Differential Revision: http://reviews.llvm.org/D14168 llvm-svn: 252014	2015-11-03 23:40:03 +00:00
Simon Pilgrim	191ac7c679	[SelectionDAG] Use existing constant nodes instead of recreating them. NFC. llvm-svn: 251990	2015-11-03 22:21:38 +00:00
Rafael Espindola	43e2e251ea	Delete dead code. llvm-svn: 251960	2015-11-03 18:55:58 +00:00
Igor Laevsky	f637b4a52e	[CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks Differential Revision: http://reviews.llvm.org/D14258 llvm-svn: 251957	2015-11-03 18:37:40 +00:00
Michael Kuperstein	73dc85293f	[X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments When push instructions are being used to pass function arguments on the stack, and either EH or debugging are enabled, we need to generate .cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is enough for the CFA offset to be correct at every call site, while for debugging we want to be correct after every push. Darwin does not support this well, so don't use pushes whenever it would be required. Differential Revision: http://reviews.llvm.org/D13767 llvm-svn: 251904	2015-11-03 08:17:25 +00:00
Matthias Braun	6f4ed269b9	RegisterPressure: Improve assert message llvm-svn: 251885	2015-11-03 01:53:36 +00:00
Matthias Braun	11859b5c8f	RegisterPressure: Slightly nicer pressure diff dumping llvm-svn: 251884	2015-11-03 01:53:33 +00:00
Matthias Braun	93563e7032	ScheduleDAGInstrs: Remove IsPostRA flag; NFC ScheduleDAGInstrs doesn't behave differently before or after register allocation. It was only used in a method of MachineSchedulerBase which behaved differently in MachineScheduler/PostMachineScheduler. Change this to let MachineScheduler/PostMachineScheduler just pass in a parameter to that function. The order of the LiveIntervals* and bool RemoveKillFlags paramters have been switched to make out-of-tree code fail instead of unintentionally passing a value intended for the IsPostRA flag to the (previously following and default initialized) RemoveKillFlags. Differential Revision: http://reviews.llvm.org/D14245 llvm-svn: 251883	2015-11-03 01:53:29 +00:00
Sanjay Patel	0ed9aeaa5f	[CGP] widen switch condition and case constants to target's register width (2nd try) This is a redo of r251849 except the tests have been split into arch-specific folders to hopefully make the bots happy. This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297. Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine does not shrink cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares. I've opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473 Before: BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5 BB#1: cmpwi 3, 99 bgt 0, .LBB0_9 BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0 BB#3: cmplwi 4, 10 bne 0, .LBB0_12 BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13 BB#6: cmplwi 3, 65526 beq 0, .LBB0_15 BB#7: cmplwi 3, 65535 bne 0, .LBB0_12 BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ... After: BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5 BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9 BB#2: cmplwi 4, 1000 beq 0, .LBB0_14 BB#3: cmplwi 4, 65436 bne 0, .LBB0_13 BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0 BB#6: cmplwi 4, 10 beq 0, .LBB0_12 BB#7: cmplwi 4, 100 bne 0, .LBB0_13 BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15 BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ... Differential Revision: http://reviews.llvm.org/D13532 llvm-svn: 251857	2015-11-02 23:22:49 +00:00
Sanjay Patel	dfc825eb36	revert r251849; need to move tests to arch-specific folders llvm-svn: 251851	2015-11-02 23:05:20 +00:00
Sanjay Patel	b90a078de9	[CGP] widen switch condition and case constants to target's register width This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297. Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine does not shrink cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares. I've opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473 Before: BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5 BB#1: cmpwi 3, 99 bgt 0, .LBB0_9 BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0 BB#3: cmplwi 4, 10 bne 0, .LBB0_12 BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13 BB#6: cmplwi 3, 65526 beq 0, .LBB0_15 BB#7: cmplwi 3, 65535 bne 0, .LBB0_12 BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ... After: BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5 BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9 BB#2: cmplwi 4, 1000 beq 0, .LBB0_14 BB#3: cmplwi 4, 65436 bne 0, .LBB0_13 BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0 BB#6: cmplwi 4, 10 beq 0, .LBB0_12 BB#7: cmplwi 4, 100 bne 0, .LBB0_13 BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15 BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ... Differential Revision: http://reviews.llvm.org/D13532 llvm-svn: 251849	2015-11-02 22:46:24 +00:00
Cong Hou	b90b9e0531	In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available. In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality. Consider a simple example shown below: A \| B⇆C \| D When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D. This patch filters those cold blocks off from the loop chain by comparing the ratio: LoopBBFreq / LoopFreq to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa. Differential revision: http://reviews.llvm.org/D11662 llvm-svn: 251833	2015-11-02 21:24:00 +00:00
James Y Knight	646c4032e7	Fix two issues in MergeConsecutiveStores: 1) PR25154. This is basically a repeat of PR18102, which was fixed in r200201, and broken again by r234430. The latter changed which of the store nodes was merged into from the first to the last. Thus, we now also need to prefer merging a later store at a given address into the target node, instead of an earlier one. 2) While investigating that, I also realized I'd introduced a bug in r236850. There, I removed a check for alignment -- not realizing that nothing except the alignment check was ensuring that none of the stores were overlapping! This is a really bogus way to ensure there's no aliased stores. A better solution to both of these issues is likely to always use the code added in the 'if (UseAA)' branches which rearrange the chain based on a more principled analysis. I'll look into whether that can be used always, but in the interest of getting things back to working, I think a minimal change makes sense. llvm-svn: 251816	2015-11-02 18:48:08 +00:00
Jonas Paulsson	72640f1c9f	[MachineVerifier] Analyze MachineMemOperands for mem-to-mem moves. Since the verifier will give false reports if it incorrectly thinks MI is loading or storing using an FI, it is necessary to scan memoperands and find out how the FI is used in the instruction. This should be relatively rare. Needed to make CodeGen/SystemZ/spill-01.ll pass, which now runs with this flag. Reviewed by Quentin Colombet. llvm-svn: 251620	2015-10-29 08:28:35 +00:00
Matthias Braun	f2f194455f	Revert "ScheduleDAGInstrs: Remove IsPostRA flag" It broke 3 arm testcases. This reverts commit r251608. llvm-svn: 251615	2015-10-29 05:06:41 +00:00
Matthias Braun	dc7580aa88	MachineScheduler: Fix typo in debug message Maybe I just missed the humor there ;-) llvm-svn: 251609	2015-10-29 03:57:28 +00:00
Matthias Braun	7ffadd0087	ScheduleDAGInstrs: Remove IsPostRA flag This was a layering violation in ScheduleDAGInstrs (and MachineSchedulerBase) they both shouldn't know directly whether they are used by the PostMachineScheduler or the MachineScheduler. llvm-svn: 251608	2015-10-29 03:57:24 +00:00
Matthias Braun	b0c437bc76	MachineScheduler: Use ranged for and slightly simplify the code llvm-svn: 251607	2015-10-29 03:57:17 +00:00
Tim Northover	2d4d161519	ARM: support .watchos_version_min and .tvos_version_min. These MachO file directives are used by linkers and other tools to provide compatibility information, much like the existing .ios_version_min and .macosx_version_min. llvm-svn: 251569	2015-10-28 22:36:05 +00:00
Sanjoy Das	1d1929aace	[ValueTracking] Use !range metadata more aggressively in KnownBits Summary: Teach `computeKnownBitsFromRangeMetadata` to use `!range` metadata more aggressively. Reviewers: majnemer, nlewycky, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14100 llvm-svn: 251487	2015-10-28 03:20:15 +00:00
Sanjoy Das	4ff3cf6d92	[SelectionDAG] Don't inspect !range metadata for extended loads Summary: Don't call `computeKnownBitsFromRangeMetadata` for extended loads -- this can cause a mismatch between the width of the !range metadata and the width of the APInt's accumulating `KnownZero` (and `KnownOne` in the future). This isn't a problem now, but will be after a future change. Note: this can be made more aggressive in the future. Reviewers: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14107 llvm-svn: 251486	2015-10-28 03:20:10 +00:00
James Y Knight	14eedd189b	Make the SelectionDAG graph printer use SDNode::PersistentId labels. r248010 changed the -debug output to use short ids, but did not similarly modify the graph printer. Change to be consistent, for ease of cross-reference. llvm-svn: 251465	2015-10-27 23:09:03 +00:00
Sanjay Patel	bbd4c79c8f	Use the 'arcp' fast-math-flag when combining repeated FP divisors This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. This was originally part of D8900. Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and possibly other changes. Differential Revision: http://reviews.llvm.org/D9708 llvm-svn: 251450	2015-10-27 20:27:25 +00:00
Cong Hou	07eeb8001e	Create a new interface addSuccessorWithoutWeight(MBB) in MBB to add successors when optimization is disabled. When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights. We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled. In this patch, a new interface addSuccessorWithoutWeight(MBB) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list. Differential revision: http://reviews.llvm.org/D13963 llvm-svn: 251429	2015-10-27 17:59:36 +00:00
Mehdi Amini	891c0973df	Do not use "else" when both branches return (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251398	2015-10-27 08:12:08 +00:00
Steve King	fee370be72	Fix llc crash processing S/UREM for -Oz builds caused by rL250825. When taking the remainder of a value divided by a constant, visitREM() attempts to convert the REM to a longer but faster sequence of instructions. This conversion calls combine() on a speculative DIV instruction. Commit rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes. Flow eventually hits unreachable(). This patch adds a test case and a check to prevent visitREM() from trying to convert the REM instruction in cases where a DIVREM is possible. See http://reviews.llvm.org/D14035 llvm-svn: 251373	2015-10-27 00:14:06 +00:00
Ivan Krasin	465fbe25c4	Fix indents. It's a follow up to r251353. llvm-svn: 251364	2015-10-26 22:35:40 +00:00
Ivan Krasin	298639a5fd	Move imported entities into DwarfCompilationUnit to speed up LTO linking. Summary: In particular, this CL speeds up the official Chrome linking with LTO by 1.8x. See more details in https://crbug.com/542426 Reviewers: dblaikie Subscribers: jevinskie Differential Revision: http://reviews.llvm.org/D13918 llvm-svn: 251353	2015-10-26 21:36:35 +00:00
David Blaikie	7b54b525cd	Remove assert(false) in favor of asserting the if conditional it is contained within. Also adjust the code to avoid 3 redundant map lookups. llvm-svn: 251327	2015-10-26 18:41:13 +00:00
Evgeniy Stepanov	d1aad26589	[safestack] Fast access to the unsafe stack pointer on AArch64/Android. Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. The previous iteration of this change was reverted in r250461. This version leaves the generic, compiler-rt based implementation in SafeStack.cpp instead of moving it to TargetLoweringBase in order to allow testing without a TargetMachine. llvm-svn: 251324	2015-10-26 18:28:25 +00:00
Elena Demikhovsky	092858588a	Scalarizer for masked.gather and masked.scatter intrinsics. When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations. If the mask is not constant, the scalarizer will build a chain of conditional basic blocks. I added isLegalMaskedGather() isLegalMaskedScatter() APIs. Differential Revision: http://reviews.llvm.org/D13722 llvm-svn: 251237	2015-10-25 15:37:55 +00:00
Michael Kuperstein	eaa16005af	[X86] Use correct calling convention for MCU psABI libcalls When using the MCU psABI, compiler-generated library calls should pass some parameters in-register. However, since inreg marking for x86 is currently done by the front end, it will not be applied to backend-generated calls. This is a workaround for PR3997, which describes a similar issue for -mregparm. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251223	2015-10-25 08:14:05 +00:00
Rafael Espindola	84921b9860	Refactor: Simplify boolean conditional return statements in lib/CodeGen. Patch by Richard. llvm-svn: 251213	2015-10-24 23:11:13 +00:00
Simon Pilgrim	3448cbcc51	[DAGCombiner] Tidy up ConstantFP commutation. NFCI Move ConstantFP canonicalization of commutative instructions to start of 2-op node creation (matches integer) - simplifies constant folding code. llvm-svn: 251203	2015-10-24 20:06:18 +00:00
Simon Pilgrim	7430804fe1	[DAGCombiner] Generalize masking of constant rotates. We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded. Followup to D13851. llvm-svn: 251197	2015-10-24 18:44:52 +00:00
Simon Pilgrim	d5ef318b5b	[X86][XOP] Add support for lowering vector rotations This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions. This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future. Differential Revision: http://reviews.llvm.org/D13851 llvm-svn: 251188	2015-10-24 13:17:26 +00:00
Joseph Tremoulet	3d0fbf1d74	[CodeGen] Mark setjmp/catchret MBBs address-taken Summary: This ensures that BranchFolding (and similar) won't remove these blocks. Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are address-taken but do not have BBs that are address-taken, since otherwise its call to getAddrLabelSymbolTableToEmit would fail an assertion on such blocks. I audited the other callers of getAddrLabelSymbolTableToEmit (and getAddrLabelSymbol); they all have BBs known to be address-taken except for the call through getAddrLabelSymbol from WinException::create32bitRef; that call is actually now unreachable, so I've removed it and updated the signature of create32bitRef. This fixes PR25168. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, llvm-commits Differential Revision: http://reviews.llvm.org/D13774 llvm-svn: 251113	2015-10-23 15:06:05 +00:00
Davide Italiano	fbb958c24b	[CodeGen] Remove usage of NDEBUG in header. Moreover, this seems unused. llvm-svn: 251081	2015-10-23 00:17:40 +00:00
Matthias Braun	61f4d6439c	MachineScheduler: Add a way to disable the 'ReduceLatency' heuristic llvm-svn: 251037	2015-10-22 18:07:31 +00:00
Craig Topper	8fe40e0ed5	Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032	2015-10-22 17:05:00 +00:00
Zia Ansari	8f509a7044	[X86] - Catch extra combine opportunities for redundant imuls. When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple users which would result in an extra add instruction. In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add. I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works). Differential Revision: http://reviews.llvm.org/D13740 llvm-svn: 251028	2015-10-22 16:14:45 +00:00
David Majnemer	a8f17871e4	[WinEH] Remove extraneous call to emitEHRegistrationOffsetLabel It's a relic from the earlier implementation, let's remove it. llvm-svn: 250964	2015-10-21 23:20:39 +00:00
Matt Arsenault	29f9663f97	LegalizeDAG: Implement promote for build_vector This will be used in future commits for AMDGPU to promote operations on i64 vectors into operations on 32-bit vector components. This will be used / tested in future AMDGPU commits. llvm-svn: 250945	2015-10-21 21:10:10 +00:00
Elena Demikhovsky	3ad76a1acd	Masked Load/Store optimization for scalar code When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks. I added optimization for constant mask vector. Differential Revision: http://reviews.llvm.org/D13855 llvm-svn: 250893	2015-10-21 11:50:54 +00:00
Jonas Paulsson	17ad04535f	Let MachineVerifier be aware of mem-to-mem instructions. A mem-to-mem instruction (that both loads and stores), which store to an FI, cannot pass the verifier since it thinks it is loading from the FI. For the mem-to-mem instruction, do a looser check in visitMachineOperand() and only check liveness at the reg-slot while analyzing a frame index operand. Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs, which now runs with this flag. Reviewed by Evan Cheng and Quentin Colombet. llvm-svn: 250885	2015-10-21 07:39:47 +00:00
Krzysztof Parzyszek	fdb7b693a7	Tail duplication can mix incompatible registers in phi nodes Do not tail duplicate blocks where the successor has a phi node, and the corresponding value in that phi node uses a subregister. http://reviews.llvm.org/D13922 llvm-svn: 250877	2015-10-21 02:40:06 +00:00
Artyom Skrobov	c736863a85	Two switch blocks in VectorLegalizer::LegalizeOp already have a default: llvm_unreachable("This action is not supported yet!"); -- so I'm adding one to the third switch block, too. This is a follow-up fix for http://reviews.llvm.org/D13862 llvm-svn: 250830	2015-10-20 15:06:37 +00:00
Artyom Skrobov	7fd67e25aa	Adding support for TargetLoweringBase::LibCall Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826	2015-10-20 13:14:52 +00:00
Artyom Skrobov	b844fa7fc0	Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner. Summary: In addition to moving the code over, this patch amends the DIV,REM -> DIVREM combining to run on all affected nodes at once: if the nodes are converted to DIVREM one at a time, then the resulting DIVREM may get legalized by the backend into something target-specific that we won't be able to recognize and correlate with the remaining nodes. The motivation is to "prepare terrain" for D13862: when we set DIV and REM to be legalized to libcalls, instead of the DIVREM, we otherwise lose the ability to combine them together. To prevent this, we need to take the DIV,REM -> DIVREM combining out of the lowering stage. Reviewers: RKSimon, eli.friedman, rengolin Subscribers: john.brawn, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13733 llvm-svn: 250825	2015-10-20 13:06:02 +00:00
Duncan P. N. Exon Smith	a25ad0685a	AsmPrinter: Remove implicit ilist iterator conversion, NFC llvm-svn: 250776	2015-10-20 00:36:08 +00:00
Cong Hou	7745dbc5c4	Enhance loop rotation with existence of profile data in MachineBlockPlacement pass. Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation: 1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header. 2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop. 3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain. Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies. Differential revision: http://reviews.llvm.org/D10717 llvm-svn: 250754	2015-10-19 23:16:40 +00:00
Sanjay Patel	69a50a1e17	[CGP] transform select instructions into branches and sink expensive operands This was originally checked in at r250527, but reverted at r250570 because of PR25222. There were at least 2 problems: 1. The cost check was checking for an instruction with an exact cost of TCC_Expensive; that should have been >=. 2. The cause of the clang stage 1 failures was illegally sinking 'call' instructions; we can't sink instructions that may have side effects / are not safe to execute speculatively. Fixed those conditions in sinkSelectOperand() and added test cases. Original commit message: This is a follow-up to the discussion in D12882. Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands are expensive (as defined by the TTI cost model) because that may expose further optimizations. However, we would then like a later pass like CodeGenPrepare to undo that transformation if the target would likely benefit from not speculatively executing an expensive op (this patch). Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its select-formation behavior that changed with r248439. Differential Revision: http://reviews.llvm.org/D13297 llvm-svn: 250743	2015-10-19 21:59:12 +00:00
Owen Anderson	faf5187ee0	Restore the original behavior of SelectionDAG::getTargetIndex(). It looks like an extra negation snuck in as apart of restoring it. llvm-svn: 250726	2015-10-19 19:27:40 +00:00
Benjamin Kramer	2002aadaad	Put back SelectionDAG::getTargetIndex. While technically this is untested dead code, it has out-of-tree users. This reverts a part of r250434. llvm-svn: 250717	2015-10-19 18:26:16 +00:00
Matthias Braun	e734195ce3	Revert "RegisterPressure: allocatable physreg uses are always kills" This reverts commit r250596. Reverted for now as the commit triggers assert in the AMDGPU target pending investigation. llvm-svn: 250713	2015-10-19 17:44:22 +00:00
Elena Demikhovsky	20662e39f1	Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore(). Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case. Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces. Differential Revision: http://reviews.llvm.org/D13850 llvm-svn: 250686	2015-10-19 07:43:38 +00:00
Simon Pilgrim	04d52d26f6	Use SDValue bool check. NFCI. llvm-svn: 250653	2015-10-18 12:33:54 +00:00
Simon Pilgrim	c2c154e078	Move one-use variable inside test. NFC. llvm-svn: 250651	2015-10-18 11:47:23 +00:00
Simon Pilgrim	24057b9566	[DAG] Ensure vector constant folding uses correct scalar undef types Minor fix to D13665 found during post-commit review. llvm-svn: 250616	2015-10-17 16:49:43 +00:00
Matthias Braun	65e6d4a3f8	RegisterPressure: Unify the sparse sets in LiveRegsSet; NFC Also do some cleanups comment improvements. llvm-svn: 250598	2015-10-17 01:03:44 +00:00
Matthias Braun	cdd2792aa6	RegisterPressure: allocatable physreg uses are always kills This property was already used in the code path when no liveness intervals are present. Unfortunately the code path that uses liveness intervals tried to query a cached live interval for an allocatable physreg, those are usually not computed so a conservative default was used. This doesn't affect any of the lit testcases. This is a foreclosure to upcoming changes which should be NFC but without this patch this tidbit wouldn't be NFC. llvm-svn: 250596	2015-10-17 00:46:57 +00:00
Matthias Braun	5105e05e8f	RegisterPressure: Remove 0 entries from PressureChange This should not change behaviour because as far as I can see all code reading the pressure changes has no effect if the PressureInc is 0. Removing these entries however does avoid unnecessary computation, and results in a more stable debug output. I want the stable debug output to check that some upcoming changes are indeed NFC and identical even at the debug output level. llvm-svn: 250595	2015-10-17 00:35:59 +00:00
Matthias Braun	96e411b90c	RegisterPressure: Hide non-const iterators of PressureDiff It is too easy to accidentally violate the ordering requirements when modifying the PressureDiff entries through iterators. llvm-svn: 250590	2015-10-17 00:08:48 +00:00
Joseph Tremoulet	55b51e9dcc	[WinEH] Fix eh.exceptionpointer intrinsic lowering Summary: Some shared code for handling eh.exceptionpointer and eh.exceptioncode needs to not share the part that truncates to 32 bits, which is intended just for exception codes. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13747 llvm-svn: 250588	2015-10-17 00:08:08 +00:00
Reid Kleckner	28e490342b	[WinEH] Fix stack alignment in funclets and ParentFrameOffset calculation Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset is incorrect when CSRs are involved. We were supposed to have a test case to catch this, but it wasn't very rigorous. The main effect here is that calling _CxxThrowException inside a catchpad doesn't immediately crash on MOVAPS when you have an odd number of CSRs. llvm-svn: 250583	2015-10-16 23:43:27 +00:00
Matthias Braun	fdee8ec2bd	RegisterPressure: Use range based for, cleanup llvm-svn: 250579	2015-10-16 23:25:09 +00:00
Benjamin Kramer	b43d33bf0f	Revert "This is a follow-up to the discussion in D12882." Breaks clang selfhost, see PR25222. This reverts commits r250527 and r250528. llvm-svn: 250570	2015-10-16 23:00:29 +00:00
Joseph Tremoulet	d11a998e81	[WinEH] Fix CatchRetSuccessorColorMap accounting Summary: We now use the block for the catchpad itself, rather than its normal successor, as the funclet entry. Putting the normal successor in the map leads downstream funclet membership computations to erroneous results. Reviewers: majnemer, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D13798 llvm-svn: 250552	2015-10-16 21:22:54 +00:00
David Majnemer	e696583dba	[WinEH] Remove dead code/includes from WinEHPrepare No functionality change is intended. llvm-svn: 250545	2015-10-16 19:59:52 +00:00
Joseph Tremoulet	53e9cbd95a	[WinEH] Fix endpad coloring/numbering Summary: When a cleanup's cleanupendpad or cleanupret targets a catchendpad, stop trying to propagate the cleanup's parent's color to the catchendpad, since what's needed is the cleanup's grandparent's color and the catchendpad will get that color from the catchpad linkage already. We already had this exclusion for invokes, but were missing it for cleanupendpad/cleanupret. Also add a missing line that tags cleanupendpads' states in the EHPadStateMap, without with lowering invokes that target cleanupendpads which unwind to other handlers (and so don't have the -1 state) will fail. This fixes the reduced IR repro in PR25163. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13797 llvm-svn: 250534	2015-10-16 18:08:16 +00:00
Sanjay Patel	374dd8d88e	This is a follow-up to the discussion in D12882. Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands are expensive (as defined by the TTI cost model) because that may expose further optimizations. However, we would then like a later pass like CodeGenPrepare to undo that transformation if the target would likely benefit from not speculatively executing an expensive op (this patch). Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its select-formation behavior that changed with r248439. Differential Revision: http://reviews.llvm.org/D13297 llvm-svn: 250527	2015-10-16 16:54:30 +00:00
Evgeniy Stepanov	9addbc9fc1	Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android." Breaks the hexagon buildbot. llvm-svn: 250461	2015-10-15 21:26:49 +00:00
Adrian Prantl	96b1551d53	Replace a forward declaration with an #include. When building with modules the forward-declared inner class DebugLocStream::ListBuilder causes clang to fall over. llvm-svn: 250459	2015-10-15 20:58:55 +00:00
Evgeniy Stepanov	142947e9f0	[safestack] Fast access to the unsafe stack pointer on AArch64/Android. Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. llvm-svn: 250456	2015-10-15 20:50:16 +00:00
Benjamin Kramer	bacc7ba7aa	[SelectionDAG] Remove dead code. NFC. Carefully selected parts without deleting graph stuff and dumping methods. llvm-svn: 250434	2015-10-15 17:54:06 +00:00
Benjamin Kramer	7fa42c8a8c	[AsmPrinter] Prune dead code. NFC. I left all (dead) print and dump methods in place. llvm-svn: 250433	2015-10-15 17:16:32 +00:00
Artyom Skrobov	4bca0bb010	A doccomment for CombineTo, and some NFC refactorings Summary: Caching SDLoc(N), instead of recreating it in every single function call, keeps the code denser, and allows to unwrap long lines. Reviewers: sunfish, atrick, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13726 llvm-svn: 250305	2015-10-14 17:18:35 +00:00
Artyom Skrobov	a5b9ad22b3	Merge DAGCombiner::visitSREM and DAGCombiner::visitUREM (NFC) Summary: The two implementations had more code in common than not. Reviewers: sunfish, MatzeB, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13724 llvm-svn: 250302	2015-10-14 16:54:14 +00:00
Joseph Tremoulet	28c89bbb36	[WinEH] Add CoreCLR EH table emission Summary: Emit the handler and clause locations immediately after the standard xdata. Clauses are emitted in the same order and format used to communiate them to the CLR Execution Engine. Add a lit test to verify correct table generation on a small but interesting example function. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13451 llvm-svn: 250219	2015-10-13 20:18:27 +00:00
Duncan P. N. Exon Smith	e400a7d412	SelectionDAG: Remove implicit ilist iterator conversions, NFC llvm-svn: 250214	2015-10-13 19:47:46 +00:00
Joseph Tremoulet	1e2f062ec5	[WinEH] Iterate state changes instead of invokes Summary: Add an iterator that can walk across blocks and which visits the state transitions rather than state ranges, with explicit transitions to -1 indicating the presence of top-level calls that may throw and cause the current function to unwind to caller. This will simplify code that needs to identify nested try regions. Refactor SEH and C++EH table generation to use the new InvokeStateChangeIterator, and remove the InvokeLabelIterator they were using. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13623 llvm-svn: 250179	2015-10-13 16:44:30 +00:00
Matt Arsenault	e5d9515fb7	DAGCombiner: Don't stop finding better chain on 2 aliases The comment says this was stopped because it was unlikely to be profitable. This is not true if you want to combine vector loads with multiple components. For a simple case that looks like t0 = load t0 ... t1 = load t0 ... t2 = load t0 ... t3 = load t0 ... t4 = store t0:1, t0:1 t5 = store t4, t1:0 t6 = store t5, t2:0 t7 = store t6, t3:0 We want to get all of these stores onto a chain that is a TokenFactor of these N loads. This mostly solves the AMDGPU merge-stores.ll regressions with -combiner-alias-analysis for merging vector stores of vector loads. llvm-svn: 250138	2015-10-13 00:49:00 +00:00
Matt Arsenault	61dc235f20	DAGCombiner: Combine extract_vector_elt from build_vector This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129	2015-10-12 23:59:50 +00:00
Cong Hou	bf22f5063a	Assign correct edge weights to unwind destinations when lowering invoke statement. When lowering invoke statement, all unwind destinations are directly added as successors of call site block, and the weight of those new edges are not assigned properly. Actually, default weight 16 are used for those edges. This patch calculates the proper edge weights for those edges when collecting all unwind destinations. Differential revision: http://reviews.llvm.org/D13354 llvm-svn: 250119	2015-10-12 23:02:58 +00:00
Simon Pilgrim	c8832fc233	[SelectionDAG] Add common vector constant folding helper function We have a number of functions that implement constant folding of vectors (unary and binary ops) in near identical manners (and the differences don't appear to be critical). This patch introduces a common implementation (SelectionDAG::FoldConstantVectorArithmetic) and calls this in both the unary and binary op cases. After this initial patch I intend to begin enabling vector constant folding for a wider number of opcodes in SelectionDAG::getNode(). Differential Revision: http://reviews.llvm.org/D13665 llvm-svn: 250118	2015-10-12 23:00:11 +00:00
Matt Arsenault	07a72bad0b	Enable verifier after PeepholeOptimizer No tests fail with this enabled so I assume it was an accident that it isn't enabled now. llvm-svn: 250070	2015-10-12 17:43:56 +00:00
Reid Kleckner	9abb3c06a6	Don't call PrepareEHLandingPad on non EH pads This was a minor bug in r249492. Calling PrepareEHLandingPad on a non-landingpad was a no-op, but it attempted to get the generic pointer register class, which apparently doesn't exist for some targets. llvm-svn: 250068	2015-10-12 17:42:32 +00:00
David Majnemer	99c1d13e52	[WinEH] Remove CatchObjRecoverIdx CatchObjRecoverIdx was used for the old scheme, it is no longer relevant. llvm-svn: 250065	2015-10-12 16:44:22 +00:00
Oliver Stannard	cca893ffac	[Debug] Look through bitcasts to find argument registers On targets where f32 is not legal, we have to look through a BITCAST SDNode to find the register that an argument is stored in when emitting debug info, or we will not be able to emit a DW_AT_location for it. Differential Revision: http://reviews.llvm.org/D13005 llvm-svn: 250056	2015-10-12 15:52:36 +00:00
Simon Pilgrim	d45c88bbb5	[DAGCombiner] Improved FMA combine support for vectors Enabled constant canonicalization for all constants. Improved combining of constant vectors. llvm-svn: 249993	2015-10-11 19:48:12 +00:00
Simon Pilgrim	5eac2607b9	[DAGCombiner] Tidyup FMINNUM/FMAXNUM constant folding Enable constant folding for vector splats as well as scalars. Enable constant canonicalization for all scalar and vector constants. llvm-svn: 249978	2015-10-11 16:02:28 +00:00
David Majnemer	bfa5b98201	[WinEH] Remove more dead code wineh-parent is dead, so is ValueOrMBB. llvm-svn: 249920	2015-10-10 00:04:29 +00:00
Reid Kleckner	14e773500e	[WinEH] Delete the old landingpad implementation of Windows EH The new implementation works at least as well as the old implementation did. Also delete the associated preparation tests. They don't exercise interesting corner cases of the new implementation. All the codegen tests of the EH tables have already been ported. llvm-svn: 249918	2015-10-09 23:34:53 +00:00
Reid Kleckner	eb7cd6c889	[SEH] Update SEH codegen tests to use the new IR Also Fix a buglet where SEH tables had ranges that spanned funclets. The remaining tests using the old landingpad IR are preparation tests, and will be deleted along with the old preparation. llvm-svn: 249917	2015-10-09 23:05:54 +00:00
Duncan P. N. Exon Smith	f1ff53ecc2	CodeGen: Remove implicit ilist iterator conversions, NFC Finish removing implicit ilist iterator conversions from LLVMCodeGen. I'm sure there are lots more of these in lib/CodeGen/*/. llvm-svn: 249915	2015-10-09 22:56:24 +00:00
Reid Kleckner	e1c8a7f9c7	[SEH] Fix _except_handler4 table base states We got them right for the old IR, but not with funclets. Port the old test to the new IR and fix the code. llvm-svn: 249906	2015-10-09 21:27:28 +00:00
Duncan P. N. Exon Smith	6e98cd32dc	CodeGen: Avoid more ilist iterator implicit conversions, NFC llvm-svn: 249903	2015-10-09 21:08:19 +00:00
Duncan P. N. Exon Smith	1ff409802d	CodeGen: Use range-based for in PostRAScheduler, NFC llvm-svn: 249901	2015-10-09 21:05:00 +00:00
Reid Kleckner	d880dc7509	[SEH] Remember to emit the last invoke range for SEH This wasn't very observable in execution tests, because usually there is an invoke in the catchpad that unwinds the the catchendpad but never actually throws. llvm-svn: 249898	2015-10-09 20:39:39 +00:00
Chad Rosier	47eba05b47	Revert "Simplify code. NFC." This reverts commit r248610. llvm-svn: 249887	2015-10-09 19:48:48 +00:00
Duncan P. N. Exon Smith	5ec1568c9c	CodeGen: Continue removing ilist iterator implicit conversions llvm-svn: 249884	2015-10-09 19:40:45 +00:00
Duncan P. N. Exon Smith	6ac07fd228	CodeGen: Remove implicit iterator conversions from MBB.cpp Remove implicit ilist iterator conversions from MachineBasicBlock.cpp. I've also added an overload of `splice()` that takes a pointer, since it's a natural API. This is similar to the overloads I added for `remove()` and `erase()` in r249867. llvm-svn: 249883	2015-10-09 19:36:12 +00:00
Duncan P. N. Exon Smith	0ac8eb9171	CodeGen: Avoid ilist iterator implicit conversions in a few more places, NFC llvm-svn: 249880	2015-10-09 19:23:20 +00:00
Duncan P. N. Exon Smith	5ae5939fa1	CodeGen: Remove more ilist iterator implicit conversions, NFC llvm-svn: 249879	2015-10-09 19:13:58 +00:00
Duncan P. N. Exon Smith	6c64aeb065	CodeGen: Use range-based for in IntrinsicLowering::AddPrototypes, NFC This happens to avoid a host of implicit ilist iterator conversions. llvm-svn: 249877	2015-10-09 19:07:41 +00:00
Duncan P. N. Exon Smith	530d040bd9	CodeGen: Use range-based for in GlobalMerge, NFC llvm-svn: 249876	2015-10-09 18:57:47 +00:00
Duncan P. N. Exon Smith	d83547a16e	CodeGen: Remove a few more ilist iterator implicit conversions, NFC llvm-svn: 249875	2015-10-09 18:44:40 +00:00
Duncan P. N. Exon Smith	980f8f2639	CodeGen: Remove implicit conversions from Analysis and BranchFolding Remove a few more implicit ilist iterator conversions, this time from Analysis.cpp and BranchFolding.cpp. I added a few overloads for `remove()` and `erase()`, which quite naturally take pointers as well as iterators as parameters. This will reduce the churn at least in the short term, but I don't really have a problem with these existing for longer. llvm-svn: 249867	2015-10-09 18:23:49 +00:00
Owen Anderson	d95b08a0a7	Refine the definition of convergent to only disallow the addition of new control dependencies. This covers the common case of operations that cannot be sunk. Operations that cannot be hoisted should already be handled properly via the safe-to-speculate rules and mechanisms. llvm-svn: 249865	2015-10-09 18:06:13 +00:00
Sanjay Patel	9fbe22bac6	fix typos; NFC llvm-svn: 249863	2015-10-09 18:01:03 +00:00
Duncan P. N. Exon Smith	8f11e1a713	CodeGen: Start removing implicit conversions to/from list iterators, NFC Start removing implicit conversions to/from list iterators in CodeGen, ala r249782 for IR. A lot more to go after this. llvm-svn: 249851	2015-10-09 16:54:49 +00:00
Reid Kleckner	ae44e871cd	Revert "Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64""" This reverts commit r249794. Apparently my checkouts are full of unexpected surprises today. llvm-svn: 249796	2015-10-09 01:13:17 +00:00
Reid Kleckner	b510401785	Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64"" This reverts commit r249032. TODO write commit msg llvm-svn: 249794	2015-10-09 01:11:37 +00:00
Joseph Tremoulet	676e5cf07f	[WinEH] Fix cleanup state numbering Summary: - Recurse from cleanupendpads to their cleanuppads, to make sure the cleanuppad is visited if it has a cleanupendpad but no cleanupret. - Check for and avoid double-processing cleanuppads, to allow for them to have multiple cleanuprets (plus cleanupendpads). - Update Cxx state numbering to visit toplevel cleanupendpads and to recurse from cleanupendpads to their preds, to ensure we number any funclets in inlined cleanups. SEH state numbering already did this. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13374 llvm-svn: 249792	2015-10-09 00:46:08 +00:00
Reid Kleckner	ebef256269	[SEH] Fix llvm.eh.exceptioncode fast register allocation assertion I called the wrong MachineBasicBlock::addLiveIn() overload. llvm-svn: 249786	2015-10-09 00:15:13 +00:00
Michael Kuperstein	2b3c16ca17	Do not assert on first non-prologue instruction being a CFI directive. llvm-svn: 249668	2015-10-08 07:48:49 +00:00
Craig Topper	da5168b7ce	Use range-based for loops. NFC. llvm-svn: 249659	2015-10-08 06:06:42 +00:00
Justin Bogner	468c998031	CodeGen: print and verify after TargetPassConfig::insertPass by default In r224059, we started verifying after addPass, but missed doing so on insertPass. There isn't a good reason for the discrepancy, and skipping the verifier in these cases causes bugs. This also exposes a verifier error that was introduced in r249087, but the verifier doesn't run until after the register coalescer, when the issue happens to have been resolved. I've skipped the verifier after SIFixSGPRLiveRangesID to avoid the failures for now and will follow up with Matt for a proper fix. llvm-svn: 249643	2015-10-08 00:36:22 +00:00
David Majnemer	6af5f82c20	[WinEH] Refer to filter funclets using their symbol-table symbol The relocation for the filter funclet will be against a symbol table entry for a function instead of the section, making it easier to understand what is going on. llvm-svn: 249621	2015-10-07 21:34:00 +00:00
Reid Kleckner	70bf6bb5e6	[WinEH] Undo the effect of r249578 for 32-bit The __CxxFrameHandler3 tables for 32-bit are supposed to hold stack offsets relative to EBP, not ESP. I blindly updated the win-catchpad.ll test case, and immediately noticed that 32-bit catching stopped working. While I'm at it, move the frame index to frame offset WinEH table logic out of PEI. PEI shouldn't have to know about WinEHFuncInfo. I realized we can calculate frame index offsets just fine from the table printer. llvm-svn: 249618	2015-10-07 21:13:15 +00:00
David Majnemer	c289c9ff55	[WinEH] Remove unreachable blocks before preparation We remove unreachable blocks because it is pointless to consider them for coloring. However, we still had stale pointers to these blocks in some data structures after we removed them from the function. Instead, remove the unreachable blocks before attempting to do anything with the function. This fixes PR25099. llvm-svn: 249617	2015-10-07 21:08:25 +00:00
Joseph Tremoulet	39234fc67e	[WinEH] Set NoModuleLevelChanges in clone flags Summary: This is necessary to keep the cloner from making bogus copies of debug metadata attached to the IR it is cloning. Also, avoid running RemapInstruction over all instructions in the common case that no cloning was performed. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13514 llvm-svn: 249591	2015-10-07 19:29:56 +00:00
Reid Kleckner	33bd2d99d8	[WinEH] Fix two minor issues in __CxxFrameHandler3 tables There was an off-by-one bug in ip2state tables which manifested when one call immediately preceded the try-range of the next. The return address of the previous call would appear to be within the try range of the next scope, resulting in extra destructors or catches running. We also computed the wrong offset for catch parameter stack objects. The offset should be from RSP, not from RBP. llvm-svn: 249578	2015-10-07 17:49:32 +00:00
Chad Rosier	169865ffda	[ARM] Promote helper function to SelectionDAG. I'll be using the function in a similar combine for AArch64. The helper was also improved to handle undef values. Part of http://reviews.llvm.org/D13442 llvm-svn: 249572	2015-10-07 17:28:58 +00:00
Joseph Tremoulet	bde46c5642	[WinEH] Update CoreCLR EH for catchpad MBBs Summary: Set the pad MBB as a funclet entry for CoreCLR as well as MSVCCXX, and update state numbering to put the catchpad block rather than its normal successor into the unwind map. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13492 llvm-svn: 249569	2015-10-07 17:16:25 +00:00
Michael Kuperstein	259f1508f0	[X86] Emit .cfi_escape GNU_ARGS_SIZE when adjusting the stack before calls When outgoing function arguments are passed using push instructions, and EH is enabled, we may need to indicate to the stack unwinder that the stack pointer was adjusted before the call. This should fix the exception handling issues in PR24792. Differential Revision: http://reviews.llvm.org/D13132 llvm-svn: 249522	2015-10-07 07:01:31 +00:00
Reid Kleckner	72ba70418f	[SEH] Add llvm.eh.exceptioncode intrinsic This will support the Clang __exception_code intrinsic. llvm-svn: 249492	2015-10-07 00:27:33 +00:00
David Blaikie	c9ad9191a7	DebugInfo: Include the decl_line/decl_file in subprogram definitions if they differ from those in the declaration This is handy for some AutoFDO stuff, and seems like a minor improvement to correctness (otherwise a debug info consumer might think the decl line/file of the def was the same as that of the declaration - though what a consumer might use that for, I'm not sure - maybe "list <func>" would've misbehaved with the old behavior?) and at a minor cost (in my experiment, with fission, without type units, without compression, 0.01% growth in debug info in the executable/objects, 0.02% growth in the .dwo files). llvm-svn: 249487	2015-10-07 00:04:16 +00:00
David Majnemer	7735a6d07a	[WinEH] Create a separate MBB for funclet prologues Our current emission strategy is to emit the funclet prologue in the CatchPad's normal destination. This is problematic because intra-funclet control flow to the normal destination is not erroneous and results in us reevaluating the prologue if said control flow is taken. Instead, use the CatchPad's location for the funclet prologue. This correctly models our desire to have unwind edges evaluate the prologue but edges to the normal destination result in typical control flow. Differential Revision: http://reviews.llvm.org/D13424 llvm-svn: 249483	2015-10-06 23:31:59 +00:00
Hans Wennborg	083ca9bb32	Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482	2015-10-06 23:24:35 +00:00
Joseph Tremoulet	7f8c1165cd	[WinEH] Implement state numbering for CoreCLR Summary: Assign one state number per handler/funclet, tracking parent state, handler type, and catch type token. State numbers are arranged such that ancestors have lower state numbers than their descendants. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13450 llvm-svn: 249457	2015-10-06 20:30:33 +00:00
Joseph Tremoulet	2afea5438f	[WinEH] Recognize CoreCLR personality function Summary: - Add CoreCLR to if/else ladders and switches as appropriate. - Rename isMSVCEHPersonality to isFuncletEHPersonality to better reflect what it captures. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13449 llvm-svn: 249455	2015-10-06 20:28:16 +00:00
Craig Topper	2c4068f409	[TwoAddressInstructionPass] When looking for a 3 addr conversion after commuting, make sure regB has been updated to take into account the commute. llvm-svn: 249378	2015-10-06 05:39:59 +00:00
Benjamin Kramer	808d2a070d	Move helper classes into an anonymous namespace. NFC. llvm-svn: 249356	2015-10-05 21:20:26 +00:00
David Majnemer	e4f9b09b51	[WinEH] Update CATCHRET's operand to match its successor The CATCHRET operand did not match the MachineFunction's CFG. This mismatch happened because FrameLowering created a new MachineBasicBlock and updated the CFG but forgot to update the CATCHRET operand. Let's make sure this doesn't happen again by strengthing the funclet membership analysis: it can now reason about the membership of all basic blocks, not just those inside of funclets. llvm-svn: 249344	2015-10-05 20:09:16 +00:00
David Majnemer	429c8eda22	[SelectionDAGBuilder] Remove dead code We already check for LandingPadInst two lines above. llvm-svn: 249280	2015-10-04 18:44:47 +00:00
David Majnemer	161935520d	[WinEH] Permit branch folding in the face of funclets Track which basic blocks belong to which funclets. Permit branch folding to fire but only if it can prove that doing so will not cause code in one funclet to be reused in another. llvm-svn: 249257	2015-10-04 02:22:52 +00:00
Simon Pilgrim	dde63374c5	[DAGCombiner] Generalize FADD constant combines to work with vectors Updated the FADD combines to work with vectors as well as scalars. Differential Revision: http://reviews.llvm.org/D13416 llvm-svn: 249251	2015-10-03 22:06:06 +00:00
Sanjay Patel	acd4baefca	include equal sign in debug equations; NFC llvm-svn: 249248	2015-10-03 20:45:01 +00:00
Simon Pilgrim	a38d76a087	[DAGCombiner] Merge SIGN_EXTEND_INREG vector constant folding methods. NCI. visitSIGN_EXTEND_INREG calls SelectionDAG::getNode to constant fold scalar constants but handles vector constants itself, despite getNode being capable of dealing with them. This required a minor change to the getNode implementation to actually deal with cases where the scalars of a BUILD_VECTOR were wider integers than the vector type - which was the only extra ability of the visitSIGN_EXTEND_INREG implementation. No codegen intended and all existing tests remain the same. llvm-svn: 249236	2015-10-03 16:26:52 +00:00
Richard Trieu	e0129e474d	Call the correct overload. Call the correct overload so a string literal does not get converted to a bool. Also fix the test case to match the names given. llvm-svn: 249183	2015-10-02 20:52:14 +00:00
Reid Kleckner	fc64fae6e3	[WinEH] Emit __C_specific_handler tables for the new IR We emit denormalized tables, where every range of invokes in the same state gets a complete list of EH action entries. This is significantly simpler than trying to infer the correct nested scoping structure from the MI. Fortunately, for SEH, the nesting structure is really just a size optimization. With this, some basic __try / __except examples work. llvm-svn: 249078	2015-10-01 21:38:24 +00:00
David Majnemer	4600c06434	[WinEH] Stop BranchFolding from merging across funclets BranchFolding would merge two funclets together, this is not OK. Disable this and strengthen the assertion in FuncletLayout. llvm-svn: 249069	2015-10-01 21:04:13 +00:00
David Majnemer	f828a0ccc7	[WinEH] Make FuncletLayout more robust against catchret Catchret transfers control from a catch funclet to an earlier funclet. However, it is not completely clear which funclet the catchret target is part of. Make this clear by stapling the catchret target's funclet membership onto the CATCHRET SDAG node. llvm-svn: 249052	2015-10-01 18:44:59 +00:00
NAKAMURA Takumi	096492a07b	Reformat. llvm-svn: 249033	2015-10-01 17:01:03 +00:00
NAKAMURA Takumi	1ed20db720	Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64" It broke; LLVM :: CodeGen__Generic__2009-11-16-BadKillsCrash.ll llvm-svn: 249032	2015-10-01 17:00:56 +00:00
Reid Kleckner	6dec87a8a0	[WinEH] Emit int3 after noreturn calls on Win64 The Win64 unwinder disassembles forwards from each PC to try to determine if this PC is in an epilogue. If so, it skips calling the EH personality function for that frame. Typically, this means you cannot catch an exception in the same frame that you threw it, because 'throw' calls a noreturn runtime function. Previously we avoided this problem with the TrapUnreachable TargetOption, but that's a much bigger hammer than we need. All we need is a 1 byte non-epilogue instruction right after the call. Instead, what we got was an unconditional branch to a shared block containing the ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which added TrapUnreachable, and replaces it with something better. The new code pattern matches for invoke/call followed by unreachable and inserts an int3 into the DAG. To be 100% watertight, we would need to insert SEH_Epilogue instructions into all basic blocks ending in a call with no terminators or successors, but in practice this is unlikely to come up. llvm-svn: 248959	2015-09-30 23:09:23 +00:00
Evgeniy Stepanov	f608111d1b	Fix debug info with SafeStack. llvm-svn: 248933	2015-09-30 19:55:43 +00:00
Maksim Panchenko	cce239c45d	HHVM calling conventions. HHVM calling convention, hhvmcc, is used by HHVM JIT for functions in translated cache. We currently support LLVM back end to generate code for X86-64 and may support other architectures in the future. In HHVM calling convention any GP register could be used to pass and return values, with the exception of R12 which is reserved for thread-local area and is callee-saved. Other than R12, we always pass RBX and RBP as args, which are our virtual machine's stack pointer and frame pointer respectively. When we enter translation cache via hhvmcc function, we expect the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed to standard ABI alignment. This affects stack object alignment and stack adjustments for function calls. One extra calling convention, hhvm_ccc, is used to call C++ helpers from HHVM's translation cache. It is almost identical to standard C calling convention with an exception of first argument which is passed in RBP (before we use RDI, RSI, etc.) Differential Revision: http://reviews.llvm.org/D12681 llvm-svn: 248832	2015-09-29 22:09:16 +00:00
David Majnemer	a80c151286	[WinEH] Teach AsmPrinter about funclets Summary: Funclets have been turned into functions by the time they hit the object file. Make sure that they have decent names for the symbol table and CFI directives explaining how to reason about their prologues. Differential Revision: http://reviews.llvm.org/D13261 llvm-svn: 248824	2015-09-29 20:12:33 +00:00
Cong Hou	166e08542e	Rename some function arguments in MachineBasicBlock.cpp/h by turning the first letter into upper case. NFC. llvm-svn: 248821	2015-09-29 19:46:09 +00:00
Jeroen Ketema	740f9d79ca	Arguments spilled on the stack before a function call may have alignment requirements, for example in the case of vectors. These requirements are exploited by the code generator by using move instructions that have similar alignment requirements, e.g., movaps on x86. Although the code generator properly aligns the arguments with respect to the displacement of the stack pointer it computes, the displacement itself may cause misalignment. For example if we have %3 = load <16 x float>, <16 x float>* %1, align 64 call void @bar(<16 x float> %3, i32 0) the x86 back-end emits: movaps 32(%ecx), %xmm2 movaps (%ecx), %xmm0 movaps 16(%ecx), %xmm1 movaps 48(%ecx), %xmm3 subl $20, %esp <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards movaps %xmm3, (%esp) <-- movaps requires 16-byte alignment, while %esp is not aligned as such. movl $0, 16(%esp) calll __bar To solve this, we need to make sure that the computed value with which the stack pointer is changed is a multiple af the maximal alignment seen during its computation. With this change we get proper alignment: subl $32, %esp movaps %xmm3, (%esp) Differential Revision: http://reviews.llvm.org/D12337 llvm-svn: 248786	2015-09-29 10:12:57 +00:00
Matthias Braun	99ae16217e	RegisterPressure: LiveRegSet tracks register units not physregs There are always more physical registers and register units so the previous behaviour was correct but we can do with less memory. llvm-svn: 248767	2015-09-29 00:20:32 +00:00
Reid Kleckner	c71d6275ca	[WinEH] Fix ip2state table emission with funclets Previously we were hijacking the old LandingPadInfo data structures to communicate our state numbers. Now we don't need that anymore. llvm-svn: 248763	2015-09-28 23:56:30 +00:00
Richard Trieu	e778e87d2a	Fix unused variable warning in non-debug builds. llvm-svn: 248754	2015-09-28 22:54:43 +00:00
Sanjay Patel	4e6527682a	tidy up comments; NFC llvm-svn: 248750	2015-09-28 22:14:51 +00:00
Sanjay Patel	5e5f0e9756	move one-use check under the comment that describes it; NFCI llvm-svn: 248745	2015-09-28 21:44:46 +00:00
Andrew Kaylor	16c4da03d5	Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735	2015-09-28 20:33:22 +00:00
Hal Finkel	bd582581b8	[DAGCombine] Fix getStoreMergeAndAliasCandidates's AA-enabled chain walking When AA is being used, non-aliasing stores are canonicalized to use the same chain, and DAGCombiner::getStoreMergeAndAliasCandidates can take advantage of this by looking only as users of a store's chain operand. However, user iteration is not result-number specific, we need to check that the use is as a chain operand, and not via some other operand. It is certainly possible to have another potentially-aliasing store, which shares the first's base pointer, and uses the first's chain's node via some other operand. Failure to catch this situation caused, at least in the included test case, an assert later because the relative sequence-number ordering caused later replacement to create a cycle in the DAG. llvm-svn: 248698	2015-09-28 08:02:14 +00:00
Craig Topper	862d5d8322	Remove 'const' from some ArrayRefs. ArrayRefs are already immutable. NFC llvm-svn: 248693	2015-09-28 00:15:34 +00:00
Joseph Tremoulet	09af67aba5	[EH] Create removeUnwindEdge utility Summary: Factor the code that rewrites invokes to calls and rewrites WinEH terminators to their "unwind to caller" equivalents into a helper in Utils/Local, and use it in the three places I'm aware of that need to do this. Reviewers: andrew.w.kaylor, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13152 llvm-svn: 248677	2015-09-27 01:47:46 +00:00
Matthias Braun	93ab942c24	LivePhysRegs: Fix live-outs of return blocks I realized that the live-out set computed for the return block is missing the callee saved registers (the non-pristine ones to be exact). This only affects the liveness computed for instructions inside the function epilogue which currently none of the LivePhysRegs users in llvm cares about, so this is just a drive-by fix without a testcase. Differential Revision: http://reviews.llvm.org/D13180 llvm-svn: 248636	2015-09-25 23:50:53 +00:00
Matthias Braun	a3b701f828	SelectionDAGDumper: Print simple operands inline. Print simple operands inline instead of their pointer/value number. Simple operands are SDNodes without predecessors like Constant(FP), Register, UNDEF. This unifies the behaviour with dumpr() which was already doing this. Previously: t0: ch = EntryToken t1: i64 = Register %vreg0 t2: i64,ch = CopyFromReg t0, t1 t3: i64 = Constant<1> t4: i64 = add t2, t3 t5: i64 = Constant<2> t6: i64 = add t2, t5 t10: i64 = undef t11: i8,ch = load t0, t2, t10<LD1[%tmp81]> t12: i8,ch = load t0, t4, t10<LD1[%tmp10]> t13: i8,ch = load t0, t6, t10<LD1[%tmp12]> Now: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t4: i64 = add t2, Constant:i64<1> t6: i64 = add t2, Constant:i64<2> t11: i8,ch = load<LD1[%tmp81]> t0, t2, undef:i64 t12: i8,ch = load<LD1[%tmp10]> t0, t4, undef:i64 t13: i8,ch = load<LD1[%tmp12]> t0, t6, undef:i64 Differential Revision: http://reviews.llvm.org/D12567 llvm-svn: 248628	2015-09-25 22:27:02 +00:00
Matt Arsenault	3c07e963b8	DAGCombiner: Check if store is volatile first This is the simpler check. NFC. llvm-svn: 248625	2015-09-25 22:06:19 +00:00
Matthias Braun	c804cdb912	TargetRegisterInfo: Introduce PrintLaneMask. This makes it more convenient to print lane masks and lead to more uniform printing. llvm-svn: 248624	2015-09-25 21:51:24 +00:00
Matthias Braun	e6a2485e1a	TargetRegisterInfo: Add typedef unsigned LaneBitmask and use it where apropriate; NFC llvm-svn: 248623	2015-09-25 21:51:14 +00:00
Sanjay Patel	bbbf9a1a34	merge vector stores into wider vector stores and fix AArch64 misaligned access TLI hook (PR21711) This is a redo of D7208 ( r227242 - http://llvm.org/viewvc/llvm-project?view=revision&revision=227242 ). The patch was reverted because an AArch64 target could infinite loop after the change in DAGCombiner to merge vector stores. That happened because AArch64's allowsMisalignedMemoryAccesses() wasn't telling the truth. It reported all unaligned memory accesses as fast, but then split some 128-bit unaligned accesses up in performSTORECombine() because they are slow. This patch attempts to fix the problem in AArch's allowsMisalignedMemoryAccesses() while preserving existing (perhaps questionable) lowering behavior. The x86 test shows that store merging is working as intended for a target with fast 32-byte unaligned stores. Differential Revision: http://reviews.llvm.org/D12635 llvm-svn: 248622	2015-09-25 21:49:48 +00:00
Matthias Braun	e86bbd8979	PrologueEpilogInserter: Fix missing live-ins when savepoint equals restorepoint The algorithm would not modify the live-in list of blocks below the save block point which is correct unless it happens to be a restore point at the same time. Also fixes the benign issue of live-in registers being added twice in some cases. The testcase is based on a test submitted by Kit Barton. Differential Revision: http://reviews.llvm.org/D13176 llvm-svn: 248620	2015-09-25 21:41:40 +00:00
Matthias Braun	c2d4befb54	MachineBasicBlock: Factor out common code into isReturnBlock() llvm-svn: 248617	2015-09-25 21:25:19 +00:00
Matt Arsenault	10aa807856	PeepholeOptimizer: Remove redundant copies If a virtual register is copied and another copy was already seen, replace with the previous copy. This only handles the simplest cases for now. This pattern shows up from various operand restrictions AMDGPU has which require inserting copies depending on the register class of the operands. llvm-svn: 248611	2015-09-25 20:22:12 +00:00
Chad Rosier	d9f102b464	Simplify code. NFC. llvm-svn: 248610	2015-09-25 20:20:22 +00:00
Matt Arsenault	50f0a42b66	Fix typo llvm-svn: 248549	2015-09-24 22:36:49 +00:00
Mohammad Shahid	13f1dfdf2e	Codegen: Fix llvm.absdiff semantic. Fixes the overflow case of llvm.absdiff intrinsic also updats the tests and LangRef.rst accordingly. Differential Revision: http://reviews.llvm.org/D11678 llvm-svn: 248483	2015-09-24 10:35:03 +00:00
Matt Arsenault	68d938649e	Introduce target hook for optimizing register copies Allow a target to do something other than search for copies that will avoid cross register bank copies. Implement for SI by only rewriting the most basic copies, so it should look through anything like a subregister extract. I'm not entirely satisified with this because it seems like eliminating a reg_sequence that isn't fully used should work generically for all targets without them having to override something. However, it seems to be tricky to have a simple implementation of this without rewriting to invalid kinds of subregister copies on some targets. I'm not sure if there is currently a generic way to easily check if a subregister index would be valid for the current use. The current set of TargetRegisterInfo::get*Class functions don't quite behave like I would expect (e.g. getSubClassWithSubReg returns the maximal register class rather than the minimal), so I'm not sure how to make the generic test keep searching if SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making the default implementation to check for simple copies breaks a variety of ARM and x86 tests by producing illegal subregister uses. The ARM tests are not actually changed since it should still be using the same sharesSameRegisterFile implementation, this just relaxes them to not check for specific registers. llvm-svn: 248478	2015-09-24 08:36:14 +00:00
Matt Arsenault	c7ec46c3aa	Remove dead declaration llvm-svn: 248471	2015-09-24 07:51:12 +00:00
Matt Arsenault	c721df0478	Use new TokenFactor chain when merging stores If the stores are storing values from loads which partially alias the stores, we could end up placing the merged loads and stores on the same chain which has the potential to break. Each store may have a different chain dependency on only some of the original loads. Create a new TokenFactor to capture all of the required dependencies of the stores rather than assuming all stores can use the same chain. The testcase is a situation where this happens, although it does not have an observable change from this. The DAG nodes just happened to not be reordered before despite this missing chain dependency. This is based on an off-list report for an out of tree target which regressed due to r246307 and I haven't managed to find a case where the nodes do end up reordered with an in tree target. llvm-svn: 248468	2015-09-24 07:22:38 +00:00
Evgeniy Stepanov	a2002b08f7	Android support for SafeStack. Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). This is a re-commit of a change in r248357 that was reverted in r248358. llvm-svn: 248405	2015-09-23 18:07:56 +00:00
Evgeniy Stepanov	8d0e3011d8	Revert "Android support for SafeStack." test/Transforms/SafeStack/abi.ll breaks when target is not supported; needs refactoring. llvm-svn: 248358	2015-09-23 01:23:22 +00:00
Evgeniy Stepanov	ce2e16f00c	Android support for SafeStack. Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). llvm-svn: 248357	2015-09-23 01:03:51 +00:00
Cong Hou	9def6efd7e	Fixed an issue on updating profile data when lowering switch statement. Fixed the issue that when there is an edge from the jump table to the default statement, we should check it directly instead of checking if the sibling of the jump table header is a successor of the jump table header, which may not be the default statment but a successor of it. llvm-svn: 248354	2015-09-23 00:20:27 +00:00
Adrian Prantl	77fefeba37	Debug Info: Emit the dwo_name only in skeleton CUs, not in DWOs. llvm-svn: 248340	2015-09-22 23:21:00 +00:00
Matthias Braun	73e4221e6c	LiveIntervalAnalysis: Avoid multiple connected liveness components We may have subregister defs which are unused but not discovered and cleaned up prior to liveness analysis. This creates multiple connected components in the resulting live range which are forbidden in the MachineVerifier because they would unnecesarily constrain the register allocator. Rewrite those dead definitions to define a newly created virtual register. Differential Revision: http://reviews.llvm.org/D13035 llvm-svn: 248335	2015-09-22 22:37:44 +00:00
Matthias Braun	5efe871971	LiveInterval: Distribute subregister liveranges to new intervals in ConnectedVNInfoEqClasses::Distribute() This improves ConnectedVNInfoEqClasses::Distribute() to distribute the segments and value numbers in the subranges instead of conservatively clearing all subregister info. No separate test here, just clearing the subrange instead of properly distributing them would however break my upcoming fix regarding dead super register definitions. Differential Revision: http://reviews.llvm.org/D13075 llvm-svn: 248334	2015-09-22 22:37:42 +00:00
Ahmed Bougacha	07a844d758	[AArch64] Emit clrex in the expanded cmpxchg fail block. In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248291	2015-09-22 17:21:44 +00:00
Benjamin Kramer	3c96f0a54e	Make helper function static. NFC. llvm-svn: 248278	2015-09-22 14:34:57 +00:00
NAKAMURA Takumi	0a7d0ad95f	Untabify. llvm-svn: 248264	2015-09-22 11:15:07 +00:00
NAKAMURA Takumi	a9cb538a74	Reformat blank lines. llvm-svn: 248263	2015-09-22 11:14:39 +00:00
NAKAMURA Takumi	84965031a7	Reformat comment lines. llvm-svn: 248262	2015-09-22 11:14:12 +00:00
NAKAMURA Takumi	70ad98aca4	Reformat. llvm-svn: 248261	2015-09-22 11:13:55 +00:00
Matthias Braun	d3dd1354a4	LiveIntervalAnalysis: Factor common code into splitSeparateComponents; NFC llvm-svn: 248241	2015-09-22 03:44:41 +00:00
Sanjay Patel	fc580a60e2	function names should start with a lower case letter; NFC llvm-svn: 248224	2015-09-21 23:03:16 +00:00
Sanjay Patel	4ac6b115e8	don't repeat function/variable names in header comments; NFC llvm-svn: 248222	2015-09-21 22:47:23 +00:00
Simon Pilgrim	4003ed2da3	[DAGCombiner] Improve FMA support for interpolation patterns This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents. This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t)))) Differential Revision: http://reviews.llvm.org/D13003 llvm-svn: 248210	2015-09-21 20:32:48 +00:00
Simon Pilgrim	e8e5a17a12	[DAGCombiner] Tidy up FMA combine helpers. NFCI. Based on feedback for D13003. llvm-svn: 248206	2015-09-21 20:15:03 +00:00
Stephen Canon	b12db0e42c	Remove roundingMode argument in APFloat::mod Because mod is always exact, this function should have never taken a rounding mode argument. The actual implementation still has issues, which I'll look at resolving in a subsequent patch. llvm-svn: 248195	2015-09-21 19:29:25 +00:00
Matt Arsenault	8fb9b94f7f	Fix accidentally committed debug printing llvm-svn: 248190	2015-09-21 18:21:10 +00:00
Matthias Braun	b9fe44ddb0	SelectionDAG: Use InsertNode for EntryNode This fixes problems where two nodes have persistent debug id 0 assigned. llvm-svn: 248182	2015-09-21 17:41:05 +00:00
Matt Arsenault	b774834429	DAGCombiner: Replace store of FP constant after attemping store merges If storing multiple FP constants, some subset of the stores would be replaced with integers due to visit order, so MergeConsecutiveStores would only partially merge these. llvm-svn: 248169	2015-09-21 15:59:46 +00:00
Matt Arsenault	a30ddb6524	Factor replacement of stores of FP constants into new function llvm-svn: 248168	2015-09-21 15:59:43 +00:00
Chad Rosier	03a47305ec	[Machine Combiner] Refactor machine reassociation code to be target-independent. No functional change intended. Patch by Haicheng Wu <haicheng@codeaurora.org>! http://reviews.llvm.org/D12887 PR24522 llvm-svn: 248164	2015-09-21 15:09:11 +00:00
Craig Topper	0013be16ff	Use makeArrayRef or None to avoid unnecessarily mentioning the ArrayRef type extra times. NFC llvm-svn: 248140	2015-09-21 05:32:41 +00:00
Maksim Panchenko	0510cd5161	[PrologEpilogInserter] Minor refactoring. Differential Revision: http://reviews.llvm.org/D12924 llvm-svn: 248084	2015-09-19 04:42:15 +00:00
Maksim Panchenko	07b754daf8	Test commit. Fix comment. NFC. llvm-svn: 248082	2015-09-19 04:01:19 +00:00
Cong Hou	d40105d321	Update edge weights properly when merging blocks in if-conversion. In if-conversion, there is a utility function MergeBlocks() that is used to merge blocks. However, when new edges are built in this function the edge weight is either not provided or not updated properly, leading to a modified CFG with incorrect edge weights. This patch corrects this issue. Differential Revision: http://reviews.llvm.org/D12513 llvm-svn: 248030	2015-09-18 20:22:41 +00:00
James Y Knight	e72b0dbf97	Make MachineScheduler debug output less confusing. At least...a little bit. llvm-svn: 248020	2015-09-18 18:52:20 +00:00
Matthias Braun	77771cfd97	SelectionDAGDumper: Leave out the <multiple use> markers They mostly clutter the output while it is still possible to see which node has multiple users without them. Differential Revision: http://reviews.llvm.org/D12569 llvm-svn: 248013	2015-09-18 17:57:33 +00:00
Matthias Braun	bab3fb45e5	SelectionDAGDumper: Avoid unnecessary newlines Before: t0 = EntryToken:ch t0: <multiple use> t0: <multiple use> t1 = CopyFromReg:v4f32,ch t0, Register:v4f32 %vreg0 t25 = IMPLICIT_DEF:v4f32 t26 = HADDPSrr:v4f32 t1, t25 t23 = CopyToReg:ch,glue t0, Register:v4f32 %XMM0, t26 t23: <multiple use> t23: <multiple use> t24 = RETQ:ch Register:v4f32 %XMM0, t23, t23:1 After: t0: <multiple use> t0: <multiple use> t1 = CopyFromReg:v4f32,ch t0, Register:v4f32 %vreg0 t26 = X86ISD::FHADD:v4f32 t1, undef:v4f32 t23 = CopyToReg:ch,glue t0, Register:v4f32 %XMM0, t26 t23: <multiple use> t21 = TargetConstant:i16<0> t23: <multiple use> t24 = X86ISD::RET_FLAG:ch t23, t21, Register:v4f32 %XMM0, t23:1 Differential Revision: http://reviews.llvm.org/D12568 llvm-svn: 248012	2015-09-18 17:57:31 +00:00
Matthias Braun	f89b7c7188	SelectionDAGDumper: Hide [ID=X], [ORD=X] and source locations by default. You can show them with the new -dag-dump-verbose switch. Differential Revision: http://reviews.llvm.org/D12566 llvm-svn: 248011	2015-09-18 17:57:28 +00:00
Matthias Braun	0b7d6c14c9	SelectionDAG: Introduce PersistentID to SDNode for assert builds. This gives us more human readable numbers to identify nodes in debug dumps. Before: 0x7fcbd9700160: ch = EntryToken 0x7fcbd985c7c8: i64 = Register %RAX ... 0x7fcbd9700160: <multiple use> 0x7fcbd985c578: i64,ch = MOV64rm 0x7fcbd985c6a0, 0x7fcbd985cc68, 0x7fcbd985c200, 0x7fcbd985cd90, 0x7fcbd985ceb8, 0x7fcbd9700160<Mem:LD8[@foo]> [ORD=2] 0x7fcbd985c8f0: ch,glue = CopyToReg 0x7fcbd9700160, 0x7fcbd985c7c8, 0x7fcbd985c578 [ORD=3] 0x7fcbd985c7c8: <multiple use> 0x7fcbd985c8f0: <multiple use> 0x7fcbd985c8f0: <multiple use> 0x7fcbd985ca18: ch = RETQ 0x7fcbd985c7c8, 0x7fcbd985c8f0, 0x7fcbd985c8f0:1 [ORD=3] Now: t0: ch = EntryToken t5: i64 = Register %RAX ... t0: <multiple use> t3: i64,ch = MOV64rm t10, t12, t11, t13, t14, t0<Mem:LD8[@foo]> [ORD=2] t6: ch,glue = CopyToReg t0, t5, t3 [ORD=3] t5: <multiple use> t6: <multiple use> t6: <multiple use> t7: ch = RETQ t5, t6, t6:1 [ORD=3] Differential Revision: http://reviews.llvm.org/D12564 llvm-svn: 248010	2015-09-18 17:41:00 +00:00
David Majnemer	9966fe8f85	[WinEH] Moved funclet pads should be in relative order We shifted the MachineBasicBlocks to the end of the MachineFunction in DFS order. This will not ensure that MachineBasicBlocks which fell through to one another will remain contiguous. Instead, implement a stable sort algorithm for iplist. This partially reverts commit r214150. llvm-svn: 247978	2015-09-18 08:18:07 +00:00
Bob Wilson	dd0eadce7d	Whitespace. Indent with spaces instead of a tab. llvm-svn: 247969	2015-09-18 05:36:13 +00:00
Quentin Colombet	b4c6886215	[ShrinkWrap] Refactor the handling of infinite loop in the analysis. - Strenghten the logic to be sure we hoist the restore point out of the current loop. (The fixes a bug with infinite loop, added as part of the patch.) - Walk over the exit blocks of the current loop to conver to the desired restore point in one iteration of the update loop. llvm-svn: 247958	2015-09-17 23:21:34 +00:00
Matthias Braun	3e86de1acb	Revert "(HEAD -> master, origin/master, origin/HEAD) RegisterPressure: Move LiveInRegs/LiveOutRegs from RegisterPressure to PressureTracker" This reverts commit r247943. Accidental commit, code review was not finished yet. llvm-svn: 247945	2015-09-17 21:12:24 +00:00
Matthias Braun	70eff2571f	RegisterPressure: Move LiveInRegs/LiveOutRegs from RegisterPressure to PressureTracker Differential Revision: http://reviews.llvm.org/D12814 llvm-svn: 247943	2015-09-17 21:10:06 +00:00
Matthias Braun	d78ee54a54	MachineScheduler: Provide an option for node hiding cutoff and disable it by default llvm-svn: 247942	2015-09-17 21:09:59 +00:00
David Majnemer	978902309a	[WinEH] Add a funclet layout pass Windows EH funclets need to be contiguous. The FuncletLayout pass will ensure that the funclets are together and begin with a funclet entry MBB. Differential Revision: http://reviews.llvm.org/D12943 llvm-svn: 247937	2015-09-17 20:45:18 +00:00
Piotr Padlewski	ea09288ee7	Added MD_invariant_group to LLVMContext http://reviews.llvm.org/D12926 llvm-svn: 247931	2015-09-17 20:25:07 +00:00
Reid Kleckner	ed17079b52	[WinEH] Add and use hasEHPadSuccessor instead of getLandingPadSuccessor getLandingPadSuccessor assumes that each invoke can have at most one EH pad successor, but WinEH invokes can have more than one. Two out of three callers of getLandingPadSuccessor don't use the returned landingpad, so we can make them use this simple predicate instead. Eventually we'll have to circle back and fix SplitKit.cpp so that register allocation works. Baby steps. llvm-svn: 247904	2015-09-17 17:19:40 +00:00
Zia Ansari	841cce1ae9	Test commit. llvm-svn: 247901	2015-09-17 16:51:27 +00:00
Eric Christopher	c7b155f670	Use the cached TargetInstrInfo instead of looking it up again. llvm-svn: 247865	2015-09-16 23:38:16 +00:00
Eric Christopher	a4e5d3cf8e	constify the Function parameter to the TTI creation callback and propagate to all callers/users/etc. llvm-svn: 247864	2015-09-16 23:38:13 +00:00
Reid Kleckner	813f1b65bc	[WinEH] Rip out the landingpad-based C++ EH state numbering code It never really worked, and the new code is working better every day. llvm-svn: 247860	2015-09-16 22:14:46 +00:00
David Majnemer	67bff0d88b	[WinEHPrepare] Turn terminatepad into a cleanuppad + call + cleanupret The MSVC doesn't really support exception specifications so let's just turn these into cleanuppads. Later, we might use terminatepad to more efficiently encode the "noexcept"-ness of a function body. llvm-svn: 247848	2015-09-16 20:42:16 +00:00
Reid Kleckner	b005d281c3	[WinEH] Pull Adjectives and CatchObj out of the catchpad arg list Clang now passes the adjectives as an argument to catchpad. Getting the CatchObj working is simply a matter of threading another static alloca through codegen, first as an alloca, then as a frame index, and finally as a frame offset. llvm-svn: 247844	2015-09-16 20:16:27 +00:00
David Majnemer	459a64aed7	[WinEHPrepare] Provide a cloning mode which doesn't demote We are experimenting with a new approach to saving and restoring SSA values used across funclets: let the register allocator do the dirty work for us. However, this means that we need to be able to clone commoned blocks without relying on demotion. llvm-svn: 247835	2015-09-16 18:40:37 +00:00
David Majnemer	b3d9b960ea	[WinEHPrepare] Refactor explicit EH preparation Split the preparation machinery into several functions, we will want to selectively enable/disable different parts of it for an alternative mechanism for dealing with cross-funclet uses. llvm-svn: 247834	2015-09-16 18:40:24 +00:00
Sanjay Patel	a260701bbb	propagate fast-math-flags on DAG nodes After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815	2015-09-16 16:31:21 +00:00
Michael Kuperstein	098cd9fba7	[X86] Fix emitEpilogue() to make less assumptions about pops This is the mirror image of r242395. When X86FrameLowering::emitEpilogue() looks for where to insert the %esp addition that deallocates stack space used for local allocations, it assumes that any sequence of pop instructions from function exit backwards consists purely of restoring callee-save registers. This may be false, since from some point backward, the pops may be clean-up of stack space allocated for arguments to a call. Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D12688 llvm-svn: 247784	2015-09-16 11:18:25 +00:00
Craig Topper	5db36df4d0	Use range-based for loops. NFC llvm-svn: 247772	2015-09-16 03:52:35 +00:00
Craig Topper	77ec077067	Fix a spelling error in the description of a statistic. NFC llvm-svn: 247771	2015-09-16 03:52:32 +00:00
Piotr Padlewski	6c15ec49ed	Introducing llvm.invariant.group.barrier intrinsic For more info for what reason it was invented, goto: http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html invariant.group.barrier: http://reviews.llvm.org/D12310 docs: http://reviews.llvm.org/D11399 CodeGenPrepare: http://reviews.llvm.org/D12875 llvm-svn: 247711	2015-09-15 18:32:14 +00:00
Quentin Colombet	dc29c973e5	[ShrinkWrapping] Fix an infinite loop while looking for restore point. This may happen when the input program itself contains an infinite loop with no exit block. In that case, we would fail to find a block post-dominating the loop such that this block is outside of the loop. This fixes PR24823. Working on reducing the test case. llvm-svn: 247710	2015-09-15 18:19:39 +00:00
Daniel Sanders	50f17235dd	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Eric has replied and has demanded the patch be reverted. llvm-svn: 247702	2015-09-15 16:17:27 +00:00
Daniel Sanders	153010c52d	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247692	2015-09-15 14:08:28 +00:00
Daniel Sanders	c40de48041	Revert r247684 - Replace Triple with a new TargetTuple ... LLDB needs to be updated in the same commit. llvm-svn: 247686	2015-09-15 13:46:21 +00:00
Daniel Sanders	18d4b0dab7	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247683	2015-09-15 13:17:40 +00:00
Adrian Prantl	deef90d7f5	DwarfDebug: Emit dwo_id+dwo_name for DICompileUnits that provide a dwoId. For module debugging clang emits prefabricated skeleton compile units that can be recognized by a nonzero dwoId. llvm-svn: 247626	2015-09-14 22:10:22 +00:00
David Blaikie	6614d8d230	[opaque pointer types] Switch a few cases of getElementType over, since I had them lying around anyway llvm-svn: 247610	2015-09-14 20:29:26 +00:00
Matthias Braun	3f3934b010	RegisterPressure: Simplify close{Top\|Bottom}() - There are no duplicate registers in LiveRegs list we are copying from and so we do not need to sort the registers. - Simply use SmallVector::apend instead of a loop between begin() and end() with push_back(). Differential Revision: http://reviews.llvm.org/D12813 llvm-svn: 247588	2015-09-14 18:24:15 +00:00
David Blaikie	16a2f3e302	Revert "[opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space" This was a flawed change - it just caused the getElementType call to be deferred until later, when we really need to remove it. Now that the IR for GlobalAliases has been updated, the root cause is addressed that way instead and this change is no longer needed (and in fact gets in the way - because we want to pass the pointee type directly down further). Follow up patches to push this through GlobalValue, bitcode format, etc, will come along soon. This reverts commit 236160. llvm-svn: 247585	2015-09-14 18:01:59 +00:00
Ahmed Bougacha	49b531a08d	[CodeGen] Fix AtomicExpand invalidation issue caused by r247429. llvm-svn: 247514	2015-09-12 18:51:23 +00:00
Bruce Mitchener	e9ffb45b60	Fix typos. Summary: This fixes a variety of typos in docs, code and headers. Subscribers: jholewinski, sanjoy, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12626 llvm-svn: 247495	2015-09-12 01:17:08 +00:00
Akira Hatanaka	bc497c93f5	Use function attribute "stackrealign" to decide whether stack realignment should be forced. With this commit, we can now force stack realignment when doing LTO and do so on a per-function basis. Also, add a new cl::opt option "stackrealign" to CommandFlags.h which is used to force stack realignment via llc's command line. Out-of-tree projects currently using -force-align-stack to force stack realignment should make changes to attach the attribute to the functions in the IR. Differential Revision: http://reviews.llvm.org/D11814 llvm-svn: 247450	2015-09-11 18:54:38 +00:00
David Majnemer	0e70598a5b	[X86] Make sure startproc/endproc are paired We used different conditions to determine if we should emit startproc vs endproc. Use the same condition to ensure that they will always be paired. This fixes PR24374. llvm-svn: 247435	2015-09-11 17:34:34 +00:00
Ahmed Bougacha	5246867384	[CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit. We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429	2015-09-11 17:08:28 +00:00
Ahmed Bougacha	9d677131c4	[CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind. This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428	2015-09-11 17:08:17 +00:00
Cong Hou	c536bd9e73	Pass BranchProbability/BlockMass by value instead of const& as they are small. NFC. llvm-svn: 247357	2015-09-10 23:10:42 +00:00
Reid Kleckner	7bb20bd69e	Fix SEH state numbering algorithm to handle cleanupendpads WinEHPrepare's new coloring algorithm really expects to see cleanupendpads now, so Clang will start emitting them soon. llvm-svn: 247341	2015-09-10 21:46:36 +00:00
Chandler Carruth	2e4ca848f4	Add an explicit 'inline' specifier to these static functions. GCC is warning on them having always_inline attribute for reasons I don't fully understand -- static functions are just as inlinable as inline functions in terms of linkage. llvm-svn: 247334	2015-09-10 20:34:57 +00:00
Adrian Prantl	d209500fd5	Debug Info: Allow a DIModule to appear as the scope of other entities. llvm-svn: 247304	2015-09-10 17:13:58 +00:00
Joseph Tremoulet	f3aff31401	[WinEH] Fix single-block cleanup coloring Summary: The coloring code in WinEHPrepare queues cleanuprets' successors with the correct color (the parent one) when it sees their cleanuppad, and so later when iterating successors knows to skip processing cleanuprets since they've already been queued. This latter check was incorrectly under an 'else' condition and so inadvertently was not kicking in for single-block cleanups. This change sinks the check out of the 'else' to fix the bug. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12751 llvm-svn: 247299	2015-09-10 16:51:25 +00:00
Hans Wennborg	aa15bffa1f	Re-commit r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes" Except the changes that defined virtual destructors as =default, because that ran into problems with GCC 4.7 and overriding methods that weren't noexcept. llvm-svn: 247298	2015-09-10 16:49:58 +00:00
Alex Lorenz	0153e59935	Fix PR 24724 - The implicit register verifier shouldn't assume certain operand order. The implicit register verifier in the MIR parser should only check if the instruction's default implicit operands are present in the instruction. It should not check the order in which they occur. llvm-svn: 247283	2015-09-10 14:04:34 +00:00
Silviu Baranga	df9ce8408a	[DAGCombine] Truncate BUILD_VECTOR operators if necessary when constant folding vectors Summary: The BUILD_VECTOR node will truncate its operators to match the type. We need to take this into account when constant folding - we need to perform a truncation before constant folding the elements. This is because the upper bits can change the result, depending on the operation type (for example this is the case for min/max). This change also adds a regression test. Reviewers: jmolloy Subscribers: jmolloy, llvm-commits Differential Revision: http://reviews.llvm.org/D12697 llvm-svn: 247265	2015-09-10 10:34:34 +00:00
Hans Wennborg	d2799a963f	Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes" This caused build breakges, e.g. http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926 llvm-svn: 247226	2015-09-10 00:57:26 +00:00
Reid Kleckner	7878391208	[WinEH] Add codegen support for cleanuppad and cleanupret All of the complexity is in cleanupret, and it mostly follows the same codepaths as catchret, except it doesn't take a return value in RAX. This small example now compiles and executes successfully on win32: extern "C" int printf(const char *, ...) noexcept; struct Dtor { ~Dtor() { printf("~Dtor\n"); } }; void has_cleanup() { Dtor o; throw 42; } int main() { try { has_cleanup(); } catch (int) { printf("caught it\n"); } } Don't try to put the cleanup in the same function as the catch, or Bad Things will happen. llvm-svn: 247219	2015-09-10 00:25:23 +00:00
Hans Wennborg	6fa09455ed	Fix Clang-tidy misc-use-override warnings, other minor fixes Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D12740 llvm-svn: 247216	2015-09-10 00:12:56 +00:00
Reid Kleckner	94b704c469	[SEH] Emit 32-bit SEH tables for the new EH IR The 32-bit tables don't actually contain PC range data, so emitting them is incredibly simple. The 64-bit tables, on the other hand, use the same table for state numbering as well as label ranges. This makes things more difficult, so it will be implemented later. llvm-svn: 247192	2015-09-09 21:10:03 +00:00
Matthias Braun	d9da162789	Save LaneMask with livein registers With subregister liveness enabled we can detect the case where only parts of a register are live in, this is expressed as a 32bit lanemask. The current code only keeps registers in the live-in list and therefore enumerated all subregisters affected by the lanemask. This turned out to be too conservative as the subregister may also cover additional parts of the lanemask which are not live. Expressing a given lanemask by enumerating a minimum set of subregisters is computationally expensive so the best solution is to simply change the live-in list to store the lanemasks as well. This will reduce memory usage for targets using subregister liveness and slightly increase it for other targets Differential Revision: http://reviews.llvm.org/D12442 llvm-svn: 247171	2015-09-09 18:08:03 +00:00
Matthias Braun	cc58005885	VirtRegMap: Improve addMBBLiveIns() using SlotIndex::MBBIndexIterator; NFC Now that we have an explicit iterator over the idx2MBBMap in SlotIndices we can use the fact that segments and the idx2MBBMap is sorted by SlotIndex position so can advance both simultaneously instead of starting from the beginning for each segment. This complicates the code for the subregister case somewhat but should be more efficient and has the advantage that we get the final lanemask for each block immediately which will be important for a subsequent change. Removes the now unused SlotIndexes::findMBBLiveIns function. Differential Revision: http://reviews.llvm.org/D12443 llvm-svn: 247170	2015-09-09 18:07:54 +00:00

... 6 7 8 9 10 ...

19981 Commits