llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	5d7ece8895	bar llvm-svn: 299619	2017-04-06 04:02:31 +00:00
Craig Topper	faf5a8553c	foo llvm-svn: 299618	2017-04-06 04:02:28 +00:00
Adam Nemet	d5ffdd3605	[DAGCombine] Support FMF contract in fused multiple-and-sub too This is a follow-on to r299096 which added support for fmadd. Subtract does not have the case where with two multiply operands we commute in order to fuse with the multiply with the fewer uses. llvm-svn: 299572	2017-04-05 17:58:48 +00:00
Adam Nemet	99e347fc35	[DAGCombine] Remove commented-out code from r299096 llvm-svn: 299571	2017-04-05 17:58:44 +00:00
Keno Fischer	4ecee77c9a	[ExecutionDepsFix] Don't recurse over the CFG Summary: Use an explicit work queue instead, to avoid accidentally causing stack overflows for input with very large CFGs. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D31681 llvm-svn: 299569	2017-04-05 17:42:56 +00:00
Sanjay Patel	b2f1621bb1	[DAGCombiner] add and use TLI hook to convert and-of-seteq / or-of-setne to bitwise logic+setcc (PR32401) This is a generic combine enabled via target hook to reduce icmp logic as discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 It's likely that other targets will want to enable this hook for scalar transforms, and there are probably other patterns that can use bitwise logic to reduce comparisons. Note that we are missing an IR canonicalization for these patterns, and we will probably prefer the pair-of-compares form in IR (shorter, more likely to fold). Differential Revision: https://reviews.llvm.org/D31483 llvm-svn: 299542	2017-04-05 14:09:39 +00:00
Jonas Paulsson	38a2da92bc	[DAGCombiner] Don't make a BUILD_VECTOR with operands of illegal type. When DAGCombiner visits a SIGN_EXTEND_INREG of a BUILD_VECTOR with constant operands, a new BUILD_VECTOR node will be created transformed constants. Llvm-stress found a case where the new BUILD_VECTOR had constant operands of an illegal type, because the (legal) element type is in fact not a legal scalar type. This patch changes this so that the new BUILD_VECTOR has the same operand type as the old one. Review: Eli Friedman, Nirav Dave https://bugs.llvm.org//show_bug.cgi?id=32422 llvm-svn: 299540	2017-04-05 13:45:37 +00:00
Matt Arsenault	7b0d947404	Allow targets to opt-in to codegen in SCC order Decouple this setting from EnableIRPA. To support function calls on AMDGPU, it is necessary to report the global register usage throughout the kernel's call graph, so callees need to be handled first. llvm-svn: 299487	2017-04-04 23:44:46 +00:00
Keno Fischer	282c62495f	[ExecutionDepsFix] Don't revisit true dependencies If an instruction has a true dependency, it makes sense for to use that register for any undef read operands in the same instruction (we'll have to wait for that register to become available anyway). This logic was already implemented. However, the code would then still try to revisit that instruction and break the dependency (and always fail, since by definition a true dependency has to be live before the instruction). Avoid revisiting such instructions as a performance optimization. No functional change. Differential Revision: https://reviews.llvm.org/D30173 llvm-svn: 299467	2017-04-04 20:30:47 +00:00
Daniel Sanders	bee5739a7c	[tablegen][globalisel] Add support for nested instruction matching. Summary: Lift the restrictions that prevented the tree walking introduced in the previous change and add support for patterns like: (G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3 Also adds support for G_SEXT and G_ZEXT to support these cases. One particular aspect of this that I should draw attention to is that I've tried to be overly conservative in determining the safety of matches that involve non-adjacent instructions and multiple basic blocks. This is intended to be used as a cheap initial check and we may add a more expensive check in the future. The current rules are: * Reject if any instruction may load/store (we'd need to check for intervening memory operations. * Reject if any instruction has implicit operands. * Reject if any instruction has unmodelled side-effects. See isObviouslySafeToFold(). Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka Reviewed By: ab Subscribers: igorb, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30539 llvm-svn: 299430	2017-04-04 13:25:23 +00:00
Matt Arsenault	c82768290d	DAG: Fix missing legalization for any_extend_vector_inreg operands llvm-svn: 299389	2017-04-03 21:28:13 +00:00
Jun Bum Lim	dee5565869	[CodeGenPrep] move aarch64-type-promotion to CGP Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680 llvm-svn: 299379	2017-04-03 19:20:07 +00:00
Craig Topper	3882613956	[DAGCombine][InstCombine] Fix inverted if condition in equivalent comments in DAGCombine and InstCombine. NFC llvm-svn: 299378	2017-04-03 19:18:48 +00:00
Zvi Rackover	d76a4d0ac6	Revert "[DAGCombine] A shuffle of a splat is always the splat itself" This reverts commit r299047 which is incorrect because the simplification may result in incorrect propogation of undefs to users of the folded shuffle. Thanks to Andrea Di Biagio for pointing this out. llvm-svn: 299368	2017-04-03 17:41:19 +00:00
Craig Topper	d33ee1b960	[APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 llvm-svn: 299362	2017-04-03 16:34:59 +00:00
Simon Pilgrim	9daf9c047d	[DAGCombiner] Check limits before accessing array element (PR32502) llvm-svn: 299361	2017-04-03 15:27:49 +00:00
Sanjay Patel	665021e7ee	[DAGCombiner] enable vector transforms for any/all {sign} bits set/clear The code already allowed vector types in via "isInteger" (which might want a more specific name), so use splat-friendly constant predicates to match those types. llvm-svn: 299304	2017-04-01 15:05:54 +00:00
Craig Topper	73250168e7	[DAGCombiner] Fix fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, B, Mask) to explicitly ensure that only one of the inputs of each shuffle is a zero vector. This can only happen when we have a mix of zero and undef elements and the two vectors have a different arrangement of zeros/undefs. The shuffle should eventually be constant folded to all zeros. Fixes PR32484. llvm-svn: 299291	2017-04-01 04:26:20 +00:00
Quentin Colombet	fc8f048c13	Revert "Localizer fun" This reverts commit r299283. Didn't intend to commit this :( llvm-svn: 299287	2017-04-01 01:26:21 +00:00
Quentin Colombet	35a47010b1	Revert "Instrument SDISel C++ patterns" This reverts commit r299284. Didn't intend to commit this :( llvm-svn: 299286	2017-04-01 01:26:17 +00:00
Quentin Colombet	7f64318938	[RegBankSelect] Support REG_SEQUENCE for generic mapping REG_SEQUENCE falls into the same category as COPY for operands mapping: - They don't have MCInstrDesc with register constraints - The input variable could use whatever register classes - It is possible to have register class already assigned to the operands In particular, given REG_SEQUENCE are always target specific because of the subreg indices. Those indices must apply to the register class of the definition of the REG_SEQUENCE and therefore, the target must set a register class to that definition. As a result, the generic code can always use that register class to derive a valid mapping for a REG_SEQUENCE. llvm-svn: 299285	2017-04-01 01:26:14 +00:00
Quentin Colombet	b43da15602	Instrument SDISel C++ patterns llvm-svn: 299284	2017-04-01 01:21:32 +00:00
Quentin Colombet	3c40b366c5	Localizer fun WIP llvm-svn: 299283	2017-04-01 01:21:28 +00:00
Sanjay Patel	16d458ea0d	[DAGCombiner] refactor and/or-of-setcc to get rid of duplicated code; NFCI llvm-svn: 299266	2017-03-31 21:30:50 +00:00
Sanjay Patel	34da36e74f	[DAGCombiner] add fold for 'All sign bits set?' (and (setlt X, 0), (setlt Y, 0)) --> (setlt (and X, Y), 0) We have 7 similar folds, but this one got away. The fact that the x86 test with a branch didn't change is probably a separate bug. We may also be missing this and the related folds in instcombine. llvm-svn: 299252	2017-03-31 20:28:06 +00:00
Sanjay Patel	61d3409535	[DAGCombiner] remove redundant code and add comments; NFCI llvm-svn: 299241	2017-03-31 18:18:58 +00:00
Jan Sjodin	f1a30f1800	Refactor code to create getFallThrough method in MachineBasicBlock. Differential Revision: https://reviews.llvm.org/D27264 llvm-svn: 299227	2017-03-31 15:55:37 +00:00
Simon Pilgrim	1cdbfe44b1	[DAGCombiner] Add ComputeNumSignBits vector demanded elements support to ASHR and INSERT_VECTOR_ELT Followup to D31311 llvm-svn: 299221	2017-03-31 14:21:50 +00:00
Simon Pilgrim	3c81c34d8d	[DAGCombiner] Add vector demanded elements support to ComputeNumSignBits Currently ComputeNumSignBits returns the minimum number of sign bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original ComputeNumSignBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. I've only added support for BUILD_VECTOR and EXTRACT_VECTOR_ELT so far, all others will default to demanding all elements but can be updated in due course. Followup to D25691. Differential Revision: https://reviews.llvm.org/D31311 llvm-svn: 299219	2017-03-31 13:54:09 +00:00
Simon Pilgrim	37b536e4b3	[DAGCombiner] Add vector demanded elements support to computeKnownBitsForTargetNode Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes. Differential Revision: https://reviews.llvm.org/D31249 llvm-svn: 299201	2017-03-31 11:24:16 +00:00
Simon Pilgrim	6bdc755519	Spelling mistakes in comments. NFCI. llvm-svn: 299197	2017-03-31 10:59:37 +00:00
Peter Collingbourne	c66018e247	Move llvm::emitLinkerFlagsForGlobalCOFF() to Mangler. llvm-svn: 299183	2017-03-31 04:46:50 +00:00
Peter Collingbourne	2e0ffe9858	Move llvm::canBeOmittedFromSymbolTable() to Analysis. llvm-svn: 299182	2017-03-31 04:46:31 +00:00
Eric Christopher	b9c56d1235	getPristineRegs is not accurately considering shrink wrapping puts registers not saved in certain blocks. Use explicit getCalleeSavedInfo and isLiveIn instead. This fixes pr32292. Patch by Tim Shen! llvm-svn: 299124	2017-03-30 22:34:20 +00:00
Adam Nemet	edaec6de73	[DAGCombiner] Initial support for the fast-math flag contract Now alternatively to the TargetOption.AllowFPOpFusion global flag, FMUL->FADD can also use the per operation FMF to allow fusion. The idea here is not to port everything to the new scheme (e.g. fused multiply-and-sub will be ported later) but that this work all the way from clang. The transformation is conditionalized on both the FADD and the FMUL having the FMF contract flag. Differential Revision: https://reviews.llvm.org/D31169 llvm-svn: 299096	2017-03-30 18:53:04 +00:00
Ahmed Bougacha	6dd6082472	[CodeGen] Pass SDAG an ORE, and replace FastISel stats with remarks. In the long-term, we want to replace statistics with something finer-grained that lets us gather per-function data. Remarks are that replacement. Create an ORE instance in SelectionDAGISel, and pass it to SelectionDAG. SelectionDAG was used so that we can emit remarks from all SelectionDAG-related code, including TargetLowering and DAGCombiner. This isn't used in the current patch but Adam tells me he's interested for the fp-contract combines. Use the ORE instance to emit FastISel failures as remarks (instead of the mix of dbgs() dumps and statistics that we currently have). Eventually, we want to have an API that tells us whether remarks are enabled (http://llvm.org/PR32352) so that we don't emit expensive remarks (in this case, dumping IR) when it's not needed. For now, use 'isEnabled' as a crude replacement. This does mean that the replacement for '-fast-isel-verbose' is now '-pass-remarks-missed=isel'. Additionally, clang users also need to enable remark diagnostics, using '-Rpass-missed=isel'. This also removes '-fast-isel-verbose2': there are no static statistics that we want to only enable in asserts builds, so we can always use the remarks regardless of the build type. Differential Revision: https://reviews.llvm.org/D31405 llvm-svn: 299093	2017-03-30 17:49:58 +00:00
Sanjay Patel	6d5ba061f8	[DAGCombiner] add helper function for visitORLike; NFCI This combines all of the equivalent clean-ups for foldAndOfSetCCs: https://reviews.llvm.org/rL298938 https://reviews.llvm.org/rL298940 https://reviews.llvm.org/rL298944 https://reviews.llvm.org/rL298949 https://reviews.llvm.org/rL298950 https://reviews.llvm.org/rL299002 https://reviews.llvm.org/rL299013 The sins of code duplication are on full display here: each function is missing a fold that wasn't copied over from its logical sibling. llvm-svn: 299091	2017-03-30 17:32:42 +00:00
Simon Pilgrim	68168d17b9	Spelling mistakes in comments. NFCI. Based on corrections mentioned in patch for clang for PR27635 llvm-svn: 299072	2017-03-30 12:59:53 +00:00
Craig Topper	eafcbe2d10	[APInt] Remove references to integerPartWidth outside of APFloat implentation. Turns out integerPartWidth only explicitly defines the width of the tc functions in the APInt class. Functions that aren't used by APInt implementation itself. Many places in the code base already assume APInt is made up of 64-bit pieces. Explicitly assuming 64-bit here doesn't make that situation much worse. A full audit would need to be done if it ever changes. llvm-svn: 299059	2017-03-30 05:49:03 +00:00
Zvi Rackover	7569436f81	[DAGCombine] A shuffle of a splat is always the splat itself Summary: Add a simplification: shuffle (splat-shuffle), undef, M --> splat-shuffle Fixes pr32449 Patch by Sanjay Patel Reviewers: eli.friedman, RKSimon, spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31426 llvm-svn: 299047	2017-03-30 01:42:57 +00:00
Eric Christopher	9ea300f08d	If the DIUnit has flags passed on it then have DW_AT_producer be a combination of DICompileUnit::Producer and Flags. The darwin behavior is unchanged and will continue to use DW_AT_APPLE_flags. Patch by Zhizhou Yang llvm-svn: 299038	2017-03-29 23:34:27 +00:00
Davide Italiano	cdcdc97879	[DAGCombiner] Remove else after return. NFCI. llvm-svn: 299022	2017-03-29 19:39:46 +00:00
Sanjay Patel	ff211bb5a6	[DAGCombiner] unify type checks and add asserts; NFCI We had a mix of type checks and usage that wasn't very clear. llvm-svn: 299013	2017-03-29 18:08:01 +00:00
Sanjay Patel	087e922328	[DAGCombiner] reduce code duplication by rearranging checks; NFCI llvm-svn: 299002	2017-03-29 15:37:33 +00:00
Sven van Haastregt	04bfa87f12	[MachineVerifier] Drop a spurious const As of r298987 the argument is a value that we std::move, so it shouldn't be const anymore. llvm-svn: 298999	2017-03-29 15:25:06 +00:00
Sven van Haastregt	039a6d9f9f	[MachineVerifier] Avoid reference to nullptr Instantiation of the MachineVerifierPass through PassInfo::getNormalCtor would yield a segfault since the default constructor of the MachineVerifierPass takes a reference to nullptr. Patch by Simone Pellegrini. Differential Revision: https://reviews.llvm.org/D31387 llvm-svn: 298987	2017-03-29 09:08:25 +00:00
Adam Nemet	92a5cf4366	[SDAG] Remove -enable-fmf-dag This is no longer needed as spotted by Sanjay in https://reviews.llvm.org/D31165. llvm-svn: 298963	2017-03-28 23:46:14 +00:00
Adam Nemet	6820f391eb	[SDAG] Add AllowContract to SNodeFlags Properly propagate the FMF from the LLVM IR to this flag. This is toward moving fp-contraction=fast from an LLVM TargetOption to a FastMathFlag in order to fix PR25721. Differential Revision: https://reviews.llvm.org/D31165 llvm-svn: 298961	2017-03-28 23:46:08 +00:00
Sanjay Patel	a41a5c29f0	[DAGCombiner] reduce code duplication with local variables; NFCI llvm-svn: 298954	2017-03-28 22:45:53 +00:00
Sanjay Patel	9747d8070b	[DAG] fix formatting; NFC llvm-svn: 298950	2017-03-28 22:25:25 +00:00
Sanjay Patel	d832eddde5	[DAGCombiner] remove redundant conditions and duplicated code; NFCI llvm-svn: 298949	2017-03-28 22:22:50 +00:00
Sanjay Patel	d2a26db991	[DAGCombiner] rename variables in foldAndOfSetCCs for easier reading; NFCI llvm-svn: 298944	2017-03-28 21:40:41 +00:00
Matt Arsenault	323b021b5e	Fix crashing on TargetCustom PseudoSourceValues Default to something more reasonable if printCustom isn't implemented. llvm-svn: 298941	2017-03-28 20:33:12 +00:00
Sanjay Patel	3230e4be11	[DAGCombiner] clean up foldAndOfSetCCs; NFCI 1. Fix bogus comment. 2. Early exit to reduce indent. 3. Change node pointer param to what it really is: an SDLoc. llvm-svn: 298940	2017-03-28 20:28:16 +00:00
Sanjay Patel	16af53a395	[DAGCombiner] add helper function for and-of-setcc folds; NFC This is just a cut and paste followed by clang-format. Clean up to follow. llvm-svn: 298938	2017-03-28 19:58:46 +00:00
Sanjay Patel	f01a1dad7f	[x86] use VPMOVMSK to replace memcmp libcalls for 32-byte equality Follow-up to: https://reviews.llvm.org/rL298775 llvm-svn: 298933	2017-03-28 17:23:49 +00:00
Nirav Dave	472b5efc8b	[SDAG] Deal with deleted node in PromoteIntShiftOp Deal with case that initial node is deleted during dag-combine leading to an assertional failure in promoteIntShiftOp. Fixes PR32420. Reviewers: spatel, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31403 llvm-svn: 298931	2017-03-28 17:09:49 +00:00
Nirav Dave	5b414ebe63	[SDAG] Avoid deleted SDNodes PromoteIntBinOp Reorder work in PromoteIntBinOp to prevent stale (deleted) nodes from being used. Fixes PR32340 and PR32345. Reviewers: hfinkel, dbabokin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31148 llvm-svn: 298923	2017-03-28 15:41:12 +00:00
Nirav Dave	9b5563c52c	[SDAG] Fix Stale SDNode usage in visitAND Reorder CombineTo Calls to prevent potential use of deleted node. Fixes PR32372. Reviewers: jnspaulsson, RKSimon, uweigand, jonpa Reviewed By: jonpa Subscribers: jonpa, llvm-commits Differential Revision: https://reviews.llvm.org/D31346 llvm-svn: 298920	2017-03-28 14:11:20 +00:00
Nirav Dave	423b24ae76	[SDAG] Minor cleanup of variable usage. NFC. llvm-svn: 298916	2017-03-28 13:39:50 +00:00
Valery Pykhtin	910da13a07	MachineScheduler/ScheduleDAG: Add support for GetSubGraph Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30626 llvm-svn: 298896	2017-03-28 05:12:31 +00:00
Junmo Park	c7479ba86a	CodeGen : Check LLVM_ENABLE_DUMP definition for dumpMachineInstrRangeWithSlotIndex. Summary: Add missing check routine for dumpMachineInstrRangeWithSlotIndex including LLVM_DUMP_METHOD. Reviewers: bkramer Differential revision: https://reviews.llvm.org/D30367 llvm-svn: 298895	2017-03-28 04:14:25 +00:00
Javed Absar	3d59437093	Improve machine schedulers for in-order processors This patch enables schedulers to specify instructions that cannot be issued with any other instructions. It also fixes BeginGroup/EndGroup. Reviewed by: Andrew Trick Differential Revision: https://reviews.llvm.org/D30744 llvm-svn: 298885	2017-03-27 20:46:37 +00:00
Adrian Prantl	e8450fdb48	Remove redundant check for nullptr. llvm-svn: 298866	2017-03-27 17:36:31 +00:00
Adrian Prantl	035862b926	Remove unneccessary virtual destructor from DwarfExpression. llvm-svn: 298865	2017-03-27 17:34:04 +00:00
Ahmed Bougacha	2d29998f22	[GlobalISel] Add a 'getConstantVRegVal' helper. Use it to compare immediate operands. llvm-svn: 298855	2017-03-27 16:35:27 +00:00
Sanjay Patel	9ebb68843e	[x86] use PMOVMSK to replace memcmp libcalls for 16-byte equality This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, we can replace memcmp calls with inline code that is both smaller and faster. Differential Revision: https://reviews.llvm.org/D31290 llvm-svn: 298775	2017-03-25 16:05:33 +00:00
Simon Pilgrim	dbc94db3f3	Apply clang-format as commented in D31311. NFCI. llvm-svn: 298751	2017-03-24 23:47:41 +00:00
Reid Kleckner	6b78e16368	[codeview] Don't assert when the user violates the ODR If we have an array of a user-defined aggregates for which there was an ODR violation, then the array size will not necessarily match the number of elements times the size of the element. Fixes PR32383 llvm-svn: 298750	2017-03-24 23:28:42 +00:00
Davide Italiano	6a1209ee87	[MachineScheduler] Add missing machine pass dependency. llvm-svn: 298736	2017-03-24 20:52:56 +00:00
Adrian Prantl	b216e3e935	Refactor code to reduce indentation and improve readability. (NFC) llvm-svn: 298665	2017-03-23 23:35:09 +00:00
Adrian Prantl	f0ffc5233d	Fix a bug when emitting debug info for partially constant global variables. While fixing a malformed testcase, I discovered that the code exercised by it was wrong, too. llvm-svn: 298664	2017-03-23 23:35:00 +00:00
Dehao Chen	b197d5b0a0	Fix trellis layout to avoid mis-identify triangle. Summary: For the following CFG: A->B B->C A->C If there is another edge B->D, then ABC should not be considered as triangle. Reviewers: davidxl, iteratee Reviewed By: iteratee Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D31310 llvm-svn: 298661	2017-03-23 23:28:09 +00:00
Dehao Chen	775341a14c	Use isFunctionHotInCallGraph to set the function section prefix. Summary: The current prefix based function layout algorithm only looks at function's entry count, which is not sufficient. A function should be grouped together if its entry count or any call edge count is hot. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31225 llvm-svn: 298656	2017-03-23 23:14:11 +00:00
Jessica Paquette	5096a34726	[Outliner] Remove unused lambda capture. Remove an unused lambda capture that made some bots unhappy. llvm-svn: 298651	2017-03-23 22:17:20 +00:00
Jessica Paquette	acffa28c63	[Outliner] Fix compile-time overhead for candidate choice The old candidate collection method in the outliner caused some very large regressions in compile time on large tests. For MultiSource/Benchmarks/7zip it caused a 284.07 s or 1156% increase in compile time. On average, using the SingleSource/MultiSource tests, it caused an average increase of 8 seconds in compile time (something like 1000%). This commit replaces that candidate collection method with a new one which only visits each node in the tree once. This reduces the worst compile time increase (still 7zip) to a 0.542 s overhead (22%) and the average compile time increase on SingleSource and MultiSource to 0.018 s (4%). llvm-svn: 298648	2017-03-23 21:27:38 +00:00
Adrian Prantl	8f3337950e	Zero-Initialize PrevInstBB when entering a new MachineFunction. It is not guaranteed that the memory used for MachineBasicBlocks in the previous MachineFunction hasn't been freed, so holding on to a pointer to the last function's isn't correct. Particularly I have observed the sret.ll testcase failing because the first BasicBlock in the new function happened to be allocated to the exact same memory as the previously saved and (deleted) PrevInstBB. llvm-svn: 298642	2017-03-23 20:23:42 +00:00
Nirav Dave	e9ca32ae52	[SDAG] Fix zeroExtend assertion error Move CombineTo preventing deleted node from being returned in visitZERO_EXTEND. Fixes PR32284. Reviewers: RKSimon, bogner Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31254 llvm-svn: 298604	2017-03-23 15:01:50 +00:00
Adrian Prantl	a27198853d	Rename helper functions in DwarfExpression to be less misleading (NFC) llvm-svn: 298523	2017-03-22 17:19:55 +00:00
Adrian Prantl	5503077cde	Fix PR32298 by adding an early exit to getFrameIndexExprs(). Also add an assertion for the case that there are multiple FI expressions with a DW_OP_LLVM_fragment; which should violate internal constraints in DbgVariable. llvm-svn: 298518	2017-03-22 16:50:16 +00:00
Aditya Nandakumar	bc389badbc	[GlobalISel]: Create VREGs for ConstantInt args This patch changes the behavior of IRTranslating intrinsics where we now create VREG + G_CONSTANT for ConstantInt values. We already do this for FloatingPoint values. This makes it easier for the backends to select code and it won't have to de-duplicate creation+selection of constants. Reviewed by: ab llvm-svn: 298473	2017-03-22 01:16:39 +00:00
Adrian Prantl	0498baa4be	Don't compose DWARF expressions with multiple subregisters. If a register location can only be described by a complex expression (i.e., multiple subregisters) it doesn't safely compose with another complex expression. For example, it is not possible to apply a DW_OP_deref operation to multiple DW_OP_pieces. llvm-svn: 298472	2017-03-22 01:16:01 +00:00
Adrian Prantl	80e188d983	DwarfExpression: Defer emitting DWARF register operations until the rest of the expression is known. This is still an NFC refactoring in preparation of a subsequent bugfix. This reapplies r298388 with a bugfix for non-physical frame registers. llvm-svn: 298471	2017-03-22 01:15:57 +00:00
Ahmed Bougacha	15b3e8a93a	[GlobalISel] Update DBG_VALUEs referencing DCE'd instructions. Quentin points out that r298358 would cause us to emit different code with debug info. That's a big no-no; also erase the instructions that only live thanks to DBG_VALUE users. Adrian explained how this is an existing problem and an OK thing to do: clang has allocas for all variables so shouldn't be affected at -O0, but swift uses a bit of inlineasm to explicitly keep values live for the purpose of debug info quality. I'm not sure there is a better scheme. llvm-svn: 298460	2017-03-21 23:42:54 +00:00
Ahmed Bougacha	e8e1fa3a7c	[GlobalISel] Don't translate br to layout successor. MI can represent fallthrough to layout successor blocks, and our post-isel representation uses that extensively. We might as well use it too, to avoid translating and carrying along unnecessary branches. llvm-svn: 298459	2017-03-21 23:42:50 +00:00
Tim Northover	548feeecd3	GlobalISel: respect BooleanContents when extending i1. The world isn't just x86 & ARM, some targets need to store -1 into the byte when legalizing a bool store. llvm-svn: 298453	2017-03-21 22:22:05 +00:00
Matthias Braun	8445cbd1ca	SplitKit: Fix subreg copy related problems Fix two problems related to r298025: - SplitKit would create duplicate VNIs in some cases leading to crashs when hoisting copies. - VirtRegMap could fail expanding copies at the beginning of a basic block. This fixes http://llvm.org/PR32353 llvm-svn: 298448	2017-03-21 21:58:08 +00:00
Tim Northover	dd4b9d6d7b	GlobalISel: widen booleans by zero-extending to a byte. A bool is represented by a single byte, which the ARM ABI requires to be either 0 or 1. So we cannot use G_ANYEXT when legalizing the type. llvm-svn: 298439	2017-03-21 21:12:04 +00:00
Adrian Prantl	4dc0324ff8	Revert 298388 and 298389 because they broke some AMDGPU tests. llvm-svn: 298401	2017-03-21 17:14:30 +00:00
Reid Kleckner	b518054b87	Rename AttributeSet to AttributeList Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 llvm-svn: 298393	2017-03-21 16:57:19 +00:00
Adrian Prantl	1db6486fb1	Don't compose DWARF expressions with multiple subregisters. If a register location can only be described by a complex expression (i.e., multiple subregisters) it doesn't safely compose with another complex expression. For example, it is not possible to apply a DW_OP_deref operation to multiple DW_OP_pieces. llvm-svn: 298389	2017-03-21 16:37:39 +00:00
Adrian Prantl	dabee8e57c	DwarfExpression: Defer emitting DWARF register operations until the rest of the expression is known. This is still an NFC refactoring in preparation of a subsequent bugfix. llvm-svn: 298388	2017-03-21 16:37:35 +00:00
Matt Arsenault	dce313c3cf	DAG: Fold bitcast/extract_vector_elt of undef to undef Fixes not eliminating store when intrinsic is lowered to undef. llvm-svn: 298385	2017-03-21 16:20:16 +00:00
Volkan Keles	47debaeef0	[GlobalISel] Move isTriviallyDead to Utils. NFC. Make it accessible by the targets to avoid code duplication. llvm-svn: 298358	2017-03-21 10:47:35 +00:00
Jonas Paulsson	54c7680e1f	[DAGTypeLegalizer] Handle widening truncate to vector of i1. Previously, PromoteIntRes_TRUNCATE() did not handle the case where the operand needs widening, which resulted in llvm_unreachable(). This patch adds the needed handling, along with a test case. Review: Eli Friedman, Simon Pilgrim. https://reviews.llvm.org/D31077 llvm-svn: 298357	2017-03-21 10:24:14 +00:00
Volkan Keles	75bdc7690e	[GlobalISel] Translate shufflevector Reviewers: qcolombet, aditya_nandakumar, t.p.northover, javed.absar, ab, dsanders Reviewed By: javed.absar Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30962 llvm-svn: 298347	2017-03-21 08:44:13 +00:00
Zachary Turner	82a0c97b32	Add a function to MD5 a file's contents. In doing so, clean up the MD5 interface a little. Most existing users only care about the lower 8 bytes of an MD5, but for some users that care about the upper and lower, there wasn't a good interface. Furthermore, consumers of the MD5 checksum were required to handle endianness details on their own, so it seems reasonable to abstract this into a nicer interface that just gives you the right value. Differential Revision: https://reviews.llvm.org/D31105 llvm-svn: 298322	2017-03-20 23:33:18 +00:00
Adrian Prantl	956484b7b5	Replace uses of DwarfExpression::addMachineReg* with addMachineRegExpression and mark the methods as protected. Besides reducing the surface area of DwarfExpression, this is in preparation for an upcoming bugfix in the DwarfExpression implementation, for which it will be necessary to defer emitting register operations until the rest of the expression is known. NFC llvm-svn: 298309	2017-03-20 21:35:09 +00:00
Adrian Prantl	52884b7be8	Make implementation details in DwarfExpression protected. (NFC) llvm-svn: 298308	2017-03-20 21:34:19 +00:00
Reid Kleckner	8819c73878	[WinEH] Adjust decision to emit SEH moves for leaf functions Move the check for "MF->hasWinCFI()" up into the calculation of the shouldEmitMoves boolean, rather than putting it in the early returning if. This ensures that endFunction doesn't try to emit .seh_* directives for leaf functions. llvm-svn: 298276	2017-03-20 17:45:59 +00:00
Tim Northover	89268b183f	GlobalISel: allow quad-precision values to be dumped. Otherwise the fallback path fails with an assertion on AAPCS AArch64 targets, when "long double" is encountered. llvm-svn: 298273	2017-03-20 16:52:08 +00:00
Diana Picus	d79253a9f7	[GlobalISel] Use the correct calling conv for calls This commit adds a parameter that lets us pass in the calling convention of the call to CallLowering::lowerCall. This allows us to handle situations where the calling convetion of the callee is different from that of the caller. Differential Revision: https://reviews.llvm.org/D31039 llvm-svn: 298254	2017-03-20 14:40:18 +00:00
Simon Pilgrim	8424df7dea	Fix constant folding of fp2int to large integers We make the assumption in most of our constant folding code that a fp2int will target an integer of 128-bits or less, calling the APFloat::convertToInteger with only uint64_t[2] of raw bits for the result. Fuzz testing (PR24662) showed that we don't handle other cases at all, resulting in stack overflows and all sorts of crashes. This patch uses the APSInt version of APFloat::convertToInteger instead to better handle such cases. Differential Revision: https://reviews.llvm.org/D31074 llvm-svn: 298226	2017-03-19 16:50:25 +00:00
Ahmed Bougacha	931904d777	[GlobalISel] Don't select trivially dead instructions. Folding instructions when selecting can cause them to become dead. Don't select these dead instructions (if they don't have other side effects, and don't define physical registers). Preserve existing tests by adding COPYs. In some tests, the G_CONSTANT vregs never get constrained to a class: the only use of the vreg was folded into another instruction, so the G_CONSTANT, now dead, never gets selected. llvm-svn: 298224	2017-03-19 16:13:00 +00:00
Ahmed Bougacha	7f2d17331c	[GlobalISel] Move method definition to the proper file. NFC. llvm-svn: 298221	2017-03-19 16:12:48 +00:00
Oren Ben Simhon	0ef61ec32a	[MIR] Support Customed Register Mask and CSRs The MIR printer dumps a string that describe the register mask of a function. A static predefined list of register masks matches a static list of strings. However when the register mask is not from the static predefined list, there is no descriptor string and the printer fails. This patch adds support to custom register mask printing and dumping. Also the list of callee saved registers (describing the registers that must be preserved for the caller) might be dynamic. As such this data needs to be dumped and parsed back to the Machine Register Info. Differential Revision: https://reviews.llvm.org/D30971 llvm-svn: 298207	2017-03-19 08:14:18 +00:00
Matthias Braun	e6ff30b696	ExecutionDepsFix: Let targets specialize the pass; NFC Let targets specialize the pass with the register class so we can get a parameterless default constructor and can put the pass into the pass registry to enable testing with -run-pass=. llvm-svn: 298184	2017-03-18 05:08:58 +00:00
Matthias Braun	e9f8209e87	ExecutionDepsFix: Normalize names; NFC Normalize ExeDepsFix, execution-fix, ExecutionDependencyFix and ExecutionDepsFix to the last one. llvm-svn: 298183	2017-03-18 05:05:40 +00:00
Matthias Braun	6e67052912	CodeGen.cpp: Sort alphabetically; NFC llvm-svn: 298182	2017-03-18 05:05:32 +00:00
Nirav Dave	ac6081cb67	Make library calls sensitive to regparm module flag (Fixes PR3997). Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 llvm-svn: 298179	2017-03-18 00:44:07 +00:00
Nirav Dave	6de2c77944	Capitalize ArgListEntry fields. NFC. llvm-svn: 298178	2017-03-18 00:43:57 +00:00
Evgeniy Stepanov	51c962f72e	Add !associated metadata. This is an ELF-specific thing that adds SHF_LINK_ORDER to the global's section pointing to the metadata argument's section. The effect of that is a reverse dependency between sections for the linker GC. !associated does not change the behavior of global-dce. The global may also need to be added to llvm.compiler.used. Since SHF_LINK_ORDER is per-section, !associated effectively enables fdata-sections for the affected globals, the same as comdats do. Differential Revision: https://reviews.llvm.org/D29104 llvm-svn: 298157	2017-03-17 22:17:24 +00:00
Eli Friedman	46ddab3810	[SelectionDAG] Remove redundant stores more aggressively. Handle TokenFactors more aggressively in SDValue::reachesChainWithoutSideEffects. This isn't really a very effective change anymore because of other changes to chain handling, but it's a cheap check, and the expanded comments are still useful. It might be possible to loosen the hasOneUse() requirement with a deeper analysis, but a naive implementation of that check would be expensive. Differential Revision: https://reviews.llvm.org/D29845 llvm-svn: 298156	2017-03-17 22:15:50 +00:00
Jun Bum Lim	4230101def	[CodeGenPrep]Restructure promoting Ext to form ExtLoad Summary: Instead of just looking for a load which is mergable with Ext to form ExtLoad, trying to promote Exts as long as the cost is acceptable. This change is not a NFC as it continue promoting Exts even after finding a load during promotions; the change in arm64-codegen-prepare-extload.ll described in 2.b might show the case. This change was motivated from D26524. Based on this change, I will move the transformation performed in aarch64-type-promotion into CGP. Reviewers: jmolloy, qcolombet, mcrosier, javed.absar Reviewed By: qcolombet Subscribers: rengolin, llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D27853 llvm-svn: 298114	2017-03-17 19:05:21 +00:00
Simon Pilgrim	5a68d401c7	[SelectionDAG] Add SelectionDAG.computeKnownBits test support for ISD::ABS llvm-svn: 298108	2017-03-17 17:45:36 +00:00
Matthias Braun	f0b68d3fbc	SplitKit: Correctly implement partial subregister copies - This fixes a bug where subregister incompatible with the vregs register class where used. - Implement the case where multiple copies are necessary to cover a given lanemask. Differential Revision: https://reviews.llvm.org/D30438 llvm-svn: 298025	2017-03-17 00:41:39 +00:00
Matthias Braun	fa289ec7f0	VirtRegMap: Correctly deal with bundles when deleting identity copies. This fixes two problems when VirtRegMap encounters bundles: - When substituting a vreg subregister def with an actual register the internal read flag must be cleared. - Removing an identity COPY from a bundle needs to use removeFromBundle() and a newly introduced function to update SlotIndexes. No testcase here, because none of the in-tree targets trigger this, however an upcoming commit of mine will need this and the testcase there will trigger this. Differential Revision: https://reviews.llvm.org/D30925 llvm-svn: 298024	2017-03-17 00:41:33 +00:00
Eric Christopher	53da761570	Remove LessPreciseFPMADOption from TargetOptions along with all of the associated command line options and functions - it's currently unused in all of llvm and clang other than being set and reset. llvm-svn: 298023	2017-03-17 00:38:03 +00:00
Reid Kleckner	45707d4d5a	Remove getArgumentList() in favor of arg_begin(), args(), etc Users often call getArgumentList().size(), which is a linear way to get the number of function arguments. arg_size(), on the other hand, is constant time. In general, the fact that arguments are stored in an iplist is an implementation detail, so I've removed it from the Function interface and moved all other users to the argument container APIs (arg_begin(), arg_end(), args(), arg_size()). Reviewed By: chandlerc Differential Revision: https://reviews.llvm.org/D31052 llvm-svn: 298010	2017-03-16 22:59:15 +00:00
Simon Pilgrim	fbfb19b1d7	Remove redundant conditions (PR31753). NFCI. llvm-svn: 297976	2017-03-16 19:52:00 +00:00
Adrian Prantl	dc855221af	Attempt to fix bot failure on Windows. Looks like this expression was accidentally using 32-bit arithmetic. llvm-svn: 297969	2017-03-16 18:06:04 +00:00
Adrian Prantl	3621309e8d	Rearrange fields. NFC. llvm-svn: 297967	2017-03-16 17:42:47 +00:00
Adrian Prantl	a63b8e8227	Rename methods in DwarfExpression to adhere to the LLVM coding guidelines. NFC. llvm-svn: 297966	2017-03-16 17:42:45 +00:00
Adrian Prantl	981f03e6a2	PR32288: More efficient encoding for DWARF expr subregister access. Citing http://bugs.llvm.org/show_bug.cgi?id=32288 The DWARF generated by LLVM includes this location: 0x55 0x93 0x04 DW_OP_reg5 DW_OP_piece(4) When GCC's DWARF is simply 0x55 (DW_OP_reg5) without the DW_OP_piece. I believe it's reasonable to assume the DWARF consumer knows which part of a register logically holds the value (low bytes, high bytes, how many bytes, etc) for a primitive value like an integer. This patch gets rid of the redundant DW_OP_piece when a subregister is at offset 0. It also adds previously missing subregister masking when a subregister is followed by another operation. (This reapplies r297960 with two additional testcase updates). rdar://problem/31069390 https://reviews.llvm.org/D31010 llvm-svn: 297965	2017-03-16 17:14:56 +00:00
Adrian Prantl	c5b3351750	Revert "PR32288: More efficient encoding for DWARF expr subregister access." This reverts commit 2bf453116889a576956892ea9683db4fcd96e30e while investigating buildbot breakage. llvm-svn: 297962	2017-03-16 16:38:22 +00:00
Adrian Prantl	8508e87998	PR32288: More efficient encoding for DWARF expr subregister access. Citing http://bugs.llvm.org/show_bug.cgi?id=32288 The DWARF generated by LLVM includes this location: 0x55 0x93 0x04 DW_OP_reg5 DW_OP_piece(4) When GCC's DWARF is simply 0x55 (DW_OP_reg5) without the DW_OP_piece. I believe it's reasonable to assume the DWARF consumer knows which part of a register logically holds the value (low bytes, high bytes, how many bytes, etc) for a primitive value like an integer. This patch gets rid of the redundant DW_OP_piece when a subregister is at offset 0. It also adds previously missing subregister masking when a subregister is followed by another operation. rdar://problem/31069390 https://reviews.llvm.org/D31010 llvm-svn: 297960	2017-03-16 16:34:14 +00:00
Oren Ben Simhon	da59ffae91	Fixing typos. llvm-svn: 297932	2017-03-16 08:15:52 +00:00
Jonas Paulsson	84319bfc40	[SelectionDAG] Optimize VSELECT->SETCC of incompatible or illegal types. Don't scalarize VSELECT->SETCC when operands/results needs to be widened, or when the type of the SETCC operands are different from those of the VSELECT. (VSELECT SETCC) and (VSELECT (AND/OR/XOR (SETCC,SETCC))) are handled. The previous splitting of VSELECT->SETCC in DAGCombiner::visitVSELECT() is no longer needed and has been removed. Updated tests: test/CodeGen/ARM/vuzp.ll test/CodeGen/NVPTX/f16x2-instructions.ll test/CodeGen/X86/2011-10-19-widen_vselect.ll test/CodeGen/X86/2011-10-21-widen-cmp.ll test/CodeGen/X86/psubus.ll test/CodeGen/X86/vselect-pcmp.ll Review: Eli Friedman, Simon Pilgrim https://reviews.llvm.org/D29489 llvm-svn: 297930	2017-03-16 07:17:12 +00:00
Kyle Butt	08655997eb	CodeGen: BlockPlacement: Reduce TriangleChainCount to 2 This produces a 1% speedup on an important internal Google benchmark (protocol buffers), with no other regressions in google or in the llvm test-suite. Only 5 targets in the entire llvm test-suite are affected, and on those 5 targets the size increase is 0.027% llvm-svn: 297925	2017-03-16 01:32:29 +00:00
Craig Topper	6c66bbca4a	[StackColoring] Remove unused header file for post-order traversal. Update comment that indicated we were using it when we really use a depth-first search. NFC llvm-svn: 297904	2017-03-15 22:40:26 +00:00
Matt Arsenault	02d915be90	CodeGenPrepare: Sink addressing modes for atomics llvm-svn: 297903	2017-03-15 22:35:20 +00:00
Eric Christopher	17ce8a2f5e	Fix up grammar in a comment. llvm-svn: 297898	2017-03-15 21:50:46 +00:00
Zvi Rackover	48cdde0e59	[DAGCombine] Bail out if can't create a vector with at least two elements Summary: Fixes pr32278 Reviewers: igorb, craig.topper, RKSimon, spatel, hfinkel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30978 llvm-svn: 297878	2017-03-15 19:48:36 +00:00
Ahmed Bougacha	2fb8030748	[GlobalISel] Avoid translating synthetic constants to new G_CONSTANTS. Currently, we create a G_CONSTANT for every "synthetic" integer constant operand (for instance, for the G_GEP offset). Instead, share the G_CONSTANTs we might have created by going through the ValueToVReg machinery. When we're emitting synthetic constants, we do need to get Constants from the context. One could argue that we shouldn't modify the context at all (for instance, this means that we're going to use a tad more memory if the constant wasn't used elsewhere), but constants are mostly harmless. We currently do this for extractvalue and all. For constant fcmp, this does mean we'll emit an extra COPY, which is not necessarily more optimal than an extra materialized constant. But that preserves the current intended design of uniqued G_CONSTANTs, and the rematerialization problem exists elsewhere and should be resolved with a single coherent solution. llvm-svn: 297875	2017-03-15 19:21:11 +00:00
Tim Northover	0d98b03b9f	ARM: avoid clobbering register in v6 jump-table expansion. If we got unlucky with register allocation and actual constpool placement, we could end up producing a tTBB_JT with an index that's already been clobbered. Technically, we might be able to fix this situation up with a MOV, but I think the constant islands pass is complex enough without having to deal with more weird edge-cases. llvm-svn: 297871	2017-03-15 18:38:13 +00:00
Ahmed Bougacha	07f247b6c2	[GlobalISel] Insert translated switch icmp blocks after switch parent. Now that we preserve the IR layout, we would end up with all the newly synthesized switch comparison blocks at the end of the function. Instead, use a hopefully more reasonable layout, with the comparison blocks immediately following the switch comparison blocks. llvm-svn: 297869	2017-03-15 18:22:37 +00:00
Ahmed Bougacha	a61c214f51	[GlobalISel] Preserve IR block layout. It makes the output function layout more predictable; the layout has an effect on performance, we don't want it to be at the mercy of the translator's visitation order and such. The predictable output is also easier to digest. getOrCreateBB isn't appropriately named anymore, as it never needs to create anything. Rename it and extract the MBB creation logic out of it. A couple tests were sensitive to the order. Update them. llvm-svn: 297868	2017-03-15 18:22:33 +00:00
Craig Topper	bcb6093610	[CodeGen] Use APInt::setLowBits/setHighBits/setBitsFrom in more places This patch replaces ORs with getHighBits/getLowBits etc. with setLowBits/setHighBits/setBitsFrom. In a few of the places we weren't ORing, but the KnownZero/KnownOne vectors were already initialized to zero. We exploit this in most places already there were just some that were inconsistent. Differential Revision: https://reviews.llvm.org/D30965 llvm-svn: 297860	2017-03-15 16:53:53 +00:00
Ahmed Bougacha	2b7f1377aa	[GlobalISel] Remove dead member. NFC. llvm-svn: 297855	2017-03-15 16:29:32 +00:00
Peter Collingbourne	d44a01aae6	CodeGen: Use the source filename as the argument to .file, rather than the module ID. Using the module ID here is wrong for a couple of reasons: 1) The module ID is not persisted, so we can end up with different object file contents given the same input file (for example if the same file is accessed via different paths). 2) With ThinLTO the module ID field may contain the path to a bitcode file, which is incorrect, as the .file argument is supposed to contain the path to a source file. Differential Revision: https://reviews.llvm.org/D30584 llvm-svn: 297853	2017-03-15 16:24:52 +00:00
Simon Pilgrim	018eedd9a5	[SelectionDAG] Support BUILD_VECTOR implicit truncation in SelectionDAG::ComputeNumSignBits (PR32273) llvm-svn: 297852	2017-03-15 16:22:24 +00:00
Nuno Lopes	ae455c562d	fix gcc -Wmisleading-indentation [NFC] llvm-svn: 297816	2017-03-15 09:33:33 +00:00
Taewook Oh	1b192336d8	NFC: Reformats comments according to the coding guildelines. llvm-svn: 297808	2017-03-15 06:29:23 +00:00
Taewook Oh	fb1833efeb	[BranchFolding] Merge debug locations from common tail instead of removing Summary: D25742 improved the precision of debug locations for PGO by removing debug locations from common tail when tail-merging. However, if identical insturctions that are merged into a common tail have the same debug locations, there's no need to remove them. This patch creates a merged debug location of identical instructions across SameTails and assign it to the instruction in the common tail, so that the debug locations are maintained if they are same across identical instructions. Reviewers: aprantl, probinson, MatzeB, rob.lougher Reviewed By: aprantl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D30226 llvm-svn: 297805	2017-03-15 05:44:59 +00:00
Peter Collingbourne	7f6e2c97b8	Ensure that prefix data is preserved with subsections-via-symbols On MachO platforms that use subsections-via-symbols dead code stripping will drop prefix data. Unfortunately there is no great way to convey the relationship between a function and its prefix data to the linker. We are forced to use a bit of a hack: we give the prefix data it’s own symbol, and mark the actual function entry an .alt_entry. Patch by Moritz Angermann! Differential Revision: https://reviews.llvm.org/D30770 llvm-svn: 297804	2017-03-15 04:18:16 +00:00
Volkan Keles	4862c63594	[GlobalISel] IRTranslator: Return the scalar for <1 x Ty> constant vectors Summary: <1 x Ty> is not a legal vector type in LLT, we shouldn’t build G_MERGE_VALUES instruction for them. Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, ab, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D30948 llvm-svn: 297792	2017-03-14 23:45:06 +00:00
Daniel Sanders	8a4bae9993	[globalisel][tblgen] Add support for ComplexPatterns Summary: Adds a new kind of MachineOperand: MO_Placeholder. This operand must not appear in the MIR and only exists as a way of creating an 'uninitialized' operand until a matcher function overwrites it. Depends on D30046, D29712 Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D30089 llvm-svn: 297782	2017-03-14 21:32:08 +00:00
Simon Pilgrim	cf2da96c82	[SelectionDAG] Add a signed integer absolute ISD node Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering. ARM/AArch64, Hexagon, PowerPC and NVPTX all have similar instructions allowing us to make this a generic opcode and move away from the hard coded tablegen patterns which makes it tricky to match more complex patterns. At the moment this patch doesn't attempt legalization as we only create an ABS node if its legal/custom. Differential Revision: https://reviews.llvm.org/D29639 llvm-svn: 297780	2017-03-14 21:26:58 +00:00
Sanjay Patel	8dd99dce6c	[DAG] vector div/rem with any zero element in divisor is undef This is the backend counterpart to: https://reviews.llvm.org/rL297390 https://reviews.llvm.org/rL297409 and follow-up to: https://reviews.llvm.org/rL297384 It surprised me that we need to duplicate the check in FoldConstantArithmetic and FoldConstantVectorArithmetic, but one or the other doesn't catch all of the test cases. There is an existing code comment about merging those someday. Differential Revision: https://reviews.llvm.org/D30826 llvm-svn: 297762	2017-03-14 18:06:28 +00:00
Benjamin Kramer	babcbddae0	[CodeGen] Fix -Wreorder warning. llvm-svn: 297729	2017-03-14 10:29:47 +00:00
Sam Parker	916b1ba617	[ARM] Move SMULW[B\|T] isel to DAG Combine Create nodes for smulwb and smulwt and move their selection from DAGToDAG to DAG combine. smlawb and smlawt can then be selected using tablegen. Added some helper functions to detect shift patterns as well as a wrapper around SimplifyDemandBits. Added a couple of extra tests. Differential Revision: https://reviews.llvm.org/D30708 llvm-svn: 297716	2017-03-14 09:13:22 +00:00
Oren Ben Simhon	fe34c5e429	Disable Callee Saved Registers Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller. Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list. The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee. The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee. Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span). The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments. The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC. Differential Revision: https://reviews.llvm.org/D28566 llvm-svn: 297715	2017-03-14 09:09:26 +00:00
Nirav Dave	4fc8401abf	Recommitting Craig Topper's patch now that r296476 has been recommitted. When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node. This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency. llvm-svn: 297698	2017-03-14 01:42:23 +00:00
Nirav Dave	54e22f33d9	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting with compiler time improvements Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 297695	2017-03-14 00:34:14 +00:00
Adrian Prantl	19aadf57c8	Revert "Debug Info: Add basic support for external types references." This reverts commit r242302. External type refs of this form were never used by any LLVM frontend so this is effectively dead code. (They were introduced to support clang module debug info, but in the end we came up with a better design that doesn't use this feature at all.) rdar://problem/25897929 Differential Revision: https://reviews.llvm.org/D30917 llvm-svn: 297684	2017-03-13 22:56:14 +00:00
Marcello Maggioni	598d89a3f4	[IPRA] Change algorithm for RegUsageInfoCollector. The previous algorithm for RegUsageInfoCollector had pretty bad performance on architectures with a lot of registers that alias a lot one another, because we potentially iterate for every register over all the aliasing registers. This costs even more if the function is small and doesn't define a lot of registers. This patch changes the algorithm to one that while iterating over all the registers it will iterate over the aliasing registers only if the register itself is defined. This should be faster based on the assumption that only a subset of the whole LLVM registers set is actually defined in the function. Differential Revision: https://reviews.llvm.org/D30880 llvm-svn: 297673	2017-03-13 21:42:53 +00:00
Volkan Keles	38a91a0de6	GlobalISel: Translate ConstantDataVector Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, javed.absar, ab Reviewed By: qcolombet, dsanders, ab Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30216 llvm-svn: 297670	2017-03-13 21:36:19 +00:00
Jessica Paquette	c984e21394	[Outliner] Add tail call support This commit adds tail call support to the MachineOutliner pass. This allows the outliner to insert jumps rather than calls in areas where tail calling is possible. Outlined tail calls include the return or terminator of the basic block being outlined from. Tail call support allows the outliner to take returns and terminators into consideration while finding candidates to outline. It also allows the outliner to save more instructions. For example, in the X86-64 outliner, a tail called outlined function saves one instruction since no return has to be inserted. llvm-svn: 297653	2017-03-13 18:39:33 +00:00
Simon Pilgrim	fa97699d09	Fix -Wsentinel warning llvm-svn: 297560	2017-03-11 12:56:02 +00:00
Amaury Sechet	d1ec5d54cf	Use setBits in SelectionDAG Summary: As per title. Reviewers: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30836 llvm-svn: 297559	2017-03-11 11:24:03 +00:00
Quentin Colombet	ee8a4f51c4	[IRTranslator] Simplify error handling for translating constants. NFC. We don't need to check whether the fallback path is enabled to return false. Just do that all the time on error cases, the caller knows (or at least should know!) how to handle the failing case. llvm-svn: 297535	2017-03-11 00:28:33 +00:00
Stanislav Mekhanoshin	b546174b0e	Fix subreg value numbers in handleMoveUp The problem can occur in presence of subregs. If we are swapping two instructions defining different subregs of the same register we will get a new liveout from a block. We need to preserve value number for block's liveout for successor block's livein to match. Differential Revision: https://reviews.llvm.org/D30558 llvm-svn: 297534	2017-03-11 00:14:52 +00:00
Simon Pilgrim	455e2f3313	Strip trailing whitespace. llvm-svn: 297529	2017-03-10 22:53:19 +00:00
Simon Pilgrim	83c37c4dac	Fix redundant condition (PR32138) '!A \|\| (A && B)' is equivalent to '!A \|\| B' llvm-svn: 297527	2017-03-10 22:44:47 +00:00
Volkan Keles	225921ac41	[GlobalISel] LegalizerHelper: Lower (G_FSUB X, Y) to (G_FADD X, (G_FNEG Y)) Summary: No test case as none of the in-tree targets with GlobalISel support has this condition. Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, ab Reviewed By: qcolombet Subscribers: dberris, rovka, kristof.beyls, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D30786 llvm-svn: 297512	2017-03-10 21:25:09 +00:00
Volkan Keles	970fee4bfe	GlobalISel: Translate ConstantAggregateZero vectors Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30259 llvm-svn: 297509	2017-03-10 21:23:13 +00:00
Volkan Keles	04cb08cc83	[GlobalISel] Translate insertelement and extractelement Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30761 llvm-svn: 297495	2017-03-10 19:08:28 +00:00
Simon Pilgrim	7dedbfa89d	[SelectionDAG] Add support for BUILD_VECTOR to ComputeNumSignBits llvm-svn: 297492	2017-03-10 18:36:46 +00:00
Volkan Keles	685fbda217	[GlobalISel] Make LegalizerInfo accessible in LegalizerHelper Summary: We don’t actually use LegalizerInfo in Legalizer pass, it’s just passed as an argument. In order to check if an instruction is legal or not, we need to get LegalizerInfo by calling `MI.getParent()->getParent()->getSubtarget().getLegalizerInfo()`. Instead, make LegalizerInfo accessible in LegalizerHelper. Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, kristof.beyls Reviewed By: qcolombet Subscribers: dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D30838 llvm-svn: 297491	2017-03-10 18:34:57 +00:00
Amaury Sechet	62e0759d56	[SelectionDAG] Make SelectionDAG aware of the known bits in USUBO and SSUBO and SUBC. Summary: Depends on D30379 This improves the state of things for the sub class of operation. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30436 llvm-svn: 297482	2017-03-10 17:26:44 +00:00
Amaury Sechet	69fa16c810	[SelectionDAG] Make SelectionDAG aware of the known bits in UADDO and SADDO. Summary: As per title. This is extracted from D29872 and I threw SADDO in. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30379 llvm-svn: 297479	2017-03-10 17:06:52 +00:00
Simon Pilgrim	b02667c469	[APInt] Add APInt::insertBits() method to insert an APInt into a larger APInt We currently have to insert bits via a temporary variable of the same size as the target with various shift/mask stages, resulting in further temporary variables, all of which require the allocation of memory for large APInts (MaskSizeInBits > 64). This is another of the compile time issues identified in PR32037 (see also D30265). This patch adds the APInt::insertBits() helper method which avoids the temporary memory allocation and masks/inserts the raw bits directly into the target. Differential Revision: https://reviews.llvm.org/D30780 llvm-svn: 297458	2017-03-10 13:44:32 +00:00
Ahmed Bougacha	d22b84b9d0	[GlobalISel] Use ImmutableCallSite instead of templates. NFC. ImmutableCallSite abstracts away CallInst and InvokeInst. Use it! llvm-svn: 297426	2017-03-10 00:25:44 +00:00
Ahmed Bougacha	4ec6d5abed	[GlobalISel] Fallback when failing to translate invoke. We unintentionally stopped falling back in r293670. While there, change an unusual construct. llvm-svn: 297425	2017-03-10 00:25:35 +00:00
Tim Northover	aa995c98f4	GlobalISel: support trivial inlineasm calls. They're used for nefarious purposes by ObjC. llvm-svn: 297422	2017-03-09 23:36:26 +00:00
Eli Friedman	93f47e5ffb	Refactor alias check from MISched into common helper. NFC. Differential Revision: https://reviews.llvm.org/D30598 llvm-svn: 297421	2017-03-09 23:33:36 +00:00
Amaury Sechet	e7d102cf02	[DAGCombiner] Do various combine on uaddo. Summary: This essentially does the same transform as for ADC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30417 llvm-svn: 297416	2017-03-09 22:47:00 +00:00
Tim Northover	d1e951e5eb	GlobalISel: inform FrameLowering when we emit a function call. Amongst other things (I expect) this is necessary to ensure decent backtraces when an "unreachable" is involved. llvm-svn: 297413	2017-03-09 22:00:39 +00:00
Tim Northover	7a9ea8f628	GlobalISel: put debug info for static allocas in the MachineFunction. The good reason to do this is that static allocas are pretty simple to handle (especially at -O0) and avoiding tracking DBG_VALUEs throughout the pipeline should give some kind of performance benefit. The bad reason is that the debug pipeline is an unholy mess of implicit contracts, where determining whether "DBG_VALUE %reg, imm" actually implies a load or not involves the services of at least 3 soothsayers and the sacrifice of at least one chicken. And it still gets it wrong if the variable is at SP directly. llvm-svn: 297410	2017-03-09 21:12:06 +00:00
Amaury Sechet	10425de063	[DAGCombiner] Do various combine on usubo. Summary: This essentially does the same transform as for SUBC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30437 llvm-svn: 297404	2017-03-09 19:28:00 +00:00
Sanjay Patel	df21979db7	[DAG] recognize div/rem by 0 as undef before trying constant folding As discussed in the review thread for rL297026, this is actually 2 changes that would independently fix all of the test cases in the patch: 1. Return undef in FoldConstantArithmetic for div/rem by 0. 2. Move basic undef simplifications for div/rem (simplifyDivRem()) before foldBinopIntoSelect() as a matter of efficiency. I will handle the case of vectors with any zero element as a follow-up. That change is the DAG sibling for D30665 + adding a check of vector elements to FoldConstantVectorArithmetic(). I'm deleting the test for PR30693 because it does not test for the actual bug any more (dangers of using bugpoint). Differential Revision: https://reviews.llvm.org/D30741 llvm-svn: 297384	2017-03-09 15:02:25 +00:00
Adam Nemet	5361b82d54	[SSP] In opt remarks, stream Function directly With this, it shows up as an attribute in YAML and non-printable characters are properly removed by GlobalValue::getRealLinkageName. llvm-svn: 297362	2017-03-09 06:10:27 +00:00
Matt Arsenault	9a3fd87523	DAG: Check no signed zeros instead of unsafe math attribute llvm-svn: 297354	2017-03-09 01:36:39 +00:00
Konstantin Zhuravlyov	d5561e0a0b	[DebugInfo] Emit address space with DW_AT_address_class attribute for pointer and reference types Differential Revision: https://reviews.llvm.org/D29670 llvm-svn: 297320	2017-03-08 23:55:44 +00:00
Jessica Paquette	d4cb9c6da0	[Outliner] Fix memory leak in suffix tree. This commit changes the BumpPtrAllocator for suffix tree nodes to a SpecificBumpPtrAllocator. Before, node construction was leaking memory because of the DenseMap in SuffixTreeNodes. Changing this to a SpecificBumpPtrAllocator allows this memory to properly be released. llvm-svn: 297319	2017-03-08 23:55:33 +00:00
Tim Northover	7596bd7a27	GlobalISel: correctly handle trivial fcmp predicates. It makes sense to only do them once in IRTranslator rather than making everyone deal with them. llvm-svn: 297304	2017-03-08 18:49:54 +00:00
Volkan Keles	5698b2ae6e	[GlobalISel] Add default action for G_FNEG Summary: rL297171 introduced G_FNEG for floating-point negation instruction and IRTranslator started to translate `FSUB -0.0, X` to `FNEG X`. This patch adds a default action for G_FNEG to avoid breaking existing targets. Reviewers: qcolombet, ab, kristof.beyls, t.p.northover, aditya_nandakumar, dsanders Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits Differential Revision: https://reviews.llvm.org/D30721 llvm-svn: 297301	2017-03-08 18:09:14 +00:00
Eli Friedman	c2c2e21d77	[DAGCombine] Simplify ISD::AND in GetDemandedBits. This helps in cases involving bitfields where an AND is exposed by legalization. Differential Revision: https://reviews.llvm.org/D30472 llvm-svn: 297249	2017-03-08 00:56:35 +00:00
Konstantin Zhuravlyov	f9b41cd3d8	[DebugInfo] Make legal and emit DW_OP_swap and DW_OP_xderef Differential Revision: https://reviews.llvm.org/D29672 llvm-svn: 297247	2017-03-08 00:28:57 +00:00
Daniel Sanders	1351db49b2	Fix additional constructor call missed by r297241. It was added between my build+test and my commit. llvm-svn: 297244	2017-03-07 23:32:10 +00:00
Daniel Sanders	52b4ce727a	Recommit: [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 297241	2017-03-07 23:20:35 +00:00
Tim Northover	542d1c1463	GlobalISel: use inserts for landingpad instead of sequences. llvm-svn: 297237	2017-03-07 23:04:06 +00:00
Tim Northover	2eb18d3c4b	GlobalISel: fix legalization of G_INSERT We were calculating incorrect extract/insert offsets by trying to be too tricksy with min/max. It's clearer to just split the logic up into "register starts before this segment" vs "after". llvm-svn: 297226	2017-03-07 21:24:33 +00:00
Yaron Keren	75fadfc774	Implement FreeMachineFunction::getPassName(). llvm-svn: 297222	2017-03-07 20:59:08 +00:00
Ahmed Bougacha	55d10423a6	[GlobalISel] Don't translate intrinsics with metadata parameters. Some intrinsics take metadata parameters. These all need custom handling of some form, and cannot possibly be lowered generically to G_INTRINSIC calls with vreg operands. Reject them, instead of hitting an assert later in getOrCreateVReg. llvm-svn: 297209	2017-03-07 20:53:09 +00:00
Ahmed Bougacha	5c7924fca5	[GlobalISel] Avoid invalidating ValToVReg when translating no-op bitcast. When we translate a no-op (same type) bitcast, we try to be clever and only emit a COPY if we already assigned a vreg to the defined value. However, when we didn't, we tried to assign to a reference into the ValToVReg DenseMap, even though the RHS of the assignment (getOrCreateVReg) could potentially grow that DenseMap, invalidating the reference. Avoid that by getting the source vreg first. I audited the rest of the translator; this is the only tricky case. The test is quite unwieldy, as the problem is caused by the DenseMap growing, which happens after the 47th mapped value. llvm-svn: 297208	2017-03-07 20:53:06 +00:00
Ahmed Bougacha	38455ea8a6	[GlobalISel] Relax vector G_SELECT assertion. For vector operands, the `select` instruction supports both vector and non-vector conditions. The MIR builder had an overly restrictive assertion, that only accepted vector conditions for vector selects (in effect implementing ISD::VSELECT). Make it possible to express the full range of G_SELECTs. llvm-svn: 297207	2017-03-07 20:53:03 +00:00
Ahmed Bougacha	adce3ee219	[GlobalISel] Slightly clean up DBG_VALUE FP build code. I messed up my rebases leading to r297200, and ended up with stale (but working) code. Fix it. llvm-svn: 297205	2017-03-07 20:52:57 +00:00
Ahmed Bougacha	c373262d52	[GlobalISel] Ignore %noreg when applying default regbank mapping. When computing the mapping for non-generic instructions, we skipped %noreg operands, because we can't always reason about their banks. Also skip them when applying the mapping. Otherwise, we could end up with mappings that we can't apply. While there, duplicate an assert to distinguish between the two error conditions. llvm-svn: 297201	2017-03-07 20:34:23 +00:00
Ahmed Bougacha	4826bae8b4	[GlobalISel] Emit DBG_VALUE %noreg for non-int/fp constant values. When a dbg_value has a constant operand that isn't representable in MI, there isn't much we can do. Use %noreg (0) for those situations. This matches the SelectionDAG behavior. llvm-svn: 297200	2017-03-07 20:34:20 +00:00
Arnold Schwaighofer	69e74b48f2	SjLjEHPrepare: Fix the pass for swifterror arguments We cannot leave the identity copies 'select true, arg, undef' that this pass inserts for arguments to simplify handling of values on swifterror arguments. swifterror arguments have restrictions on their uses. rdar://30839288 llvm-svn: 297197	2017-03-07 20:29:02 +00:00
Daniel Sanders	8ebec37d26	Revert r297177: Change LLT constructor string into an LLT-based object ... More module problems. This time it only showed up in the stage 2 compile of clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile. Somehow, this change causes the build to need Attributes.gen before it's been generated. llvm-svn: 297188	2017-03-07 19:21:23 +00:00
Daniel Sanders	8612326a08	[globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 297177	2017-03-07 18:32:25 +00:00
Volkan Keles	20d3c4200d	[GlobalISel] Translate floating-point negation Reviewers: qcolombet, javed.absar, aditya_nandakumar, dsanders, t.p.northover, ab Reviewed By: qcolombet Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30671 llvm-svn: 297171	2017-03-07 18:03:28 +00:00
Tim Northover	c2c545b8f7	GlobalISel: restrict G_EXTRACT instruction to just one operand. A bit more painful than G_INSERT because it was more widely used, but this should simplify the handling of extract operations in most locations. llvm-svn: 297100	2017-03-06 23:50:28 +00:00
Sanjay Patel	7f18ec50ba	[DAG] refactor related div/rem folds; NFCI This is known incomplete and not called in the right order relative to other folds, but that's the current behavior. I'm just trying to clean this up before making actual functional changes to make the patch smaller. The logic here should mimic the IR equivalents that are in InstSimplify's simplifyDivRem(). llvm-svn: 297086	2017-03-06 22:32:40 +00:00
Paul Robinson	f96e21ad6d	[DWARFv5] Update definitions to match published spec. Some late additions to DWARF v5 were not in Dwarf.def; also one form was redefined. Add the new cases to relevant switches in different parts of LLVM. Replace DW_FORM_ref_sup with DW_FORM_ref_sup[4,8]. I did not add support for DW_FORM_strx3/addrx3 other that defining the constants. We don't have any infrastructure to support these. Differential Revision: http://reviews.llvm.org/D30664 llvm-svn: 297085	2017-03-06 22:20:03 +00:00
Jessica Paquette	596f483a5e	[Outliner] Fixed Asan bot failure in r296418 Fixed the asan bot failure which led to the last commit of the outliner being reverted. The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector is no longer initialized using reserve but just a standard constructor. llvm-svn: 297081	2017-03-06 21:31:18 +00:00
Krzysztof Parzyszek	5b8fae5edd	[IfConversion] Only renormalize probabilities if branches are analyzable If a block has non-analyzable branches, the listed successors don't need to add up to one. For example, if a block has a conditional tail call, that tail call will not have a corresponding successor in the successor list, but will still be a possible branch. Differential Revision: https://reviews.llvm.org/D30556 llvm-svn: 297054	2017-03-06 19:12:42 +00:00
Tim Northover	95b6d5f2b1	GlobalISel: don't emit degenerate G_INSERT instructions. Before, we were producing G_INSERT instructions that were actually closer to a cast or even a COPY when both input and output sizes are the same. This doesn't really make sense and means that everything interpreting a G_INSERT also has to handle all these kinds of casts. So now we detect these degenerate cases and emit real casts instead. llvm-svn: 297051	2017-03-06 19:04:17 +00:00
Tim Northover	81dafc1c88	GlobalISel: add buildUndef method to MachineIRBuilder. NFC. llvm-svn: 297044	2017-03-06 18:36:40 +00:00
Tim Northover	75e0b91e59	GlobalISel: refactor legalization of G_INSERT. Now that G_INSERT instructions can only insert one register, this code was overly general. In another direction it didn't handle registers that crossed split boundaries properly, which needed to be fixed. llvm-svn: 297042	2017-03-06 18:23:04 +00:00
Sanjay Patel	7f7947bf41	[DAGCombiner] simplify div/rem-by-0 Refactoring of duplicated code and more fixes to follow. This is motivated by the post-commit comments for r296699: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435182.html Ie, we can crash if we're missing obvious simplifications like this that exist in the IR simplifier or if these occur later than expected. The x86 change for non-splat division shows a potential opportunity to improve vector codegen: we assumed that since only one lane had meaningful results, we should do the math in scalar. But that means moving back and forth from vector registers. llvm-svn: 297026	2017-03-06 16:36:42 +00:00
Sanjay Patel	6b029a5380	[DAG] fix formatting; NFC llvm-svn: 297015	2017-03-06 15:27:57 +00:00
Sanjay Patel	5273afd4bb	[DAG] fix typo in comment; NFC llvm-svn: 297011	2017-03-06 15:07:43 +00:00
Dean Michael Berris	7e8eea429f	[XRay] Allow logging the first argument of a function call. Summary: Functions with the "xray-log-args" attribute will have a special XRay sled kind emitted, for compiler-rt to copy any call arguments to your logging handler. For practical and performance reasons, only the first argument is supported, and only up to 64 bits. Reviewers: dberris Reviewed By: dberris Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29702 llvm-svn: 296998	2017-03-06 06:48:56 +00:00
Simon Pilgrim	584d6d9d91	[SelectionDAG] Fix vector splitting for *_EXTEND_VECTOR_INREG instructions Found by fuzz testing after rL296985 landed llvm-svn: 296989	2017-03-05 15:52:18 +00:00
Simon Pilgrim	9f5c251d57	[X86][SSE] Lower 128-bit vectors to SIGN/ZERO_EXTEND_VECTOR_IN_REG ops As described on PR31712, we miss a variety of legalization combines because we lower these to X86ISD::VSEXT/VZEXT despite them having the same functionality. This patch makes 128-bit (SSE41) SIGN/ZERO_EXTEND_VECTOR_IN_REG ops legal, adds the necessary tablegen plumbing and uses a helper 'getExtendInVec' to decide when to use SIGN/ZERO_EXTEND_VECTOR_IN_REG or VSEXT/VZEXT. We're missing a couple of shuffle combines that will be added in a future patch for review. Later patches can then support the AVX2 cases as a mixture of SIGN/ZERO_EXTEND and SIGN/ZERO_EXTEND_VECTOR_IN_REG, and then finally deal with the AVX512 cases. Differential Revision: https://reviews.llvm.org/D30549 llvm-svn: 296985	2017-03-05 09:57:20 +00:00
Craig Topper	6ffc044b2f	[DAGCombine] Use APInt::operator\|(uint64_t) instead of creating a temporary APInt and calling APInt::Or. NFC This is more efficient by itself. But this is prep for a future patch that may remove APInt::Or while making operator\| support rvalue references similar to add/sub. llvm-svn: 296981	2017-03-05 01:08:16 +00:00
Sanjay Patel	066f3208bf	[DAGCombiner] allow transforming (select Cond, C +/- 1, C) to (add(ext Cond), C) select Cond, C +/- 1, C --> add(ext Cond), C -- with a target hook. This is part of the ongoing process to obsolete D24480. The motivation is to canonicalize to select IR in InstCombine whenever possible, so we need to have a way to undo that easily in codegen. PowerPC is an obvious winner for this kind of transform because it has fast and complete bit-twiddling abilities but generally lousy conditional execution perf (although this might have changed in recent implementations). x86 also sees some wins, but the effect is limited because these transforms already mostly exist in its target-specific combineSelectOfTwoConstants(). The fact that we see any x86 changes just shows that that code is a mess of special-case holes. We may be able to remove some of that logic now. My guess is that other targets will want to enable this hook for most cases. The likely follow-ups would be to add value type and/or the constants themselves as parameters for the hook. As the tests in select_const.ll show, we can transform any select-of-constants to math/logic, but the general transform for any 2 constants needs one more instruction (multiply or 'and'). ARM is one target that I think may not want this for most cases. I see infinite loops there because it wants to use selects to enable conditionally executed instructions. Differential Revision: https://reviews.llvm.org/D30537 llvm-svn: 296977	2017-03-04 19:18:09 +00:00
Florian Hahn	6406f98342	[legalize-types] Remove stale entries from SoftenedFloats. Summary: When replacing a SDValue, we should remove the replaced value from SoftenedFloats (and possibly the other maps as well?). When we revisit a Node because it needs analyzing again, we have to remove all result values from SoftenedFloats (and possibly other maps?). This fixes the fp128 test failures with expensive checks for X86. I think we probably should also remove the values from the other maps (PromotedIntegers and so on), let me know what you think. Reviewers: baldrick, bogner, davidxl, ab, arsenm, pirama, chh, RKSimon Reviewed By: chh Subscribers: danalbert, wdng, srhines, hfinkel, sepavloff, llvm-commits Differential Revision: https://reviews.llvm.org/D29265 llvm-svn: 296964	2017-03-04 12:00:35 +00:00
Eli Friedman	49f0220086	[MISched] Remove unused arguments. NFC. llvm-svn: 296934	2017-03-04 00:42:55 +00:00
Matthias Braun	ffe40dd69e	RegAllocGreedy: Follow-up to r296722 We can now end up in situations where we initiate LiveIntervalUnion queries with different SubRanges against the same register unit, so the assert() no longer holds in all cases. Just recalculate now when we know the cache is out of date. llvm-svn: 296928	2017-03-03 23:27:20 +00:00
Tim Northover	3e6a7afd81	GlobalISel: constrain G_INSERT to inserting just one value per instruction. It's much easier to reason about single-value inserts and no-one was actually using the variadic variants before. llvm-svn: 296923	2017-03-03 23:05:47 +00:00
Tim Northover	bf017293af	GlobalISel: add merge/unmerge nodes for legalization. These are simplified variants of the current G_SEQUENCE and G_EXTRACT, which assume the individual parts will be contiguous, homogeneous, and occupy the entirity of the larger register. This makes reasoning about them much easer since you only have to look at the first register being merged and the result to know what the instruction is doing. I intend to gradually replace all uses of the more complicated sequence/extract with these (or single-element insert/extracts), and then remove the older variants. For now we start with legalization. llvm-svn: 296921	2017-03-03 22:46:09 +00:00
Matthias Braun	a04d7ad851	RegisterCoalescer: Simplify subrange splitting code; NFC - Use slightly better variable names / compute in a more direct way. llvm-svn: 296905	2017-03-03 19:05:34 +00:00
Simon Pilgrim	6dfab414db	Use APInt::setBits instead of OR'ing in a separate APInt::getBitsSet call llvm-svn: 296886	2017-03-03 17:03:52 +00:00
Simon Pilgrim	cf12b5e1a6	Use APInt::getOneBitSet instead of APInt::getBitsSet for sign bit mask creation Avoids all the unnecessary extra bitrange creation/shift stages. llvm-svn: 296879	2017-03-03 16:35:57 +00:00
Simon Pilgrim	10754abe7e	Use APInt::getOneBitSet instead of APInt::getBitsSet for sign bit mask creation Avoids all the unnecessary extra bitrange creation/shift stages. llvm-svn: 296871	2017-03-03 14:25:46 +00:00
Simon Pilgrim	b01bb3a1b2	Fix Wdocumentation warning llvm-svn: 296866	2017-03-03 12:09:11 +00:00
Chandler Carruth	ce52b80744	[SDAG] Revert r296476 (and r296486, r296668, r296690). This patch causes compile times for some patterns to explode. I have a (large, unreduced) test case that slows down by more than 20x and several test cases slow down by 2x. I'm sending some of the test cases directly to Nirav and following up with more details in the review log, but this should unblock anyone else hitting this. llvm-svn: 296862	2017-03-03 10:02:25 +00:00
Adrian Prantl	ea8880b81f	LiveDebugValues: Assume calls never clobber SP. A call should never modify the stack pointer, but some backends are not so sure about this and never list SP in the regmask. For the purposes of LiveDebugValues we assume a call never clobbers SP. We already have a similar workaround in DbgValueHistoryCalculator (which we hopefully can retire soon). This fixes the availabilty of local ASANified variables on AArch64. rdar://problem/27757381 llvm-svn: 296847	2017-03-03 01:08:25 +00:00
Kyle Butt	1fa6030767	CodeGen: BlockPlacement: Precompute layout for chains of triangles. For chains of triangles with small join blocks that can be tail duplicated, a simple calculation of probabilities is insufficient. Tail duplication can be profitable in 3 different ways for these cases: 1) The post-dominators marked 50% are actually taken 56% (This shrinks with longer chains) 2) The chains are statically correlated. Branch probabilities have a very U-shaped distribution. [http://nrs.harvard.edu/urn-3:HUL.InstRepos:24015805] If the branches in a chain are likely to be from the same side of the distribution as their predecessor, but are independent at runtime, this transformation is profitable. (Because the cost of being wrong is a small fixed cost, unlike the standard triangle layout where the cost of being wrong scales with the # of triangles.) 3) The chains are dynamically correlated. If the probability that a previous branch was taken positively influences whether the next branch will be taken We believe that 2 and 3 are common enough to justify the small margin in 1. The code pre-scans a function's CFG to identify this pattern and marks the edges so that the standard layout algorithm can use the computed results. llvm-svn: 296845	2017-03-03 01:00:22 +00:00
Taewook Oh	96c6415697	[DAGCombiner] Fix DebugLoc propagation when folding !(x cc y) -> (x !cc y) Summary: Currently, when 't1: i1 = setcc t2, t3, cc' followed by 't4: i1 = xor t1, Constant:i1<-1>' is folded into 't5: i1 = setcc t2, t3 !cc', SDLoc of newly created SDValue 't5' follows SDLoc of 't4', not 't1'. However, as the opcode of newly created SDValue is 'setcc', it make more sense to take DebugLoc from 't1' than 't4'. For the code below ``` extern int bar(); extern int baz(); int foo(int x, int y) { if (x != y) return bar(); else return baz(); } ``` , following is the bitcode representation of 'foo' at the end of llvm-ir level optimization: ``` define i32 @foo(i32 %x, i32 %y) !dbg !4 { entry: tail call void @llvm.dbg.value(metadata i32 %x, i64 0, metadata !9, metadata !11), !dbg !12 tail call void @llvm.dbg.value(metadata i32 %y, i64 0, metadata !10, metadata !11), !dbg !13 %cmp = icmp ne i32 %x, %y, !dbg !14 br i1 %cmp, label %if.then, label %if.else, !dbg !16 if.then: ; preds = %entry %call = tail call i32 (...) @bar() #3, !dbg !17 br label %return, !dbg !18 if.else: ; preds = %entry %call1 = tail call i32 (...) @baz() #3, !dbg !19 br label %return, !dbg !20 return: ; preds = %if.else, %if.then %retval.0 = phi i32 [ %call, %if.then ], [ %call1, %if.else ] ret i32 %retval.0, !dbg !21 } !14 = !DILocation(line: 5, column: 9, scope: !15) !16 = !DILocation(line: 5, column: 7, scope: !4) ``` As you can see, in 'entry' block, 'icmp' instruction and 'br' instruction have different debug locations. However, with current implementation, there's no distinction between debug locations of these two when they are lowered to asm instructions. This is because 'icmp' and 'br' become 'setcc' 'xor' and 'brcond' in SelectionDAG, where SDLoc of 'setcc' follows the debug location of 'icmp' but SDLOC of 'xor' and 'brcond' follows the debug location of 'br' instruction, and SDLoc of 'xor' overwrites SDLoc of 'setcc' when they are folded. This patch addresses this issue. Reviewers: atrick, bogner, andreadb, craig.topper, aprantl Reviewed By: andreadb Subscribers: jlebar, mkuper, jholewinski, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D29813 llvm-svn: 296825	2017-03-02 21:58:35 +00:00
Sanjay Patel	7884dcb788	[DAG] early exit to improve readability and formatting of visitMemCmpCall(); NFCI llvm-svn: 296824	2017-03-02 21:56:43 +00:00
Kyle Butt	1393761e0c	CodeGen: MachineBlockPlacement: Remove the unused outlining heuristic. Outlining optional branches isn't a good heuristic, and it's never been on by default. Remove it to clean things up. llvm-svn: 296818	2017-03-02 21:44:24 +00:00
Zachary Turner	d9dc2829ea	[Support] Move Stream library from MSF -> Support. After several smaller patches to get most of the core improvements finished up, this patch is a straight move and header fixup of the source. Differential Revision: https://reviews.llvm.org/D30266 llvm-svn: 296810	2017-03-02 20:52:51 +00:00
Sanjay Patel	209b0f9aad	[DAG] improve documentation comments; NFC llvm-svn: 296808	2017-03-02 20:48:08 +00:00
Simon Pilgrim	7b227fecb5	Fix some Wdocumentation warnings llvm-svn: 296783	2017-03-02 18:59:07 +00:00
Sanjay Patel	fffa179837	[DAGCombiner] avoid assertion when folding binops with opaque constants This bug was introduced with: https://reviews.llvm.org/rL296699 There may be a way to loosen the restriction, but for now just bail out on any opaque constant. The tests show that opacity is target-specific. This goes back to cost calculations in ConstantHoisting based on TTI->getIntImmCost(). llvm-svn: 296768	2017-03-02 17:18:56 +00:00
Sanjay Patel	f7aba7ba22	fix typo in comment; NFC llvm-svn: 296760	2017-03-02 16:37:24 +00:00
Serge Pavlov	e2bf69715f	Do not verify MachimeDominatorTree if it is not calculated If dominator tree is not calculated or is invalidated, set corresponding pointer in the pass state to nullptr. Such pointer value will indicate that operations with dominator tree are not allowed. In particular, it allows to skip verification for such pass state. The dominator tree is not calculated if the machine dominator pass was skipped, it occures in the case of entities with linkage available_externally. The change fixes some test fails observed when expensive checks are enabled. Differential Revision: https://reviews.llvm.org/D29280 llvm-svn: 296742	2017-03-02 12:00:10 +00:00
Matthias Braun	dbcf9e2ee4	LiveRegMatrix: Fix some subreg interference checks Surprisingly, one of the three interference checks in LiveRegMatrix was using the main live range instead of the apropriate subregister range resulting in unnecessarily conservative results. llvm-svn: 296722	2017-03-02 00:35:08 +00:00
Paul Robinson	a94f76b18c	Remove spurious use of LLVM_FALLTHROUGH (NFC) llvm-svn: 296713	2017-03-01 23:59:11 +00:00
Amaury Sechet	71f511fd1e	[DAGCombiner] mulhi + 1 never overflow. Summary: This can be used to optimize large multiplications after legalization. Depends on D29565 Reviewers: mkuper, spatel, RKSimon, zvi, bkramer, aaboud, craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29587 llvm-svn: 296711	2017-03-01 23:44:17 +00:00
Ahmed Bougacha	120ae22d70	[GlobalISel] Add a way for targets to enable GISel. Until now, we've had to use -global-isel to enable GISel. But using that on other targets that don't support it will result in an abort, as we can't build a full pipeline. Additionally, we want to experiment with enabling GISel by default for some targets: we can't just enable GISel by default, even among those target that do have some support, because the level of support varies. This first step adds an override for the target to explicitly define its level of support. For AArch64, do that using a new command-line option (I know..): -aarch64-enable-global-isel-at-O=<N> Where N is the opt-level below which GISel should be used. Default that to -1, so that we still don't enable GISel anywhere. We're not there yet! While there, remove a couple LLVM_UNLIKELYs. Building the pipeline is such a cold path that in practice that shouldn't matter at all. llvm-svn: 296710	2017-03-01 23:33:08 +00:00
Sanjay Patel	92938657a0	[DAGCombiner] fold binops with constant into select-of-constants This is part of the ongoing attempt to improve select codegen for all targets and select canonicalization in IR (see D24480 for more background). The transform is a subset of what is done in InstCombine's FoldOpIntoSelect(). I first noticed a regression in the x86 avx512-insert-extract.ll tests with a patch that hopes to convert more selects to basic math ops. This appears to be a general missing DAG transform though, so I added tests for all standard binops in rL296621 (PowerPC was chosen semi-randomly; it has scripted FileCheck support, but so do ARM and x86). The poor output for "sel_constants_shl_constant" is tracked with: https://bugs.llvm.org/show_bug.cgi?id=32105 Differential Revision: https://reviews.llvm.org/D30502 llvm-svn: 296699	2017-03-01 22:51:31 +00:00
Victor Leschuk	d7bfa40ace	[DebugInfo] [DWARFv5] Unique abbrevs for DIEs with different implicit_const values Take DW_FORM_implicit_const attribute value into account when profiling DIEAbbrevData. Currently if we have two similar types with implicit_const attributes and different values we end up with only one abbrev in .debug_abbrev section. For example consider two structures: S1 with implicit_const attribute ATTR and value VAL1 and S2 with implicit_const ATTR and value VAL2. The .debug_abbrev section will contain only 1 related record: [N] DW_TAG_structure_type DW_CHILDREN_yes DW_AT_ATTR DW_FORM_implicit_const VAL1 // .... This is incorrect as struct S2 (with VAL2) will use abbrev record with VAL1. With this patch we will have two different abbreviations here: [N] DW_TAG_structure_type DW_CHILDREN_yes DW_AT_ATTR DW_FORM_implicit_const VAL1 // .... [M] DW_TAG_structure_type DW_CHILDREN_yes DW_AT_ATTR DW_FORM_implicit_const VAL2 // .... llvm-svn: 296691	2017-03-01 22:13:42 +00:00
Benjamin Kramer	0e429606b0	[DAGCombiner] Remove non-ascii character and reflow comment. llvm-svn: 296690	2017-03-01 22:10:43 +00:00
Matthias Braun	173e11439e	LIU:::Query: Query LiveRange instead of LiveInterval; NFC - We only need the information from the base class, not the additional details in the LiveInterval class. - Spread more `const` - Some code cleanup llvm-svn: 296684	2017-03-01 21:48:12 +00:00
Reid Kleckner	f7c0980c10	Elide argument copies during instruction selection Summary: Avoids tons of prologue boilerplate when arguments are passed in memory and left in memory. This can happen in a debug build or in a release build when an argument alloca is escaped. This will dramatically affect the code size of x86 debug builds, because X86 fast isel doesn't handle arguments passed in memory at all. It only handles the x86_64 case of up to 6 basic register parameters. This is implemented by analyzing the entry block before ISel to identify copy elision candidates. A copy elision candidate is an argument that is used to fully initialize an alloca before any other possibly escaping uses of that alloca. If an argument is a copy elision candidate, we set a flag on the InputArg. If the the target generates loads from a fixed stack object that matches the size and alignment requirements of the alloca, the SelectionDAG builder will delete the stack object created for the alloca and replace it with the fixed stack object. The load is left behind to satisfy any remaining uses of the argument value. The store is now dead and is therefore elided. The fixed stack object is also marked as mutable, as it may now be modified by the user, and it would be invalid to rematerialize the initial load from it. Supersedes D28388 Fixes PR26328 Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D29668 llvm-svn: 296683	2017-03-01 21:42:00 +00:00
Matthias Braun	702f55bb4a	LIU::Query: Remove always false member+getter; NFC llvm-svn: 296675	2017-03-01 21:02:52 +00:00
Nemanja Ivanovic	b223cfabcc	Improve scheduling with branch coalescing This patch adds a MachineSSA pass that coalesces blocks that branch on the same condition. Committing on behalf of Lei Huang. Differential Revision: https://reviews.llvm.org/D28249 llvm-svn: 296670	2017-03-01 20:29:34 +00:00
Nirav Dave	0a4703b5ec	[DAG] Prevent Stale nodes from entering worklist Add check that deleted nodes do not get added to worklist. This can occur when a node's operand is simplified to an existing node. This fixes PR32108. Reviewers: jyknight, hfinkel, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30506 llvm-svn: 296668	2017-03-01 20:19:38 +00:00
Paul Robinson	d4f1c487f3	Alphabetize some cases (NFC) llvm-svn: 296655	2017-03-01 19:01:47 +00:00
Paul Robinson	91d74813a6	[DWARF] Default lower bound should respect requested DWARF version. DWARF may define a default lower-bound for arrays in languages defined in a particular DWARF version. But the logic to suppress an unnecessary lower-bound attribute was looking at the hard-coded default DWARF version, not the version that had been requested. Also updated the list with all languages defined in DWARF v5. Differential Revision: http://reviews.llvm.org/D30484 llvm-svn: 296652	2017-03-01 18:32:37 +00:00
Artur Pilipenko	e1b2d31468	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine Resubmit r295336 after the bug with non-zero offset patterns on BE targets is fixed (r296336). Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 296651	2017-03-01 18:12:29 +00:00
Ahmed Bougacha	20b3e9a835	[CodeGen] Remove dead FastISel code after SDAG emitted a tailcall. When SDAGISel (top-down) selects a tail-call, it skips the remainder of the block. If, before that, FastISel (bottom-up) selected some of the (no-op) next few instructions, we can end up with dead instructions following the terminator (selected by SDAGISel). We need to erase them, as we know they aren't necessary (in addition to being incorrect). We already do this when FastISel falls back on the tail-call itself. Also remove the FastISel-emitted code if we fallback on the instructions between the tail-call and the return. llvm-svn: 296552	2017-03-01 00:43:42 +00:00
Ahmed Bougacha	67d1c7c3c2	[GlobalISel] Replace all combined G_EXTRACT uses. Iterating on the use-list we're modifying doesn't work: after the first iteration, the use-list iterator will point to a MachineOperand referencing the new register. This caused us to skip the other uses to replace. Instead, use MRI.replaceRegWith(), which accounts for this behavior. llvm-svn: 296551	2017-03-01 00:43:39 +00:00
Paul Robinson	3443575c03	Add missing module/license header. NFC. llvm-svn: 296550	2017-03-01 00:14:42 +00:00
Paul Robinson	cddd60445e	[DWARFv5] Emit new unit header format. Requesting DWARF v5 will now get you the new compile-unit and type-unit headers. llvm-dwarfdump will also recognize them. Differential Revision: http://reviews.llvm.org/D30206 llvm-svn: 296514	2017-02-28 20:24:55 +00:00
Sanjay Patel	ea61ea9f19	[DAGCombiner] use dyn_cast values in foldSelectOfConstants(); NFC llvm-svn: 296502	2017-02-28 18:41:49 +00:00
Craig Topper	419f145ebb	[DAGISel] When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node. This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency. llvm-svn: 296486	2017-02-28 16:52:05 +00:00
David Bozier	5159968786	[Stack Protection] Add diagnostic information for why stack protection was applied to a function Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which functions have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function. This change adds a remark that is reported by the stack protection code when an instruction or attribute is encountered that causes SSP to be applied. Patch by: James Henderson Differential Revision: https://reviews.llvm.org/D29023 llvm-svn: 296483	2017-02-28 16:02:37 +00:00
Daniel Sanders	983c9b98e9	Revert r296474 - [globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it. There's a circular dependency that's only revealed when LLVM_ENABLE_MODULES=1. llvm-svn: 296478	2017-02-28 15:00:27 +00:00
Nirav Dave	f830dec3f2	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 296476	2017-02-28 14:24:15 +00:00
Daniel Sanders	a5afdefec6	[globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 296474	2017-02-28 14:21:31 +00:00
Sanjoy Das	eef785c1a5	[ImplicitNullCheck] Add alias analysis usage Summary: With this change ImplicitNullCheck optimization uses alias analysis and can use load/store memory access for implicit null check if there are other load/store before but memory accesses do not alias. Patch by Serguei Katkov! Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30331 llvm-svn: 296440	2017-02-28 07:04:49 +00:00
Matthias Braun	81f68ec3a9	Revert "Add MIR-level outlining pass" Revert Machine Outliner for now, as it breaks the asan bot. This reverts commit r296418. llvm-svn: 296426	2017-02-28 02:24:30 +00:00
Matthias Braun	d36410945f	Add MIR-level outlining pass This is a patch for the outliner described in the RFC at: http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html The outliner is a code-size reduction pass which works by finding repeated sequences of instructions in a program, and replacing them with calls to functions. This is useful to people working in low-memory environments, where sacrificing performance for space is acceptable. This adds an interprocedural outliner directly before printing assembly. For reference on how this would work, this patch also includes X86 target hooks and an X86 test. The outliner is run like so: clang -mno-red-zone -mllvm -enable-machine-outliner file.c Patch by Jessica Paquette<jpaquette@apple.com>! rdar://29166825 Differential Revision: https://reviews.llvm.org/D26872 llvm-svn: 296418	2017-02-28 00:33:32 +00:00
Michael Kuperstein	13bf8a2684	[CGP] Split some critical edges coming out of indirect branches Splitting critical edges when one of the source edges is an indirectbr is hard in general (because it requires changing the memory the indirectbr reads). But if a block only has a single indirectbr predecessor (which is the common case), we can simulate splitting that edge by splitting the destination block, and retargeting the direct branches. This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. Since MachineSink can't break indirect critical edges on demand (and doing this in MIR doesn't look feasible), this causes us to emit about ~100 defs of registers containing constants, which we in the predecessor block, where only one of those constants is used in each successor. So, at each computed goto, we needlessly spill about a 100 constants to stack. The end result is that a clang-compiled python interpreter can be about ~2.5x slower on a simple python reduction loop than a gcc-compiled interpreter. Differential Revision: https://reviews.llvm.org/D29916 llvm-svn: 296416	2017-02-28 00:11:34 +00:00
Zachary Turner	695ed56ba5	[PDB] Make streams carry their own endianness. Before the endianness was specified on each call to read or write of the StreamReader / StreamWriter, but in practice it's extremely rare for streams to have data encoded in multiple different endiannesses, so we should optimize for the 99% use case. This makes the code cleaner and more general, but otherwise has NFC. llvm-svn: 296415	2017-02-28 00:04:07 +00:00
Eugene Zelenko	fa912a7151	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 296404	2017-02-27 22:45:06 +00:00
Arnold Schwaighofer	b2605f31ed	ISel: We need to notify FastIS of the IMPLICIT_DEF we created in createSwiftErrorEntriesInEntryBlock Otherwise, it will insert instructions before it. rdar://30536186 llvm-svn: 296395	2017-02-27 22:12:06 +00:00
Zachary Turner	120faca41b	[PDB] Partial resubmit of r296215, which improved PDB Stream Library. This was reverted because it was breaking some builds, and because of incorrect error code usage. Since the CL was large and contained many different things, I'm resubmitting it in pieces. This portion is NFC, and consists of: 1) Renaming classes to follow a consistent naming convention. 2) Fixing the const-ness of the interface methods. 3) Adding detailed doxygen comments. 4) Fixing a few instances of passing `const BinaryStream& X`. These are now passed as `BinaryStreamRef X`. llvm-svn: 296394	2017-02-27 22:11:43 +00:00
Matt Arsenault	4a7cc16e89	Revert "DAG: Check if extract_vector_elt is legal or custom" This reverts r295782. This could potentially result in some legalization loops and I avoided the need for this. llvm-svn: 296393	2017-02-27 21:59:07 +00:00
Simon Pilgrim	5c4efcdddf	[X86][SSE] Attempt to extract vector elements through target shuffles DAGCombiner already supports peeking thorough shuffles to improve vector element extraction, but legalization often leaves us in situations where we need to extract vector elements after shuffles have already been lowered. This patch adds support for VECTOR_EXTRACT_ELEMENT/PEXTRW/PEXTRB instructions to attempt to handle target shuffles as well. I've covered some basic scenarios including handling shuffle mask scaling and the implicit zero-extension of PEXTRW/PEXTRB, there is more that could be done here (that I've mentioned in TODOs) but I haven't found many cases where its worth it. Differential Revision: https://reviews.llvm.org/D30176 llvm-svn: 296381	2017-02-27 21:01:57 +00:00
Taewook Oh	a49eb8578a	[TailDuplicator] Maintain DebugLoc for branch instructions Summary: Existing implementation of duplicateSimpleBB function drops DebugLoc metadata of branch instructions during the transformation. This patch addresses this issue by making newly created branch instructions to keep the metadata of replaced branch instructions. Reviewers: qcolombet, craig.topper, aprantl, MatzeB, sanjoy, dblaikie Reviewed By: dblaikie Subscribers: dblaikie, llvm-commits Differential Revision: https://reviews.llvm.org/D30026 llvm-svn: 296371	2017-02-27 19:30:01 +00:00
Artur Pilipenko	f7196c8d9e	[DAGCombine] Fix for a load combine bug with non-zero offset patterns on BE targets This pattern is essentially a i16 load from p+1 address: %p1.i16 = bitcast i8* %p to i16* %p2.i8 = getelementptr i8, i8* %p, i64 2 %v1 = load i16, i16* %p1.i16 %v2.i8 = load i8, i8* %p2.i8 %v2 = zext i8 %v2.i8 to i16 %v1.shl = shl i16 %v1, 8 %res = or i16 %v1.shl, %v2 Current implementation would identify %v1 load as the first byte load and would mistakenly emit a i16 load from %p1.i16 address. This patch adds a check that the first byte is loaded from a non-zero offset of the first load address. This way this address can be used as the base address for the combined value. Otherwise just give up combining. llvm-svn: 296336	2017-02-27 13:04:23 +00:00
Artur Pilipenko	c43b20a43b	[DAGCombine] NFC. MatchLoadCombine extract MemoryByteOffset lambda helper This refactoring will simplify the upcoming change to fix the bug in folding patterns with non-zero offsets on BE targets. llvm-svn: 296332	2017-02-27 11:42:54 +00:00
Artur Pilipenko	f2c26e0bf2	[DAGCombine] NFC. MatchLoadCombine remember the first byte provider, not the load node This refactoring will simplify the upcoming change to fix a bug in folding patterns with non-zero offsets on BE targets. llvm-svn: 296331	2017-02-27 11:40:14 +00:00
Daniel Jasper	3ca4525612	Revert "[CGP] Split some critical edges coming out of indirect branches" This reverts commit r296149 as it leads to crashes when compiling for PPC. llvm-svn: 296295	2017-02-26 11:09:12 +00:00
Nirav Dave	73cd0194cf	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r296252 until 256-bit operations are more efficiently generated in X86. llvm-svn: 296279	2017-02-26 01:27:32 +00:00
Craig Topper	3b8aca2ecf	[ExecutionDepsFix] Don't make copies of LiveReg objects when collecting operands for soft instructions Summary: While collecting operands we make copies of the LiveReg objects which are stored in the LiveRegs array. If the instruction uses the same register multiple times we end up with multiple copies. Later we iterate through the collected list of LiveReg objects and merge DomainValues. In the process of doing this the merge function can change the contents of the original LiveReg object in the LiveRegs array, but not the copies that have been made. So when we get to the second usage of the register we end up seeing a stale copy of the LiveReg object. To fix this I've stopped copying and now just store a pointer to the original LiveReg object. Another option might be to avoid adding the same register to the Regs array twice, but this approach seemed simpler. The included test case exposes this bug due to an AVX-512 masked OR instruction using the same register for the passthru operand and one of the inputs to the OR operation. Fixes PR30284. Reviewers: RKSimon, stoklund, MatzeB, spatel, myatsina Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30242 llvm-svn: 296260	2017-02-25 18:12:25 +00:00
Artyom Skrobov	ac56719231	No need to copy the variable [NFC] llvm-svn: 296259	2017-02-25 17:18:09 +00:00
NAKAMURA Takumi	05a75e40da	Revert r296215, "[PDB] General improvements to Stream library." and followings. r296215, "[PDB] General improvements to Stream library." r296217, "Disable BinaryStreamTest.StreamReaderObject temporarily." r296220, "Re-enable BinaryStreamTest.StreamReaderObject." r296244, "[PDB] Disable some tests that are breaking bots." r296249, "Add static_cast to silence -Wc++11-narrowing." std::errc::no_buffer_space should be used for OS-oriented errors for socket transmission. (Seek discussions around llvm/xray.) I could substitute s/no_buffer_space/others/g, but I revert whole them ATM. Could we define and use LLVM errors there? llvm-svn: 296258	2017-02-25 17:04:23 +00:00
Nirav Dave	beabf456df	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 296252	2017-02-25 11:43:58 +00:00
Junmo Park	061bec802e	Minor code cleanup. NFC. llvm-svn: 296222	2017-02-25 01:50:45 +00:00
Zachary Turner	af299ea5d4	[PDB] General improvements to Stream library. This adds various new functionality and cleanup surrounding the use of the Stream library. Major changes include: * Renaming of all classes for more consistency / meaningfulness * Addition of some new methods for reading multiple values at once. * Full suite of unit tests for reader / writer functionality. * Full set of doxygen comments for all classes. * Streams now store their own endianness. * Fixed some bugs in a few of the classes that were discovered by the unit tests. llvm-svn: 296215	2017-02-25 00:44:30 +00:00
Zachary Turner	d2684b7969	[PDB] Rename Stream related source files. This is part of a larger effort to get the Stream code moved up to Support. I don't want to do it in one large patch, in part because the changes are so big that it will treat everything as file deletions and add, losing history in the process. Aside from that though, it's just a good idea in general to make small changes. So this change only changes the names of the Stream related source files, and applies necessary source fix ups. llvm-svn: 296211	2017-02-25 00:33:34 +00:00
Dan Gohman	82607f56bd	[WebAssembly] Add support for using a wasm global for the stack pointer. This replaces the __stack_pointer variable which was allocated in linear memory. llvm-svn: 296201	2017-02-24 23:46:05 +00:00
Dan Gohman	d934cb8806	[WebAssembly] Basic support for Wasm object file encoding. With the "wasm32-unknown-unknown-wasm" triple, this allows writing out simple wasm object files, and is another step in a larger series toward migrating from ELF to general wasm object support. Note that this code and the binary format itself is still experimental. llvm-svn: 296190	2017-02-24 23:18:00 +00:00
Stanislav Mekhanoshin	42259cf35e	Revert "Correct register pressure calculation in presence of subregs" This reverts commit r296009. It broke one out of tree target and also does not account for all partial lines added or removed when calculating PressureDiff. llvm-svn: 296182	2017-02-24 21:56:16 +00:00
Tim Northover	ef29e7284b	GlobalISel: check for CImm rather than Imm on G_CONSTANTs. All G_CONSTANTS created by the MachineIRBuilder have an operand of type CImm (i.e. a ConstantInt), so that's what the selector needs to look for. llvm-svn: 296176	2017-02-24 21:21:38 +00:00
Eli Friedman	c12a5a7595	[CodeGenPrepare] Make -addr-sink-using-gep work with address spaces. When we construct addressing modes, we use isNoopAddrSpaceCast to ignore addrspacecast instructions. Make sure we insert the correct addrspacecast when we reconstruct the addressing mode. Differential Revision: https://reviews.llvm.org/D30114 llvm-svn: 296167	2017-02-24 20:51:36 +00:00
Michael Kuperstein	46b131e3f8	[CGP] Split some critical edges coming out of indirect branches Splitting critical edges when one of the source edges is an indirectbr is hard in general (because it requires changing the memory the indirectbr reads). But if a block only has a single indirectbr predecessor (which is the common case), we can simulate splitting that edge by splitting the destination block, and retargeting the direct branches. This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. Since MachineSink can't break indirect critical edges on demand (and doing this in MIR doesn't look feasible), this causes us to emit about ~100 defs of registers containing constants, which we in the predecessor block, where only one of those constants is used in each successor. So, at each computed goto, we needlessly spill about a 100 constants to stack. The end result is that a clang-compiled python interpreter can be about ~2.5x slower on a simple python reduction loop than a gcc-compiled interpreter. Differential Revision: https://reviews.llvm.org/D29916 llvm-svn: 296149	2017-02-24 18:41:32 +00:00
Sanjay Patel	832b1622d8	[DAGCombiner] add missing folds for scalar select of {-1,0,1} The motivation for filling out these select-of-constants cases goes back to D24480, where we discussed removing an IR fold from add(zext) --> select. And that goes back to: https://reviews.llvm.org/rL75531 https://reviews.llvm.org/rL159230 The idea is that we should always canonicalize patterns like this to a select-of-constants in IR because that's the smallest IR and the best for value tracking. Note that we currently do the opposite in some cases (like the cases in this patch). Ie, the proposed folds in this patch already exist in InstCombine today: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSelect.cpp#L1151 As this patch shows, most targets generate better machine code for simple ext/add/not ops rather than a select of constants. So the follow-up steps to make this less of a patchwork of special-case folds and missing IR canonicalization: 1. Have DAGCombiner convert any select of constants into ext/add/not ops. 2 Have InstCombine canonicalize in the other direction (create more selects). Differential Revision: https://reviews.llvm.org/D30180 llvm-svn: 296137	2017-02-24 17:17:33 +00:00
Daniel Sanders	066ebbfd46	[globalisel] Decouple src pattern operands from dst pattern operands. Summary: This isn't testable for AArch64 by itself so this patch also adds support for constant immediates in the pattern and physical register uses in the result. The new IntOperandMatcher matches the constant in patterns such as '(set $rd:GPR32, (G_XOR $rs:GPR32, -1))'. It's always safe to fold immediates into an instruction so this is the first rule that will match across multiple BB's. The Renderer hierarchy is responsible for adding operands to the result instruction. Renderers can copy operands (CopyRenderer) or add physical registers (in particular %wzr and %xzr) to the result instruction in any order (OperandMatchers now import the operand names from SelectionDAG to allow renderers to access any operand). This allows us to emit the result instruction for: %1 = G_XOR %0, -1 --> %1 = ORNWrr %wzr, %0 %1 = G_XOR -1, %0 --> %1 = ORNWrr %wzr, %0 although the latter is untested since the matcher/importer has not been taught about commutativity yet. Added BuildMIAction which can build new instructions and mutate them where possible. W.r.t the mutation aspect, MatchActions are now told the name of an instruction they can recycle and BuildMIAction will emit mutation code when the renderers are appropriate. They are appropriate when all operands are rendered using CopyRenderer and the indices are the same as the matcher. This currently assumes that all operands have at least one matcher. Finally, this change also fixes a crash in AArch64InstructionSelector::select() caused by an immediate operand passing isImm() rather than isCImm(). This was uncovered by the other changes and was detected by existing tests. Depends on D29711 Reviewers: t.p.northover, ab, qcolombet, rovka, aditya_nandakumar, javed.absar Reviewed By: rovka Subscribers: aemerson, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D29712 llvm-svn: 296131	2017-02-24 15:43:30 +00:00
Justin Bogner	259a0cf3ee	Add missing initialization for MachineOptimizationRemarkEmitter This was missed in r293110. llvm-svn: 296096	2017-02-24 07:42:35 +00:00
Craig Topper	c446b1ffae	[ExecutionDepsFix] Use range-based for loop. NFC llvm-svn: 296093	2017-02-24 06:38:24 +00:00
Michael Kuperstein	581c9f4b20	Revert r269060 to pacify bots. llvm-svn: 296064	2017-02-24 01:22:19 +00:00
Michael Kuperstein	12e79d5002	[CGP] Split some critical edges coming out of indirect branches Splitting critical edges when one of the source edges is an indirectbr is hard in general (because it requires changing the memory the indirectbr reads). But if a block only has a single indirectbr predecessor (which is the common case), we can simulate splitting that edge by splitting the destination block, and retargeting the direct branches. This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. Since MachineSink can't break indirect critical edges on demand (and doing this in MIR doesn't look feasible), this causes us to emit about ~100 defs of registers containing constants, which we in the predecessor block, where only one of those constants is used in each successor. So, at each computed goto, we needlessly spill about a 100 constants to stack. The end result is that a clang-compiled python interpreter can be about ~2.5x slower on a simple python reduction loop than a gcc-compiled interpreter. Differential Revision: https://reviews.llvm.org/D29916 llvm-svn: 296060	2017-02-24 00:56:21 +00:00
Ahmed Bougacha	7daaf88c70	[GlobalISel] Use the same name for all remarks. While there, switch to the explicit ctor. llvm-svn: 296059	2017-02-24 00:34:47 +00:00
Ahmed Bougacha	7c88a4e12b	[GlobalISel] Use the DISubprogram for translation failure remarks. Justin added support for DISubprogram locs in r295531 and r296052. Use that instead of no-loc for constants and arguments. llvm-svn: 296058	2017-02-24 00:34:44 +00:00
Ahmed Bougacha	8f9e99bcb6	[GlobalISel] Remove now-unnecessary variable. NFC. Since r296047, we're able to return early on failures. Don't track whether we succeeded. llvm-svn: 296057	2017-02-24 00:34:41 +00:00
Justin Bogner	369ba753aa	OptDiag: Summarize the instruction count in asm-printer Add an optimization remark to asm-printer that summarizes the number of instructions emitted per function. llvm-svn: 296053	2017-02-24 00:19:22 +00:00
Ahmed Bougacha	4f8dd0202d	[GlobalISel] Don't translate other blocks when one failed. We were stopping the translation of the parent block when the translation of an instruction failed, but we were still trying to translate the other blocks of the parent function. Don't do that. llvm-svn: 296047	2017-02-23 23:57:36 +00:00
Ahmed Bougacha	eceabddcfd	[GlobalISel] Finalize translated function on scope exit. NFC. This is the compromise between having a per-function IRTranslator and manually managing the per-function state. llvm-svn: 296046	2017-02-23 23:57:28 +00:00
Kyle Butt	ebe6cc4dad	CodeGen: MachineBlockPlacement: Rename member to more general name. NFC. Rename ComputedTrellisEdges to ComputedEdges to allow for other methods of pre-computing edges. Differential Revision: https://reviews.llvm.org/D30308 llvm-svn: 296018	2017-02-23 21:22:24 +00:00
Ahmed Bougacha	ae9dadecf3	[GlobalISel] Emit opt remarks on isel fallbacks. Having more fine-grained information on the specific construct that caused us to fallback is valuable for large-scale data collection. We still have the fallback warning, that's also used for FastISel. We still need to remove the fallback warning, and teach FastISel to also emit remarks (it currently has a combination of the warning, stats, and debug prints: the remarks could unify all three). The abort-on-fallback path could also be better handled using remarks: one could imagine a "-Rpass-error", analoguous to "-Werror", which would promote missed/failed remarks to errors. It's not clear whether that would be useful for other remarks though, so we're not there yet. llvm-svn: 296013	2017-02-23 21:05:42 +00:00
Ahmed Bougacha	272739751d	[CodeGen] Teach opt remarks how to print MI instructions. This will be used with GISel opt remarks. llvm-svn: 296012	2017-02-23 21:05:33 +00:00
Ahmed Bougacha	97119d48db	[CodeGen] Print MI without a newline when skipping debugloc. NFC. This matches the behavior for skip-operands. While there, document it. This is a follow-up to r296007. llvm-svn: 296011	2017-02-23 21:05:29 +00:00
Stanislav Mekhanoshin	ce3ddd2de4	Correct register pressure calculation in presence of subregs If a subreg is used in an instruction it counts as a whole superreg for the purpose of register pressure calculation. This patch corrects improper register pressure calculation by examining operand's lane mask. Differential Revision: https://reviews.llvm.org/D29835 llvm-svn: 296009	2017-02-23 20:19:44 +00:00
Ahmed Bougacha	4319224628	[CodeGen] Add a way to SkipDebugLoc in MachineInstr::print(). NFC. llvm-svn: 296007	2017-02-23 19:17:31 +00:00
Ahmed Bougacha	1b3339447d	[GlobalISel] Simplify Select type cleanup using a ScopeExit. NFC. This lets us use more natural early-returns when selection fails. llvm-svn: 296006	2017-02-23 19:17:24 +00:00
Sanjay Patel	4a4fbe162f	[DAG] add convenience function to get -1 constant; NFCI llvm-svn: 296004	2017-02-23 19:02:33 +00:00
Adam Nemet	b516cf3f3f	[LazyMachineBFI] Reimplement with getAnalysisIfAvailable Since LoopInfo is not available in machine passes as universally as in IR passes, using the same approach for OptimizationRemarkEmitter as we did for IR will run LoopInfo and DominatorTree unnecessarily. (LoopInfo is not used lazily by ORE.) To fix this, I am modifying the approach I took in D29836. LazyMachineBFI now uses its client passes including MachineBFI itself that are available or otherwise compute them on the fly. So for example GreedyRegAlloc, since it's already using MBFI, will reuse that instance. On the other hand, AsmPrinter in Justin's patch will generate DT, LI and finally BFI on the fly. (I am of course wondering now if the simplicity of this approach is even preferable in IR. I will do some experiments.) Testing is provided by an updated version of D29837 which requires Justin's patch to bring ORE to the AsmPrinter. Differential Revision: https://reviews.llvm.org/D30128 llvm-svn: 295996	2017-02-23 17:30:01 +00:00
Simon Pilgrim	858d8e672d	Fix signed/unsigned comparison warning on MSVC llvm-svn: 295962	2017-02-23 12:00:34 +00:00
Eugene Zelenko	db56e5a89a	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 295893	2017-02-22 22:32:51 +00:00
Bill Seurer	8e48f416ad	[DAGCombiner] revert r295336 r295336 causes a bootstrapped clang to fail for many compilations on powerpc BE. See http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/2315 for example. Reverting as per the developer's request. llvm-svn: 295849	2017-02-22 16:27:33 +00:00
Igor Breger	f7359d893a	[X86][GlobalISel] Initial implementation , select G_ADD gpr, gpr Summary: Initial implementation for X86InstructionSelector. Handle selection COPY and G_ADD/G_SUB gpr, gpr . Reviewers: qcolombet, rovka, zvi, ab Reviewed By: rovka Subscribers: mgorny, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D29816 llvm-svn: 295824	2017-02-22 12:25:09 +00:00
Dan Gohman	18eafb6c68	[WebAssembly] Add skeleton MC support for the Wasm container format This just adds the basic skeleton for supporting a new object file format. All of the actual encoding will be implemented in followup patches. Differential Revision: https://reviews.llvm.org/D26722 llvm-svn: 295803	2017-02-22 01:23:18 +00:00
Matt Arsenault	f0a4823b91	DAG: Check if extract_vector_elt is legal or custom Avoids test regressions in future AMDGPU commits when more vector types are custom lowered. llvm-svn: 295782	2017-02-21 22:47:27 +00:00
Eugene Zelenko	49e2fc4f5f	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 295773	2017-02-21 22:07:52 +00:00
Geoff Berry	5d534b6a11	[CodeGenPrepare] Sink and duplicate more 'and' instructions. Summary: Rework the code that was sinking/duplicating (icmp and, 0) sequences into blocks where they were being used by conditional branches to form more tbz instructions on AArch64. The new code is more general in that it just looks for 'and's that have all icmp 0's as users, with a target hook used to select which subset of 'and' instructions to consider. This change also enables 'and' sinking for X86, where it is more widely beneficial than on AArch64. The 'and' sinking/duplicating code is moved into the optimizeInst phase of CodeGenPrepare, where it can take advantage of the fact the OptimizeCmpExpression has already sunk/duplicated any icmps into the blocks where they are used. One minor complication from this change is that optimizeLoadExt needed to be updated to always mark 'and's it has determined should be in the same block as their feeding load in the InsertedInsts set to avoid an infinite loop of hoisting and sinking the same 'and'. This change fixes a regression on X86 in the tsan runtime caused by moving GVNHoist to a later place in the optimization pipeline (see PR31382). Reviewers: t.p.northover, qcolombet, MatzeB Subscribers: aemerson, mcrosier, sebpop, llvm-commits Differential Revision: https://reviews.llvm.org/D28813 llvm-svn: 295746	2017-02-21 18:53:14 +00:00
Matthias Braun	9ab403942b	ScheduleDAG: Cleanup; NFC - Fix doxygen comments (do not repeat documented name, remove definition comment if there is already one at the declaration, add \p, ...) - Add some const modifiers - Use range based for llvm-svn: 295688	2017-02-21 01:27:33 +00:00
Taewook Oh	4cf5c1087c	[BranchFolding] Update debug location along with the update of branch instruction. Summary: Currently, BranchFolder drops DebugLoc for branch instructions in some places. For example, for the test code attached, the branch instruction of 'entry' block has a DILocation of ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` , but this information is gone when then block is lowered because BranchFolder misses it. This patch is a fix for this issue. Reviewers: qcolombet, aprantl, craig.topper, MatzeB Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29902 llvm-svn: 295684	2017-02-21 00:12:38 +00:00
Simon Pilgrim	c0dc9a4913	Strip trailing whitespace. llvm-svn: 295653	2017-02-20 11:56:43 +00:00
Simon Pilgrim	50b958c07a	[SelectionDAG] Add scalarization support for ISD::*_EXTEND_VECTOR_INREG opcodes. Thanks to Mikael Holmén for the initial test case llvm-svn: 295652	2017-02-20 11:55:58 +00:00
Artyom Skrobov	be31754094	Remove redundant call to GluedNodes.back() [NFC] llvm-svn: 295607	2017-02-19 16:56:18 +00:00
Matthias Braun	431305927f	MachineRegionInfo: Fix pass initialization - Adapt MachineBasicBlock::getName() to have the same behavior as the IR BasicBlock (Value::getName()). - Add it to lib/CodeGen/CodeGen.cpp::initializeCodeGen so that it is linked in the CodeGen library. - MachineRegionInfoPass's name conflicts with RegionInfoPass's name ("region"). - MachineRegionInfo should depend on MachineDominatorTree, MachinePostDominatorTree and MachineDominanceFrontier instead of their respective IR versions. - Since there were no tests for this, add a X86 MIR test. Patch by Francis Visoiu Mistrih<fvisoiumistrih@apple.com> llvm-svn: 295518	2017-02-18 00:41:16 +00:00
Eugene Zelenko	be37db1882	[CodeGen] Revert changes in LowLevelType to pre-r295499 to fix broken buildbots. llvm-svn: 295505	2017-02-17 22:23:34 +00:00
Eugene Zelenko	5db84df728	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 295499	2017-02-17 21:43:25 +00:00
Adrian Prantl	67c2442210	Debug Info: Sort frame index expressions before emitting them. This fixes PR31381, which caused an assertion and/or invalid debug info. This affects debug variables that have multiple fragments in the MMI side (i.e.: in the stack frame) table. rdar://problem/30571676 llvm-svn: 295486	2017-02-17 19:42:32 +00:00
Tim Northover	88634996c7	GlobalISel: verify that generic loads & stores have a mem operand. The mem operand is used by GlobalISel to convey atomic constraints so dropping it is invalid. llvm-svn: 295476	2017-02-17 18:50:15 +00:00
Sanjay Patel	7f2e58972c	[DAGCombiner] split i1 select-of-constants from non-i1 case; NFCI I can't find any tests of the non-i1 code path, so it may be unnecessary at this point. llvm-svn: 295463	2017-02-17 17:13:27 +00:00
Simon Pilgrim	0429c0cf8b	Fix signed/unsigned comparison warning. llvm-svn: 295453	2017-02-17 16:01:16 +00:00
Simon Pilgrim	511d788a95	[DAGCombine] Recognise any_extend_vector_inreg and truncation style shuffle masks During legalization we are often creating shuffles (via a build_vector scalarization stage) that are "any_extend_vector_inreg" style masks, and also other masks that are the equivalent of "truncate_vector_inreg" (if we had such a thing). This patch is an attempt to match these cases to help undo the effects of just leaving shuffle lowering to handle it - which typically means we lose track of the undefined elements of the shuffles resulting in an unnecessary extension+truncation stage for widened illegal types. The 2011-10-21-widen-cmp.ll regression will be fixed by making SIGN_EXTEND_VECTOR_IN_REG legal in SSE instead of lowering them to X86ISD::VSEXT (PR31712). Differential Revision: https://reviews.llvm.org/D29454 llvm-svn: 295451	2017-02-17 15:14:48 +00:00
Sanjay Patel	5573042035	[DAGCombiner] improve readability; NFCI llvm-svn: 295447	2017-02-17 14:21:59 +00:00
Teresa Johnson	76b5b7493c	Handle link of NoDebug CU with a CU that has debug emission enabled Summary: This is an issue both with regular and Thin LTO. When we link together a DICompileUnit that is marked NoDebug (e.g when compiling with -g0 but applying an AutoFDO profile, which requires location tracking in the compiler) and a DICompileUnit with debug emission enabled, we can have failures during dwarf debug generation. Specifically, when we have inlined from the NoDebug compile unit into the debug compile unit, we can fail during construction of the abstract and inlined scope DIEs. This is because the SPMap does not include NoDebug CUs (they are skipped in the debug_compile_units_iterator). This patch fixes the failures by skipping locations from NoDebug CUs when extracting lexical scopes. Reviewers: dblaikie, aprantl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D29765 llvm-svn: 295384	2017-02-17 00:21:19 +00:00
Benjamin Kramer	3f6260cab4	[MachinePipeliner] Remove redundant destructor. NFC. llvm-svn: 295372	2017-02-16 20:26:51 +00:00
David Blaikie	b2fbb4b276	Refactor DebugHandlerBase a bit to common non-debug-having-function filtering llvm-svn: 295354	2017-02-16 18:48:33 +00:00
Artur Pilipenko	85d758299e	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine Resubmit -r295314 with PowerPC and AMDGPU tests updated. Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295336	2017-02-16 17:07:27 +00:00
Artur Pilipenko	a1b384c4ce	Rever -r295314 "[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine" This change causes some of AMDGPU and PowerPC tests to fail. llvm-svn: 295316	2017-02-16 13:04:46 +00:00
Artur Pilipenko	daaa0c0f7d	[DAGCombiner] Support {a\|s}ext, {a\|z\|s}ext load nodes in load combine Support {a\|s}ext, {a\|z\|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295314	2017-02-16 12:53:26 +00:00
Diana Picus	ca6a890d7f	[ARM] GlobalISel: Lower double precision FP args For the hard float calling convention, we just use the D registers. For the soft-fp calling convention, we use the R registers and move values to/from the D registers by means of G_SEQUENCE/G_EXTRACT. While doing so, we make sure to honor the endianness of the target, since the CCAssignFn doesn't do that for us. For pure soft float targets, we still bail out because we don't support the libcalls yet. llvm-svn: 295295	2017-02-16 07:53:07 +00:00
Hans Wennborg	a468601e0e	[X86] Re-enable conditional tail calls and fix PR31257. This reverts r294348, which removed support for conditional tail calls due to the PR above. It fixes the PR by marking live registers as implicitly used and defined by the now predicated tailcall. This is similar to how IfConversion predicates instructions. Differential Revision: https://reviews.llvm.org/D29856 llvm-svn: 295262	2017-02-16 00:04:05 +00:00
Tim Northover	9136617a3f	GlobalISel: legalize va_arg on AArch64. Uses a Custom implementation because the slot sizes being a multiple of the pointer size isn't really universal, even for the architectures that do have a simple "void *" va_list. llvm-svn: 295255	2017-02-15 23:22:50 +00:00
Tim Northover	4a652227dd	GlobalISel: support translating va_arg Since (say) i128 and [16 x i8] map to the same type in generic MIR, we also need to attach the required alignment info. llvm-svn: 295254	2017-02-15 23:22:33 +00:00
Matt Arsenault	900b21c350	Fix typos llvm-svn: 295246	2017-02-15 22:19:06 +00:00
Matt Arsenault	5de8dc9cf5	DAG: Do not scalarize fsub if fneg is legal Tests will be included with future commit. llvm-svn: 295242	2017-02-15 22:02:42 +00:00
Kyle Butt	7fbec9bdf1	Codegen: Make chains from trellis-shaped CFGs Lay out trellis-shaped CFGs optimally. A trellis of the shape below: A B \|\ /\| \| \ / \| \| X \| \| / \ \| \|/ \\| C D would be laid out A; B->C ; D by the current layout algorithm. Now we identify trellises and lay them out either A->C; B->D or A->D; B->C. This scales with an increasing number of predecessors. A trellis is a a group of 2 or more predecessor blocks that all have the same successors. because of this we can tail duplicate to extend existing trellises. As an example consider the following CFG: B D F H / \ / \ / \ / \ A---C---E---G---Ret Where A,C,E,G are all small (Currently 2 instructions). The CFG preserving layout is then A,B,C,D,E,F,G,H,Ret. The current code will copy C into B, E into D and G into F and yield the layout A,C,B(C),E,D(E),F(G),G,H,ret define void @straight_test(i32 %tag) { entry: br label %test1 test1: ; A %tagbit1 = and i32 %tag, 1 %tagbit1eq0 = icmp eq i32 %tagbit1, 0 br i1 %tagbit1eq0, label %test2, label %optional1 optional1: ; B call void @a() br label %test2 test2: ; C %tagbit2 = and i32 %tag, 2 %tagbit2eq0 = icmp eq i32 %tagbit2, 0 br i1 %tagbit2eq0, label %test3, label %optional2 optional2: ; D call void @b() br label %test3 test3: ; E %tagbit3 = and i32 %tag, 4 %tagbit3eq0 = icmp eq i32 %tagbit3, 0 br i1 %tagbit3eq0, label %test4, label %optional3 optional3: ; F call void @c() br label %test4 test4: ; G %tagbit4 = and i32 %tag, 8 %tagbit4eq0 = icmp eq i32 %tagbit4, 0 br i1 %tagbit4eq0, label %exit, label %optional4 optional4: ; H call void @d() br label %exit exit: ret void } here is the layout after D27742: straight_test: # @straight_test ; ... Prologue elided ; BB#0: # %entry ; A (merged with test1) ; ... More prologue elided mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_2 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_3 b .LBB0_4 .LBB0_2: # %optional1 ; B (copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_4 .LBB0_3: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_5 b .LBB0_6 .LBB0_4: # %optional2 ; D (copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_5: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 b .LBB0_7 .LBB0_6: # %optional3 ; F (copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit ; Ret ld 30, 96(1) # 8-byte Folded Reload addi 1, 1, 112 ld 0, 16(1) mtlr 0 blr The tail-duplication has produced some benefit, but it has also produced a trellis which is not laid out optimally. With this patch, we improve the layouts of such trellises, and decrease the cost calculation for tail-duplication accordingly. This patch produces the layout A,C,E,G,B,D,F,H,Ret. This layout does have back edges, which is a negative, but it has a bigger compensating positive, which is that it handles the case where there are long strings of skipped blocks much better than the original layout. Both layouts handle runs of executed blocks equally well. Branch prediction also improves if there is any correlation between subsequent optional blocks. Here is the resulting concrete layout: straight_test: # @straight_test ; BB#0: # %entry ; A (merged with test1) mr 30, 3 andi. 3, 30, 1 bc 12, 1, .LBB0_4 ; BB#1: # %test2 ; C rlwinm. 3, 30, 0, 30, 30 bne 0, .LBB0_5 .LBB0_2: # %test3 ; E rlwinm. 3, 30, 0, 29, 29 bne 0, .LBB0_6 .LBB0_3: # %test4 ; G rlwinm. 3, 30, 0, 28, 28 bne 0, .LBB0_7 b .LBB0_8 .LBB0_4: # %optional1 ; B (Copy of C) bl a nop rlwinm. 3, 30, 0, 30, 30 beq 0, .LBB0_2 .LBB0_5: # %optional2 ; D (Copy of E) bl b nop rlwinm. 3, 30, 0, 29, 29 beq 0, .LBB0_3 .LBB0_6: # %optional3 ; F (Copy of G) bl c nop rlwinm. 3, 30, 0, 28, 28 beq 0, .LBB0_8 .LBB0_7: # %optional4 ; H bl d nop .LBB0_8: # %exit Differential Revision: https://reviews.llvm.org/D28522 llvm-svn: 295223	2017-02-15 19:49:14 +00:00
Xinliang David Li	538d666814	include function name in dot filename Differential Revision: http://reviews.llvm.org/D29975 llvm-svn: 295220	2017-02-15 19:21:04 +00:00
Michael Kuperstein	ba80db39d7	[DAG] Don't try to create an INSERT_SUBVECTOR with an illegal source We currently can't legalize those, but we should really not be creating them in the first place, since legalization would probably look similar to the way we legalize CONCAT_VECTORS - basically replace the INSERT with a BUILD. This fixes PR311956. Differential Revision: https://reviews.llvm.org/D29961 llvm-svn: 295213	2017-02-15 18:37:26 +00:00
Sagar Thakur	ec65792910	[LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el Summary: Adds support for xray instrumentation on mips for both 32-bit and 64-bit. Reviewed by sdardis, dberris Differential: D27697 llvm-svn: 295164	2017-02-15 10:48:11 +00:00
Craig Topper	96ec7a23e3	[SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where inputs are larger than the mask Summary: The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract. This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract. Reviewers: zvi, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29926 llvm-svn: 295152	2017-02-15 05:57:16 +00:00
Reid Kleckner	a622fc9bdf	[BranchFolding] Tail common all identical unreachable blocks Summary: Blocks ending in unreachable are typically cold because they end the program or throw an exception, so merging them with other identical blocks is usually profitable because it reduces the size of cold code. MachineBlockPlacement generally does not arrange to fall through to such blocks, so commoning these blocks will not introduce additional unconditional branches. Reviewers: hans, iteratee, haicheng Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29153 llvm-svn: 295105	2017-02-14 21:02:24 +00:00
Tim Northover	c2f8956313	GlobalISel: introduce G_PTR_MASK to simplify alloca handling. This instruction clears the low bits of a pointer without requiring (possibly dodgy if pointers aren't ints) conversions to and from an integer. Since (as far as I'm aware) all masks are statically known, the instruction takes an immediate operand rather than a register to specify the mask. llvm-svn: 295103	2017-02-14 20:56:18 +00:00
Eric Christopher	14303d1815	Reformat slightly. llvm-svn: 295096	2017-02-14 19:43:50 +00:00
Wolfgang Pieb	399dcfaa2a	Reapply r294532, reverted in r294787. Store instructions can have more than one memory operand as a result of optimizations that fold different stores into one. When we identify spill instructions to generate DBG_VALUE instructions to record the spilling of a variable, we disregard stores with multiple memory operands for now. We may miss some relevant spills but the handling is a bit more complex, so we'll do it in a different patch. This fixes PR31935. llvm-svn: 295093	2017-02-14 19:08:45 +00:00
Aditya Nandakumar	bb0483bc8e	[Tablegen] Instrumenting table gen DAGGenISelDAG To help assist in debugging ISEL or to prioritize GlobalISel backend work, this patch adds two more tables to <Target>GenISelDAGISel.inc - one which contains the patterns that are used during selection and the other containing include source location of the patterns Enabled through CMake varialbe LLVM_ENABLE_DAGISEL_COV llvm-svn: 295081	2017-02-14 18:32:41 +00:00
Adam Nemet	bbb141c734	Add new pass LazyMachineBlockFrequencyInfo And use it in MachineOptimizationRemarkEmitter. A test will follow on top of Justin's changes to enable MachineORE in AsmPrinter. The approach is similar to the IR-level pass. It's a bit simpler because BPI is immutable at the Machine level so we don't need to make that lazy. Because of this, a new function mapping is introduced (BPIPassTrait::getBPI). This function extracts BPI from the pass. In case of the lazy pass, this is when the calculation of the BFI occurs. For Machine-level, this is the identity function. Differential Revision: https://reviews.llvm.org/D29836 llvm-svn: 295072	2017-02-14 17:21:09 +00:00
Artyom Skrobov	dc66a82dc7	Removing a redundant assignment llvm-svn: 295055	2017-02-14 14:44:01 +00:00
Eugene Zelenko	d96089b248	[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). Same changes in files affected by reduced MC headers dependencies. llvm-svn: 295009	2017-02-14 00:33:36 +00:00
Tim Northover	48dfa1a6ed	GlobalISel: represent atomic loads & stores via the MachineMemOperand. Also make sure the AArch64 backend doesn't try to convert them into normal loads and stores. llvm-svn: 294993	2017-02-13 22:14:16 +00:00
Tim Northover	b73e309071	MIR: parse & print the atomic parts of a MachineMemOperand. We're going to need them very soon for GlobalISel. llvm-svn: 294992	2017-02-13 22:14:08 +00:00
Taewook Oh	4d35f9e10e	Address post-commit comments for https://reviews.llvm.org/D29596 . NFCI. llvm-svn: 294985	2017-02-13 21:12:27 +00:00
Arnold Schwaighofer	8f3df731dc	swiftcc: Don't emit tail calls from callers with swifterror parameters Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call. rdar://30495920 llvm-svn: 294982	2017-02-13 19:58:28 +00:00
Taewook Oh	06a2128cfa	Make MachineBasicBlock::updateTerminator to update DebugLoc as well Summary: Currently MachineBasicBlock::updateTerminator simply drops DebugLoc for newly created branch instructions, which may cause incorrect stepping and/or imprecise sample profile data. Below is an example: ``` 1 extern int bar(int x); 2 3 int foo(int begin, int end) { 4 int i; 5 int ret = 0; 6 for ( 7 i = begin ; 8 i != end ; 9 i++) 10 { 11 ret += bar(i); 12 } 13 return ret; 14 } ``` Below is a bitcode of 'foo' at the end of LLVM-IR level optimizations with -O3: ``` define i32 @foo(i32* readonly %begin, i32* readnone %end) !dbg !4 { entry: %cmp6 = icmp eq i32* %begin, %end, !dbg !9 br i1 %cmp6, label %for.end, label %for.body.preheader, !dbg !12 for.body.preheader: ; preds = %entry br label %for.body, !dbg !13 for.body: ; preds = %for.body.preheader, %for.body %ret.08 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ] %i.07 = phi i32* [ %incdec.ptr, %for.body ], [ %begin, %for.body.preheader ] %0 = load i32, i32* %i.07, align 4, !dbg !13, !tbaa !15 %call = tail call i32 @bar(i32 %0), !dbg !19 %add = add nsw i32 %call, %ret.08, !dbg !20 %incdec.ptr = getelementptr inbounds i32, i32* %i.07, i64 1, !dbg !21 %cmp = icmp eq i32* %incdec.ptr, %end, !dbg !9 br i1 %cmp, label %for.end.loopexit, label %for.body, !dbg !12, !llvm.loop !22 for.end.loopexit: ; preds = %for.body br label %for.end, !dbg !24 for.end: ; preds = %for.end.loopexit, %entry %ret.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.end.loopexit ] ret i32 %ret.0.lcssa, !dbg !24 } ``` where ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` . As you can see, the terminator of 'entry' block, which is a loop control branch, has a DebugLoc of line 6, column 3. Howerver, after the execution of 'MachineBlock::updateTerminator' function, which is triggered by MachineSinking pass, the DebugLoc info is dropped as below (see there's no debug-location for JNE_1): ``` bb.0.entry: successors: %bb.4(0x30000000), %bb.1.for.body.preheader(0x50000000) liveins: %rdi, %rsi %6 = COPY %rsi %5 = COPY %rdi %8 = SUB64rr %5, %6, implicit-def %eflags, debug-location !9 JNE_1 %bb.1.for.body.preheader, implicit %eflags ``` This patch addresses this issue and make newly created branch instructions to keep debug-location info. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D29596 llvm-svn: 294976	2017-02-13 18:15:31 +00:00
Quentin Colombet	fbae5fcb96	[FastISel] Add a diagnostic to warm on fallback. This is consistent with what we do for GlobalISel. That way, it is easy to see whether or not FastISel is able to fully select a function. At some point we may want to switch that to an optimization remark. llvm-svn: 294970	2017-02-13 17:38:59 +00:00
Sanne Wouda	91eadad3bd	[Assembler] Improve diagnostics for inline assembly. Summary: Keep a vector of LocInfos around; one for each call to EmitInlineAsm. Since each call to EmitInlineAsm creates a new buffer in the inline asm SourceMgr, we can use the buffer number to map to the right LocInfo. Reviewers: rengolin, grosbach, rnk, echristo Reviewed By: rnk Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D29769 llvm-svn: 294947	2017-02-13 13:58:00 +00:00
Andrew V. Tischenko	8da96914f9	Compile time decreasing in the case we're dealing with Machine Combiner. Before this patch compile time was about 21s (see below). After this patch we have less than 2s (see bellow). Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz DAGCombiner - trunk time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.685s DAGCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.655s MachineCombiner w/o Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m21.614s MachineCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.593s The test spill_fdiv.ll is attached to D29627 D29627 should be closed. llvm-svn: 294936	2017-02-13 09:43:37 +00:00
Craig Topper	3668bde371	[DAGCombiner] Teach DAG combine that inserting an extract_subvector result into the same location of a an undef vector can just use the original input to the extract. llvm-svn: 294932	2017-02-13 04:53:33 +00:00
Craig Topper	aa46204ed9	[DAGCombiner] Remove the half vector width check for the combine of EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR. This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors. llvm-svn: 294930	2017-02-12 23:49:49 +00:00
Sanjay Patel	0557a44287	[TargetLowering] fix SETCC SETLT folding with FP types The bug was introduced with: https://reviews.llvm.org/rL294863 ...and manifests as a selection failure in x86, but that's actually another bug. This fix prevents wrong codegen with -0.0, but in the more common case when we have NSZ and NNAN (-ffast-math), we should still be able to fold this setcc/compare. llvm-svn: 294924	2017-02-12 23:07:52 +00:00
Craig Topper	b633adedc7	[DAGCombiner] Make the combine of INSERT_SUBVECTOR into a CONCAT_VECTOR more generic to support larger concats. llvm-svn: 294875	2017-02-11 22:57:09 +00:00
Sanjay Patel	63499b61c9	[TargetLowering] check for sign-bit comparisons in SimplifyDemandedBits I don't know if anything other than x86 vectors is affected by this change, but this may allow us to remove target-specific intrinsics for blendv* (vector selects). The simplification arises from the fact that blendv* instructions only use the sign-bit when deciding which vector element to choose for the destination vector. The mechanism to fold VSELECT into SHRUNKBLEND nodes already exists in x86 lowering; this demanded bits change just enables the transform to fire more often. The original motivation starts with a bug for DSE of masked stores that seems completely unrelated, but I've explained the likely steps in this series here: https://llvm.org/bugs/show_bug.cgi?id=11210 Differential Revision: https://reviews.llvm.org/D29687 llvm-svn: 294863	2017-02-11 18:01:55 +00:00
Nico Weber	ee0b0ec935	Revert r294532, it caused PR31935 llvm-svn: 294787	2017-02-10 21:57:30 +00:00
Tim Shen	918ed871df	[XRay] Implement powerpc64le xray. Summary: powerpc64 big-endian is not supported, but I believe that most logic can be shared, except for xray_powerpc64.cc. Also add a function InvalidateInstructionCache to xray_util.h, which is copied from llvm/Support/Memory.cpp. I'm not sure if I need to add a unittest, and I don't know how. Reviewers: dberris, echristo, iteratee, kbarton, hfinkel Subscribers: mehdi_amini, nemanjai, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D29742 llvm-svn: 294781	2017-02-10 21:03:24 +00:00
Tim Northover	0e01170c79	GlobalISel: drop lifetime intrinsics during translation. We don't use them yet and they just cause problems. llvm-svn: 294770	2017-02-10 19:10:38 +00:00
Simon Pilgrim	bfb1747806	[DAGCombine] Allow vector constant folding of any value type before type legalization The patch comes in 2 parts: 1 - it makes use of the SelectionDAG::NewNodesMustHaveLegalTypes flag to tell when it can safely constant fold illegal types. 2 - it correctly resets SelectionDAG::NewNodesMustHaveLegalTypes at the start of each call to SelectionDAGISel::CodeGenAndEmitDAG so all the pre-legalization stages can make use of it - not just the first basic block that gets handled. Fix for PR30760 Differential Revision: https://reviews.llvm.org/D29568 llvm-svn: 294749	2017-02-10 14:37:25 +00:00
Craig Topper	a9f1121896	[SelectionDAG] Dump the DAG after legalizing vector ops and after the second type legalization Summary: With -debug, we aren't dumping the DAG after legalizing vector ops. In particular, on X86 with AVX1 only, we don't dump the DAG after we split 256-bit integer ops into pairs of 128-bit ADDs since this occurs during vector legalization. I'm only dumping if the legalize vector ops changes something since we don't print anything during legalize vector ops. So this dump shows up right after the first type-legalization dump happens. So if nothing changed this second dump is unnecessary. Having said that though, I think we should probably fix legalize vector ops to log what its doing. Reviewers: RKSimon, eli.friedman, spatel, arsenm, chandlerc Reviewed By: RKSimon Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D29554 llvm-svn: 294711	2017-02-10 05:05:57 +00:00
Eric Fiselier	87c87f4c30	[CMake] Fix pthread handling for out-of-tree builds LLVM defines `PTHREAD_LIB` which is used by AddLLVM.cmake and various projects to correctly link the threading library when needed. Unfortunately `PTHREAD_LIB` is defined by LLVM's `config-ix.cmake` file which isn't installed and therefore can't be used when configuring out-of-tree builds. This causes such builds to fail since `pthread` isn't being correctly linked. This patch attempts to fix that problem by renaming and exporting `LLVM_PTHREAD_LIB` as part of`LLVMConfig.cmake`. I renamed `PTHREAD_LIB` because It seemed likely to cause collisions with downstream users of `LLVMConfig.cmake`. llvm-svn: 294690	2017-02-10 01:59:20 +00:00
Geoff Berry	7e320c2485	[SelectionDAG] Fix bugs in inverted condition splitting code. Summary: Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by Mikael Holmen. Handle non-canonicalized xor not operation correctly (was assuming operand 0 was always the non-constant operand) and check that the negated condition is also in the same block as the original and/or instruction (as is done for and/or operands already) before proceeding with optimization. Reviewers: bogner, MatzeB, qcolombet Subscribers: mcrosier, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D29680 llvm-svn: 294605	2017-02-09 18:28:17 +00:00
David Bozier	93e773e9be	Revert: "[Stack Protection] Add diagnostic information for why stack protection was applied to a function" this reverts revision r294590 as it broke some buildbots. llvm-svn: 294593	2017-02-09 15:40:14 +00:00
David Bozier	6a44b7c2eb	[Stack Protection] Add diagnostic information for why stack protection was applied to a function Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function. This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled. Patch by: James Henderson Differential Revision: https://reviews.llvm.org/D29023 llvm-svn: 294590	2017-02-09 15:08:40 +00:00
Artur Pilipenko	4a64031954	[DAGCombiner] Support non-zero offset in load combine Enable folding patterns which load the value from non-zero offset: i8 a = ... i32 val = a[4] \| (a[5] << 8) \| (a[6] << 16) \| (a[7] << 24) => i32 val = ((i32*)(a+4)) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29394 llvm-svn: 294582	2017-02-09 12:06:01 +00:00
Wolfgang Pieb	458b4e7c46	Reapply r294356 ("Keep track of spilled variables in LiveDebugValues"). Was reverted with r294447 due to undefined behavior with negative offsets in DBG_VALUE instructions. llvm-svn: 294532	2017-02-08 23:46:59 +00:00
Tim Northover	e041841811	GlobalISel: legalize G_FPOW to a libcall on AArch64. There's no instruction to implement it. llvm-svn: 294531	2017-02-08 23:23:39 +00:00
Tim Northover	b38b4e2464	GlobalISel: translate @llvm.pow intrinsic to G_FPOW. It'll usually be immediately legalized back to a libcall, but occasionally something can be done with it so we'd just as well enable that flexibility from the start. llvm-svn: 294530	2017-02-08 23:23:32 +00:00
Tim Northover	0a9b27933a	GlobalISel: expand mul-with-overflow into mul-hi on AArch64. AArch64 has specific instructions to multiply two numbers at double the width and produce the high part of the result. These can be used to implement LLVM's mul.with.overflow instructions fairly simply. Helps with C++ operator new[]. llvm-svn: 294519	2017-02-08 21:22:15 +00:00
Simon Dardis	2e8cdbd795	[DebugInfo] Rename EmitDebugValue to EmitDebugThreadLocal (NFC) As pointed out by David Blaikie in the post commit review of r292624, EmitDebugValue should be called EmitDebugThreadLocal. llvm-svn: 294500	2017-02-08 19:03:46 +00:00
Artur Pilipenko	045ab08252	[DAGCombiner] NFC. Mark ByteProvider accessors as const llvm-svn: 294494	2017-02-08 17:59:34 +00:00
Tim Northover	f19d467ff6	GlobalISel: translate @llvm.va_start intrinsic. Because we need to preserve the memory access being performed we need a separate instruction to represent this. llvm-svn: 294492	2017-02-08 17:57:20 +00:00
Sanne Wouda	2933875cc2	[Assembler] Enable nicer diagnostics for inline assembly. Fixed test. Summary: Enables source location in diagnostic messages from the backend. This is after parsing, during finalization. This requires the SourceMgr, the inline assembly string buffer, and DiagInfo to still be alive after EmitInlineAsm returns. This patch creates a single SourceMgr for inline assembly inside the AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one SourceMgr per call to EmitInlineAsm would make it difficult for MCContext to figure out in which SourceMgr the SMLoc is located, while a single SourceMgr can figure it out if it has multiple buffers. The Str argument to EmitInlineAsm is copied into a buffer and owned by the inline asm SourceMgr. This ensures that DiagHandlers won't print garbage. (Clang emits a "note: instantiated into assembly here", which refers to this string.) The AsmParser gets destroyed before finalization, which means that the DiagHandlers the AsmParser installs into the SourceMgr will be stale. Restore the saved DiagHandlers. Since now we're using just one SourceMgr for multiple inline asm strings, we need to tell the AsmParser which buffer it needs to parse currently. Hand a buffer id -- returned from SourceMgr:: AddNewSourceBuffer -- to the AsmParser. Reviewers: rnk, grosbach, compnerd, rengolin, rovka, anemet Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29441 llvm-svn: 294458	2017-02-08 14:48:05 +00:00
Diana Picus	79add417b4	Revert "[Assembler] Enable nicer diagnostics for inline assembly." This reverts commit r294433 because it seems it broke the buildbots. llvm-svn: 294448	2017-02-08 14:02:16 +00:00
NAKAMURA Takumi	a3bc043caa	Revert r294356, "DebugInfo: Track spilled variables in LiveDebugValues" It caused undefined behavior in VarLoc. As far as I investigated, - VarLoc::VarLoc() treats negative offset value as InvalidKind. Consider the case that (int64_t)MI.getOperand(1).getImm() is negative and whether it satisfies ((uint64_t)Offset < (1ULL << 32)). - Comparison operators in VarLoc behave undefined since VarLoc::Loc.Hash is uninitialized in case of InvalidKind. I guess Offset (in VarLoc) could be made aware of signed, but I am not sure. So I have reverted it for now. llvm-svn: 294447	2017-02-08 13:49:28 +00:00
Sanne Wouda	09adc245ea	[Assembler] Enable nicer diagnostics for inline assembly. Summary: Enables source location in diagnostic messages from the backend. This is after parsing, during finalization. This requires the SourceMgr, the inline assembly string buffer, and DiagInfo to still be alive after EmitInlineAsm returns. This patch creates a single SourceMgr for inline assembly inside the AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one SourceMgr per call to EmitInlineAsm would make it difficult for MCContext to figure out in which SourceMgr the SMLoc is located, while a single SourceMgr can figure it out if it has multiple buffers. The Str argument to EmitInlineAsm is copied into a buffer and owned by the inline asm SourceMgr. This ensures that DiagHandlers won't print garbage. (Clang emits a "note: instantiated into assembly here", which refers to this string.) The AsmParser gets destroyed before finalization, which means that the DiagHandlers the AsmParser installs into the SourceMgr will be stale. Restore the saved DiagHandlers. Since now we're using just one SourceMgr for multiple inline asm strings, we need to tell the AsmParser which buffer it needs to parse currently. Hand a buffer id -- returned from SourceMgr:: AddNewSourceBuffer -- to the AsmParser. Reviewers: rnk, grosbach, compnerd, rengolin, rovka, anemet Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29441 llvm-svn: 294433	2017-02-08 10:20:07 +00:00
Matt Arsenault	1672b1b86e	TargetLowering: Remove AddrSpace parameter from GetAddrModeArguments It doesn't make any sense to pass in to what is supposed to be parsing the call, and this can be inferred from the pointer output. llvm-svn: 294412	2017-02-08 07:09:03 +00:00
Amaury Sechet	4b946916ac	[DAGCombiner] Push truncate through adde when the carry isn't used. Summary: As per title. Reviewers: mkuper, spatel, bkramer, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29528 llvm-svn: 294394	2017-02-08 00:32:36 +00:00
Wolfgang Pieb	02f329370f	DebugInfo: Track spilled variables in LiveDebugValues When variables are spilled to the stack by the register allocator, keep track of their debug locations in LiveDebugValues and insert DBG_VALUE instructions at the appropriate place. Ensure that the locations are propagated down the dominator tree via the existing mechanisms. Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D29500 llvm-svn: 294356	2017-02-07 21:23:15 +00:00
Hans Wennborg	819e3e02a9	[X86] Disable conditional tail calls (PR31257) They are currently modelled incorrectly (as calls, which clobber registers, confusing e.g. Machine Copy Propagation). Reverting until we figure out the proper solution. llvm-svn: 294348	2017-02-07 20:37:45 +00:00
Tim Northover	d0d025ae45	GlobalISel: translate @llvm.va_end intrinsic. Turns out no-one actually cares about this one (at least) in tree so we can just drop it entirely. llvm-svn: 294345	2017-02-07 20:08:59 +00:00
Sanjoy Das	2f63cbcc0c	[ImplicitNullCheck] Extend Implicit Null Check scope by using stores Summary: This change allows usage of store instruction for implicit null check. Memory Aliasing Analisys is not used and change conservatively supposes that any store and load may access the same memory. As a result re-ordering of store-store, store-load and load-store is prohibited. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: atrick, llvm-commits Differential Revision: https://reviews.llvm.org/D29400 llvm-svn: 294338	2017-02-07 19:19:49 +00:00
Reid Kleckner	0887d44a61	[SDAGISel] Simplify some SDAGISel code, NFC Hoist entry block code for arguments and swift error values out of the basic block instruction selection loop. Lowering arguments once up front seems much more readable than doing it conditionally inside the loop. It also makes it clear that argument lowering can update StaticAllocaMap because no instructions have been selected yet. Also use range-based for loops where possible. llvm-svn: 294329	2017-02-07 18:42:53 +00:00
Sanjay Patel	8c99ca3df0	[TargetLowering] fix formatting and comments for ShrinkDemandedConstant; NFC llvm-svn: 294325	2017-02-07 18:04:26 +00:00
Igor Laevsky	3be81bae63	[CodeGenPrepare] Hoist all getSubtargetImpl calls to the beginning of the pass Differential Revision: https://reviews.llvm.org/D29456 llvm-svn: 294301	2017-02-07 13:27:20 +00:00
Daniel Jasper	84b3cc394d	Revert "[DAGCombiner] (add X, (adde Y, 0, Carry)) -> (adde X, Y, Carry)" This reverts commit r294186. On an internal test, this triggers an out-of-memory error on PPC, presumably because there is another dagcombine that does the exact opposite triggering and endless loop consuming more and more memory. Chandler has started at creating a reduced test case and we'll attach it as soon as possible. llvm-svn: 294288	2017-02-07 08:57:50 +00:00
Matthias Braun	f80403d8ee	RegisterCoalescer: Fix joinReservedPhysReg() joinReservedPhysReg() can only deal with a liverange in a single basic block when copying from a vreg into a physreg. See also rdar://30306405 Differential Revision: https://reviews.llvm.org/D29436 llvm-svn: 294268	2017-02-07 01:59:39 +00:00
Tim Northover	868332d6bf	GlobalISel: legalize narrow G_SELECTS on AArch64. Otherwise there aren't any patterns to select them. llvm-svn: 294261	2017-02-06 23:41:27 +00:00
Tim Northover	0e6afbdd77	GlobalISel: legalize G_INSERT instructions We don't handle all cases yet (see arm64-fallback.ll for an example), but this is enough to cover most common C++ code so it's a good place to start. llvm-svn: 294247	2017-02-06 21:56:47 +00:00
Artur Pilipenko	d3464bf9ad	[DAGCombiner] Support bswap as a part of load combine patterns Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29397 llvm-svn: 294201	2017-02-06 17:48:08 +00:00
Amaury Sechet	e674f5c758	Add ADDC to SelectionDAG::computeKnownBits and ComputeNumSignBits. Summary: As per title. Reviewers: bkramer, sunfish, lattner, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29521 llvm-svn: 294188	2017-02-06 14:59:06 +00:00
Amaury Sechet	8a3b32941d	[DAGCombiner] Make DAGCombiner smarter about overflow Summary: Leverage it to transform addc into add. Reviewers: mkuper, spatel, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29524 llvm-svn: 294187	2017-02-06 14:54:49 +00:00
Amaury Sechet	1d466f598e	[DAGCombiner] (add X, (adde Y, 0, Carry)) -> (adde X, Y, Carry) Summary: This is extracted from D29443 . Reviewers: mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29564 llvm-svn: 294186	2017-02-06 14:28:39 +00:00
Simon Pilgrim	bfd4495512	[X86][SSE] Combine shuffle nodes with multiple uses if all the users are being combined. Currently we only combine shuffle nodes if they have a single user to prevent us from causing code bloat by splitting the shuffles into several different combines. We don't take into account that in some cases we will already have combined all the users during recursively calling up the shuffle tree. This patch keeps a list of all the shuffle nodes that have been combined so far and permits combining of further shuffle nodes if all its users are in that list. Differential Revision: https://reviews.llvm.org/D29399 llvm-svn: 294183	2017-02-06 13:44:45 +00:00
Kamil Rytarowski	5d2bd8dd54	Revamp llvm::once_flag to be closer to std::once_flag Summary: Make this interface reusable similarly to std::call_once and std::once_flag interface. This makes porting LLDB to NetBSD easier as there was in the original approach a portable way to specify a non-static once_flag. With this change translating std::once_flag to llvm::once_flag is mechanical. Sponsored by <The NetBSD Foundation> Reviewers: mehdi_amini, labath, joerg Reviewed By: mehdi_amini Subscribers: emaste, clayborg Differential Revision: https://reviews.llvm.org/D29566 llvm-svn: 294143	2017-02-05 21:13:06 +00:00
Geoff Berry	76ca8c2b34	[SelectionDAG] In InstrEmitter, handle EXTRACT_SUBREG of a physical register. Summary: Without this change, the getVR() call would hit an assert since it was being passed a physical register. Update the AArch64/ldst-opt.ll test with a case that triggers this behavior by adding a run with strict-align, which causes an unaligned STR XZR instruction to be split into byte stores, creating an EXTRACT_SUBREG of XZR that triggers the original problem. Reviewers: bogner, qcolombet, MatzeB, atrick Subscribers: aemerson, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D29495 llvm-svn: 294129	2017-02-05 18:28:14 +00:00
Amaury Sechet	143902c29f	[DAGCombiner] Leverage add's commutativity Summary: This avoid the need to duplicate all pattern and actually end up exposing some opportunity to optimize existing pattern that did not exists in both directions on an existing test case. Reviewers: mkuper, spatel, bkramer, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29541 llvm-svn: 294125	2017-02-05 14:22:20 +00:00
Craig Topper	42b83f8d6e	[DAGCombiner] Canonicalize the order of a chain of INSERT_SUBVECTORs. Based on similar code for INSERT_VECTOR_ELT. llvm-svn: 294110	2017-02-04 23:26:39 +00:00
Craig Topper	04dce84ead	[DAGCombiner] Use DAG.getAnyExtOrTrunc to simplify some code. NFC llvm-svn: 294109	2017-02-04 23:26:37 +00:00
Craig Topper	ceaf9c1633	[DAGCombiner] In visitINSERT_VECTOR_ELT, move check for BUILD_VECTOR being legal below code that just canonicalizes INSERT_VECTOR_ELT without creating BUILD_VECTORS. llvm-svn: 294108	2017-02-04 23:26:34 +00:00
Amaury Sechet	6e2d8e49ec	Formatting in DAGCombiner. NFC llvm-svn: 294091	2017-02-04 13:01:53 +00:00
Matthias Braun	82e7f4d877	MachineCopyPropagation: Respect implicit operands of COPY The code missed to check implicit operands of COPY instructions for defs/uses. Differential Revision: https://reviews.llvm.org/D29522 llvm-svn: 294088	2017-02-04 02:27:20 +00:00
Matthias Braun	776a1d7ecb	MachineCopyPropagation: Do not consider undef operands as clobbers This was originally introduced in r278321 to work around correctness problems in the ExecutionDepsFix pass; Probably also to keep the performance benefits of breaking the false dependencies which of course also affect undef operands. ExecutionDepsFix has been improved here recently (see for example r278321) so we should not need this exception any longer. Differential Revision: https://reviews.llvm.org/D29525 llvm-svn: 294087	2017-02-04 02:27:13 +00:00
Kyle Butt	c7d67eef5a	[CodeGen]: BlockPlacement: Skip extraneous logging. Move a check for blocks that are not candidates for tail duplication up before the logging. Reduces logging noise. No non-logging changes intended. llvm-svn: 294086	2017-02-04 02:26:34 +00:00
Kyle Butt	e9425c4ff8	[CodeGen]: BlockPlacement: Apply const liberally. NFC Anything that needs to be passed to AnalyzeBranch unfortunately can't be const, or more would be const. Added const_iterator to BlockChain to allow BlockChain to be const when we don't expect to change it. llvm-svn: 294085	2017-02-04 02:26:32 +00:00
Eugene Zelenko	502d0bc28e	[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). This is preparation to reduce TargetInstrInfo.h dependencies. llvm-svn: 294084	2017-02-04 02:00:53 +00:00
Craig Topper	aa741abfc5	[TwoAddressInstruction] Fix typo in comment. NFC llvm-svn: 294083	2017-02-04 01:58:10 +00:00
Brendon Cahoon	9809b10586	[RegisterCoalescer] Do not call getInstructionIndex with DBG_VALUE An assert occurs when calling SlotIndexes::getInstructionIndex with a DBG_VALUE instruction because the function expects an instruction with a slot index. However, there is no slot index for a DBG_VALUE instruction. Differential Revision: https://reviews.llvm.org/D29048 llvm-svn: 294070	2017-02-04 00:10:22 +00:00
Ahmed Bougacha	9677cc6fb7	[TLI] Robustize SDAG LibFunc proto checking by merging it into TLI. This re-applies commit r292189, reverted in r292191. SelectionDAGBuilder recognizes libfuncs using some homegrown parameter type-checking. Use TLI instead, removing another heap of redundant code. This isn't strictly NFC, as the SDAG code was too lax. Concretely, this means changes are required to a few tests: - calling a non-variadic function via a variadic prototype isn't OK; it just happens to work on x86_64 (but not on, e.g., aarch64). - mempcpy has a size_t parameter; the SDAG code accepts any integer type, which meant using i32 on x86_64 worked. - a handful of SystemZ tests check the SDAG support for lax prototype checking: Ulrich agrees on removing them. I don't think it's worth supporting any of these (IMO) invalid testcases. Instead, fix them to be more meaningful. llvm-svn: 294028	2017-02-03 19:11:19 +00:00
Tim Northover	c3e3f59d12	GlobalISel: translate dynamic alloca instructions. llvm-svn: 294022	2017-02-03 18:22:45 +00:00
Alexey Bataev	a0d9f2582b	[SelectionDAG] Fix for PR30775: Assertion `NodeToMatch->getOpcode() != ISD::DELETED_NODE && "NodeToMatch was removed partway through selection"' failed. NodeToMatch can be modified during matching, but code does not handle this situation. Differential Revision: https://reviews.llvm.org/D29292 llvm-svn: 294003	2017-02-03 12:28:40 +00:00
David Blaikie	a0e3c75187	DebugInfo: ensure type and namespace names are included in pubnames/pubtypes even when they are only present in type units While looking to add support for placing singular types (types that will only be emitted in one place (such as attached to a strong vtable or explicit template instantiation definition)) not in type units (since type units have overhead) I stumbled across that change causing an increase in pubtypes. Turns out we were missing some types from type units if they were only referenced from other type units and not from the debug_info section. This fixes that, following GCC's line of describing the offset of such entities as the CU die (since there's no compile unit-relative offset that would describe such an entity - they aren't in the CU). Also like GCC, this change prefers to describe the type stub within the CU rather than the "just use the CU offset" fallback where possible. This may give the DWARF consumer some opportunity to find the extra info in the type stub - though I'm not sure GDB does anything with this currently. The size of the pubnames/pubtypes sections now match exactly with or without type units enabled. This nearly triples (+189%) the pubtypes section for a clang self-host and grows pubnames by 0.07% (without compression). For a total of 8% increase in debug info sections of the objects of a Split DWARF build when using type units. llvm-svn: 293971	2017-02-03 00:44:18 +00:00
Bob Haarman	dd4ebc1d3b	[lto] add getLinkerOpts() Summary: Some compilers, including MSVC and Clang, allow linker options to be specified in source files. In the legacy LTO API, there is a getLinkerOpts() method that returns linker options for the bitcode module being processed. This change adds that method to the new API, so that the COFF linker can get the right linker options when using the new LTO API. Reviewers: pcc, ruiu, mehdi_amini, tejohnson Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D29207 llvm-svn: 293950	2017-02-02 23:00:49 +00:00
Reid Kleckner	c35139ec0d	[CodeGen] Remove dead call-or-prologue enum from CCState This enum has been dead since Olivier Stannard re-implemented ARM byval handling in r202985 (2014). llvm-svn: 293943	2017-02-02 21:58:22 +00:00
Xinliang David Li	58fcc9bdce	[PGO] internal option cleanups 1. Added comments for options 2. Added missing option cl::desc field 3. Uniified function filter option for graph viewing. Now PGO count/raw-counts share the same filter option: -view-bfi-func-name=. llvm-svn: 293938	2017-02-02 21:29:17 +00:00
Quentin Colombet	5725f56bb0	[LiveRangeEdit] Don't mess up with LiveInterval when a new vreg is created. In r283838, we added the capability of splitting unspillable register. When doing so we had to make sure the split live-ranges were also unspillable and we did that by marking the related live-ranges in the delegate method that is called when a new vreg is created. However, by accessing the live-range there, we also triggered their lazy computation (LiveIntervalAnalysis::getInterval) which is not what we want in general. Indeed, later code in LiveRangeEdit is going to build the live-ranges this lazy computation may mess up that computation resulting in assertion failures. Namely, the createEmptyIntervalFrom method expect that the live-range is going to be empty, not computed. Thanks to Mikael Holmén <mikael.holmen@ericsson.com> for noticing and reporting the problem. llvm-svn: 293934	2017-02-02 20:44:36 +00:00
Xinliang David Li	1eb4ec6a2e	[PGO] make graph view internal options available for all builds Differential Revision: https://reviews.llvm.org/D29259 llvm-svn: 293921	2017-02-02 19:18:56 +00:00
Nirav Dave	93f9d5ce04	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293893 which is miscompiling lua on ARM and bootstrapping for x86-windows. llvm-svn: 293915	2017-02-02 18:24:55 +00:00
Amaury Sechet	f3e421d6e9	Use N0 instead of N->getOperand(0) in DagCombiner::visitAdd. NFC llvm-svn: 293903	2017-02-02 16:07:44 +00:00
Nirav Dave	4442667fc5	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting after fixing X86 inc/dec chain bug. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293893	2017-02-02 14:39:42 +00:00
Matthias Braun	9dc3b5ff89	RegisterCoalescer: Cleanup joinReservedPhysReg(); NFC - Factor out a common subexpression - Add some helpful comments - Fix printing of a register in a debug message llvm-svn: 293856	2017-02-02 02:23:27 +00:00
Paul Robinson	5362216c36	Remove an assertion that doesn't hold when mixing -g and -gmlt through LTO. Replace it with a related assertion, ensuring that abstract variables appear only in abstract scopes. Part of PR31437. Differential Revision: http://reviews.llvm.org/D29430 llvm-svn: 293841	2017-02-01 23:51:56 +00:00
Dehao Chen	0944a8c2ec	Change debug-info-for-profiling from a TargetOption to a function attribute. Summary: LTO requires the debug-info-for-profiling to be a function attribute. Reviewers: echristo, mehdi_amini, dblaikie, probinson, aprantl Reviewed By: mehdi_amini, dblaikie, aprantl Subscribers: aprantl, probinson, ahatanak, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D29203 llvm-svn: 293833	2017-02-01 22:45:09 +00:00
Paul Robinson	a380e613f1	Remove an assertion that doesn't hold when mixing -g and -gmlt through LTO. Part of PR31437. Differential Revision: http://reviews.llvm.org/D29310 llvm-svn: 293818	2017-02-01 21:54:50 +00:00
Sanjoy Das	08da2e28ee	[ImplicitNullCheck] Extend canReorder scope Summary: This change allows a re-order of two intructions if their uses are overlapped. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29120 llvm-svn: 293775	2017-02-01 16:04:21 +00:00
Florian Hahn	7a5ec55fb3	[legalizetypes] Push fp16 -> fp32 extension node to worklist. Summary: This way, the type legalization machinery will take care of registering the result of this node properly. This patches fixes all failing fp16 test cases with expensive checks. (CodeGen/ARM/fp16-promote.ll, CodeGen/ARM/fp16.ll, CodeGen/X86/cvt16.ll CodeGen/X86/soft-fp.ll) Reviewers: t.p.northover, baldrick, olista01, bogner, jmolloy, davidxl, ab, echristo, hfinkel Reviewed By: hfinkel Subscribers: mehdi_amini, hfinkel, davide, RKSimon, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D28195 llvm-svn: 293765	2017-02-01 13:01:33 +00:00
Evandro Menezes	94edf02923	[CodeGen] Move MacroFusion to the target This patch moves the class for scheduling adjacent instructions, MacroFusion, to the target. In AArch64, it also expands the fusion to all instructions pairs in a scheduling block, beyond just among the predecessors of the branch at the end. Differential revision: https://reviews.llvm.org/D28489 llvm-svn: 293737	2017-02-01 02:54:34 +00:00
Sanjoy Das	15e50b510e	[ImplicitNullCheck] NFC isSuitableMemoryOp cleanup Summary: isSuitableMemoryOp method is repsonsible for verification that instruction is a candidate to use in implicit null check. Additionally it checks that base register is not re-defined before. In case base has been re-defined it just returns false and lookup is continued while any suitable instruction will not succeed this check as well. This results in redundant further operations. So when we found that base register has been re-defined we just stop. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29119 llvm-svn: 293736	2017-02-01 02:49:25 +00:00
Stanislav Mekhanoshin	70c245e92d	Fix regalloc assignment of overlapping registers SplitEditor::defFromParent() can create a register copy. If register is a tuple of other registers and not all lanes are used a copy will be done on a full tuple regardless. Later register unit for an unused lane will be considered free and another overlapping register tuple can be assigned to a different value even though first register is live at that point. That is because interference only look at liveness info, while full register copy clobbers all lanes, even unused. This patch fixes copy to only cover used lanes. Differential Revision: https://reviews.llvm.org/D29105 llvm-svn: 293728	2017-02-01 01:18:36 +00:00
Kyle Butt	b15c06677c	CodeGen: Allow small copyable blocks to "break" the CFG. When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well, subject to some simple frequency calculations. Differential Revision: https://reviews.llvm.org/D28583 llvm-svn: 293716	2017-01-31 23:48:32 +00:00
Tim Northover	c6bfa481cf	GlobalISel: the translation of an invoke must branch to the good block. Otherwise bad things happen if the basic block order isn't trivial after an invoke. llvm-svn: 293679	2017-01-31 20:12:18 +00:00
Matthias Braun	01fa962226	InterleaveAccessPass: Avoid constructing invalid shuffle masks Fix a bug where we would construct shufflevector instructions addressing invalid elements. Differential Revision: https://reviews.llvm.org/D29313 llvm-svn: 293673	2017-01-31 18:37:53 +00:00
Tim Northover	293f74355b	GlobalISel: merge invoke and call translation paths. Well, sort of. But the lower-level code that invoke used to be using completely botched the handling of varargs functions, which hopefully won't be possible if they're using the same code. llvm-svn: 293670	2017-01-31 18:36:11 +00:00
Nirav Dave	a7c041d147	[X86] Implement -mfentry Summary: Insert calls to __fentry__ at function entry. Reviewers: hfinkel, craig.topper Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D28000 llvm-svn: 293648	2017-01-31 17:00:27 +00:00
Nicolai Haehnle	8813d5d221	[DAGCombine] require UnsafeFPMath for re-association of addition Summary: The affected transforms all implicitly use associativity of addition, for which we usually require unsafe math to be enabled. The "Aggressive" flag is only meant to convey information about the performance of the fused ops relative to a fmul+fadd sequence. Fixes Bug 31626. Reviewers: spatel, hfinkel, mehdi_amini, arsenm, tstellarAMD Subscribers: jholewinski, nemanjai, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D28675 llvm-svn: 293635	2017-01-31 14:35:37 +00:00
Keno Fischer	578cf7aae7	[ExecutionDepsFix] Improve clearance calculation for loops Summary: In revision rL278321, ExecutionDepsFix learned how to pick a better register for undef register reads, e.g. for instructions such as `vcvtsi2sdq`. While this revision improved performance on a good number of our benchmarks, it unfortunately also caused significant regressions (up to 3x) on others. This regression turned out to be caused by loops such as: PH -> A -> B (xmm<Undef> -> xmm<Def>) -> C -> D -> EXIT ^ \| +----------------------------------+ In the previous version of the clearance calculation, we would visit the blocks in order, remembering for each whether there were any incoming backedges from blocks that we hadn't processed yet and if so queuing up the block to be re-processed. However, for loop structures such as the above, this is clearly insufficient, since the block B does not have any unknown backedges, so we do not see the false dependency from the previous interation's Def of xmm registers in B. To fix this, we need to consider all blocks that are part of the loop and reprocess them one the correct clearance values are known. As an optimization, we also want to avoid reprocessing any later blocks that are not part of the loop. In summary, the iteration order is as follows: Before: PH A B C D A' Corrected (Naive): PH A B C D A' B' C' D' Corrected (w/ optimization): PH A B C A' B' C' D To facilitate this optimization we introduce two new counters for each basic block. The first counts how many of it's predecssors have completed primary processing. The second counts how many of its predecessors have completed all processing (we will call such a block done. Now, the criteria to reprocess a block is as follows: - All Predecessors have completed primary processing - For x the number of predecessors that have completed primary processing at the time of primary processing of this block, the number of predecessors that are done has reached x. The intuition behind this criterion is as follows: We need to perform primary processing on all predecessors in order to find out any direct defs in those predecessors. When predecessors are done, we also know that we have information about indirect defs (e.g. in block B though that were inherited through B->C->A->B). However, we can't wait for all predecessors to be done, since that would cause cyclic dependencies. However, it is guaranteed that all those predecessors that are prior to us in reverse postorder will be done before us. Since we iterate of the basic blocks in reverse postorder, the number x above, is precisely the count of the number of predecessors prior to us in reverse postorder. Reviewers: myatsina Differential Revision: https://reviews.llvm.org/D28759 llvm-svn: 293571	2017-01-30 23:37:03 +00:00
Tim Northover	2bf8c9d381	GlobalISel: correctly translate invoke when callee is a register. This should fix the GlobalISel verifier. llvm-svn: 293550	2017-01-30 21:45:21 +00:00
Tim Northover	c944970484	GlobalISel: account for differing exception selector sizes. For some reason the exception selector register must be a pointer (that's assumed by SDag); on the other hand, it gets moved into an IR-level type which might be entirely different (i32 on AArch64). IRTranslator needs to be aware of this. llvm-svn: 293546	2017-01-30 20:52:42 +00:00
Tim Northover	c94d70336b	GlobalISel: tidy up def/use test. NFC. llvm-svn: 293545	2017-01-30 20:52:37 +00:00
Tim Northover	79f43f195c	GlobalISel: translate memset & memmove. llvm-svn: 293541	2017-01-30 19:33:07 +00:00
Tim Northover	480609d0f3	GlobalISel: permit unused vregs without a register-class after ISel. This can happen if earlier combining has removed all uses of some VReg, which is fine and shouldn't flag an error. llvm-svn: 293537	2017-01-30 19:12:50 +00:00
Simon Pilgrim	ffe2535cf6	Use SelectionDAG::getBuildVector helper function where possible. NFCI. llvm-svn: 293532	2017-01-30 18:53:45 +00:00
Justin Bogner	8f520a73b2	SDAG: Update ChainNodesMatched during UpdateChains if a node is replaced Previously, we would hit UB (or the ISD::DELETED_NODE assert) if we happened to replace a node during UpdateChains, because it would be left in the list we were iterating over. This nulls out the pointer when that happens so that we can avoid the issue. Fixes llvm.org/PR31710 llvm-svn: 293522	2017-01-30 18:29:46 +00:00
Simon Pilgrim	0a5ab5c4db	Use SelectionDAG::getBuildVector/getSplatBuildVector helper functions where possible. NFCI. llvm-svn: 293520	2017-01-30 18:20:42 +00:00
Matt Arsenault	32e6bfa20f	DAG: Fold fneg into compare with constant into the constant fcmp (fneg x), c, pred -> fcmp x, -c, (swap pred) InstCombine already does this. llvm-svn: 293512	2017-01-30 17:57:28 +00:00
David Blaikie	a66696f210	unique_ptrify some containers in GlobalISel::RegisterBankInfo To simplify/clarify memory ownership, make leaks (as one was found/fixed recently) harder to write, etc. (also, while I was there - removed a duplicate lookup in a container) llvm-svn: 293506	2017-01-30 17:13:56 +00:00
Matt Arsenault	0c687390fe	DAG: Constant fold fp16_to_fp/fp16_to_fp This fixes emitting conversions of constants on targets without legal f16 that need to use these for legalization. llvm-svn: 293499	2017-01-30 16:57:41 +00:00
Kristof Beyls	65a12c012f	[GlobalISel] Add support for indirectbr Differential Revision: https://reviews.llvm.org/D28079 llvm-svn: 293470	2017-01-30 09:13:18 +00:00
Matthias Braun	a4976c6166	MachineInstr: Remove parameter from dump() The primary use of the dump() functions in LLVM is for use in a debugger. Unfortunately lldb does not seem to handle default arguments so using `p SomeMI.dump()` fails and you have to type the longer `p SomeMI.dump(nullptr)`. Remove the paramter to make the most common use easy. (You can always construct something like `p SomeMI.print(dbgs(),MyTII)` if you need more features). Differential Revision: https://reviews.llvm.org/D29241 llvm-svn: 293440	2017-01-29 18:20:42 +00:00
Craig Topper	135da1faf5	[SelectionDAG] Make SDNode::getConstantOperandVal an inline method. It's operation already exists manually in many places without using the method. llvm-svn: 293421	2017-01-29 06:08:02 +00:00
Craig Topper	4753736abf	[DAGCombiner] Use unsigned for a constant vector index instead of APInt. The type system requires that the number of vector elements should fit in 32-bits so this should be safe. llvm-svn: 293414	2017-01-29 04:38:21 +00:00
Craig Topper	d15730902b	[DAGCombiner] Remove unnecessary check on the size of the type of the index of EXTRACT_SUBVECTOR. The type system already requires that the number of vector elements must fit in 32-bits so an index should as well. Even if the type of the index were larger all we care about is that the constant index can fit in 64-bits so that we can call getZExtValue. llvm-svn: 293413	2017-01-29 04:38:19 +00:00
Craig Topper	24cdbe8fa6	[DAGCombiner] Make sure index of EXTRACT_SUBVECTOR is a constant before trying to use getConstantOperandVal. llvm-svn: 293412	2017-01-29 04:38:16 +00:00
Xinliang David Li	fd3f645f9d	Add support to dump dot graph block layout after MBP Differential Revision: https://reviews.llvm.org/D29141 llvm-svn: 293408	2017-01-29 01:57:02 +00:00
Matthias Braun	194ded551c	Use print() instead of dump() in code llvm-svn: 293371	2017-01-28 06:53:55 +00:00
Quentin Colombet	8cf1163c4f	[RegisterBankInfo] Emit proper type for remapped registers. When the OperandsMapper creates virtual registers, it used to just create plain scalar register with the right size. This may confuse the instruction selector because we lose the information of the instruction using those registers what supposed to do. The MachineVerifier complains about that already. With this patch, the OperandsMapper still creates plain scalar register, but the expectation is for the mapping function to remap the type properly. The default mapping function has been updated to do that. rdar://problem/30231850 llvm-svn: 293362	2017-01-28 02:23:48 +00:00
Matthias Braun	8c209aa877	Cleanup dump() functions. We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif llvm-svn: 293359	2017-01-28 02:02:38 +00:00
Quentin Colombet	351099022a	[RegisterCoalescing] Recommit the patch "Remove partial redundent copy". In r292621, the recommit fixes a bug related with live interval update after the partial redundent copy is moved. This recommit solves an additional bug related to the lack of update of subranges. The original patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 Re-apply r292621 Revert "Revert rL292621. Caused some internal build bot failures in apple." This reverts commit r292984. Original patch: Wei Mi <wmi@google.com> Subrange fix: Mostly Matthias Braun <matze@braunis.de> llvm-svn: 293353	2017-01-28 01:05:27 +00:00
Evgeniy Stepanov	d0852873e5	Fix memory leak in globalisel. #0 0x89cdeb in operator new[](unsigned long) /code/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:84:37 #1 0x4ec87c4 in llvm::RegisterBankInfo::ValueMapping const* llvm::RegisterBankInfo::getOperandsMapping<llvm::RegisterBankInfo::ValueMapping const* const>(llvm::RegisterBankInfo::ValueMapping const const, llvm::RegisterBankInfo::ValueMapping const const) const /code/llvm/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp:297:9 #2 0x9327ee in llvm::AArch64RegisterBankInfo::getInstrMapping(llvm::MachineInstr const&) const /code/llvm/lib/Target/AArch64/AArch64RegisterBankInfo.cpp:540:30 #3 0x4eb8d07 in llvm::RegBankSelect::assignInstr(llvm::MachineInstr&) /code/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp:546:24 #4 0x4eb9dd2 in llvm::RegBankSelect::runOnMachineFunction(llvm::MachineFunction&) /code/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp:624:12 #5 0x3141875 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /code/llvm/lib/CodeGen/MachineFunctionPass.cpp:62:13 #6 0x396128d in llvm::FPPassManager::runOnFunction(llvm::Function&) /code/llvm/lib/IR/LegacyPassManager.cpp:1513:27 #7 0x3961832 in llvm::FPPassManager::runOnModule(llvm::Module&) /code/llvm/lib/IR/LegacyPassManager.cpp:1534:16 #8 0x3962540 in runOnModule /code/llvm/lib/IR/LegacyPassManager.cpp:1590:27 #9 0x3962540 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /code/llvm/lib/IR/LegacyPassManager.cpp:1693 #10 0x8ae368 in compileModule(char*, llvm::LLVMContext&) /code/llvm/tools/llc/llc.cpp:562:8 #11 0x8a7a1b in main /code/llvm/tools/llc/llc.cpp:316:22 llvm-svn: 293351	2017-01-28 00:46:30 +00:00
Tim Northover	12bd22fbee	GlobalISel: don't leak super-entry BB when merging with IR-level one. We have to delete the block manually or it leaks. That triggers failures in -fsanitize=leak bots (unsurprisingly), which should be fixed by this patch. llvm-svn: 293347	2017-01-27 23:54:31 +00:00
Tim Northover	d8b85584f2	GlobalISel: set correct regclass for LOAD_STACK_GUARD. Since it's not actually a generic MI, its register operands need a RegClass, which is conveniently the target's pointer RegClass. llvm-svn: 293335	2017-01-27 21:31:24 +00:00
Tim Northover	c9bc8a5580	GlobalISel: mark incoming landing-pad registers as live. Should fix machine verifier failures. llvm-svn: 293334	2017-01-27 21:31:17 +00:00
Matthias Braun	c91e28af4b	ScheduleDAGInstrs: Do not try to toggle kill flags on debug uses Preparation for upcoming changes. No testcase as none of the public targets bundles early enough and has a post machine scheduler enabled at the same time. The error is also easily catched by asserts. llvm-svn: 293324	2017-01-27 18:53:07 +00:00
Matthias Braun	26e8c350f9	ScheduleDAGInstrs: Cleanup toggleKillFlag(); NFC llvm-svn: 293323	2017-01-27 18:53:05 +00:00
Matthias Braun	bd7d91838e	ScheduleDAGInstrs: Cleanup; NFC Comment, doxygen and a bit of whitespace cleanup. llvm-svn: 293322	2017-01-27 18:53:00 +00:00
Jun Bum Lim	b99a06b7c9	[CodeGenPrep]No negative cost in the ExtLd promotion Summary: This change prevent the signed value of cost from being negative as the value is passed as an unsigned argument. Reviewers: mcrosier, jmolloy, qcolombet, javed.absar Reviewed By: mcrosier, qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28871 llvm-svn: 293307	2017-01-27 17:16:37 +00:00
Jonas Paulsson	bb0ed3e732	[DAGTypeLegalizer] Handle SIGN/ZERO_EXTEND in WidenVecRes_Convert(). In case of a SIGN/ZERO_EXTEND of an incomplete vector type (using only a partial number of available vector elements), WidenVecRes_Convert() used to resort to scalarization. This patch adds a handling of the (common) case where an input vector can be found of same width as the widened result vector, by converting the node to SIGN/ZERO_EXTEND_VECTOR_INREG. Review: Eli Friedman llvm-svn: 293268	2017-01-27 07:46:26 +00:00
Tim Northover	09aac4ad2a	GlobalISel: support debug intrinsics. The translation scheme is mostly cribbed from FastISel, and it's not entirely convincing semantically. But it does seem to work in the common cases and allow variables to be printed so it can't be all wrong. llvm-svn: 293228	2017-01-26 23:39:14 +00:00
Andrew Kaylor	a0a1164ce4	Add intrinsics for constrained floating point operations This commit introduces a set of experimental intrinsics intended to prevent optimizations that make assumptions about the rounding mode and floating point exception behavior. These intrinsics will later be extended to specify flush-to-zero behavior. More work is also required to model instruction dependencies in machine code and to generate these instructions from clang (when required by pragmas and/or command line options that are not currently supported). Differential Revision: https://reviews.llvm.org/D27028 llvm-svn: 293226	2017-01-26 23:27:59 +00:00
Kyle Butt	c4614b3e76	[IfConversion] Use reverse_iterator to simplify. NFC This simplifies skipping debug instructions and shrinking ranges. llvm-svn: 293202	2017-01-26 20:02:47 +00:00
Nirav Dave	d32a421f75	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r293184 which is failing in LTO builds llvm-svn: 293188	2017-01-26 16:46:13 +00:00
Nirav Dave	de6516c466	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293184	2017-01-26 16:02:24 +00:00
Craig Topper	001aad7da7	[DAGCombiner] Fold extract_subvector of undef to undef. Fold away inserting undef subvectors. llvm-svn: 293152	2017-01-26 05:38:46 +00:00
Adam Nemet	a964066705	New OptimizationRemarkEmitter pass for MIR This allows MIR passes to emit optimization remarks with the same level of functionality that is available to IR passes. It also hooks up the greedy register allocator to report spills. This allows for interesting use cases like increasing interleaving on a loop until spilling of registers is observed. I still need to experiment whether reporting every spill scales but this demonstrates for now that the functionality works from llc using -pass-remarks*=<pass>. Differential Revision: https://reviews.llvm.org/D29004 llvm-svn: 293110	2017-01-25 23:20:33 +00:00
Tim Northover	470f070b7d	SDag: fix how initial loads are formed when splitting vector ops. Later code expects the vector loads produced to be directly concatenable, which means we shouldn't pad anything except the last load produced with UNDEF. llvm-svn: 293088	2017-01-25 20:58:26 +00:00
Tim Northover	9e35f1e21c	GlobalISel: rework getOrCreateVReg to avoid double lookup. NFC. Thanks to Quentin for suggesting the refactoring. llvm-svn: 293087	2017-01-25 20:58:22 +00:00
Tim Northover	5d27063eb4	DebugInfo: remove unused parameter from function. NFC. I think it's a hold-over from some previous iteration, but it's never set to true in LLVM as it exists now. llvm-svn: 293086	2017-01-25 20:58:07 +00:00
Krzysztof Parzyszek	ee9aa3ffee	Add iterator_range<regclass_iterator> to {Target,MC}RegisterInfo, NFC llvm-svn: 293077	2017-01-25 19:29:04 +00:00
Chad Rosier	4f724dce42	Revert "Do not verify dominator tree if it has no roots" This reverts commit r293033, per Danny's comment. In short, we require domtrees to have roots at all times. llvm-svn: 293075	2017-01-25 17:15:48 +00:00
Artur Pilipenko	bc93452420	Fix buildbot failures introduced by 293036 Fix unused variable, specify types explicitly to make VC compiler happy. llvm-svn: 293039	2017-01-25 09:10:07 +00:00
Artur Pilipenko	41c0005aa3	[DAGCombiner] Match load by bytes idiom and fold it into a single load. Attempt #2 . The previous patch (https://reviews.llvm.org/rL289538) got reverted because of a bug. Chandler also requested some changes to the algorithm. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161212/413479.html This is an updated patch. The key difference is that collectBitProviders (renamed to calculateByteProvider) now collects the origin of one byte, not the whole value. It simplifies the implementation and allows to stop the traversal earlier if we know that the result won't be used. From the original commit: Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it. Assuming little endian target: i8 a = ... i32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) => i32 val = ((i32)a) i8 a = ... i32 val = (a[0] << 24) \| (a[1] << 16) \| (a[2] << 8) \| a[3] => i32 val = BSWAP(((i32)a)) This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations. Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part: i32 val = a[i] \| (a[i + 1] << 8) \| (a[i + 2] << 16) \| (a[i + 3] << 24) Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses. The general scheme is to match OR expressions by recursively calculating the origin of individual bytes which constitute the resulting OR value. If all the OR bytes come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed. Reviewed By: RKSimon, filcab, chandlerc Differential Revision: https://reviews.llvm.org/D27861 llvm-svn: 293036	2017-01-25 08:53:31 +00:00
Serge Pavlov	43a7759f4b	Do not verify dominator tree if it has no roots If dominator tree has no roots, the pass that calculates it is likely to be skipped. It occures, for instance, in the case of entities with linkage available_externally. Do not run tree verification in such case. Differential Revision: https://reviews.llvm.org/D28767 llvm-svn: 293033	2017-01-25 07:58:10 +00:00
Matt Arsenault	732a531506	DAG: Recognize no-signed-zeros-fp-math attribute clang already emits this with -cl-no-signed-zeros, but codegen doesn't do anything with it. Treat it like the other fast math attributes, and change one place to use it. llvm-svn: 293024	2017-01-25 06:08:42 +00:00
Justin Bogner	4844573eb1	GlobalISel: Fix typo in error message llvm-svn: 293023	2017-01-25 06:02:10 +00:00
Matt Arsenault	8a27aee6ae	DAGCombiner: Allow negating ConstantFP after legalize llvm-svn: 293019	2017-01-25 04:54:34 +00:00
Ahmed Bougacha	eb185e1f64	Try to prevent build breakage by touching a CMakeLists.txt. Looks like our cmake goop for handling .inc->td dependencies doesn't track the .td files. This manifests as cmake complaining about missing files since r293009. Force a rerun to avoid that. llvm-svn: 293012	2017-01-25 02:55:24 +00:00
Justin Bogner	a029531e10	GlobalISel: Use the correct types when translating landingpad instructions There was a bug here where we were using p0 instead of s32 for the selector type in the landingpad. Instead of hardcoding these types we should get the types from the landingpad instruction directly. Note that we replicate an assert from SDAG here to only support two-valued landingpads. llvm-svn: 292995	2017-01-25 00:16:53 +00:00
Wei Mi	f1cf0278e8	Revert rL292621. Caused some internal build bot failures in apple. llvm-svn: 292984	2017-01-24 22:15:06 +00:00
Geoff Berry	92a286ae5a	[SelectionDAG] Handle inverted conditions when splitting into multiple branches. Summary: When conditional branches with complex conditions are split into multiple branches in SelectionDAGBuilder::FindMergedConditions, also handle inverted conditions. These may sometimes appear without having been optimized by InstCombine when CodeGenPrepare decides to sink and duplicate cmp instructions, causing them to have only one use. This problem can be increased by e.g. GVNHoist hiding more cmps from InstCombine by combining equivalent cmps from different blocks. For example codegen X & !(Y \| Z) as: jmp_if_X TmpBB jmp FBB TmpBB: jmp_if_notY Tmp2BB jmp FBB Tmp2BB: jmp_if_notZ TBB jmp FBB Reviewers: bogner, MatzeB, qcolombet Subscribers: llvm-commits, hiraditya, mcrosier, sebpop Differential Revision: https://reviews.llvm.org/D28380 llvm-svn: 292944	2017-01-24 16:36:07 +00:00
Craig Topper	ff272ad4f3	[SelectionDAG] Teach getNode to simplify a couple easy cases of EXTRACT_SUBVECTOR Summary: This teaches getNode to simplify extracting from Undef. This is similar to what is done for EXTRACT_VECTOR_ELT. It also adds support for extracting from CONCAT_VECTOR when we can reuse one of the inputs to the concat. These seem like simple non-target specific optimizations. For X86 we currently handle undef in extractSubvector, but not all EXTRACT_SUBVECTOR creations go through there. Ultimately, my motivation here is to simplify extractSubvector and remove custom lowering for EXTRACT_SUBVECTOR since we don't do anything but handle undef and BUILD_VECTOR optimizations, but those should be DAG combines. Reviewers: RKSimon, delena Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29000 llvm-svn: 292876	2017-01-24 02:36:59 +00:00
Matthias Braun	b901d33461	LiveIntervalAnalysis: Calculate liveness even if a superreg is reserved. A register unit may be allocatable and non-reserved but some of the register(tuples) built with it are reserved. We still need to calculate liveness in this case. Note to out of tree targets: If you start seeing machine verifier errors with this commit, it probably means that you do not properly mark super registers of reserved register as reserved. See for example r292836 or r292870 for example on how to fix that. rdar://29996737 Differential Revision: https://reviews.llvm.org/D28881 llvm-svn: 292871	2017-01-24 01:12:58 +00:00
David L. Jones	d21529fa0d	[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC) Summary: The LibFunc::Func enum holds enumerators named for libc functions. Unfortunately, there are real situations, including libc implementations, where function names are actually macros (musl uses "#define fopen64 fopen", for example; any other transitively visible macro would have similar effects). Strictly speaking, a conforming C++ Standard Library should provide any such macros as functions instead (via <cstdio>). However, there are some "library" functions which are not part of the standard, and thus not subject to this rule (fopen64, for example). So, in order to be both portable and consistent, the enum should not use the bare function names. The old enum naming used a namespace LibFunc and an enum Func, with bare enumerators. This patch changes LibFunc to be an enum with enumerators prefixed with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override macros.) There are additional changes required in clang. Reviewers: rsmith Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28476 llvm-svn: 292848	2017-01-23 23:16:46 +00:00
Matt Arsenault	4e305c6c1e	DAG: Don't fold vector extract into load if target doesn't want to Fixes turning a 32-bit scalar load into an extending vector load for AMDGPU when dynamically indexing a vector. llvm-svn: 292842	2017-01-23 22:48:53 +00:00
Ahmed Bougacha	b6137063eb	[AArch64][GlobalISel] Legalize narrow scalar fp->int conversions. Since we're now avoiding operations using narrow scalar integer types, we have to legalize the integer side of the FP conversions. This requires teaching the legalizer how to do that. llvm-svn: 292828	2017-01-23 21:10:14 +00:00
Matt Arsenault	7030661364	DAG: Allow legalization of fcanonicalize vector types llvm-svn: 292814	2017-01-23 18:52:26 +00:00
Matthias Braun	28eae8f4e0	LiveRegUnits: Add accumulateBackward() function Re-Commit r292543 with a fix for the situation when the chain end is MBB.end(). This function can be used to accumulate the set of all read and modified register in a sequence of instructions. Use this code in AArch64A57FPLoadBalancing::scavengeRegister() to prove the concept. - The AArch64A57LoadBalancing code is using a backwards analysis now which is irrespective of kill flags. This is the main motivation for this change. Differential Revision: http://reviews.llvm.org/D22082 llvm-svn: 292705	2017-01-21 02:21:04 +00:00
Tim Northover	7f3ad2e07b	GlobalISel: prevent heap use-after-free when looking up VReg. Translating the constant can create more VRegs, which can invalidate the reference into the DenseMap. So we have to look up the value again after all that's happened. llvm-svn: 292675	2017-01-20 23:25:17 +00:00
Petar Jovanovic	dbb39356b4	[mips] Fix debug information for __thread variable This patch fixes debug information for __thread variable on Mips using .dtprelword and .dtpreldword directives. Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D28770 llvm-svn: 292624	2017-01-20 17:53:30 +00:00
Wei Mi	d5f7eebd42	[RegisterCoalescing] Recommit the patch "Remove partial redundent copy". The recommit fixes a bug related with live interval update after the partial redundent copy is moved. The original patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 llvm-svn: 292621	2017-01-20 17:38:54 +00:00
Matthias Braun	d9217c0b86	Revert "LiveRegUnits: Add accumulateBackward() function" This seems to be breaking some bots. This reverts commit r292543. llvm-svn: 292574	2017-01-20 03:58:42 +00:00
Ahmed Bougacha	d294823930	[AArch64][GlobalISel] Widen scalar int->fp conversions. It's incorrect to ignore the higher bits of the integer source. Teach the legalizer how to widen it. llvm-svn: 292563	2017-01-20 01:37:24 +00:00
Stanislav Mekhanoshin	6ec3e3a728	[AMDGPU] Prevent spills before exec mask is restored Inline spiller can decide to move a spill as early as possible in the basic block. It will skip phis and label, but we also need to make sure it skips instructions in the basic block prologue which restore exec mask. Added isPositionLike callback in TargetInstrInfo to detect instructions which shall be skipped in addition to common phis, labels etc. Differential Revision: https://reviews.llvm.org/D27997 llvm-svn: 292554	2017-01-20 00:44:31 +00:00
Justin Bogner	e094cc4372	GlobalISel: Add a note about how we're being a bit loose with memory operands The logic in r292461 is conservatively correct, but we should revisit this later. Add a TODO so we don't forget. llvm-svn: 292553	2017-01-20 00:30:17 +00:00
Ahmed Bougacha	bf480554df	[MIRParser] Allow generic register specification on operand. This completes r292321 by adding support for generic registers, e.g.: %2:_(s32) = G_ADD %0, %1 llvm-svn: 292550	2017-01-20 00:29:59 +00:00
Justin Bogner	680663931c	GlobalISel: Only set FailedISel on dropped dbg intrinsics when using fallback It's easier to test the non-fallback path if we just drop these intrinsics for now, like we did before we added the fallback path. We'll obviously need to fix this properly, but the fixme for that is already here. llvm-svn: 292547	2017-01-20 00:24:30 +00:00
Justin Bogner	23acaf07de	GlobalISel: Pass the MachineFunction in to reportSelectionError directly Rather than trying to find MF based on the possibly-null MI we've passed in here, just pass it in directly. It's already available at all callers anyway. llvm-svn: 292544	2017-01-20 00:16:19 +00:00
Matthias Braun	3ffeb68869	LiveRegUnits: Add accumulateBackward() function This function can be used to accumulate the set of all read and modified register in a sequence of instructions. Use this code in AArch64A57FPLoadBalancing::scavengeRegister() to prove the concept. - The AArch64A57LoadBalancing code is using a backwards analysis now which is irrespective of kill flags. This is the main motivation for this change. Differential Revision: http://reviews.llvm.org/D22082 llvm-svn: 292543	2017-01-20 00:16:17 +00:00
Matthias Braun	710a4c1f3d	CodeGen: Add/Factor out LiveRegUnits class; NFCI This is a set of register units intended to track register liveness, it is similar in spirit to LivePhysRegs. You can also think of this as the liveness tracking parts of the RegisterScavenger factored out into an own class. This was proposed in http://llvm.org/PR27609 Differential Revision: http://reviews.llvm.org/D21916 llvm-svn: 292542	2017-01-20 00:16:14 +00:00
Tim Northover	3babfef11d	AArch64: fall back to DAG ISel for inline assembly. We can't currently handle "calls" to inlineasm strings so it's better to let the DAG handle it than generate rubbish. llvm-svn: 292540	2017-01-19 23:59:35 +00:00
Simon Pilgrim	fb32eea1b4	[SelectionDAG] Improve knownbits handling of UMIN/UMAX (PR31293) This patch improves the knownbits logic for unsigned integer min/max opcodes. For UMIN we know that the result will have the maximum of the inputs' known leading zero bits in the result, similarly for UMAX the maximum of the inputs' leading one bits. This is particularly useful for simplifying clamping patterns,. e.g. as SSE doesn't have a uitofp instruction we want to use sitofp instead where possible and for that we need to confirm that the top bit is not set. Differential Revision: https://reviews.llvm.org/D28853 llvm-svn: 292528	2017-01-19 22:41:22 +00:00
Mikael Holmen	2074e7497b	[DAG] Don't increase SDNodeOrder for dbg.value/declare. Summary: The SDNodeOrder is saved in the IROrder field in the SDNode, and this field may affects scheduling. Thus, letting dbg.value/declare increase the order numbers may in turn affect scheduling. Because of this change we also need to update the code deciding when dbg values should be output, in ScheduleDAGSDNodes.cpp/ProcessSDDbgValues. Dbg values now have the same order as the SDNode they are connected to, not the following orders. Test cases provided by Florian Hahn. Reviewers: bogner, aprantl, sunfish, atrick Reviewed By: atrick Subscribers: fhahn, probinson, andreadb, llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D25318 llvm-svn: 292485	2017-01-19 13:55:55 +00:00
Daniel Sanders	d64d5024a4	Re-commit: [globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Changes since first commit attempt: * Added missing guards * Added more missing guards * Found and fixed a use-after-free bug involving Twine locals Reviewers: t.p.northover, ab, rovka, qcolombet Reviewed By: qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292478	2017-01-19 11:15:55 +00:00
Justin Bogner	ddb80aee7e	GlobalISel: Implement widening for shifts llvm-svn: 292476	2017-01-19 07:51:17 +00:00
Justin Bogner	d09c3ce6c0	GlobalISel: Implement narrowing for G_LOAD llvm-svn: 292461	2017-01-19 01:05:48 +00:00
Justin Bogner	1f5c505437	GlobalISel: Fix text wrapping in a comment. NFC llvm-svn: 292460	2017-01-19 01:04:46 +00:00
Dehao Chen	1ce8d6ca59	Add -debug-info-for-profiling to emit more debug info for sample pgo profile collection Summary: SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info: * start line of all subprograms * linkage name of all subprograms * standalone subprograms (functions that has neither inlined nor been inlined) This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch): -gmlt(orig) -gmlt(patched) -g 433.milc 4.68% 5.40% 19.73% 444.namd 8.45% 8.93% 45.99% 447.dealII 97.43% 115.21% 374.89% 450.soplex 27.75% 31.88% 126.04% 453.povray 21.81% 26.16% 92.03% 470.lbm 0.60% 0.67% 1.96% 482.sphinx3 5.77% 6.47% 26.17% 400.perlbench 17.81% 19.43% 73.08% 401.bzip2 3.73% 3.92% 12.18% 403.gcc 31.75% 34.48% 122.75% 429.mcf 0.78% 0.88% 3.89% 445.gobmk 6.08% 7.92% 42.27% 456.hmmer 10.36% 11.25% 35.23% 458.sjeng 5.08% 5.42% 14.36% 462.libquantum 1.71% 1.96% 6.36% 464.h264ref 15.61% 16.56% 43.92% 471.omnetpp 11.93% 15.84% 60.09% 473.astar 3.11% 3.69% 14.18% 483.xalancbmk 56.29% 81.63% 353.22% geomean 15.60% 18.30% 57.81% Debug info size change for -gmlt binary with this patch: 433.milc 13.46% 444.namd 5.35% 447.dealII 18.21% 450.soplex 14.68% 453.povray 19.65% 470.lbm 6.03% 482.sphinx3 11.21% 400.perlbench 8.91% 401.bzip2 4.41% 403.gcc 8.56% 429.mcf 8.24% 445.gobmk 29.47% 456.hmmer 8.19% 458.sjeng 6.05% 462.libquantum 11.23% 464.h264ref 5.93% 471.omnetpp 31.89% 473.astar 16.20% 483.xalancbmk 44.62% geomean 16.83% Reviewers: davidxl, echristo, dblaikie Reviewed By: echristo, dblaikie Subscribers: aprantl, probinson, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D25434 llvm-svn: 292457	2017-01-19 00:44:11 +00:00
Matthias Braun	9f21a8d787	LiveIntervalAnalysis: Cleanup; NFC - Fix doxygen comments: Do not repeat name, remove duplicated doxygen comment (on declaration + implementation), etc. - Use more range based for llvm-svn: 292455	2017-01-19 00:32:13 +00:00
Krzysztof Parzyszek	de44c9d857	Treat segment [B, E) as not overlapping block with boundaries [A, B) llvm-svn: 292446	2017-01-18 23:12:19 +00:00
Haicheng Wu	8ce2d14356	[CodeGenPrepare] Fix a typo in the comment. NFC. encode => endcode. Differential Revision: https://reviews.llvm.org/D28866 llvm-svn: 292438	2017-01-18 21:12:10 +00:00
Justin Bogner	fde0104649	GlobalISel: Implement narrowing for G_STORE Legalize stores of types that are too wide by breaking them up into sequences of smaller stores. llvm-svn: 292412	2017-01-18 17:29:54 +00:00
Teresa Johnson	2d384ac381	Don't create a comdat group for a dropped def with initializer Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to available_externally when possible. If they had an initializer in the global_ctors list, a comdat group was being created. This code already had logic to skip available_externally defs, but now the EliminateAvailableExternally pass will drop these symbols to declarations earlier. Change the check to skip all declarations for linker (which includes available_externally along with declarations). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28737 llvm-svn: 292408	2017-01-18 16:58:43 +00:00
Florian Hahn	8485cecd3f	[thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue. Summary: In this function, virtual registers can be introduced (for example through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs will replace those virtual registers with concrete registers later on in PrologEpilogInserter, which sets NoVRegs again. This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which failed with expensive checks. https://llvm.org/bugs/show_bug.cgi?id=27484 Reviewers: rnk, bkramer, olista01 Reviewed By: olista01 Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D28829 llvm-svn: 292372	2017-01-18 15:01:22 +00:00
Daniel Sanders	af76f989b5	Re-revert: [globalisel] Tablegen-erate current Register Bank Information More missing guards. My build didn't notice it due to a stale file left over from a Global ISel build. llvm-svn: 292369	2017-01-18 14:26:12 +00:00
Daniel Sanders	517b61cb69	Re-commit: [globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Changes since last commit: The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and this should fix the buildbots however it may not be the whole fix. The previous buildbot failures suggest there may be a memory bug lurking that I'm unable to reproduce (including when using asan) or spot in the source. If they re-occur on this commit then I'll need assistance from the bot owners to track it down. Reviewers: t.p.northover, ab, rovka, qcolombet Reviewed By: qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292367	2017-01-18 14:17:50 +00:00
Matt Arsenault	f411071d63	DAG: Consider nnan in isKnownNeverNaN llvm-svn: 292328	2017-01-18 02:10:08 +00:00
Wei Mi	ce9d04ce58	Revert rL292292 since it causes a SEGV on sanitizer-x86_64-linux-fuzzer build bot. llvm-svn: 292327	2017-01-18 01:53:53 +00:00
Matthias Braun	de5fea2c30	MIRParser: Allow regclass specification on operand You can now define the register class of a virtual register on the operand itself avoiding the need to use a "registers:" block. Example: "%0:gr64 = COPY %rax" Differential Revision: https://reviews.llvm.org/D22398 llvm-svn: 292321	2017-01-18 00:59:19 +00:00
Wei Mi	8f4178a59e	[RegisterCoalescing] Remove partial redundent copy. The patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 llvm-svn: 292292	2017-01-17 23:39:07 +00:00
Tim Northover	d943354216	GlobalISel: correctly handle varargs Some platforms (notably iOS) use a different calling convention for unnamed vs named parameters in varargs functions, so we need to keep track of this information when translating calls. Since not many platforms are involved, the guts of the special handling is in the ValueHandler class (with a generic implementation that should work for most targets). llvm-svn: 292283	2017-01-17 22:30:10 +00:00
Tim Northover	b6636fd392	[GlobalISel] track predecessor mapping during switch lowering. Correctly populating Machine PHIs relies on knowing exactly how the IR level CFG was lowered to MachineIR. This needs to be tracked by any translation phases that meddle (currently only SwitchInst handling). This reapplies r291973 which was reverted because of testing failures. Fixes: + Don't return an ArrayRef to a local temporary. + Incorporate Kristof's suggested comment improvements. llvm-svn: 292278	2017-01-17 22:13:50 +00:00
Ahmed Bougacha	9e5a085cf1	Revert "[TLI] Robustize SDAG proto checking by merging it into TLI." This reverts commit r292189, as it causes issues on SystemZ bots. llvm-svn: 292191	2017-01-17 03:31:00 +00:00
Ahmed Bougacha	c018efd680	[TLI] Robustize SDAG proto checking by merging it into TLI. SelectionDAGBuilder recognizes libfuncs using some homegrown parameter type-checking. Use TLI instead, removing another heap of redundant code. This isn't strictly NFC, as the SDAG code was too lax. Concretely, this means changes are required to two tests: - calling a non-variadic function via a variadic prototype isn't OK; it just happens to work on x86_64 (but not on, e.g., aarch64). - mempcpy has a size_t parameter; the SDAG code accepts any integer type, which meant using i32 on x86_64 worked. I don't think it's worth supporting either of these (IMO) broken testcases. Instead, fix them to be more correct. llvm-svn: 292189	2017-01-17 03:10:06 +00:00
Daniel Sanders	a83a1a69c5	Revert r292132: [globalisel] Tablegen-erate current Register Bank Information'... Several buildbots encountered a crash in tablegen when building this commit. Reverting while I investigate the cause. llvm-svn: 292136	2017-01-16 15:34:43 +00:00
Daniel Sanders	ab8194def0	[globalisel] Tablegen-erate current Register Bank Information Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292132	2017-01-16 15:20:43 +00:00
Simon Pilgrim	3e91519a1c	[SelectionDAG] Add knownbits support for BITREVERSE llvm-svn: 292130	2017-01-16 14:49:26 +00:00
Simon Pilgrim	db73dbcc7c	[SelectionDAG] Add support for BITREVERSE constant folding We were relying on constant folding of the legalized instructions to do what constant folding we had previously llvm-svn: 292114	2017-01-16 13:39:00 +00:00
Serge Pavlov	ed5eb93384	Reverted: Track validity of pass results Commits r291882 and related r291887. llvm-svn: 292062	2017-01-15 10:23:18 +00:00
Daniel Jasper	bf56ad36cb	Revert "[GlobalISel] track predecessor mapping during switch lowering." This reverts commit r291973. The test fails in a Release build with LLVM_BUILD_GLOBAL_ISEL enabled. AFAICT, llc segfaults. I'll add a few more details to the original commit. llvm-svn: 292061	2017-01-15 09:41:49 +00:00
Justin Bogner	1a314dac4b	GlobalISel: Abort in ResetMachineFunctionPass if fallback isn't enabled When GlobalISel is configured to abort rather than fallback the only thing that resetting the machine function does is make things harder to debug. If we ever get to this point in the abort configuration it indicates that we've already hit a bug, so this changes the behaviour to abort instead. llvm-svn: 291977	2017-01-13 23:46:11 +00:00
Tim Northover	2b57987827	[GlobalISel] track predecessor mapping during switch lowering. Correctly populating Machine PHIs relies on knowing exactly how the IR level CFG was lowered to MachineIR. This needs to be tracked by any translation phases that meddle (currently only SwitchInst handling). llvm-svn: 291973	2017-01-13 23:11:37 +00:00
David Majnemer	e0ebdf4bc7	[CodeGen] Simplify getRecipEstimateForFunc It used two attribute lookups when only one was needed. llvm-svn: 291965	2017-01-13 22:24:25 +00:00
James Y Knight	99a2ce2af2	Check for register clobbers when merging a vreg live range with a reserved physreg in RegisterCoalescer. Previously, we only checked for clobbers when merging into a READ of the physreg, but not when merging from a WRITE to the physreg. Differential Revision: https://reviews.llvm.org/D28527 llvm-svn: 291942	2017-01-13 19:08:36 +00:00
Malcolm Parsons	17d266bc96	Remove unused lambda captures. NFC llvm-svn: 291916	2017-01-13 17:12:16 +00:00
Benjamin Kramer	061f4a5fe6	Apply clang-tidy's performance-unnecessary-value-param to LLVM. With some minor manual fixes for using function_ref instead of std::function. No functional change intended. llvm-svn: 291904	2017-01-13 14:39:03 +00:00
Diana Picus	116bbab4e4	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891	2017-01-13 09:58:52 +00:00
Serge Pavlov	d409411ef1	Track validity of pass results Running tests with expensive checks enabled exhibits some problems with verification of pass results. First, the pass verification may require results of analysis that are not available. For instance, verification of loop info requires results of dominator tree analysis. A pass may be marked as conserving loop info but does not need to be dependent on DominatorTreePass. When a pass manager tries to verify that loop info is valid, it needs dominator tree, but corresponding analysis may be already destroyed as no user of it remained. Another case is a pass that is skipped. For instance, entities with linkage available_externally do not need code generation and such passes are skipped for them. In this case result verification must also be skipped. To solve these problems this change introduces a special flag to the Pass structure to mark passes that have valid results. If this flag is reset, verifications dependent on the pass result are skipped. Differential Revision: https://reviews.llvm.org/D27190 llvm-svn: 291882	2017-01-13 06:09:54 +00:00
Daniel Sanders	b7391dd3b4	[globalisel] Move as much RegisterBank initialization to the constructor as possible Summary: The register bank is now entirely initialized in the constructor. However, we still have the hardcoded number of register classes which will be dealt with in the TableGen patch (D27338) since we do not have access to this information to resolve this at this stage. The number of register classes is known to the TRI and to TableGen but the RegisterBank constructor is too early for the former and too late for the latter. This will be fixed when the data is tablegen-erated. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D27809 llvm-svn: 291770	2017-01-12 16:11:23 +00:00
Daniel Sanders	ae03595bfb	[globalisel] Initialize RegisterBanks with static data. Summary: Refactor the RegisterBank initialization to use static data. This requires GlobalISel implementations to rewrite calls to createRegisterBank() and addRegBankCoverage() into a call to setRegBankData(). Out of tree targets can use diff 4 of D27807 (https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump the register classes and other data that needs to be provided to setRegBankData(). This is the method that was used to generate the static data in this patch. Tablegen-eration of this static data will follow after some refactoring. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D27807 Differential Revision: https://reviews.llvm.org/D27808 llvm-svn: 291768	2017-01-12 15:32:10 +00:00
Zachary Turner	629cb7d8cc	[CodeView] Finish decoupling TypeDatabase from TypeDumper. Previously the type dumper itself was passed around to a lot of different places and manipulated in ways that were more appropriate on the type database. For example, the entire TypeDumper was passed into the symbol dumper, when all the symbol dumper wanted to do was lookup the name of a TypeIndex so it could print it. That's what the TypeDatabase is for -- mapping type indices to names. Another example is how if the user runs llvm-pdbdump with the option to dump symbols but not types, we still have to visit all types so that we can print minimal information about the type of a symbol, but just without dumping full symbol records. The way we did this before is by hacking it up so that we run everything through the type dumper with a null printer, so that the output goes to /dev/null. But really, we don't need to dump anything, all we want to do is build the type database. Since TypeDatabaseVisitor now exists independently of TypeDumper, we can do this. We just build a custom visitor callback pipeline that includes a database visitor but not a dumper. All the hackery around printers etc goes away. After this patch, we could probably even delete the entire CVTypeDumper class since really all it is at this point is a thin wrapper that hides the details of how to build a useful visitation pipeline. It's not a priority though, so CVTypeDumper remains for now. After this patch we will be able to easily plug in a different style of type dumper by only implementing the proper visitation methods to dump one-line output and then sticking it on the pipeline. Differential Revision: https://reviews.llvm.org/D28524 llvm-svn: 291724	2017-01-11 23:24:22 +00:00
Kyle Butt	efe56fed12	Revert "CodeGen: Allow small copyable blocks to "break" the CFG." This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded. This needs a simple probability check because there are some cases where it is not profitable. llvm-svn: 291695	2017-01-11 19:55:19 +00:00
Tim Northover	778d0816ab	GlobalISel: only print debug info with -debug. NFC. Turns out DEBUG(...) has uses even inside NDEBUG checks. llvm-svn: 291685	2017-01-11 17:33:37 +00:00
Craig Topper	7af39837a9	Revert r291645 "[DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent." Some test appears to be hanging on the build bots. llvm-svn: 291650	2017-01-11 04:59:25 +00:00
Craig Topper	577d258569	[DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent. llvm-svn: 291645	2017-01-11 04:02:23 +00:00
Matt Arsenault	e482403e1c	DAGCombiner: Add hasOneUse checks to fadd/fma combine Even with aggressive fusion enabled, this requires duplicating the fmul, or increases an fadd to another fma which is not an improvement. llvm-svn: 291642	2017-01-11 02:02:12 +00:00
Eugene Zelenko	c4ad1ce068	[Target] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 291641	2017-01-11 01:45:03 +00:00
Quentin Colombet	0b63b31b2e	[RegBankSelect] Improve the output of the debug messages. Add more information about mapping cost and chosen solution. llvm-svn: 291629	2017-01-11 00:48:41 +00:00
Zachary Turner	a9054ddd9c	[CodeView/PDB] Rename a bunch of files. We were starting to get some name clashes between llvm-pdbdump and the common CodeView framework, so I took this opportunity to rename a bunch of files to more accurately describe their usage. This also helps in llvm-pdbdump to distinguish between different files and whether they are used for pretty dump mode or raw dump mode. llvm-svn: 291627	2017-01-11 00:35:43 +00:00
Zachary Turner	c640b76db5	[CodeView] Add TypeDatabase class. This creates a centralized class in which to store type records. It stores types as an array of entries, which matches the notion of a type stream being a topologically sorted DAG. Logic to build up such a database was already being used in CVTypeDumper, so CVTypeDumper is now updated to to read from a TypeDatabase which is filled out by an earlier visitor in the pipeline. Differential Revision: https://reviews.llvm.org/D28486 llvm-svn: 291626	2017-01-11 00:35:08 +00:00
Kyle Butt	df27aa8c89	CodeGen: Allow small copyable blocks to "break" the CFG. When choosing the best successor for a block, ordinarily we would have preferred a block that preserves the CFG unless there is a strong probability the other direction. For small blocks that can be duplicated we now skip that requirement as well. Differential revision: https://reviews.llvm.org/D27742 llvm-svn: 291609	2017-01-10 23:04:30 +00:00
Matt Arsenault	def496c04b	Remove unused CONVERT_RNDSAT intrinsics llvm-svn: 291607	2017-01-10 22:38:02 +00:00
Matt Arsenault	0b382a7cb8	DAG: Avoid OOB when legalizing vector indexing If a vector index is out of bounds, the result is supposed to be undefined but is not undefined behavior. Change the legalization for indexing the vector on the stack so that an out of bounds index does not create an out of bounds memory access. llvm-svn: 291604	2017-01-10 22:02:30 +00:00
Victor Leschuk	cbddae74f5	DebugInfo: support for DW_FORM_implicit_const Support for DW_FORM_implicit_const DWARFv5 feature. When this form is used attribute value goes to .debug_abbrev section (as SLEB). As this form would break any debug tool which doesn't support DWARFv5 it is guarded by dwarf version check. Attempt to use this form with dwarf version <= 4 is considered a fatal error. Differential Revision: https://reviews.llvm.org/D28456 llvm-svn: 291599	2017-01-10 21:18:26 +00:00
Simon Dardis	548a53f5ee	[mips] Fix Mips MSA instrinsics The usage of some MIPS MSA instrinsics that took immediates could crash LLVM during lowering. This patch addresses that behaviour. Crucially this patch also makes the use of intrinsics with out of range immediates as producing an internal error. The ld,st instrinsics would trigger an assertion failure for MIPS64 as their lowering would attempt to add an i32 offset to a i64 pointer. Reviewers: vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D25438 llvm-svn: 291571	2017-01-10 16:40:57 +00:00
Craig Topper	588c734b0f	[DAGCombiner] Merge together duplicate checks for folding fold (select C, 1, X) -> (or C, X) and folding (select C, X, 0) -> (and C, X). Also be consistent about checking that both the condition and the result type are i1. NFC I guess previously we just assumed if the result type was i1, then the condition type must also be i1? llvm-svn: 291548	2017-01-10 07:42:57 +00:00
Craig Topper	d915d6ba57	[DAGCombiner] Remove code for optimizing select (xor Cond, 0), X, Y -> select Cond, X, Y. Just let combine on the xor itself take care of it. llvm-svn: 291534	2017-01-10 04:12:19 +00:00
Evandro Menezes	063259d4d1	[CodeGen] Implement the SUnit::print() method This method seems to have had a troubled life. This patch proposes that it replaces the recently added helper function dumpSUIdentifier. This way, the method can be used in other files using the SUnit class. Differential revision: https://reviews.llvm.org/D28488 llvm-svn: 291520	2017-01-10 01:08:01 +00:00
Matthias Braun	ba7d95d425	PeepholeOptimizer: Do not replace SubregToReg(bitcast like) While we can usually replace bitcast like instructions (MachineInstr::isBitcast()) with a COPY this is not legal if any of the users uses SUBREG_TO_REG to assert the upper bits of the result are zero. Differential Revision: https://reviews.llvm.org/D28474 llvm-svn: 291483	2017-01-09 21:38:17 +00:00
Matthias Braun	a37430844c	MachineInstr: Print name for subreg index in SUBREG_TO_REG SUBREG_TO_REG takes a subregister index as 3rd operand, print the name instead of a number. We already do the same for INSERT_SUBREG and REG_SEQUENCE. llvm-svn: 291481	2017-01-09 21:38:10 +00:00
Sumanth Gundapaneni	0ac1ce70ba	In the below scenario, we must be able to skip the a DBG_VALUE instruction and remove the dead store. %vreg0<def> = L2_loadri_io <fi#15>, 0; mem:LD4[%dataF](align=4) DBG_VALUE %vreg0, %noreg, !"dataF", <!184>; IntRegs:%vreg0 S2_storeri_io <fi#15>, 0, %vreg0; mem:ST4[%dataF] In reality, this kind of stores are eliminated before Stack Slot Coloring pass, possibly in instruction lowering Differential Revision: https://reviews.llvm.org/D26616 llvm-svn: 291455	2017-01-09 17:45:02 +00:00
Bjorn Pettersson	b14afd452d	[SelectionDAG] Fix in legalization of UMAX/SMAX/UMIN/SMIN. Solves PR31486. Summary: Originally i64 = umax t8, Constant:i64<4> was expanded into i32,i32 = umax Constant:i32<0>, Constant:i32<0> i32,i32 = umax t7, Constant:i32<4> Now instead the two produced umax:es return i32 instead of i32, i32. Thanks to Jan Vesely for help with the test case. Patch by mikael.holmen at ericsson.com Reviewers: bogner, jvesely, tstellarAMD, arsenm Subscribers: test, wdng, RKSimon, arsenm, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D28135 llvm-svn: 291441	2017-01-09 12:03:50 +00:00
Daniel Sanders	12360efa76	[globalisel] Stop requiring -debug/-debug-only=registerbankinfo for assertions. Summary: I've noticed that these assertions don't trigger when the condition is false. The problem is that the DEBUG(x) macro only executes x when the pass is emitting debug output via the -debug and -debug-only=registerbankinfo command line arguments. Debug builds should always execute the assertions so use '#ifndef NDEBUG' instead. Also removed an assertion that is only true the first time it's tested. <Target>RegisterBankInfo's constructor will re-use register banks causing them to be valid on subsequent tests. That assertion will fail on the first test too in the near future. Reviewers: t.p.northover, ab, rovka, qcolombet Subscribers: dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D28358 llvm-svn: 291235	2017-01-06 14:29:34 +00:00
David Majnemer	9e04befb09	[SelectionDAG] Rework lowerRangeToAssertZExt Utilize ConstantRange to make it easier to interpret range metadata. llvm-svn: 291211	2017-01-06 02:43:28 +00:00
David Majnemer	eaba06cffa	[SelectionDAG] Correctly transform range metadata to AssertZExt We used the logBase2 of the high instead of the ceilLogBase2 resulting in the wrong result for certain values. For example, it resulted in an i1 AssertZExt when the exclusive portion of the range was 3. llvm-svn: 291196	2017-01-06 00:11:46 +00:00
Joerg Sonnenberger	83963995c6	PR 31534: When emitting both DWARF unwind tables and debug information, do not use .cfi_sections. This requires checking if any non-declaration function in the module needs an unwind table. llvm-svn: 291172	2017-01-05 20:55:28 +00:00
Matthias Braun	1172332203	CodeGen: Assert that liveness is up to date when reading block live-ins. Add an assert that checks whether liveins are up to date before they are used. - Do not print liveins into .mir files anymore in situations where they are out of date anyway. - The assert in the RegisterScavenger is superseded by the new one in livein_begin(). - Skip parts of the liveness updating logic in IfConversion.cpp when liveness isn't tracked anymore (just enough to avoid hitting the new assert()). Differential Revision: https://reviews.llvm.org/D27562 llvm-svn: 291169	2017-01-05 20:01:19 +00:00
Kristof Beyls	a983e7c4a4	[GlobalISel] Add support for address-taken basic blocks To make this work, pointers from the MachineBasicBlock to the LLVM-IR-level basic blocks need to be initialized, as the AsmPrinter uses this link to be able to print out labels for the basic blocks that are address-taken. Most of the changes in this commit are about adapting existing tests to include the basic block name that is now printed out in the MIR format, now that the name becomes available as the link to the LLVM-IR basic block is initialized. The relevant test change for the functionality added in this patch are the added "(address-taken)" strings in test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D28123 llvm-svn: 291105	2017-01-05 13:27:52 +00:00
Kristof Beyls	eced071e88	[GlobalISel] Add support for switch statements This commit does this using a trivial chain of conditional branches. In the future, we probably want to reuse the optimized switch lowering used in SelectionDAG. Differential Revision: https://reviews.llvm.org/D28176 llvm-svn: 291099	2017-01-05 11:28:51 +00:00
Saleem Abdulrasool	6252bd8eac	MC: support passing search paths to the IAS This is needed to support inclusion in inline assembly via the `.include` directive. llvm-svn: 291085	2017-01-05 05:56:39 +00:00

... 10 11 12 13 14 ...

22929 Commits