llvm-project

Commit Graph

Author	SHA1	Message	Date
Mehdi Amini	a28d91d81b	DataLayout is mandatory, update the API to reflect it with references. Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740	2015-03-10 02:37:25 +00:00
Simon Pilgrim	8c58c066b7	[DAGCombiner] Add a shuffle mask commutation helper function. NFCI. We have an increasing number of cases where we are creating commuted shuffle masks - all implementing nearly the same code. This patch adds a static helper function - ShuffleVectorSDNode::commuteMask() and replaces a number of cases to use it. Differential Revision: http://reviews.llvm.org/D8139 llvm-svn: 231581	2015-03-07 22:33:11 +00:00
Benjamin Kramer	0b6742aeb5	Accidentaly inverted the condition again. Sorry. llvm-svn: 230973	2015-03-02 16:45:08 +00:00
Benjamin Kramer	f43de1879a	Avoid assertion in MSVC 2013 debug builds. llvm-svn: 230972	2015-03-02 16:42:56 +00:00
Benjamin Kramer	8008e9f624	Simplify code. NFC. llvm-svn: 230948	2015-03-02 11:57:04 +00:00
Sanjay Patel	ab7e86e5be	Canonicalize splats as build_vectors (PR22283) This is a follow-on patch to: http://reviews.llvm.org/D7093 That patch canonicalized constant splats as build_vectors, and this patch removes the constant check so we can canonicalize all splats as build_vectors. This fixes the 2nd test case in PR22283: http://llvm.org/bugs/show_bug.cgi?id=22283 The unfortunate code duplication between SelectionDAG and DAGCombiner is discussed in the earlier patch review. At least this patch is just removing code... This improves an existing x86 AVX test and changes codegen in an ARM test. Differential Revision: http://reviews.llvm.org/D7389 llvm-svn: 229511	2015-02-17 16:54:32 +00:00
Benjamin Kramer	6cd780ff21	Prefer SmallVector::append/insert over push_back loops. Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500	2015-02-17 15:29:18 +00:00
Chandler Carruth	1b5285dd57	[SDAG] Teach the SelectionDAG to canonicalize vector shuffles of splats directly into blends of the splats. These patterns show up even very late in the vector shuffle lowering where we don't have any chance for DAG combining to kick in, and blending is a tremendously simpler operation to model. By coercing the shuffle into a blend we can much more easily match and lower shuffles of splats. Immediately with this change there are significantly more blends being matched in the x86 vector shuffle lowering. llvm-svn: 229308	2015-02-15 12:18:12 +00:00
Duncan P. N. Exon Smith	70eb9c5ae5	CodeGen: Canonicalize access to function attributes, NFC Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) Also, add `Function::getFnStackAlignment()`, and canonicalize: getAttributes().getStackAlignment(AttributeSet::FunctionIndex) => getFnStackAlignment() llvm-svn: 229208	2015-02-14 01:44:41 +00:00
Manuel Jacob	6f508c578b	Add nullptr checks for TargetSelectionDAGInfo in SelectionDAG. TSI is not guaranteed be non-null in SelectionDAG. llvm-svn: 227397	2015-01-28 23:50:40 +00:00
Ahmed Bougacha	9a9e1a59ce	[SelectionDAG] Fix assert message copypasta. NFC. llvm-svn: 227119	2015-01-26 19:31:42 +00:00
Andrea Di Biagio	8381475a75	[DAG] Fix wrong canonicalization performed on shuffle nodes. This fixes a regression introduced by r226816. When replacing a splat shuffle node with a constant build_vector, make sure that the new build_vector has a valid number of elements. Thanks to Patrik Hagglund for reporting this problem and providing a small reproducible. llvm-svn: 227002	2015-01-24 11:54:29 +00:00
Mehdi Amini	5059813c2d	DAGCombine: always constant fold FMA when target disable FP exceptions Summary: When trying to constant fold an FMA in the DAG, getNode() fails to fold the FMA if an operand is not finite. In this case this patch allows the constant folding if !TLI->hasFloatingPointExceptions() Reviewers: resistor Reviewed By: resistor Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D6912 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 226901	2015-01-23 07:07:20 +00:00
Jan Vesely	6269e3ca2f	SelectionDAG: Add KnownBits and SignBits computation for EXTRACT_ELEMENT v2: use getZExtValue add missing break codestyle v3: add few more comments Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226880	2015-01-22 23:42:41 +00:00
Michael Kuperstein	25e34d11f3	[DAGCombine] Produce better code for constant splats This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 Fixed recommit of r226811. llvm-svn: 226816	2015-01-22 13:07:28 +00:00
Michael Kuperstein	ff74032018	Revert r226811, MSVC accepts code sane compilers don't. llvm-svn: 226814	2015-01-22 12:48:07 +00:00
Michael Kuperstein	84fad3e5c9	[DAGCombine] Produce better code for constant splats This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 llvm-svn: 226811	2015-01-22 12:37:23 +00:00
Elena Demikhovsky	150d9f3187	Fixed a bug in type legalizer for masked load/store intrinsics. The problem occurs when after vectorization we have type <2 x i32>. This type is promoted to <2 x i64> and then requires additional efforts for expanding loads and truncating stores. I added EXPAND / TRUNCATE attributes to the masked load/store SDNodes. The code now contains additional shuffles. I've prepared changes in the cost estimation for masked memory operations, it will be submitted separately. llvm-svn: 226808	2015-01-22 12:07:59 +00:00
Matt Arsenault	bd22342322	Implement new way of expanding extloads. Now that the source and destination types can be specified, allow doing an expansion that doesn't use an EXTLOAD of the result type. Try to do a legal extload to an intermediate type and extend that if possible. This generalizes the special case custom lowering of extloads R600 has been using to work around this problem. This also happens to fix a bug that would incorrectly use more aligned loads than should be used. llvm-svn: 225925	2015-01-14 01:35:17 +00:00
Matthias Braun	f50ab43214	DAGCombiner: simplify by using condition variables; NFC llvm-svn: 225836	2015-01-13 22:17:46 +00:00
Sanjay Patel	7129c10cae	merge consecutive loads that are offset from a base address SelectionDAG::isConsecutiveLoad() was not detecting consecutive loads when the first load was offset from a base address. This patch recognizes that pattern and subtracts the offset before comparing the second load to see if it is consecutive. The codegen change in the new test case improves from: vmovsd 32(%rdi), %xmm0 vmovsd 48(%rdi), %xmm1 vmovhpd 56(%rdi), %xmm1, %xmm1 vmovhpd 40(%rdi), %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 To: vmovups 32(%rdi), %ymm0 An existing test case is also improved from: vmovsd (%rdi), %xmm0 vmovsd 16(%rdi), %xmm1 vmovsd 24(%rdi), %xmm2 vunpcklpd %xmm2, %xmm0, %xmm0 ## xmm0 = xmm0[0],xmm2[0] vmovhpd 8(%rdi), %xmm1, %xmm3 To: vmovsd (%rdi), %xmm0 vmovsd 16(%rdi), %xmm1 vmovhpd 24(%rdi), %xmm0, %xmm0 vmovhpd 8(%rdi), %xmm1, %xmm1 This patch fixes PR21771 ( http://llvm.org/bugs/show_bug.cgi?id=21771 ). Differential Revision: http://reviews.llvm.org/D6642 llvm-svn: 224379	2014-12-16 21:57:18 +00:00
Owen Anderson	558012a3fc	Fix a few instances found in SelectionDAG where we were not handling F16 at parity with F32 and F64. llvm-svn: 223760	2014-12-09 06:50:39 +00:00
Elena Demikhovsky	f1de34b84d	Masked Load / Store Intrinsics - the CodeGen part. I'm recommiting the codegen part of the patch. The vectorizer part will be send to review again. Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 223348	2014-12-04 09:40:44 +00:00
Duncan P. N. Exon Smith	9bc81fbe92	Revert "Masked Vector Load and Store Intrinsics." This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures. Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp llvm-svn: 222936	2014-11-28 21:29:14 +00:00
Elena Demikhovsky	9e5089a938	Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632	2014-11-23 08:07:43 +00:00
Simon Pilgrim	3ac3b251a9	[X86][SSE] pslldq/psrldq byte shifts/rotation for SSE2 This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain. I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr. Differential Revision: http://reviews.llvm.org/D5699 llvm-svn: 222340	2014-11-19 10:06:49 +00:00
David Blaikie	70573dcd9f	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
Frederic Riss	0f7abef2cf	Add an assert and a test that verify r221709's fix. llvm-svn: 221854	2014-11-13 03:20:23 +00:00
Frederic Riss	8ad4f498fb	Totally forget deallocated SDNodes in SDDbgInfo. What would happen before that commit is that the SDDbgValues associated with a deallocated SDNode would be marked Invalidated, but SDDbgInfo would keep a map entry keyed by the SDNode pointer pointing to this list of invalidated SDDbgNodes. As the memory gets reused, the list might get wrongly associated with another new SDNode. As the SDDbgValues are cloned when they are transfered, this can lead to an exponential number of SDDbgValues being produced during DAGCombine like in http://llvm.org/bugs/show_bug.cgi?id=20893 Note that the previous behavior wasn't really buggy as the invalidation made sure that the SDDbgValues won't be used. This commit can be considered a memory optimization and as such is really hard to validate in a unit-test. llvm-svn: 221709	2014-11-11 21:21:08 +00:00
Benjamin Kramer	7000ca3f55	Modernize old-style static asserts. NFC. llvm-svn: 219588	2014-10-12 17:56:40 +00:00
Eric Christopher	1e845f269b	Remove a bunch of getSubtargetImpl calls since we already have a cached TLI instance. llvm-svn: 219342	2014-10-08 21:08:32 +00:00
Eric Christopher	8d07f44560	Have the selection dag grab TargetLowering off of the subtarget inside init rather than have it passed in as an argument. llvm-svn: 219270	2014-10-08 01:57:58 +00:00
Eric Christopher	d9636c1dcf	Have SelectionDAG's subtarget TargetSelectionDAGInfo be set during init rather than construction time. llvm-svn: 219262	2014-10-08 00:32:59 +00:00
Adrian Prantl	87b7eb9d0f	Move the complex address expression out of DIVariable and into an extra argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! Note: I accidentally committed a bogus older version of this patch previously. llvm-svn: 218787	2014-10-01 18:55:02 +00:00
Adrian Prantl	b458dc2eee	Revert r218778 while investigating buldbot breakage. "Move the complex address expression out of DIVariable and into an extra" llvm-svn: 218782	2014-10-01 18:10:54 +00:00
Adrian Prantl	25a7174e7a	Move the complex address expression out of DIVariable and into an extra argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! llvm-svn: 218778	2014-10-01 17:55:39 +00:00
Job Noorman	9b31bd6bb0	Do not assume the value passed to memset is an i32. The code in SelectionDAG::getMemset for some reason assumes the value passed to memset is an i32. This breaks the generated code for targets that only have registers smaller than 32 bits because the value might get split into multiple registers by the calling convention. See the test for the MSP430 target included in the patch for an example. This patch ensures that nothing is assumed about the type of the value. Instead, the type is taken from the selected overload of the llvm.memset intrinsic. llvm-svn: 216716	2014-08-29 08:23:53 +00:00
Craig Topper	e1d1294853	Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or just letting them be implicitly created. llvm-svn: 216525	2014-08-27 05:25:25 +00:00
Craig Topper	4627679cec	Use range based for loops to avoid needing to re-mention SmallPtrSet size. llvm-svn: 216351	2014-08-24 23:23:06 +00:00
Craig Topper	71b7b68b74	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 216158	2014-08-21 05:55:13 +00:00
Alexey Samsonov	e229ec5bfc	Fix null reference creation in SelectionDAG constructor. Store TargetSelectionDAGInfo as a pointer instead of a reference: getSelectionDAGInfo() may not be implemented for certain backends (e.g. it's not currently implemented for R600). This bug is reported by UBSan. llvm-svn: 216129	2014-08-20 21:40:15 +00:00
Craig Topper	6230691c91	Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870	2014-08-18 00:24:38 +00:00
Craig Topper	5229cfd163	Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. llvm-svn: 215868	2014-08-17 23:47:00 +00:00
Hal Finkel	46ef7ce283	[PowerPC] Implement PPCTargetLowering::getTgtMemIntrinsic This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store intrinsics. As with the construction of the MachineMemOperands for the intrinsic calls used for unaligned load/store lowering, the only slight complication is that we need to represent a larger memory range than the loaded/stored value-type size (because the address is rounded down to an aligned address, and we need to conservatively represent the entire possible range of the actual access). This required adding an extra size field to TargetLowering::IntrinsicInfo, and this was done in a way that required no modifications to other targets (the size defaults to the store size of the provided memory data type). This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed). llvm-svn: 215512	2014-08-13 01:15:40 +00:00
Pedro Artigas	caa565887d	Added a TLI hook to signal that the target does not have or does not care about floating point exceptions, added use of flag to fold potentially exception raising floating point math in selection DAG. No functionality change, as targets have to explicitly ask for this behavior and none does today. llvm-svn: 215222	2014-08-08 16:46:53 +00:00
Eric Christopher	fc6de428c8	Have MachineFunction cache a pointer to the subtarget to make lookups shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838	2014-08-05 02:39:49 +00:00
Eric Christopher	d913448b38	Remove the TargetMachine forwards for TargetSubtargetInfo based information and update all callers. No functional change. llvm-svn: 214781	2014-08-04 21:25:23 +00:00
Chandler Carruth	356665a36c	[SDAG] MorphNodeTo recursively deletes dead operands of the old fromulation of the node, which isn't really the desired behavior from within the combiner or legalizer, but is necessary within ISel. I've added a hopefully helpful comment and fixed the only two places where this took place. Yet another step toward the combiner and legalizer not needing to use update listeners with virtual calls to manage the worklists behind legalization and combining. llvm-svn: 214574	2014-08-01 22:09:43 +00:00
Hal Finkel	b6d0d6b263	[PowerPC] Generate unaligned vector loads using intrinsics instead of regular loads Altivec vector loads on PowerPC have an interesting property: They always load from an aligned address (by rounding down the address actually provided if necessary). In order to generate an actual unaligned load, you can generate two load instructions, one with the original address, one offset by one vector length, and use a special permutation to extract the bytes desired. When this was originally implemented, I generated these two loads using regular ISD::LOAD nodes, now marked as aligned. Unfortunately, there is a problem with this: The alignment of a load does not contribute to its identity, and SDNodes are uniqued. So, imagine that we have some unaligned load, L1, that is not aligned. The routine will create two loads, L1(aligned) and (L1+16)(aligned). Further imagine that there had already existed a load (L1+16)(unaligned) with the same chain operand as the load L1. When (L1+16)(aligned) is created as part of the lowering of L1, this load is also the (L1+16)(unaligned) node, just now marked as aligned (because the new alignment overwrites the old). But the original users of (L1+16)(unaligned) now get the data intended for the permutation yielding the data for L1, and (L1+16)(unaligned) no longer exists to get its own permutation-based expansion. This was PR19991. A second potential problem has to do with the MMOs on these loads, which can be used by AA during instruction scheduling to break chain-based dependencies. If the new "aligned" loads get the MMO from the original unaligned load, this does not represent the fact that it will load data from below the original address. Normally, this would not matter, but this load might be combined with another load pair for a previous vector, and then the dependency on the otherwise- ignored lower bytes can matter. To fix both problems, instead of generating the necessary loads using regular ISD::LOAD instructions, ppc_altivec_lvx intrinsics are used instead. These are provided with MMOs with a conservative address range. Unfortunately, I no longer have a failing test case (since PR19991 was reported, other changes in CodeGen have forced this bug back into hiding it again). Nevertheless, this should fix the underlying problem. llvm-svn: 214481	2014-08-01 05:20:41 +00:00
Louis Gerbarg	67474e3755	Make sure no loads resulting from load->switch DAGCombine are marked invariant Currently when DAGCombine converts loads feeding a switch into a switch of addresses feeding a load the new load inherits the isInvariant flag of the left side. This is incorrect since invariant loads can be reordered in cases where it is illegal to reoarder normal loads. This patch adds an isInvariant parameter to getExtLoad() and updates all call sites to pass in the data if they have it or false if they don't. It also changes the DAGCombine to use that data to make the right decision when creating the new load. llvm-svn: 214449	2014-07-31 21:45:05 +00:00

1 2 3 4 5 ...

1320 Commits