llvm-project

Commit Graph

Author	SHA1	Message	Date
Keno Fischer	277bfaefaf	Initialize BasicAAWrapperPass in it's constructor Summary: This idiom is used elsewhere in LLVM, but was overlooked here. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13628 llvm-svn: 251348	2015-10-26 21:22:58 +00:00
Alexey Samsonov	ff8a80b477	Fix build failure on GCC 4.7 (old libstdc++ doesn't have std::map::emplace). llvm-svn: 251347	2015-10-26 21:20:37 +00:00
David Blaikie	efbb29153e	Remove use of std::map<>::emplace which is not supported on some older versions of libstdc++ llvm-svn: 251346	2015-10-26 21:10:36 +00:00
Diego Novillo	e822b63681	Remove unused local variable. NFC. llvm-svn: 251344	2015-10-26 20:50:26 +00:00
Peter Collingbourne	99fac80db2	ARM/ELF: Restore original (pre-r251322) logic for deciding whether to use GOT. Unbreaks linking with gold, which cannot resolve direct relocations referring to global symbols. llvm-svn: 251342	2015-10-26 20:46:44 +00:00
Alexey Samsonov	f3ecfd3af4	[LLVMSymbolize] Use symbol table only if function linkage name was requested. Now it's enough to just specify -functions=short without additionally providing -use-symbol-table=false. llvm-svn: 251339	2015-10-26 20:12:29 +00:00
Alexey Samsonov	1d3f3271ac	Fix build error by fully qualifying llvm::make_unique. llvm-svn: 251338	2015-10-26 20:12:27 +00:00
Rui Ueyama	df94852a60	Optimize StringTableBuilder. This is a patch to improve StringTableBuilder's performance. That class' finalize function is very hot particularly in LLD because the function does tail-merge strings in string tables or SHF_MERGE sections. Generic std::sort-style sorter is not efficient for sorting strings. The function implemented in this patch seems to be more efficient. Here's a benchmark of LLD to link Clang with or without this patch. The numbers are medians of 50 runs. -O0 real 0m0.455s real 0m0.430s (5.5% faster) -O3 real 0m0.487s real 0m0.452s (7.2% faster) Since that is a benchmark of the whole linker, the speedup of StringTableBuilder itself is much more than that. http://reviews.llvm.org/D14053 llvm-svn: 251337	2015-10-26 19:58:29 +00:00
Alexey Samsonov	7a952e53f9	[LLVMSymbolize] Use std::unique_ptr more extensively to clarify ownership. llvm-svn: 251336	2015-10-26 19:41:23 +00:00
Igor Laevsky	1ef06559f4	[RS4GC] Strip noalias attribute after statepoint rewrite We should remove noalias along with dereference and dereference_or_null attributes because statepoint could potentially touch the entire heap including noalias objects. Differential Revision: http://reviews.llvm.org/D14032 llvm-svn: 251333	2015-10-26 19:06:01 +00:00
Diego Novillo	7963ea1996	SamplePGO - Add optimization reports. This adds a couple of optimization remarks to the SamplePGO transformation. When it decides to inline a hot function (to mimic the inline stack and repeat useful inline decisions in the original build). It will also report branch destinations. For instance, given the code fragment: 6 if (i < 1000) 7 sum -= i; 8 else 9 sum += -i * rand(); If the 'else' branch is taken most of the time, building this code with -Rpass=sample-profile will produce: a.cc:9:14: remark: most popular destination for conditional branches at small.cc:6:9 [-Rpass=sample-profile] sum += -i * rand(); ^ llvm-svn: 251330	2015-10-26 18:52:53 +00:00
David Blaikie	7b54b525cd	Remove assert(false) in favor of asserting the if conditional it is contained within. Also adjust the code to avoid 3 redundant map lookups. llvm-svn: 251327	2015-10-26 18:41:13 +00:00
David Blaikie	94c83370b5	Move the canonical header to the top of its matching cpp file as per coding convention This ensures that the header will be verified to be standalone (and avoid mistakes like the one fixed in r251178) llvm-svn: 251326	2015-10-26 18:40:56 +00:00
Mehdi Amini	5d303285b9	Add an (optional) identification block in the bitcode Processing bitcode from a different LLVM version can lead to unexpected behavior. The LLVM project guarantees autoupdating bitcode from a previous minor revision for the same major, but can't make any promise when reading bitcode generated from a either a non-released LLVM, a vendor toolchain, or a "future" LLVM release. This patch aims at being more user-friendly and allows a bitcode produce to emit an optional block at the beginning of the bitcode that will contains an opaque string intended to describe the bitcode producer information. The bitcode reader will dump this information alongside any error it reports. The optional block also includes an "epoch" number, monotonically increasing when incompatible changes are made to the bitcode. The reader will reject bitcode whose epoch is different from the one expected. Differential Revision: http://reviews.llvm.org/D13666 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251325	2015-10-26 18:37:00 +00:00
Evgeniy Stepanov	d1aad26589	[safestack] Fast access to the unsafe stack pointer on AArch64/Android. Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. The previous iteration of this change was reverted in r250461. This version leaves the generic, compiler-rt based implementation in SafeStack.cpp instead of moving it to TargetLoweringBase in order to allow testing without a TargetMachine. llvm-svn: 251324	2015-10-26 18:28:25 +00:00
Peter Collingbourne	97aae40880	ARM/ELF: Better codegen for global variable addresses. In PIC mode we were previously computing global variable addresses (or GOT entry addresses) by adding the PC, the PC-relative GOT displacement and the GOT-relative symbol/GOT entry displacement. Because the latter two displacements are fixed, we ended up performing one more addition than necessary. This change causes us to compute addresses using a single PC-relative displacement, resulting in a shorter code sequence. This reduces code size by about 4% in a recent build of Chromium for Android. As a result of this change we no longer need to compute the GOT base address in the ARM backend, which allows us to remove the Global Base Reg pass and SDAG lowering for the GOT. We also now no longer use the GOT when addressing a symbol which is known to be defined in the same linkage unit. Specifically, the symbol must have either hidden visibility or a strong definition in the current module in order to not use the the GOT. This is a change from the previous behaviour where we would use the GOT to address externally visible symbols defined in the same module. I think the only cases where this could matter are cases involving symbol interposition, but we don't really support that well anyway. Differential Revision: http://reviews.llvm.org/D13650 llvm-svn: 251322	2015-10-26 18:23:16 +00:00
Alexey Samsonov	145b0fd2a0	Refactor: Simplify boolean conditional return statements in lib/Transforms/Instrumentation Summary: Use clang-tidy to simplify boolean conditional return statements. Differential Revision: http://reviews.llvm.org/D9996 Patch by Richard (legalize@xmission.com)! llvm-svn: 251318	2015-10-26 18:06:40 +00:00
Cong Hou	fff8ccf579	Check the case that the numerator and denominator are both zeros when getting edge probabilities in BPI and return 100% in this case. This issue is triggered in PGO mode when bootstrapping LLVM. It seems that it is not guaranteed that edge weights are always greater than zero which are read from profile data. llvm-svn: 251317	2015-10-26 18:00:17 +00:00
Alexey Samsonov	57f8837ada	Move parts of llvm-symbolizer tool into LLVMSymbolize library. Summary: See http://lists.llvm.org/pipermail/llvm-dev/2015-October/091624.html Reviewers: echristo Subscribers: llvm-commits, aizatsky Differential Revision: http://reviews.llvm.org/D13998 llvm-svn: 251316	2015-10-26 17:56:12 +00:00
Jonas Paulsson	83553d0cac	[SystemZ] LTGFR use regclass should be GR32, not GR64. Discovered by testing int-cmp-44.ll with -verify-machineinstrs (added to test run). llvm-svn: 251299	2015-10-26 15:03:49 +00:00
Jonas Paulsson	7da3820882	[SystemZ] Also clear kill flag for index reg in splitMove(). Discovered by running fp-move-05.ll with -verify-machineinstrs (added to test case run). llvm-svn: 251298	2015-10-26 15:03:41 +00:00
Jonas Paulsson	9525b2c0c8	[SystemZ] Don't forget the CC def op on LTEBRCompare pseudos Discovered by running fp-cmp-02.ll with -verify-machineinstrs (now added to test run). llvm-svn: 251297	2015-10-26 15:03:32 +00:00
Jonas Paulsson	dab7407258	[SystemZ] Tie operands in SystemZShorteInst if MI becomes 2-address. Discovered by testing fp-add-02.ll with -verify-machineinstrs. Test case updated to always run with -verify-machineinstrs. llvm-svn: 251296	2015-10-26 15:03:07 +00:00
Vasileios Kalintiris	165121f326	[mips] Check for the correct error message in tests for interrupt attributes. Instead of XFAIL-ing the tests with the wrong usage of the "interrupt" attribute, we should check that we emit the correct error messages to the user. llvm-svn: 251295	2015-10-26 14:24:30 +00:00
James Molloy	493e57de01	[ValueTracking] Extend r251146 to catch a fairly common case Even though we may not know the value of the shifter operand, it's possible we know the shifter operand is non-zero. This can allow us to infer more known bits - for example: %1 = load %p !range {1, 5} %2 = shl %q, %1 We don't know %1, but we do know that it is nonzero so %2[0] is known zero, and importantly %2 is known non-zero. Calling isKnownNonZero is nontrivially expensive so use an Optional to run it lazily and cache its result. llvm-svn: 251294	2015-10-26 14:10:46 +00:00
Elena Demikhovsky	7a77149391	Loop Vectorizer - skipping "bitcast" before GEP Vectorization of memory instruction (Load/Store) is possible when the pointer is coming from GEP. The GEP analysis allows to estimate the profit. In some cases we have a "bitcast" between GEP and memory instruction. I added code that skips the "bitcast". http://reviews.llvm.org/D13886 llvm-svn: 251291	2015-10-26 13:42:41 +00:00
Igor Breger	e4ddc3f4cd	AVX512: Enabled VPBROADCASTB lowering for v64i8 vectors. Differential Revision: http://reviews.llvm.org/D13896 llvm-svn: 251287	2015-10-26 13:01:02 +00:00
Vasileios Kalintiris	43dff0c033	[mips] Interrupt attribute support for mips32r2+. Summary: This patch adds support for using the "interrupt" attribute on Mips for interrupt handling functions. At this time only mips32r2+ with the o32 ABI with the static relocation model is supported. Unsupported configurations will be rejected Patch by Simon Dardis (+ clang-format & some trivial changes to follow the LLVM coding standards by me). Reviewers: mpf, dsanders Subscribers: dsanders, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D10768 llvm-svn: 251286	2015-10-26 12:38:43 +00:00
Igor Breger	684af8156c	AVX-512: Use correct extract vector length. Bug https://llvm.org/bugs/show_bug.cgi?id=25318 Differential Revision: http://reviews.llvm.org/D14062 llvm-svn: 251285	2015-10-26 12:26:34 +00:00
Silviu Baranga	b892e35520	[InstCombine] Teach instcombine not to create extra PHI nodes when folding GEPs Summary: InstCombine tries to transform GEP(PHI(GEP1, GEP2, ..)) into GEP(GEP(PHI(...)) when possible. However, this may leave the old PHI node around. Even if we do end up folding the GEPs, having an extra PHI node might not be beneficial. This change makes the transformation more conservative. We now only do this if the PHI has only one use, and can therefore be removed after the transformation. Reviewers: jmolloy, majnemer Subscribers: mcrosier, mssimpso, llvm-commits Differential Revision: http://reviews.llvm.org/D13887 llvm-svn: 251281	2015-10-26 10:25:05 +00:00
James Molloy	72222f5dca	[ARM] Handle the inline asm constraint type 'o' This means "memory with offset" and requires very little plumbing to get working. This fixes PR25317. llvm-svn: 251280	2015-10-26 10:04:52 +00:00
Benjamin Kramer	8604457f2e	Drop code after unreachable. No functionality change. llvm-svn: 251278	2015-10-26 09:55:45 +00:00
Igor Breger	f8e461f920	AVX512: Add AVX-512 not materializable instructions. Otherwise value can be reused , despite its value could be changed - produces incorrect assembler. https://llvm.org/bugs/show_bug.cgi?id=25270 Differential Revision: http://reviews.llvm.org/D14057 llvm-svn: 251275	2015-10-26 08:37:12 +00:00
Lang Hames	4df7ba7a16	[Orc] Add license header to OrcTargetSupport. llvm-svn: 251274	2015-10-26 06:40:28 +00:00
David Majnemer	0993e0b8a1	[MC] Add support for GNU as-compatible binary operator precedence GNU as and Darwin give the various binary operators different precedence. LLVM's MC supported the Darwin semantics but not the GNU semantics. This fixes PR25311. llvm-svn: 251271	2015-10-26 03:15:34 +00:00
David Majnemer	a375b26144	[MC] Don't crash when .word is given bogus values We didn't validate that the .word directive was given a sane value, leading to crashes when we attempt to write out the object file. Instead, perform some validation and issue a diagnostic pointing at the start of the diagnostic. llvm-svn: 251270	2015-10-26 02:45:50 +00:00
Benjamin Kramer	8ceb323bb4	Convert assert(false) into llvm_unreachable where it makes sense. llvm-svn: 251266	2015-10-25 22:28:27 +00:00
Davide Italiano	f04d89bdb4	[ScalarEvolution] Throw away dead code. llvm-svn: 251256	2015-10-25 20:00:49 +00:00
Davide Italiano	2071f4cc5a	[ScalarEvolution] Get rid of NDEBUG in header (correctly this time). llvm-svn: 251255	2015-10-25 19:55:24 +00:00
Sanjoy Das	15c4c4604f	[LCSSA] Unbreak build, don't reuse L; NFC The build broke in r251248. llvm-svn: 251251	2015-10-25 19:27:17 +00:00
Davide Italiano	0c34243ac1	[ScalarEvolution] Get rid of NDEBUG in header. llvm-svn: 251249	2015-10-25 19:13:36 +00:00
Sanjoy Das	331521c688	[LCSSA] Use range for loops; NFC llvm-svn: 251248	2015-10-25 19:08:32 +00:00
Simon Pilgrim	ec6db262e0	[X86][SSE4A] Fix for EXTRQI shuffle lowering. Incorrect range test - found during fuzz testing. llvm-svn: 251245	2015-10-25 17:40:54 +00:00
Elena Demikhovsky	092858588a	Scalarizer for masked.gather and masked.scatter intrinsics. When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations. If the mask is not constant, the scalarizer will build a chain of conditional basic blocks. I added isLegalMaskedGather() isLegalMaskedScatter() APIs. Differential Revision: http://reviews.llvm.org/D13722 llvm-svn: 251237	2015-10-25 15:37:55 +00:00
Michael Kuperstein	eaa16005af	[X86] Use correct calling convention for MCU psABI libcalls When using the MCU psABI, compiler-generated library calls should pass some parameters in-register. However, since inreg marking for x86 is currently done by the front end, it will not be applied to backend-generated calls. This is a workaround for PR3997, which describes a similar issue for -mregparm. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251223	2015-10-25 08:14:05 +00:00
Michael Kuperstein	fe897623f3	[X86] Add support for elfiamcu triple This adds support for the i?86-*-elfiamcu triple, which indicates the IAMCU psABI is used. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251222	2015-10-25 08:07:37 +00:00
Craig Topper	eda02a905e	Remove two unnecessary conversions from MVT to EVT. NFC llvm-svn: 251219	2015-10-25 03:15:29 +00:00
Craig Topper	7bf52c9d26	Use MVT::SimpleValueType instead of MVT in template parameter. NFC llvm-svn: 251217	2015-10-25 00:27:14 +00:00
Rafael Espindola	84921b9860	Refactor: Simplify boolean conditional return statements in lib/CodeGen. Patch by Richard. llvm-svn: 251213	2015-10-24 23:11:13 +00:00
Simon Pilgrim	53c2bff5fe	[X86][SSE] Use lowerVectorShuffleWithUNPCK instead of custom matches. Most 128-bit and 256-bit shuffles were manually matching UNPCK patterns - use lowerVectorShuffleWithUNPCK to be more thorough. llvm-svn: 251211	2015-10-24 22:45:04 +00:00
Simon Pilgrim	fdfed5143c	[X86][SSE] lowerVectorShuffleWithUNPCK - use equivalent shuffle mask test. Use isShuffleEquivalent to match UNPCK shuffles - better support for build vector inputs. llvm-svn: 251207	2015-10-24 20:48:08 +00:00
Michael Zolotukhin	1eeb2da7d4	Refactor: Simplify boolean conditional return statements in lib/Transforms/Vectorize (NFC). Summary: Use clang-tidy to simplify boolean conditional return statements Differential Revision: http://reviews.llvm.org/D10003 Patch by Richard<legalize@xmission.com> llvm-svn: 251206	2015-10-24 20:16:42 +00:00
Simon Pilgrim	3448cbcc51	[DAGCombiner] Tidy up ConstantFP commutation. NFCI Move ConstantFP canonicalization of commutative instructions to start of 2-op node creation (matches integer) - simplifies constant folding code. llvm-svn: 251203	2015-10-24 20:06:18 +00:00
Benjamin Kramer	5611561e99	Use all_of to simplify control flow. NFC. llvm-svn: 251202	2015-10-24 19:30:37 +00:00
Yaron Keren	57fa135b40	Add libuuid to required system libraries list for mingw. This list is produced by llvm-config --system-libs to be used by external programs using the llvm libraries, such as creduce. In r250501 llvm/Support/Windows/Path.inc started to use the constant FOLDERID_Profile from libuuid. llvm-svn: 251201	2015-10-24 19:27:28 +00:00
Benjamin Kramer	74b6d3b967	Use find_if to simplify control flow. NFC. llvm-svn: 251200	2015-10-24 19:03:15 +00:00
Simon Pilgrim	7430804fe1	[DAGCombiner] Generalize masking of constant rotates. We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded. Followup to D13851. llvm-svn: 251197	2015-10-24 18:44:52 +00:00
Craig Topper	272d6a57bb	Call the version of ConvertCostTableLookup that takes a statically sized array rather than pointer and size. NFC llvm-svn: 251196	2015-10-24 18:40:22 +00:00
Hans Wennborg	34d40434a7	X86ISelLowering: Support tail calls to/from callee pop functions This enables tail calls with thiscall, stdcall, vectorcall and fastcall functions. Differential Revision: http://reviews.llvm.org/D13999 llvm-svn: 251190	2015-10-24 16:47:10 +00:00
Simon Pilgrim	e379fe0ddb	Fix unused variable warning. NFC. llvm-svn: 251189	2015-10-24 13:41:45 +00:00
Simon Pilgrim	d5ef318b5b	[X86][XOP] Add support for lowering vector rotations This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions. This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future. Differential Revision: http://reviews.llvm.org/D13851 llvm-svn: 251188	2015-10-24 13:17:26 +00:00
Benjamin Kramer	7ecf8c22cf	[TblGen] ArrayRefize TGParser. No functional change intended. llvm-svn: 251186	2015-10-24 12:46:45 +00:00
Benjamin Kramer	557b601b08	[BasicAliasAnalysis] Simplify expression, no functional change. (-1) - x + 1 is the same as -x. llvm-svn: 251185	2015-10-24 11:38:01 +00:00
NAKAMURA Takumi	26c3872666	ScalarReplAggregates.cpp: Try to appease clash of anonymous::SROA in modules build. llvm-svn: 251181	2015-10-24 06:42:42 +00:00
Sanjoy Das	a7e13782f1	Extract out getConstantRangeFromMetadata; NFC The loop idiom creating a ConstantRange is repeated twice in the codebase, time to give it a name and a home. The loop is also repeated in `rangeMetadataExcludesValue`, but using `getConstantRangeFromMetadata` there would not be an NFC -- the range returned by `getConstantRangeFromMetadata` may contain a value that none of the subranges did. llvm-svn: 251180	2015-10-24 05:37:35 +00:00
Sanjoy Das	bb5ffc50b7	Fix whitespace issues in two places; NFC llvm-svn: 251179	2015-10-24 05:37:28 +00:00
Kostya Serebryany	9cc3b0ddb6	[libFuzzer] add -merge flag to merge corpora llvm-svn: 251168	2015-10-24 01:16:40 +00:00
Matt Arsenault	2ea0a23f18	AMDGPU: Print modifiers when dumping AMDGPUOperand llvm-svn: 251160	2015-10-24 00:12:56 +00:00
Igor Laevsky	dde0029a25	[RS4GC] Rename stripDereferenceabilityInfo into stripNonValidAttributes. llvm-svn: 251157	2015-10-23 22:42:44 +00:00
Rafael Espindola	21956e4007	Add a RAW mode to StringTableBuilder. In this mode it just tries to tail merge the strings without imposing any other format constrains. It will not, for example, add a null byte between them. Also add support for keeping a tentative size and offset if we decide to not optimize after all. This will be used shortly in lld for merging SHF_STRINGS sections. llvm-svn: 251153	2015-10-23 21:48:05 +00:00
Chen Li	7009cd3554	Revert rL251061 [SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad. llvm-svn: 251149	2015-10-23 21:13:01 +00:00
Hal Finkel	f2199b2178	Handle non-constant shifts in computeKnownBits, and use computeKnownBits for constant folding in InstCombine/Simplify First, the motivation: LLVM currently does not realize that: ((2072 >> (L == 0)) >> 7) & 1 == 0 where L is some arbitrary value. Whether you right-shift 2072 by 7 or by 8, the lowest-order bit is always zero. There are obviously several ways to go about fixing this, but the generic solution pursued in this patch is to teach computeKnownBits something about shifts by a non-constant amount. Previously, we would give up completely on these. Instead, in cases where we know something about the low-order bits of the shift-amount operand, we can combine (and together) the associated restrictions for all shift amounts consistent with that knowledge. As a further generalization, I refactored all of the logic for all three kinds of shifts to have this capability. This works well in the above case, for example, because the dynamic shift amount can only be 0 or 1, and thus we can say a lot about the known bits of the result. This brings us to the second part of this change: Even when we know all of the bits of a value via computeKnownBits, nothing used to constant-fold the result. This introduces the necessary code into InstCombine and InstSimplify. I've added it into both because: 1. InstCombine won't automatically pick up the associated logic in InstSimplify (InstCombine uses InstSimplify, but not via the API that passes in the original instruction). 2. Putting the logic in InstCombine allows the resulting simplifications to become part of the iterative worklist 3. Putting the logic in InstSimplify allows the resulting simplifications to be used by everywhere else that calls SimplifyInstruction (inlining, unrolling, and many others). And this requires a small change to our definition of an ephemeral value so that we don't break the rest case from r246696 (where the icmp feeding the @llvm.assume, is also feeding a br). Under the old definition, the icmp would not be considered ephemeral (because it is used by the br), but this causes the assume to remove itself (in addition to simplifying the branch structure), and it seems more-useful to prevent that from happening. llvm-svn: 251146	2015-10-23 20:37:08 +00:00
Tim Northover	d4f55c0b1b	GVN: don't try to replace instruction with itself. After some look-ahead PRE was added for GEPs, an instruction could end up in the table of candidates before it was actually inspected. When this happened the pass might decide it was the best candidate to replace itself. This didn't go well. Should fix PR25291 llvm-svn: 251145	2015-10-23 20:30:02 +00:00
Rafael Espindola	a9b3944c0e	Fix the variable names to match the LLVM style. llvm-svn: 251143	2015-10-23 20:15:35 +00:00
Sanjoy Das	52f7b08b4a	[SCEV] Fix stylistic issue in MatchBinaryAddToConst; NFCI Instead of checking `(FlagsPresent & ExpectedFlags) != 0`, check `(FlagsPresent & ExpectedFlags) == ExpectedFlags`. Right now they're equivalent since `ExpectedFlags` can only be either `FlagNUW` or `FlagNSW`, but if we ever pass in `ExpectedFlags` as `FlagNUW \| FlagNSW` then checking `(FlagsPresent & ExpectedFlags) != 0` would be wrong. llvm-svn: 251142	2015-10-23 20:09:57 +00:00
Sanjoy Das	0a1bee8a80	[Inliner] Don't inline through callsites with operand bundles Summary: This change teaches the LLVM inliner to not inline through callsites with unknown operand bundles. Currently all operand bundles are "unknown" operand bundles but in the near future we will add support for inlining through some select kinds of operand bundles. Reviewers: reames, chandlerc, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14001 llvm-svn: 251141	2015-10-23 20:09:55 +00:00
Reid Kleckner	f02e33ce42	[X86] Clean up the tail call eligibility logic Summary: The logic here isn't straightforward because our support for TargetOptions::GuaranteedTailCallOpt. Also fix a bug where we were allowing tail calls to cdecl functions from fastcall and vectorcall functions. We were special casing thiscall and stdcall callers rather than checking for any convention that requires clearing stack arguments before returning. Reviewers: hans Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14024 llvm-svn: 251137	2015-10-23 19:35:38 +00:00
Lang Hames	3fef117ba5	[RuntimeDyld][COFF] Fix a think-o in the handling of the IMAGE_REL_AMD64_ADDR64 relocation that was introduced in r250733. llvm-svn: 251135	2015-10-23 18:46:43 +00:00
Kostya Serebryany	94660b3c36	[libFuzzer] remove some old code; also make __sanitizer_get_total_unique_caller_callee_pairs weak so that newer libFuzzer works with older asan llvm-svn: 251133	2015-10-23 18:37:58 +00:00
Matt Arsenault	382557ec72	AMDGPU: Fix parsing of 32-bit literals with sign bit set llvm-svn: 251132	2015-10-23 18:07:58 +00:00
Artyom Skrobov	5a6e39454e	[ARM] Renaming +t2dsp feature into +dsp, as discussed on llvm-dev llvm-svn: 251125	2015-10-23 17:19:19 +00:00
Oleg Ranevskyy	6389dd9fa2	[ARM CodeGen] @llvm.debugtrap call may be removed when restoring callee saved registers Summary: When ARMFrameLowering::emitPopInst generates a "pop" instruction to restore the callee saved registers, it checks if the LR register is among them. If so, the function may decide to remove the basic block's terminator and replace it with a "pop" to the PC register instead of LR. This leads to a problem when the block's terminator is preceded by a "llvm.debugtrap" call. The MI iterator points to the trap in such a case, which is also a terminator. If the function decides to restore LR to PC, it erroneously removes the trap. Reviewers: asl, rengolin Subscribers: aemerson, jfb, rengolin, dschuff, llvm-commits Differential Revision: http://reviews.llvm.org/D13672 llvm-svn: 251123	2015-10-23 17:17:59 +00:00
Oleg Ranevskyy	5f78c5c293	Test commit: fix typo in comment. llvm-svn: 251122	2015-10-23 17:10:44 +00:00
Joseph Tremoulet	3d0fbf1d74	[CodeGen] Mark setjmp/catchret MBBs address-taken Summary: This ensures that BranchFolding (and similar) won't remove these blocks. Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are address-taken but do not have BBs that are address-taken, since otherwise its call to getAddrLabelSymbolTableToEmit would fail an assertion on such blocks. I audited the other callers of getAddrLabelSymbolTableToEmit (and getAddrLabelSymbol); they all have BBs known to be address-taken except for the call through getAddrLabelSymbol from WinException::create32bitRef; that call is actually now unreachable, so I've removed it and updated the signature of create32bitRef. This fixes PR25168. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, llvm-commits Differential Revision: http://reviews.llvm.org/D13774 llvm-svn: 251113	2015-10-23 15:06:05 +00:00
James Molloy	05a896a8d1	[BasicAA] Bugfix for r251016 If the loaded type sizes don't match the element type of the sequential type, all bets are off and the addresses may, indeed, overlap. Surprisingly, this just got caught in one test, on one builder, out of the 30+ builders testing this change. Congratulations go to http://lab.llvm.org:8011/builders/clang-aarch64-lnt/builds/5205. llvm-svn: 251112	2015-10-23 14:17:03 +00:00
James Molloy	5b18b4ce96	Revert "[AArch64]Merge halfword loads into a 32-bit load" This reverts commit r250719. This introduced a codegen fault in SPEC2000.gcc, when compiled for Cortex-A53. llvm-svn: 251108	2015-10-23 10:41:38 +00:00
Sanjoy Das	42801100e1	[SCEV] Get rid of an unnecessary lambda; NFC llvm-svn: 251099	2015-10-23 06:57:21 +00:00
Zlatko Buljan	2cf61020b8	[mips][microMIPS] Implement SHLL.PH, SHLL_S.PH, SHLL.QB, SHLLV.PH, SHLLV_S.PH, SHLLV.QB, SHLLV_S.W, SHLL_S.W, SHRA.QB and SHRA_R.QB instructions Differential Revision: http://reviews.llvm.org/D13929 llvm-svn: 251098	2015-10-23 06:39:29 +00:00
Sanjoy Das	0714e3e245	[SCEV] Fix a latent bug in `getPreStartForExtend` I could not come up a way to test this -- I think this bug is latent today, and will not actually result in a miscompile. In `getPreStartForExtend`, SCEV constructs `PreStart` as a sum of all of `SA`'s operands except `Op`. It also uses `SA`'s no-wrap flags, and this is problematic because removing an element from an add expression can make it signed-wrap. E.g. if `SA` was `(127 + 1 + -1)`, then it could safely be `<nsw>` (since `sext(127) + sext(1) + sext(-1)` == `sext(127 + 1 + -1)`), but `(127 + 1)` (== `PreStart` if `Op` is `-1`) is not `<nsw>`. Transferring `<nuw>` from `SA` to `PreStart` is safe, as far as I can tell. llvm-svn: 251097	2015-10-23 06:33:47 +00:00
Dylan McKay	57cee79f7c	[AVR] Add ELF constants to headers Also adds a 'trivial' ELF file. This was generated by assembling and linking a file with the symbol main which contains a single return instruction. llvm-svn: 251096	2015-10-23 06:05:55 +00:00
Xinliang David Li	8ee08b0f70	Add more intrumentation/runtime helper interfaces (NFC) This patch converts the remaining references to literal strings for names of profile runtime entites (such as profile runtime hook, runtime hook use function, profile init method, register function etc). Also added documentation for all the new interfaces. llvm-svn: 251093	2015-10-23 04:22:58 +00:00
Mehdi Amini	d42ae865b8	SLPVectorizer: AllSameOpcode* starts "true" only for instructions r251085 wasn't as NFC as intended... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251087	2015-10-23 01:04:45 +00:00
Mehdi Amini	bf6ee32ca5	SLPVectorizer: refactor reorderInputsAccordingToOpcode (NFC) This is intended to simplify the changes needed to solve PR25247. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251085	2015-10-23 00:46:17 +00:00
Davide Italiano	fbb958c24b	[CodeGen] Remove usage of NDEBUG in header. Moreover, this seems unused. llvm-svn: 251081	2015-10-23 00:17:40 +00:00
Kostya Serebryany	2e9fca9f88	[libFuzzer] use the indirect caller-callee counter as an independent search heuristic llvm-svn: 251078	2015-10-22 23:55:39 +00:00
Kostya Serebryany	09d2a5f6e1	[libFuzzer] more refactoring the code that checks the coverage. NFC llvm-svn: 251075	2015-10-22 22:56:45 +00:00
Kostya Serebryany	007c9b25f4	[libFuzzer] refactoring the code that checks the coverage. NFC llvm-svn: 251074	2015-10-22 22:50:47 +00:00
Kostya Serebryany	b36025619c	[libFuzzer] remove the deprecated 'tokens' feature llvm-svn: 251069	2015-10-22 21:48:09 +00:00
Justin Bogner	f98df7a0d1	LoopPass: Remove redoLoop, it isn't used. NFC In r251064 I removed a logically unreachable call to `redoLoop`, and now there aren't any callers of this API at all. Remove the needless complexity. llvm-svn: 251067	2015-10-22 21:31:34 +00:00
Justin Bogner	35e46cdd04	LoopPass: Simplify the API for adding a new loop. NFC The insertLoop() API is only used to add new loops, and has confusing ownership semantics. Simplify it by replacing it with addLoop(). llvm-svn: 251064	2015-10-22 21:21:32 +00:00
Chen Li	c6e28782d8	[SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad. Summary: Currently SimplifyResume can convert an invoke instruction to a call instruction if its landing pad is trivial. In practice we could have several invoke instructions with trivial landing pads and share a common rethrow block, and in the common rethrow block, all the landing pads join to a phi node. The patch extends SimplifyResume to check the phi of landing pad and their incoming blocks. If any of them is trivial, remove it from the phi node and convert the invoke instruction to a call instruction. Reviewers: hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13718 llvm-svn: 251061	2015-10-22 20:48:38 +00:00
Xinliang David Li	83bc4220ce	Add helper functions and remove hard coded references to instProf related name/name-prefixes This is a clean up patch that defines instr prof section and variable name prefixes in a common header with access helper functions. clang FE change will be done as a follow up once this patch is in. Differential Revision: http://reviews.llvm.org/D13919 llvm-svn: 251058	2015-10-22 20:32:12 +00:00
David Majnemer	e0675fb8fb	[Sink] Don't check BB.empty() As an invariant, BasicBlocks cannot be empty when passed to a transform. This is not the case for MachineBasicBlocks and the Sink pass was ported from the MachineSink pass which would explain the check's existence. llvm-svn: 251057	2015-10-22 20:29:08 +00:00
Alexey Samsonov	f4fb5f500c	[ASan] Enable instrumentation of dynamic allocas by default. llvm-svn: 251056	2015-10-22 20:07:28 +00:00
Sanjoy Das	eeca9f6fd4	[SCEV] Commute zero extends through <nuw> additions llvm-svn: 251052	2015-10-22 19:57:38 +00:00
Sanjoy Das	6e78b17b43	[SCEV] Opportunistically interpret unsigned constraints as signed Summary: An unsigned comparision is equivalent to is corresponding signed version if both the operands being compared are positive. Teach SCEV to use this fact when profitable. Reviewers: atrick, hfinkel, reames, nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13687 llvm-svn: 251051	2015-10-22 19:57:34 +00:00
Sanjoy Das	1123148d40	[SCEV] Teach SCEV some axioms about non-wrapping arithmetic Summary: - A s< (A + C)<nsw> if C > 0 - A s<= (A + C)<nsw> if C >= 0 - (A + C)<nsw> s< A if C < 0 - (A + C)<nsw> s<= A if C <= 0 Right now `C` needs to be a constant, but we can later generalize it to be a non-constant if needed. Reviewers: atrick, hfinkel, reames, nlewycky Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13686 llvm-svn: 251050	2015-10-22 19:57:29 +00:00
Sanjoy Das	a060e602fd	[SCEV] Commute sign extends through nsw additions Summary: Depends on D13613. Reviewers: atrick, hfinkel, reames, nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13685 llvm-svn: 251049	2015-10-22 19:57:25 +00:00
Sanjoy Das	8f27415c05	[SCEV] Mark AddExprs as nsw or nuw if legal Summary: This uses `ScalarEvolution::getRange` and not potentially control dependent `nsw` and `nuw` bits on the arithmetic instruction. Reviewers: atrick, hfinkel, nlewycky Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13613 llvm-svn: 251048	2015-10-22 19:57:19 +00:00
Alexey Samsonov	8daaf8b09b	[ASan] Minor fixes to dynamic allocas handling: * Don't instrument promotable dynamic allocas: We already have a test that checks that promotable dynamic allocas are ignored, as well as static promotable allocas. Make sure this test will still pass if/when we enable dynamic alloca instrumentation by default. * Handle lifetime intrinsics before handling dynamic allocas: lifetime intrinsics may refer to dynamic allocas, so we need to emit instrumentation before these dynamic allocas would be replaced. Differential Revision: http://reviews.llvm.org/D12704 llvm-svn: 251045	2015-10-22 19:51:59 +00:00
Davide Italiano	be8c33e8da	[ExecutionEngine] Garbage collect some dead (and unsafe) code. llvm-svn: 251042	2015-10-22 18:46:27 +00:00
Rafael Espindola	fc063e8fec	Avoid storing a second copy of each string in StringTableBuilder. This was only use in the extremely uncommon case of @@@ symbols on ELF. llvm-svn: 251039	2015-10-22 18:32:06 +00:00
Matthias Braun	d276de6db1	AArch64: Disable the latency heuristic It turned out not to improve any of our benchmarks but occasionally led to increased register pressure and spilling. Only enabling for the Cyclone CPU as the results on the cortex CPUs give mixed results. Differential Revision: http://reviews.llvm.org/D13708 llvm-svn: 251038	2015-10-22 18:07:38 +00:00
Matthias Braun	61f4d6439c	MachineScheduler: Add a way to disable the 'ReduceLatency' heuristic llvm-svn: 251037	2015-10-22 18:07:31 +00:00
Eric Christopher	227d71bba6	Remove the last traces of X86CompilationCallback as it is completely unused. llvm-svn: 251035	2015-10-22 17:55:35 +00:00
Craig Topper	8fe40e0ed5	Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032	2015-10-22 17:05:00 +00:00
Zachary Turner	c55a5041e3	Fix broken build under MSVC. llvm-svn: 251030	2015-10-22 16:42:31 +00:00
Craig Topper	42526d3372	Use ArrayRef instead of pointer and size. NFC llvm-svn: 251029	2015-10-22 16:35:56 +00:00
Zia Ansari	8f509a7044	[X86] - Catch extra combine opportunities for redundant imuls. When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple users which would result in an extra add instruction. In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add. I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works). Differential Revision: http://reviews.llvm.org/D13740 llvm-svn: 251028	2015-10-22 16:14:45 +00:00
Bill Schmidt	de1dc9c98f	[PPC] Fix PR24686 by failing assembly for an invalid relocation PR24686 identifies a problem where a relocation expression is invalid when not all of the symbols in the expression can be locally resolved. This causes the compiler to request a PC-relative half16ds relocation, which is nonsensical for PowerPC. This patch recognizes this situation and ensures we fail the assembly cleanly. Test case provided by Anton Blanchard. llvm-svn: 251027	2015-10-22 15:53:44 +00:00
Rafael Espindola	e015f66a73	Avoid hash lookups when finalizing StringTableBuilder. NFC. llvm-svn: 251024	2015-10-22 15:26:35 +00:00
Rafael Espindola	0169a45e04	Use array_pod_sort. NFC. llvm-svn: 251023	2015-10-22 15:15:44 +00:00
Asaf Badouh	7c52245660	[X86][AVX512] extend vcvtph2ps to support xmm/ymm and sae versions Differential Revision: http://reviews.llvm.org/D13945 llvm-svn: 251018	2015-10-22 14:01:16 +00:00
James Molloy	5b2a732fac	[GlobalsAA] Loosen an overly conservative bailout Instead of bailing out when we see loads, analyze them. If we can prove that the loaded-from address must escape, then we can conclude that a load from that address must escape too and therefore cannot alias a non-addr-taken global. When checking if a Value can alias a non-addr-taken global, if the Value is a LoadInst of a non-global, recurse instead of bailing. If we can follow a trail of loads up to some base that is captured, we know by inference that all the loads we followed are also captured. llvm-svn: 251017	2015-10-22 13:44:26 +00:00
James Molloy	5a4d8cd519	[BasicAA] Non-equal indices in a GEP of a SequentialType don't overlap If the final indices of two GEPs can be proven to not be equal, and the GEP is of a SequentialType (not a StructType), then the two GEPs do not alias. llvm-svn: 251016	2015-10-22 13:28:18 +00:00
James Molloy	1d88d6f289	[ValueTracking] Add a new predicate: isKnownNonEqual() isKnownNonEqual(A, B) returns true if it can be determined that A != B. At the moment it only knows two facts, that a non-wrapping add of nonzero to a value cannot be that value: A + B != A [where B != 0, addition is nsw or nuw] and that contradictory known bits imply two values are not equal. This patch also hooks this up to InstSimplify; InstSimplify had a peephole for the first fact but not the second so this teaches InstSimplify a new trick too (alas no measured performance impact!) llvm-svn: 251012	2015-10-22 13:18:42 +00:00
Pawel Bylica	64d08ff034	Use range-based for loop in sys::path::append(). NFC. llvm-svn: 250999	2015-10-22 08:12:15 +00:00
Elena Demikhovsky	5c97dfdc9c	AVX-512: Fixed a bug in select_cc for i1 type Fixed faiure: LLVM ERROR: Cannot select: t33: i1 = select_cc t25, Constant:i32<0>, t45, t42, seteq:ch added a test Differential Revision: http://reviews.llvm.org/D13943 llvm-svn: 250996	2015-10-22 07:10:29 +00:00
Elena Demikhovsky	7ad0d563a5	Partially reverted changes from r250686 Clang runtime failure was reported. Assertion failed: (isExtended() && "Type is not extended!"), function getTypeForEVT I'll need to add a proper handling for PointerType in masked load/store intrinsics. llvm-svn: 250995	2015-10-22 06:20:29 +00:00
Sanjoy Das	6ed053051d	[IR] Add a `makeNoWrapRegion` method to `ConstantRange` Summary: This will be used in a future change to ScalarEvolution. Reviewers: hfinkel, reames, nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13612 llvm-svn: 250975	2015-10-22 03:12:57 +00:00
Sanjoy Das	98a341bc0c	[OperandBundles] Make function attributes conservatively correct Summary: This makes attribute accessors on `CallInst` and `InvokeInst` do the (conservatively) right thing. This essentially involves, in some cases, not falling back querying the attributes on the called `llvm::Function` when operand bundles are present. Attributes locally present on the `CallInst` or `InvokeInst` will still override operand bundle semantics. The LangRef has been amended to reflect this. Note: this change does not do anything prevent `-function-attrs` from inferring `CallSite` local attributes after inspecting the called function -- that will be done as a separate change. I've used `-adce` and `-early-cse` to test these changes. There is nothing special about these passes (and they did not require any changes) except that they seemed be the easiest way to write the tests. This change does not add deal with `argmemonly`. That's a later change because alias analysis requires a related fix before `argmemonly` can be tested. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13961 llvm-svn: 250973	2015-10-22 03:12:22 +00:00
JF Bastien	f2364bf129	WebAssembly: fix more syntax br_if shouldn't start with a dot. div and rem went from prefix u/s to suffix. llvm-svn: 250972	2015-10-22 02:32:50 +00:00
Pete Cooper	b70b956c80	Add missing load/store flags to thumb2 instructions. These were the cause of a verifier error when building 7zip with -verify-machineinstrs. Running 'make check' with the verifier triggered the same error on the test here so i've updated the test to run the verifier on one of its runs instead of adding a new one. While looking at this code, there was a stale comment that these instructions were only used for disassembly. This probably used to be the case, but they are now used in the 'ARM load / store optimization pass' too. This reapplies r242300 which was reverted in r242428 due to bot failures. Ultimately those failures were spurious and completely unrelated to this commit. I reverted this at the time because it was thought to be at fault. llvm-svn: 250969	2015-10-22 01:48:57 +00:00
David Majnemer	a8f17871e4	[WinEH] Remove extraneous call to emitEHRegistrationOffsetLabel It's a relic from the earlier implementation, let's remove it. llvm-svn: 250964	2015-10-21 23:20:39 +00:00
Matt Arsenault	391be09ef3	AMDGPU: Fix adding redundant m0 uses BuildMI already adds these since they are defined correctly now. llvm-svn: 250961	2015-10-21 22:37:51 +00:00
Matt Arsenault	e8c0891e42	AMDGPU: Fix verifier error in SIFoldOperands There may be other use operands that also need their kill flags cleared. This happens in a few tests when SIFoldOperands is moved after PeepholeOptimizer. PeepholeOptimizer rewrites cases that look like: %vreg0 = ... %vreg1 = COPY %vreg0 use %vreg1<kill> %vreg2 = COPY %vreg0 use %vreg2<kill> to use the earlier source to %vreg0 = ... use %vreg0 use %vreg0 Currently SIFoldOperands sees the copied registers, so there is only one use. So far I haven't managed to come up with a test that currently has multiple uses of a foldable VGPR -> VGPR copy. llvm-svn: 250960	2015-10-21 22:37:50 +00:00
Matt Arsenault	b6fd98c7d9	AMDGPU: Split DiagnosticInfoUnsupported into its own file llvm-svn: 250959	2015-10-21 22:37:46 +00:00
Matt Arsenault	6005fcbe12	AMDGPU: Simplify VOP3 operand legalization. This was checking for a variety of situations that should never happen. This saves a tiny bit of compile time. We should not be selecting instructions with invalid operands in the first place. Most of the time for registers copys are inserted to the correct operand register class. For VOP3, since all operand types are supported and literal constants never are, we just need to verify the constant bus requirements (all immediates should be legal inline ones). The only possibly tricky case to maybe worry about is if when legalizing operands in moveToVALU with s_add_i32 and similar instructions. If the original s_add_i32 had a literal constant and we need to replace it with v_add_i32_e64 we would have an unsupported literal operand. However, I don't think we should worry about that because SIFoldOperands should handle folding literal constant operands into the SALU instructions based on the uses. At SIFoldOperands time, the legality and profitability of operand types is a bit different. llvm-svn: 250951	2015-10-21 21:51:02 +00:00
Matt Arsenault	e223cebd10	AMDGPU: Fix not checking implicit operands in verifyInstruction When verifying constant bus restrictions, this wasn't catching uses in implicit operands. llvm-svn: 250948	2015-10-21 21:15:01 +00:00
Matt Arsenault	45cbfa5949	Use numeric_limits instead of LLONG_MAX This is a build fix for configurations where LLONG_MAX is not defined in system headers. llvm-svn: 250946	2015-10-21 21:10:12 +00:00
Matt Arsenault	29f9663f97	LegalizeDAG: Implement promote for build_vector This will be used in future commits for AMDGPU to promote operations on i64 vectors into operations on 32-bit vector components. This will be used / tested in future AMDGPU commits. llvm-svn: 250945	2015-10-21 21:10:10 +00:00
Vedant Kumar	029c35186e	[Verifier] Minor comment update, NFC llvm-svn: 250943	2015-10-21 20:33:31 +00:00
Keno Fischer	ddad187ce7	[RuntimeDyld] Ignore ST_FILE symbols when constructing GlobalSymbolTable Summary: ELF's STT_File symbols may overlap with regular globals in other files, so we should ignore them here in order to avoid having bogus entries in the symbol table that confuse us when resolving relocations. Reviewers: lhames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13888 llvm-svn: 250942	2015-10-21 20:22:04 +00:00
Joerg Sonnenberger	7212809abc	Drop assert that a call with struct return goes to a function with sret attribute. Clang incorrectly misses it on __muldc3 and friends and the type system doesn't include it properly either. llvm-svn: 250938	2015-10-21 20:05:01 +00:00
Teresa Johnson	c8a8a5e2ae	Silence Visual C++ warning in function summary parsing code (NFC) llvm-svn: 250929	2015-10-21 19:25:14 +00:00
Sanjay Patel	efab8b0d08	[x86] move recursive add match for LEA to helper function; NFCI llvm-svn: 250926	2015-10-21 18:56:06 +00:00
David Majnemer	dc3b67b4ca	[SimplifyCFG] Don't use-after-free an SSA value SimplifyTerminatorOnSelect didn't consider the possibility that the condition might be related to one of PHI nodes. This fixes PR25267. llvm-svn: 250922	2015-10-21 18:22:24 +00:00
Craig Topper	896c267544	[X86] Add AMD mwaitx, monitorx, and clzero instructions to the assembly parser and disassembler. llvm-svn: 250911	2015-10-21 17:26:45 +00:00
Kevin Enderby	da9dd05011	Backing out commit r250906 as it broke lld. llvm-svn: 250908	2015-10-21 17:13:20 +00:00
Kevin Enderby	e3bf4fd546	This removes the eating of the error in Archive::Child::getSize() when the characters in the size field in the archive header for the member is not a number. To do this we have all of the needed methods return ErrorOr to push them up until we get out of lib. Then the tools and can handle the error in whatever way is appropriate for that tool. So the solution is to plumb all the ErrorOr stuff through everything that touches archives. This include its iterators as one can create an Archive object but the first or any other Child object may fail to be created due to a bad size field in its header. Thanks to Lang Hames on the changes making child_iterator contain an ErrorOr<Child> instead of a Child and the needed changes to ErrorOr.h to add operator overloading for * and -> . We don’t want to use llvm_unreachable() as it calls abort() and is produces a “crash” and using report_fatal_error() to move the error checking will cause the program to stop, neither of which are really correct in library code. There are still some uses of these that should be cleaned up in this library code for other than the size field. Also corrected the code where the size gets us to the “at the end of the archive” which is OK but past the end of the archive will return object_error::parse_failed now. The test cases use archives with text files so one can see the non-digit character, in this case a ‘%’, in the size field. llvm-svn: 250906	2015-10-21 16:59:24 +00:00
Craig Topper	8ea2390c35	[Option] Use an ArrayRef to store the Option Infos in OptTable. NFC llvm-svn: 250901	2015-10-21 16:30:42 +00:00
Daniel Sanders	d6cf3e05ef	[mips][mips16] Re-work the inline assembly stubs to work with IAS. NFC. Summary: Previously, we were inserting an InlineAsm statement for each line of the inline assembly. This works for GAS but it triggers prologue/epilogue emission when IAS is in use. This caused: .set noreorder .cpload $25 to be emitted as: .set push .set reorder .set noreorder .set pop .set push .set reorder .cpload $25 .set pop which led to assembler errors and caused the test to fail. The whitespace-after-comma changes included in this patch are necessary to match the output when IAS is in use. Reviewers: vkalintiris Subscribers: rkotler, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13653 llvm-svn: 250895	2015-10-21 12:44:14 +00:00
Chandler Carruth	2be10754a9	[AA] Enhance the new AliasAnalysis infrastructure with an optional "external" AA wrapper pass. This is a generic hook that can be used to thread custom code into the primary AAResultsWrapperPass for the legacy pass manager in order to allow it to merge external AA results into the AA results it is building. It does this by threading in a raw callback and so it is very powerful and should serve almost any use case I have come up with for extending the set of alias analyses used. The only thing not well supported here is using a different order of alias analyses. That form of extension is supportable with the new pass manager, and I can make the callback structure here more elaborate to support it in the legacy pass manager if this is a critical use case that people are already depending on, but the only use cases I have heard of thus far should be reasonably satisfied by this simpler extension mechanism. It is hard to test this using normal facilities (the built-in AAs don't use this for obvious reasons) so I've written a fairly extensive set of custom passes in the alias analysis unit test that should be an excellent test case because it models the out-of-tree users: it adds a totally custom AA to the system. This should also serve as a reasonably good example and guide for out-of-tree users to follow in order to rig up their existing alias analyses. No support in opt for commandline control is provided here however. I'm really unhappy with the kind of contortions that would be required to support that. It would fully re-introduce the analysis group self-recursion kind of patterns. =/ I've heard from out-of-tree users that this will unblock their use cases with extending AAs on top of the new infrastructure and let us retain the new analysis-group-free-world. Differential Revision: http://reviews.llvm.org/D13418 llvm-svn: 250894	2015-10-21 12:15:19 +00:00
Elena Demikhovsky	3ad76a1acd	Masked Load/Store optimization for scalar code When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks. I added optimization for constant mask vector. Differential Revision: http://reviews.llvm.org/D13855 llvm-svn: 250893	2015-10-21 11:50:54 +00:00
Daniel Sanders	0f596814e9	[mips][msa] Remove copy_u.d and move copy_u.w to MSA64. Summary: The forwards compatibility strategy employed by MIPS is to consider registers to be infinitely sign-extended. Then on ISA's with a wider register, the result of existing instructions are sign-extended to register width and zero-extended counterparts are added. copy_u.w on MSA32 and copy_u.w on MSA64 violate this strategy and we have therefore corrected the MSA specs to fix this. We still keep track of sign/zero-extension during legalization but we now match copy_s.[wd] where required. No change required to clang since __builtin_msa_copy_u_[wd] will map to copy_s.[wd] where appropriate for the target. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13472 llvm-svn: 250887	2015-10-21 09:58:54 +00:00
Jonas Paulsson	17ad04535f	Let MachineVerifier be aware of mem-to-mem instructions. A mem-to-mem instruction (that both loads and stores), which store to an FI, cannot pass the verifier since it thinks it is loading from the FI. For the mem-to-mem instruction, do a looser check in visitMachineOperand() and only check liveness at the reg-slot while analyzing a frame index operand. Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs, which now runs with this flag. Reviewed by Evan Cheng and Quentin Colombet. llvm-svn: 250885	2015-10-21 07:39:47 +00:00
Mehdi Amini	4215236621	Do not use `dyn_cast<X>` after `isa<X>` (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 250883	2015-10-21 06:11:01 +00:00
Krzysztof Parzyszek	fdb7b693a7	Tail duplication can mix incompatible registers in phi nodes Do not tail duplicate blocks where the successor has a phi node, and the corresponding value in that phi node uses a subregister. http://reviews.llvm.org/D13922 llvm-svn: 250877	2015-10-21 02:40:06 +00:00
JF Bastien	1a59c6b2c9	WebAssembly: support imports C/C++ code can declare an extern function, which will show up as an import in WebAssembly's output. It's expected that the linker will resolve these, and mark unresolved imports as call_import (I have a patch which does this in wasmate). llvm-svn: 250875	2015-10-21 02:23:09 +00:00
Dehao Chen	100424124b	Tolerate negative offset when matching sample profile. In some cases (as illustrated in the unittest), lineno can be less than the heade_lineno because the function body are included from some other files. In this case, offset will be negative. This patch makes clang still able to match the profile to IR in this situation. http://reviews.llvm.org/D13914 llvm-svn: 250873	2015-10-21 01:22:27 +00:00
Krzysztof Parzyszek	ced9941cd4	[Hexagon] Bit-based instruction simplification Analyze bit patterns of operands and values of instructions to perform various simplifications, dead/redundant code elimination, etc. llvm-svn: 250868	2015-10-20 22:57:13 +00:00
Krzysztof Parzyszek	26b2c9080f	[Hexagon] Fix isNVStorable flag in .td files An upper half and a double word cannot be used as value sources in a new-value store. llvm-svn: 250867	2015-10-20 22:40:57 +00:00
Igor Laevsky	68688df94c	[MemorySanitizer] NFC. Do not use GET_INTRINSIC_MODREF_BEHAVIOR table. It is now possible to infer intrinsic modref behaviour purely from intrinsic attributes. This change will allow to completely remove GET_INTRINSIC_MODREF_BEHAVIOR table. Differential Revision: http://reviews.llvm.org/D13907 llvm-svn: 250860	2015-10-20 21:33:30 +00:00
Krzysztof Parzyszek	79512b88b0	[Hexagon] Capture aggregate variables by reference, not value llvm-svn: 250851	2015-10-20 19:33:46 +00:00
Krzysztof Parzyszek	e4cff4058c	[Hexagon] Do not fall-through if there is no CFG edge llvm-svn: 250850	2015-10-20 19:30:21 +00:00
Krzysztof Parzyszek	bfe8e92fd1	[Hexagon] Use symbolic name for subregister instead of hardcoded number llvm-svn: 250849	2015-10-20 19:26:36 +00:00
Krzysztof Parzyszek	0257905f27	[Hexagon] Change Based->Base in getBasedWithImmOffset llvm-svn: 250848	2015-10-20 19:21:05 +00:00
Krzysztof Parzyszek	05da79d5ac	[Hexagon] Remove the remnants of isConstExtProfitable llvm-svn: 250845	2015-10-20 19:04:53 +00:00
Artyom Skrobov	c736863a85	Two switch blocks in VectorLegalizer::LegalizeOp already have a default: llvm_unreachable("This action is not supported yet!"); -- so I'm adding one to the third switch block, too. This is a follow-up fix for http://reviews.llvm.org/D13862 llvm-svn: 250830	2015-10-20 15:06:37 +00:00
Jonas Paulsson	4b29f6f7f7	[SystemZ] Use LivePhysRegs helper class in SystemZShortenInst.cpp. Don't use home brewed liveness tracking code for phys regs, since this class does the job. Reviewed by Ulrich Weigand. llvm-svn: 250829	2015-10-20 15:05:58 +00:00
Artyom Skrobov	7fd67e25aa	Adding support for TargetLoweringBase::LibCall Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826	2015-10-20 13:14:52 +00:00
Artyom Skrobov	b844fa7fc0	Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner. Summary: In addition to moving the code over, this patch amends the DIV,REM -> DIVREM combining to run on all affected nodes at once: if the nodes are converted to DIVREM one at a time, then the resulting DIVREM may get legalized by the backend into something target-specific that we won't be able to recognize and correlate with the remaining nodes. The motivation is to "prepare terrain" for D13862: when we set DIV and REM to be legalized to libcalls, instead of the DIVREM, we otherwise lose the ability to combine them together. To prevent this, we need to take the DIV,REM -> DIVREM combining out of the lowering stage. Reviewers: RKSimon, eli.friedman, rengolin Subscribers: john.brawn, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13733 llvm-svn: 250825	2015-10-20 13:06:02 +00:00
Igor Breger	21296d230a	AVX512: Implemented encoding and intrinsics for VPBROADCASTB/W/D/Q instructions. Differential Revision: http://reviews.llvm.org/D13884 llvm-svn: 250819	2015-10-20 11:56:42 +00:00
Keno Fischer	a010cfa592	Fix missing INITIALIZE_PASS_DEPENDENCY for AddressSanitizer Summary: In r231241, TargetLibraryInfoWrapperPass was added to `getAnalysisUsage` for `AddressSanitizer`, but the corresponding `INITIALIZE_PASS_DEPENDENCY` was not added. Reviewers: dvyukov, chandlerc, kcc Subscribers: kcc, llvm-commits Differential Revision: http://reviews.llvm.org/D13629 llvm-svn: 250813	2015-10-20 10:13:55 +00:00
Matt Arsenault	3add6439d0	AMDGPU: Add MachineInstr overloads for instruction format tests llvm-svn: 250797	2015-10-20 04:35:43 +00:00
Matt Arsenault	8f18917a90	AMDGPU: Stop reserving v[254:255] This wasn't doing anything useful. They weren't explicitly used anywhere, and the RegScavenger ignores reserved registers. This for some reason caused a random scheduling change in the test. Getting the check lines to pass is too frustrating, and there's probably not too much value in checking the vector case's operands N times. llvm-svn: 250794	2015-10-20 03:59:58 +00:00
JF Bastien	c8f89e86d5	WebAssembly: fix call/return syntax. They are now typeless, unlike other operations. llvm-svn: 250793	2015-10-20 01:26:54 +00:00
Duncan P. N. Exon Smith	c4829deae8	MSP430: Remove implicit ilist iterator conversions, NFC llvm-svn: 250792	2015-10-20 01:18:39 +00:00
Duncan P. N. Exon Smith	ac331fbdb6	AsmParser: Remove implicit ilist iterator conversions, NFC llvm-svn: 250791	2015-10-20 01:12:49 +00:00
Duncan P. N. Exon Smith	a2c90e4743	SystemZ: Remove implicit ilist iterator conversion, NFC llvm-svn: 250790	2015-10-20 01:12:46 +00:00
Duncan P. N. Exon Smith	0ce253d3a9	XCore: Remove implicit ilist iterator conversions, NFC llvm-svn: 250788	2015-10-20 01:07:42 +00:00
Duncan P. N. Exon Smith	ac65b4c422	PowerPC: Remove implicit ilist iterator conversions, NFC llvm-svn: 250787	2015-10-20 01:07:37 +00:00
Sanjoy Das	3020b1bc8c	[RS4GC] Remove a redundant linear search, NFCI Since LiveVariables is uniqued (we just created it from a `DenseSet`), `FindIndex(LiveVariables, LiveVariables[i])` is always `i`. llvm-svn: 250786	2015-10-20 01:06:31 +00:00
Sanjoy Das	b1942f14cd	[RS4GC] Clean up `find_index`; NFC - Bring it up to the LLVM Coding Style - Sink it inside `CreateGCRelocates`, which is its only user llvm-svn: 250785	2015-10-20 01:06:28 +00:00
Sanjoy Das	7ad67640e9	[RS4GC] Re-purpose `normalizeForInvokeSafepoint`; NFC. `normalizeForInvokeSafepoint` in RewriteStatepointsForGC.cpp, as it is written today, deals with `gc.relocate` and `gc.result` uses of a statepoint equally well. This change documents this fact and adds a test case. There is no functional change here -- only documentation of existing functionality. llvm-svn: 250784	2015-10-20 01:06:24 +00:00
Sanjoy Das	ff3dba736a	[RS4GC] Minor cleanup to `normalizeForInvokeSafepoint`; NFC llvm-svn: 250783	2015-10-20 01:06:17 +00:00
Duncan P. N. Exon Smith	c3f7988472	Sparc: Remove implicit ilist iterator conversions, NFC llvm-svn: 250781	2015-10-20 00:59:43 +00:00
Duncan P. N. Exon Smith	61149b86c3	NVPTX: Remove implicit ilist iterator conversions, NFC llvm-svn: 250779	2015-10-20 00:54:09 +00:00
Duncan P. N. Exon Smith	a72c6e25ec	Hexagon: Remove implicit ilist iterator conversions, NFC There are two things out of the ordinary in this commit. First, I made a loop obviously "infinite" in HexagonInstrInfo.cpp. After checking if an instruction was at the beginning of a basic block (in which case, `break`), the loop decremented and checked the iterator for `nullptr` as the loop condition. This has never been possible (the prev pointers are always been circular, so even with the weird ilist/iplist implementation, this isn't been possible), so I removed the condition. Second, in HexagonAsmPrinter.cpp there was another case of comparing a `MachineBasicBlock::instr_iterator` against `MachineBasicBlock::end()` (which returns `MachineBasicBlock::iterator`). While not incorrect, it's fragile. I switched this to `::instr_end()`. All that said, no functionality change intended here. llvm-svn: 250778	2015-10-20 00:46:39 +00:00
JF Bastien	3b0177c542	WebAssembly: fix syntax for br_if. llvm-svn: 250777	2015-10-20 00:37:42 +00:00
Duncan P. N. Exon Smith	a25ad0685a	AsmPrinter: Remove implicit ilist iterator conversion, NFC llvm-svn: 250776	2015-10-20 00:36:08 +00:00
Duncan P. N. Exon Smith	7869148c47	Mips: Remove implicit ilist iterator conversions, NFC llvm-svn: 250769	2015-10-20 00:15:20 +00:00
Duncan P. N. Exon Smith	19d951874a	CppBackend: Remove implicit ilist iterator conversions, NFC Mostly just converted to range-based for loops. May have converted a couple of extra loops as a drive-by (not sure). llvm-svn: 250766	2015-10-20 00:06:41 +00:00
Duncan P. N. Exon Smith	d95fa4ccf1	BPF: Remove implicit ilist iterator conversion, NFC llvm-svn: 250765	2015-10-20 00:02:50 +00:00
Duncan P. N. Exon Smith	9f9559e807	ARM: Remove implicit ilist iterator conversions, NFC llvm-svn: 250759	2015-10-19 23:25:57 +00:00
Duncan P. N. Exon Smith	1e59a66c69	ObjCARC: Remove implicit ilist iterator conversions, NFC llvm-svn: 250756	2015-10-19 23:20:14 +00:00
Cong Hou	7745dbc5c4	Enhance loop rotation with existence of profile data in MachineBlockPlacement pass. Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation: 1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header. 2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop. 3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain. Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies. Differential revision: http://reviews.llvm.org/D10717 llvm-svn: 250754	2015-10-19 23:16:40 +00:00
Duncan P. N. Exon Smith	9934b26b0f	Linker: Remove implicit ilist iterator conversion, NFC llvm-svn: 250748	2015-10-19 22:23:36 +00:00
David Blaikie	437aafdab2	Fix -Wdeprecated regarding ORC copying ValueMaterializers As usual, this is a polymorphic hierarchy without polymorphic ownership, so simply make the dtor protected non-virtual, protected default copy ctor/assign, and make derived classes final. The derived classes will pick up correct default public copy ops (and dtor) implicitly. (wish I could add -Wdeprecated to the build, but last time I tried it triggered on some system headers I still need to look into/figure out) llvm-svn: 250747	2015-10-19 22:15:55 +00:00
Michael Liao	c65d386b81	[InstCombine] Optimize icmp of inc/dec at RHS Allow LLVM to optimize the sequence like the following: %inc = add nsw i32 %i, 1 %cmp = icmp slt %n, %inc into: %cmp = icmp sle i32 %n, %i The case is not handled previously due to the complexity of compuation of %n. Hence, LLVM cannot swap operands of icmp accordingly. llvm-svn: 250746	2015-10-19 22:08:14 +00:00
Duncan P. N. Exon Smith	6b92a14a28	Vectorize: Remove implicit ilist iterator conversions, NFC Besides the usual, I finally added an overload to `BasicBlock::splitBasicBlock()` that accepts an `Instruction*` instead of `BasicBlock::iterator`. Someone can go back and remove this overload later (after updating the callers I'm going to skip going forward), but the most common call seems to be `BB->splitBasicBlock(BB->getTerminator(), ...)` and I'm not sure it's better to add `->getIterator()` to every one than have the overload. It's pretty hard to get the usage wrong. llvm-svn: 250745	2015-10-19 22:06:09 +00:00
Sanjay Patel	69a50a1e17	[CGP] transform select instructions into branches and sink expensive operands This was originally checked in at r250527, but reverted at r250570 because of PR25222. There were at least 2 problems: 1. The cost check was checking for an instruction with an exact cost of TCC_Expensive; that should have been >=. 2. The cause of the clang stage 1 failures was illegally sinking 'call' instructions; we can't sink instructions that may have side effects / are not safe to execute speculatively. Fixed those conditions in sinkSelectOperand() and added test cases. Original commit message: This is a follow-up to the discussion in D12882. Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands are expensive (as defined by the TTI cost model) because that may expose further optimizations. However, we would then like a later pass like CodeGenPrepare to undo that transformation if the target would likely benefit from not speculatively executing an expensive op (this patch). Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its select-formation behavior that changed with r248439. Differential Revision: http://reviews.llvm.org/D13297 llvm-svn: 250743	2015-10-19 21:59:12 +00:00
Duncan P. N. Exon Smith	d77de6495e	X86: Remove implicit ilist iterator conversions, NFC llvm-svn: 250741	2015-10-19 21:48:29 +00:00
Lang Hames	f1381cb8d0	[RuntimeDyld][COFF] Fix some endianness issues, re-enable the regression test. llvm-svn: 250733	2015-10-19 20:37:52 +00:00
Owen Anderson	faf5187ee0	Restore the original behavior of SelectionDAG::getTargetIndex(). It looks like an extra negation snuck in as apart of restoring it. llvm-svn: 250726	2015-10-19 19:27:40 +00:00
Krzysztof Parzyszek	055c5fd74e	[Hexagon] Remove unnecessary argument sign extends llvm-svn: 250724	2015-10-19 19:10:48 +00:00
Teresa Johnson	3da931f87a	Pass FunctionInfoIndex by reference to WriteFunctionSummaryToFile (NFC) Implemented suggestion by dblakie in review for r250704. llvm-svn: 250723	2015-10-19 19:06:06 +00:00
Benjamin Kramer	755e502952	Add missing override noticed by Clang's -Winconsistent-missing-override. llvm-svn: 250720	2015-10-19 18:41:23 +00:00
Jun Bum Lim	d3548303ec	[AArch64]Merge halfword loads into a 32-bit load Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 250719	2015-10-19 18:34:53 +00:00
Krzysztof Parzyszek	23920ec95d	[Hexagon] Fix debug information for local objects - Isolate the check for the existence of a stack frame into hasFP. - Implement getFrameIndexReference for DWARF address computation. - Use getFrameIndexReference for offset computation in eliminateFrameIndex. - Preserve debug information for dynamically allocated stack objects. - Prefer FP to access local objects at -O0. - Add experimental code to skip allocframe when not strictly necessary (disabled by default). llvm-svn: 250718	2015-10-19 18:30:27 +00:00
Benjamin Kramer	2002aadaad	Put back SelectionDAG::getTargetIndex. While technically this is untested dead code, it has out-of-tree users. This reverts a part of r250434. llvm-svn: 250717	2015-10-19 18:26:16 +00:00
Krzysztof Parzyszek	db8677067c	[Hexagon] Delay emission of CFI instructions Emit the CFI instructions after all code transformation have been done. This will avoid any interference between CFI instructions and packetization. llvm-svn: 250714	2015-10-19 17:46:01 +00:00
Matthias Braun	e734195ce3	Revert "RegisterPressure: allocatable physreg uses are always kills" This reverts commit r250596. Reverted for now as the commit triggers assert in the AMDGPU target pending investigation. llvm-svn: 250713	2015-10-19 17:44:22 +00:00
Lang Hames	98c2ac13d3	[Orc] Add support for emitting indirect stubs directly into the JIT target's memory, rather than representing the stubs in IR. Update the CompileOnDemand layer to use this functionality. Directly emitting stubs is much cheaper than building them in IR and codegen'ing them (see below). It also plays well with remote JITing - stubs can be emitted directly in the target process, rather than having to send them over the wire. The downsides are: (1) Care must be taken when resolving symbols, as stub symbols are held in a separate symbol table. This is only a problem for layer writers and other people using this API directly. The CompileOnDemand layer hides this detail. (2) Aliases of function stubs can't be symbolic any more (since there's no symbol definition in IR), but must be converted into a constant pointer expression. This means that modules containing aliases of stubs cannot be cached. In practice this is unlikely to be a problem: There's no benefit to caching such a module anyway. On balance I think the extra performance is more than worth the trade-offs: In a simple stress test with 10000 dummy functions requiring stubs and a single executed "hello world" main function, directly emitting stubs reduced user time for JITing / executing by over 90% (1.5s for IR stubs vs 0.1s for direct emission). llvm-svn: 250712	2015-10-19 17:43:51 +00:00
Benjamin Kramer	335332329b	Remove CRLF newlines. NFC. llvm-svn: 250698	2015-10-19 13:05:25 +00:00
Asiri Rathnayake	1040a53be3	Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions The mapping of these two intrinsics in ARMInstrInfo.td had a small omission which lead to their operands not being validated/transformed before being lowered into usat and ssat instructions. This can cause incorrect instructions to be emitted. I've also added tests for the remaining two saturating arithmatic intrinsics @llvm.arm.qadd and @llvm.arm.qsub as they are missing codegen tests. llvm-svn: 250697	2015-10-19 11:44:24 +00:00
James Molloy	17379c4ea1	[GlobalsAA] Fix a really horrible iterator invalidation bug We were keeping a reference to an object in a DenseMap then mutating it. At the end of the function we were attempting to clone that reference into other keys in the DenseMap, but DenseMap may well decide to resize its hashtable which would invalidate the reference! It took an extremely complex testcase to catch this - many thanks to Zhendong Su for catching it in PR25225. This fixes PR25225. llvm-svn: 250692	2015-10-19 08:54:59 +00:00
Elena Demikhovsky	20662e39f1	Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore(). Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case. Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces. Differential Revision: http://reviews.llvm.org/D13850 llvm-svn: 250686	2015-10-19 07:43:38 +00:00
Zlatko Buljan	5292083584	[mips][microMIPS] Implement ADDQ.PH, ADDQ_S.W, ADDQH.PH, ADDQH.W, ADDSC, ADDU.PH, ADDU_S.QB, ADDWC and ADDUH.QB instructions Differential Revision: http://reviews.llvm.org/D13130 llvm-svn: 250685	2015-10-19 07:16:26 +00:00
Zlatko Buljan	d0a7d6e4ee	[mips][microMIPS] Implement ABSQ.QB, ABSQ_S.PH, ABSQ_S.W, ABSQ_S.QB, INSV, MADD, MADDU, MSUB, MSUBU, MULT and MULTU instructions Differential Revision: http://reviews.llvm.org/D13721 llvm-svn: 250683	2015-10-19 06:34:44 +00:00
Xinliang David Li	aa0592cc70	[PGO] Eliminate prof data register calls on FreeBSD platform This is a follow up patch of r250199 after verifying the start/stop section symbols work as spected on FreeBSD. llvm-svn: 250679	2015-10-19 04:17:10 +00:00
Jakub Staszak	f12821a43c	Preserve CFG in MergedLoadStoreMotion. This fixes PR24426. llvm-svn: 250660	2015-10-18 19:34:10 +00:00
Simon Pilgrim	04d52d26f6	Use SDValue bool check. NFCI. llvm-svn: 250653	2015-10-18 12:33:54 +00:00
Simon Pilgrim	c2c154e078	Move one-use variable inside test. NFC. llvm-svn: 250651	2015-10-18 11:47:23 +00:00
Asaf Badouh	696e8e0bb7	[X86][AVX512DQ] add scalar fpclass Differential Revision: http://reviews.llvm.org/D13769 llvm-svn: 250650	2015-10-18 11:04:38 +00:00
Igor Breger	cbb9550537	AVX512: Lowering i8/i16 vector CTLZ using the dword LZCNT vector instruction Differential Revision: http://reviews.llvm.org/D13632 llvm-svn: 250649	2015-10-18 09:56:39 +00:00
Craig Topper	92cfdd70f8	[Sparc] Use MCPhysReg instead of unsigned to size static arrays of registers. Should reduce the table size. llvm-svn: 250644	2015-10-18 05:29:05 +00:00
Craig Topper	1d37443718	Use array_lengthof. NFC llvm-svn: 250643	2015-10-18 05:15:38 +00:00
Craig Topper	2626094fa1	Make a bunch of static arrays const. llvm-svn: 250642	2015-10-18 05:15:34 +00:00
Lang Hames	a32d71be4c	[RuntimeDyld] Add support for absolute symbols. llvm-svn: 250639	2015-10-18 01:41:37 +00:00
Xinliang David Li	dab183ed40	Minor Instr PGO code restructuring 1. Key constant values (version, magic) and data structures related to raw and indexed profile format are moved into one centralized file: InstrProf.h. 2. Utility function such as MD5Hash computation is also moved to the common header to allow sharing with other components in the future. 3. A header data structure is introduced for Indexed format so that the reader and writer can always be in sync. 4. Added some comments to document different places where multiple definition of the data structure must be kept in sync (reader/writer, runtime, lowering etc). No functional change is intended. Differential Revision: http://reviews.llvm.org/D13758 llvm-svn: 250638	2015-10-18 01:02:29 +00:00
Sanjoy Das	d295f2c7ca	[SCEV] Fix whitespace issues and remove extra braces; NFC llvm-svn: 250636	2015-10-18 00:29:27 +00:00
Sanjoy Das	f07d2a7143	[SCEV] Use std::all_of and std::any_of; NFC llvm-svn: 250635	2015-10-18 00:29:23 +00:00
Sanjoy Das	6391459069	[SCEV] Use auto where it helps remove line breaks; NFC llvm-svn: 250634	2015-10-18 00:29:20 +00:00
Sanjoy Das	d9f6d33a7f	[SCEV] Use range for loops; NFC llvm-svn: 250633	2015-10-18 00:29:16 +00:00
Craig Topper	ec15ea12e7	Use std::find instead of manual loop. llvm-svn: 250624	2015-10-17 21:32:28 +00:00
Craig Topper	a833451173	Use std::is_sorted to replace a custom version. Also replace a comparison predicate struct with a lambda. llvm-svn: 250623	2015-10-17 21:32:26 +00:00
Simon Pilgrim	86c5e85e84	[X86][XOP] Add VPROT instruction opcodes Added X86ISD opcodes for VPROT vector rotate by variable and by immediate. llvm-svn: 250620	2015-10-17 19:04:24 +00:00
Craig Topper	a2d0635098	Remove unnecessary 'const' pointed out by David Blaikie. llvm-svn: 250619	2015-10-17 18:22:46 +00:00
Simon Pilgrim	24057b9566	[DAG] Ensure vector constant folding uses correct scalar undef types Minor fix to D13665 found during post-commit review. llvm-svn: 250616	2015-10-17 16:49:43 +00:00
Craig Topper	9ff9bf4959	Replace a custom table sort check with std::is_sorted. Change a function to take ArrayRef instead of pointer and length. NFC llvm-svn: 250615	2015-10-17 16:37:13 +00:00
Craig Topper	c177d9edb3	Use std::begin/end and std::is_sorted to simplify some code. NFC llvm-svn: 250614	2015-10-17 16:37:11 +00:00
Simon Pilgrim	a18ae9bd70	[CostModel] Fixed AVX integer shift costs Targets with AVX but without AVX2 were incorrectly reporting costs of 256-bit integer shifts. llvm-svn: 250611	2015-10-17 13:23:38 +00:00
Simon Pilgrim	5b65f28fe7	[X86][FastISel] Teach how to select SSE4A nontemporal stores. Add FastISel support for SSE4A scalar float / double non-temporal stores Follow up to D13698 Differential Revision: http://reviews.llvm.org/D13773 llvm-svn: 250610	2015-10-17 13:04:42 +00:00
Simon Pilgrim	216b1bf5ed	[InstCombine] SSE4A constant folding and conversion to shuffles. This patch improves support for combining the SSE4A EXTRQ(I) and INSERTQ(I) intrinsics: 1 - Converts INSERTQ/EXTRQ calls to INSERTQI/EXTRQI if the 'bit index' and 'length' operands are constant 2 - Converts INSERTQI/EXTRQI calls to shufflevector if the bit index/length are both byte aligned (we can already lower shuffles to INSERTQI/EXTRQI if its useful) 3 - Constant folding support 4 - Add zeroinitializer handling Differential Revision: http://reviews.llvm.org/D13348 llvm-svn: 250609	2015-10-17 11:40:05 +00:00
Kostya Serebryany	fed509e73d	[libFuzzer] add -shuffle flag llvm-svn: 250603	2015-10-17 04:38:26 +00:00
Colin LeMahieu	7c9587136d	[Hexagon] Adding skeleton of HVX extension instructions. llvm-svn: 250600	2015-10-17 01:33:04 +00:00
Matthias Braun	65e6d4a3f8	RegisterPressure: Unify the sparse sets in LiveRegsSet; NFC Also do some cleanups comment improvements. llvm-svn: 250598	2015-10-17 01:03:44 +00:00
Matthias Braun	cdd2792aa6	RegisterPressure: allocatable physreg uses are always kills This property was already used in the code path when no liveness intervals are present. Unfortunately the code path that uses liveness intervals tried to query a cached live interval for an allocatable physreg, those are usually not computed so a conservative default was used. This doesn't affect any of the lit testcases. This is a foreclosure to upcoming changes which should be NFC but without this patch this tidbit wouldn't be NFC. llvm-svn: 250596	2015-10-17 00:46:57 +00:00
Matthias Braun	5105e05e8f	RegisterPressure: Remove 0 entries from PressureChange This should not change behaviour because as far as I can see all code reading the pressure changes has no effect if the PressureInc is 0. Removing these entries however does avoid unnecessary computation, and results in a more stable debug output. I want the stable debug output to check that some upcoming changes are indeed NFC and identical even at the debug output level. llvm-svn: 250595	2015-10-17 00:35:59 +00:00
JF Bastien	3428ed4f53	WebAssembly: don't omit dead vregs from locals Summary: This is a temporary hack until we get around to remapping the vreg numbers to local numbers. Dead vregs cause bad numbering and make consumers sad. We could also just look at debug info an use named locals instead, but vregs have to work properly anyways so there! Reviewers: binji, sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: http://reviews.llvm.org/D13839 llvm-svn: 250594	2015-10-17 00:25:38 +00:00
JF Bastien	4f43e80ece	WebAssembly: fix the syntax for comparisons Summary: It has also slightly changed. Reviewers: binji Subscribers: jfb, dschuff, llvm-commits, sunfish Differential Revision: http://reviews.llvm.org/D13837 llvm-svn: 250591	2015-10-17 00:12:29 +00:00
Matthias Braun	96e411b90c	RegisterPressure: Hide non-const iterators of PressureDiff It is too easy to accidentally violate the ordering requirements when modifying the PressureDiff entries through iterators. llvm-svn: 250590	2015-10-17 00:08:48 +00:00
Joseph Tremoulet	55b51e9dcc	[WinEH] Fix eh.exceptionpointer intrinsic lowering Summary: Some shared code for handling eh.exceptionpointer and eh.exceptioncode needs to not share the part that truncates to 32 bits, which is intended just for exception codes. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13747 llvm-svn: 250588	2015-10-17 00:08:08 +00:00
Reid Kleckner	28e490342b	[WinEH] Fix stack alignment in funclets and ParentFrameOffset calculation Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset is incorrect when CSRs are involved. We were supposed to have a test case to catch this, but it wasn't very rigorous. The main effect here is that calling _CxxThrowException inside a catchpad doesn't immediately crash on MOVAPS when you have an odd number of CSRs. llvm-svn: 250583	2015-10-16 23:43:27 +00:00
Matthias Braun	fdee8ec2bd	RegisterPressure: Use range based for, cleanup llvm-svn: 250579	2015-10-16 23:25:09 +00:00
Kostya Serebryany	d6edce97fb	[libFuzzer] print a stack trace on timeout llvm-svn: 250571	2015-10-16 23:04:31 +00:00
Benjamin Kramer	b43d33bf0f	Revert "This is a follow-up to the discussion in D12882." Breaks clang selfhost, see PR25222. This reverts commits r250527 and r250528. llvm-svn: 250570	2015-10-16 23:00:29 +00:00
Kostya Serebryany	a9da9b48ef	[libFuzzer] reduce the size of artifacts printed on the screen llvm-svn: 250565	2015-10-16 22:47:20 +00:00
Kostya Serebryany	b91c62b1f3	[libFuzzer] When -test_single_input crashes the test it is not necessary to write crash-file because input is already known to the user. Patch by Mike Aizatsky llvm-svn: 250564	2015-10-16 22:41:47 +00:00
Sanjay Patel	bbd524496c	[x86] promote 'add nsw' to a wider type to allow more combines The motivation for this patch starts with PR20134: https://llvm.org/bugs/show_bug.cgi?id=20134 void foo(int *a, int i) { a[i] = a[i+1] + a[i+2]; } It seems better to produce this (14 bytes): movslq %esi, %rsi movl 0x4(%rdi,%rsi,4), %eax addl 0x8(%rdi,%rsi,4), %eax movl %eax, (%rdi,%rsi,4) Rather than this (22 bytes): leal 0x1(%rsi), %eax cltq leal 0x2(%rsi), %ecx movslq %ecx, %rcx movl (%rdi,%rcx,4), %ecx addl (%rdi,%rax,4), %ecx movslq %esi, %rax movl %ecx, (%rdi,%rax,4) The most basic problem (the first test case in the patch combines constants) should also be fixed in InstCombine, but it gets more complicated after that because we need to consider architecture and micro-architecture. For example, AArch64 may not see any benefit from the more general transform because the ISA solves the sexting in hardware. Some x86 chips may not want to replace 2 ADD insts with 1 LEA, and there's an attribute for that: FeatureSlowLEA. But I suspect that doesn't go far enough or maybe it's not getting used when it should; I'm also not sure if FeatureSlowLEA should also mean "slow complex addressing mode". I see no perf differences on test-suite with this change running on AMD Jaguar, but I see small code size improvements when building clang and the LLVM tools with the patched compiler. A more general solution to the sext(add nsw(x, C)) problem that works for multiple targets is available in CodeGenPrepare, but it may take quite a bit more work to get that to fire on all of the test cases that this patch takes care of. Differential Revision: http://reviews.llvm.org/D13757 llvm-svn: 250560	2015-10-16 22:14:12 +00:00
Jim Grosbach	0fdd572763	MC: Don't crash after issuing a diagnostic. Crashing is bad, m'kay? Fixing a 4 year old bug of my own creation. Adding the testcase now which I should have added then which would have long since caught this. The problem is that printMessage() will display the diagnostic but not set HadError to true, resulting in the assembler continuing on its way and trying to create relocations for things that may not allow them or otherwise get itself into trouble. Using the Error() helper function here rather than calling printMessage() directly resolves this. rdar://23133240 llvm-svn: 250557	2015-10-16 22:07:59 +00:00
Joseph Tremoulet	d11a998e81	[WinEH] Fix CatchRetSuccessorColorMap accounting Summary: We now use the block for the catchpad itself, rather than its normal successor, as the funclet entry. Putting the normal successor in the map leads downstream funclet membership computations to erroneous results. Reviewers: majnemer, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D13798 llvm-svn: 250552	2015-10-16 21:22:54 +00:00
Andrew Kaylor	09b39acc03	Fix assertion failure with fp128 to unsigned i64 conversion Patch by Mitch Bodart Differential Revision: http://reviews.llvm.org/D13780 llvm-svn: 250550	2015-10-16 20:39:20 +00:00
Krzysztof Parzyszek	a7c5f0409c	[Hexagon] Split double registers llvm-svn: 250549	2015-10-16 20:38:54 +00:00
David Majnemer	e696583dba	[WinEH] Remove dead code/includes from WinEHPrepare No functionality change is intended. llvm-svn: 250545	2015-10-16 19:59:52 +00:00
Krzysztof Parzyszek	aec39c68ae	[Hexagon] Delete lib/Target/Hexagon/HexagonRemoveSZExtArgs.cpp llvm-svn: 250543	2015-10-16 19:51:53 +00:00
Krzysztof Parzyszek	5b7dd0cdf9	[Hexagon] Merge adjacent stores llvm-svn: 250542	2015-10-16 19:43:56 +00:00
Diego Novillo	b93483dbce	Sample profiles - Re-arrange binary format to emit head samples only on top functions. The number of samples collected at the head of a function only make sense for top-level functions (i.e., those actually called as opposed to being inlined inside another). Head samples essentially count the time spent inside the function's prologue. This clearly doesn't make sense for inlined functions, so we were always emitting 0 in those. llvm-svn: 250539	2015-10-16 18:54:35 +00:00
JF Bastien	6126d2b883	WebAssembly: fix load/store syntax Summary: The syntax has changed a bit recently. Reviewers: binji Subscribers: llvm-commits, jfb, sunfish, dschuff Differential Revision: http://reviews.llvm.org/D13821 llvm-svn: 250535	2015-10-16 18:24:42 +00:00
Joseph Tremoulet	53e9cbd95a	[WinEH] Fix endpad coloring/numbering Summary: When a cleanup's cleanupendpad or cleanupret targets a catchendpad, stop trying to propagate the cleanup's parent's color to the catchendpad, since what's needed is the cleanup's grandparent's color and the catchendpad will get that color from the catchpad linkage already. We already had this exclusion for invokes, but were missing it for cleanupendpad/cleanupret. Also add a missing line that tags cleanupendpads' states in the EHPadStateMap, without with lowering invokes that target cleanupendpads which unwind to other handlers (and so don't have the -1 state) will fail. This fixes the reduced IR repro in PR25163. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13797 llvm-svn: 250534	2015-10-16 18:08:16 +00:00
Sanjay Patel	374dd8d88e	This is a follow-up to the discussion in D12882. Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands are expensive (as defined by the TTI cost model) because that may expose further optimizations. However, we would then like a later pass like CodeGenPrepare to undo that transformation if the target would likely benefit from not speculatively executing an expensive op (this patch). Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its select-formation behavior that changed with r248439. Differential Revision: http://reviews.llvm.org/D13297 llvm-svn: 250527	2015-10-16 16:54:30 +00:00
JF Bastien	53bd975033	WebAssembly: relooper analysis pass Summary: Make the relooper an analysis pass, to convert CFG to AST. Reviewers: sunfish Subscribers: jfb, dschuff Differential Revision: http://reviews.llvm.org/D12744 llvm-svn: 250524	2015-10-16 16:35:49 +00:00
Charlie Turner	434d4599d4	[AArch64] Implement vector splitting on UADDV. Summary: Fixes PR25056. Reviewers: mcrosier, junbuml, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13466 llvm-svn: 250520	2015-10-16 15:38:25 +00:00
Hrvoje Varga	3c88fbd367	[mips][microMIPS] Implement LB, LBE, LBU and LBUE instructions Differential Revision: http://reviews.llvm.org/D11633 llvm-svn: 250511	2015-10-16 12:24:58 +00:00
Pawel Bylica	7187e4bba9	Use Windows Vista API to get the user's home directory Summary: This patch replaces usage of deprecated SHGetFolderPathW with SHGetKnownFolderPath. The usage of SHGetKnownFolderPath is wrapped to allow queries for other "known" folders in the near future. Reviewers: aaron.ballman, gbedwell Subscribers: chapuni, llvm-commits Differential Revision: http://reviews.llvm.org/D13753 llvm-svn: 250501	2015-10-16 09:08:59 +00:00
Craig Topper	09b6598572	[X86] Add fxsr feature flag for fxsave/fxrestore instructions. llvm-svn: 250497	2015-10-16 06:03:09 +00:00
Dylan McKay	b1d469c657	Initial migration of AVR backend This patch adds the underlying infrastructure for an AVR backend to be included into LLVM. It is the first of a series of patches aimed at moving the out-of-tree AVR backend into the tree. It consists of adding a new`Triple` target 'avr'. llvm-svn: 250492	2015-10-16 03:10:30 +00:00
Sanjoy Das	58fae7cf6b	[RS4GC] Dont' propagate call attrs related to patchable statepoints The `"statepoint-id"` and `"statepoint-num-patch-bytes"` attributes are used solely to determine properties of the `gc.statepoint` being created. Once the `gc.statepoint` is in place, these should be removed. llvm-svn: 250491	2015-10-16 02:41:23 +00:00
Sanjoy Das	810a59d037	[RS4GC] Bring legalizeCallAttributes up to LLVM coding style; NFC llvm-svn: 250490	2015-10-16 02:41:11 +00:00
Sanjoy Das	25ec1a3e60	[RS4GC] Use "deopt" operand bundles Summary: This is a step towards using operand bundles to carry deopt state till RewriteStatepointsForGC. The change adds a flag to RewriteStatepointsForGC that teaches it to pick up deopt state from a `"deopt"` operand bundle attached to the `call` or `invoke` it is wrapping. The command line flag added, `-rs4gc-use-deopt-bundles`, will only exist for a short while. Once we are able to pipe deopt bundle state through the full optimization pipeline without problems, we will "constant fold" `-rs4gc-use-deopt-bundles` to `true`. Reviewers: swaroop.sridhar, reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13372 llvm-svn: 250489	2015-10-16 02:41:00 +00:00
Sanjoy Das	7360f30852	[IndVars] Rename getExtend; NFC Rename `IndVarSimplify::getExtend` to `IndVarSimplify::createExtendInst` to make it obvious that it creates `llvm::Instruction` s. llvm-svn: 250484	2015-10-16 01:00:50 +00:00
Sanjoy Das	37e87c2023	[IndVars] Have `cloneArithmeticIVUser` guess better Summary: `cloneArithmeticIVUser` currently trips over expression like `add %iv, -1` when `%iv` is being zero extended -- it tries to construct the widened use as `add %iv.zext, zext(-1)` and (correctly) fails to prove equivalence to `zext(add %iv, -1)` (here the SCEV for `%iv` is `{1,+,1}`). This change teaches `IndVars` to try sign extending the non-IV operand if that makes the newly constructed IV use equivalent to the widened narrow IV use. Reviewers: atrick, hfinkel, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13717 llvm-svn: 250483	2015-10-16 01:00:47 +00:00
Sanjoy Das	472840a3d3	[IndVars] Extract out a few local variables; NFC llvm-svn: 250482	2015-10-16 01:00:44 +00:00
Sanjoy Das	1fd184e5a2	[IndVars] Split `WidenIV::cloneIVUser`; NFC Summary: This NFC splitting is intended to make a later diff easier to follow. It just tail duplicates `cloneIVUser` into `cloneArithmeticIVUser` and `cloneBitwiseIVUser`. Reviewers: atrick, hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13716 llvm-svn: 250481	2015-10-16 01:00:39 +00:00
JF Bastien	1d20a5e9e8	WebAssembly: update syntax Summary: Follow the same syntax as for the spec repo. Both have evolved slightly independently and need to converge again. This, along with wasmate changes, allows me to do the following: echo "int add(int a, int b) { return a + b; }" > add.c ./out/bin/clang -O2 -S --target=wasm32-unknown-unknown add.c -o add.wack ./experimental/prototype-wasmate/wasmate.py add.wack > add.wast ./sexpr-wasm-prototype/out/sexpr-wasm add.wast -o add.wasm ./sexpr-wasm-prototype/third_party/v8-native-prototype/v8/v8/out/Release/d8 -e "print(WASM.instantiateModule(readbuffer('add.wasm'), {print:print}).add(42, 1337));" As you'd expect, the d8 shell prints out the right value. Reviewers: sunfish Subscribers: jfb, llvm-commits, dschuff Differential Revision: http://reviews.llvm.org/D13712 llvm-svn: 250480	2015-10-16 00:53:49 +00:00
Evgeniy Stepanov	9addbc9fc1	Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android." Breaks the hexagon buildbot. llvm-svn: 250461	2015-10-15 21:26:49 +00:00
Adrian Prantl	96b1551d53	Replace a forward declaration with an #include. When building with modules the forward-declared inner class DebugLocStream::ListBuilder causes clang to fall over. llvm-svn: 250459	2015-10-15 20:58:55 +00:00
Evgeniy Stepanov	142947e9f0	[safestack] Fast access to the unsafe stack pointer on AArch64/Android. Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. llvm-svn: 250456	2015-10-15 20:50:16 +00:00
Adrian Prantl	d8596384e7	Add a missing include of cstddef needed for size_t. llvm-svn: 250446	2015-10-15 19:41:54 +00:00
JF Bastien	2cdd5e4710	x86: preserve flags when folding atomic operations D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. I fixed some of this issue in D13680 but had missed INC/DEC. This patch adds the missing EFLAGS definition. llvm-svn: 250438	2015-10-15 18:24:52 +00:00
Benjamin Kramer	bacc7ba7aa	[SelectionDAG] Remove dead code. NFC. Carefully selected parts without deleting graph stuff and dumping methods. llvm-svn: 250434	2015-10-15 17:54:06 +00:00
Benjamin Kramer	7fa42c8a8c	[AsmPrinter] Prune dead code. NFC. I left all (dead) print and dump methods in place. llvm-svn: 250433	2015-10-15 17:16:32 +00:00
Philip Reames	a956cc7f08	Revert 250343 and 250344 Turns out this approach is buggy. In discussion about follow on work, Sanjoy pointed out that we could be subject to circular logic problems. Consider: if (i u< L) leave() if ((i + 1) u< L) leave() print(a[i] + a[i+1]) If we know that L is less than UINT_MAX, we could possible prove (in a control dependent way) that i + 1 does not overflow. This gives us: if (i u< L) leave() if ((i +nuw 1) u< L) leave() print(a[i] + a[i+1]) If we now do the transform this patch proposed, we end up with: if ((i +nuw 1) u< L) leave_appropriately() print(a[i] + a[i+1]) That would be a miscompile when i==-1. The problem here is that the control dependent nuw bits got used to prove something about the first condition. That's obviously invalid. This won't happen today, but since I plan to enhance LVI/CVP with exactly that transform at some point in the not too distant future... llvm-svn: 250430	2015-10-15 16:51:00 +00:00
JF Bastien	5b327712b0	x86 FP atomic codegen: don't drop globals, stack Summary: x86 codegen is clever about generating good code for relaxed floating-point operations, but it was being silly when globals and immediates were involved, forgetting where the global was and loading/storing from/to the wrong place. The same applied to hard-coded address immediates. Don't let it forget about the displacement. This fixes https://llvm.org/bugs/show_bug.cgi?id=25171 A very similar bug when doing floating-points atomics to the stack is also fixed by this patch. This fixes https://llvm.org/bugs/show_bug.cgi?id=25144 Reviewers: pete Subscribers: llvm-commits, majnemer, rsmith Differential Revision: http://reviews.llvm.org/D13749 llvm-svn: 250429	2015-10-15 16:46:29 +00:00
Diego Novillo	38be33302c	Sample Profiles - Adjust integer types. Mostly NFC. This adjusts all integers in the reader/writer to reflect the types stored on profile files. They should all be unsigned 32-bit or 64-bit values. Changed all associated internal types to be uint32_t or uint64_t. The only place that needed some adjustments is in the sample profile transformation. Altough the weight read from the profile are 64-bit values, the internal API for branch weights only accepts 32-bit values. The pass now saturates weights that overflow uint32_t. llvm-svn: 250427	2015-10-15 16:36:21 +00:00
Tim Northover	0515291c52	Prevent assertion with "llc -debug" and anonymous symbols. llvm-svn: 250425	2015-10-15 16:18:27 +00:00
Benjamin Kramer	6db3338cb1	[ScalarOpts] Remove dead code. Does not touch debug dumpers. NFC. llvm-svn: 250417	2015-10-15 15:08:58 +00:00
Manman Ren	72d44b1b09	Recommit r250345, it was reverted in r250366 to investigate a bot failure. Our internal bot is still red after r250366. llvm-svn: 250415	2015-10-15 14:59:40 +00:00
Daniel Sanders	6394ee598e	[mips][ias] Implement ulh macro. Summary: This macro is needed to prevent test/CodeGen/Mips/2008-08-01-AsmInline.ll from failing after the integrated assembler is enabled by default. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13654 llvm-svn: 250414	2015-10-15 14:52:58 +00:00
Pawel Bylica	6b129bd464	Require Windows API of version 6.1 (Windows 7). llvm-svn: 250413	2015-10-15 14:50:31 +00:00
Benjamin Kramer	c5275bdec1	[NVPTX] Remove dead code. I left helpers that look useful for debugging alone. NFC. llvm-svn: 250410	2015-10-15 14:45:41 +00:00
Daniel Sanders	8008de5551	[mips][mips16] MIPS16 is not a CPU/Architecture but is an ASE. Summary: The -mcpu=mips16 option caused the Integrated Assembler to crash because it couldn't figure out the architecture revision number to write to the .MIPS.abiflags section. This CPU definition has been removed because, like microMIPS, MIPS16 is an ASE to a base architecture. Reviewers: vkalintiris Subscribers: rkotler, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13656 llvm-svn: 250407	2015-10-15 14:34:23 +00:00
Benjamin Kramer	5dfcda73d5	[X86] Rip out orphaned method declarations and other dead code. NFC. llvm-svn: 250406	2015-10-15 14:09:59 +00:00
Aaron Ballman	58f413c518	Silencing a -Wtype-limits warning; an unsigned value will always be >= 0; NFC. llvm-svn: 250404	2015-10-15 13:55:43 +00:00
Igor Breger	d7bae451de	AVX512: Implemented DAG lowering for shuff62x2/shufi62x2 instructions ( shuffle packed values at 128-bit granularity ) Differential Revision: http://reviews.llvm.org/D13648 llvm-svn: 250400	2015-10-15 13:29:07 +00:00
Igor Breger	b4bb190eed	AVX512: Implemented encoding and intrinsics for vpternlogd/q. Differential Revision: http://reviews.llvm.org/D13768 llvm-svn: 250396	2015-10-15 12:33:24 +00:00
Elena Demikhovsky	ecff21b297	AVX-512: Fixed a bug in shuffle lowering 32-bit mode AVX-512 bit shuffle fails on 32 bit since we create a vector of 64-bit constants. I split 8x64-bit const vector to 16x32 on 32-bit mode. Differential Revision: http://reviews.llvm.org/D13644 llvm-svn: 250390	2015-10-15 11:35:33 +00:00
Artyom Skrobov	63471330d2	Don't pretend AMDGPU backend knows how to custom-lower UDIVREM for vector types; it can't Reviewers: arsenm, jvesely, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13734 llvm-svn: 250384	2015-10-15 09:18:47 +00:00
Zlatko Buljan	54b1eb4c73	[mips][microMIPS] Implement DPA.W.PH, DPAQ_S.W.PH, DPAQ_SA.L.W, DPAQX_S.W.PH, DPAQX_SA.W.PH, DPAU.H.QBL, DPAU.H.QBR and DPAX.W.PH instructions Differential Revision: http://reviews.llvm.org/D13376 llvm-svn: 250382	2015-10-15 08:59:45 +00:00
Hrvoje Varga	3a3c4b8a39	[mips][microMIPS] Implement BREAK16, LI16, MOVE16, SDBBP16, SUBU16 and XOR16 instructions Differential Revision: http://reviews.llvm.org/D11292#inline-103143 llvm-svn: 250381	2015-10-15 08:39:07 +00:00
Hrvoje Varga	3ef4dd7bc8	[mips][microMIPS] Implement LLE and SCE instructions Differential Revision: http://reviews.llvm.org/D11630 llvm-svn: 250379	2015-10-15 08:11:50 +00:00
Hrvoje Varga	a766eff5a0	[mips][microMIPS] Implement LWLE, LWRE, SWLE and SWRE instructions Differential Revision: http://reviews.llvm.org/D11631 llvm-svn: 250377	2015-10-15 07:23:06 +00:00
Eric Christopher	bdafb3cd1c	Remove DIFile from createSubroutineType. Patch by Amaury Sechet with a small modification by me. llvm-svn: 250374	2015-10-15 06:56:10 +00:00
Lang Hames	86a4593dd2	[RuntimeDyld] Don't try to get the contents of sections that don't have any (e.g. bss sections). MachO and ELF have been silently letting this pass, but COFFObjectFile contains an assertion to catch this kind of (ab)use of the getSectionContents, and this was causing the JIT to crash on COFF objects with BSS sections. This patch should fix that. llvm-svn: 250371	2015-10-15 06:41:45 +00:00
Akira Hatanaka	8ad7399f8e	[MachO] Stop generating coal sections. Recommit r250342: move coal-sections-powerpc.s to subdirectory for powerpc. Some background on why we don't have to use coal sections anymore: Long ago when C++ was new and "weak" had not been standardized, an attempt was made in cctools to support C++ inlines that can be coalesced by putting them into their own section (TEXT/textcoal_nt instead of TEXT/text). The current macho linker supports the weak-def bit on any symbol to allow it to be coalesced, but the compiler still puts weak-def functions/data into alternate section names, which the linker must map back to the base section name. This patch makes changes that are necessary to prevent the compiler from using the "coal" sections and have it use the non-coal sections instead when the target architecture is not powerpc: TEXT/textcoal_nt instead use TEXT/text TEXT/const_coal instead use TEXT/const DATA/datacoal_nt instead use DATA/data If the target is powerpc, we continue to use the coal sections since anyone targeting powerpc is probably using an old linker that doesn't have support for the weak-def bits. Also, have the assembler issue a warning if it encounters a coal section in the assembly file and inform the users to use the non-coal sections instead. rdar://problem/14265330 Differential Revision: http://reviews.llvm.org/D13188 llvm-svn: 250370	2015-10-15 05:28:38 +00:00
Hrvoje Varga	8c9526400e	Test commit. llvm-svn: 250367	2015-10-15 05:20:51 +00:00
Manman Ren	f5499fd9d5	Temporarily revert r250345 to sort out bot failure. With r250345 and r250343, we start to observe the following failure when bootstrap clang with lto and pgo: PHI node entries do not match predecessors! %.sroa.029.3.i = phi %"class.llvm::SDNode.13298"* [ null, %30953 ], [ null, %31017 ], [ null, %30998 ], [ null, %_ZN4llvm8dyn_castINS_14ConstantSDNodeENS_7SDValueEEENS_10cast_rettyIT_T0_E8ret_typeERS5_.exit.i.1804 ], [ null, %30975 ], [ null, %30991 ], [ null, %_ZNK4llvm3EVT13getScalarTypeEv.exit.i.1812 ], [ %..sroa.029.0.i, %_ZN4llvm11SmallVectorIiLj8EED1Ev.exit.i.1826 ], !dbg !451895 label %30998 label %_ZNK4llvm3EVTeqES0_.exit19.thread.i LLVM ERROR: Broken function found, compilation aborted! I will re-commit this if the bot does not recover. llvm-svn: 250366	2015-10-15 04:58:24 +00:00
Craig Topper	fd2cc7cd8a	Add XSAVE/XSAVEOPT to KNL processor. llvm-svn: 250362	2015-10-15 03:56:54 +00:00
David Majnemer	6e08126f31	[llvm-pdbdump] Provide a mechanism to dump the raw contents of a PDB A PDB can be thought of as a very simple file system. It is occasionally illuminating to see the contents of the underlying files. Differential Revision: http://reviews.llvm.org/D13674 llvm-svn: 250356	2015-10-15 01:27:19 +00:00
Richard Smith	93e9bb0864	Fix -Wmismatched-tags error in modules build by removing unused forward declaration. llvm-svn: 250355	2015-10-15 01:15:26 +00:00
Quentin Colombet	5084e44d71	[ARM] Make sure we do not dereference the end iterator when accessing debug information. Although the problem was always here, it would only be exposed when shrink-wrapping is enable. rdar://problem/23110493 llvm-svn: 250352	2015-10-15 00:41:26 +00:00
Akira Hatanaka	276332b47f	Revert r250349. Test case coal-sections-powerpc.s is still failing on some buildbots. llvm-svn: 250351	2015-10-15 00:11:03 +00:00
Akira Hatanaka	1cea644114	[MachO] Stop generating coal sections. Recommit r250342: add -arch=ppc32 to the RUN lines of powerpc tests. Some background on why we don't have to use coal sections anymore: Long ago when C++ was new and "weak" had not been standardized, an attempt was made in cctools to support C++ inlines that can be coalesced by putting them into their own section (TEXT/textcoal_nt instead of TEXT/text). The current macho linker supports the weak-def bit on any symbol to allow it to be coalesced, but the compiler still puts weak-def functions/data into alternate section names, which the linker must map back to the base section name. This patch makes changes that are necessary to prevent the compiler from using the "coal" sections and have it use the non-coal sections instead when the target architecture is not powerpc: TEXT/textcoal_nt instead use TEXT/text TEXT/const_coal instead use TEXT/const DATA/datacoal_nt instead use DATA/data If the target is powerpc, we continue to use the coal sections since anyone targeting powerpc is probably using an old linker that doesn't have support for the weak-def bits. Also, have the assembler issue a warning if it encounters a coal section in the assembly file and inform the users to use the non-coal sections instead. rdar://problem/14265330 Differential Revision: http://reviews.llvm.org/D13188 llvm-svn: 250349	2015-10-14 23:48:10 +00:00
Akira Hatanaka	d58d347e42	Revert r250342. Investigate why coal-sections-powerpc.s is failing on some buildbots. llvm-svn: 250346	2015-10-14 23:29:10 +00:00
Cong Hou	b74d3b3b86	Update the branch weight metadata in JumpThreading pass. Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). This is the third attempt to submit this patch, while the first two led to failures in some FDO tests. After investigation, it is the edge weight normalization that caused those failures. In this patch the edge weight normalization is fixed so that there is no zero weight in the output and the sum of all weights can fit in 32-bit integer. Several unit tests are added. Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250345	2015-10-14 23:14:17 +00:00
Philip Reames	b42db21de8	[SimplifyCFG] Speculatively flatten CFG based on profiling metadata If we have a series of branches which are all unlikely to fail, we can possibly combine them into a single check on the fastpath combined with a bit of dispatch logic on the slowpath. We don't want to do this unconditionally since it requires speculating instructions past a branch, but if the profiling metadata on the branch indicates profitability, this can reduce the number of checks needed along the fast path. The canonical example this is trying to handle is removing the second bounds check implied by the Java code: a[i] + a[i+1]. Note that it can currently only do so for really simple conditions and the values of a[i] can't be used anywhere except in the addition. (i.e. the load has to have been sunk already and not prevent speculation.) I plan on extending this transform over the next few days to handle alternate sequences. Differential Revision: http://reviews.llvm.org/D13070 llvm-svn: 250343	2015-10-14 22:46:19 +00:00
Akira Hatanaka	c078ae3e4f	[MachO] Stop generating coal sections. Some background on why we don't have to use coal sections anymore: Long ago when C++ was new and "weak" had not been standardized, an attempt was made in cctools to support C++ inlines that can be coalesced by putting them into their own section (TEXT/textcoal_nt instead of TEXT/text). The current macho linker supports the weak-def bit on any symbol to allow it to be coalesced, but the compiler still puts weak-def functions/data into alternate section names, which the linker must map back to the base section name. This patch makes changes that are necessary to prevent the compiler from using the "coal" sections and have it use the non-coal sections instead when the target architecture is not powerpc: TEXT/textcoal_nt instead use TEXT/text TEXT/const_coal instead use TEXT/const DATA/datacoal_nt instead use DATA/data If the target is powerpc, we continue to use the coal sections since anyone targeting powerpc is probably using an old linker that doesn't have support for the weak-def bits. Also, have the assembler issue a warning if it encounters a coal section in the assembly file and inform the users to use the non-coal sections instead. rdar://problem/14265330 Differential Revision: http://reviews.llvm.org/D13188 llvm-svn: 250342	2015-10-14 22:45:36 +00:00
Philip Reames	ddcf6b35a2	Tighten known bits for ctpop based on zero input bits This is a cleaned up patch from the one written by John Regehr based on the findings of the Souper superoptimizer. The basic idea here is that input bits that are known zero reduce the maximum count that the intrinsic could return. We know that the number of bits required to represent a particular count is at most log2(N)+1. Differential Revision: http://reviews.llvm.org/D13253 llvm-svn: 250338	2015-10-14 22:42:12 +00:00
Bill Schmidt	048cc97fb1	[PowerPC] Fix invalid lxvdsx optimization (PR25157) PR25157 identifies a bug where a load plus a vector shuffle is incorrectly converted into an LXVDSX instruction. That optimization is only valid if the load is of a doubleword, and in the noted case, it was not. This corrects that problem. Joint patch with Eric Schweitz, who provided the bugpoint-reduced test case. llvm-svn: 250324	2015-10-14 20:45:00 +00:00
Chen Li	567aa7ab30	[LoopUnswitch] Correct misleading comments. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13738 llvm-svn: 250317	2015-10-14 19:47:43 +00:00
Diego Novillo	bb5605ca3a	Sample profiles - Add documentation for binary profile encoding. NFC. This adds documentation for the binary profile encoding and moves the documentation for the text encoding into the header file SampleProfReader.h. llvm-svn: 250309	2015-10-14 18:36:30 +00:00
Artyom Skrobov	4bca0bb010	A doccomment for CombineTo, and some NFC refactorings Summary: Caching SDLoc(N), instead of recreating it in every single function call, keeps the code denser, and allows to unwrap long lines. Reviewers: sunfish, atrick, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13726 llvm-svn: 250305	2015-10-14 17:18:35 +00:00
Artyom Skrobov	a5b9ad22b3	Merge DAGCombiner::visitSREM and DAGCombiner::visitUREM (NFC) Summary: The two implementations had more code in common than not. Reviewers: sunfish, MatzeB, sdmitrouk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13724 llvm-svn: 250302	2015-10-14 16:54:14 +00:00
Andrea Di Biagio	c47edbef4c	[x86][FastISel] Teach how to select nontemporal stores. This patch teaches x86 fast-isel how to select nontemporal stores. On x86, we can use MOVNTI for nontemporal stores of doublewords/quadwords. Instructions (V)MOVNTPS/PD/DQ can be used for SSE2/AVX aligned nontemporal vector stores. Before this patch, fast-isel always selected 'movd/movq' instead of 'movnti' for doubleword/quadword nontemporal stores. In the case of nontemporal stores of aligned vectors, fast-isel always selected movaps/movapd/movdqa instead of movntps/movntpd/movntdq. With this patch, if we use SSE2/AVX intrinsics for nontemporal stores we now always get the expected (V)MOVNT instructions. The lack of fast-isel support for nontemporal stores was spotted when analyzing the -O0 codegen for nontemporal stores. Differential Revision: http://reviews.llvm.org/D13698 llvm-svn: 250285	2015-10-14 10:03:13 +00:00
Craig Topper	b84b12699f	[X86] Update CPU detection to only enable XSAVE features if the OS has enabled them and the saving of YMM state. This seems to be consistent with gcc behavior. llvm-svn: 250269	2015-10-14 05:37:42 +00:00
Craig Topper	0ee356951a	[X86] Add XSAVE feature flags to their various processors. llvm-svn: 250268	2015-10-14 05:37:38 +00:00
Craig Topper	1129a00abf	Use range-based for loops. NFC llvm-svn: 250266	2015-10-14 04:36:00 +00:00
Manman Ren	2c8e16d507	Revert r250204 and r250240 due to bot failure. We failed to build PGO-ed clang. llvm-svn: 250264	2015-10-14 03:04:03 +00:00
Evgeniy Stepanov	ebd3f44f93	[msan] Fix crash on multiplication by a non-integer constant. Fixes PR25160. llvm-svn: 250260	2015-10-14 00:21:13 +00:00
Kostya Serebryany	5cb86d5a40	[asan] Disabling speculative loads under asan. Patch by Mike Aizatsky llvm-svn: 250259	2015-10-14 00:21:05 +00:00
Richard Smith	e7dc8bf9c2	Rename one of our two llvm::GCOVOptions classes to llvm::GCOV::Options. We used to get away with this because llvm/Support/GCOV.h was an implementation detail of the llvm-gcov tool, but it's now being used by FDO. llvm-svn: 250258	2015-10-14 00:04:19 +00:00
Sanjoy Das	16e7ff171b	[SCEV] Use `SCEV::isAllOnesValue` directly; NFC. Instead of `dyn_cast` ing to `SCEVConstant` and checking the contained `ConstantInteger. llvm-svn: 250251	2015-10-13 23:28:31 +00:00
Diego Novillo	43396fa8db	Sample profile reader - remove dead code. NFC. This removes old remnants from the gcov reader. I missed these when I re-wrote it recently. llvm-svn: 250242	2015-10-13 22:48:48 +00:00
Diego Novillo	760c5a8f45	Sample profiles - Add a name table to the binary encoding. Binary encoded profiles used to encode all function names inline at every reference. This is clearly suboptimal in terms of space. This patch fixes this by adding a name table to the header of the file. llvm-svn: 250241	2015-10-13 22:48:46 +00:00
David Majnemer	eba62796cb	[InlineFunction] Correctly inline TerminatePadInst We forgot to append the terminatepad's arguments which resulted in us treating the old terminatepad as an argument to the new terminatepad causing us to crash immediately. Instead, add the old terminatepad's arguments to the new terminatepad. This fixes PR25155. llvm-svn: 250234	2015-10-13 22:08:17 +00:00
Dan Gohman	ac93f649fa	[WebAssembly] Remove a TODO comment which is no longer needed. NFC. llvm-svn: 250233	2015-10-13 22:06:40 +00:00
Chad Rosier	7f08d80595	Typo. llvm-svn: 250224	2015-10-13 20:59:16 +00:00
Kevin Enderby	1c1add44b6	Tweak to r250117 and change to use ErrorOr and drop isSizeValid for ArchiveMemberHeader, suggestion by Rafael Espíndola. Also The clang-x86-win2008-selfhost bot still does not like the malformed-machos 00000031.a test, so removing it for now. All the other bots are fine with it however. llvm-svn: 250222	2015-10-13 20:48:04 +00:00
Joseph Tremoulet	28c89bbb36	[WinEH] Add CoreCLR EH table emission Summary: Emit the handler and clause locations immediately after the standard xdata. Clauses are emitted in the same order and format used to communiate them to the CLR Execution Engine. Add a lit test to verify correct table generation on a small but interesting example function. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13451 llvm-svn: 250219	2015-10-13 20:18:27 +00:00
Duncan P. N. Exon Smith	a73371a9b7	AMDGPU: Remove implicit ilist iterator conversions, NFC One of the changes in lib/Target/AMDGPU/AMDGPUMCInstLower.cpp was a new one. Previously, bundle iterators and single-instruction iterators could be compared to each other (comparing on underlying pointers). I changed a comparison from using `MBB->end()` to using `MBB->instr_end()`, since both end iterators should point at the some place anyway. I don't think the implicit conversion between the two iterator types is a good idea since it's fairly easy to accidentally compare to the wrong thing (they aren't always end iterators). Otherwise I would have just added the conversion. Even with that, no there should be functionality change here. llvm-svn: 250218	2015-10-13 20:07:10 +00:00
Duncan P. N. Exon Smith	d3b9df02b3	AArch64: Remove implicit ilist iterator conversions, NFC llvm-svn: 250216	2015-10-13 20:02:15 +00:00
Duncan P. N. Exon Smith	e400a7d412	SelectionDAG: Remove implicit ilist iterator conversions, NFC llvm-svn: 250214	2015-10-13 19:47:46 +00:00
Duncan P. N. Exon Smith	be4d8cba1c	Scalar: Remove remaining ilist iterator implicit conversions Remove remaining `ilist_iterator` implicit conversions from LLVMScalarOpts. This change exposed some scary behaviour in lib/Transforms/Scalar/SCCP.cpp around line 1770. This patch changes a call from `Function::begin()` to `&Function::front()`, since the return was immediately being passed into another function that takes a `Function`. `Function::front()` started to assert, since the function was empty. Note that `Function::end()` does not point at a legal `Function` -- it points at an `ilist_half_node` -- so the other function was getting garbage before. (I added the missing check for `Function::isDeclaration()`.) Otherwise, no functionality change intended. llvm-svn: 250211	2015-10-13 19:26:58 +00:00
Akira Hatanaka	5a4e4f8d8a	[AArch64] Check the size of the vector before accessing its elements. This fixes an assert in AArch64AsmParser::MatchAndEmitInstruction. rdar://problem/23081753 llvm-svn: 250207	2015-10-13 18:55:34 +00:00
Cong Hou	7ab123a5cf	Update the branch weight metadata in JumpThreading pass. Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250204	2015-10-13 18:43:10 +00:00
Xinliang David Li	3dd8817d84	[PGO]: Eliminate calls to __llvm_profile_register_function for Linux. On Linux, the profile runtime can use __start_SECTNAME and __stop_SECTNAME symbols defined by the linker to locate the start and end location of a named section (with C name). This eliminates the need for instrumented binary to call __llvm_profile_register_function during start-up time. llvm-svn: 250199	2015-10-13 18:39:48 +00:00
Duncan P. N. Exon Smith	3a9c9e3dcd	Scalar: Remove some implicit ilist iterator conversions, NFC Remove some of the implicit ilist iterator conversions in LLVMScalarOpts. More to go. llvm-svn: 250197	2015-10-13 18:26:00 +00:00
Duncan P. N. Exon Smith	4ead920ce5	ExecutionEngine: Remove implicit ilist iterator conversions, NFC llvm-svn: 250193	2015-10-13 18:11:02 +00:00
Duncan P. N. Exon Smith	1275bffa96	OrcJIT: Remove implicit ilist iterator conversions, NFC llvm-svn: 250192	2015-10-13 18:10:59 +00:00
Duncan P. N. Exon Smith	1732340bfa	IPO: Remove implicit ilist iterator conversions, NFC llvm-svn: 250187	2015-10-13 17:51:03 +00:00
Duncan P. N. Exon Smith	e82c286fba	Instrumentation: Remove ilist iterator implicit conversions, NFC llvm-svn: 250186	2015-10-13 17:39:10 +00:00
Duncan P. N. Exon Smith	c8f02e7540	Interpreter: Remove implicit ilist iterator conversions, NFC llvm-svn: 250185	2015-10-13 17:33:41 +00:00
Duncan P. N. Exon Smith	9f8aaf21ba	InstCombine: Remove ilist iterator implicit conversions, NFC Stop relying on implicit conversions of ilist iterators in LLVMInstCombine. No functionality change intended. llvm-svn: 250183	2015-10-13 16:59:33 +00:00
Duncan P. N. Exon Smith	fb1743a32e	BitcodeReader: Remove ilist iterator implicit conversions, NFC Get LLVMBitReader building without relying on `ilist_iterator` implicit conversions. llvm-svn: 250181	2015-10-13 16:48:55 +00:00
Joseph Tremoulet	1e2f062ec5	[WinEH] Iterate state changes instead of invokes Summary: Add an iterator that can walk across blocks and which visits the state transitions rather than state ranges, with explicit transitions to -1 indicating the presence of top-level calls that may throw and cause the current function to unwind to caller. This will simplify code that needs to identify nested try regions. Refactor SEH and C++EH table generation to use the new InvokeStateChangeIterator, and remove the InvokeLabelIterator they were using. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13623 llvm-svn: 250179	2015-10-13 16:44:30 +00:00
Xinliang David Li	c758387e9b	Fix a couple of comments; NFC llvm-svn: 250177	2015-10-13 16:35:59 +00:00
Sanjay Patel	85030aa1bd	function names should start with a lower case letter; NFC llvm-svn: 250174	2015-10-13 16:23:00 +00:00
Sanjay Patel	b5723d0dbd	don't repeat function/class/variable names in comments; NFC llvm-svn: 250162	2015-10-13 15:12:27 +00:00
Simon Pilgrim	3c2b30f8ba	[InstCombine][SSE4A] Remove broken INSERTQI range combining optimization As discussed in D13348 - the INSERTQI range combining code is wrong in that it confuses the insertion bit index with an extraction bit index. The remaining legal combines are very unlikely (especially once we've converted to shuffles in D13348) so I'm removing the optimization. llvm-svn: 250160	2015-10-13 14:48:54 +00:00
James Molloy	4015925606	[GlobalsAA] Turn GlobalsAA on again by default Now that all the known faults with GlobalsAA have been fixed, flip the big switch on -enable-non-lto-gmr again. Feel free to pester me with any more bugs found, and don't hesitate to flip the switch back off. llvm-svn: 250157	2015-10-13 10:43:57 +00:00
James Molloy	860507f838	[GlobalsAA] Don't assume anything about functions that may be overridden Weak linkage and friends allow a symbol to be overriden outside the code generator's model, so GlobalsAA shouldn't assume that anything it can compute about such a symbol is valid. llvm-svn: 250156	2015-10-13 10:43:33 +00:00
Christof Douma	f0765c4f5b	Test commit llvm-svn: 250154	2015-10-13 09:38:21 +00:00
Sanjoy Das	b873cbe5c9	[IndVars] NFC Cleanup. - Rename methods according to the LLVM Coding Style - Merge adjacent anonymous namespace block - Use `auto` in two places llvm-svn: 250152	2015-10-13 07:17:38 +00:00
Michael Kuperstein	af22dafc8b	Fix line-ending issue. NFC. llvm-svn: 250151	2015-10-13 06:22:30 +00:00
Craig Topper	24b56a62bb	[X86] Mark the AAD and AAM aliases as not valid in 64-bit mode. llvm-svn: 250148	2015-10-13 05:12:07 +00:00
Craig Topper	4f76372afc	[X86] Change all the i8imm operands in XOP instructions to u8imm so the parser will check the size. llvm-svn: 250147	2015-10-13 05:06:25 +00:00
Manman Ren	9f824dab1d	Revert 250089 due to bot failure. It failed when building clang itself with PGO. llvm-svn: 250145	2015-10-13 03:38:02 +00:00
Duncan P. N. Exon Smith	584af871cc	BitcodeWriter: Stop using implicit ilist iterator conversion, NFC Now LLVMBitWriter compiles without implicit ilist iterator conversions. In these cases, the cleanest thing was to switch to range-based for loops. Since there wasn't much noise I converted sub-loops and parent loops as a drive-by. llvm-svn: 250144	2015-10-13 03:26:19 +00:00
Sanjoy Das	1ed6910338	[SCEV] Put some utilites in the ScalarEvolution class In a later commit, `SplitBinaryAdd` will be used outside `IsConstDiff`, so lift that out. And lift out `IsConstDiff` as `computeConstantDifference` to keep things clean and to avoid playing C++ access specifier games. NFC. llvm-svn: 250143	2015-10-13 02:53:27 +00:00
Duncan P. N. Exon Smith	5b4c837c58	TransformUtils: Remove implicit ilist iterator conversions, NFC Continuing the work from last week to remove implicit ilist iterator conversions. First related commit was probably r249767, with some more motivation in r249925. This edition gets LLVMTransformUtils compiling without the implicit conversions. No functional change intended. llvm-svn: 250142	2015-10-13 02:39:05 +00:00
Matt Arsenault	e5d9515fb7	DAGCombiner: Don't stop finding better chain on 2 aliases The comment says this was stopped because it was unlikely to be profitable. This is not true if you want to combine vector loads with multiple components. For a simple case that looks like t0 = load t0 ... t1 = load t0 ... t2 = load t0 ... t3 = load t0 ... t4 = store t0:1, t0:1 t5 = store t4, t1:0 t6 = store t5, t2:0 t7 = store t6, t3:0 We want to get all of these stores onto a chain that is a TokenFactor of these N loads. This mostly solves the AMDGPU merge-stores.ll regressions with -combiner-alias-analysis for merging vector stores of vector loads. llvm-svn: 250138	2015-10-13 00:49:00 +00:00
JF Bastien	986ed68eed	x86: preserve flags when folding atomic operations Summary: D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. This patch adds the missing EFLAGS definition. Floating point operations don't set flags, the subsequent fadd optimization is therefore correct. The same applies for surrounding load/store optimizations. Reviewers: rsmith, rtrieu Subscribers: llvm-commits, reames, morisset Differential Revision: http://reviews.llvm.org/D13680 llvm-svn: 250135	2015-10-13 00:28:47 +00:00
Matt Arsenault	f0d9e47da2	AMDGPU: Refactor isVGPRToSGPRCopy It should now correctly handle physical registers and make it easier to identify the other direction. llvm-svn: 250132	2015-10-13 00:07:54 +00:00
Matt Arsenault	61dc235f20	DAGCombiner: Combine extract_vector_elt from build_vector This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129	2015-10-12 23:59:50 +00:00
Cong Hou	bf22f5063a	Assign correct edge weights to unwind destinations when lowering invoke statement. When lowering invoke statement, all unwind destinations are directly added as successors of call site block, and the weight of those new edges are not assigned properly. Actually, default weight 16 are used for those edges. This patch calculates the proper edge weights for those edges when collecting all unwind destinations. Differential revision: http://reviews.llvm.org/D13354 llvm-svn: 250119	2015-10-12 23:02:58 +00:00
Simon Pilgrim	c8832fc233	[SelectionDAG] Add common vector constant folding helper function We have a number of functions that implement constant folding of vectors (unary and binary ops) in near identical manners (and the differences don't appear to be critical). This patch introduces a common implementation (SelectionDAG::FoldConstantVectorArithmetic) and calls this in both the unary and binary op cases. After this initial patch I intend to begin enabling vector constant folding for a wider number of opcodes in SelectionDAG::getNode(). Differential Revision: http://reviews.llvm.org/D13665 llvm-svn: 250118	2015-10-12 23:00:11 +00:00
Kevin Enderby	903955451e	Fixed bugs in llvm-obdump while parsing Mach-O files from malformed archives that caused aborts. This was because of the characters of the ‘Size’ field in the archive header did not contain decimal characters. rdar://22983603 llvm-svn: 250117	2015-10-12 22:04:54 +00:00
Cong Hou	3320bcd815	Update the branch weight metadata in JumpThreading pass. In JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250089	2015-10-12 19:44:08 +00:00
Reid Kleckner	4a5f35c0ae	Make Win64 localescape offsets FP relative instead of SP relative We made them SP relative back in March (r233137) because that's the value the runtime passes to EH functions. With the new cleanuppad IR, funclets adjust their frame argument from SP to FP, so our offsets should now be FP-relative. llvm-svn: 250088	2015-10-12 19:43:34 +00:00
Andrea Di Biagio	b0fe4eb199	[x86] Fix wrong lowering of vsetcc nodes (PR25080). Function LowerVSETCC (in X86ISelLowering.cpp) worked under the wrong assumption that for non-AVX512 targets, the source type and destination type of a type-legalized setcc node were always the same type. This assumption was unfortunately incorrect; the type legalizer is not always able to promote the return type of a setcc to the same type as the first operand of a setcc. In the case of a vsetcc node, the legalizer firstly checks if the first input operand has a legal type. If so, then it promotes the return type of the vsetcc to that same type. Otherwise, the return type is promoted to the 'next legal type', which, for vectors of MVT::i1 is always a 128-bit integer vector type. Example (-mattr=+avx): %0 = trunc <8 x i32> %a to <8 x i23> %1 = icmp eq <8 x i23> %0, zeroinitializer The initial selection dag for the code above is: v8i1 = setcc t5, t7, seteq:ch t5: v8i23 = truncate t2 t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1 t7: v8i32 = build_vector of all zeroes. The type legalizer would firstly check if 't5' has a legal type. If so, then it would reuse that same type to promote the return type of the setcc node. Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to promote the return type of the setcc node. Consequently, the setcc return type is promoted to v8i16. Later on, 't5' is promoted to v8i32 thus leading to the following dag node: v8i16 = setcc t32, t25, seteq:ch where t32 and t25 are now values of type v8i32. Before this patch, function LowerVSETCC would have wrongly expanded the setcc to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an instruction. In our case, ISel would have matched a VPCMPEQWrr: t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25 However, t36 and t25 are both VR256, while the result type is instead of class VR128. This inconsistency ended up causing the insertion of COPY instructions like this: %vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3 Which is an invalid full copy (not a sub register copy). Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy instruction" in the attempt to expand the malformed pseudo COPY instructions. This patch fixes the problem adding the missing logic in LowerVSETCC to handle the corner case of a setcc with 128-bit return type and 256-bit operand type. This problem was originally reported by Dimitry as PR25080. It has been latent for a very long time. I have added the minimal reproducible from that bugzilla as test setcc-lowering.ll. Differential Revision: http://reviews.llvm.org/D13660 llvm-svn: 250085	2015-10-12 19:22:30 +00:00
Cong Hou	61e13de408	Add - and -= operators to BlockFrequency using saturating arithmetic. llvm-svn: 250077	2015-10-12 18:34:00 +00:00
Sanjay Patel	0dc91b3143	combine predicates; NFCI llvm-svn: 250075	2015-10-12 18:15:08 +00:00
Cong Hou	90c6cf8e7d	Turn const/const& into value type for BlockFrequency in functions of this class. Also fix a naming issue. NFC. llvm-svn: 250074	2015-10-12 18:14:15 +00:00
Matt Arsenault	8c0ef8b36d	AMDGPU: Register some more passes so -print-before works llvm-svn: 250071	2015-10-12 17:43:59 +00:00
Matt Arsenault	07a72bad0b	Enable verifier after PeepholeOptimizer No tests fail with this enabled so I assume it was an accident that it isn't enabled now. llvm-svn: 250070	2015-10-12 17:43:56 +00:00
Reid Kleckner	9abb3c06a6	Don't call PrepareEHLandingPad on non EH pads This was a minor bug in r249492. Calling PrepareEHLandingPad on a non-landingpad was a no-op, but it attempted to get the generic pointer register class, which apparently doesn't exist for some targets. llvm-svn: 250068	2015-10-12 17:42:32 +00:00
David Majnemer	99c1d13e52	[WinEH] Remove CatchObjRecoverIdx CatchObjRecoverIdx was used for the old scheme, it is no longer relevant. llvm-svn: 250065	2015-10-12 16:44:22 +00:00
Sanjay Patel	b814ef1ad6	fix typos; NFC llvm-svn: 250059	2015-10-12 16:09:59 +00:00
Zoran Jovanovic	2e386d3d07	[mips][micromips] Initial support for micrmomips DSP instructions and addu.qb implementation Differential Revision: http://reviews.llvm.org/D12798 llvm-svn: 250058	2015-10-12 16:07:25 +00:00
Oliver Stannard	cca893ffac	[Debug] Look through bitcasts to find argument registers On targets where f32 is not legal, we have to look through a BITCAST SDNode to find the register that an argument is stored in when emitting debug info, or we will not be able to emit a DW_AT_location for it. Differential Revision: http://reviews.llvm.org/D13005 llvm-svn: 250056	2015-10-12 15:52:36 +00:00
Vasileios Kalintiris	2a95f82859	[mips][FastISel] Clang-format switch statement. NFC. llvm-svn: 250053	2015-10-12 15:39:41 +00:00
Sanjay Patel	53d1d8b731	fix capitalization; NFC llvm-svn: 250049	2015-10-12 15:24:01 +00:00
Greg Bedwell	7f68a71669	Fix rename() sometimes failing if another process uses openFileForRead() On Windows, fs::rename() could fail is another process was reading the file at the same time using fs::openFileForRead(). In most cases the user wouldn't notice as fs::rename() will continue to retry for 2000ms. Typically this is enough for the read to complete and a retry to succeed, but if the disk is being it too hard then the response time might be longer than the retry time and the rename would fail with a permission error. Add FILE_SHARE_DELETE to the sharing flags for CreateFileW() in fs::openFileForRead() and try ReplaceFileW() prior to MoveFileExW() in fs::rename(). Based on an initial patch by Edd Dawson! Differential Revision: http://reviews.llvm.org/D13647 llvm-svn: 250046	2015-10-12 15:11:47 +00:00
Daniel Sanders	b1ef88c172	[mips][ias] Implement macro expansion when bcc has an immediate where a register belongs. Summary: Fixes PR24915. Reviewers: vkalintiris Subscribers: emaste, seanbruno, llvm-commits Differential Revision: http://reviews.llvm.org/D13533 llvm-svn: 250042	2015-10-12 14:24:05 +00:00
Daniel Sanders	2a5ce1ace0	[mips] Clean up most macro expansions to use the emit*() functions. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13591 llvm-svn: 250040	2015-10-12 14:09:12 +00:00
Daniel Sanders	2fb8564d99	[mips] Handle undef when extracting subregs from FP64 registers. Summary: This removes unnecessary instructions when extracting from an undefined register and also fixes a crash for O32 when passing undef to a double argument in held in integer registers. Reviewers: vkalintiris Subscribers: llvm-commits, zoran.jovanovic, petarj Differential Revision: http://reviews.llvm.org/D13467 llvm-svn: 250039	2015-10-12 13:55:44 +00:00
Oliver Stannard	939724cd02	GlobalOpt does not treat externally_initialized globals correctly GlobalOpt currently merges stores into the initialisers of internal, externally_initialized globals, but should not do so as the value of the global may change between the initialiser and any code in the module being run. llvm-svn: 250035	2015-10-12 13:20:52 +00:00
James Molloy	fa4e994a7a	[ARM] Mark Swift MISched model as incomplete The Swift Machine Scheduler Model is incomplete. There are instructions missing which can trigger the "incomplete machine model" abort. This was observed when a downstream SchedMachineModel was added to the ARM target. Patch by Christof Douma! llvm-svn: 250033	2015-10-12 12:49:59 +00:00
James Molloy	55d633bd60	[LoopVectorize] Shrink integer operations into the smallest type possible C semantics force sub-int-sized values (e.g. i8, i16) to be promoted to int type (e.g. i32) whenever arithmetic is performed on them. For targets with native i8 or i16 operations, usually InstCombine can shrink the arithmetic type down again. However InstCombine refuses to create illegal types, so for targets without i8 or i16 registers, the lengthening and shrinking remains. Most SIMD ISAs (e.g. NEON) however support vectors of i8 or i16 even when their scalar equivalents do not, so during vectorization it is important to remove these lengthens and truncates when deciding the profitability of vectorization. The algorithm this uses starts at truncs and icmps, trawling their use-def chains until they terminate or instructions outside the loop are found (or unsafe instructions like inttoptr casts are found). If the use-def chains starting from different root instructions (truncs/icmps) meet, they are unioned. The demanded bits of each node in the graph are ORed together to form an overall mask of the demanded bits in the entire graph. The minimum bitwidth that graph can be truncated to is the bitwidth minus the number of leading zeroes in the overall mask. The intention is that this algorithm should "first do no harm", so it will never insert extra cast instructions. This is why the use-def graphs are unioned, so that subgraphs with different minimum bitwidths do not need casts inserted between them. This algorithm works hard to reduce compile time impact. DemandedBits are only queried if there are extends of illegal types and if a truncate to an illegal type is seen. In the general case, this results in a simple linear scan of the instructions in the loop. No non-noise compile time impact was seen on a clang bootstrap build. llvm-svn: 250032	2015-10-12 12:34:45 +00:00
Amjad Aboud	1db6d7af46	[X86] Add XSAVE intrinsic family Add intrinsics for the XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64) XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64) XSAVEC instructions (XSAVEC/XSAVEC64) XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64) Differential Revision: http://reviews.llvm.org/D13012 llvm-svn: 250029	2015-10-12 11:47:46 +00:00
Andrea Di Biagio	a0922ed8fe	[x86] PR24562: fix incorrect folding of PSHUFB nodes with a mask where all indices have the most significant bit set. This patch fixes a problem in function 'combineX86ShuffleChain' that causes a chain of shuffles to be wrongly folded away when the combined shuffle mask has only one element. We may end up with a combined shuffle mask of one element as a result of multiple calls to function 'canWidenShuffleElements()'. Function canWidenShuffleElements attempts to simplify a shuffle mask by widening the size of the elements being shuffled. For every pair of shuffle indices, function canWidenShuffleElements checks if indices refer to adjacent elements. If all pairs refer to "adjacent" elements then the shuffle mask is safely widened. As a consequence of widening, we end up with a new shuffle mask which is half the size of the original shuffle mask. The byte shuffle (pshufb) from test pr24562.ll has a mask of all SM_SentinelZero indices. Function canWidenShuffleElements would combine each pair of SM_SentinelZero indices into a single SM_SentinelZero index. So, in a logarithmic number of steps (4 in this case), the pshufb mask is simplified to a mask with only one index which is equal to SM_SentinelZero. Before this patch, function combineX86ShuffleChain wrongly assumed that a mask of size one is always equivalent to an identity mask. So, the entire shuffle chain was just folded away as the combined shuffle mask was treated as a no-op mask. With this patch we know check if the only element of a combined shuffle mask is SM_SentinelZero. In case, we propagate a zero vector. Differential Revision: http://reviews.llvm.org/D13364 llvm-svn: 250027	2015-10-12 11:25:41 +00:00
Zlatko Buljan	d76b666a06	Test commit llvm-svn: 250026	2015-10-12 11:19:40 +00:00
Tobias Grosser	374bce0c22	SCEV: Allow simple AddRec * Parameter products in delinearization This patch also allows the -delinearize pass to delinearize expressions that do not have an outermost SCEVAddRec expression. The SCEV::delinearize infrastructure allowed this since r240952, but the -delinearize pass was not updated yet. llvm-svn: 250018	2015-10-12 08:02:00 +00:00
Craig Topper	8d2e6bc25b	[X86] Use u8imm for the immediate type for all shift and rotate instructions. This way the assembler will perform range checking. Believe this matches gas behavior. llvm-svn: 250016	2015-10-12 06:23:10 +00:00
Craig Topper	d6b661dbf0	[X86] Add support to assembler and MCInst lowering to use the other vmovq %xmmX, %xmmX encoding if it would be a shorter VEX encoding. llvm-svn: 250014	2015-10-12 04:57:59 +00:00
Craig Topper	635e05df0a	[X86] Cleanup formatting a bit. NFC llvm-svn: 250013	2015-10-12 04:27:17 +00:00
Craig Topper	5be914eda1	[X86] Change the immediate for IN/OUT instructions to u8imm so the assembly parser will check the size. llvm-svn: 250012	2015-10-12 04:17:55 +00:00
Craig Topper	95fffba227	[X86] Add some instruction aliases to get the assembly parser table to favor arithmetic instructions with 8-bit immediates over the forms that implicitly use the ax/eax/rax. This allows us to remove the explicit code for working around the existing priority llvm-svn: 250011	2015-10-12 03:39:57 +00:00
Craig Topper	fcc34bdee0	[X86] Fix CMP and TEST with al/ax/eax/rax to not mark EFLAGS as a use or al/ax/eax/rax as a def. Probably doesn't have a functional affect since these aren't used in isel. llvm-svn: 249994	2015-10-11 19:54:02 +00:00
Simon Pilgrim	d45c88bbb5	[DAGCombiner] Improved FMA combine support for vectors Enabled constant canonicalization for all constants. Improved combining of constant vectors. llvm-svn: 249993	2015-10-11 19:48:12 +00:00
Craig Topper	87990ee4ec	[X86] Remove special validation for INT immediate operand from AsmParser. Instead mark its operand type as u8imm which will cause it to fail to match. This is more consistent with other instruction behavior. This also fixes a bug where negative immediates below -128 were not being reported as errors. llvm-svn: 249989	2015-10-11 18:27:24 +00:00
Craig Topper	a71630729d	[X86] Simplify immediate range checking code. llvm-svn: 249979	2015-10-11 16:38:14 +00:00
Simon Pilgrim	5eac2607b9	[DAGCombiner] Tidyup FMINNUM/FMAXNUM constant folding Enable constant folding for vector splats as well as scalars. Enable constant canonicalization for all scalar and vector constants. llvm-svn: 249978	2015-10-11 16:02:28 +00:00
Simon Pilgrim	1d1c56e2df	[InstCombine][X86][XOP] Combine XOP integer vector comparisons to native IR We now have lowering support for XOP PCOM/PCOMU instructions. llvm-svn: 249977	2015-10-11 14:38:34 +00:00
Simon Pilgrim	52d47e5704	[X86][XOP] Added support for the lowering of 128-bit vector integer comparisons to XOP PCOM/PCOMU instructions. The XOP vector integer comparisons can deal with all signed/unsigned comparison cases directly and can be easily commuted as well (D7646). llvm-svn: 249976	2015-10-11 14:15:17 +00:00
Nathan Slingerland	5e896ce2d1	[ProfileData] Test commit for slingn This is a test of the LLVM commit system. In the event of a real commit there would be some useful code changes. llvm-svn: 249972	2015-10-11 13:30:56 +00:00
Craig Topper	55b1f29203	Change isUIntN/isIntN calls with constant N to use the template version. NFC llvm-svn: 249952	2015-10-10 20:17:07 +00:00
Teresa Johnson	1493ad9c24	Fix PR25101 - Handle anonymous functions without VST entries Summary: The change to use the VST function entries for lazy deserialization did not handle the case of anonymous functions without aliases. In that case we must fall back to scanning the function blocks as there is no VST entry. Reviewers: dexonsmith, joker.eph, davidxl Subscribers: tstellarAMD, llvm-commits Differential Revision: http://reviews.llvm.org/D13596 llvm-svn: 249947	2015-10-10 14:18:36 +00:00
Jonas Paulsson	63a2b6862e	[SystemZ] Fixes in the backend I/R. expandPostRAPseudo(): STX -> 2 * STD: The first STD should not have the kill flag set for the address. SystemZElimCompare: BRC -> BRCT conversion: Don't forget to remove the CC<use,kill> operand. Needed to make SystemZ/asm-17.ll pass with -verify-machineinstrs, which now runs with this flag. Reviewed by Ulrich Weigand. llvm-svn: 249945	2015-10-10 07:14:24 +00:00
Sanjoy Das	cc16ccc1ab	[IndVars] Use `auto`; NFC llvm-svn: 249944	2015-10-10 06:33:33 +00:00
Craig Topper	84008481e4	Use range-based for loops. NFC llvm-svn: 249943	2015-10-10 05:38:14 +00:00
Keno Fischer	2cd66e9270	[RuntimeDyld] Fix performance problem in resolveRelocations with many sections Summary: Rather than just iterating over all sections and checking whether we have relocations for them, iterate over the relocation map instead. This showed up heavily in an artificial julia benchmark that does lots of compilation. On that particular benchmark, this patch gives ~15% performance improvements. As far as I can tell the primary reason why the original loop was so expensive is that Relocations[i] actually constructs a relocationList (allocating memory & doing lots of other unnecessary computing) if none is found. Reviewers: lhames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13545 llvm-svn: 249942	2015-10-10 05:37:02 +00:00
Craig Topper	7143d8001a	Use range-based for loops. NFC. llvm-svn: 249941	2015-10-10 05:25:06 +00:00
Craig Topper	7d5b23101c	Use emplace_back instead of a constructor call and push_back. NFC llvm-svn: 249940	2015-10-10 05:25:02 +00:00
Duncan P. N. Exon Smith	5a82c916b0	Analysis: Remove implicit ilist iterator conversions Remove implicit ilist iterator conversions from LLVMAnalysis. I came across something really scary in `llvm::isKnownNotFullPoison()` which relied on `Instruction::getNextNode()` being completely broken (not surprising, but scary nevertheless). This function is documented (and coded to) return `nullptr` when it gets to the sentinel, but with an `ilist_half_node` as a sentinel, the sentinel check looks into some other memory and we don't recognize we've hit the end. Rooting out these scary cases is the reason I'm removing the implicit conversions before doing anything else with `ilist`; I'm not at all surprised that clients rely on badness. I found another scary case -- this time, not relying on badness, just bad (but I guess getting lucky so far) -- in `ObjectSizeOffsetEvaluator::compute_()`. Here, we save out the insertion point, do some things, and then restore it. Previously, we let the iterator auto-convert to `Instruction`, and then set it back using the `Instruction` version: Instruction PrevInsertPoint = Builder.GetInsertPoint(); / Logic that may change insert point */ if (PrevInsertPoint) Builder.SetInsertPoint(PrevInsertPoint); The check for `PrevInsertPoint` doesn't protect correctly against bad accesses. If the insertion point has been set to the end of a basic block (i.e., `SetInsertPoint(SomeBB)`), then `GetInsertPoint()` returns an iterator pointing at the list sentinel. The version of `SetInsertPoint()` that's getting called will then call `PrevInsertPoint->getParent()`, which explodes horribly. The only reason this hasn't blown up is that it's fairly unlikely the builder is adding to the end of the block; usually, we're adding instructions somewhere before the terminator. llvm-svn: 249925	2015-10-10 00:53:03 +00:00
Duncan P. N. Exon Smith	a5f45da27e	MC: Remove implicit ilist iterator conversions, NFC llvm-svn: 249922	2015-10-10 00:13:11 +00:00
David Majnemer	bfa5b98201	[WinEH] Remove more dead code wineh-parent is dead, so is ValueOrMBB. llvm-svn: 249920	2015-10-10 00:04:29 +00:00
Reid Kleckner	14e773500e	[WinEH] Delete the old landingpad implementation of Windows EH The new implementation works at least as well as the old implementation did. Also delete the associated preparation tests. They don't exercise interesting corner cases of the new implementation. All the codegen tests of the EH tables have already been ported. llvm-svn: 249918	2015-10-09 23:34:53 +00:00
Reid Kleckner	eb7cd6c889	[SEH] Update SEH codegen tests to use the new IR Also Fix a buglet where SEH tables had ranges that spanned funclets. The remaining tests using the old landingpad IR are preparation tests, and will be deleted along with the old preparation. llvm-svn: 249917	2015-10-09 23:05:54 +00:00
Duncan P. N. Exon Smith	f1ff53ecc2	CodeGen: Remove implicit ilist iterator conversions, NFC Finish removing implicit ilist iterator conversions from LLVMCodeGen. I'm sure there are lots more of these in lib/CodeGen/*/. llvm-svn: 249915	2015-10-09 22:56:24 +00:00
David Majnemer	35d27b21a1	[WinEH] Insert the catchpad return before CSR restoration x64 catchpads use rax to inform the unwinder where control should go next. However, we must initialize rax before the epilogue sequence so as to not perturb the unwinder. llvm-svn: 249910	2015-10-09 22:18:45 +00:00
James Y Knight	692e037499	Fix assert when emitting llvm.pow.f86. This occurred due to introducing the invalid i64 type after type legalization had already finished, in an attempt to workaround bitcast f64 -> v2i32 not doing constant folding. The right thing is to actually fix bitcast, but that has other complications. So, for now, just get rid of the broken workaround, and check in a test-case showing that it doesn't crash, with TODOs for emitting proper code. llvm-svn: 249908	2015-10-09 21:36:19 +00:00
Reid Kleckner	e1c8a7f9c7	[SEH] Fix _except_handler4 table base states We got them right for the old IR, but not with funclets. Port the old test to the new IR and fix the code. llvm-svn: 249906	2015-10-09 21:27:28 +00:00
Duncan P. N. Exon Smith	6e98cd32dc	CodeGen: Avoid more ilist iterator implicit conversions, NFC llvm-svn: 249903	2015-10-09 21:08:19 +00:00
Duncan P. N. Exon Smith	1ff409802d	CodeGen: Use range-based for in PostRAScheduler, NFC llvm-svn: 249901	2015-10-09 21:05:00 +00:00
Reid Kleckner	d880dc7509	[SEH] Remember to emit the last invoke range for SEH This wasn't very observable in execution tests, because usually there is an invoke in the catchpad that unwinds the the catchendpad but never actually throws. llvm-svn: 249898	2015-10-09 20:39:39 +00:00
Owen Anderson	97ca0f3f2c	Generalize convergent check to handle invokes as well as calls. llvm-svn: 249892	2015-10-09 20:17:46 +00:00
James Y Knight	5b8217bc05	Fix assert in X86 backend. When running combine on an extract_vector_elt, it wants to look through a bitcast to check if the argument to the bitcast was itself an extract_vector_elt with particular operands. However, it called getOperand() on the argument to the bitcast before checking that the opcode was EXTRACT_VECTOR_ELT, assert-failing if there were zero operands for the actual opcode. Fix, and add trivial test. llvm-svn: 249891	2015-10-09 20:10:14 +00:00
Chad Rosier	47eba05b47	Revert "Simplify code. NFC." This reverts commit r248610. llvm-svn: 249887	2015-10-09 19:48:48 +00:00
Duncan P. N. Exon Smith	5ec1568c9c	CodeGen: Continue removing ilist iterator implicit conversions llvm-svn: 249884	2015-10-09 19:40:45 +00:00
Duncan P. N. Exon Smith	6ac07fd228	CodeGen: Remove implicit iterator conversions from MBB.cpp Remove implicit ilist iterator conversions from MachineBasicBlock.cpp. I've also added an overload of `splice()` that takes a pointer, since it's a natural API. This is similar to the overloads I added for `remove()` and `erase()` in r249867. llvm-svn: 249883	2015-10-09 19:36:12 +00:00
Duncan P. N. Exon Smith	0ac8eb9171	CodeGen: Avoid ilist iterator implicit conversions in a few more places, NFC llvm-svn: 249880	2015-10-09 19:23:20 +00:00
Duncan P. N. Exon Smith	5ae5939fa1	CodeGen: Remove more ilist iterator implicit conversions, NFC llvm-svn: 249879	2015-10-09 19:13:58 +00:00
Duncan P. N. Exon Smith	6c64aeb065	CodeGen: Use range-based for in IntrinsicLowering::AddPrototypes, NFC This happens to avoid a host of implicit ilist iterator conversions. llvm-svn: 249877	2015-10-09 19:07:41 +00:00
Duncan P. N. Exon Smith	530d040bd9	CodeGen: Use range-based for in GlobalMerge, NFC llvm-svn: 249876	2015-10-09 18:57:47 +00:00
Duncan P. N. Exon Smith	d83547a16e	CodeGen: Remove a few more ilist iterator implicit conversions, NFC llvm-svn: 249875	2015-10-09 18:44:40 +00:00
Owen Anderson	2c9978b12b	Teach LoopUnswitch not to perform non-trivial unswitching on loops containing convergent operations. Doing so could cause the post-unswitching convergent ops to be control-dependent on the unswitch condition where they were not before. This check could be refined to allow unswitching where the convergent operation was already control-dependent on the unswitch condition. llvm-svn: 249874	2015-10-09 18:40:20 +00:00
Duncan P. N. Exon Smith	980f8f2639	CodeGen: Remove implicit conversions from Analysis and BranchFolding Remove a few more implicit ilist iterator conversions, this time from Analysis.cpp and BranchFolding.cpp. I added a few overloads for `remove()` and `erase()`, which quite naturally take pointers as well as iterators as parameters. This will reduce the churn at least in the short term, but I don't really have a problem with these existing for longer. llvm-svn: 249867	2015-10-09 18:23:49 +00:00
Owen Anderson	d95b08a0a7	Refine the definition of convergent to only disallow the addition of new control dependencies. This covers the common case of operations that cannot be sunk. Operations that cannot be hoisted should already be handled properly via the safe-to-speculate rules and mechanisms. llvm-svn: 249865	2015-10-09 18:06:13 +00:00
Sanjay Patel	9fbe22bac6	fix typos; NFC llvm-svn: 249863	2015-10-09 18:01:03 +00:00
Diego Novillo	a7f1e8ef83	Add inline stack streaming to binary sample profiles. With this patch we can now read and write inline stacks in sample profiles to the binary encoded profiles. In a subsequent patch, I will add a string table to the binary encoding. Right now function names are emitted as strings every time we find them. This is too bloated and will produce large files in applications with lots of inlining. llvm-svn: 249861	2015-10-09 17:54:24 +00:00
Dan Gohman	ee1588ce96	[WebAssembly] Rename floating-point operators to match their spec names. llvm-svn: 249859	2015-10-09 17:50:00 +00:00
Artur Pilipenko	cca800207a	Add verification for align, dereferenceable, dereferenceable_or_null load metadata Reviewed By: reames Differential Revision: http://reviews.llvm.org/D13428 llvm-svn: 249856	2015-10-09 17:41:29 +00:00
Keno Fischer	21a7f23666	Clear SectionSymbols in MCContext::Reset This was just forgotten when SectionSymbols was introduced and could cause corruption if the MCContext was reused after Reset. Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13547 llvm-svn: 249854	2015-10-09 17:24:54 +00:00
Duncan P. N. Exon Smith	769e1a972d	AArch64: Make getNextNode() cleanup in r249764 more clear After r249764, if you didn't see the full context, it looked like `std::next(I)` would get the same result as `++MachineBasicBlock::iterator(I)`. However, `I` is a `MachineInstr*` (not a `MachineBasicBlock::iterator`). Use the `getIterator()` helper I added later (r249782) to make this code more clear. llvm-svn: 249852	2015-10-09 16:54:54 +00:00
Duncan P. N. Exon Smith	8f11e1a713	CodeGen: Start removing implicit conversions to/from list iterators, NFC Start removing implicit conversions to/from list iterators in CodeGen, ala r249782 for IR. A lot more to go after this. llvm-svn: 249851	2015-10-09 16:54:49 +00:00
Dehao Chen	41dc5a6e86	Make HeaderLineno a local variable. http://reviews.llvm.org/D13576 As we are using hierarchical profile, there is no need to keep HeaderLineno a member variable. This is because each level of the inline stack will have its own header lineno. One should use the head lineno of its own inline stack level instead of the actual symbol. llvm-svn: 249848	2015-10-09 16:50:16 +00:00
Artur Pilipenko	ffd132878a	ValueTracking: use getAlignment in isAligned Reviewed By: reames Differential Revision: http://reviews.llvm.org/D13517 llvm-svn: 249841	2015-10-09 15:58:26 +00:00
Jun Bum Lim	0aace13d18	Improve ISel across lane float min/max reduction In vectorized float min/max reduction code, the final "reduce" step is sub-optimal. In AArch64, this change wll combine : svn0 = vector_shuffle t0, undef<2,3,u,u> fmin = fminnum t0,svn0 svn1 = vector_shuffle fmin, undef<1,u,u,u> cc = setcc fmin, svn1, ole n0 = extract_vector_elt cc, #0 n1 = extract_vector_elt fmin, #0 n2 = extract_vector_elt fmin, #1 result = select n0, n1,n2 into : result = llvm.aarch64.neon.fminnmv t0 This change extends r247575. llvm-svn: 249834	2015-10-09 14:11:25 +00:00
Jonas Paulsson	ee3685fd45	[SystemZ] Remove unused code in SystemZElimCompare.cpp The Reference IndirectDef and IndirectUse members were unused and therefore removed. llvm-svn: 249824	2015-10-09 11:27:44 +00:00
Nemanja Ivanovic	d389657399	Vector element extraction without stack operations on Power 8 This patch corresponds to review: http://reviews.llvm.org/D12032 This patch builds onto the patch that provided scalar to vector conversions without stack operations (D11471). Included in this patch: - Vector element extraction for all vector types with constant element number - Vector element extraction for v16i8 and v8i16 with variable element number - Removal of some unnecessary COPY_TO_REGCLASS operations that ended up unnecessarily moving things around between registers Not included in this patch (will be in upcoming patch): - Vector element extraction for v4i32, v4f32, v2i64 and v2f64 with variable element number - Vector element insertion for variable/constant element number Testing is provided for all extractions. The extractions that are not implemented yet are just placeholders. llvm-svn: 249822	2015-10-09 11:12:18 +00:00
Andrea Di Biagio	99493df257	[MemCpyOpt] Fix wrong merging adjacent nontemporal stores into memset calls. Pass MemCpyOpt doesn't check if a store instruction is nontemporal. As a consequence, adjacent nontemporal stores are always merged into a memset call. Example: ;;; define void @foo(<4 x float>* nocapture %p) { entry: store <4 x float> zeroinitializer, <4 x float>* %p, align 16, !nontemporal !0 %p1 = getelementptr inbounds <4 x float>, <4 x float>* %dst, i64 1 store <4 x float> zeroinitializer, <4 x float>* %p1, align 16, !nontemporal !0 ret void } !0 = !{i32 1} ;;; In this example, the two nontemporal stores are combined to a memset of zero which does not preserve the nontemporal hint. Later on the backend (tested on a x86-64 corei7) expands that memset call into a sequence of two normal 16-byte aligned vector stores. opt -memcpyopt example.ll -S -o - \| llc -mcpu=corei7 -o - Before: xorps %xmm0, %xmm0 movaps %xmm0, 16(%rdi) movaps %xmm0, (%rdi) With this patch, we no longer merge nontemporal stores into calls to memset. In this example, llc correctly expands the two stores into two movntps: xorps %xmm0, %xmm0 movntps %xmm0, 16(%rdi) movntps %xmm0, (%rdi) In theory, we could extend the usage of !nontemporal metadata to memcpy/memset calls. However a change like that would only have the effect of forcing the backend to expand !nontemporal memsets back to sequences of store instructions. A memset library call would not have exactly the same semantic of a builtin !nontemporal memset call. So, SelectionDAG will have to conservatively expand it back to a sequence of !nontemporal stores (effectively undoing the merging). Differential Revision: http://reviews.llvm.org/D13519 llvm-svn: 249820	2015-10-09 10:53:41 +00:00
Arnaud A. de Grandmaison	859b2ac07d	[EarlyCSE] Address post commit review for r249523. llvm-svn: 249814	2015-10-09 09:23:01 +00:00
Jonas Paulsson	5b3bab40b2	[SystemZ] Remove superfluous braces in SystemZShortenInst.cpp llvm-svn: 249812	2015-10-09 07:19:20 +00:00
Jonas Paulsson	18d877f79b	[SystemZ] Minor bugfixes. LLCH, LLHH and CLIH had the wrong register classes for the def-operand. Tie operands if changing opcode to an instruction with tied ops. Comment typo fix. These fixes were needed in order to make regression test case SystemZ/asm-18.ll pass with -verify-machineinstrs (not used by default). Reviewed by Ulrich Weigand. llvm-svn: 249811	2015-10-09 07:19:16 +00:00
Jonas Paulsson	0a9049ba82	[SystemZ] Bugfix in SystemZAsmParser.cpp. Let parseRegister() allow RegFP Group if expecting RegV Group, since the %f register prefix yields the FP group even while used with vector instructions. Reviewed by Ulrich Weigand. llvm-svn: 249810	2015-10-09 07:19:12 +00:00
Kostya Serebryany	e95022ac14	[libFuzzer] don't print large artifacts to stderr llvm-svn: 249808	2015-10-09 04:03:14 +00:00
Kostya Serebryany	bd5d1cdbb9	[libFuzzer] add -artifact_prefix flag llvm-svn: 249807	2015-10-09 03:57:59 +00:00
Saleem Abdulrasool	1825fac3c9	ARM: tweak WoA frame lowering Accept r11 when targeting Windows on ARM rather than just low registers. Because we are in a thumb-2 only mode, this may be slightly more expensive in code size, but results in better code for the environment since it spills the frame register, which is generally desired for fast stack walking as per the ABI. llvm-svn: 249804	2015-10-09 03:19:03 +00:00
Sanjoy Das	648956118b	[SCEV] Call `StrengthenNoWrapFlags` after `GroupByComplexity`; NFCI The current implementation of `StrengthenNoWrapFlags` is agnostic to the order of `Ops`, so this commit should not change anything semantic. An upcoming change will make `StrengthenNoWrapFlags` sensitive to the order of `Ops`. llvm-svn: 249802	2015-10-09 02:44:45 +00:00
Reid Kleckner	ae44e871cd	Revert "Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64""" This reverts commit r249794. Apparently my checkouts are full of unexpected surprises today. llvm-svn: 249796	2015-10-09 01:13:17 +00:00
Reid Kleckner	b510401785	Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64"" This reverts commit r249032. TODO write commit msg llvm-svn: 249794	2015-10-09 01:11:37 +00:00
Joseph Tremoulet	676e5cf07f	[WinEH] Fix cleanup state numbering Summary: - Recurse from cleanupendpads to their cleanuppads, to make sure the cleanuppad is visited if it has a cleanupendpad but no cleanupret. - Check for and avoid double-processing cleanuppads, to allow for them to have multiple cleanuprets (plus cleanupendpads). - Update Cxx state numbering to visit toplevel cleanupendpads and to recurse from cleanupendpads to their preds, to ensure we number any funclets in inlined cleanups. SEH state numbering already did this. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13374 llvm-svn: 249792	2015-10-09 00:46:08 +00:00
Reid Kleckner	ebef256269	[SEH] Fix llvm.eh.exceptioncode fast register allocation assertion I called the wrong MachineBasicBlock::addLiveIn() overload. llvm-svn: 249786	2015-10-09 00:15:13 +00:00
Reid Kleckner	21427ada3e	Address review comments, remove error case and return 0 instead as required by tests llvm-svn: 249785	2015-10-09 00:15:08 +00:00
Reid Kleckner	e94fef7b3d	[llvm-symbolizer] Make --relative-address work with DWARF contexts Summary: Previously the relative address flag only affected PDB debug info. Now both DIContext implementations always expect to be passed virtual addresses. llvm-symbolizer is now responsible for adding ImageBase to module offsets when --relative-offset is passed. Reviewers: zturner Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12883 llvm-svn: 249784	2015-10-09 00:15:01 +00:00
Duncan P. N. Exon Smith	52888a6738	IR: Remove implicit iterator conversions from lib/IR, NFC Stop converting implicitly between iterators and pointers/references in lib/IR. For convenience, I've added a `getIterator()` accessor to `ilist_node` so that callers don't need to know how to spell the iterator class (i.e., they can use `X.getIterator()` instead of `Function::iterator(X)`). I'll eventually disallow these implicit conversions entirely, but there's a lot of code, so it doesn't make sense to do it all in one patch. One library or so at a time. Why? To root out cases of `getNextNode()` and `getPrevNode()` being used in iterator logic. The design of `ilist` makes that invalid when the current node could be at the back of the list, but it happens to "work" right now because of a bug where those functions never return `nullptr` if you're using a half-node sentinel. Before I can fix the function, I have to remove uses of it that rely on it misbehaving. (Maybe the function should just be deleted anyway? But I don't want deleting it -- potentially a huge project -- to block fixing ilist/iplist.) llvm-svn: 249782	2015-10-08 23:49:46 +00:00
Sanjoy Das	3c520a1272	[RS4GC] Refactoring to make a later change easier, NFCI Summary: These non-semantic changes will help make a later change adding support for deopt operand bundles more streamlined. Reviewers: reames, swaroop.sridhar Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13491 llvm-svn: 249779	2015-10-08 23:18:38 +00:00
Sanjoy Das	4fd3d400fa	[IRBuilder] Change the `gc.statepoint` creation interface This is to enable me to address review for D13491 -- `Flags` is a bitfield of `StatepointFlags`, not an individual item out of the enum, so it should be represented as an `uint32_t`. llvm-svn: 249778	2015-10-08 23:18:33 +00:00
Sanjoy Das	c21a05a3a4	[PlaceSafeopints] Extract out `callsGCLeafFunction`, NFC Summary: This will be used in a later change to RewriteStatepointsForGC. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13490 llvm-svn: 249777	2015-10-08 23:18:30 +00:00
Sanjoy Das	1ede5367ba	[RS4GC] Don't copy ADT's unneccessarily, NFCI Summary: Use `const auto &` instead of `auto` in `makeStatepointExplicit`. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13454 llvm-svn: 249776	2015-10-08 23:18:22 +00:00
Kevin Enderby	46e642f8c5	Fix a bug in llvm-objdump’s printing of Objective-C meta data from malformed Mach-O files that caused a crash because of a section header had a size that extended past the end of the file. rdar://22983603 llvm-svn: 249768	2015-10-08 22:50:55 +00:00
Duncan P. N. Exon Smith	6eeaff169d	Support: Stop relying on iterator auto-conversion, NFC Stop relying on ilist implicit conversions from `value_type&` to `iterator` in YAMLParser.cpp. I eventually want to outlaw this entirely. It encourages `getNextNode()` and `getPrevNode()` in iterator logic, which is extremely fragile (and relies on them never returning `nullptr`). FTR, there's nothing nefarious going on in this case, it was just easy to clean up since the callers really wanted iterators to begin with. llvm-svn: 249767	2015-10-08 22:47:55 +00:00
Duncan P. N. Exon Smith	d389165c14	AArch64: Stop using MachineInstr::getNextNode() Stop using `getNextNode()` to get an insertion point (at least, in this one place). Instead, use iterator logic directly. The `getNextNode()` interface isn't actually supposed to work for creating iterators; it's supposed to return `nullptr` (not a real iterator) if this is the last node. It's currently broken and will "happen" to work, but if we ever fix the function, we'll get some strange failures in places like this. llvm-svn: 249764	2015-10-08 22:43:26 +00:00
Duncan P. N. Exon Smith	ece61624b1	MC: Stop using Fragment::getNextNode() Stop using `getNextNode()` to get an iterator to a fragment (at least, in this one place). Instead, use iterator logic directly. The `getNextNode()` interface isn't actually supposed to work for creating iterators; it's supposed to return `nullptr` (not a real iterator) if this is the last node. It's currently broken and will "happen" to work, but if we ever fix the function, we'll get some strange failures in places like this. llvm-svn: 249763	2015-10-08 22:36:08 +00:00
Duncan P. N. Exon Smith	a3da44882f	PowerPC: Don't use getNextNode() for insertion point Stop using `getNextNode()` to create an insertion point for machine instructions (at least, in this one place). Instead, use an iterator. As a drive-by, clean up dump statements to use iterator logic. The `getNextNode()` interface isn't actually supposed to work for insertion points; it's supposed to return `nullptr` if this is the last node. It's currently broken and will "happen" to work, but if we ever fix the function, we'll get some strange failures. llvm-svn: 249758	2015-10-08 22:20:37 +00:00
Evgeniy Stepanov	d12212bc8c	New MSan mapping layout (llvm part). This is an implementation of https://github.com/google/sanitizers/issues/579 It has a number of advantages over the current mapping: * Works for non-PIE executables. * Does not require ASLR; as a consequence, debugging MSan programs in gdb no longer requires "set disable-randomization off". * Supports linux kernels >=4.1.2. * The code is marginally faster and smaller. This is an ABI break. We never really promised ABI stability, but this patch includes a courtesy escape hatch: a compile-time macro that reverts back to the old mapping layout. llvm-svn: 249753	2015-10-08 21:35:26 +00:00
Evgeniy Stepanov	5fe279e727	Add Triple::isAndroid(). This is a simple refactoring that replaces Triple.getEnvironment() checks for Android with Triple.isAndroid(). llvm-svn: 249750	2015-10-08 21:21:24 +00:00
Eric Christopher	11e5983658	Move the MMX subtarget feature out of the SSE set of features and into its own variable. This is needed so that we can explicitly turn off MMX without turning off SSE and also so that we can diagnose feature set incompatibilities that involve MMX without SSE. Rationale: // sse3 __m128d test_mm_addsub_pd(__m128d A, __m128d B) { return _mm_addsub_pd(A, B); } // mmx void shift(__m64 a, __m64 b, int c) { _mm_slli_pi16(a, c); _mm_slli_pi32(a, c); _mm_slli_si64(a, c); _mm_srli_pi16(a, c); _mm_srli_pi32(a, c); _mm_srli_si64(a, c); _mm_srai_pi16(a, c); _mm_srai_pi32(a, c); } clang -msse3 -mno-mmx file.c -c For this code we should be able to explicitly turn off MMX without affecting the compilation of the SSE3 function and then diagnose and error on compiling the MMX function. This matches the existing gcc behavior and follows the spirit of the SSE/MMX separation in llvm where we can (and do) turn off MMX code generation except in the presence of intrinsics. Updated a couple of tests, but primarily tested with a couple of tests for turning on only mmx and only sse. This is paired with a patch to clang to take advantage of this behavior. llvm-svn: 249731	2015-10-08 20:10:06 +00:00
Diego Novillo	aae1ed8e08	Re-apply r249644: Handle inline stacks in gcov-encoded sample profiles. This fixes memory allocation problems by making the merge operation keep the profile readers around until the merged profile has been emitted. This is needed to prevent the inlined function names to disappear from the function profiles. Since all the names are kept as references, once the reader disappears, the names are also deallocated. Additionally, XFAIL on big-endian architectures. The test case uses a gcov file generated on a little-endian system. llvm-svn: 249724	2015-10-08 19:40:37 +00:00
Alexei Starovoitov	87f83e6926	[bpf] Do not expand UNDEF SDNode during insn selection lowering o Before this patch, BPF backend will expand UNDEF node to i64 constant 0. o For second pass of dag combiner, legalizer will run through each to-be-processed dag node. o If any new SDNode is generated and has an undef operand, dag combiner will put undef node, newly-generated constant-0 node, and any node which uses these nodes in the working list. o During this process, it is possible undef operand is generated again, and this will form an infinite loop for dag combiner pass2. o This patch allows UNDEF to be a legal type. Signed-off-by: Yonghong Song <yhs@plumgrid.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> llvm-svn: 249718	2015-10-08 18:52:40 +00:00
Sanjoy Das	413dbbb1c2	[SCEV] Bring some methods up to coding style; NFC - Start methods with lower case - Reflow a comment - Delete header comment repeated in .cpp file llvm-svn: 249716	2015-10-08 18:46:59 +00:00
Reid Kleckner	b2244cb8f0	[WinEH] Relax assertion in the presence of stack realignment The code is correct as is, but we should test it. llvm-svn: 249715	2015-10-08 18:41:52 +00:00
Sanjoy Das	3bf22b1883	[SCEV] Remove comment repeated in cpp file; NFC llvm-svn: 249713	2015-10-08 18:28:42 +00:00
Sanjoy Das	dd70996a5c	[SCEV] Pick backedge values for phi nodes correctly Summary: `getConstantEvolutionLoopExitValue` and `ComputeExitCountExhaustively` assumed all phi nodes in the loop header have the same order of incoming values. This is not correct, and this commit changes `getConstantEvolutionLoopExitValue` and `ComputeExitCountExhaustively` to lookup the backedge value of a phi node using the loop's latch block. Unfortunately, there is still some code duplication `getConstantEvolutionLoopExitValue` and `ComputeExitCountExhaustively`. At some point in the future we should extract out a helper class / method that can evolve constant evolution phi nodes across iterations. Fixes 25060. Thanks to Mattias Eriksson for the spot-on analysis! Depends on D13457. Reviewers: atrick, hfinkel Subscribers: materi, llvm-commits Differential Revision: http://reviews.llvm.org/D13458 llvm-svn: 249712	2015-10-08 18:28:36 +00:00
Rafael Espindola	483ad20009	Handle Archive::getNumberOfSymbols being called in an archive with no symbols. No change in llvm, but will be tested from lld. llvm-svn: 249709	2015-10-08 18:06:20 +00:00
Ulrich Weigand	f4d14f781f	[SystemZ] Fix another assertion failure in tryBuildVectorShuffle This fixes yet another scenario where tryBuildVectorShuffle would attempt to create a BUILD_VECTOR node with an invalid combination of types. This can happen if the incoming BUILD_VECTOR has elements of a type different from the vector element type, which is allowed in certain cases as long as they are all the same type. When one of these elements is used in the residual vector, and UNDEF elements are added to fill up the residual vector, those UNDEFs then have to use the type of the original element, not the vector element type, or else the resulting BUILD_VECTOR will have an invalid type combination. llvm-svn: 249706	2015-10-08 17:46:59 +00:00
Sanjay Patel	f61a08fbf1	[InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886) This is a partial fix for PR24886: https://llvm.org/bugs/show_bug.cgi?id=24886 Without this IR transform, the backend (x86 at least) was producing inefficient code. This patch is making 2 assumptions: 1. The canonical form of a fabs() operation is, in fact, the LLVM fabs() intrinsic. 2. The high bit of an FP value is always the sign bit; as noted in the bug report, this isn't specified by the LangRef. Differential Revision: http://reviews.llvm.org/D13076 llvm-svn: 249702	2015-10-08 17:09:31 +00:00
Sanjay Patel	9115cf8c9d	[ValueTracking] teach computeKnownBits that a fabs() clears sign bits This was requested in D13076: if we're going to canonicalize to fabs(), ValueTracking should know that fabs() clears sign bits. In this patch (as in D13076), we're not handling vectors yet even though computeKnownBits' fabs() case itself should be vector-ready via the splat in this patch. Fixing this will require follow-on patches to correct other logic that uses 'getScalarType'. Differential Revision: http://reviews.llvm.org/D13222 llvm-svn: 249701	2015-10-08 16:56:55 +00:00
George Rimar	87780300f6	Windows: Fixed sys::findProgramByName to work with files containing dot in their name. Problem was in SearchPathW function that does not attach an extension if file already has one. That does not work for executables like ld.lld2 for example which require to have .exe extension but SearchPath thinks that its "lld2". Solution was to add the extension manually. Differential Revision: http://reviews.llvm.org/D13536 llvm-svn: 249696	2015-10-08 16:03:19 +00:00
Frederic Riss	263b772bda	[X86] Disable X86CallFrameOptimization on Darwin in presence of EH We emit 1 compact unwind encoding per function, and this can’t represent the varying stack pointer that will be generated by X86CallFrameOptimization. Disable the optimization on Darwin. (It might be possible to split the function into multiple ranges and emit 1 compact unwind info per range. The compact unwind emission code isn’t ready for that and this kind of info certainly isn’t tested/used anywhere. It might be worth exploring this path if we want to get the space savings at some point though) llvm-svn: 249694	2015-10-08 15:45:08 +00:00
Teresa Johnson	ca6b64ff04	Fix combined function index abbrev (NFC) Removed an unused abbrev op in the VST_CODE_COMBINED_FNENTRY abbrev. I noticed while writing/testing an array string dumper for llvm-bcanalyze that the combined function's VST entry abbrevs contained an old field that I am not using. Everything was working fine since the bitcode writer and reader were in sync on how the record fields were actually being set up and interpreted. llvm-svn: 249691	2015-10-08 13:52:56 +00:00
Igor Breger	defab3c1ef	AVX512: vpextrb/w/d/q and vpinsrb/w/d/q implementation. This instructions doesn't have intrincis. Added tests for lowering and encoding. Differential Revision: http://reviews.llvm.org/D12317 llvm-svn: 249688	2015-10-08 12:55:01 +00:00
James Molloy	e9d50dc9f7	Compute demanded bits for icmp instructions Instead of bailing out when we see an icmp, we can instead at least say that if the upper bits of both operands are known zero, they are not demanded. This doesn't help with signed comparisons, but it's at least better than bailing out. llvm-svn: 249687	2015-10-08 12:40:06 +00:00
James Molloy	bcd7f0ac98	Treat Mul just like Add and Subtract Like adds and subtracts, muls ripple only to the left so we can use the same logic. While we're here, add a print method to DemandedBits so it can be used with -analyze, which we'll use in the testcase. llvm-svn: 249686	2015-10-08 12:39:59 +00:00
James Molloy	ab9fdb9226	Make demanded bits lazy The algorithm itself is still eager, but it doesn't get run until a query function is called. This greatly reduces the compile-time impact of requiring DemandedBits when at runtime it is not often used. NFCI. llvm-svn: 249685	2015-10-08 12:39:50 +00:00
Michael Kuperstein	04e79329d0	[X86] Fix wrong treatment of multi-lane blends in BUILD_VECTORtoBlendMask() This fixes two separate bugs: 1) The mask for the high lane was not set correctly. That fixes PR24532. 2) The transformation should bail out if it believes it involves more than 2 lanes, as it does not currently do anything sensible in this case. Differential Revision: http://reviews.llvm.org/D13505 llvm-svn: 249669	2015-10-08 08:13:02 +00:00
Michael Kuperstein	2b3c16ca17	Do not assert on first non-prologue instruction being a CFI directive. llvm-svn: 249668	2015-10-08 07:48:49 +00:00
Jonas Paulsson	5d3fbd3733	[SystemZ] SystemZElimCompare pass improved. Compare elimination extended to recognize load-and-test instructions used for comparison and eliminate them the same way as with compare instructions. Test case fp-cmp-05.ll updated to expect optimized results now also for z13. The order of instruction shortening and compare elimination passes have been changed so that opcodes do not have to be handled in both passes. Reviewed by Ulrich Weigand. llvm-svn: 249666	2015-10-08 07:40:23 +00:00
Jonas Paulsson	29d9d8d955	[SystemZ] Bugfix: check CC reg liveness in SystemZShortenInst. The following instruction shortening transformations would introduce a definition of the CC reg, so therefore liveness of CC reg must be checked: WFADB -> ADBR WFSDB -> SDBR Also add the CC reg implicit def operand to the MI in case of change of opcode. Reviewed by Ulrich Weigand. llvm-svn: 249665	2015-10-08 07:40:19 +00:00
Jonas Paulsson	7c5ce10a07	[SystemZ] Use load-and-test for fp compare with 0 if vector support is present. Since the LTxBRCompare instructions can't be used with vector registers, a normal load-and-test instruction (with a modelled def operand) is used instead. Reviewed by Ulrich Weigand. llvm-svn: 249664	2015-10-08 07:40:16 +00:00
Jonas Paulsson	2c96dd64fc	[SystemZ] More minor fixing in SystemZElimCompare.cpp Don't use subreg indices since they are not used after regalloc. Reviewed by Ulrich Weigand. llvm-svn: 249663	2015-10-08 07:40:11 +00:00
Jonas Paulsson	9e1f3bd1bd	[SystemZ] Minor fixes in SystemZElimCompare.cpp Reviewed by Ulrich Weigand. llvm-svn: 249662	2015-10-08 07:39:55 +00:00
Craig Topper	da5168b7ce	Use range-based for loops. NFC. llvm-svn: 249659	2015-10-08 06:06:42 +00:00
Sanjoy Das	10dffcb36b	[SCEV] Check `Pred` first in isKnownPredicateViaSplitting Comparing `Pred` with `ICmpInst::ICMP_ULT` is cheaper that memory access -- do that check before loading / storing `ProvingSplitPredicate`. llvm-svn: 249654	2015-10-08 03:46:00 +00:00
Sanjoy Das	1195dbee66	[SCEV] Use `auto *` instead of `auto`; NFCI (As prescribed by the coding style document) llvm-svn: 249653	2015-10-08 03:45:58 +00:00
Diego Novillo	a082040ded	Revert "Handle inline stacks in gcov-encoded sample profiles." This reverts commit r249644. The buildbots are failing the new test I added. Investigating. llvm-svn: 249648	2015-10-08 01:17:26 +00:00
Kostya Serebryany	3b804877fd	[libFuzzer] fix 32-bit build llvm-svn: 249646	2015-10-08 00:59:25 +00:00
Diego Novillo	b7fca57493	Handle inline stacks in gcov-encoded sample profiles. This patch adds support for reading sample profiles with inline stacks. Inline stacks in a profile are generated when the sampled binary has samples in inlined functions. For instance, if main() calls foo() and foo() calls bar(), and bar() is inlined into foo() and foo() inlined into main(), the profile may look something like: main total:364084 head:0 [ ... ] 2.3: _Z3fool total:243786 1: 60149 1.2: 38568 1.4: 46511 1.7: _Z3bari total:98558 1.1: 52672 1.2: 45886 At line 2, discriminator 3, main() calls foo(). In turn, foo() calls bar() at line 1, discriminator 7. In the textual format, this stacking of inline calls is represented with indentation. With this change, LLVM can now read sample profile files generated by the create_gcov tool from https://github.com/google/autofdo. llvm-svn: 249644	2015-10-08 00:39:11 +00:00
Justin Bogner	468c998031	CodeGen: print and verify after TargetPassConfig::insertPass by default In r224059, we started verifying after addPass, but missed doing so on insertPass. There isn't a good reason for the discrepancy, and skipping the verifier in these cases causes bugs. This also exposes a verifier error that was introduced in r249087, but the verifier doesn't run until after the register coalescer, when the issue happens to have been resolved. I've skipped the verifier after SIFixSGPRLiveRangesID to avoid the failures for now and will follow up with Matt for a proper fix. llvm-svn: 249643	2015-10-08 00:36:22 +00:00
Reid Kleckner	97797419e6	[WinEH] Fix 32-bit funclet epilogues in the presence of dynamic allocas In particular, passing non-trivially copyable objects by value on win32 uses a dynamic alloca (inalloca). We would clobber ESP in the epilogue and end up returning to outer space. llvm-svn: 249637	2015-10-07 23:55:01 +00:00
David Majnemer	6af5f82c20	[WinEH] Refer to filter funclets using their symbol-table symbol The relocation for the filter funclet will be against a symbol table entry for a function instead of the section, making it easier to understand what is going on. llvm-svn: 249621	2015-10-07 21:34:00 +00:00
Sanjoy Das	40bdd041db	[RS4GC] Use AssertingVH for RematerializedValueMapTy, NFCI Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13489 llvm-svn: 249620	2015-10-07 21:32:35 +00:00
Reid Kleckner	70bf6bb5e6	[WinEH] Undo the effect of r249578 for 32-bit The __CxxFrameHandler3 tables for 32-bit are supposed to hold stack offsets relative to EBP, not ESP. I blindly updated the win-catchpad.ll test case, and immediately noticed that 32-bit catching stopped working. While I'm at it, move the frame index to frame offset WinEH table logic out of PEI. PEI shouldn't have to know about WinEHFuncInfo. I realized we can calculate frame index offsets just fine from the table printer. llvm-svn: 249618	2015-10-07 21:13:15 +00:00
David Majnemer	c289c9ff55	[WinEH] Remove unreachable blocks before preparation We remove unreachable blocks because it is pointless to consider them for coloring. However, we still had stale pointers to these blocks in some data structures after we removed them from the function. Instead, remove the unreachable blocks before attempting to do anything with the function. This fixes PR25099. llvm-svn: 249617	2015-10-07 21:08:25 +00:00
Rafael Espindola	284093033f	git-clang-format r249548. Sorry for missing this the first time. llvm-svn: 249610	2015-10-07 20:32:24 +00:00
Vasileios Kalintiris	b876b58d38	[mips][FastISel] Factor out common code from switch statement. NFC llvm-svn: 249603	2015-10-07 20:06:30 +00:00
Duncan P. N. Exon Smith	37bf678a0d	IR: Create SymbolTableList wrapper around iplist, NFC Create `SymbolTableList`, a wrapper around `iplist` for lists that automatically manage a symbol table. This commit reduces a ton of code duplication between the six traits classes that were used previously. As a drive by, reduce the number of template parameters from 2 to 1 by using a SymbolTableListParentType metafunction (I originally had this as a separate commit, but it touched most of the same lines so I squashed them). I'm in the process of trying to remove the UB in `createSentinel()` (see the FIXMEs I added for `ilist_embedded_sentinel_traits` and `ilist_half_embedded_sentinel_traits`). My eventual goal is to separate the list logic into a base class layer that knows nothing about (and isn't templated on) the downcasted nodes -- removing the need to invoke UB -- but for now I'm just trying to get a handle on all the current use cases (and cleaning things up as I see them). Besides these six SymbolTable lists, there are two others that use the addNode/removeNode/transferNodes() hooks: the `MachineInstruction` and `MachineBasicBlock` lists. Ideally there'll be a way to factor these hooks out of the low-level API entirely, but I'm not quite there yet. llvm-svn: 249602	2015-10-07 20:05:10 +00:00
Sanjoy Das	af6980c70a	[IRBuilder] Add gc.statepoint related methods to IRBuilder Summary: This adds some more routines to `IRBuilder` around creating calls and invokes to `gc.statepoint`. These will be used later. Reviewers: reames, swaroop.sridhar Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13371 llvm-svn: 249596	2015-10-07 19:52:12 +00:00
Vasileios Kalintiris	6ae1b35cda	[mips][FastISel] Use ternary operator to select opcode. NFC llvm-svn: 249594	2015-10-07 19:43:31 +00:00
Joseph Tremoulet	39234fc67e	[WinEH] Set NoModuleLevelChanges in clone flags Summary: This is necessary to keep the cloner from making bogus copies of debug metadata attached to the IR it is cloning. Also, avoid running RemapInstruction over all instructions in the common case that no cloning was performed. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13514 llvm-svn: 249591	2015-10-07 19:29:56 +00:00
Rafael Espindola	4264e2d531	Use SpecificBumpPtrAllocator to simplify the MCSeciton destruction. llvm-svn: 249589	2015-10-07 19:08:19 +00:00
Mehdi Amini	044cb34bdc	Revert "Revert "This patch builds on top of D13378 to handle constant condition."" This reverts commit r249528 and reapply r249431. The fix for the fallout has been commited in r249575. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 249581	2015-10-07 18:14:25 +00:00
Vasileios Kalintiris	daad571ba4	[mips][FastISel] Simple refactoring of MipsFastISel::emitLogicalOP(). NFC. llvm-svn: 249580	2015-10-07 18:14:24 +00:00
Chad Rosier	7c6ac2b8f9	[AArch64] Fold a floating-point divide by power of two into fp conversion. Part of http://reviews.llvm.org/D13442 llvm-svn: 249579	2015-10-07 17:51:37 +00:00
Reid Kleckner	33bd2d99d8	[WinEH] Fix two minor issues in __CxxFrameHandler3 tables There was an off-by-one bug in ip2state tables which manifested when one call immediately preceded the try-range of the next. The return address of the previous call would appear to be within the try range of the next scope, resulting in extra destructors or catches running. We also computed the wrong offset for catch parameter stack objects. The offset should be from RSP, not from RBP. llvm-svn: 249578	2015-10-07 17:49:32 +00:00
Matt Arsenault	fc0ad42516	AMDGPU: Fix missing implicit m0 uses on movrel instructions llvm-svn: 249577	2015-10-07 17:46:32 +00:00
Chad Rosier	fa30c9b436	[AArch64] Fold a floating-point multiply by power of two into fp conversion. Part of http://reviews.llvm.org/D13442 llvm-svn: 249576	2015-10-07 17:39:18 +00:00
Sanjoy Das	0015e5a088	[IndVars] Preserve LCSSA in `eliminateIdentitySCEV` Summary: After r249211, SCEV can see through some LCSSA phis. Add a `replacementPreservesLCSSAForm` check before replacing uses of these phi nodes with a simplified use of the induction variable to avoid breaking LCSSA. Fixes 25047. Depends on D13460. Reviewers: atrick, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13461 llvm-svn: 249575	2015-10-07 17:38:31 +00:00
Sanjoy Das	4493b40002	[SCEV] Use some C++11'ism, NFC Summary: Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13457 llvm-svn: 249574	2015-10-07 17:38:25 +00:00
Chad Rosier	169865ffda	[ARM] Promote helper function to SelectionDAG. I'll be using the function in a similar combine for AArch64. The helper was also improved to handle undef values. Part of http://reviews.llvm.org/D13442 llvm-svn: 249572	2015-10-07 17:28:58 +00:00
Kevin B. Smith	9c7408807f	Test commit access. Fixed comment to have correct input parameter name and period termination. llvm-svn: 249571	2015-10-07 17:24:25 +00:00
Joseph Tremoulet	bde46c5642	[WinEH] Update CoreCLR EH for catchpad MBBs Summary: Set the pad MBB as a funclet entry for CoreCLR as well as MSVCCXX, and update state numbering to put the catchpad block rather than its normal successor into the unwind map. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13492 llvm-svn: 249569	2015-10-07 17:16:25 +00:00
Oliver Stannard	d3d114ba54	[ARM] Use correct half-precision functions in EABI mode The ARM RTABI defines the half- to single-precision float conversion functions with an __aeabi prefix, but libgcc only has them with a __gnu prefix. Therefore we need to emit the __aeabi version when compiling with an eabi or eabihf triple, and the __gnu version with a gnueabi or gnueabihf triple. llvm-svn: 249565	2015-10-07 16:58:49 +00:00
Chad Rosier	17436bf64e	[ARM] Prevent PerformVDIVCombine from combining a vcvt/vdiv with 8 lanes. This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 249560	2015-10-07 16:15:40 +00:00
Artur Pilipenko	d94903c9f8	Teach computeKnownBits to use new align attribute/metadata Reviewed By: reames Differential Revision: http://reviews.llvm.org/D13470 llvm-svn: 249557	2015-10-07 16:01:18 +00:00
Jeroen Ketema	aebca09543	[ARM][AArch64] Only lower to interleaved load/store if the target has NEON Without an additional check for NEON, the compiler crashes during legalization of NEON ldN/stN. Differential Revision: http://reviews.llvm.org/D13508 llvm-svn: 249550	2015-10-07 14:53:29 +00:00
Rafael Espindola	30d77777e7	Use non virtual destructors for sections. llvm-svn: 249548	2015-10-07 13:46:06 +00:00
Chad Rosier	db71abf2d4	[ARM] Push more complex check down to reduce compile time. NFC. llvm-svn: 249547	2015-10-07 13:40:44 +00:00
Rafael Espindola	665b0d3a4e	Don't repeat names in comments and don't indent in namespaces. NFC. llvm-svn: 249546	2015-10-07 13:38:49 +00:00
Scott Egerton	9004cc7942	Revert: r249536 - Testing commit access with a trival whitespace change. llvm-svn: 249537	2015-10-07 10:57:06 +00:00
Scott Egerton	be6b54b691	Testing commit access with a trival whitespace change. llvm-svn: 249536	2015-10-07 10:49:49 +00:00
James Molloy	47efaeb36e	Revert "This patch builds on top of D13378 to handle constant condition." This reverts commit r249431. This caused failures in sqlite3: http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/14453 llvm-svn: 249528	2015-10-07 09:03:34 +00:00
Arnaud A. de Grandmaison	a6178a179d	[EarlyCSE] Fix handling of target memory intrinsics for CSE'ing loads. Summary: Some target intrinsics can access multiple elements, using the pointer as a base address (e.g. AArch64 ld4). When trying to CSE such instructions, it must be checked the available value comes from a compatible instruction because the pointer is not enough to discriminate whether the value is correct. Reviewers: ssijaric Subscribers: mcrosier, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13475 llvm-svn: 249523	2015-10-07 07:41:29 +00:00
Michael Kuperstein	259f1508f0	[X86] Emit .cfi_escape GNU_ARGS_SIZE when adjusting the stack before calls When outgoing function arguments are passed using push instructions, and EH is enabled, we may need to indicate to the stack unwinder that the stack pointer was adjusted before the call. This should fix the exception handling issues in PR24792. Differential Revision: http://reviews.llvm.org/D13132 llvm-svn: 249522	2015-10-07 07:01:31 +00:00
Igor Breger	1a6fd1cc0f	AVX512: Change encoding of vpshuflw and vpshufhw instructions. Implement WIG as W0 and not W1, like all other instruction have been implemented. Add encoding tests. Differential Revision: http://reviews.llvm.org/D13471 llvm-svn: 249521	2015-10-07 06:31:18 +00:00
Sanjoy Das	60bf3db17f	[RS4GC] Remove an unnecessary assert & related variables I don't think this assert adds much value, and removing it and related variables avoids an "unused variable" warning in release builds. llvm-svn: 249511	2015-10-07 02:39:27 +00:00
Sanjoy Das	b40bd1a93f	[RS4GC] Cosmetic cleanup, NFC Summary: A series of cosmetic cleanup changes to RewriteStatepointsForGC: - Rename variables to LLVM style - Remove some redundant asserts - Remove an unsued `Pass *` parameter - Remove unnecessary variables - Use C++11 idioms where applicable - Pass CallSite by value, not reference Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13370 llvm-svn: 249508	2015-10-07 02:39:18 +00:00
Matt Arsenault	10e6a61892	AMDGPU: Add comment for VOP2b operand class Because of the constant bus requirement, it is never legal to use a literal constant for these instructions despite the encoding allowing it. This was already doing the right thing, but note why. llvm-svn: 249500	2015-10-07 01:36:00 +00:00
Matt Arsenault	187276fa94	AMDGPU: Properly register passes llvm-svn: 249495	2015-10-07 00:42:53 +00:00
Matt Arsenault	284192730a	AMDGPU: Use explicit register size indirect pseudos This stops using an unknown reg class operand. Currently build_vector selection has a broken looking check where it tries to use a VGPR reg class and an SGPR one if it sees an SGPR use. With the source operand has an explicit VGPR class, illegal copies will be inserted that SIFixSGPRCopies will take care of normally later, which will allow removing the weird check of build_vector users. Without this, when removed v_movrels_b32 would still be emitted even though all of the values were only stored in SGPRs. llvm-svn: 249494	2015-10-07 00:42:51 +00:00
Matt Arsenault	922b7bf808	AMDGPU: Remove inferRegClassFromUses / inferRegClassFromDefs I'm not sure why this would be necessary, and no tests fail with them removed. Looking at the uses is suspect as well because the use reg classes will likely change when the users are moved as a result of moving this instruction. llvm-svn: 249493	2015-10-07 00:42:31 +00:00
Reid Kleckner	72ba70418f	[SEH] Add llvm.eh.exceptioncode intrinsic This will support the Clang __exception_code intrinsic. llvm-svn: 249492	2015-10-07 00:27:33 +00:00
Hans Wennborg	f1f36517b7	InstCombine: Fold comparisons between unguessable allocas and other pointers This will allow us to optimize code such as: int f(int p) { int x; return p == &x; } as well as: int allocate(void); int f() { int x; int *p = allocate(); return p == &x; } The folding can only be done under certain circumstances. Even though p and &x cannot alias, the comparison must still return true if the pointer representations are equal. If a user successfully generates a p that's a correct guess for &x, comparison should return true even though p is an invalid pointer. This patch argues that if the address of the alloca isn't observable outside the function, the function can act as-if the address is impossible to guess from the outside. The tricky part is keeping the act consistent: if we fold p == &x to false in one place, we must make sure to fold any other comparisons based on those pointers similarly. To ensure that, we only fold when &x is involved exactly once in comparison instructions. Differential Revision: http://reviews.llvm.org/D13358 llvm-svn: 249490	2015-10-07 00:20:07 +00:00
David Blaikie	c9ad9191a7	DebugInfo: Include the decl_line/decl_file in subprogram definitions if they differ from those in the declaration This is handy for some AutoFDO stuff, and seems like a minor improvement to correctness (otherwise a debug info consumer might think the decl line/file of the def was the same as that of the declaration - though what a consumer might use that for, I'm not sure - maybe "list <func>" would've misbehaved with the old behavior?) and at a minor cost (in my experiment, with fission, without type units, without compression, 0.01% growth in debug info in the executable/objects, 0.02% growth in the .dwo files). llvm-svn: 249487	2015-10-07 00:04:16 +00:00
David Majnemer	7735a6d07a	[WinEH] Create a separate MBB for funclet prologues Our current emission strategy is to emit the funclet prologue in the CatchPad's normal destination. This is problematic because intra-funclet control flow to the normal destination is not erroneous and results in us reevaluating the prologue if said control flow is taken. Instead, use the CatchPad's location for the funclet prologue. This correctly models our desire to have unwind edges evaluate the prologue but edges to the normal destination result in typical control flow. Differential Revision: http://reviews.llvm.org/D13424 llvm-svn: 249483	2015-10-06 23:31:59 +00:00
Hans Wennborg	083ca9bb32	Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482	2015-10-06 23:24:35 +00:00
Lang Hames	44780acd91	[Orc] Teach the CompileOnDemand layer to clone aliases. This allows modules containing aliases to be lazily jit'd. Previously these failed with missing symbol errors because the aliases weren't cloned from the original module. llvm-svn: 249481	2015-10-06 22:55:05 +00:00
Duncan P. N. Exon Smith	55a0c43f8e	IR: Use auto for iterators, NFC llvm-svn: 249480	2015-10-06 22:37:47 +00:00
Duncan P. N. Exon Smith	d146e7d93e	IR: Remove unnecessary TraitsClass typedef, NFC No classes are specializing the symbol table traits, so no need to look through a typedef for class API. Make a few more functions private since only SymbolTableListTraits should be using them. llvm-svn: 249476	2015-10-06 22:14:06 +00:00
Sanjoy Das	5c8bead46d	[IndVars] Don't break dominance in `eliminateIdentitySCEV` Summary: After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and Y are related in the dominator tree, even if X is an operand to Y (I've included a toy example in comments, and a real example as a test case). This commit changes `SimplifyIndVar` to require a `DominatorTree`. I don't think this is a problem because `ScalarEvolution` requires it anyway. Fixes PR25051. Depends on D13459. Reviewers: atrick, hfinkel Subscribers: joker.eph, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13460 llvm-svn: 249471	2015-10-06 21:44:49 +00:00
Sanjoy Das	088bb0ea9f	[IndVars] Extract out eliminateIdentitySCEV, NFC Summary: Reflow a comment while at it. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13459 llvm-svn: 249470	2015-10-06 21:44:39 +00:00
Duncan P. N. Exon Smith	c77e92da92	IR: Remove unnecessary specialization of getSymTab(), NFC The only specializations of `getSymTab()` were identical to the default defined in `SymbolTableListTraits::getSymTab()`. Remove the specializations, and stop treating it like a configuration point. Just to be sure no one else accesses this, make it private. llvm-svn: 249469	2015-10-06 21:31:07 +00:00
Tom Stellard	0fbf899c0f	AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments() Summary: We currently ignore the calling convention, so there is no real reason to assert on the calling convention of functions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13367 llvm-svn: 249468	2015-10-06 21:16:34 +00:00
Chad Rosier	dca46b426f	[ARM] Minor refactoring. NFC. llvm-svn: 249465	2015-10-06 20:58:42 +00:00
Chad Rosier	aed910b7d7	[ARM] Minor refactoring. NFC. llvm-svn: 249464	2015-10-06 20:51:26 +00:00
Chad Rosier	9df4aff86d	[ARM] Minor refactoring. NFC. llvm-svn: 249463	2015-10-06 20:45:45 +00:00
Vedant Kumar	1ab5ea564f	[Function] Clean up {prefix,prologue} data routines (NFC) Factor out some common code used to get+set function prefix/prologue data. This may come in handy if we ever decide to store personality functions in the same way we store prefix/prologue data. Differential Revision: http://reviews.llvm.org/D13120 Reviewed-by: bogner llvm-svn: 249460	2015-10-06 20:31:57 +00:00
Joseph Tremoulet	7f8c1165cd	[WinEH] Implement state numbering for CoreCLR Summary: Assign one state number per handler/funclet, tracking parent state, handler type, and catch type token. State numbers are arranged such that ancestors have lower state numbers than their descendants. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13450 llvm-svn: 249457	2015-10-06 20:30:33 +00:00
Joseph Tremoulet	2afea5438f	[WinEH] Recognize CoreCLR personality function Summary: - Add CoreCLR to if/else ladders and switches as appropriate. - Rename isMSVCEHPersonality to isFuncletEHPersonality to better reflect what it captures. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D13449 llvm-svn: 249455	2015-10-06 20:28:16 +00:00
Chad Rosier	a087fd21da	[ARM] Minor refactoring to improve readability. NFC. llvm-svn: 249454	2015-10-06 20:23:42 +00:00
Philip Reames	675418ebc0	Extend known bits to understand @llvm.bswap This is a cleaned up patch from the one written by John Regehr based on the findings of the Souper superoptimizer. When writing tests, I was surprised to find that instsimplify apparently doesn't know how to collapse bit test sequences based purely on known bits. This required me to split my tests across both instsimplify and instcombine. Differential Revision: http://reviews.llvm.org/D13250 llvm-svn: 249453	2015-10-06 20:20:45 +00:00
Philip Reames	600a91580f	Fix pr25040 - Handle vectors of i1s in recently added implication code As mentioned in the bug, I'd missed the presence of a getScalarType in the caller of the new implies method. As a result, when we ended up with a implication over two vectors, we'd trip an assert and crash. Differential Revision: http://reviews.llvm.org/D13441 llvm-svn: 249442	2015-10-06 19:00:02 +00:00
Krzysztof Parzyszek	8d2b2cfa29	[Hexagon] Remove ZeroOrMore from option flags llvm-svn: 249438	2015-10-06 18:29:36 +00:00
Mehdi Amini	cf2513b352	This patch builds on top of D13378 to handle constant condition. With this patch, clang -O3 optimizes correctly providing > 1000x speedup on this artificial benchmark): for (a=0; a<n; a++) for (b=0; b<n; b++) for (c=0; c<n; c++) for (d=0; d<n; d++) for (e=0; e<n; e++) for (f=0; f<n; f++) x++; From test-suite/SingleSource/Benchmarks/Shootout/nestedloop.c Reviewers: sanjoyd Differential Revision: http://reviews.llvm.org/D13390 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 249431	2015-10-06 17:19:20 +00:00
Tom Stellard	88e0b25181	AMDGPU/SI: Add 64-bit versions of v_nop and v_clrexcp Summary: The assembly printing of these is still missing the encoding size suffix, but this will be fixed in a later commit. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13436 llvm-svn: 249424	2015-10-06 15:57:53 +00:00
Krzysztof Parzyszek	fb33824efd	[Hexagon] Add an early if-conversion pass llvm-svn: 249423	2015-10-06 15:49:14 +00:00
Daniel Sanders	1b3341724c	[mips][microMIPS] Fix an issue with selecting sqrt instruction in LLVM backend Summary: This fixes 7 tests during fast LLVM test-suite run: * MultiSource/Benchmarks/McCat/18-imp/imp * MultiSource/Applications/oggenc/oggenc * MultiSource/Benchmarks/MallocBench/gs/gs * MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan * MultiSource/Benchmarks/VersaBench/beamformer/beamformer * MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame * MultiSource/Benchmarks/Bullet/bullet Error message was in the form of: fatal error: error in backend: Cannot select: 0x95c3288: f32 = fsqrt 0x95c0190 [ORD=9] [ID=18] 0x95c0190: f32 = fadd 0x95bef30, 0x95c4d00 [ORD=8] [ID=17] 0x95bef30: f32 = fmul 0x95c4988, 0x95c4988 [ORD=5] [ID=16] ... There was problem with selecting sqrt instruction in LLVM backend. To fix the issue changes are made in TableGen definition for sqrt instruction in MipsInstrFPU.td and new test file sqrt.ll is added to LLVM regression tests. Patch by Zlatko Buljan Reviewers: zoran.jovanovic, hvarga, dsanders Subscribers: llvm-commits, petarj Differential Revision: http://reviews.llvm.org/D13235 llvm-svn: 249416	2015-10-06 15:17:25 +00:00
Daniel Sanders	add9057fa7	Revert r249123 - [mips][microMIPS] Fix an issue with selecting sqrt instruction in LLVM backend The author was not credited and most of the commit message is missing. Will re-commit with this fixed. llvm-svn: 249415	2015-10-06 15:13:16 +00:00
Arnaud A. de Grandmaison	6fd488b156	[EarlyCSE] Constify ParseMemoryInst methods (NFC). llvm-svn: 249400	2015-10-06 13:35:30 +00:00
Filipe Cabecinhas	b70fd8719e	Make sure the CastInst is valid before trying to create it Bug found with afl-fuzz. llvm-svn: 249396	2015-10-06 12:37:54 +00:00
Andrea Di Biagio	40f59e4466	[InstCombine] Teach SimplifyDemandedVectorElts how to handle ConstantVector select masks with ConstantExpr elements (PR24922) If the mask of a select instruction is a ConstantVector, method SimplifyDemandedVectorElts iterates over the mask elements to identify which values are selected from the select inputs. Before this patch, method SimplifyDemandedVectorElts always used method Constant::isNullValue() to check if a value in the mask was zero. Unfortunately that method always returns false when called on a ConstantExpr. This patch fixes the problem in SimplifyDemandedVectorElts by adding an explicit check for ConstantExpr values. Now, if a value in the mask is a ConstantExpr, we avoid calling isNullValue() on it. Fixes PR24922. Differential Revision: http://reviews.llvm.org/D13219 llvm-svn: 249390	2015-10-06 10:34:53 +00:00
Craig Topper	2c4068f409	[TwoAddressInstructionPass] When looking for a 3 addr conversion after commuting, make sure regB has been updated to take into account the commute. llvm-svn: 249378	2015-10-06 05:39:59 +00:00
Alexei Starovoitov	4e01a38da0	[bpf] Avoid extra pointer arithmetic for stack access For the program like below struct key_t { int pid; char name[16]; }; extern void test1(char *); int test() { struct key_t key = {}; test1(key.name); return 0; } For key.name, the llc/bpf may generate the below code: R1 = R10 // R10 is the frame pointer R1 += -24 // framepointer adjustment R1 \|= 4 // R1 is then used as the first parameter of test1 OR operation is not recognized by in-kernel verifier. This patch introduces an intermediate FI_ri instruction and generates the following code that can be properly verified: R1 = R10 R1 += -20 Patch by Yonghong Song <yhs@plumgrid.com> llvm-svn: 249371	2015-10-06 04:00:53 +00:00
Craig Topper	79dd1bf094	[X86] Teach constant hoisting that ANDs with 64-bit immediates in the range 0x80000000-0xffffffff can be handled cheaply and don't need to be hoisted. Most importantly, this keeps constant hoisting from preventing instruction selections ability to turn an AND with 0xffffffff into a move into a 32-bit subregister. llvm-svn: 249370	2015-10-06 02:50:24 +00:00
Craig Topper	d69d495333	[X86] Remove unnecessary AddComplexity directive. The instruction is already wrapped in the equivalent earlier. NFC llvm-svn: 249369	2015-10-06 02:50:21 +00:00
Dan Gohman	e51c058ecc	[WebAssembly] Switch to a more traditional assembly syntax This new syntax is built around putting each instruction on its own line in a "mnemonic op, op, op" like syntax. It also uses conventional data section directives like ".byte" and so on rather than requiring everything to be in hierarchical S-expression format. This is a more natural syntax for a ".s" file format from the perspective of LLVM MC and related tools, while remaining easy to translate into other forms as needed. llvm-svn: 249364	2015-10-06 00:27:55 +00:00
Benjamin Kramer	808d2a070d	Move helper classes into an anonymous namespace. NFC. llvm-svn: 249356	2015-10-05 21:20:26 +00:00
Diego Novillo	91cbed84d9	Remove AutoFDO profile handling for GCC's LIPO. NFC. Given the work we are doing on ThinLTO, we will never need to support module groups and working sets in GCC's implementation of LIPO. These are currently dead code, and will continue to be so. llvm-svn: 249351	2015-10-05 21:08:05 +00:00
David Majnemer	e4f9b09b51	[WinEH] Update CATCHRET's operand to match its successor The CATCHRET operand did not match the MachineFunction's CFG. This mismatch happened because FrameLowering created a new MachineBasicBlock and updated the CFG but forgot to update the CATCHRET operand. Let's make sure this doesn't happen again by strengthing the funclet membership analysis: it can now reason about the membership of all basic blocks, not just those inside of funclets. llvm-svn: 249344	2015-10-05 20:09:16 +00:00
Jakub Staszak	225d3ab801	Simplify code. No functionality change. llvm-svn: 249335	2015-10-05 18:53:30 +00:00
Evgeniy Stepanov	670abcfd78	[msan] Correct a typo in poison stack pattern command line description. Patch by Jon Eyolfson. llvm-svn: 249331	2015-10-05 18:01:17 +00:00
Tom Stellard	d585cd85a3	AMDGPU/SI: Add a helper for creating aliases for the _e32 instructions Summary: We are currently only using these aliases for VOPC instructions, but this helper will make it easier to use them everywhere. These aliases allow for the automatic matching of instructions with forced 32-bit encoding. Eventually, we should be able to remove the custom C++ logic we have for this in the assembler. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13396 llvm-svn: 249330	2015-10-05 17:57:39 +00:00
Arnold Schwaighofer	0591c5d719	MergeFunctions: Clear GlobalNumbers ValueMap Otherwise, the map will observe changes as long as MergeFunctions is alive. This is bad because follow-up passes could replace-all-uses-with on the key of an entry in the map. The value handle callback of ValueMap however asserts that the key type matches. rdar://22971893 llvm-svn: 249327	2015-10-05 17:26:36 +00:00
Scott Douglass	953f908173	[ARM] Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing memcpy as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achieving better codegen is to create MEMCPY pseudo instructions, attach scratch virtual registers to them and then, post register allocation, expand the MEMCPYs into LDM/STM pairs using the scratch registers. The register allocator will have picked arbitrary registers which we sort when expanding the MEMCPY. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. Fixes PR9199 and PR23768. [This is based on Peter Collingbourne's r238473 which was reverted.] Differential Revision: http://reviews.llvm.org/D13239 Change-Id: I727543c2e94136e0f80b8e22d5642d7b9ee5b458 Author: Peter Collingbourne <peter@pcc.me.uk> llvm-svn: 249322	2015-10-05 14:49:54 +00:00
Zoran Jovanovic	5a8dffc618	[mips][microMIPS] Implement JALRC16, JRCADDIUSP and JRC16 instructions Differential Revision: http://reviews.llvm.org/D11219 llvm-svn: 249317	2015-10-05 14:00:09 +00:00
Alexandros Lamprineas	1bab191f25	[MC layer][AArch64] llvm-mc accepts 4-bit immediate values for "msr pan, #imm", while only 1-bit immediate values should be valid. Changed encoding and decoding for msr pstate instructions. Differential Revision: http://reviews.llvm.org/D13011 llvm-svn: 249313	2015-10-05 13:42:31 +00:00
Daniel Sanders	d5a89418c5	[mips] Changed the way symbols are handled in dla and la instructions to allow simple expressions. Summary: An instruction like "(d)la $5, symbol+8" previously would have crashed the assembler as it contains an expression. This is now fixed. A few tests cases have also been changed to reflect these changes, however these should only be syntax changes. Some new test cases have also been added. Patch by Scott Egerton. Reviewers: vkalintiris, dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12760 llvm-svn: 249311	2015-10-05 13:19:29 +00:00
Benjamin Kramer	ae1d59967d	[Support] Add a version of fs::make_absolute with a custom CWD. This will be used soon from clang. llvm-svn: 249309	2015-10-05 13:02:43 +00:00
Rafael Espindola	e3a20f57d9	Fix pr24486. This extends the work done in r233995 so that now getFragment (in addition to getSection) also works for variable symbols. With that the existing logic to decide if a-b can be computed works even if a or b are variables. Given that, the expression evaluation can avoid expanding variables as aggressively and that in turn lets the relocation code see the original variable. In order for this to work with the asm streamer, there is now a dummy fragment per section. It is used to assign a section to a symbol when no other fragment exists. This patch is a joint work by Maxim Ostapenko andy myself. llvm-svn: 249303	2015-10-05 12:07:05 +00:00
David Majnemer	429c8eda22	[SelectionDAGBuilder] Remove dead code We already check for LandingPadInst two lines above. llvm-svn: 249280	2015-10-04 18:44:47 +00:00
Teresa Johnson	19f517a7d7	Remove unused private field introduced by r249270. llvm-svn: 249277	2015-10-04 15:00:55 +00:00
Teresa Johnson	403a787e03	Support for function summary index bitcode sections and files. Summary: The bitcode format is described in this document: https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view For more info on ThinLTO see: https://sites.google.com/site/llvmthinlto The first customer is ThinLTO, however the data structures are designed and named more generally based on prior feedback. There are a few comments regarding how certain interfaces are used by ThinLTO, and the options added here to gold currently have ThinLTO-specific names as the behavior they provoke is currently ThinLTO-specific. This patch includes support for generating per-module function indexes, the combined index file via the gold plugin, and several tests (more are included with the associated clang patch D11908). Reviewers: dexonsmith, davidxl, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13107 llvm-svn: 249270	2015-10-04 14:33:43 +00:00
Joerg Sonnenberger	726e624c0c	[SPARCv9] Add support for the rdpr/wrpr instructions. llvm-svn: 249262	2015-10-04 09:11:22 +00:00
Igor Breger	78741a1b1e	AVX512: Implemented encoding and intrinsics for VPERMILPS/PD instructions. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12690 llvm-svn: 249261	2015-10-04 07:20:41 +00:00
David Majnemer	161935520d	[WinEH] Permit branch folding in the face of funclets Track which basic blocks belong to which funclets. Permit branch folding to fire but only if it can prove that doing so will not cause code in one funclet to be reused in another. llvm-svn: 249257	2015-10-04 02:22:52 +00:00
Jeroen Ketema	321fc30afc	Fix typo in README llvm-svn: 249253	2015-10-04 00:46:16 +00:00
Simon Pilgrim	dde63374c5	[DAGCombiner] Generalize FADD constant combines to work with vectors Updated the FADD combines to work with vectors as well as scalars. Differential Revision: http://reviews.llvm.org/D13416 llvm-svn: 249251	2015-10-03 22:06:06 +00:00
Sanjay Patel	acd4baefca	include equal sign in debug equations; NFC llvm-svn: 249248	2015-10-03 20:45:01 +00:00
Simon Pilgrim	bc707d04a4	[X86] Lower SEXTLOAD using SIGN_EXTEND_VECTOR_INREG. NCI. The custom lowering in LowerExtendedLoad is doing the equivalent shuffle, so make use of existing lowering code to reduce duplication. llvm-svn: 249243	2015-10-03 18:55:43 +00:00
Rafael Espindola	28de224002	Move registerSection out of line and reduce #includes. NFC. llvm-svn: 249241	2015-10-03 18:28:40 +00:00
Simon Pilgrim	a38d76a087	[DAGCombiner] Merge SIGN_EXTEND_INREG vector constant folding methods. NCI. visitSIGN_EXTEND_INREG calls SelectionDAG::getNode to constant fold scalar constants but handles vector constants itself, despite getNode being capable of dealing with them. This required a minor change to the getNode implementation to actually deal with cases where the scalars of a BUILD_VECTOR were wider integers than the vector type - which was the only extra ability of the visitSIGN_EXTEND_INREG implementation. No codegen intended and all existing tests remain the same. llvm-svn: 249236	2015-10-03 16:26:52 +00:00
Kostya Serebryany	c8cd29fb7e	[libFuzzer] trying to fix at-exit hang llvm-svn: 249231	2015-10-03 07:02:05 +00:00
Dan Gohman	dc51b96b7f	[WebAssembly] Implement the remaining conversion operations. This is a temporary assembly syntax that will likely evolve along with broader upcoming syntax changes. llvm-svn: 249225	2015-10-03 02:10:28 +00:00
Rafael Espindola	81413c0ca0	Use early return. NFC. llvm-svn: 249224	2015-10-03 00:57:12 +00:00
Sanjoy Das	1cd930b05f	Try to appease MSVC, NFCI. This time by lifting the lambda's in `createNodeFromSelectLikePHI` to the file scope. Looks like there are differences in capture rules between clang and MSVC? llvm-svn: 249222	2015-10-03 00:34:19 +00:00
Tom Stellard	dc9088a10e	AMDGPU/SI: Remove unused tablegen multiclass Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13395 llvm-svn: 249221	2015-10-03 00:29:50 +00:00
Rafael Espindola	2f834372f4	Disallow assigning symbol a null section. They are constructed without one and they can't go back, so this was effectively dead code. llvm-svn: 249220	2015-10-03 00:18:14 +00:00
Sanjoy Das	21ea9bdc46	Try to appease the MSVC bots, NFCI. llvm-svn: 249219	2015-10-03 00:03:15 +00:00
Dan Gohman	6a050f30de	[WebAssembly] Rename setlocal to set_local to match the spec. llvm-svn: 249218	2015-10-03 00:01:53 +00:00
Sanjoy Das	5b92acea2b	Try to appease the MSVC bots, NFC. llvm-svn: 249216	2015-10-02 23:43:32 +00:00
Kostya Serebryany	20bb5e71b2	[libFuzzer] make LLVMFuzzerTestOneInput (the fuzzer target function) return int instead of void. The actual return value is not yet used (and expected to be 0). This change is API breaking, so the fuzzers will need to be updated. llvm-svn: 249214	2015-10-02 23:34:06 +00:00
Sanjoy Das	55015d210f	[SCEV] Recognize simple br-phi patterns Summary: Teach SCEV to match patterns like ``` br %cond, label %left, label %right left: br label %merge right: br label %merge merge: V = phi [ %x, %left ], [ %y, %right ] ``` as "select %cond, %x, %y". Before this SCEV would match PHI nodes exclusively to add recurrences. This addresses PR25005. Reviewers: joker.eph, joker-eph, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13378 llvm-svn: 249211	2015-10-02 23:09:44 +00:00
Piotr Padlewski	dc9b2cfc50	inariant.group handling in GVN The most important part required to make clang devirtualization works ( ͡°͜ʖ ͡°). The code is able to find non local dependencies, but unfortunatelly because the caller can only handle local dependencies, I had to add some restrictions to look for dependencies only in the same BB. http://reviews.llvm.org/D12992 llvm-svn: 249196	2015-10-02 22:12:22 +00:00
Kostya Serebryany	65d0a1458f	[libFuzzer] remove experimental flag and functionality llvm-svn: 249194	2015-10-02 22:00:32 +00:00
Dan Gohman	e3e4a5ff52	[WebAssembly] Fix CFG stackification of nested loops. llvm-svn: 249187	2015-10-02 21:11:36 +00:00
Dan Gohman	9cc692b06e	[WebAssembly] Support calls marked as "tail", fastcc, and coldcc. llvm-svn: 249184	2015-10-02 20:54:23 +00:00
Richard Trieu	e0129e474d	Call the correct overload. Call the correct overload so a string literal does not get converted to a bool. Also fix the test case to match the names given. llvm-svn: 249183	2015-10-02 20:52:14 +00:00
Kostya Serebryany	b85db178a0	[libFuzzer] add a flag -max_total_time llvm-svn: 249181	2015-10-02 20:47:55 +00:00
Dan Gohman	baba8c648b	[WebAssembly] Add a resize_memory intrinsic. llvm-svn: 249178	2015-10-02 20:10:26 +00:00
Sanjoy Das	d0671346ae	[SCEV] Refactor out a createNodeForSelect Summary: We will shortly re-use this for select-like br-phi pairs. Reviewers: atrick, joker-eph, joker.eph Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13377 llvm-svn: 249177	2015-10-02 19:39:59 +00:00
Dan Gohman	72f1692a2c	[WebAssembly] Add a memory_size intrinsic. llvm-svn: 249171	2015-10-02 19:21:15 +00:00
Matt Arsenault	d092a068ba	AMDGPU/SI: Add verifier check for exec reads Make sure we aren't accidentally not setting these in the instruction definitions. llvm-svn: 249170	2015-10-02 18:58:37 +00:00
Sanjoy Das	7d910f2b11	[SCEV] Try to prove predicates by splitting them Summary: This change teaches SCEV that to prove `A u< B` it is sufficient to prove each of these facts individually: - B >= 0 - A s< B - A >= 0 In practice, SCEV sometimes finds it easier to prove these facts individually than to prove `A u< B` as one atomic step. Reviewers: reames, atrick, nlewycky, hfinkel Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13042 llvm-svn: 249168	2015-10-02 18:50:30 +00:00
Roman Divacky	4b5507a037	Actually switch the arch when we see .arch. PR21695 llvm-svn: 249165	2015-10-02 18:25:25 +00:00
Tim Northover	8d67b8e053	ARM: diagnose invalid local fixups on Thumb1 We previously stopped producing Thumb2 relaxations when they weren't supported, but only diagnosed the case where an actual relocation was produced. We should also tell people if local symbols aren't going to work rather than silently overflowing. llvm-svn: 249164	2015-10-02 18:07:18 +00:00
Tim Northover	956b008db6	ARM: correctly align constant pool value on Thumb1 targets. Since we're using tLDRpci to access it, the constant pool's address must be 0 (mod 4). llvm-svn: 249163	2015-10-02 18:07:13 +00:00
Chad Rosier	1f385618c0	[ARM] Typo. NFC. llvm-svn: 249153	2015-10-02 16:42:59 +00:00
Andrea Di Biagio	77f62652c1	Reapply r249121 : "[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types." This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Originally reviewed here: http://reviews.llvm.org/D13347 llvm-svn: 249147	2015-10-02 16:08:05 +00:00
Andrea Di Biagio	45874e67a1	Revert: [FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. r249121 caused a Clang test failure (avx2-buitins.c). Revert r249121 while I keep investigating on the reason why that test failed. llvm-svn: 249124	2015-10-02 13:06:19 +00:00
Zoran Jovanovic	9ffdfa5986	[mips][microMIPS] Fix an issue with selecting sqrt instruction in LLVM backend Differential Revision: http://reviews.llvm.org/D13235 llvm-svn: 249123	2015-10-02 13:06:02 +00:00
Andrea Di Biagio	cb33456122	[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Differential Revision: http://reviews.llvm.org/D13347 llvm-svn: 249121	2015-10-02 12:45:37 +00:00
Ivan Krasin	95e82d5b48	[LibFuzzer] test_single_input option to run a single test case. -test_single_input flag specifies a file name with test data. Review URL: http://reviews.llvm.org/D13359 Patch by Mike Aizatsky! llvm-svn: 249096	2015-10-01 23:23:06 +00:00
Bruno Cardoso Lopes	b491a2d641	[SimplifyLibCalls] Fix instruction misplacement in string/memory libcall optimization When trying to optimize fortified library functions use the right location to insert new instructions in order to preserve correct def-use order. This fixes an issue where a misplaced instruction definition would happen to be after one of its use after a RAUW, forming invalid IR. This behavior was introduced by r227250. Differential Revision: http://reviews.llvm.org/D13301 rdar://problem/22802369 llvm-svn: 249092	2015-10-01 22:43:53 +00:00
Matt Arsenault	b733f00510	AMDGPU: Fix unused variable warning in release build llvm-svn: 249091	2015-10-01 22:40:35 +00:00
Matt Arsenault	b87fc22915	AMDGPU: Move SIFixSGPRLiveRanges to be a regalloc pass Replace LiveInterval usage with LiveVariables. LiveIntervals computes far more information than is needed for this pass which just needs to find if an SGPR is live out of the defining block. LiveIntervals are not usually available that early, requiring computing them twice which is very expensive. The extra run of LiveIntervals/LiveVariables/SlotIndexes was costing in total about 5% of compile time. Continuing to use LiveIntervals is problematic. It seems there is an option (early-live-intervals) to run the analysis about where it should go to avoid recomputing LiveVariables, but it seems to be completely broken with subreg liveness enabled. There are also problems from trying to recompute LiveIntervals since this seems to undo LiveVariables and clearing kill flags, causing TwoAddressInstructions to make bad decisions. Insert the pass right after live variables and preserve it. The tricky case to worry about might be phis since LiveVariables doesn't count a register as live out if in the successor block it is only used in a phi, but I don't think this is a concern right now because SIFixSGPRCopies replaces SGPR phis. llvm-svn: 249087	2015-10-01 22:10:03 +00:00
Joerg Sonnenberger	c8d50d6347	Fix relocation used for GOT references in non-PIC mode. Fix relocations for "set" pseudo op in PIC mode. Differential Revision: http://reviews.llvm.org/D13173 llvm-svn: 249086	2015-10-01 22:08:20 +00:00
Matt Arsenault	d2c7589f93	AMDGPU: Merge if and switch llvm-svn: 249082	2015-10-01 21:51:59 +00:00
Matt Arsenault	db7f0ef367	AMDGPU: Remove dead code There's no point in checking VReg_1 because all uses of it should already have been removed by SILowerI1Copies. llvm-svn: 249081	2015-10-01 21:51:57 +00:00
Matt Arsenault	d1d499aa56	AMDGPU: Make SIInsertWaits about a factor of 4 faster This was the slowest target custom pass and was spending 80% of the time in getMinimalPhysRegClass which was called for every register operand. Try to use the statically known register class when possible from the instruction's MCOperandInfo. There are a few pseudo instructions which are not well behaved with unknown register classes which still require the expensive physical register class search. There are a few other possibilities for making this even faster, such as not inspecting implicit operands. For now those are checked because it is technically possible to have a scalar load into exec or vcc which can be implicitly used. llvm-svn: 249079	2015-10-01 21:43:15 +00:00
Reid Kleckner	fc64fae6e3	[WinEH] Emit __C_specific_handler tables for the new IR We emit denormalized tables, where every range of invokes in the same state gets a complete list of EH action entries. This is significantly simpler than trying to infer the correct nested scoping structure from the MI. Fortunately, for SEH, the nesting structure is really just a size optimization. With this, some basic __try / __except examples work. llvm-svn: 249078	2015-10-01 21:38:24 +00:00
Tom Stellard	e9f8b24985	AMDGPU/SI: Remove assert from AMDGPUOpenCLImageTypeLowering pass Summary: Instead of asserting when the kernel metadata is different than we expect, we should just skip lowering that function. This fixes assertion failures with OpenCL argument metadata from older LLVM releases. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13356 llvm-svn: 249073	2015-10-01 21:16:05 +00:00
David Majnemer	4600c06434	[WinEH] Stop BranchFolding from merging across funclets BranchFolding would merge two funclets together, this is not OK. Disable this and strengthen the assertion in FuncletLayout. llvm-svn: 249069	2015-10-01 21:04:13 +00:00
David Majnemer	f828a0ccc7	[WinEH] Make FuncletLayout more robust against catchret Catchret transfers control from a catch funclet to an earlier funclet. However, it is not completely clear which funclet the catchret target is part of. Make this clear by stapling the catchret target's funclet membership onto the CATCHRET SDAG node. llvm-svn: 249052	2015-10-01 18:44:59 +00:00
Chad Rosier	f11d040f01	[AArch64] Deprecate a command-line option used for testing. Support for pairing unscaled loads and stores has been enabled since the original ARM64 port. This feature is no longer experimental, AFAICT. llvm-svn: 249049	2015-10-01 18:17:12 +00:00
Jonas Paulsson	12629324a4	[SystemZ] Add some generic (floating point support) load instructions. Add generic instructions for load complement, load negative and load positive for fp32 and fp64, and let isel prefer them. They do not clobber CC, and so give scheduler more freedom. SystemZElimCompare pass will convert them when it can to the CC-setting variants. Regression tests updated to expect the new opcodes in places where the old ones where used. New test case SystemZ/fp-cmp-05.ll checks that SystemZCompareElim.cpp can handle the new opcodes. README.txt updated (bullet removed). Note that fp128 is not yet handled, because it is relatively rare, and is a bit trickier, because of the fact that l.dfr would operate on the sign bit of one of the subregisters of a fp128, but we would not want to copy the other sub-reg in case src and dst regs are not the same. Reviewed by Ulrich Weigand. llvm-svn: 249046	2015-10-01 18:12:28 +00:00
Tom Stellard	e0e582c9aa	AMDGPU: Add MEM_RAT STORE_TYPED. v2: Add test (Matt). Fix capitalization of isEOP (Matt). Move pattern to class parameter (Matt). Make the instruction available to Cayman (Matt). Change name from MEM_RAT WRITE_TYPED to MEM_RAT STORE_TYPED. Patch by: Zoltan Gilian llvm-svn: 249042	2015-10-01 17:51:34 +00:00
Tom Stellard	c0f0fba2c4	AMDGPU: Factor out EOP query. v2: Fix brace placement and capitalization (Matt). Patch by: Zoltan Gilian llvm-svn: 249041	2015-10-01 17:51:29 +00:00
NAKAMURA Takumi	096492a07b	Reformat. llvm-svn: 249033	2015-10-01 17:01:03 +00:00
NAKAMURA Takumi	1ed20db720	Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64" It broke; LLVM :: CodeGen__Generic__2009-11-16-BadKillsCrash.ll llvm-svn: 249032	2015-10-01 17:00:56 +00:00
Arnaud A. de Grandmaison	849f3bf8c9	[InstCombine] Remove trivially empty lifetime start/end ranges. Summary: Some passes may open up opportunities for optimizations, leaving empty lifetime start/end ranges. For example, with the following code: void foo(char , char ); void bar(int Size, bool flag) { for (int i = 0; i < Size; ++i) { char text[1]; char buff[1]; if (flag) foo(text, buff); // BBFoo } } the loop unswitch pass will create 2 versions of the loop, one with flag==true, and the other one with flag==false, but always leaving the BBFoo basic block, with lifetime ranges covering the scope of the for loop. Simplify CFG will then remove BBFoo in the case where flag==false, but will leave the lifetime markers. This patch teaches InstCombine to remove trivially empty lifetime marker ranges, that is ranges ending right after they were started (ignoring debug info or other lifetime markers in the range). This fixes PR24598: excessive compile time after r234581. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13305 llvm-svn: 249018	2015-10-01 14:54:31 +00:00
Ulrich Weigand	cf1670a095	[SystemZ] Add assembly instructions for obtaining clock values as well as CPU features Provide assembler support for STCK, STCKF, STCKE, and STFLE. Author: joncmu Differential Revision: http://reviews.llvm.org/D13299 llvm-svn: 249015	2015-10-01 14:43:48 +00:00
Chad Rosier	b7c5b91068	[AArch64] Hoist commonly failing check. NFC. llvm-svn: 249011	2015-10-01 13:43:05 +00:00
Chad Rosier	0b15e7c618	[AArch64] Rename variable to improve readability. NFC. llvm-svn: 249008	2015-10-01 13:33:31 +00:00
Chad Rosier	7a83d770ae	[AArch64] Update comment to reflect reality. llvm-svn: 249007	2015-10-01 13:09:44 +00:00
Zoran Jovanovic	2960f3a346	[mips][microMIPS] Implement CACHEE, WRPGPR and WSBH instructions Differential Revision: http://reviews.llvm.org/D10337 llvm-svn: 249004	2015-10-01 12:49:27 +00:00
Scott Douglass	290183d734	[ARM] More care with Thumb1 writeback in ARMLoadStoreOptimizer Differential Revision: http://reviews.llvm.org/D13240 llvm-svn: 249002	2015-10-01 11:56:19 +00:00
Jingyue Wu	df1a1b113b	[NaryReassociate] SeenExprs records WeakVH Summary: The instructions SeenExprs records may be deleted during rewriting. FindClosestMatchingDominator should ignore these deleted instructions. Fixes PR24301. Reviewers: grosser Subscribers: grosser, llvm-commits Differential Revision: http://reviews.llvm.org/D13315 llvm-svn: 248983	2015-10-01 03:51:44 +00:00
Keno Fischer	17433bd102	Fix performance problem in long-running SectionMemoryManagers Summary: Without this patch, the memory manager would call `mprotect` on every memory region it ever allocated whenever it wanted to finalize memory (i.e. not just the ones it just allocated). This caused terrible performance problems for long running memory managers. In one particular compile heavy julia benchmark, we were spending 50% of time in `mprotect` if running under MCJIT. Fix this by splitting allocated memory blocks into those on which memory permissions have been set and those on which they haven't and only running `mprotect` on the latter. Reviewers: lhames Subscribers: reames, llvm-commits Differential Revision: http://reviews.llvm.org/D13156 llvm-svn: 248981	2015-10-01 02:45:07 +00:00
Tom Stellard	1f0e7bbc5b	AMDGPU/SI: Re-order PreloadedValue enum and number entries based on init order Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12451 llvm-svn: 248978	2015-10-01 02:02:46 +00:00
Dehao Chen	7c41dd6498	Update sample profile propagation algorithm. http://reviews.llvm.org/D13218 llvm-svn: 248968	2015-10-01 00:26:56 +00:00
Ahmed Bougacha	23a0d1a1d6	[X86] Don't custom-lower vNi32 uint_to_fp when unsafe-fp-math. The custom code produces incorrect results if later reassociated. Since r221657, on x86, vNi32 uitofp is lowered using an optimized sequence: movdqa LCPI0_0(%rip), %xmm1 ## xmm1 = [65535, ...] pand %xmm0, %xmm1 por LCPI0_1(%rip), %xmm1 ## [0x4b000000, ...] psrld $16, %xmm0 por LCPI0_2(%rip), %xmm0 ## [0x53000000, ...] addps LCPI0_3(%rip), %xmm0 ## [float -5.497642e+11, ...] addps %xmm1, %xmm0 Since r240361, the machine combiner opportunistically reassociates 2-instruction sequences (with -ffast-math). In the new code sequence, the ADDPS' are eligible. In isolation, for simple examples (without reassociable users), this makes no performance difference (the goal being to enable reassociation of longer chains). In the trivial example (just one uitofp), the reassociation doesn't happen, because (I think) it would require the emission of a separate movaps for a constantpool load (instead of folding it into addps). However, when we have multiple uitofp sequences, and the constantpool loads are CSE'd earlier, the machine combiner can do the reassociation. When the ADDPS' are reassociated, the resulting sequence isn't correct anymore, as we'd be adding large (239) constants with comparatively smaller values (~223). Given that two of the three inputs are powers of 2 larger than 216, and that ulp(239) == 2(39-24) == 215, the reassociated chain will produce 0 for any input in [0, 214[. In my testing, it also produces wrong results for 99.5% of [0, 232[. Avoid this by disabling the new lowering when -ffast-math. It does mean that we'll get slower code than without it, but at least we won't get egregiously incorrect code. One might argue that, considering -ffast-math is all but meaningless, uitofp producing wrong results isn't a compiler bug. But it really is. Fixes PR24512. ...though this is really more of a workaround. Ideally, we'd have some sort of Machine FMF, but that's a problem that's not worth tackling until we do more with machine IR. llvm-svn: 248965	2015-10-01 00:11:07 +00:00
Reid Kleckner	6dec87a8a0	[WinEH] Emit int3 after noreturn calls on Win64 The Win64 unwinder disassembles forwards from each PC to try to determine if this PC is in an epilogue. If so, it skips calling the EH personality function for that frame. Typically, this means you cannot catch an exception in the same frame that you threw it, because 'throw' calls a noreturn runtime function. Previously we avoided this problem with the TrapUnreachable TargetOption, but that's a much bigger hammer than we need. All we need is a 1 byte non-epilogue instruction right after the call. Instead, what we got was an unconditional branch to a shared block containing the ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which added TrapUnreachable, and replaces it with something better. The new code pattern matches for invoke/call followed by unreachable and inserts an int3 into the DAG. To be 100% watertight, we would need to insert SEH_Epilogue instructions into all basic blocks ending in a call with no terminators or successors, but in practice this is unlikely to come up. llvm-svn: 248959	2015-09-30 23:09:23 +00:00
Sanjay Patel	a114a10bbe	[x86] enable machine combiner reassociations for 256-bit vector logical integer insts llvm-svn: 248955	2015-09-30 22:25:55 +00:00
Kostya Serebryany	3287d7a6ed	[libFuzzer] Marking exported symbols as visible. Patch by Mike Aizatsky llvm-svn: 248954	2015-09-30 22:22:37 +00:00
Michael Zolotukhin	fc783e91e0	[SLP] Don't vectorize loads of non-packed types (like i1, i2). Summary: Given an array of i2 elements, 4 consecutive scalar loads will be lowered to i8-sized loads and thus will access 4 consecutive bytes in memory. If we vectorize these loads into a single <4 x i2> load, it'll access only 1 byte in memory. Hence, we should prohibit vectorization in such cases. PS: Initial patch was proposed by Arnold. Reviewers: aschwaighofer, nadav, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13277 llvm-svn: 248943	2015-09-30 21:05:43 +00:00
David Blaikie	757908e545	Fix -Wsign-compare warning llvm-svn: 248942	2015-09-30 20:37:48 +00:00
Evgeniy Stepanov	f608111d1b	Fix debug info with SafeStack. llvm-svn: 248933	2015-09-30 19:55:43 +00:00
Chad Rosier	11c825f7db	[AArch64] Remove an unnecessary restriction on pre-index instructions. Previously, the index was constrained to the size of the memory operation for no apparent reason. This change removes that constraint so that we can form pre-index instructions with any valid offset. llvm-svn: 248931	2015-09-30 19:44:40 +00:00
Fiona Glaser	b0c6d9174e	DeadCodeElimination: rewrite to be faster Same strategy as simplifyInstructionsInBlock. ~1/3 less time on my test suite. This pass doesn't have many in-tree users, but getting rid of an O(N^2) worst case and making it cleaner should at least make it a viable alternative to ADCE, since it's now consistently somewhat faster. llvm-svn: 248927	2015-09-30 17:49:49 +00:00
Hal Finkel	4c45775880	[PowerPC] Disable shrink wrapping Shrink wrapping is causing a self-hosting failure on PPC64/Linux. Disable for now until the problem can be fixed. llvm-svn: 248924	2015-09-30 17:29:03 +00:00
Artyom Skrobov	72ca6b8f3f	[ARM] Support for ARMv6-Z / ARMv6-ZK missing As Richard Barton observed at http://reviews.llvm.org/D12937#inline-107121 TargetParser in LLVM has insufficient support for ARMv6Z and ARMv6ZK. In particular, there were no tests for TrustZone being supported in these architectures. The patch clears a FIXME: left by Saleem Abdulrasool in r201471, and fixes his test case which hadn't really been testing what it was claiming to test. Differential Revision: http://reviews.llvm.org/D13236 llvm-svn: 248921	2015-09-30 17:25:52 +00:00
Erik Eckstein	848c1aa452	SLPVectorizer: limit the scheduling region size per basic block. Usually large blocks are not a problem. But if a large block (> 10k instructions) contains many (potential) chains of vector instructions, and those chains are spread over a wide range of instructions, then scheduling becomes a compile time problem. This change introduces a limit for the accumulate scheduling region size of a block. For real-world functions this limit will never be exceeded (it's about 10x larger than the maximum value seen in the test-suite and external test suite). llvm-svn: 248917	2015-09-30 17:00:44 +00:00
Chad Rosier	4f04e2ec87	[AArch64] Use helper function to improve readability. NFC. llvm-svn: 248914	2015-09-30 16:50:41 +00:00
Andrea Di Biagio	0594e2a1e9	[InstCombine] Teach how to convert SSSE3/AVX2 byte shuffles to builtin shuffles if the shuffle mask is constant. This patch teaches InstCombiner how to convert a SSSE3/AVX2 byte shuffle to a builtin shuffle if the mask is constant. Converting byte shuffle intrinsic calls to builtin shuffles can help finding more opportunities for combining shuffles later on in selection dag. We may end up with byte shuffles with constant masks as the result of inlining. Differential Revision: http://reviews.llvm.org/D13252 llvm-svn: 248913	2015-09-30 16:44:39 +00:00
Artur Pilipenko	029d8531e6	Refactor computeKnownBits alignment handling code Reviewed By: reames, hfinkel Differential Revision: http://reviews.llvm.org/D12958 llvm-svn: 248892	2015-09-30 11:55:45 +00:00
Jeroen Ketema	ab99b59e8c	[ARM][NEON] Use address space in vld([1234]\|[234]lane) and vst([1234]\|[234]lane) instructions This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234], vst[234]lane ARM neon intrinsics and associates an address space with the pointer that these intrinsics take. This changes, e.g., <2 x i32> @llvm.arm.neon.vld1.v2i32(i8, i32) to <2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8, i32) This change ensures that address spaces are fully taken into account in the ARM target during lowering of interleaved loads and stores. Differential Revision: http://reviews.llvm.org/D12985 llvm-svn: 248887	2015-09-30 10:56:37 +00:00
Simon Pilgrim	3d11c994f7	[X86][XOP] Added support for the lowering of 128-bit vector shifts to XOP shift instructions The XOP shifts just have logical/arithmetic versions and the left/right shifts are controlled by whether the value is positive/negative. Because of this I've added new X86ISD nodes instead of trying to force them to use the existing shift nodes. Additionally Excavator cores (bdver4) support XOP and AVX2 - meaning that it should use the AVX2 shifts when it can and fall back to XOP in other cases. Differential Revision: http://reviews.llvm.org/D8690 llvm-svn: 248878	2015-09-30 08:17:50 +00:00
Justin Bogner	75df7187f3	InstrProf: Don't call std::unique twice here llvm-svn: 248872	2015-09-30 02:02:08 +00:00
Dehao Chen	6722688eaa	http://reviews.llvm.org/D13145 Support hierarachical sample profile format. llvm-svn: 248865	2015-09-30 00:42:46 +00:00
Evgeniy Stepanov	d3f544f271	[safestack] Fix a stupid mix-up in the direct-tls code path. llvm-svn: 248863	2015-09-30 00:01:47 +00:00
Marek Olsak	d1a69a2839	AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is set to prevent setting a huge stride, because DATA_FORMAT has a different meaning if ADD_TID_ENABLE is set. This is a candidate for stable llvm 3.7. Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 248858	2015-09-29 23:37:32 +00:00
Reid Kleckner	a13dfd539b	[WinEH] Setup RBP correctly in Win64 funclet prologues Previously local variable captures just didn't work in 64-bit. Now we can access local variables more or less correctly. llvm-svn: 248857	2015-09-29 23:32:01 +00:00
David Majnemer	91b0ab9172	[WinEH] Ensure that funclets obey the x64 ABI The x64 ABI requires that epilogues do not contain code other than stack adjustments and some limited control flow. However, we'd insert code to initialize the return address after stack adjustments. Instead, insert EAX/RAX with the current value before we create the stack adjustments in the epilogue. llvm-svn: 248839	2015-09-29 22:33:36 +00:00
Justin Bogner	9e9a057a9b	InstrProf: Support for value profiling in the indexed profile format Add support to the indexed instrprof reader and writer for the format that will be used for value profiling. Patch by Betul Buyukkurt, with minor modifications. llvm-svn: 248833	2015-09-29 22:13:58 +00:00
Maksim Panchenko	cce239c45d	HHVM calling conventions. HHVM calling convention, hhvmcc, is used by HHVM JIT for functions in translated cache. We currently support LLVM back end to generate code for X86-64 and may support other architectures in the future. In HHVM calling convention any GP register could be used to pass and return values, with the exception of R12 which is reserved for thread-local area and is callee-saved. Other than R12, we always pass RBX and RBP as args, which are our virtual machine's stack pointer and frame pointer respectively. When we enter translation cache via hhvmcc function, we expect the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed to standard ABI alignment. This affects stack object alignment and stack adjustments for function calls. One extra calling convention, hhvm_ccc, is used to call C++ helpers from HHVM's translation cache. It is almost identical to standard C calling convention with an exception of first argument which is passed in RBP (before we use RDI, RSI, etc.) Differential Revision: http://reviews.llvm.org/D12681 llvm-svn: 248832	2015-09-29 22:09:16 +00:00
Chad Rosier	4315012769	[AArch64] Add support for pre- and post-index LDPSWs. llvm-svn: 248825	2015-09-29 20:39:55 +00:00
David Majnemer	a80c151286	[WinEH] Teach AsmPrinter about funclets Summary: Funclets have been turned into functions by the time they hit the object file. Make sure that they have decent names for the symbol table and CFI directives explaining how to reason about their prologues. Differential Revision: http://reviews.llvm.org/D13261 llvm-svn: 248824	2015-09-29 20:12:33 +00:00
Cong Hou	166e08542e	Rename some function arguments in MachineBasicBlock.cpp/h by turning the first letter into upper case. NFC. llvm-svn: 248821	2015-09-29 19:46:09 +00:00
Dehao Chen	8e7df83e6a	http://reviews.llvm.org/D13231 Change lookup functions to const functions. llvm-svn: 248818	2015-09-29 18:28:15 +00:00
Chad Rosier	dabe2534ed	[AArch64] Add integer pre- and post-index halfword/byte loads and stores. llvm-svn: 248817	2015-09-29 18:26:15 +00:00
Dehao Chen	028e122ca9	Revert r248810 which breaks tests. llvm-svn: 248814	2015-09-29 18:18:49 +00:00
Dehao Chen	410a25aa7a	http://reviews.llvm.org/D13231 Change lookup functions to const functions. llvm-svn: 248810	2015-09-29 17:59:58 +00:00
Nemanja Ivanovic	2c84b29464	Addition of interfaces the BE to conform to Table A-2 of ELF V2 ABI V1.1 This patch corresponds to review: http://reviews.llvm.org/D13191 Back end portion of the fifth round of additions to altivec.h. llvm-svn: 248809	2015-09-29 17:41:53 +00:00
Chad Rosier	32d4d37e61	[AArch64] Scale offsets by the size of the memory operation. NFC. The immediate in the load/store should be scaled by the size of the memory operation, not the size of the register being loaded/stored. This change gets us one step closer to forming LDPSW instructions. This change also enables pre- and post-indexing for halfword and byte loads and stores. llvm-svn: 248804	2015-09-29 16:07:32 +00:00
Igor Laevsky	cea9ede74e	[ValueTracking] Lower dom-conditions-dom-blocks and dom-conditions-max-uses thresholds On some of our benchmarks this change shows about 50% compile time improvement without any noticeable performance difference. Differential Revision: http://reviews.llvm.org/D13248 llvm-svn: 248801	2015-09-29 14:57:52 +00:00
Chad Rosier	a4d3217e81	[AArch64] Remove some redundant cases. NFC. llvm-svn: 248800	2015-09-29 14:57:10 +00:00
James Molloy	897048bee3	[ValueTracking] Teach isKnownNonZero about monotonically increasing PHIs If a PHI starts at a non-negative constant, monotonically increases (only adds of a constant are supported at the moment) and that add does not wrap, then the PHI is known never to be zero. llvm-svn: 248796	2015-09-29 14:08:45 +00:00
Jeroen Ketema	740f9d79ca	Arguments spilled on the stack before a function call may have alignment requirements, for example in the case of vectors. These requirements are exploited by the code generator by using move instructions that have similar alignment requirements, e.g., movaps on x86. Although the code generator properly aligns the arguments with respect to the displacement of the stack pointer it computes, the displacement itself may cause misalignment. For example if we have %3 = load <16 x float>, <16 x float>* %1, align 64 call void @bar(<16 x float> %3, i32 0) the x86 back-end emits: movaps 32(%ecx), %xmm2 movaps (%ecx), %xmm0 movaps 16(%ecx), %xmm1 movaps 48(%ecx), %xmm3 subl $20, %esp <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards movaps %xmm3, (%esp) <-- movaps requires 16-byte alignment, while %esp is not aligned as such. movl $0, 16(%esp) calll __bar To solve this, we need to make sure that the computed value with which the stack pointer is changed is a multiple af the maximal alignment seen during its computation. With this change we get proper alignment: subl $32, %esp movaps %xmm3, (%esp) Differential Revision: http://reviews.llvm.org/D12337 llvm-svn: 248786	2015-09-29 10:12:57 +00:00
Simon Pilgrim	43f5e0848e	[InstCombine] Improve Vector Demanded Bits Through Bitcasts Currently SimplifyDemandedVectorElts can only peek through bitcasts if the vectors have the same number of elements. This patch fixes and enables some existing (disabled) code to support bitcasting to vectors with more/fewer elements. It currently only accepts cases when vectors alias cleanly (i.e. number of elements are an exact multiple of the other vector). This was added to improve the demanded vector elements support for SSE vector shifts which require the __m128i (<2 x i64>) argument type to be bitcast to the vector type for the builtin shift. I've added extra tests for various additional bitcasts. Differential Revision: http://reviews.llvm.org/D12935 llvm-svn: 248784	2015-09-29 08:19:11 +00:00
Chen Li	9f27fc0599	[LoopUnswitch] Add block frequency analysis to recognize hot/cold regions Summary: This patch adds block frequency analysis to LoopUnswitch pass to recognize hot/cold regions. For cold regions the pass only performs trivial unswitches since they do not increase code size, and for hot regions everything works as before. This helps to minimize code growth in cold regions and be more aggressive in hot regions. Currently the default cold regions are blocks with frequencies below 20% of function entry frequency, and it can be adjusted via -loop-unswitch-cold-block-frequency flag. The entire feature is controlled via -loop-unswitch-with-block-frequency flag and it is off by default. Reviewers: broune, silvas, dnovillo, reames Subscribers: davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D11605 llvm-svn: 248777	2015-09-29 05:03:32 +00:00
NAKAMURA Takumi	0c12a3949e	[CMake] X86AsmParser: Prune redundant LINK_LIBS. It is described in LLVMBuild.txt. llvm-svn: 248771	2015-09-29 01:25:01 +00:00
Evgeniy Stepanov	d8b86f7cdc	Move dbg.declare intrinsics when merging and replacing allocas. Place new and update dbg.declare calls immediately after the corresponding alloca. Current code in replaceDbgDeclareForAlloca puts the new dbg.declare at the end of the basic block. LLVM codegen has problems emitting debug info in a situation when dbg.declare appears after all uses of the variable. This usually kinda works for inlining and ASan (two users of this function) but not for SafeStack (see the pending change in http://reviews.llvm.org/D13178). llvm-svn: 248769	2015-09-29 00:30:19 +00:00
Matthias Braun	99ae16217e	RegisterPressure: LiveRegSet tracks register units not physregs There are always more physical registers and register units so the previous behaviour was correct but we can do with less memory. llvm-svn: 248767	2015-09-29 00:20:32 +00:00
Reid Kleckner	c71d6275ca	[WinEH] Fix ip2state table emission with funclets Previously we were hijacking the old LandingPadInfo data structures to communicate our state numbers. Now we don't need that anymore. llvm-svn: 248763	2015-09-28 23:56:30 +00:00
Richard Trieu	e778e87d2a	Fix unused variable warning in non-debug builds. llvm-svn: 248754	2015-09-28 22:54:43 +00:00
Sanjay Patel	4e6527682a	tidy up comments; NFC llvm-svn: 248750	2015-09-28 22:14:51 +00:00
Sanjay Patel	3a14f1a338	add a FIXME for a CPU model check that should have an attribute instead llvm-svn: 248746	2015-09-28 22:00:24 +00:00
Sanjay Patel	5e5f0e9756	move one-use check under the comment that describes it; NFCI llvm-svn: 248745	2015-09-28 21:44:46 +00:00
Sanjoy Das	4f1c45952c	[SCEV] Don't crash on pointer comparisons `ScalarEvolution::isImpliedCondOperandsViaNoOverflow` tries to cast the operand type of the comparison it is given to an `IntegerType`. This is incorrect because it could actually be simplifying a comparison between two pointers. Switch it to using `getTypeSizeInBits` instead, which does the right thing for both pointers and integers. Fixed PR24956. llvm-svn: 248743	2015-09-28 21:14:32 +00:00
Matt Arsenault	ba6aae785a	AMDGPU: Factor switch into separate function llvm-svn: 248742	2015-09-28 20:54:57 +00:00
Matt Arsenault	73aa8f687a	AMDGPU: Fix splitting x16 SMRD loads When used recursively, this would set the kill flag on the intermediate step from first splitting x16 to x8. llvm-svn: 248741	2015-09-28 20:54:52 +00:00
Matt Arsenault	e5d042cd56	AMDGPU: Fix moving SMRD loads with literal offsets on CI llvm-svn: 248740	2015-09-28 20:54:46 +00:00
Matt Arsenault	dd49c5fc1b	AMDGPU: Fix splitting SMRD with large offset The splitting of > 4 dword SMRD instructions if using an offset in an SGPR instead of an immediate was not setting the destination register, resulting an an instruction missing an operand which would assert later. Test will be included in a following commit which fixes a related issue. llvm-svn: 248739	2015-09-28 20:54:42 +00:00
Andrew Kaylor	16c4da03d5	Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735	2015-09-28 20:33:22 +00:00
Sean Silva	ace7818ce6	[GlobalOpt] Sort members of llvm.used deterministically Patch by Jake VanAdrighem! Summary: Fix the way we sort the llvm.used and llvm.compiler.used members. This bug seems to have been introduced in rL183756 through a set of improper casts to GlobalValue*. In subsequent patches this problem was missed and transformed into a getName call on a ConstantExpr. Reviewers: silvas Subscribers: silvas, llvm-commits Differential Revision: http://reviews.llvm.org/D12851 llvm-svn: 248728	2015-09-28 19:02:11 +00:00
Fiona Glaser	f74cc40e34	Improve performance of SimplifyInstructionsInBlock 1. Use a worklist, not a recursive approach, to avoid needless revisitation and being repeatedly forced to jump back to the start of the BB if a handle is invalidated. 2. Only insert operands to the worklist if they become unused after a dead instruction is removed, so we don’t have to visit them again in most cases. 3. Use a SmallSetVector to track the worklist. 4. Instead of pre-initting the SmallSetVector like in DeadCodeEliminationPass, only put things into the worklist if they have to be revisited after the first run-through. This minimizes how much the actual SmallSetVector gets used, which saves a lot of time. llvm-svn: 248727	2015-09-28 18:56:07 +00:00
Daniel Sanders	7727e1098c	[mips][p5600] Added P5600 processor and initial scheduler. Summary: The P5600 is an out-of-order, superscalar implementation of the MIPS32R5 architecture. The scheduler has a few missing details (see the 'Tricky Instructions' section and some quirks of the P5600 are deliberately omitted due to implementation difficulty and low chance of significant benefit (e.g. the predicate on P5600WriteEitherALU). However, testing on SingleSource is showing significant performance benefits on some apps (seven in the 10-30% range) and only one significant regression (12%) when -pre-RA-sched=linearize is given. Without -pre-RA-sched=linearize the results are more variable. Some do even better (up to 55% improvement) but increased numbers of copies are slowing others down (up to 12%). Overall, the scheduler as it currently stands is a 2.4% win with -pre-RA-sched=linearize and a 2.7% win without -pre-RA-sched=linearize. I'm sure we can improve on this further. For completeness, the FPGA this was tested on shows some failures with and without the P5600 scheduler. These appear to be scheduling related since the two test runs have fairly different sets of failing tests even after accounting for other factors (e.g. spurious connection failures) however it's not P5600 specific since we also get some for the generic scheduler. Reviewers: vkalintiris Subscribers: mpf, llvm-commits, atrick, vkalintiris Differential Revision: http://reviews.llvm.org/D12193 llvm-svn: 248725	2015-09-28 18:24:08 +00:00
Artur Pilipenko	b4d009042b	Introduce !align metadata for load instruction Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D12853 llvm-svn: 248721	2015-09-28 17:41:08 +00:00
Philip Reames	13f023c09d	[InstSimplify] Fold simple known implications to true This was split off of http://reviews.llvm.org/D13040 to make it easier to test the correctness of the implication logic. For the moment, this only handles a single easy case which shows up when eliminating and combining range checks. In the (near) future, I plan to extend this for other cases which show up in range checks, but I wanted to make those changes incrementally once the framework was in place. At the moment, the implication logic will be used by three places. One in InstSimplify (this review) and two in SimplifyCFG (http://reviews.llvm.org/D13040 & http://reviews.llvm.org/D13070). Can anyone think of other locations this style of reasoning would make sense? Differential Revision: http://reviews.llvm.org/D13074 llvm-svn: 248719	2015-09-28 17:14:24 +00:00
Weiming Zhao	310770a90f	[LoopReroll] Ignore debug intrinsics Originally, debug intrinsics and annotation intrinsics may prevent the loop to be rerolled, now they are ignored. Differential Revision: http://reviews.llvm.org/D13150 llvm-svn: 248718	2015-09-28 17:03:23 +00:00
Dan Gohman	05a17aa82a	[WebAssembly] Support for direct call and call_indirect. llvm-svn: 248716	2015-09-28 16:22:39 +00:00
Zoran Jovanovic	cdb64566cc	[mips] Handling of immediates bigger than 16 bits Differential Revision: http://reviews.llvm.org/D10539 llvm-svn: 248706	2015-09-28 11:11:34 +00:00
Artyom Skrobov	ad8a0638f7	[ARM] Avoid redundant checks for isThumb1Only() after supportsTailCall() supportsTailCall() has two callers. Both of them double-check isThumb1Only(), and refuse to proceed with tail-calling in that case. Therefore, it makes sense to move this check to ARMSubtarget::initSubtargetFeatures, where SupportsTailCall is initialized; and to eliminate the extra checks at the call sites. Following a review comment, added an "assert(supportsTailCall())" in IsEligibleForTailCall. NFC. llvm-svn: 248703	2015-09-28 09:44:11 +00:00
Hal Finkel	bd582581b8	[DAGCombine] Fix getStoreMergeAndAliasCandidates's AA-enabled chain walking When AA is being used, non-aliasing stores are canonicalized to use the same chain, and DAGCombiner::getStoreMergeAndAliasCandidates can take advantage of this by looking only as users of a store's chain operand. However, user iteration is not result-number specific, we need to check that the use is as a chain operand, and not via some other operand. It is certainly possible to have another potentially-aliasing store, which shares the first's base pointer, and uses the first's chain's node via some other operand. Failure to catch this situation caused, at least in the included test case, an assert later because the relative sequence-number ordering caused later replacement to create a cycle in the DAG. llvm-svn: 248698	2015-09-28 08:02:14 +00:00
Craig Topper	862d5d8322	Remove 'const' from some ArrayRefs. ArrayRefs are already immutable. NFC llvm-svn: 248693	2015-09-28 00:15:34 +00:00
Justin Bogner	d7d1a72f66	AsmWriter: Print the argument names in declarations while debugging When llvm declarations have argument names, it's helpful to actually print those names when debugging. Arguably, it'd be nice to print them all the time, but that would mean the IR we output wouldn't round trip through bitcode, which doesn't store the names. Make the varous print() methods in AsmWriter optionally print "for debug" and set that flag in the dump() methods. The only thing this does differently for now is print the argument names in declarations. llvm-svn: 248692	2015-09-27 22:38:50 +00:00
Yaron Keren	e5a9dc2f5b	Silence clang warning: variable ‘Status’ set but not used. llvm-svn: 248691	2015-09-27 21:31:33 +00:00
Sanjoy Das	f1090b6061	[SCEV] identical instructions don't compute equal values Before this change `HasSameValue` would return true for distinct `alloca` instructions if they happened to be allocating the same type (`alloca` instructions are not specified as reading memory). This change adds an explicit whitelist of instruction types for which "identical" instructions compute the same value. Fixes PR24952. llvm-svn: 248690	2015-09-27 21:09:48 +00:00
Sanjay Patel	9533407566	[InstCombine] fold zexts and constants into a phi (PR24766) This is one step towards solving PR24766: https://llvm.org/bugs/show_bug.cgi?id=24766 We were not producing the same IR for these two C functions because the store to the temp bool causes extra zexts: #include <stdbool.h> bool switchy(char x1, char x2, char condition) { bool conditionMet = false; switch (condition) { case 0: conditionMet = (x1 == x2); break; case 1: conditionMet = (x1 <= x2); break; } return conditionMet; } bool switchy2(char x1, char x2, char condition) { switch (condition) { case 0: return (x1 == x2); case 1: return (x1 <= x2); } return false; } As noted in the code comments, this test case manages to avoid the more general existing phi optimizations where there are only 2 phi inputs or where there are no constant phi args mixed in with the casts ops. It seems like a corner case, but if we don't catch it, then I don't think we can get SimplifyCFG to further optimize towards the canonical form for this function shown in the bug report. Differential Revision: http://reviews.llvm.org/D12866 llvm-svn: 248689	2015-09-27 20:34:31 +00:00
Joseph Tremoulet	09af67aba5	[EH] Create removeUnwindEdge utility Summary: Factor the code that rewrites invokes to calls and rewrites WinEH terminators to their "unwind to caller" equivalents into a helper in Utils/Local, and use it in the three places I'm aware of that need to do this. Reviewers: andrew.w.kaylor, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13152 llvm-svn: 248677	2015-09-27 01:47:46 +00:00
Benjamin Kramer	2a63631abd	[BranchProbability] Manually round the floating point output. llvm::format compiles down to snprintf which has no defined rounding for floating point arguments, and MSVC has implemented it differently from what the BSD libcs and glibc do. Try to emulate the glibc rounding behavior to avoid changing tests. While there simplify code a bit and move trivial methods inline. llvm-svn: 248665	2015-09-26 10:09:36 +00:00
Matt Arsenault	1d36b717a5	AMDGPU: Remove hasPostISelHook from most instructions Since this is only needed for VOP3 and a few other special case instructions, stop setting it on everything. llvm-svn: 248657	2015-09-26 05:06:48 +00:00
Matt Arsenault	f32481372c	AMDGPU: Switch over reg class size instead of checking all super classes This gets isSGPRClass out of my profile of SIFixSGPRCopies. llvm-svn: 248656	2015-09-26 04:59:04 +00:00
Matt Arsenault	6e28010215	AMDGPU: Don't handle invalid reg classes in helper functions No tests hit these and it would be better to have checks like this explicit where they are used. llvm-svn: 248655	2015-09-26 04:53:30 +00:00
Saleem Abdulrasool	9174623b2d	AMDGPU: address -Winconsistent-missing-override Add missing override. NFC. llvm-svn: 248652	2015-09-26 04:34:52 +00:00
Matt Arsenault	8e1ddf84fe	AMDGPU: Set CopyCost of register classes These require multiple mov instructions to copy, but the default value is that 1 instruction is needed. I'm not sure if this actually changes anything. llvm-svn: 248651	2015-09-26 04:09:34 +00:00
Chen Li	7452d95656	[Bug 24848] Use range metadata to constant fold comparisons between two values Summary: This is the second part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. If both operands of a comparison have range metadata, they should be used to constant fold the comparison. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13177 llvm-svn: 248650	2015-09-26 03:26:47 +00:00
Matt Arsenault	e98a074c42	AMDGPU: VOP3b definition cleanups llvm-svn: 248647	2015-09-26 02:25:48 +00:00
Matt Arsenault	86095b8dec	AMDGPU: Fix sched model for VOP2b instructions Trying to use the version with the explicit output operand would complain because of the missing WriteSALU. I'm not sure why it doesn't complain about this with the implicit VCC def. llvm-svn: 248646	2015-09-26 02:25:45 +00:00
Dan Gohman	d0bf981296	[WebAssembly] Rename several functions and types according to the new spec. llvm-svn: 248644	2015-09-26 01:09:44 +00:00
Ahmed Bougacha	e81610fabb	[ARM] Don't generate clrex for pre-v7 targets. Since r248294, we emit clrex, but it doesn't exist on v6. llvm-svn: 248640	2015-09-26 00:14:02 +00:00
Sanjoy Das	b174f9a316	[SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to exploit trip counts' Summary: If the trip count of a specific backedge is `N`, then we know that backedge is effectively guarded by the condition `{0,+,1} u< N`. This change teaches SCEV to use this condition to prove things in `isLoopBackedgeGuardedByCond`. Depends on D12948 Depends on D12949 The original checkin, r248608 had to be backed out due to an issue with a ObjCXX unit test. That issue is now fixed, so re-landing. Reviewers: atrick, reames, majnemer, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12950 llvm-svn: 248638	2015-09-25 23:53:50 +00:00
Sanjoy Das	96709c4854	[SCEV] Reapply 'Exploit A < B => (A+K) < (B+K) when possible' Summary: This change teaches SCEV's `isImpliedCond` two new identities: A u< B u< -C => (A + C) u< (B + C) A s< B s< INT_MIN - C => (A + C) s< (B + C) While these are useful on their own, they're really intended to support D12950. The original checkin, r248606 had to be backed out due to an issue with a ObjCXX unit test. That issue is now fixed, so re-landing. Reviewers: atrick, reames, majnemer, nlewycky, hfinkel Subscribers: aadg, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12948 llvm-svn: 248637	2015-09-25 23:53:45 +00:00
Matthias Braun	93ab942c24	LivePhysRegs: Fix live-outs of return blocks I realized that the live-out set computed for the return block is missing the callee saved registers (the non-pristine ones to be exact). This only affects the liveness computed for instructions inside the function epilogue which currently none of the LivePhysRegs users in llvm cares about, so this is just a drive-by fix without a testcase. Differential Revision: http://reviews.llvm.org/D13180 llvm-svn: 248636	2015-09-25 23:50:53 +00:00
Sanjay Patel	e1b09caaaf	[InstCombine] match De Morgan's Law hidden by zext ops (PR22723) This is a fix for PR22723: https://llvm.org/bugs/show_bug.cgi?id=22723 My first attempt at this was to change what I thought was the root problem: xor (zext i1 X to i32), 1 --> zext (xor i1 X, true) to i32 ...but we create the opposite pattern in InstCombiner::visitZExt(), so infinite loop! My next idea was to fix the matchIfNot() implementation in PatternMatch, but that would mean potentially returning a different size for the match than what was input. I think this would require all users of m_Not to check the size of the returned match, so I abandoned that idea. I settled on just fixing the exact case presented in the PR. This patch does allow the 2 functions in PR22723 to compile identically (x86): bool test(bool x, bool y) { return !x \| !y; } bool test(bool x, bool y) { return !x \|\| !y; } ... andb %sil, %dil xorb $1, %dil movb %dil, %al retq Differential Revision: http://reviews.llvm.org/D12705 llvm-svn: 248634	2015-09-25 23:21:38 +00:00
Cong Hou	15ea016346	Use fixed-point representation for BranchProbability. BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator 1<<31. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change: Pros: 1. It uses only a half space of the current one. 2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer. Cons: 1. Constructing a probability using arbitrary numerator and denominator needs additional calculations. 2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/3 may not be exactly identical to 1 / 3 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio. One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock. We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern. Differential revision: http://reviews.llvm.org/D12603 llvm-svn: 248633	2015-09-25 23:09:59 +00:00
Matthias Braun	a3b701f828	SelectionDAGDumper: Print simple operands inline. Print simple operands inline instead of their pointer/value number. Simple operands are SDNodes without predecessors like Constant(FP), Register, UNDEF. This unifies the behaviour with dumpr() which was already doing this. Previously: t0: ch = EntryToken t1: i64 = Register %vreg0 t2: i64,ch = CopyFromReg t0, t1 t3: i64 = Constant<1> t4: i64 = add t2, t3 t5: i64 = Constant<2> t6: i64 = add t2, t5 t10: i64 = undef t11: i8,ch = load t0, t2, t10<LD1[%tmp81]> t12: i8,ch = load t0, t4, t10<LD1[%tmp10]> t13: i8,ch = load t0, t6, t10<LD1[%tmp12]> Now: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t4: i64 = add t2, Constant:i64<1> t6: i64 = add t2, Constant:i64<2> t11: i8,ch = load<LD1[%tmp81]> t0, t2, undef:i64 t12: i8,ch = load<LD1[%tmp10]> t0, t4, undef:i64 t13: i8,ch = load<LD1[%tmp12]> t0, t6, undef:i64 Differential Revision: http://reviews.llvm.org/D12567 llvm-svn: 248628	2015-09-25 22:27:02 +00:00
Matt Arsenault	e229c0c45e	AMDGPU: Construct new buffer instruction when moving SMRD It's easier to understand creating a full instruction than the current situation where sometimes a new instruction is created and sometimes it is awkwardly mutated in place. llvm-svn: 248627	2015-09-25 22:21:19 +00:00
Matt Arsenault	3c07e963b8	DAGCombiner: Check if store is volatile first This is the simpler check. NFC. llvm-svn: 248625	2015-09-25 22:06:19 +00:00
Matthias Braun	c804cdb912	TargetRegisterInfo: Introduce PrintLaneMask. This makes it more convenient to print lane masks and lead to more uniform printing. llvm-svn: 248624	2015-09-25 21:51:24 +00:00
Matthias Braun	e6a2485e1a	TargetRegisterInfo: Add typedef unsigned LaneBitmask and use it where apropriate; NFC llvm-svn: 248623	2015-09-25 21:51:14 +00:00
Sanjay Patel	bbbf9a1a34	merge vector stores into wider vector stores and fix AArch64 misaligned access TLI hook (PR21711) This is a redo of D7208 ( r227242 - http://llvm.org/viewvc/llvm-project?view=revision&revision=227242 ). The patch was reverted because an AArch64 target could infinite loop after the change in DAGCombiner to merge vector stores. That happened because AArch64's allowsMisalignedMemoryAccesses() wasn't telling the truth. It reported all unaligned memory accesses as fast, but then split some 128-bit unaligned accesses up in performSTORECombine() because they are slow. This patch attempts to fix the problem in AArch's allowsMisalignedMemoryAccesses() while preserving existing (perhaps questionable) lowering behavior. The x86 test shows that store merging is working as intended for a target with fast 32-byte unaligned stores. Differential Revision: http://reviews.llvm.org/D12635 llvm-svn: 248622	2015-09-25 21:49:48 +00:00
Matthias Braun	e86bbd8979	PrologueEpilogInserter: Fix missing live-ins when savepoint equals restorepoint The algorithm would not modify the live-in list of blocks below the save block point which is correct unless it happens to be a restore point at the same time. Also fixes the benign issue of live-in registers being added twice in some cases. The testcase is based on a test submitted by Kit Barton. Differential Revision: http://reviews.llvm.org/D13176 llvm-svn: 248620	2015-09-25 21:41:40 +00:00
Tom Stellard	e135ffd554	AMDGPU/SI: Use .hsatext section instead of .text for HSA Reviewers: arsenm, grosbach, rafael Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12424 llvm-svn: 248619	2015-09-25 21:41:28 +00:00
Tom Stellard	8e0257625d	MCAsmInfo: Allow targets to specify when the .section directive should be omitted Summary: The default behavior is to omit the .section directive for .text, .data, and sometimes .bss, but some targets may want to omit this directive for other sections too. The AMDGPU backend will uses this to emit a simplified syntax for section switches. For example if the section directive is not omitted (current behavior), section switches to .hsatext will be printed like this: .section .hsatext,#alloc,#execinstr,#write This is actually wrong, because .hsatext has some custom STT_* flags, which MC doesn't know how to print or parse. If the section directive is omitted (made possible by this commit), section switches will be printed like this: .hsatext The motivation for this patch is to make it possible to emit sections with custom STT_* flags without having to teach MC about all the target specific STT_* flags. Reviewers: rafael, grosbach Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12423 llvm-svn: 248618	2015-09-25 21:41:14 +00:00
Matthias Braun	c2d4befb54	MachineBasicBlock: Factor out common code into isReturnBlock() llvm-svn: 248617	2015-09-25 21:25:19 +00:00
Sanjoy Das	4a39b97671	Revert two SCEV changes that caused test failures in clang. r248606: "[SCEV] Exploit A < B => (A+K) < (B+K) when possible" r248608: "[SCEV] Teach isLoopBackedgeGuardedByCond to exploit trip counts." llvm-svn: 248614	2015-09-25 21:16:50 +00:00
Justin Bogner	0638b7ba99	ADCE: Fix typo in file comment. NFC llvm-svn: 248613	2015-09-25 21:03:46 +00:00
Matt Arsenault	10aa807856	PeepholeOptimizer: Remove redundant copies If a virtual register is copied and another copy was already seen, replace with the previous copy. This only handles the simplest cases for now. This pattern shows up from various operand restrictions AMDGPU has which require inserting copies depending on the register class of the operands. llvm-svn: 248611	2015-09-25 20:22:12 +00:00
Chad Rosier	d9f102b464	Simplify code. NFC. llvm-svn: 248610	2015-09-25 20:20:22 +00:00
Sanjay Patel	a67559c106	more space; NFC llvm-svn: 248609	2015-09-25 20:12:43 +00:00
Sanjoy Das	d706fa8a0c	[SCEV] Teach isLoopBackedgeGuardedByCond to exploit trip counts. Summary: If the trip count of a specific backedge is `N`, then we know that backedge is effectively guarded by the condition `{0,+,1} u< N`. This change teaches SCEV to use this condition to prove things in `isLoopBackedgeGuardedByCond`. Depends on D12948 Depends on D12949 Reviewers: atrick, reames, majnemer, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12950 llvm-svn: 248608	2015-09-25 19:59:57 +00:00
Sanjoy Das	df1635d394	[SCEV] Extract helper function from isImpliedCond; NFC Summary: This new helper routine will be used in a subsequent change. Reviewers: hfinkel Subscribers: hfinkel, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12949 llvm-svn: 248607	2015-09-25 19:59:52 +00:00
Sanjoy Das	fdec9deb13	[SCEV] Exploit A < B => (A+K) < (B+K) when possible Summary: This change teaches SCEV's `isImpliedCond` two new identities: A u< B u< -C => (A + C) u< (B + C) A s< B s< INT_MIN - C => (A + C) s< (B + C) While these are useful on their own, they're really intended to support D12950. Reviewers: atrick, reames, majnemer, nlewycky, hfinkel Subscribers: aadg, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12948 llvm-svn: 248606	2015-09-25 19:59:49 +00:00
Matt Arsenault	f743b838cb	AMDGPU: Make getNamedOperandIdx declaration readonly This matches how it is defined in the generated implementation. llvm-svn: 248598	2015-09-25 18:09:15 +00:00
Chad Rosier	1bbd7fb38e	[AArch64] Add support for generating pre- and post-index load/store pairs. llvm-svn: 248593	2015-09-25 17:48:17 +00:00
Matt Arsenault	0a10900070	AMDGPU: Disable some passes that are not meaningful Don't run passes related to stack maps, garbage collection, exceptions since these aren't useful for GPUs. There might be a few more to turn off that I'm less sure about (e.g. ShrinkWrapping) or I'm not sure how to disable (SafeStack and StackProtector) llvm-svn: 248591	2015-09-25 17:41:20 +00:00
Matt Arsenault	4bf43d4e68	AMDGPU: Handle i64->v2i32 loads/stores in PreprocessISelDAG This fixes a select error when the i64 source was also bitcasted to v2i32 in the original source. Instead of awkwardly trying to select the modified source value and the store, replace before isel begins. Uses a worklist to avoid possible problems from mutating the DAG, although it seems to work OK without it. llvm-svn: 248589	2015-09-25 17:27:08 +00:00
Matt Arsenault	0cb8517dc6	AMDGPU: Fix recomputing dominator tree unnecessarily SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands. llvm-svn: 248587	2015-09-25 17:21:28 +00:00
Matt Arsenault	2d6fdb8495	AMDGPU: Re-justify workaround and fix worked around problem When buffer resource descriptors were built, the upper two components of the descriptor were first composed into a 64-bit register because legalizeOperands assumed all operands had the same register class. Fix that problem, but keep the workaround. I'm not sure anything actually is actually emitting such a REG_SEQUENCE now. If multiple resource descriptors are set up with different base pointers, this is copied with a single s_mov_b64. We probably should fix this better by recognizing a pair of s_mov_b32 later, but for now delete the dead code. llvm-svn: 248585	2015-09-25 17:08:42 +00:00
Matt Arsenault	3ad55ec946	AMDGPU: Don't create REG_SEQUENCE with SGPR dest and VGPR sources This avoids needting to re-legalize the new REG_SEQUENCE. llvm-svn: 248584	2015-09-25 17:08:40 +00:00
Matt Arsenault	6525aa3529	AMDGPU: Fix not adding exec to defs of cmpx instruction pseudos This was only set on the final _si/_vi version, but not on the pseudos most of codegen sees. No test since these instructions aren't used yet. llvm-svn: 248583	2015-09-25 16:58:27 +00:00
Matt Arsenault	5f70436c49	AMDGPU: Improve accuracy of instruction rates for VOPC These were all using the default 32-bit VALU write class, but the i64/f64 compares are half rate. I'm not sure this is really correct, because they are still using the write to VALU write class, even though they really write to the SALU. llvm-svn: 248582	2015-09-25 16:58:25 +00:00
James Molloy	eb46641c28	[GlobalsAA] Teach GlobalsAA about nocapture Arguments to function calls marked "nocapture" can be marked as non-escaping. However, nocapture is defined in terms of the lifetime of the callee, and if the callee can directly or indirectly recurse to the caller, the semantics of nocapture are invalid. Therefore, we eagerly discover which SCC each function belongs to, and later can check if callee and caller of a callsite belong to the same SCC, in which case there could be recursion. This means that we can't be so optimistic in getModRefInfo(ImmutableCallsite) - previously we assumed all call arguments never aliased with an escaping global. Now we need to check, because a global could now be passed as an argument but still not escape. This also solves a related conformance problem: MemCpyOptimizer can turn non-escaping stores of globals into calls to intrinsics like llvm.memcpy/llvm/memset. This confuses GlobalsAA, which knows the global can't escape and so returns NoModRef when queried, when obviously a memcpy/memset call does indeed reference and modify its arguments. This fixes PR24800, PR24801, and PR24802. llvm-svn: 248576	2015-09-25 15:39:29 +00:00
Saleem Abdulrasool	8e99f50768	ARM: make -Asserts,-Werror=unused-variable build happy The value was only used in an assertion. Sink the variable usage into the assertion. llvm-svn: 248562	2015-09-25 05:41:02 +00:00
Saleem Abdulrasool	fe83b50289	ARM: address WoA division limitation We now emit the compiler generated divide by zero check that was needed for the MSVC routines. We construct a psuedo-instruction for the DBZ check as the operation requires splitting up the BB. For the 64-bit operations, we need to custom expand the node as we need to insert the DBZ check and then emit the libcall to the appropriate name. Because this is target specific, it seemed better to reproduce the expansion operation from the target-agnostic type legalization rather than sink this there to avoid the duplication. The division library calls now match MSVC semantically. llvm-svn: 248561	2015-09-25 05:15:46 +00:00
Matt Arsenault	8aa9973696	AMDGPU: Remove unused includes llvm-svn: 248553	2015-09-25 00:28:43 +00:00
Sanjoy Das	b513a9fa4f	[Bitcode][Asm] Teach LLVM to read and write operand bundles. Summary: This also adds the first set of tests for operand bundles. The optimizer has not been audited to ensure that it does the right thing with operand bundles. Depends on D12456. Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner Subscribers: maksfb, llvm-commits Differential Revision: http://reviews.llvm.org/D12457 llvm-svn: 248551	2015-09-24 23:34:52 +00:00
Matt Arsenault	50f0a42b66	Fix typo llvm-svn: 248549	2015-09-24 22:36:49 +00:00
Chad Rosier	b02f5a5a1f	[AArch64] Improve the readability of the ld/st optimization pass. NFC. In this context, MI is an add/sub instruction not a loads/store. llvm-svn: 248540	2015-09-24 21:27:49 +00:00
Simon Pilgrim	68d0050c6a	[X86][SSE2] Fix zero/any extension shuffles that don't start from the first element Fix for D12561 - we weren't correctly ensuring that the base element for extension was moved to start on a boundary suitable for UNPCKL/H llvm-svn: 248536	2015-09-24 21:02:17 +00:00
Matt Arsenault	e66621b306	AMDGPU: Add s_dcache_* instructions llvm-svn: 248533	2015-09-24 19:52:27 +00:00
Matt Arsenault	d6adfb401c	AMDGPU: Add cache invalidation instructions. These are necessary for implementing mem_fence for OpenCL 2.0. The VI assembler tests are disabled since it seems to be using the wrong encoding or opcode. llvm-svn: 248532	2015-09-24 19:52:21 +00:00
Chad Rosier	7cd472b719	[AArch64] The paired post-increment store instruction has an output register. The pre- and post-increment version update the base register, but the post- version was defined incorrectly. There is no test case as we don't currently generate these instructions, but I plan on changing that in the near future. llvm-svn: 248528	2015-09-24 19:21:42 +00:00
Sanjoy Das	9303c24650	[IR] Add operand bundles to CallInst and InvokeInst. Summary: This change teaches `CallInst`s and `InvokeInst`s to maintain a set of operand bundles as part of its operands. `CallInst`s and `InvokeInst`s with operand bundles co-allocate some space before their `Use` array to hold meta information about which of its operands are part of an operand bundle. The strings corresponding to the bundle tags are interned into `LLVMContextImpl::BundleTagCache` This change does not include any parsing / bitcode support. That's the next change. Depends on D12455. Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner Subscribers: MatzeB, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12456 llvm-svn: 248527	2015-09-24 19:14:18 +00:00
Artyom Skrobov	cf296444ab	[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with a FIXME: attached. This patch changes the handling of +t2dsp to be in line with other architecture extensions. Following a revert of r248152 and new review comments, this patch also includes renaming FeatureDSPThumb2 -> FeatureDSP, hasThumb2DSP() -> hasDSP(), etc. The spelling of "t2dsp" is preserved, pending a further investigation of its possible external usage. Differential Revision: http://reviews.llvm.org/D12937 llvm-svn: 248519	2015-09-24 17:31:16 +00:00
James Molloy	b6be1ebb7d	[ValueTracking] Teach isKnownNonZero a new trick If the shifter operand is a constant, and all of the bits shifted out are known to be zero, then if X is known non-zero at least one non-zero bit must remain. llvm-svn: 248508	2015-09-24 16:06:32 +00:00
Daniel Sanders	090f6e41c4	[mips] Use PredicateControl for the MSA ASE instructions. NFC. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13092 llvm-svn: 248486	2015-09-24 12:10:23 +00:00
Mohammad Shahid	13f1dfdf2e	Codegen: Fix llvm.absdiff semantic. Fixes the overflow case of llvm.absdiff intrinsic also updats the tests and LangRef.rst accordingly. Differential Revision: http://reviews.llvm.org/D11678 llvm-svn: 248483	2015-09-24 10:35:03 +00:00
Charlie Turner	2720593ab4	[InstCombine] Recognize another bswap idiom. Summary: The byte-swap recognizer can now notice that this ``` uint32_t bswap(uint32_t x) { x = (x & 0x0000FFFF) << 16 \| (x & 0xFFFF0000) >> 16; x = (x & 0x00FF00FF) << 8 \| (x & 0xFF00FF00) >> 8; return x; } ``` is a bswap. Fixes PR23863. Reviewers: nlewycky, hfinkel, hans, jmolloy, rengolin Subscribers: majnemer, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D12637 llvm-svn: 248482	2015-09-24 10:24:58 +00:00
Matt Arsenault	68d938649e	Introduce target hook for optimizing register copies Allow a target to do something other than search for copies that will avoid cross register bank copies. Implement for SI by only rewriting the most basic copies, so it should look through anything like a subregister extract. I'm not entirely satisified with this because it seems like eliminating a reg_sequence that isn't fully used should work generically for all targets without them having to override something. However, it seems to be tricky to have a simple implementation of this without rewriting to invalid kinds of subregister copies on some targets. I'm not sure if there is currently a generic way to easily check if a subregister index would be valid for the current use. The current set of TargetRegisterInfo::get*Class functions don't quite behave like I would expect (e.g. getSubClassWithSubReg returns the maximal register class rather than the minimal), so I'm not sure how to make the generic test keep searching if SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making the default implementation to check for simple copies breaks a variety of ARM and x86 tests by producing illegal subregister uses. The ARM tests are not actually changed since it should still be using the same sharesSameRegisterFile implementation, this just relaxes them to not check for specific registers. llvm-svn: 248478	2015-09-24 08:36:14 +00:00
Matt Arsenault	e068f9a263	AMDGPU: Return after instruction is processed. llvm-svn: 248476	2015-09-24 07:51:28 +00:00
Matt Arsenault	708586faa2	AMDGPU: Remove another unnecessary check from commuteInstruction llvm-svn: 248475	2015-09-24 07:51:25 +00:00
Matt Arsenault	fa242960fc	AMDGPU: Add readonly to InstrMapping functions llvm-svn: 248474	2015-09-24 07:51:23 +00:00
Matt Arsenault	cab64f1c75	AMDGPU: Fix printing trailing whitespace for mubuf atomics llvm-svn: 248472	2015-09-24 07:51:17 +00:00
Matt Arsenault	c7ec46c3aa	Remove dead declaration llvm-svn: 248471	2015-09-24 07:51:12 +00:00
Matt Arsenault	c721df0478	Use new TokenFactor chain when merging stores If the stores are storing values from loads which partially alias the stores, we could end up placing the merged loads and stores on the same chain which has the potential to break. Each store may have a different chain dependency on only some of the original loads. Create a new TokenFactor to capture all of the required dependencies of the stores rather than assuming all stores can use the same chain. The testcase is a situation where this happens, although it does not have an observable change from this. The DAG nodes just happened to not be reordered before despite this missing chain dependency. This is based on an off-list report for an out of tree target which regressed due to r246307 and I haven't managed to find a case where the nodes do end up reordered with an in tree target. llvm-svn: 248468	2015-09-24 07:22:38 +00:00
Matt Arsenault	c8e2ce4046	AMDGPU: Reduce number of copies emitted Instead of always inserting a copy in case the super register is itself a subregister, only extract to the super reg class if this is actually the case. This shouldn't really change codegen, but makes looking at the output of SIFixSGPRCopies easier to read. llvm-svn: 248467	2015-09-24 07:16:37 +00:00
Justin Bogner	abdcb3c1b3	Fix a think-o in which functions these should surround llvm-svn: 248465	2015-09-24 05:29:31 +00:00
Justin Bogner	aa57ac5d96	Add some NDEBUG checks I accidentally dropped in r248462 llvm-svn: 248464	2015-09-24 05:20:04 +00:00
Justin Bogner	49655f806f	BasicAA: Move BasicAAResult::alias out-of-line. NFC This makes the header more readable and cleans up some unnecessary header differences between NDEBUG and !NDEBUG. llvm-svn: 248462	2015-09-24 04:59:24 +00:00
Michael Zolotukhin	74621cced7	Add CFG Simplification pass after Loop Unswitching. Loop unswitching produces conditional branches with constant condition, and it's beneficial for later passes to clean this up with simplify-cfg. We do this after the second invocation of loop-unswitch, but not after the first one. Not doing so might cause problem for passes like LoopUnroll, whose estimate of loop body size would be less accurate. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D13064 llvm-svn: 248460	2015-09-24 03:50:17 +00:00
Evgeniy Stepanov	8685daf23e	[safestack] Fix compiler crash in the presence of stack restores. A use can be emitted before def in a function with stack restore points but no static allocas. llvm-svn: 248455	2015-09-24 01:23:51 +00:00
Sanjoy Das	b07fc572bd	[IR] Teach `llvm::User` to co-allocate a descriptor. Summary: With this change, subclasses of `llvm::User` will be able to co-allocate a variable number of bytes (called a "descriptor") with the `llvm::User` instance. The co-allocated descriptor can later be accessed using `llvm::User::getDescriptor`. This will be used in later changes to implement operand bundles. This change steals one bit from `NumUserOperands`, but given that it is still 28 bits wide I don't think this will be a practical issue. This change does not allow allocating hung off uses with descriptors. This only for simplicity, not for any fundamental reason; and we can easily add this functionality later if needed. Reviewers: reames, chandlerc, dexonsmith, kmod, majnemer, pete, JosephTremoulet Subscribers: pete, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12455 llvm-svn: 248453	2015-09-24 01:00:49 +00:00
Michael Zolotukhin	d56ee06d1f	[Unroll] When completely unrolling the loop, replace conditinal branches with unconditional. Nothing is expected to change, except we do less redundant work in clean-up. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12951 llvm-svn: 248444	2015-09-23 23:12:43 +00:00
Wei Mi	3cc9204a52	Put profile variables of COMDAT functions to it's own COMDAT group. In -fprofile-instr-generate compilation, to remove the redundant profile variables for the COMDAT functions, these variables are placed in the same COMDAT group as its associated function. This way when the COMDAT function is not picked by the linker, those profile variables will also not be output in the final binary. This may cause warning when mix link objects built w and wo -fprofile-instr-generate. This patch puts the profile variables for COMDAT functions to its own COMDAT group to avoid the problem. Patch by xur. Differential Revision: http://reviews.llvm.org/D12248 llvm-svn: 248440	2015-09-23 22:40:45 +00:00
Tim Northover	beb5bccf88	ARM: fix folding stack adjustment (again again again...) This time, the issue is that we weren't accounting for the possibility that aligned DPRs could have been stored after the final "push" in a prologue. When that happened we effectively moved a "sub sp, #N" from below the aligned stores to above them, and everything went to pot. To make it worse, I'd actually committed something testing that we produced wrong code, so the test update is tiny. llvm-svn: 248437	2015-09-23 22:21:09 +00:00
Philip Reames	d63df5107e	Remove handling of AddrSpaceCast in stripAndAccumulateInBoundsConstantOffsets Patch by: simoncook Unlike BitCasts, AddrSpaceCasts do not always produce an output the same size as its input, which was previously assumed. This fixes cases where two address spaces do not have the same size pointer, as an assertion failure would occur when trying to prove deferenceability. LoopUnswitch is used in the particular test, but LICM also exhibits the same problem. Differential Revision: http://reviews.llvm.org/D13008 llvm-svn: 248422	2015-09-23 19:48:43 +00:00
Lawrence Hu	cac0b89289	Swap loop invariant GEP with loop variant GEP to allow more LICM. This patch changes the order of GEPs generated by Splitting GEPs pass, specially when one of the GEPs has constant and the base is loop invariant, then we will generate the GEP with constant first when beneficial, to expose more cases for LICM. If originally Splitting GEP generate the following: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 %3 %uglygep7 = getelementptr i8, i8* %uglygep, i64 1032 ... Now it genereates: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 1032 %uglygep7 = getelementptr i8, i8* %uglygep, i64 %3 ... For no-loop cases, the original way of generating GEPs seems to expose more CSE cases, so we don't change the logic for no-loop cases, and only limit our change to the specific case we are interested in. llvm-svn: 248420	2015-09-23 19:25:30 +00:00
Akira Hatanaka	f6afd11538	[InstCombine] Preserve metadata when merging loads that are phi arguments. Make sure InstCombiner::FoldPHIArgLoadIntoPHI doesn't drop the following metadata: MD_tbaa MD_alias_scope MD_noalias MD_invariant_load MD_nonnull MD_range rdar://problem/17617709 Differential Revision: http://reviews.llvm.org/D12710 llvm-svn: 248419	2015-09-23 18:40:57 +00:00
Sanjay Patel	1a6534661b	[x86] replace integer 'xor' ops with packed SSE FP 'xor' ops when operating on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx xorl %eax, %ecx movd %ecx, %xmm0 into this: xorps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 This is an extension of: http://reviews.llvm.org/rL248395 llvm-svn: 248415	2015-09-23 18:33:42 +00:00
Sanjay Patel	aba37553c4	[x86] replace integer 'or' ops with packed SSE FP 'or' ops when operating on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx orl %eax, %ecx movd %ecx, %xmm0 into this: orps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 This is an extension of: http://reviews.llvm.org/rL248395 llvm-svn: 248409	2015-09-23 18:19:07 +00:00
Evgeniy Stepanov	a2002b08f7	Android support for SafeStack. Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). This is a re-commit of a change in r248357 that was reverted in r248358. llvm-svn: 248405	2015-09-23 18:07:56 +00:00
Sanjay Patel	b14ecd34f7	move call to convertIntLogicToFPLogic up; NFCI The BEXTR comments didn't make sense before, we may want to extend the FP logic transform to work on vectors, and this way is more beautiful. llvm-svn: 248404	2015-09-23 18:03:37 +00:00
Chen Li	5cd6deeae3	[Bug 24848] Use range metadata to constant fold comparisons with constant values Summary: This is the first part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848. When range metadata is provided, it should be used to constant fold comparisons with constant values. Reviewers: sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12988 llvm-svn: 248402	2015-09-23 17:58:44 +00:00
Sanjay Patel	ade3abd2d9	[x86] move code for converting int logic to FP logic to a helper function; NFCI This is a follow-on to: http://reviews.llvm.org/rL248395 so we can add the call to the or/xor combines too. llvm-svn: 248399	2015-09-23 17:39:41 +00:00
Sanjay Patel	df2495f331	[x86] replace integer 'and' ops with packed SSE FP 'and' ops when operating on FP scalars Turn this: movd %xmm0, %eax movd %xmm1, %ecx andl %eax, %ecx movd %ecx, %xmm0 into this: andps %xmm1, %xmm0 This is related to, but does not solve: https://llvm.org/bugs/show_bug.cgi?id=22428 Differential Revision: http://reviews.llvm.org/D13065 llvm-svn: 248395	2015-09-23 17:00:06 +00:00
Dan Gohman	979840d31f	[WebAssembly] Fix hasAddr64 being used before being initializer. This reverts r248388 and fixes the underlying bug: hasAddr64 was initialized in runOnMachineFunction, but runOnMachineFunction isn't ever called in CodeGen/WebAssembly/global.ll since that testcase has no functions. The fix here is to use AsmPrinter's getPointerSize() as needed to determine the pointer size instead. llvm-svn: 248394	2015-09-23 16:59:10 +00:00
Vedant Kumar	ff08e926ba	[Inline] Use AssumptionCache from the right Function This changes the behavior of AddAligntmentAssumptions to match its comment. I.e, prove the asserted alignment in the context of the caller, not the callee. Thanks to Mehdi Amini for seeing the issue here! Also to Artur Pilipenko who also saw a fix for the issue. rdar://22521387 Differential Revision: http://reviews.llvm.org/D12997 llvm-svn: 248390	2015-09-23 15:49:08 +00:00
Alexander Kornienko	a3eaa204e6	Fix CodeGen/WebAssembly/global.ll test under ASAN. llvm-svn: 248388	2015-09-23 15:41:25 +00:00
David Majnemer	fa36bde2f6	[DeadArgElim] Split the invoke successor edge Invoking a function which returns an aggregate can sometimes be transformed to return a scalar value. However, this means that we need to create an insertvalue instruction(s) to recreate the correct aggregate type. We achieved this by inserting an insertvalue instruction at the invoke's normal successor. However, this is not feasible if the normal successor uses the invoke's return value inside a PHI node. Instead, split the edge between the invoke and the unwind successor and create the insertvalue instruction in the new basic block. The new basic block's successor will be the old invoke successor which leaves us with IR which is well behaved. This fixes PR24906. llvm-svn: 248387	2015-09-23 15:41:09 +00:00
Chad Rosier	2dfd35499e	[AArch64] Refactor pre- and post-index merge fuctions into a single function. NFC. llvm-svn: 248377	2015-09-23 13:51:44 +00:00
Igor Laevsky	029bd93c5d	[DeadStoreElimination] Remove dead zero store to calloc initialized memory This change allows dead store elimination to remove zero and null stores into memory freshly allocated with calloc-like function. Differential Revision: http://reviews.llvm.org/D13021 llvm-svn: 248374	2015-09-23 11:38:44 +00:00
Oliver Stannard	f2ed5c68d2	[ARM] Add option to force fast-isel The ARM backend has some logic that only allows the fast-isel to be enabled for subtargets where it is known to be stable. This adds a backend option to override this and force the fast-isel to be used for any target, to allow it to be tested. This is an ARM-specific option, because no other backend disables the fast-isel on a per-subtarget basis. llvm-svn: 248369	2015-09-23 09:19:54 +00:00
Simon Pilgrim	9cb018b6b6	[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR This patches removes the x86.sse41.pmovsx* intrinsics, provides a suitable upgrade path and updates relevant tests to sign extend a subvector instead. LLVM counterpart to D12835 Differential Revision: http://reviews.llvm.org/D13002 llvm-svn: 248368	2015-09-23 08:48:33 +00:00
Sanjoy Das	2aacc0ecca	[SCEV] Introduce ScalarEvolution::getOne and getZero. Summary: It is fairly common to call SE->getConstant(Ty, 0) or SE->getConstant(Ty, 1); this change makes such uses a little bit briefer. I've refactored the call sites I could find easily to use getZero / getOne. Reviewers: hfinkel, majnemer, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12947 llvm-svn: 248362	2015-09-23 01:59:04 +00:00
Evgeniy Stepanov	8d0e3011d8	Revert "Android support for SafeStack." test/Transforms/SafeStack/abi.ll breaks when target is not supported; needs refactoring. llvm-svn: 248358	2015-09-23 01:23:22 +00:00
Evgeniy Stepanov	ce2e16f00c	Android support for SafeStack. Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). llvm-svn: 248357	2015-09-23 01:03:51 +00:00
Cong Hou	9def6efd7e	Fixed an issue on updating profile data when lowering switch statement. Fixed the issue that when there is an edge from the jump table to the default statement, we should check it directly instead of checking if the sibling of the jump table header is a successor of the jump table header, which may not be the default statment but a successor of it. llvm-svn: 248354	2015-09-23 00:20:27 +00:00
Adrian Prantl	77fefeba37	Debug Info: Emit the dwo_name only in skeleton CUs, not in DWOs. llvm-svn: 248340	2015-09-22 23:21:00 +00:00
Matthias Braun	73e4221e6c	LiveIntervalAnalysis: Avoid multiple connected liveness components We may have subregister defs which are unused but not discovered and cleaned up prior to liveness analysis. This creates multiple connected components in the resulting live range which are forbidden in the MachineVerifier because they would unnecesarily constrain the register allocator. Rewrite those dead definitions to define a newly created virtual register. Differential Revision: http://reviews.llvm.org/D13035 llvm-svn: 248335	2015-09-22 22:37:44 +00:00
Matthias Braun	5efe871971	LiveInterval: Distribute subregister liveranges to new intervals in ConnectedVNInfoEqClasses::Distribute() This improves ConnectedVNInfoEqClasses::Distribute() to distribute the segments and value numbers in the subranges instead of conservatively clearing all subregister info. No separate test here, just clearing the subrange instead of properly distributing them would however break my upcoming fix regarding dead super register definitions. Differential Revision: http://reviews.llvm.org/D13075 llvm-svn: 248334	2015-09-22 22:37:42 +00:00
Michael Zolotukhin	deade19630	[Unroll] Do not crash trying to propagate a value to vector load. llvm-svn: 248333	2015-09-22 22:27:12 +00:00
Michael Zolotukhin	8bb31dd08a	[Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad. Apart from checking that GlobalVariable is a constant, we should check that it's not a weak constant, in which case we can't propagate its value. llvm-svn: 248327	2015-09-22 21:41:29 +00:00
Ahmed Bougacha	81616a72ea	[ARM] Emit clrex in the expanded cmpxchg fail block. ARM counterpart to r248291: In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248294	2015-09-22 17:22:58 +00:00
Ahmed Bougacha	07a844d758	[AArch64] Emit clrex in the expanded cmpxchg fail block. In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248291	2015-09-22 17:21:44 +00:00
Benjamin Kramer	3c96f0a54e	Make helper function static. NFC. llvm-svn: 248278	2015-09-22 14:34:57 +00:00
Daniel Sanders	86cce70010	[mips][sched] Split IIBranch into specific instruction classes. Summary: Almost no functional change since the InstrItinData's have been duplicated. The one functional change is to remove IIBranch from the MSA branches. The classes will be assigned to the MSA instructions as part of implementing the P5600 scheduler. II_IndirectBranchPseudo and II_ReturnPseudo can probably be removed. I've preserved the itinerary information for the corresponding pseudo instructions to avoid making a functional change to these pseudos in this patch. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12189 llvm-svn: 248273	2015-09-22 13:36:28 +00:00
Daniel Sanders	1af1d275bc	[mips][sched] Temporarily rename IIAlu to IIM16Alu. NFC. Summary: The only instructions left in IIAlu are MIPS16 specific. We're not implementing a MIPS16 scheduler at this time so rename the class to make it obvious that they are MIPS16 instructions. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12188 llvm-svn: 248267	2015-09-22 12:36:28 +00:00
Stephen Canon	8216d88511	Don't raise inexact when lowering ceil, floor, round, trunc. The C standard has historically not specified whether or not these functions should raise the inexact flag. Traditionally on Darwin, these functions did raise inexact, and the llvm lowerings followed that conventions. n1778 (C bindings for IEEE-754 (2008)) clarifies that these functions should not set inexact. This patch brings the lowerings for arm64 and x86 in line with the newly specified behavior. This also lets us fold some logic into TD patterns, which is nice. Differential Revision: http://reviews.llvm.org/D12969 llvm-svn: 248266	2015-09-22 11:43:17 +00:00
NAKAMURA Takumi	10c80e7996	Prune trailing whitespaces. llvm-svn: 248265	2015-09-22 11:19:03 +00:00
NAKAMURA Takumi	0a7d0ad95f	Untabify. llvm-svn: 248264	2015-09-22 11:15:07 +00:00
NAKAMURA Takumi	a9cb538a74	Reformat blank lines. llvm-svn: 248263	2015-09-22 11:14:39 +00:00
NAKAMURA Takumi	84965031a7	Reformat comment lines. llvm-svn: 248262	2015-09-22 11:14:12 +00:00
NAKAMURA Takumi	70ad98aca4	Reformat. llvm-svn: 248261	2015-09-22 11:13:55 +00:00
NAKAMURA Takumi	59a16a76be	ARMInstrInfo.cpp: Reformat. llvm-svn: 248260	2015-09-22 11:10:17 +00:00
NAKAMURA Takumi	bf9cc7f30b	Fix utf8 chars. llvm-svn: 248259	2015-09-22 11:10:08 +00:00
Daniel Sanders	f173dda0e2	[mips][ias] Implement .cpreturn directive. Summary: Based on a patch by David Chisnall. I've modified the original patch as follows: * Moved the expansion to the TargetStreamers so that the directive isn't expanded when emitting assembly. * Fixed an operand order bug. * Changed the move instructions from DADDu to OR to match recent changes to GAS. Reviewers: vkalintiris Subscribers: llvm-commits, emaste, seanbruno, theraven Differential Revision: http://reviews.llvm.org/D13017 llvm-svn: 248258	2015-09-22 10:50:09 +00:00
Daniel Sanders	254f387723	[mips][sched] Added class for WSBH Summary: No functional change since no InstrItinData is provided. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12190 llvm-svn: 248257	2015-09-22 10:01:13 +00:00
Simon Pilgrim	1cad0cd3ce	[X86][SSE] Match zero/any extension shuffles that don't start from the first element This patch generalizes the lowering of shuffles as zero extensions to allow extensions that don't start from the first element. It now recognises extensions starting anywhere in the lower 128-bits or at the start of any higher 128-bit lane. The motivation was to reduce the number of high cost pshufb calls, but it also improves the SSE2 case as well. Differential Revision: http://reviews.llvm.org/D12561 llvm-svn: 248250	2015-09-22 08:16:08 +00:00
Matt Arsenault	f11e7489e1	AMDGPU: Remove unnecessary check If the instruction doesn't have enough operands, it either shouldn't be marked as isCommutable or is malformed. llvm-svn: 248242	2015-09-22 04:17:45 +00:00
Matthias Braun	d3dd1354a4	LiveIntervalAnalysis: Factor common code into splitSeparateComponents; NFC llvm-svn: 248241	2015-09-22 03:44:41 +00:00
Evgeniy Stepanov	3c9c8338d0	Remove unused TargetTransformInfo dependency from SafeStack pass. llvm-svn: 248233	2015-09-22 00:44:32 +00:00
Michael Zolotukhin	9f3aea6e1f	[LoopUnswitch] Require DominatorTree info. Summary: We should either require the DT info to be available, or check if it's available in every place we use DT (and we already miss such check in one place, which causes failures in some cases). As other loop passes preserve DT and it's usually available, it makes sense to just require it here. There is no regression test, because the bug only shows up if pass manager decides to clean DT info right before LoopUnswitch. If loop-unswitch is run separately, DT is available, so bug isn't exposed. Reviewers: chandlerc, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13036 llvm-svn: 248230	2015-09-22 00:22:47 +00:00
Sanjoy Das	5d9a8cbb68	[SCEV] Use SaveAndRestore<T> instead of a hand rolled struct; NFCI. `ClearWalkingBEDominatingCondsOnExit` is exactly `SaveAndRestore<bool>`, so use `SaveAndRestore<bool>` instead. llvm-svn: 248227	2015-09-22 00:10:57 +00:00
Sanjay Patel	fc580a60e2	function names should start with a lower case letter; NFC llvm-svn: 248224	2015-09-21 23:03:16 +00:00
Sanjay Patel	4ac6b115e8	don't repeat function/variable names in header comments; NFC llvm-svn: 248222	2015-09-21 22:47:23 +00:00
Philip Reames	5f99423de9	[LICM] Hoist calls to readonly argmemonly functions even with stores in the loop We know that an argmemonly function can only access memory pointed to by it's pointer arguments. Rather than needing to consider all possible stores as aliasing (as we do for a readonly function), we can only consider the aliasing of the pointer arguments. Note that this change only addresses hoisting. I'm thinking about how to address speculation safety as well, but that will be a different change. FYI, argmemonly disallows accessing memory through non-pointer typed arguments. Differential Revision: http://reviews.llvm.org/D12771 llvm-svn: 248220	2015-09-21 22:27:59 +00:00
Philip Reames	963febd4f8	Fix for pr24866 Turns out that not every basic block is guaranteed to have a node within the DominatorTree. This is really hard to trigger, but the test case from the PR managed to do so. There's active discussion continuing about what documentation and/or invariants needed cleaned up. llvm-svn: 248216	2015-09-21 22:04:10 +00:00
Mehdi Amini	24e20583d1	Fix UB: can't bind a reference to nullptr (NFC) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 248213	2015-09-21 21:29:43 +00:00
David Blaikie	9ebdc69214	auto and range-for-ify some things to make changing container types a bit easier in the (possibly near) future llvm-svn: 248212	2015-09-21 21:07:50 +00:00
Simon Pilgrim	4003ed2da3	[DAGCombiner] Improve FMA support for interpolation patterns This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents. This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t)))) Differential Revision: http://reviews.llvm.org/D13003 llvm-svn: 248210	2015-09-21 20:32:48 +00:00
Jeroen Ketema	41681a5329	[ARM] Do not scale vext with a factor The vext pseudo-instruction takes the number of elements that need to be extracted, not the number of bytes. Hence, use the number of elements directly instead of scaling them with a factor. Reviewers: Silviu Baranga, James Molloy (not reflected in the differential revision) Differential Revision: http://reviews.llvm.org/D12974 llvm-svn: 248208	2015-09-21 20:28:04 +00:00
Simon Pilgrim	e8e5a17a12	[DAGCombiner] Tidy up FMA combine helpers. NFCI. Based on feedback for D13003. llvm-svn: 248206	2015-09-21 20:15:03 +00:00
James Molloy	50a4c27f97	[LoopUtils,LV] Propagate fast-math flags on generated FCmp instructions We're currently losing any fast-math flags when synthesizing fcmps for min/max reductions. In LV, make sure we copy over the scalar inst's flags. In LoopUtils, we know we only ever match patterns with hasUnsafeAlgebra, so apply that to any synthesized ops. llvm-svn: 248201	2015-09-21 19:41:19 +00:00
Stephen Canon	b12db0e42c	Remove roundingMode argument in APFloat::mod Because mod is always exact, this function should have never taken a rounding mode argument. The actual implementation still has issues, which I'll look at resolving in a subsequent patch. llvm-svn: 248195	2015-09-21 19:29:25 +00:00
Matt Arsenault	8fb9b94f7f	Fix accidentally committed debug printing llvm-svn: 248190	2015-09-21 18:21:10 +00:00
Marcello Maggioni	ab58c74d98	[DivergenceAnalysis] Separated definition of class into header. The definition of the DivergenceAnalysis pass was in a CPP file and wasn't accessible to users of the analysis to get it through "getAnalysis<>()". This patch extracts the definition into a separate header that can be used by users of the analysis to fetch the results. Patch by Volkan Keles (vkeles@apple.com) llvm-svn: 248186	2015-09-21 17:58:14 +00:00
Matthias Braun	b9fe44ddb0	SelectionDAG: Use InsertNode for EntryNode This fixes problems where two nodes have persistent debug id 0 assigned. llvm-svn: 248182	2015-09-21 17:41:05 +00:00
Chandler Carruth	7542d37688	[FunctionAttrs] Extract a helper function for the core logic used to evaluate whether 'readonly' or 'readnone' apply to a given function. This both reduces indentation and will make it easy to share the logic with a new pass manager implementation. llvm-svn: 248181	2015-09-21 17:39:41 +00:00
Ulrich Weigand	126caeb043	[SystemZ] Fix expansion of ISD::FPOW and ISD::FSINCOS The ISD::FPOW and ISD::FSINCOS opcodes default to Legal, but there is no legal instruction for those on SystemZ. This could cause LLVM internal errors. Fixed by setting the operation action to Expand for those opcodes. Also added test cases for all other LLVM IR intrinsics that should generate a library call. (Those already work correctly since the default operation action is fine.) llvm-svn: 248180	2015-09-21 17:35:45 +00:00
James Molloy	e46da3849a	Revert "[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def" This was committed without the code review (http://reviews.llvm.org/D12937) being approved. This reverts commit r248152. llvm-svn: 248174	2015-09-21 16:35:08 +00:00
Matt Arsenault	85441dd724	AMDGPU: Move copy handling under switch like other instructions llvm-svn: 248172	2015-09-21 16:27:22 +00:00
Sanjay Patel	55dcd40d3e	add ShouldChangeType() variant that takes bitwidths This is more efficient for cases like D12965 where we already have widths. llvm-svn: 248170	2015-09-21 16:09:37 +00:00
Matt Arsenault	b774834429	DAGCombiner: Replace store of FP constant after attemping store merges If storing multiple FP constants, some subset of the stores would be replaced with integers due to visit order, so MergeConsecutiveStores would only partially merge these. llvm-svn: 248169	2015-09-21 15:59:46 +00:00
Matt Arsenault	a30ddb6524	Factor replacement of stores of FP constants into new function llvm-svn: 248168	2015-09-21 15:59:43 +00:00
Sanjay Patel	84dca494b1	don't repeat function names in comments; NFC llvm-svn: 248166	2015-09-21 15:33:26 +00:00
Chad Rosier	03a47305ec	[Machine Combiner] Refactor machine reassociation code to be target-independent. No functional change intended. Patch by Haicheng Wu <haicheng@codeaurora.org>! http://reviews.llvm.org/D12887 PR24522 llvm-svn: 248164	2015-09-21 15:09:11 +00:00
Artyom Skrobov	79b0adaae4	[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with a FIXME: attached. This patch changes the handling of +t2dsp to be in line with other architecture extensions. Following review comments, also updating the description of FeatureDSPThumb2 in ARM.td. Differential Revision: http://reviews.llvm.org/D12937 llvm-svn: 248152	2015-09-21 12:43:10 +00:00
Asaf Badouh	eaf2da14bf	[X86][AVX512] add masked version for RSQRT14 & RCP14 Scalar FP Differential Revision: http://reviews.llvm.org/D12524 llvm-svn: 248147	2015-09-21 10:23:53 +00:00
Daniel Sanders	5d7962880d	[mips] Allow constant expressions in second argument of .cpsetup. Summary: Also tightened up the test and made a trivial fix to prevent double-newline after emitting .cpsetup directives. Reviewers: vkalintiris Subscribers: seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D12956 llvm-svn: 248143	2015-09-21 09:26:55 +00:00
Craig Topper	0013be16ff	Use makeArrayRef or None to avoid unnecessarily mentioning the ArrayRef type extra times. NFC llvm-svn: 248140	2015-09-21 05:32:41 +00:00
Craig Topper	4e9b03d6f9	Don't pass StringRefs around by const reference. Pass by value instead per coding standards. NFC llvm-svn: 248136	2015-09-21 00:18:00 +00:00
Craig Topper	3c76c523e1	Cleanup places that passed SMLoc by const reference to pass it by value instead. NFC llvm-svn: 248135	2015-09-20 23:35:59 +00:00
Sanjoy Das	7cc2cfecd9	[IndVars] Use C++11 style field initialization; NFCI. llvm-svn: 248131	2015-09-20 18:42:53 +00:00
Sanjoy Das	e1e352d5c5	[IndVars] Don't add a level of indentation for namespace {. NFC. Whitespace-only change. llvm-svn: 248130	2015-09-20 18:42:50 +00:00
Igor Breger	b7e1f9d680	AVX512: Implemented encoding and intrinsics for vcmpss/sd. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12593 llvm-svn: 248121	2015-09-20 15:15:10 +00:00
Asaf Badouh	2744d21fb8	[X86][AVX512] extend support in Scalar conversion add scalar FP to Int conversion with truncation intrinsics add scalar conversion FP32 from/to FP64 intrinsics add rounding mode and SAE mode encoding for these intrinsics Differential Revision: http://reviews.llvm.org/D12665 llvm-svn: 248117	2015-09-20 14:31:19 +00:00
Igor Breger	4c4cd789c9	AVX512: vsqrtss/sd encoding and intrinsics implementation. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12102 llvm-svn: 248116	2015-09-20 09:13:41 +00:00
Asaf Badouh	572bbceecc	[X86][AVX512DQ] Add fpclass instruction Differential Revision: http://reviews.llvm.org/D12931 llvm-svn: 248115	2015-09-20 08:46:07 +00:00
Michael Kuperstein	58e86bc893	[X86] Fix sitofp and uitofp instruction matching failures with long double and avx512 The operation action for i32 and i64 cannot be set to legal, as long double needs custom lowering. Patch by: mitch.l.bodart@intel.com Differential Revision: http://reviews.llvm.org/D12372 llvm-svn: 248114	2015-09-20 08:12:17 +00:00
Igor Breger	1d55f20bee	AVX512: Implemented intrinsics for vshuff32x4, vshuff64x2, vshufi64x2, vshufi32x4 Added tests for intrinsics. Differential Revision: http://reviews.llvm.org/D12525 llvm-svn: 248113	2015-09-20 07:18:53 +00:00
Sanjoy Das	9119bf4c0b	[IndVars] Don't repeat function names in comment; NFC. Only changes comments. llvm-svn: 248112	2015-09-20 06:58:03 +00:00
Igor Breger	0ede3cbb5c	AVX512: Implement instructions encoding, lowering and intrinsics vinserti64x4, vinserti64x2, vinserti32x8, vinserti32x4, vinsertf64x4, vinsertf64x2, vinsertf32x8, vinsertf32x4 Added tests for encoding, lowering and intrinsics. Differential Revision: http://reviews.llvm.org/D11893 llvm-svn: 248111	2015-09-20 06:52:42 +00:00
Saleem Abdulrasool	4966f58ac2	ARM: cleanup formatting clang-format a line which was poorly formatted. NFC. llvm-svn: 248110	2015-09-20 03:19:09 +00:00
Sanjoy Das	428db150d1	[IndVars] Fix a bug in r248045. Because -indvars widens induction variables through arithmetic, `NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV` manages information for all transitive uses of an IV being widened, including uses of `-1 * IV`). Instead it must live on `NarrowIVDefUse` which manages information for a specific def-use edge in the transitive use list of an induction variable. This change also adds a test case that demonstrates the problem with r248045. llvm-svn: 248107	2015-09-20 01:52:18 +00:00
Simon Pilgrim	d0448ee59f	[X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1)) Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x)) Differential Revision: http://reviews.llvm.org/D12663 llvm-svn: 248091	2015-09-19 13:22:57 +00:00
Simon Pilgrim	996725eb17	[InstCombine] Use SimplifyDemandedVectorEltsLow helper function. NFCI. Use the SimplifyDemandedVectorEltsLow helper function introduced in D12680. llvm-svn: 248089	2015-09-19 11:41:53 +00:00
Matt Arsenault	1fafdc82d6	AMDGPU: Remove dead code getCFGStructurizerRegClass is not used for SI, so move it into R600 specific stuff. llvm-svn: 248087	2015-09-19 06:41:10 +00:00
Bob Wilson	8823b84fae	NFC: Fix indentation and add braces to clarify nested of else-statement. llvm-svn: 248086	2015-09-19 06:20:59 +00:00
Maksim Panchenko	0510cd5161	[PrologEpilogInserter] Minor refactoring. Differential Revision: http://reviews.llvm.org/D12924 llvm-svn: 248084	2015-09-19 04:42:15 +00:00
Maksim Panchenko	07b754daf8	Test commit. Fix comment. NFC. llvm-svn: 248082	2015-09-19 04:01:19 +00:00
David Majnemer	47ce0b81b0	[InstCombine] FoldICmpCstShrCst failed for ashr when comparing against -1 (icmp eq (ashr C1, %V) -1) may have multiple answers if C1 is not a power of two and has the sign bit set. This fixes PR24873. llvm-svn: 248074	2015-09-19 00:48:31 +00:00
David Majnemer	e5977ebecc	[InstCombine] FoldICmpCstShrCst didn't handle icmps of -1 in the ashr case correctly llvm-svn: 248073	2015-09-19 00:48:26 +00:00
Sanjoy Das	f69d0e3384	[IndVars] Widen more comparisons for non-negative induction vars Summary: If an induction variable is provably non-negative, its sign extension is equal to its zero extension. This means narrow uses like icmp slt iNarrow %indvar, %rhs can be widened into icmp slt iWide zext(%indvar), sext(%rhs) Reviewers: atrick, mcrosier, hfinkel Subscribers: hfinkel, reames, llvm-commits Differential Revision: http://reviews.llvm.org/D12745 llvm-svn: 248045	2015-09-18 21:21:02 +00:00
Cong Hou	d40105d321	Update edge weights properly when merging blocks in if-conversion. In if-conversion, there is a utility function MergeBlocks() that is used to merge blocks. However, when new edges are built in this function the edge weight is either not provided or not updated properly, leading to a modified CFG with incorrect edge weights. This patch corrects this issue. Differential Revision: http://reviews.llvm.org/D12513 llvm-svn: 248030	2015-09-18 20:22:41 +00:00
Eric Christopher	a835956bda	Limit the range of processors supported by ARM fast isel to v6 or later as that's all that is tested right now. Fixes PR24858. llvm-svn: 248027	2015-09-18 20:08:18 +00:00
Larisse Voufo	532bf7153c	Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan. (Complete version of r247497. See D12886) llvm-svn: 248022	2015-09-18 19:14:35 +00:00
James Y Knight	e72b0dbf97	Make MachineScheduler debug output less confusing. At least...a little bit. llvm-svn: 248020	2015-09-18 18:52:20 +00:00
Cong Hou	f9f9ffb98b	Scaling up values in ARMBaseInstrInfo::isProfitableToIfCvt() before they are scaled by a probability to avoid precision issue. In ARMBaseInstrInfo::isProfitableToIfCvt(), there is a simple cost model in which the number of cycles is scaled by a probability to estimate the cost. However, when the number of cycles is small (which is usually the case), there is a precision issue after the computation. To avoid this issue, this patch scales those cycles by 1024 (chosen to make the multiplication a litter faster) before they are scaled by the probability. Other variables are also scaled up for the final comparison. Differential Revision: http://reviews.llvm.org/D12742 llvm-svn: 248018	2015-09-18 18:19:40 +00:00
Matthias Braun	77771cfd97	SelectionDAGDumper: Leave out the <multiple use> markers They mostly clutter the output while it is still possible to see which node has multiple users without them. Differential Revision: http://reviews.llvm.org/D12569 llvm-svn: 248013	2015-09-18 17:57:33 +00:00
Matthias Braun	bab3fb45e5	SelectionDAGDumper: Avoid unnecessary newlines Before: t0 = EntryToken:ch t0: <multiple use> t0: <multiple use> t1 = CopyFromReg:v4f32,ch t0, Register:v4f32 %vreg0 t25 = IMPLICIT_DEF:v4f32 t26 = HADDPSrr:v4f32 t1, t25 t23 = CopyToReg:ch,glue t0, Register:v4f32 %XMM0, t26 t23: <multiple use> t23: <multiple use> t24 = RETQ:ch Register:v4f32 %XMM0, t23, t23:1 After: t0: <multiple use> t0: <multiple use> t1 = CopyFromReg:v4f32,ch t0, Register:v4f32 %vreg0 t26 = X86ISD::FHADD:v4f32 t1, undef:v4f32 t23 = CopyToReg:ch,glue t0, Register:v4f32 %XMM0, t26 t23: <multiple use> t21 = TargetConstant:i16<0> t23: <multiple use> t24 = X86ISD::RET_FLAG:ch t23, t21, Register:v4f32 %XMM0, t23:1 Differential Revision: http://reviews.llvm.org/D12568 llvm-svn: 248012	2015-09-18 17:57:31 +00:00
Matthias Braun	f89b7c7188	SelectionDAGDumper: Hide [ID=X], [ORD=X] and source locations by default. You can show them with the new -dag-dump-verbose switch. Differential Revision: http://reviews.llvm.org/D12566 llvm-svn: 248011	2015-09-18 17:57:28 +00:00
Matthias Braun	0b7d6c14c9	SelectionDAG: Introduce PersistentID to SDNode for assert builds. This gives us more human readable numbers to identify nodes in debug dumps. Before: 0x7fcbd9700160: ch = EntryToken 0x7fcbd985c7c8: i64 = Register %RAX ... 0x7fcbd9700160: <multiple use> 0x7fcbd985c578: i64,ch = MOV64rm 0x7fcbd985c6a0, 0x7fcbd985cc68, 0x7fcbd985c200, 0x7fcbd985cd90, 0x7fcbd985ceb8, 0x7fcbd9700160<Mem:LD8[@foo]> [ORD=2] 0x7fcbd985c8f0: ch,glue = CopyToReg 0x7fcbd9700160, 0x7fcbd985c7c8, 0x7fcbd985c578 [ORD=3] 0x7fcbd985c7c8: <multiple use> 0x7fcbd985c8f0: <multiple use> 0x7fcbd985c8f0: <multiple use> 0x7fcbd985ca18: ch = RETQ 0x7fcbd985c7c8, 0x7fcbd985c8f0, 0x7fcbd985c8f0:1 [ORD=3] Now: t0: ch = EntryToken t5: i64 = Register %RAX ... t0: <multiple use> t3: i64,ch = MOV64rm t10, t12, t11, t13, t14, t0<Mem:LD8[@foo]> [ORD=2] t6: ch,glue = CopyToReg t0, t5, t3 [ORD=3] t5: <multiple use> t6: <multiple use> t6: <multiple use> t7: ch = RETQ t5, t6, t6:1 [ORD=3] Differential Revision: http://reviews.llvm.org/D12564 llvm-svn: 248010	2015-09-18 17:41:00 +00:00
Geoff Berry	43ec15e57e	[AArch64] Improved bitfield instruction selection. Summary: For bitfield insert OR matching, check both operands for larger pattern first before checking for smaller pattern. Add pattern for unsigned bitfield insert-in-zero done with SHL+AND. Resolves PR21631. Reviewers: jmolloy, t.p.northover Subscribers: aemerson, rengolin, llvm-commits, mcrosier Differential Revision: http://reviews.llvm.org/D12908 llvm-svn: 248006	2015-09-18 17:11:53 +00:00
Rafael Espindola	7dbb577826	Remove temporary file on signal. Without this lld leaves temporary files behind when it crashes. llvm-svn: 247994	2015-09-18 15:17:53 +00:00
Daniel Sanders	df19a5e605	[mips][microMIPS] Fix an invalid read for lwm32 and reserved reglist values. Summary: Some values of 'reglist' are reserved and cause the disassembler to read past the end of the Regs array. Treat lwm32's containing reserved values as invalid instructions. Reviewers: zoran.jovanovic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12959 llvm-svn: 247990	2015-09-18 14:20:54 +00:00
Chad Rosier	d90e2ebdf6	[AArch64] Reorder cases to improve readability. NFC. llvm-svn: 247989	2015-09-18 14:15:19 +00:00
Chad Rosier	84a0afdeff	[AArch64] Remove some redundant cases. NFC. llvm-svn: 247988	2015-09-18 14:13:18 +00:00
Aaron Ballman	2d0f38c5fb	Silencing a -Wsign-compare warning; NFC. llvm-svn: 247986	2015-09-18 13:31:42 +00:00
Igor Laevsky	0fa4819dd8	[LazyValueInfo] Report nonnull range for nonnull pointers Currently LazyValueInfo will report only alloca's as having nonnull range. For loads with !nonnull metadata it will bailout with no additional information. Same is true for calls returning nonnull pointers. This change extends LazyValueInfo to handle additional nonnull instructions. Differential Revision: http://reviews.llvm.org/D12932 llvm-svn: 247985	2015-09-18 13:01:48 +00:00
Artur Pilipenko	84bc62f7a3	Support align attribute for return values Reviewed By: reames Differential Revision: http://reviews.llvm.org/D12844 llvm-svn: 247984	2015-09-18 12:33:31 +00:00
Michael Kruse	020296a968	[Support] Reapply r245289 "Always wait for GraphViz before opening the viewer" The change was accidentally undone by r245290. Original log message: When calling DisplayGraph and a PS viewer is chosen, two programs are executed: The GraphViz generator and the PostScript viewer. Always wait for the generator to finish to ensure that the .ps file is written before opening the viewer for that file. DisplayGraph's wait parameter refers to whether to wait until the user closes the viewer. This happened on Windows and if none of the options to open the .dot file directly applies, also on Linux. Differential Revision: http://reviews.llvm.org/D11876 llvm-svn: 247980	2015-09-18 10:56:30 +00:00
David Majnemer	9966fe8f85	[WinEH] Moved funclet pads should be in relative order We shifted the MachineBasicBlocks to the end of the MachineFunction in DFS order. This will not ensure that MachineBasicBlocks which fell through to one another will remain contiguous. Instead, implement a stable sort algorithm for iplist. This partially reverts commit r214150. llvm-svn: 247978	2015-09-18 08:18:07 +00:00
Bob Wilson	dd0eadce7d	Whitespace. Indent with spaces instead of a tab. llvm-svn: 247969	2015-09-18 05:36:13 +00:00
Quentin Colombet	b4c6886215	[ShrinkWrap] Refactor the handling of infinite loop in the analysis. - Strenghten the logic to be sure we hoist the restore point out of the current loop. (The fixes a bug with infinite loop, added as part of the patch.) - Walk over the exit blocks of the current loop to conver to the desired restore point in one iteration of the update loop. llvm-svn: 247958	2015-09-17 23:21:34 +00:00
David Blaikie	6a51dbdb3c	[opaque pointer types] Add an explicit pointee type to alias records in the IR Since aliases actually use and verify their explicit type already, no further invalid testing is required here. The invalid.test:ALIAS-TYPE-MISMATCH case catches errors due to emitting a non-pointee type in the new format or a non-pointer type in the old format. llvm-svn: 247952	2015-09-17 22:18:59 +00:00
Alexei Starovoitov	bf19a11116	[bpf] expand indirect branches BPF instruction set doesn't have indirect branches. Expand them. Reported by John Fastabend. llvm-svn: 247951	2015-09-17 22:18:08 +00:00
Matthias Braun	3e86de1acb	Revert "(HEAD -> master, origin/master, origin/HEAD) RegisterPressure: Move LiveInRegs/LiveOutRegs from RegisterPressure to PressureTracker" This reverts commit r247943. Accidental commit, code review was not finished yet. llvm-svn: 247945	2015-09-17 21:12:24 +00:00
Matthias Braun	70eff2571f	RegisterPressure: Move LiveInRegs/LiveOutRegs from RegisterPressure to PressureTracker Differential Revision: http://reviews.llvm.org/D12814 llvm-svn: 247943	2015-09-17 21:10:06 +00:00
Matthias Braun	d78ee54a54	MachineScheduler: Provide an option for node hiding cutoff and disable it by default llvm-svn: 247942	2015-09-17 21:09:59 +00:00
Joerg Sonnenberger	1bbfa7f9d7	[SPARC] Add mulscc. llvm-svn: 247940	2015-09-17 20:54:26 +00:00
Sanjay Patel	5dd66c3ca2	fix typo; NFC llvm-svn: 247938	2015-09-17 20:51:50 +00:00
David Majnemer	978902309a	[WinEH] Add a funclet layout pass Windows EH funclets need to be contiguous. The FuncletLayout pass will ensure that the funclets are together and begin with a funclet entry MBB. Differential Revision: http://reviews.llvm.org/D12943 llvm-svn: 247937	2015-09-17 20:45:18 +00:00
Reid Kleckner	5b8a46e771	[WinEH] Make funclet return instrs pseudo instrs This makes catchret look more like a branch, and less like a weird use of BlockAddress. It also lets us get away from llvm.x86.seh.restoreframe, which relies on the old parentfpoffset label arithmetic. llvm-svn: 247936	2015-09-17 20:43:47 +00:00
Piotr Padlewski	a4d43337d4	gvn small fix http://reviews.llvm.org/D12928 llvm-svn: 247935	2015-09-17 20:34:22 +00:00
Simon Pilgrim	61116ddc7b	[InstCombine] Added vector demanded bits support for SSE4A EXTRQ/INSERTQ instructions The SSE4A instructions EXTRQ/INSERTQ only use the lower 64-bits (or less) for many of their input vector operands and all of them have undefined upper 64-bits results. Differential Revision: http://reviews.llvm.org/D12680 llvm-svn: 247934	2015-09-17 20:32:45 +00:00
Piotr Padlewski	ea09288ee7	Added MD_invariant_group to LLVMContext http://reviews.llvm.org/D12926 llvm-svn: 247931	2015-09-17 20:25:07 +00:00
Teresa Johnson	ff642b9b84	Restore "Function bitcode index in Value Symbol Table and lazy reading support" This reverts commit r247898 (which reverted r247894). Patch fixed to address two issues exposed by buildbots: - unused variable warning in NDEBUG mode - std::initializer_list lifetime issue causing test failures Original Summary: Support for including the function bitcode indices in the Value Symbol Table. This requires writing the VST after the function blocks, which in turn requires a new VST forward declaration record encoding the offset of the full VST (which is backpatched to contain the offset after the VST is written). This patch also enables the lazy function reader to use the new function indices out of the VST. This support will be used by ThinLTO as well, which will be in a follow on patch. Backwards compatibility with older bitcode files is maintained. A new test is also included. The bitcode format (used for the lazy reader as well as the upcoming ThinLTO patches) came out of discussions with Duncan and others and is described here: https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view Reviewers: dexonsmith, davidxl, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12536 llvm-svn: 247927	2015-09-17 20:12:00 +00:00
Sanjoy Das	7a9f8bb995	[SCEV] Use auto instead of full iterator type; NFCI. llvm-svn: 247919	2015-09-17 19:04:09 +00:00
Reid Kleckner	ed17079b52	[WinEH] Add and use hasEHPadSuccessor instead of getLandingPadSuccessor getLandingPadSuccessor assumes that each invoke can have at most one EH pad successor, but WinEH invokes can have more than one. Two out of three callers of getLandingPadSuccessor don't use the returned landingpad, so we can make them use this simple predicate instead. Eventually we'll have to circle back and fix SplitKit.cpp so that register allocation works. Baby steps. llvm-svn: 247904	2015-09-17 17:19:40 +00:00
Zia Ansari	841cce1ae9	Test commit. llvm-svn: 247901	2015-09-17 16:51:27 +00:00
Teresa Johnson	2e98d57ad4	Revert "Function bitcode index in Value Symbol Table and lazy reading support" Temporarily revert to fix some buildbot issues. One is a minor issue with a variable unused in NDEBUG mode. More concerning are some test failures on win7 that I need to dig into. This reverts commit 4e66a74543459832cfd571db42b4543580ae1d1d. llvm-svn: 247898	2015-09-17 16:19:10 +00:00
Daniel Sanders	e2982adc0b	[mips] Add assembler support for the .cprestore directive. Summary: This assembler directive is used in O32 PIC to restore the current function's $gp after executing JAL's. The $gp is first stored on the stack at a user-specified offset. It has the following format: ".cprestore 8" (where 8 is the offset). This fixes llvm.org/PR20967. Patch by Toma Tabacu. Reviewers: seanbruno, tomatabacu Subscribers: brooks, seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D6267 llvm-svn: 247897	2015-09-17 16:08:39 +00:00
Teresa Johnson	b77b1f8a0c	Function bitcode index in Value Symbol Table and lazy reading support Summary: Support for including the function bitcode indices in the Value Symbol Table. This requires writing the VST after the function blocks, which in turn requires a new VST forward declaration record encoding the offset of the full VST (which is backpatched to contain the offset after the VST is written). This patch also enables the lazy function reader to use the new function indices out of the VST. This support will be used by ThinLTO as well, which will be in a follow on patch. Backwards compatibility with older bitcode files is maintained. A new test is also included. The bitcode format (used for the lazy reader as well as the upcoming ThinLTO patches) came out of discussions with Duncan and others and is described here: https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view Reviewers: dexonsmith, davidxl, joker.eph Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12536 llvm-svn: 247894	2015-09-17 15:52:30 +00:00
Teresa Johnson	c01e4cbccc	Refactor string encoding checks in BitcodeWriter (NFC) llvm-svn: 247891	2015-09-17 14:37:35 +00:00
Chad Rosier	6c1f0933ac	Typos. NFC. llvm-svn: 247884	2015-09-17 13:10:27 +00:00
Zoran Jovanovic	7ba636cb4c	[mips][microMIPS] Implement TEQ, TGE, TGEU, TLT, TLTU and TNE instructions Differential Revision: http://reviews.llvm.org/D9658 llvm-svn: 247880	2015-09-17 10:14:09 +00:00
Elena Demikhovsky	702a6adfaa	AVX-512: shufflevector for i1 vectors <2 x i1> .. <64 x i1> AVX-512 does not provide an instruction that shuffles mask register. So I do the following way: mask-2-simd , shuffle simd , simd-2-mask Differential Revision: http://reviews.llvm.org/D12727 llvm-svn: 247876	2015-09-17 06:53:12 +00:00
Diego Novillo	3376a78781	GCC AutoFDO profile reader - Initial support. This adds enough machinery to support reading simple GCC AutoFDO profiles. It now supports reading flat profiles (no function calls). Subsequent patches will add support for: - Inlined calls (in particular, the inline call stack is not traversed to accumulate samples). - Working sets and modules. These are used mostly for GCC's LIPO optimizations, so they're not needed in LLVM atm. I'm not sure that we will ever need them. For now, I've if0'd around the calls. The patch also adds support in GCOV.h for gcov version V704 (generated by GCC's profile conversion tool). llvm-svn: 247874	2015-09-17 00:17:24 +00:00
Hans Wennborg	9099b5e644	Try to fix WebAssembly build after r247864 llvm-svn: 247870	2015-09-16 23:59:57 +00:00
Naomi Musgrave	f90c1be78c	ScalarEvolution: added tmp to avoid use-after-dtor in for loop. Summary: For loop destroyed current instance before invoking next. Temporary variable added to prevent use-after-dtor when invoke destructor on current instance. Reviewers: eugenis Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12912 Rename temp var. llvm-svn: 247867	2015-09-16 23:46:40 +00:00
Eric Christopher	eb8b4bc473	Make sure we're negating the assembler predicate - no testcase because it isn't being used on anything via the assembler right now. llvm-svn: 247866	2015-09-16 23:38:18 +00:00
Eric Christopher	c7b155f670	Use the cached TargetInstrInfo instead of looking it up again. llvm-svn: 247865	2015-09-16 23:38:16 +00:00
Eric Christopher	a4e5d3cf8e	constify the Function parameter to the TTI creation callback and propagate to all callers/users/etc. llvm-svn: 247864	2015-09-16 23:38:13 +00:00
Reid Kleckner	813f1b65bc	[WinEH] Rip out the landingpad-based C++ EH state numbering code It never really worked, and the new code is working better every day. llvm-svn: 247860	2015-09-16 22:14:46 +00:00
David Majnemer	67bff0d88b	[WinEHPrepare] Turn terminatepad into a cleanuppad + call + cleanupret The MSVC doesn't really support exception specifications so let's just turn these into cleanuppads. Later, we might use terminatepad to more efficiently encode the "noexcept"-ness of a function body. llvm-svn: 247848	2015-09-16 20:42:16 +00:00
Sanjoy Das	e5f4889ba9	[InstCombine] Optimize icmp slt signum(x), 1 --> icmp slt x, 1 Summary: `signum(x)` is sometimes implemented as `(x >> 63) \| (-x >>> 63)` (for an `i64` `x`). This change adds a matcher for that pattern, and an instcombine rule to optimize `signum(x) s< 1`. Later, we can also consider optimizing: icmp slt signum(x), 0 --> icmp slt x, 0 icmp sle signum(x), 1 --> true etc. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12703 llvm-svn: 247846	2015-09-16 20:41:29 +00:00
Reid Kleckner	b005d281c3	[WinEH] Pull Adjectives and CatchObj out of the catchpad arg list Clang now passes the adjectives as an argument to catchpad. Getting the CatchObj working is simply a matter of threading another static alloca through codegen, first as an alloca, then as a frame index, and finally as a frame offset. llvm-svn: 247844	2015-09-16 20:16:27 +00:00
David Majnemer	459a64aed7	[WinEHPrepare] Provide a cloning mode which doesn't demote We are experimenting with a new approach to saving and restoring SSA values used across funclets: let the register allocator do the dirty work for us. However, this means that we need to be able to clone commoned blocks without relying on demotion. llvm-svn: 247835	2015-09-16 18:40:37 +00:00
David Majnemer	b3d9b960ea	[WinEHPrepare] Refactor explicit EH preparation Split the preparation machinery into several functions, we will want to selectively enable/disable different parts of it for an alternative mechanism for dealing with cross-funclet uses. llvm-svn: 247834	2015-09-16 18:40:24 +00:00
Reid Kleckner	84ebff4a5e	[WinEH] Skip state numbering when no EH pads are present Otherwise we'd try to emit the thunk that passes the LSDA to __CxxFrameHandler3. We don't emit the LSDA if there were no landingpads, so we'd end up with an assembler error when trying to write the COFF object. llvm-svn: 247820	2015-09-16 17:19:44 +00:00
Dan Gohman	950a13cfa3	[WebAssembly] Check in an initial CFG Stackifier pass This pass implements a simple algorithm for conversion from CFG to wasm's structured control flow. It doesn't yet handle multiple-entry loops; that will be added in a future patch. It also adds initial support for switch statements. Differential Revision: http://reviews.llvm.org/D12735 llvm-svn: 247818	2015-09-16 16:51:30 +00:00
Sanjay Patel	a260701bbb	propagate fast-math-flags on DAG nodes After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815	2015-09-16 16:31:21 +00:00
Reid Kleckner	85dfb68e50	Add assembler fatal error for undefined assembler labels in COFF writer llvm-svn: 247814	2015-09-16 16:26:29 +00:00
Sanjay Patel	815adacd22	don't repeat function names in comments; NFC llvm-svn: 247813	2015-09-16 16:21:08 +00:00
Adhemerval Zanella	f0c95bd2ca	[sanitizer] Add MSan support for AArch64 This patch adds support for msan on aarch64-linux for both 39 and 42-bit VMA. The support is enabled by defining the SANITIZER_AARCH64_VMA compiler flag to either 39 or 42 at build time for both clang/llvm and compiler-rt. The default VMA is 39 bits. llvm-svn: 247807	2015-09-16 15:10:27 +00:00
Joerg Sonnenberger	22cd644e1b	[SPARC] Both GNU and Solaris as support eq as condition code for integer ops. llvm-svn: 247804	2015-09-16 14:41:36 +00:00
Joerg Sonnenberger	9763490e4d	[SPARC] Recognize st/stx operations with %fsr argument too. llvm-svn: 247794	2015-09-16 13:30:54 +00:00
David L Kreitzer	da700ce581	Test commit: Fixed a few typos in the comments. llvm-svn: 247793	2015-09-16 13:27:30 +00:00
Chad Rosier	5d485db6b2	[ARM] Register ARMPreAllocLoadStoreOpt pass with LLVM pass manager. llvm-svn: 247791	2015-09-16 13:11:31 +00:00
Michael Kuperstein	d926465342	[X86] Do not generate 64-bit pops of 32-bit GPRs. When trying emit a stack adjustments using pops, frame lowering selects an arbitrary free GPR. It should always select one from an appropriate class... This fixes PR24649. Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D12609 llvm-svn: 247785	2015-09-16 11:27:20 +00:00
Michael Kuperstein	098cd9fba7	[X86] Fix emitEpilogue() to make less assumptions about pops This is the mirror image of r242395. When X86FrameLowering::emitEpilogue() looks for where to insert the %esp addition that deallocates stack space used for local allocations, it assumes that any sequence of pop instructions from function exit backwards consists purely of restoring callee-save registers. This may be false, since from some point backward, the pops may be clean-up of stack space allocated for arguments to a call. Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D12688 llvm-svn: 247784	2015-09-16 11:18:25 +00:00
Zoran Jovanovic	6e6a2c9cd7	[mips][microMIPS] Implement PREFX, LHUE, LBE, LBUE, LHE, LWE, SBE, SHE and SWE instructions Differential Revision: http://reviews.llvm.org/D9189 llvm-svn: 247780	2015-09-16 09:14:35 +00:00
Craig Topper	5db36df4d0	Use range-based for loops. NFC llvm-svn: 247772	2015-09-16 03:52:35 +00:00
Craig Topper	77ec077067	Fix a spelling error in the description of a statistic. NFC llvm-svn: 247771	2015-09-16 03:52:32 +00:00
Michael Zolotukhin	fc314be0ec	[Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad. We only checked that a global is initialized with constants, which is incorrect. We should be checking that GlobalVariable is a constant, not just initialized with it. llvm-svn: 247769	2015-09-16 03:25:09 +00:00
Sanjoy Das	8a5526e8be	[IndVars] Fix PR24783. In `IndVarSimplify::ExpandSCEVIfNeeded`, `SCEVExpander::findExistingExpansion` may return an `llvm::Value` that differs in type from the SCEV it was asked to find an expansion for (but computes the same value). In such cases, we fall back on `expandCodeFor`; and rely on LLVM to CSE the two equivalent expressions (different only by a no-op cast) into a single computation. I tried a few other approaches to fixing PR24783, all of which turned out to be more complex than this current version: 1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`. This got problematic because currently we do not pass in the `Loop *` into `expandCodeFor`. Changing the interface to do this is a more invasive change, and really does not make much semantic sense unless the SCEV being passed in is an add recurrence. There is also the problem of `expandCodeFor` being used in places other than `indvars` -- there may be performance / correctness issues elsewhere if `expandCodeFor` is moved from always generating IR from scratch to cache-like model. 2. Have `findExistingExpansion` only return expression with the correct type. This would make `isHighCostExpansionHelper` and thus `isHighCostExpansion` more conservative than necessary. 3. Insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo`. This is complicated because `InsertNoopCastOfTo` depends on internal state of its `SCEVExpander` (specifically `Builder.GetInserPoint()`), and this may not be set up when `ExpandSCEVIfNeeded` is called. 4. Manually insert casts on the value returned by `findExistingExpansion` if needed using `InsertNoopCastOfTo` via `CastInst::Create`. This is probably workable, but figuring out the location where the cast instruction needs to be inserted has enough edge cases (arguments, constants, invokes, LCSSA must be preserved) makes me feel what I have right now is simplest solution. llvm-svn: 247749	2015-09-15 23:45:39 +00:00
Sanjoy Das	0ce51a92a8	[IndVars] Rename variable; NFC. llvm-svn: 247748	2015-09-15 23:45:35 +00:00
Duncan P. N. Exon Smith	cff5feff6f	Reapply "LTO: Disable extra verify runs in release builds" This reverts commit r247730, effectively reapplying r247729. This time I have an lld commit ready to follow. llvm-svn: 247735	2015-09-15 23:05:59 +00:00
Alexey Samsonov	c1603b6493	[ASan] Don't instrument globals in .preinit_array/.init_array/.fini_array These sections contain pointers to function that should be invoked during startup/shutdown by __libc_csu_init and __libc_csu_fini. Instrumenting these globals will append redzone to them, which will be filled with zeroes. This will cause null pointer dereference at runtime. Merge ASan regression tests for globals that should be ignored by instrumentation pass. llvm-svn: 247734	2015-09-15 23:05:48 +00:00
Duncan P. N. Exon Smith	7de73e56a4	Revert "LTO: Disable extra verify runs in release builds" This temporarily reverts commit r247729, as it caused lld build failures. I'll recommit once I have an lld patch ready-to-go. llvm-svn: 247730	2015-09-15 22:47:38 +00:00
Duncan P. N. Exon Smith	236787838c	LTO: Disable extra verify runs in release builds The verifier currently runs three times in LTO: (1) after parsing, (2) at the beginning of the optimization pipeline, and (3) at the end of it. The first run is important, since we're not sure where the bitcode comes from and it's nice to validate it, but in release builds the extra runs aren't appropriate. This commit: - Allows these runs to be disabled in LTOCodeGenerator. - Adds command-line options to llvm-lto. - Adds command-line options to libLTO.dylib, and disables the verifier by default in release builds (based on NDEBUG). This shaves about 3.5% off the runtime of ld64 when linking verify-uselistorder with -flto -g. rdar://22509081 llvm-svn: 247729	2015-09-15 22:26:11 +00:00
Larisse Voufo	6b867c7254	Revert "Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan." for preliminary community discussion (See. D12886) llvm-svn: 247716	2015-09-15 19:14:05 +00:00
Piotr Padlewski	6c15ec49ed	Introducing llvm.invariant.group.barrier intrinsic For more info for what reason it was invented, goto: http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html invariant.group.barrier: http://reviews.llvm.org/D12310 docs: http://reviews.llvm.org/D11399 CodeGenPrepare: http://reviews.llvm.org/D12875 llvm-svn: 247711	2015-09-15 18:32:14 +00:00
Quentin Colombet	dc29c973e5	[ShrinkWrapping] Fix an infinite loop while looking for restore point. This may happen when the input program itself contains an infinite loop with no exit block. In that case, we would fail to find a block post-dominating the loop such that this block is outside of the loop. This fixes PR24823. Working on reducing the test case. llvm-svn: 247710	2015-09-15 18:19:39 +00:00
Arch D. Robison	8ed0854f55	Broaden optimization of fcmp ([us]itofp x, constant) by instcombine. The patch extends the optimization to cases where the constant's magnitude is so small or large that the rounding of the conversion is irrelevant. The "so small" case includes negative zero. Differential review: http://reviews.llvm.org/D11210 llvm-svn: 247708	2015-09-15 17:51:59 +00:00
Igor Laevsky	bdc1eafe20	[CorrelatedValuePropagation] Infer nonnull attributes LazuValueInfo can prove that value is nonnull based on the context information. Make use of this ability to infer nonnull attributes for the call arguments. Differential Revision: http://reviews.llvm.org/D12836 llvm-svn: 247707	2015-09-15 17:51:50 +00:00
Marcello Maggioni	454faa84e2	[NaryReassociate] Add support for Mul instructions This patch extends the current pass by handling Mul instructions as well. Patch by: Volkan Keles (vkeles@apple.com) llvm-svn: 247705	2015-09-15 17:22:52 +00:00
Daniel Sanders	50f17235dd	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Eric has replied and has demanded the patch be reverted. llvm-svn: 247702	2015-09-15 16:17:27 +00:00
Sanjay Patel	e9434e80d1	80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC80-cols; NFC llvm-svn: 247700	2015-09-15 15:26:25 +00:00
Sanjay Patel	f9b776350f	more space; NFC llvm-svn: 247699	2015-09-15 15:24:42 +00:00
Zoran Jovanovic	dc4b8c2761	[mips][microMIPS] Fix an issue with disassembling lwm32 instruction Fixed microMIPS disassembler crash on test case generated by llvm-mc-fuzzer. Differential Revision: http://reviews.llvm.org/D12881 llvm-svn: 247698	2015-09-15 15:21:27 +00:00
Zoran Jovanovic	8eb8c9861d	[mips] Add support for branch-likely pseudo-instructions Differential Revision: http://reviews.llvm.org/D10537 llvm-svn: 247697	2015-09-15 15:06:26 +00:00
Ulrich Weigand	e861e6442c	[SystemZ] Fix assertion failure in tryBuildVectorShuffle Under certain circumstances, tryBuildVectorShuffle would attempt to create a BUILD_VECTOR node with an invalid combination of types. This happened when one of the components of the original BUILD_VECTOR was itself a TRUNCATE node. That TRUNCATE was stripped off during intermediate processing to simplify code, but when adding the node back to the result vector, we still need it to get the type right. llvm-svn: 247694	2015-09-15 14:27:46 +00:00
Daniel Sanders	153010c52d	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247692	2015-09-15 14:08:28 +00:00
Daniel Sanders	c40de48041	Revert r247684 - Replace Triple with a new TargetTuple ... LLDB needs to be updated in the same commit. llvm-svn: 247686	2015-09-15 13:46:21 +00:00
Daniel Sanders	18d4b0dab7	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 llvm-svn: 247683	2015-09-15 13:17:40 +00:00
Daniel Sanders	c8cd6e95d2	Fix namespace indentation and missing blank lines before 'public:' in *MCAsmInfo.h. NFC. This is to reduce noise in a following commit. Also fixes a couple missing spaces before the reference operator. llvm-svn: 247679	2015-09-15 12:27:06 +00:00
James Molloy	d5b161a221	[GlobalsAA] Disable globals-aa by default Several issues have been found with it - disabling in the meantime. llvm-svn: 247674	2015-09-15 10:44:06 +00:00
Zoran Jovanovic	7beb737b46	[mips][microMIPS] Implement CACHEE and PREFE instructions for microMIPS32r6 Differential Revision: http://reviews.llvm.org/D11632 llvm-svn: 247670	2015-09-15 10:05:10 +00:00
Daniel Sanders	e4e83a7bc1	[mips] Added support for various EVA ASE instructions. Summary: Added support for the following instructions: CACHEE, LBE, LBUE, LHE, LHUE, LWE, LLE, LWLE, LWRE, PREFE, SBE, SHE, SWE, SCE, SWLE, SWRE, TLBINV, TLBINVF This required adding some infrastructure for the EVA ASE. Patch by Scott Egerton. Reviewers: vkalintiris, dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11139 llvm-svn: 247669	2015-09-15 10:02:16 +00:00
Sanjoy Das	f75e15e5ac	[PlaceSafepoints] Make the width of a counted loop settable. Summary: This change lets a `PlaceSafepoints` client change how wide the trip count of a loop has to be for the loop to be considerd "counted", via `CountedLoopTripWidth`. It also removes the boolean `SkipCounted` flag and the `upperTripBound` constant -- we can get the old behavior of `SkipCounted` == `false` by setting `CountedLoopTripWidth` to `13` (2 ^ 13 == 8192). Reviewers: reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12789 llvm-svn: 247656	2015-09-15 01:42:48 +00:00
Dan Gohman	311b488d76	[WebAssembly] Implement int64-to-int32 conversion. llvm-svn: 247649	2015-09-15 00:55:19 +00:00
Adrian Prantl	deef90d7f5	DwarfDebug: Emit dwo_id+dwo_name for DICompileUnits that provide a dwoId. For module debugging clang emits prefabricated skeleton compile units that can be recognized by a nonzero dwoId. llvm-svn: 247626	2015-09-14 22:10:22 +00:00
David Blaikie	798f3079d4	[opaque pointer types] Add an explicit value type to GlobalObject This is needed by all GlobalObjects (GlobalAlias, Function, GlobalVariable), see the GlobalObject::getValueType which is used in many places. If at some point that can be removed, then we can remove this member. llvm-svn: 247621	2015-09-14 21:47:27 +00:00
David Blaikie	6614d8d230	[opaque pointer types] Switch a few cases of getElementType over, since I had them lying around anyway llvm-svn: 247610	2015-09-14 20:29:26 +00:00
Matthias Braun	3f3934b010	RegisterPressure: Simplify close{Top\|Bottom}() - There are no duplicate registers in LiveRegs list we are copying from and so we do not need to sort the registers. - Simply use SmallVector::apend instead of a loop between begin() and end() with push_back(). Differential Revision: http://reviews.llvm.org/D12813 llvm-svn: 247588	2015-09-14 18:24:15 +00:00
Chen Li	0d043b52eb	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. It also adds assertions in isKnownNonNull() and isKnownNonNullFromDominatingCondition() to make sure the value checked is pointer type (as defined in LLVM document). These assertions might trip failures in things which are not covered under llvm/test, but fixes should be pretty obvious. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12779 llvm-svn: 247587	2015-09-14 18:10:43 +00:00
David Blaikie	16a2f3e302	Revert "[opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space" This was a flawed change - it just caused the getElementType call to be deferred until later, when we really need to remove it. Now that the IR for GlobalAliases has been updated, the root cause is addressed that way instead and this change is no longer needed (and in fact gets in the way - because we want to pass the pointee type directly down further). Follow up patches to push this through GlobalValue, bitcode format, etc, will come along soon. This reverts commit 236160. llvm-svn: 247585	2015-09-14 18:01:59 +00:00
Jun Bum Lim	34b9bd0435	Improve ISel using across lane min/max reduction In vectorized integer min/max reduction code, the final "reduce" step is sub-optimal. In AArch64, this change wll combine : %svn0 = vector_shuffle %0, undef<2,3,u,u> %smax0 = smax %0, svn0 %svn3 = vector_shuffle %smax0, undef<1,u,u,u> %sc = setcc %smax0, %svn3, gt %n0 = extract_vector_elt %sc, #0 %n1 = extract_vector_elt %smax0, #0 %n2 = extract_vector_elt $smax0, #1 %result = select %n0, %n1, n2 becomes : %1 = smaxv %0 %result = extract_vector_elt %1, 0 This change extends r246790. llvm-svn: 247575	2015-09-14 16:19:52 +00:00
Daniel Sanders	7d6d898826	[mips] Unified the MipsMemSimm9GPRAsmOperand and MipsMemSimm9AsmOperand operands, NFC. Summary: These operands had the same purpose, however the MipsMemSimm9GPRAsmOperand operand was only for micromips32r6 and the MipsMemSimm9AsmOperand did not have a ParserMatchClass. Patch by Scott Egerton Reviewers: vkalintiris, dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12730 llvm-svn: 247573	2015-09-14 15:57:24 +00:00
JF Bastien	26aca14b15	[MergeFuncs] Fix bug in merging GetElementPointers GetElementPointers must have the first argument's type compared for structural equivalence. Previously the code erroneously compared the pointer's type, but this code was dead because all pointer types (of the same address space) are the same. The pointee must be compared instead (using the type stored in the GEP, not from the pointer type which will be erased anyway). Author: jrkoenig Reviewers: dschuff, nlewycky, jfb Subscribers: nlewycky, llvm-commits Differential revision: http://reviews.llvm.org/D12820 llvm-svn: 247570	2015-09-14 15:37:48 +00:00
John Brawn	056e67865a	[ARM] Extract shifts out of multiply-by-constant Turning (op x (mul y k)) into (op x (lsl (mul y k>>n) n)) is beneficial when we can do the lsl as a shifted operand and the resulting multiply constant is simpler to generate. Do this by doing the transformation when trying to select a shifted operand, as that ensures that it actually turns out better (the alternative would be to do it in PreprocessISelDAG, but we don't know for sure there if extracting the shift would allow a shifted operand to be used). Differential Revision: http://reviews.llvm.org/D12196 llvm-svn: 247569	2015-09-14 15:19:41 +00:00
Simon Atanasyan	75e3b3c826	[mips] Remove redundant nested-name-specifier. NFC llvm-svn: 247547	2015-09-14 11:18:22 +00:00
Simon Atanasyan	4d45c76895	[mips] Save a copy of MipsABIInfo in the MipsTargetStreamer to escape a dangling pointer The MipsTargetELFStreamer can receive ABI info from many sources. For example, from the MipsAsmParser instance. Lifetime of the MipsAsmParser can be shorter than MipsTargetELFStreamer's lifetime. In that case we get a dangling pointer to MipsABIInfo. Differential Revision: http://reviews.llvm.org/D12805 llvm-svn: 247546	2015-09-14 11:18:03 +00:00
NAKAMURA Takumi	98800d9c3a	GlobalsAAResult: Try to fix crash. DeletionCallbackHandle holds GAR in its creation. It assumes; - It is registered as CallbackVH. It should not be moved in its life. - Its parent, GAR, may be moved. To move list<DeletionCallbackHandle> GlobalsAAResult::Handles, GAR must be updated with the destination in GlobalsAAResult(&&). llvm-svn: 247534	2015-09-14 06:16:44 +00:00
Simon Pilgrim	f8f86ab176	[X86][MMX] Added shuffle decodes for MMX/3DNow! shuffles. Added shuffle decodes for MMX PUNPCK + PSHUFW shuffles. Added shuffle decodes for 3DNow! PSWAPD shuffles. llvm-svn: 247526	2015-09-13 11:28:45 +00:00
Chandler Carruth	3824f859f5	[FunctionAttrs] Move the malloc-like test to a static helper function that could be used from a new pass manager. This one makes particular sense as a static helper as it doesn't even need TLI. llvm-svn: 247525	2015-09-13 08:23:27 +00:00
Chandler Carruth	8874b78697	[FunctionAttrs] Factor the logic to test for a known non-null return out of a method and into a re-usable static helper. We can potentially use this function from the implementation of a new pass manager oriented version of the pass. Also add some better documentation of exactly what the semantic model of this routine is (it isn't trivial) and use a more modern naming convention for it. llvm-svn: 247524	2015-09-13 08:17:14 +00:00
Elena Demikhovsky	8671fcbbd6	AVX-512: Fixed a bug in OR/XOR operations for 512-bit FP values on KNL. KNL does not have VXORPS, VORPS for 512-bit values. I use integer VPXOR, VPOR that actually do the same. X86ISD::FXOR/FOR are generated as a result of FSUB combining. Differential Revision: http://reviews.llvm.org/D12753 llvm-svn: 247523	2015-09-13 08:15:15 +00:00
Chandler Carruth	444d005615	[FunctionAttrs] Make the per-function attribute inference a boring static function rather than a method. It just needed access to TargetLibraryInfo, and this way it can be easily reused between the current FunctionAttrs implementation and any port for the new pass manager. llvm-svn: 247522	2015-09-13 08:03:23 +00:00
Chandler Carruth	d02452015c	[FunctionAttrs] Collect utility functions as static helpers rather than methods. They don't need anything from the class anyways. Also, collect the declarations into the private section of the pass. llvm-svn: 247521	2015-09-13 07:50:43 +00:00
Chandler Carruth	a632fb9e86	Clean up doxygen comments in FunctionAttrs, promoting some non-doxygen comments, deleting duplicate comments, moving comments to consistently live on the definition since these are all really internal routines, etc. NFC. llvm-svn: 247520	2015-09-13 06:57:25 +00:00
Chandler Carruth	63559d7cd4	Do some spring cleaning on FunctionAttrs.cpp with clang-format prior to other refactorings and cleanups here. llvm-svn: 247519	2015-09-13 06:47:20 +00:00
Sanjay Patel	8b960d22ad	[x86] enable machine combiner reassociations for 128-bit vector logical integer insts (2nd try) The changes in: test/CodeGen/X86/machine-cp.ll are just due to scheduling differences after some logic instructions were reassociated. llvm-svn: 247516	2015-09-12 19:47:50 +00:00
Ahmed Bougacha	49b531a08d	[CodeGen] Fix AtomicExpand invalidation issue caused by r247429. llvm-svn: 247514	2015-09-12 18:51:23 +00:00
Simon Pilgrim	5253b7b4a7	[X86] Renamed lowerVectorShuffleAsUnpack NFCI. Renamed to lowerVectorShuffleAsPermuteAndUnpack to make it clear that it lowers to more than just a UNPCK instruction. llvm-svn: 247513	2015-09-12 18:26:47 +00:00
Simon Pilgrim	2fcfef542a	[X86] Moved lowerVectorShuffleWithUNPCK earlier to make reuse easier. NFCI. llvm-svn: 247511	2015-09-12 16:03:06 +00:00
Sanjay Patel	99f7370a79	revert r247506; need to verify changes in existing tests llvm-svn: 247507	2015-09-12 15:27:31 +00:00
Sanjay Patel	08755c7dbc	[x86] enable machine combiner reassociations for 128-bit vector logical integer insts llvm-svn: 247506	2015-09-12 14:58:04 +00:00
Simon Pilgrim	48ffca0f47	Fixed unused variable warning. llvm-svn: 247505	2015-09-12 14:00:17 +00:00
Simon Pilgrim	20c607b110	[InstCombine] CVTPH2PS Vector Demanded Elements + Constant Folding Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion): <4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion. Added constant folding support. Differential Revision: http://reviews.llvm.org/D12731 llvm-svn: 247504	2015-09-12 13:39:53 +00:00
Chandler Carruth	29a18a4663	[PM] Port SROA to the new pass manager. In some ways this is a very boring port to the new pass manager as there are no interesting analyses or dependencies or other oddities. However, this does introduce the first good example of a transformation pass with non-trivial state porting to the new pass manager. I've tried to carve out patterns here to replicate elsewhere, and would appreciate comments on whether folks like these patterns: - A common need in the new pass manager is to effectively lift the pass class and some of its state into a public header file. Prior to this, LLVM used anonymous namespaces to provide "module private" types and utilities, but that doesn't scale to cases where a public header file is needed and the new pass manager will exacerbate that. The pattern I've adopted here is to use the namespace-cased-name of the core pass (what would be a module if we had them) as a module-private namespace. Then utility and other code can be declared and defined in this namespace. At some point in the future, we could even have (conditionally compiled) code that used modules features when available to do the same basic thing. - I've split the actual pass run method in two in order to expose a private method usable by the old pass manager to wrap the new class with a minimum of duplicated code. I actually looked at a bunch of ways to automate or generate these, but they are all quite terrible IMO. The fundamental need is to extract the set of analyses which need to cross this interface boundary, and that will end up being too unpredictable to effectively encapsulate IMO. This is also a relatively small amount of boiler plate that will live a relatively short time, so I'm not too worried about the fact that it is boiler plate. The rest of the patch is totally boring but results in a massive diff (sorry). It just moves code around and removes or adds qualifiers to reflect the new name and nesting structure. Differential Revision: http://reviews.llvm.org/D12773 llvm-svn: 247501	2015-09-12 09:09:14 +00:00
Larisse Voufo	f57162b6e7	Clean up: Refactoring the hardcoded value of 6 for FindAvailableLoadedValue()'s parameter MaxInstsToScan. llvm-svn: 247497	2015-09-12 01:41:55 +00:00
Bruce Mitchener	e9ffb45b60	Fix typos. Summary: This fixes a variety of typos in docs, code and headers. Subscribers: jholewinski, sanjoy, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12626 llvm-svn: 247495	2015-09-12 01:17:08 +00:00
Davide Italiano	983366ab12	[MC] Fix style bugs introduced in r247471. Reported by Rafael Espindola. llvm-svn: 247483	2015-09-11 22:04:21 +00:00
Davide Italiano	63cee81c3c	[MC] Don't crash on division by zero. Differential Revision: http://reviews.llvm.org/D12776 llvm-svn: 247471	2015-09-11 20:47:35 +00:00
Yunzhong Gao	46261a74db	Add a non-exiting diagnostic handler for LTO. This is in order to give LTO clients a chance to do some clean-up before terminating the process. llvm-svn: 247461	2015-09-11 20:01:53 +00:00
Sanjay Patel	41c739b3fa	typo; NFC llvm-svn: 247454	2015-09-11 19:29:18 +00:00
Akira Hatanaka	bc497c93f5	Use function attribute "stackrealign" to decide whether stack realignment should be forced. With this commit, we can now force stack realignment when doing LTO and do so on a per-function basis. Also, add a new cl::opt option "stackrealign" to CommandFlags.h which is used to force stack realignment via llc's command line. Out-of-tree projects currently using -force-align-stack to force stack realignment should make changes to attach the attribute to the functions in the IR. Differential Revision: http://reviews.llvm.org/D11814 llvm-svn: 247450	2015-09-11 18:54:38 +00:00
David Majnemer	0e70598a5b	[X86] Make sure startproc/endproc are paired We used different conditions to determine if we should emit startproc vs endproc. Use the same condition to ensure that they will always be paired. This fixes PR24374. llvm-svn: 247435	2015-09-11 17:34:34 +00:00
Reid Kleckner	5dbee7baef	[IR] Print the label operands of a catchpad like an invoke The rest of the EH pads are fine, since they have at most one label and take fewer operands for the personality. Old catchpad vs. new: %5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8)] to label %__except.ret.10 unwind label %catchendblock.9 ----- %5 = catchpad [i8 bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)] to label %__except.ret.10 unwind label %catchendblock.9 llvm-svn: 247433	2015-09-11 17:27:52 +00:00
Ahmed Bougacha	5246867384	[CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit. We used to have this magic "hasLoadLinkedStoreConditional()" callback, which really meant two things: - expand cmpxchg (to ll/sc). - expand atomic loads using ll/sc (rather than cmpxchg). Remove it, and, instead, introduce explicit callbacks: - bool shouldExpandAtomicCmpXchgInIR(inst) - AtomicExpansionKind shouldExpandAtomicLoadInIR(inst) Differential Revision: http://reviews.llvm.org/D12557 llvm-svn: 247429	2015-09-11 17:08:28 +00:00
Ahmed Bougacha	9d677131c4	[CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind. This lets us generalize its usage to the other atomic instructions. llvm-svn: 247428	2015-09-11 17:08:17 +00:00
NAKAMURA Takumi	25b5bd27ea	[PR24785] Appease MSC18 to tweak optimizations. This brings a warning. cl : Command line warning D9035: option 'Og-' has been deprecated and will be removed in a future release We should resolve PR11951 to remove this tweak. llvm-svn: 247427	2015-09-11 17:08:02 +00:00
Yaron Keren	102e3ce5b3	Add #include llvm-config.h to Locale.cpp which depends on LLVM_ON_WIN32. Source code was assuming that llvm-config.h would be included somehow but up to r247253 that added #include "llvm/Support/Compiler.h" to StringRef.h the config file was not actually included. The inclusion of llvm-config.h caused a change of behaviour in tools/clang/test/Frontend/source-col-map.c: previously it would output the original UTF-8 but now it outputs <U+03B1>. llvm-svn: 247409	2015-09-11 13:22:47 +00:00
NAKAMURA Takumi	8061e8645f	PPCFrameLowering::emitEpilogue(): Avoid manipulating MBBI on iterator end. It caused crash in MachineInstr::hasPropertyInBundle() since r247237. llvm-svn: 247395	2015-09-11 08:20:56 +00:00
David Blaikie	2f40830dde	[opaque pointer type] Add textual IR support for explicit type parameter for global aliases update.py: import fileinput import sys import re alias_match_prefix = r"(.(?:=\|:\|^)\s(?:external \|)(?:(?:private\|internal\|linkonce\|linkonce_odr\|weak\|weak_odr\|common\|appending\|extern_weak\|available_externally) )?(?:default \|hidden \|protected )?(?:dllimport \|dllexport )?(?:unnamed_addr \|)(?:thread_local(?:$[a-z]$)? )?alias" plain = re.compile(alias_match_prefix + r" (.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|addrspacecast\|\[\[[a-zA-Z]\|\{\{).$)") cast = re.compile(alias_match_prefix + r") ((?:bitcast\|inttoptr\|addrspacecast)\s$. to (.?)(\| addrspace\(\d+$ )\\)\s(?:;.)?$)") gep = re.compile(alias_match_prefix + r") ((?:getelementptr)\s(?:inbounds)?\s$(?P<type>.), (?P=type)(?:\saddrspace\(\d+$\s)?\* .\)\s(?:;.)?$)") def conv(line): m = re.match(cast, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(gep, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(plain, line) if m: return m.group(1) + ", " + m.group(2) + m.group(3) + "" + m.group(4) + "\n" return line for line in sys.stdin: sys.stdout.write(conv(line)) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name .ll \| xargs ./apply.sh llvm-svn: 247378	2015-09-11 03:22:04 +00:00
Cong Hou	c416e4182a	Fixed a bug that BranchProbability is not defined in BlockFrequency.cpp. NFC. llvm-svn: 247376	2015-09-11 02:47:30 +00:00
Duncan P. N. Exon Smith	adc5456da0	AsmWriter: Avoid O(N^2) processing of metadata Fix embarrassing bugs I introduced to the `SlotTracker` in or around r235785. I had us iterating through every instruction in a function (and hitting a map in the LLVMContext) for every basic block in the function. While there, completely avoid the call to `SlotTracker::processFunctionMetadata()` from `SlotTracker::processFunction()` if we've speculatively done this already in `SlotTracker::processModule()` by checking `ShouldInitializeAllMetadata` (this wasn't an algorithmic problem, but it's touching the same line of code). Fixes PR24699. llvm-svn: 247372	2015-09-11 01:34:59 +00:00
Mehdi Amini	2bd08527ff	Revert "[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite" This reverts commit r247356. Breaks test/Transforms/InstCombine/pr8547.ll with: Wrong types for attribute: byval inalloca nest noalias nocapture nonnull readnone readonly sret dereferenceable(1) dereferenceable_or_null(1) %call = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i64 0, i64 0), i32 nonnull %conv2) #0 LLVM ERROR: Broken function found, compilation aborted! From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 247371	2015-09-11 01:33:48 +00:00
Kostya Serebryany	dd02f1f8ab	[libFuzzer] perform fewer crossover operations compared to plain mutations llvm-svn: 247364	2015-09-11 00:20:58 +00:00
Reid Kleckner	95ce1df93a	Add .exe check to Execute to fix clang-modernize tests broken in r247358 llvm-svn: 247361	2015-09-10 23:59:45 +00:00
Reid Kleckner	89d4b1a77c	ScanDirForExecutable on Windows fails to find executables with the "exe" extension in name When the driver tries to locate a program by its name, e.g. a linker, it scans the paths provided by the toolchain using the ScanDirForExecutable function. If the lookup fails, the driver uses llvm::sys::findProgramByName. Unlike llvm::sys::findProgramByName, ScanDirForExecutable is not aware of file extensions. If the program has the "exe" extension in its name, which is very common on Windows, ScanDirForExecutable won't find it under the toolchain-provided paths. This patch changes the Windows version of the "`can_execute`" function called by ScanDirForExecutable to respect file extensions, similarly to llvm::sys::findProgramByName. Patch by Oleg Ranevskyy Reviewers: rnk Differential Revision: http://reviews.llvm.org/D12711 llvm-svn: 247358	2015-09-10 23:28:06 +00:00
Cong Hou	c536bd9e73	Pass BranchProbability/BlockMass by value instead of const& as they are small. NFC. llvm-svn: 247357	2015-09-10 23:10:42 +00:00
Chen Li	a29c612ddd	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12779 llvm-svn: 247356	2015-09-10 23:04:49 +00:00
Chen Li	32a51416e5	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of gc.relocate return value Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of gc.relocate return value. In this way it can handle cases where the relocated value does not have nonnull attribute but has a dominating null check from the CFG. Reviewers: reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12772 llvm-svn: 247353	2015-09-10 22:35:41 +00:00
Filipe Cabecinhas	48b090a31f	Remove gcc warning when comparing an unsigned var for >= 0 llvm-svn: 247352	2015-09-10 22:34:39 +00:00
Reid Kleckner	da6dcc5d92	[WinEH] Push and pop EBP for 32-bit funclets The Win32 EH runtime caller does not preserve EBP, even though it does preserve the CSRs (EBX, ESI, EDI) for us. The result was that each finally funclet call would leave the frame pointer off by 12 bytes. llvm-svn: 247348	2015-09-10 22:00:02 +00:00
Matt Arsenault	e0b44040aa	AMDGPU: Simplify debug printing llvm-svn: 247345	2015-09-10 21:51:19 +00:00
Matt Arsenault	57116cce19	AMDGPU: Use StringRef value llvm-svn: 247344	2015-09-10 21:51:15 +00:00
James Y Knight	1f3e6af7d0	[SPARC] Switch to the Machine Scheduler. The (mostly-deprecated) SelectionDAG-based ILPListDAGScheduler scheduler was making poor scheduling decisions, causing high register pressure and extraneous register spills. Switching to the newer machine scheduler generates better code -- even without there being a machine model defined for SPARC yet. (Actually committing the test changes too, this time, unlike r247315) llvm-svn: 247343	2015-09-10 21:49:06 +00:00
Reid Kleckner	7bb20bd69e	Fix SEH state numbering algorithm to handle cleanupendpads WinEHPrepare's new coloring algorithm really expects to see cleanupendpads now, so Clang will start emitting them soon. llvm-svn: 247341	2015-09-10 21:46:36 +00:00
Matthew Simpson	29dc0f7075	[LV] Relax Small Size Reduction Type Requirement This patch enables small size reductions in which the source types are smaller than the reduction type (e.g., computing an i16 sum from the values in an i8 array). The previous behavior was to only allow small size reductions if the source types and reduction type were the same. The change accounts for the fact that the existing sign- and zero-extend instructions in these cases should still be included in the cost model. Differential Revision: http://reviews.llvm.org/D12770 llvm-svn: 247337	2015-09-10 21:12:57 +00:00
Lang Hames	21a77ba1f7	[RuntimeDyld] Support non-zero addends for the MachO X86_64 SUBTRACTOR reloc. This functionality was accidentally left out of r247119. llvm-svn: 247336	2015-09-10 21:05:58 +00:00
Lang Hames	79fce4711b	[RuntimeDyld] Fix a bug in debugging output: all sections should be dumped before any relocations have been applied, and again after all relocations have been applied. Previously each section was dumped before and after relocations targetting it were applied, but this only shows the impact of relocations that point to other symbols in the same section. llvm-svn: 247335	2015-09-10 20:44:36 +00:00
Chandler Carruth	2e4ca848f4	Add an explicit 'inline' specifier to these static functions. GCC is warning on them having always_inline attribute for reasons I don't fully understand -- static functions are just as inlinable as inline functions in terms of linkage. llvm-svn: 247334	2015-09-10 20:34:57 +00:00
James Y Knight	221885c7cb	Revert "[SPARC] Switch to the Machine Scheduler." This reverts commit r247315. Accidentally omitted test changes; will resubmit full change shortly. llvm-svn: 247328	2015-09-10 19:42:03 +00:00
David Majnemer	880c2cb097	[IR] Conservatively mark 'catchpad' as accessing memory The exact semantics of 'catchpad' are really in the hands of the personality routine so we shouldn't assume that they have no side effects. llvm-svn: 247322	2015-09-10 18:50:09 +00:00
Kostya Serebryany	65f50868e5	[libFuzzer] refactor the code to allow building libFuzzer on platforms that don't have dfsan and don't support weak functions llvm-svn: 247321	2015-09-10 18:48:38 +00:00
James Y Knight	8a772cfd61	[SPARC] Switch to the Machine Scheduler. The (mostly-deprecated) SelectionDAG-based ILPListDAGScheduler scheduler was making poor scheduling decisions, causing high register pressure and extraneous register spills. Switching to the newer machine scheduler generates better code -- even without there being a machine model defined for SPARC yet. llvm-svn: 247315	2015-09-10 18:20:45 +00:00
Matthew Simpson	ddb4d9741f	[SCEV] Consistently Handle Expressions That Cannot Be Divided This patch addresses the issue of SCEV division asserting on some input expressions (e.g., non-affine expressions) and quietly giving up on others. When giving up, we set the quotient to be equal to zero and the remainder to be equal to the numerator. With this patch, we always quietly give up when we cannot perform the division. This patch also adds a test case for DependenceAnalysis that previously caused an assertion. Differential Revision: http://reviews.llvm.org/D11725 llvm-svn: 247314	2015-09-10 18:12:47 +00:00
JF Bastien	fa946233b4	[MergeFuncs] Fix callsite attributes in thunk generation This change correctly sets the attributes on the callsites generated in thunks. This makes sure things such as sret, sext, etc. are correctly set, so that the call can be a proper tailcall. Also, the transfer of attributes in the replaceDirectCallers function appears to be unnecessary, but until this is confirmed it will remain. Author: jrkoenig Reviewers: dschuff, jfb Subscribers: llvm-commits, nlewycky Differential revision: http://reviews.llvm.org/D12581 llvm-svn: 247313	2015-09-10 18:08:35 +00:00
Philip Reames	053701399d	[SimplifyCFG] Use known bits to eliminate dead switch defaults This is a follow up to http://reviews.llvm.org/D11995 implementing the suggestion by Hans. If we know some of the bits of the value being switched on, we know that the maximum number of unique cases covers the unknown bits. This allows to eliminate switch defaults for large integers (i32) when most bits in the value are known. Note that I had to make the transform contingent on not having any dead cases. This is conservatively correct with the old code, but required for the new code since we might have a dead case which varies one of the known bits. Counting that towards our number of covering cases would be bad. If we do have dead cases, we'll eliminate them first, then revisit the possibly dead default. Differential Revision: http://reviews.llvm.org/D12497 llvm-svn: 247309	2015-09-10 17:44:47 +00:00
Adrian Prantl	d209500fd5	Debug Info: Allow a DIModule to appear as the scope of other entities. llvm-svn: 247304	2015-09-10 17:13:58 +00:00
Kostya Serebryany	a938bcb89a	[libFuzzer] add two more variants of FuzzerDriver for convenience llvm-svn: 247300	2015-09-10 16:57:57 +00:00
Joseph Tremoulet	f3aff31401	[WinEH] Fix single-block cleanup coloring Summary: The coloring code in WinEHPrepare queues cleanuprets' successors with the correct color (the parent one) when it sees their cleanuppad, and so later when iterating successors knows to skip processing cleanuprets since they've already been queued. This latter check was incorrectly under an 'else' condition and so inadvertently was not kicking in for single-block cleanups. This change sinks the check out of the 'else' to fix the bug. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12751 llvm-svn: 247299	2015-09-10 16:51:25 +00:00
Hans Wennborg	aa15bffa1f	Re-commit r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes" Except the changes that defined virtual destructors as =default, because that ran into problems with GCC 4.7 and overriding methods that weren't noexcept. llvm-svn: 247298	2015-09-10 16:49:58 +00:00
Steven Wu	e3b1f2b765	Fix an undefined behavior introduces in r247234 llvm-svn: 247296	2015-09-10 16:32:28 +00:00
Sanjay Patel	9361d35525	80-cols; NFC llvm-svn: 247295	2015-09-10 16:31:19 +00:00
Sanjay Patel	f4b34b76d4	use range-based for loop; NFCI llvm-svn: 247294	2015-09-10 16:25:38 +00:00
Sanjay Patel	5e7bd91891	use range-based for loop; NFCI llvm-svn: 247293	2015-09-10 16:15:21 +00:00
Sanjay Patel	59661459f1	fix typo; NFC llvm-svn: 247287	2015-09-10 15:14:34 +00:00
Alex Lorenz	0153e59935	Fix PR 24724 - The implicit register verifier shouldn't assume certain operand order. The implicit register verifier in the MIR parser should only check if the instruction's default implicit operands are present in the instruction. It should not check the order in which they occur. llvm-svn: 247283	2015-09-10 14:04:34 +00:00
Igor Breger	7f69a99c54	AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 llvm-svn: 247276	2015-09-10 12:54:54 +00:00
Jakub Kuderski	58ea4eeb9e	There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that removes cast by performing the lshr on smaller types. However, currently there is no trunc(lshr (sext A), Cst) variant. This patch add such optimization by transforming trunc(lshr (sext A), Cst) to ashr A, Cst. Differential Revision: http://reviews.llvm.org/D12520 llvm-svn: 247271	2015-09-10 11:31:20 +00:00
Chandler Carruth	233edd20a7	[ADT] Rewrite the StringRef::find implementation to be simpler, clearer, and tremendously less reliant on the optimizer to fix things. The code is always necessarily looking for the entire length of the string when doing the equality tests in this find implementation, but it previously was needlessly re-checking the size each time among other annoyances. By writing this so simply an ddirectly in terms of memcmp, it also is about 8x faster in a debug build, which in turn makes FileCheck about 2x faster in 'ninja check-llvm'. This saves about 8% of the time for FileCheck-heavy parts of the test suite like the x86 backend tests. llvm-svn: 247269	2015-09-10 11:17:49 +00:00
Silviu Baranga	df9ce8408a	[DAGCombine] Truncate BUILD_VECTOR operators if necessary when constant folding vectors Summary: The BUILD_VECTOR node will truncate its operators to match the type. We need to take this into account when constant folding - we need to perform a truncation before constant folding the elements. This is because the upper bits can change the result, depending on the operation type (for example this is the case for min/max). This change also adds a regression test. Reviewers: jmolloy Subscribers: jmolloy, llvm-commits Differential Revision: http://reviews.llvm.org/D12697 llvm-svn: 247265	2015-09-10 10:34:34 +00:00
James Molloy	d47634d781	Enable GlobalsAA by default This can give significant improvements to alias analysis in some situations, and improves its testing coverage in all situations. llvm-svn: 247264	2015-09-10 10:22:20 +00:00
James Molloy	efbba72cb2	Add GlobalsAA as preserved to a bunch of transforms GlobalsAA must by definition be preserved in function passes, but the passmanager doesn't know that. Make each pass explicitly preserve GlobalsAA. llvm-svn: 247263	2015-09-10 10:22:12 +00:00
James Molloy	8c995a93ce	[ARM] Do not use vtrn for vectorshuffle if the order is reversed The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case. Patch by Jeroen Ketema! llvm-svn: 247254	2015-09-10 08:42:28 +00:00
Chandler Carruth	f054eca167	[ADT] Micro-optimize the Triple constructor by doing a single split and re-using the resulting components rather than repeatedly splitting and re-splitting to compute each component as part of the initializer list. This is more work on PR23676. Sadly, it doesn't help much. It removes the constructor from my profile, but doesn't make a sufficient dent in the total time. But it should play together nicely with subsequent changes. llvm-svn: 247250	2015-09-10 07:51:43 +00:00
Chandler Carruth	4425c91dea	[ADT] Fix a confusing interface spec and some annoying peculiarities with the StringRef::split method when used with a MaxSplit argument other than '-1' (which nobody really does today, but which should actually work). The spec claimed both to split up to MaxSplit times, but also to append <= MaxSplit strings to the vector. One of these doesn't make sense. Given the name "MaxSplit", let's go with it being a max over how many splits occur, which means the max on how many strings get appended is MaxSplit+1. I'm not actually sure the implementation correctly provided this logic either, as it used a really opaque loop structure. The implementation was also playing weird games with nullptr in the data field to try to rely on a totally opaque hidden property of the split method that returns a pair. Nasty IMO. Replace all of this with what is (IMO) simpler code that doesn't use the pair returning split method, and instead just finds each separator and appends directly. I think this is a lot easier to read, and it most definitely matches the spec. Added some tests that exercise the corner cases around StringRef() and StringRef("") that all now pass. I'll start using this in code in the next commit. llvm-svn: 247249	2015-09-10 07:51:37 +00:00
NAKAMURA Takumi	1a296ec6d1	GlobalsAAResult(&&): Move every members. Or, one of MSVC builders failed with unexpected behavior. llvm-svn: 247247	2015-09-10 07:16:42 +00:00
Chandler Carruth	e4405e949f	[ADT] Switch a bunch of places in LLVM that were doing single-character splits to actually use the single character split routine which does less work, and in a debug build is substantially faster. llvm-svn: 247245	2015-09-10 06:12:31 +00:00
Chandler Carruth	477121721b	[ADT] Add a single-character version of the small vector split routine on StringRef. Finding and splitting on a single character is substantially faster than doing it on even a single character StringRef -- we immediately get to a very tuned memchr call this way. Even nicer, we get to this even in a debug build, shaving 18% off the runtime of TripleTest.Normalization, helping PR23676 some more. llvm-svn: 247244	2015-09-10 06:07:03 +00:00
Sanjoy Das	f3132d3b03	[ScalarEvolution] Fix PR24757. Summary: PR24757 was caused by some incorect math in `ScalarEvolution::HowFarToZero` -- the smallest unsigned solution for X in 2^N * A = 2^N * X is not necessarily A. Reviewers: atrick, majnemer, meheff Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12721 llvm-svn: 247242	2015-09-10 05:27:38 +00:00
Chandler Carruth	87275186d1	[LPM] Simplify this code and fix a compile error for compilers that don't correctly implement the scoping rules of C++11 range based for loops. This kind of aliasing isn't a good idea anyways (and wasn't really intended). llvm-svn: 247241	2015-09-10 04:22:36 +00:00
Chandler Carruth	b1e3a9ae8d	[LPM] Use a map from analysis ID to immutable passes in the legacy pass manager to avoid a slow linear scan of every immutable pass and on every attempt to find an analysis pass. This speeds up 'check-llvm' on an unoptimized build for me by 15%, YMMV. It should also help (a tiny bit) other folks that are really bottlenecked on repeated runs of tiny pass pipelines across small IR files. llvm-svn: 247240	2015-09-10 02:31:42 +00:00
Kit Barton	d3b904d440	Enable the shrink wrapping optimization for PPC64. The changes in this patch are as follows: 1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blocks are not the first/last blocks in the function 2. Fix a bug in PPCEarlyReturn optimization caused by an empty entry block in the function 3. Override the runShrinkWrap PredicateFtor (defined in TargetMachine) to check whether shrink wrapping should run: Shrink wrapping will run on PPC64 (Little Endian and Big Endian) unless -enable-shrink-wrap=false is specified on command line A new test case, ppc-shrink-wrapping.ll was created based on the existing shrink wrapping tests for x86, arm, and arm64. Phabricator review: http://reviews.llvm.org/D11817 llvm-svn: 247237	2015-09-10 01:55:44 +00:00
Ahmed Bougacha	05541459fa	[AArch64] Match FI+offset in STNP addressing mode. First, we need to teach isFrameOffsetLegal about STNP. It already knew about the STP/LDP variants, but those were probably never exercised, because it's only the load/store optimizer that generates STP/LDP, and the only user of the method is frame lowering, which runs earlier. The STP/LDP cases were wrong: they didn't take into account the fact that they return two results, not one, so the immediate offset will be the 4th operand, not the 3rd. Follow-up to r247234. llvm-svn: 247236	2015-09-10 01:54:43 +00:00
Ahmed Bougacha	c0ac38d584	[AArch64] Match base+offset in STNP addressing mode. Followup to r247231. llvm-svn: 247234	2015-09-10 01:48:29 +00:00
Ahmed Bougacha	b8886b517d	[AArch64] Support selecting STNP. We could go through the load/store optimizer and match STNP where we would have matched a nontemporal-annotated STP, but that's not reliable enough, as an opportunistic optimization. Insetad, we can guarantee emitting STNP, by matching them at ISel. Since there are no single-input nontemporal stores, we have to resort to some high-bits-extracting trickery to generate an STNP from a plain store. Also, we need to support another, LDP/STP-specific addressing mode, base + signed scaled 7-bit immediate offset. For now, only match the base. Let's make it smart separately. Part of PR24086. llvm-svn: 247231	2015-09-10 01:42:28 +00:00
Matt Arsenault	80f766a032	AMDGPU/SI: Fix more cases of losing exec operands llvm-svn: 247230	2015-09-10 01:23:28 +00:00
Matt Arsenault	ad46e0c1ab	AMDGPU/SI: Fix creating v_mov_b32s without exec uses This will be caught by existing tests with a verifier check to be added in a future commit. llvm-svn: 247229	2015-09-10 01:06:06 +00:00
Hans Wennborg	d2799a963f	Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes" This caused build breakges, e.g. http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926 llvm-svn: 247226	2015-09-10 00:57:26 +00:00
Ahmed Bougacha	37bffd83f0	[CodeGen] Make x86 nontemporal store patfrags generic. NFC. To be used by other targets. llvm-svn: 247225	2015-09-10 00:53:15 +00:00
Philip Reames	953817b65d	[RewriteStatepointsForGC] Minor refactor to use shared implementation [NFC] llvm-svn: 247223	2015-09-10 00:44:10 +00:00
Philip Reames	b4e55f3923	[RewriteStatepointsForGC] Strengthen a confusingly weak assertion [NFC] The assertion was weaker than it should be and gave the impression we're growing the number of base defining values being considered during the fixed point interation. That's not true. The tighter form of the assert is useful documentation. llvm-svn: 247221	2015-09-10 00:32:56 +00:00
Philip Reames	c8ded462c4	[RewriteStatepointsForGC] One last bit of naming [NFCI] llvm-svn: 247220	2015-09-10 00:27:50 +00:00
Reid Kleckner	7878391208	[WinEH] Add codegen support for cleanuppad and cleanupret All of the complexity is in cleanupret, and it mostly follows the same codepaths as catchret, except it doesn't take a return value in RAX. This small example now compiles and executes successfully on win32: extern "C" int printf(const char *, ...) noexcept; struct Dtor { ~Dtor() { printf("~Dtor\n"); } }; void has_cleanup() { Dtor o; throw 42; } int main() { try { has_cleanup(); } catch (int) { printf("caught it\n"); } } Don't try to put the cleanup in the same function as the catch, or Bad Things will happen. llvm-svn: 247219	2015-09-10 00:25:23 +00:00
Philip Reames	34d7a7493d	[RewriteStatepointsForGC] Further style/naming fixup [NFCI] llvm-svn: 247217	2015-09-10 00:22:49 +00:00
Hans Wennborg	6fa09455ed	Fix Clang-tidy misc-use-override warnings, other minor fixes Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D12740 llvm-svn: 247216	2015-09-10 00:12:56 +00:00
Philip Reames	7540e3a45d	[RewriteStatepointsForGC] More naming cleanup [NFCI] llvm-svn: 247213	2015-09-10 00:01:53 +00:00
Philip Reames	ece70b8042	[RewriteStatepointsForGC] Code cleanup [NFC] Factor out common code related to naming values, fix a small style issue. More to follow in separate changes. llvm-svn: 247211	2015-09-09 23:57:18 +00:00
Philip Reames	6628713f4f	[RewriteStatepointsForGC] Extend base pointer inference to handle insertelement This change is simply enhancing the existing inference algorithm to handle insertelement instructions by conservatively inserting a new instruction to propagate the vector of associated base pointers. In the process, I'm ripping out the peephole optimizations which mostly helped cover the fact this hadn't been done. Note that most of the newly inserted nodes will be nearly immediately removed by the post insertion optimization pass introduced in 246718. Arguably, we should be trying harder to avoid the malloc traffic here, but I'd rather get the code correct, then worry about compile time. Unlike previous extensions of the algorithm to handle more case, I discovered the existing code was causing miscompiles in some cases. In particular, we had an implicit assumption that the peephole covered all insert element instructions, so if we had a value directly based on a insert element the peephole didn't cover, we proceeded as if it were a base anyways. Not good. I believe we had the same issue with shufflevector which is why I adjusted the predicate for them as well. Differential Revision: http://reviews.llvm.org/D12583 llvm-svn: 247210	2015-09-09 23:40:12 +00:00
Philip Reames	15d5563cea	[RewriteStatepointsForGC] Make base pointer inference deterministic Previously, the base pointer algorithm wasn't deterministic. The core fixed point was (of course), but we were inserting new nodes and optimizing them in an order which was unspecified and variable. We'd somewhat hacked around this for testing by sorting by value name, but that doesn't solve the general determinism problem. Instead, we can use the order of traversal over the def/use graph to give us a single consistent ordering. Today, this is a DFS order, but the exact order doesn't mater provided it's deterministic for a given input. (Q: It is safe to rely on a deterministic order of operands right?) Note that this only fixes the determinism within a single inference step. The inference step is currently invoked many times in a non-deterministic order. That's a future change in the sequence. :) Differential Revision: http://reviews.llvm.org/D12640 llvm-svn: 247208	2015-09-09 23:26:08 +00:00
Peter Collingbourne	1cbc91eccf	LowerBitSets: Fix non-determinism bug. Visit disjoint sets in a deterministic order based on the maximum BitSetNM index, otherwise the order in which we visit them will depend on pointer comparisons. This was being exposed by MSan. llvm-svn: 247201	2015-09-09 22:30:32 +00:00
Reid Kleckner	94b704c469	[SEH] Emit 32-bit SEH tables for the new EH IR The 32-bit tables don't actually contain PC range data, so emitting them is incredibly simple. The 64-bit tables, on the other hand, use the same table for state numbering as well as label ranges. This makes things more difficult, so it will be implemented later. llvm-svn: 247192	2015-09-09 21:10:03 +00:00
Piotr Padlewski	0dde00d239	ScalarEvolution assume hanging bugfix http://reviews.llvm.org/D12719 llvm-svn: 247184	2015-09-09 20:47:30 +00:00
David Majnemer	d34dbf07bd	Revert trunc(lshr (sext A), Cst) to ashr A, Cst This reverts commit r246997, it introduced a regression (PR24763). llvm-svn: 247180	2015-09-09 20:20:08 +00:00
Renato Golin	db7ea86bf4	Revert "AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding." This reverts commit r247149, as it was breaking numerous buildbots of varied architectures. llvm-svn: 247177	2015-09-09 19:44:40 +00:00
Matthias Braun	d9da162789	Save LaneMask with livein registers With subregister liveness enabled we can detect the case where only parts of a register are live in, this is expressed as a 32bit lanemask. The current code only keeps registers in the live-in list and therefore enumerated all subregisters affected by the lanemask. This turned out to be too conservative as the subregister may also cover additional parts of the lanemask which are not live. Expressing a given lanemask by enumerating a minimum set of subregisters is computationally expensive so the best solution is to simply change the live-in list to store the lanemasks as well. This will reduce memory usage for targets using subregister liveness and slightly increase it for other targets Differential Revision: http://reviews.llvm.org/D12442 llvm-svn: 247171	2015-09-09 18:08:03 +00:00
Matthias Braun	cc58005885	VirtRegMap: Improve addMBBLiveIns() using SlotIndex::MBBIndexIterator; NFC Now that we have an explicit iterator over the idx2MBBMap in SlotIndices we can use the fact that segments and the idx2MBBMap is sorted by SlotIndex position so can advance both simultaneously instead of starting from the beginning for each segment. This complicates the code for the subregister case somewhat but should be more efficient and has the advantage that we get the final lanemask for each block immediately which will be important for a subsequent change. Removes the now unused SlotIndexes::findMBBLiveIns function. Differential Revision: http://reviews.llvm.org/D12443 llvm-svn: 247170	2015-09-09 18:07:54 +00:00
Chandler Carruth	7b560d40bd	[PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are within a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167	2015-09-09 17:55:00 +00:00
Matthias Braun	80595460d8	MachineVerifier: Check that SlotIndex MBBIndexList is sorted. This introduces a check that the MBBIndexList is sorted as proposed in http://reviews.llvm.org/D12443 but split up into a separate commit. llvm-svn: 247166	2015-09-09 17:49:46 +00:00
Matt Arsenault	ef67d76869	AMDGPU: Extract full 64-bit subregister and use subregs Instead of extracting both 32-bit components from the 128-bit register. This produces fewer copies and is easier for the copy peephole optimizer to understand and see the actual uses as extracts from a reg_sequence. This avoids needing to handle subregister composing in the PeepholeOptimizer's ValueTracker for this case. llvm-svn: 247162	2015-09-09 17:03:29 +00:00
Matt Arsenault	b5541fb098	AMDGPU: Remove unused multiclass argument llvm-svn: 247161	2015-09-09 17:03:18 +00:00
Dan Gohman	f71abef701	[WebAssembly] Implement calls with void return types. llvm-svn: 247158	2015-09-09 16:13:47 +00:00
Tom Stellard	9a197676b1	AMDGPU/SI: Fold operands through REG_SEQUENCE instructions Summary: This helps mostly when we use add instructions for address calculations that contain immediates. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12256 llvm-svn: 247157	2015-09-09 15:43:26 +00:00
Silviu Baranga	a3e27edb5d	[CostModel][AArch64] Remove amortization factor for some of the vector select instructions Summary: We are not scalarizing the wide selects in codegen for i16 and i32 and therefore we can remove the amortization factor. We still have issues with i64 vectors in codegen though. Reviewers: mcrosier Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12724 llvm-svn: 247156	2015-09-09 15:35:02 +00:00
Sanjay Patel	6eccf487c9	don't repeat function names in comments; NFC llvm-svn: 247154	2015-09-09 15:24:36 +00:00
Dan Gohman	1ce7ba5fe0	[WebAssembly] Tidy up some unneeded newline characters. llvm-svn: 247152	2015-09-09 15:13:36 +00:00
Sanjay Patel	e283441836	function names start with a lower case letter; NFC llvm-svn: 247150	2015-09-09 14:54:29 +00:00
Igor Breger	ac29a82921	AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 llvm-svn: 247149	2015-09-09 14:35:09 +00:00
Sanjay Patel	2fbab9d893	don't repeat function names in comments; NFC llvm-svn: 247148	2015-09-09 14:34:26 +00:00
Zoran Jovanovic	6b28f09d67	[mips][microMIPS] Implement ADDU16, AND16, ANDI16, NOT16, OR16, SLL16 and SRL16 instructions Differential Revision: http://reviews.llvm.org/D11178 llvm-svn: 247146	2015-09-09 13:55:45 +00:00
Alex Lorenz	b9a68dbcae	Fix PR 24633 - Handle undef values when parsing standalone constants. llvm-svn: 247145	2015-09-09 13:44:33 +00:00
James Molloy	520838977b	Rename ExitCount to BackedgeTakenCount, because that's what it is. We called a variable ExitCount, stored the backedge count in it, then redefined it to be the exit count again. llvm-svn: 247140	2015-09-09 12:51:10 +00:00
James Molloy	89eccee4db	Delay predication of stores until near the end of vector code generation Predicating stores requires creating extra blocks. It's much cleaner if we do this in one pass instead of mutating the CFG while writing vector instructions. Besides which we can make use of helper functions to update domtree for us, reducing the work we need to do. llvm-svn: 247139	2015-09-09 12:51:06 +00:00
Daniel Sanders	2038747fce	Fix vector splitting for extract_vector_elt and vector elements of <8-bits. Summary: One of the vector splitting paths for extract_vector_elt tries to lower: define i1 @via_stack_bug(i8 signext %idx) { %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx ret i1 %1 } to: define i1 @via_stack_bug(i8 signext %idx) { %base = alloca <2 x i1> store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx %3 = load i1, i1* %2 ret i1 %3 } However, the elements of <2 x i1> are not byte-addressible. The result of this is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies to '%base + %idx * 0', and then simply '%base' causing all values of %idx to extract element zero. This commit fixes this by promoting the vector elements of <8-bits to i8 before splitting the vector. This fixes a number of test failures in pocl. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12591 llvm-svn: 247128	2015-09-09 09:53:20 +00:00
Chandler Carruth	1688a772fc	Fix a typo I spotted when hacking on SROA. Somewhat alarming that nothing broke. llvm-svn: 247127	2015-09-09 09:46:16 +00:00
Zoran Jovanovic	d9790793d6	[mips][microMIPS] Implement CACHEE and PREFE instructions Differential Revision: http://reviews.llvm.org/D11628 llvm-svn: 247125	2015-09-09 09:10:46 +00:00
Matt Arsenault	d768737454	AMDGPU: Fix not encoding src2 of VOP3b instructions Broken by r247074. Should include an assembler test, but the assembler is currently broken for VOP3b apparently. llvm-svn: 247123	2015-09-09 08:39:49 +00:00
Sanjoy Das	da0d79e0a0	[IRCE] Add INITIALIZE_PASS_DEPENDENCY invocations. IRCE was just using INITIALIZE_PASS(), which is incorrect. llvm-svn: 247122	2015-09-09 03:47:18 +00:00
Lang Hames	856e4767ff	[RuntimeDyld] Add support for MachO x86_64 SUBTRACTOR relocation. llvm-svn: 247119	2015-09-09 03:14:29 +00:00
Dan Gohman	e590b33bf8	[WebAssembly] Fix lowering of calls with more than one argument. llvm-svn: 247118	2015-09-09 01:52:45 +00:00
Matt Arsenault	acd68b58ae	SelectionDAG: Support Expand of f16 extloads Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113	2015-09-09 01:12:27 +00:00
Dan Gohman	4f52e00ecb	[WebAssembly] Implement WebAssemblyInstrInfo::copyPhysReg llvm-svn: 247110	2015-09-09 00:52:47 +00:00
Matt Arsenault	3099156261	Fix typos / grammar llvm-svn: 247109	2015-09-09 00:38:33 +00:00
Reid Kleckner	51189f0a1d	[WinEH] Avoid creating MBBs for LLVM BBs that cannot contain code Typically these are catchpads, which hold data used to decide whether to catch the exception or continue unwinding. We also shouldn't create MBBs for catchendpads, cleanupendpads, or terminatepads, since no real code can live in them. This fixes a problem where MI passes (like the register allocator) would try to put code into catchpad blocks, which are not executed by the runtime. In the new world, blocks ending in invokes now have many possible successors. llvm-svn: 247102	2015-09-08 23:28:38 +00:00
Peter Collingbourne	8d24ae9441	Re-apply r247080 with order of evaluation fix. llvm-svn: 247095	2015-09-08 22:49:35 +00:00
Reid Kleckner	df1295173f	[WinEH] Emit prologues and epilogues for funclets Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 llvm-svn: 247092	2015-09-08 22:44:41 +00:00
Peter Collingbourne	07f3af2e82	Revert r247080, "LowerBitSets: Extend pass to support functions as bitset members." as it causes test failures on a number of bots. llvm-svn: 247088	2015-09-08 22:33:23 +00:00
Eric Christopher	71f6e2f568	Fix the PPC CTR Loop pass to look for calls to the intrinsics that read CTR and count them as reading the CTR. llvm-svn: 247083	2015-09-08 22:14:58 +00:00
Peter Collingbourne	c634ed0b1a	LowerBitSets: Extend pass to support functions as bitset members. This change extends the bitset lowering pass to support bitsets that may contain either functions or global variables. A function bitset is lowered to a jump table that is laid out before one of the functions in the bitset. Also add support for non-string bitset identifier names. This allows for distinct metadata nodes to stand in for names with internal linkage, as done in D11857. Differential Revision: http://reviews.llvm.org/D11856 llvm-svn: 247080	2015-09-08 21:57:45 +00:00
Ivan Krasin	a610cb5ba0	[libFuzzer]Add a test for defeating a hash sum. Summary: Add a test for a data followed by 4-byte hash value. I use a slightly modified Jenkins hash function, as described in https://en.wikipedia.org/wiki/Jenkins_hash_function The modification is to ensure that hash(zeros) != 0. Reviewers: kcc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12648 llvm-svn: 247076	2015-09-08 21:22:52 +00:00
Matt Arsenault	86d336e91b	AMDGPU/SI: Fix input vcc operand for VOP2b instructions Adds vcc to output string input for e32. Allows option of using e64 encoding with assembler. Also fixes these instructions not implicitly reading exec. llvm-svn: 247074	2015-09-08 21:15:00 +00:00
Artem Belevich	0127d80986	[NVPTX] Added run NVVMReflect pass to NVPTX back-end. The pass is needed to remove __nvvm_reflect calls when we link in libdevice bitcode that comes with CUDA. Differential Revision: http://reviews.llvm.org/D11663 llvm-svn: 247072	2015-09-08 21:04:55 +00:00
Derek Schuff	eef533f422	x32. Fixes a bug in how struct va_list is initialized in x32 Summary: This patch modifies X86TargetLowering::LowerVASTART so that struct va_list is initialized with 32 bit pointers in x32. It also includes tests that call @llvm.va_start() for x32. Patch by João Porto Subscribers: llvm-commits, hjl.tools Differential Revision: http://reviews.llvm.org/D12346 llvm-svn: 247069	2015-09-08 20:51:31 +00:00
Kostya Serebryany	4b82de2e47	[libFuzzer] remove a piece of stale code llvm-svn: 247067	2015-09-08 20:40:10 +00:00
Kostya Serebryany	9cdea94f66	[libFuzzer] be more robust when dealing with files on disk (e.g. don't crash if a file was there but disappeared) llvm-svn: 247066	2015-09-08 20:36:33 +00:00
Dan Gohman	e32c57443f	[WebAssembly] Support running without a register allocator in the default CodeGen passes This allows backends which don't use a traditional register allocator, but do need PHI lowering and other passes, to use the default TargetPassConfig::addFastRegAlloc and TargetPassConfig::addOptimizedRegAlloc implementations. Differential Revision: http://reviews.llvm.org/D12691 llvm-svn: 247065	2015-09-08 20:36:33 +00:00
Sanjay Patel	b54e62fe17	refactor matches for De Morgan's Laws; NFCI llvm-svn: 247061	2015-09-08 20:14:13 +00:00
Matt Arsenault	8ac35cd031	AMDGPU: Mark s_barrier as a high latency instruction These were marked as WriteSALU, which is low latency. I'm guessing at the value to use, but it should probably be considered the highest latency instruction. I'm not sure this has any actual effect since hasSideEffects probably is preventing any moving of these. llvm-svn: 247060	2015-09-08 19:54:32 +00:00
Matt Arsenault	8fb810a1d2	AMDGPU: Fix s_barrier flags This should be convergent. This is not a barrier in the isBarrier sense, nor hasCtrlDep. llvm-svn: 247059	2015-09-08 19:54:25 +00:00
Derek Schuff	ee4e947e23	x32. Fixes a bug in i8mem_NOREX declaration. The old implementation assumed LP64 which is broken for x32. Specifically, the MOVE8rm_NOREX and MOVE8mr_NOREX, when selected, would cause a 'Cannot emit physreg copy instruction' error message to be reported. This patch also enable the h-register*ll tests for x32. Differential Revision: http://reviews.llvm.org/D12336 Patch by João Porto llvm-svn: 247058	2015-09-08 19:47:15 +00:00
Matt Arsenault	966a94f861	AMDGPU: Handle sub of constant for DS offset folding sub C, x - > add (sub 0, x), C for DS offsets. This is mostly to fix regressions that show up when SeparateConstOffsetFromGEP is enabled. llvm-svn: 247054	2015-09-08 19:34:22 +00:00
Diego Novillo	f9aa39b0cf	Fix PR 24723 - Handle 0-mass backedges in irreducible loops This corner case happens when we have an irreducible SCC that is deeply nested. As we work down the tree, the backedge masses start getting smaller and smaller until we reach one that is down to 0. Since we distribute the incoming mass using the backedge masses as weight, the distributor does not allow zero weights. So, we simply ignore them (which will just use the weights of the non-zero nodes). llvm-svn: 247050	2015-09-08 19:22:17 +00:00
Davide Italiano	cb2da7166a	[MC/ELF] Accept zero for .align directive .align directive refuses alignment 0 -- a comment in the code hints this is done for GNU as compatibility, but it seems GNU as accepts .align 0 (and silently rounds up alignment to 1). Differential Revision: http://reviews.llvm.org/D12682 llvm-svn: 247048	2015-09-08 18:59:47 +00:00
David Blaikie	12dd3c4ebb	Fix CPP Backend for GEP API changes for opaque pointer types Based on a patch by Jerome Witmann. llvm-svn: 247047	2015-09-08 18:42:29 +00:00
Sanjay Patel	1854927556	remove function names from comments; NFC llvm-svn: 247043	2015-09-08 18:24:36 +00:00
Andrew Kaylor	e2ea93c6c0	Fix for bz24500: Avoid non-deterministic code generation triggered by the x86 call frame optimization Patch by Dave Kreitzer Differential Revision: http://reviews.llvm.org/D12620 llvm-svn: 247042	2015-09-08 18:18:46 +00:00
Kostya Serebryany	b06fae5ede	[libFuzzer] better documentatio for -save_minimized_corpus=1 llvm-svn: 247033	2015-09-08 17:43:51 +00:00
Kostya Serebryany	468ed78434	[libFuzzer] remove -iterations as redundant (there is also -num_runs) llvm-svn: 247030	2015-09-08 17:30:35 +00:00
JF Bastien	749ed88aa5	WebAssembly: NFC rename shr/sar Renamed from: https://github.com/WebAssembly/design/pull/332 llvm-svn: 247028	2015-09-08 17:21:21 +00:00
Kostya Serebryany	25425ad920	[libFuzzer] add one more mutator: Mutate_ChangeASCIIInteger llvm-svn: 247027	2015-09-08 17:19:31 +00:00
Jun Bum Lim	4d3c5986f2	Remove white space (test commit) llvm-svn: 247021	2015-09-08 16:11:22 +00:00
Zoran Jovanovic	2da1437d62	[mips][microMIPS] Implement LLE, LUI, LW and LWE instructions Differential Revision: http://reviews.llvm.org/D1179 llvm-svn: 247017	2015-09-08 15:02:50 +00:00
Igor Breger	a54a1a84dd	AVX512: kunpck encoding implementation Added tests for encoding. Differential Revision: http://reviews.llvm.org/D12061 llvm-svn: 247010	2015-09-08 13:10:00 +00:00
Dan Gohman	25d2a0dda4	[WebAssembly] Enable SSA lowering and other pre-regalloc passes llvm-svn: 247008	2015-09-08 12:39:25 +00:00
Elena Demikhovsky	ddf715ef77	Removed an old comment, NFC llvm-svn: 247006	2015-09-08 12:22:22 +00:00
Zoran Jovanovic	9eaa30d2bf	[mips][microMIPS] Implement SB, SBE, SCE, SH and SHE instructions Differential Revision: http://reviews.llvm.org/D11801 llvm-svn: 246999	2015-09-08 10:18:38 +00:00
Jakub Kuderski	7cd4810021	There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that removes cast by performing the lshr on smaller types. However, currently there is no trunc(lshr (sext A), Cst) variant. This patch add such optimization by transforming trunc(lshr (sext A), Cst) to ashr A, Cst. Differential Revision: http://reviews.llvm.org/D12520 llvm-svn: 246997	2015-09-08 10:03:17 +00:00
Daniel Sanders	808dfb8ba7	[mips] Reserve address spaces 1-255 for software use. Summary: And define them to have noop casts with address spaces 0-255. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12678 llvm-svn: 246990	2015-09-08 09:07:03 +00:00
Zoran Jovanovic	68be5f21a9	[mips][microMIPS] Add microMIPS32r6 and microMIPS64r6 tests for existing 16-bit LBU16, LHU16, LW16, LWGP and LWSP instructions Differential Revision: http://reviews.llvm.org/D10956 llvm-svn: 246987	2015-09-08 08:25:34 +00:00
Elena Demikhovsky	dec0f0885f	compilation issue, NFC llvm-svn: 246983	2015-09-08 07:34:06 +00:00
Elena Demikhovsky	d240d778b3	fixed compilation issue, NFC. llvm-svn: 246982	2015-09-08 07:10:08 +00:00
Elena Demikhovsky	e88038f235	AVX-512: Lowering for 512-bit vector shuffles. Vector types: <8 x 64>, <16 x 32>, <32 x 16> float and integer. Differential Revision: http://reviews.llvm.org/D10683 llvm-svn: 246981	2015-09-08 06:38:21 +00:00
Zoran Jovanovic	7b85682541	[mips][microMIPS] Implement ABS.fmt, CEIL.L.fmt, CEIL.W.fmt, FLOOR.L.fmt, FLOOR.W.fmt, TRUNC.L.fmt, TRUNC.W.fmt, RSQRT.fmt and SQRT.fmt instructions Differential Revision: http://reviews.llvm.org/D11674 llvm-svn: 246968	2015-09-07 13:01:04 +00:00
Zoran Jovanovic	ada7091812	[mips][microMIPS] Implement BC16, BEQZC16 and BNEZC16 instructions Differential Revision: http://reviews.llvm.org/D11181 llvm-svn: 246963	2015-09-07 11:56:37 +00:00
John Brawn	d8b405abf7	[ARM] Get rid of SelectT2ShifterOperandReg, NFC SelectT2ShifterOperandReg has identical behaviour to SelectImmShifterOperand, so get rid of it and use SelectImmShifterOperand instead. Differential Revision: http://reviews.llvm.org/D12195 llvm-svn: 246962	2015-09-07 11:45:18 +00:00
Zoran Jovanovic	14f308e44f	[mips][microMIPS] Implement CVT.D.fmt, CVT.L.fmt, CVT.S.fmt, CVT.W.fmt, MAX.fmt, MIN.fmt, MAXA.fmt, MINA.fmt and CMP.condn.fmt instructions Differential Revision: http://reviews.llvm.org/D12141 llvm-svn: 246960	2015-09-07 10:31:31 +00:00
NAKAMURA Takumi	0d72539d5a	Prune utf8 chars in comments. llvm-svn: 246953	2015-09-07 00:26:54 +00:00
David Majnemer	135ca40a7d	[InstCombine] Don't divide by zero when evaluating a potential transform Trivial multiplication by zero may survive the worklist. We tried to reassociate the multiplication with a division instruction, causing us to divide by zero; bail out instead. This fixes PR24726. llvm-svn: 246939	2015-09-06 06:49:59 +00:00
Hal Finkel	10aac5fd0e	[SelectionDAG] Swap commutative binops before constant-based folding In searching for a fix for the underlying code-quality bug highlighted by r246937 (that SDAG simplification can lead to us generating an ISD::OR node with a constant zero LHS), I ran across this: We generically canonicalize commutative binary-operation nodes in SDAG getNode so that, if only one operand is a constant, it will be on the RHS. However, we were doing this only after a bunch of constant-based simplification checks that all assume this canonical form (that any constant will be on the RHS). Moving the operand-swapping canonicalization prior to these checks seems like the right thing to do (and, as it turns out, causes SDAG to completely fold away the computation in test/CodeGen/ARM/2012-11-14-subs_carry.ll, just like InstCombine would do). llvm-svn: 246938	2015-09-06 05:42:13 +00:00
Hal Finkel	ccf9259c00	[PowerPC] Don't commute trivial rlwimi instructions To commute a trivial rlwimi instructions (meaning one with a full mask and zero shift), we'd need to ability to form an all-zero mask (instead of an all-one mask) using rlwimi. We can't represent this, however, and we'll miscompile code if we try. The code quality problem that this highlights (that SDAG simplification can lead to us generating an ISD::OR node with a constant zero LHS) will be fixed as a follow-up. Fixes PR24719. llvm-svn: 246937	2015-09-06 04:17:30 +00:00
David Majnemer	daa24b9789	[InstCombine] Don't assume m_Mul gives back an Instruction This fixes PR24713. llvm-svn: 246933	2015-09-05 20:44:56 +00:00
Alexandros Lamprineas	ea33e5e88e	Added arch extensions and default target features in TargetParser. Differential: http://reviews.llvm.org/D11590 llvm-svn: 246930	2015-09-05 17:05:33 +00:00
Zoran Jovanovic	89ca2b982e	[mips][microMIPS] Implement ADD.fmt, SUB.fmt, MOV.fmt, MUL.fmt, DIV.fmt, MADDF.fmt, MSUBF.fmt and NEG.fmt instructions Differential Revision: http://reviews.llvm.org/D11978 llvm-svn: 246919	2015-09-05 09:25:30 +00:00
Craig Topper	02a55d701d	Fix build warning. llvm-svn: 246908	2015-09-05 04:49:44 +00:00
NAKAMURA Takumi	2f9e8c0570	WinCOFFObjectWriter.cpp: Roll back TimeDateStamp along ENABLE_TIMESTAMPS. We want a deterministic output. GNU AS leaves it zero. FIXME: It may be optional by its user, like llc and clang. llvm-svn: 246905	2015-09-05 01:17:49 +00:00
Andrew Kaylor	2a9a6d8c38	Fix build warning llvm-svn: 246903	2015-09-05 01:00:51 +00:00
Hal Finkel	b1518d6c24	[PowerPC] Fix and(or(x, c1), c2) -> rlwimi generation PPCISelDAGToDAG has a transformation that generates a rlwimi instruction from an input pattern that looks like this: and(or(x, c1), c2) but the associated logic does not work if there are bits that are 1 in c1 but 0 in c2 (these are normally canonicalized away, but that can't happen if the 'or' has other users. Make sure we abort the transformation if such bits are discovered. Fixes PR24704. llvm-svn: 246900	2015-09-05 00:02:59 +00:00
Andrew Kaylor	a212aba680	Fix build warning llvm-svn: 246899	2015-09-04 23:58:32 +00:00
Andrew Kaylor	50e4e86c26	[WinEH] Teach SimplfyCFG to eliminate empty cleanup pads. Differential Revision: http://reviews.llvm.org/D12434 llvm-svn: 246896	2015-09-04 23:39:40 +00:00
Kostya Serebryany	e641dd6479	[libFuzzer] more accurate logic for traces, 80-char fix llvm-svn: 246888	2015-09-04 22:32:25 +00:00
Yaron Keren	771e31964d	Remove two unused includes and C++11 rangify for loops. llvm-svn: 246865	2015-09-04 20:24:24 +00:00
Chad Rosier	a67b2d0117	Typo. NFC. llvm-svn: 246851	2015-09-04 12:34:55 +00:00
David Majnemer	5ca46f0df1	[MC] Replace comparison with isUInt<32>. Casting to unsigned long can cause the time to get truncated to 32-bits, making it appear to be a valid timestamp. Just use isUInt<32> instead. llvm-svn: 246840	2015-09-04 07:22:36 +00:00
NAKAMURA Takumi	c95358b1ea	WinCOFFObjectWriter.cpp: Appease a warning in checking std::time_t. [-Wsign-compare] llvm-svn: 246839	2015-09-04 05:19:37 +00:00
Kostya Serebryany	b2e9897644	[libFuzzer] when a single mutation fails try a few more times with other mutations before returning un-mutated data llvm-svn: 246828	2015-09-04 00:40:29 +00:00
Kostya Serebryany	7d21166218	[libFuzzer] actually make the dictionaries work (+docs) llvm-svn: 246825	2015-09-04 00:12:11 +00:00
Hal Finkel	4a7be23976	[PowerPC] Enable interleaved-access vectorization This adds a basic cost model for interleaved-access vectorization (and a better default for shuffles), and enables interleaved-access vectorization by default. The relevant difference from the default cost model for interleaved-access vectorization, is that on PPC, the shuffles that end up being used are much cheaper than modeling the process with insert/extract pairs (which are quite expensive, especially on older cores). llvm-svn: 246824	2015-09-04 00:10:41 +00:00
Hal Finkel	75afa2b6b6	[PowerPC] Always use aggressive interleaving on the A2 On the A2, with an eye toward QPX unaligned-load merging, we should always use aggressive interleaving. It is generally superior to only using concatenation unrolling. llvm-svn: 246819	2015-09-03 23:23:00 +00:00
Hal Finkel	e6702ca0e2	[PowerPC] Try harder to find a base+offset when looking for consecutive accesses When forming permutation-based unaligned vector loads, we need to know whether it is valid to read ahead of the requested address by a full vector length. Doing so is more efficient (and allows for more CSE with later loads), but could trigger a page fault if invalid. To determine validity, we look for other loads in the same block that access the relevant address range. The relevant point here is that we need to do this as part of the process of forming permutation-based vector loads, and this happens quite early in the SDAG pipeline - specifically before many of the address calculations are fully canonicalized. As a result, we need to try harder to recognize base+offset address computations, because they still might appear as chain of adds (base+offset+offset, for example). To account for this, we'll look through chains of adds, accumulating the constant offsets. llvm-svn: 246813	2015-09-03 22:37:44 +00:00
Sanjoy Das	88d0fdeb00	[IR] Have AttrBuilder::clear clear `TargetDepAttrs`. Test case attached -- currently the parser smears the "foo bar" to all of the formal arguments. llvm-svn: 246812	2015-09-03 22:27:42 +00:00

... 23 24 25 26 27 ...

85123 Commits