llvm-project

Commit Graph

Author	SHA1	Message	Date
Teresa Johnson	8c1bc986b5	[ThinLTO] Indirect call promotion fixes for promoted local functions Summary: Fix a couple issues limiting the application of indirect call promotion in ThinLTO mode: - Invoke indirect call promotion before globalopt, since it may eliminate imported functions which appear unreferenced. - Invoke indirect call promotion with InLTO=true so that the PGOFuncName metadata is used to get the name for locals which would have been renamed during promotion. Reviewers: davidxl, mehdi_amini Subscribers: Prazek, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24004 llvm-svn: 280024	2016-08-29 22:46:56 +00:00
Hal Finkel	3d70a9dbb7	[PowerPC] Fix i8/i16 atomics for little-Endian targets without partword atomics For little-Endian PowerPC, we generally target only P8 and later by default. However, generic (older) 64-bit configurations are still an option, and in that case, partword atomics are not available (e.g. stbcx.). To lower i8/i16 atomics without true i8/i16 atomic operations, we emulate using i32 atomics in combination with a bunch of shifting and masking, etc. The amount by which to shift in little-Endian mode is different from the amount in big-Endian mode (it is inverted -- meaning we can leave off the xor when computing the amount). Fixes PR22923. llvm-svn: 280022	2016-08-29 22:25:36 +00:00
Chad Rosier	6e1eaac62b	[SLP] Return a boolean value for these static helpers. NFC. Differential Revision: https://reviews.llvm.org/D24008 llvm-svn: 280020	2016-08-29 22:09:51 +00:00
Jan Vesely	77ed6af416	AMDGPU/R600: Remove MergeVectorStores from legalization This is handled by DAGCombiner in a more generic way Differential Revision: https://reviews.llvm.org/D23970 llvm-svn: 280019	2016-08-29 22:05:06 +00:00
Tim Northover	f8bab1ce0c	GlobalISel: use multi-dimensional arrays for legalize actions. Instead of putting all possible requests into a single table, we can perform the extremely dense lookup based on opcode and type-index in constant time using multi-dimensional array-like things. This roughly halves the time spent doing legalization, which was dominated by queries against the Actions table. llvm-svn: 280011	2016-08-29 21:00:00 +00:00
Easwaran Raman	7060af9d22	Fix a thinko in r278189. llvm-svn: 280008	2016-08-29 20:45:51 +00:00
Saleem Abdulrasool	43e5fe3fac	AMDGPU: fix mismatch tags, NFC llvm-svn: 280006	2016-08-29 20:42:07 +00:00
Saleem Abdulrasool	ef72107a49	ExecutionEngine: fix a bug in the movt/movw relocator According to the arm arm specifications, 4 bytes are needed for a shift instead of 8, this was causing the movt instruction to write to a different register sometimes. Patch by Walter Erquinigo! llvm-svn: 280005	2016-08-29 20:42:03 +00:00
Matthew Simpson	df19502b16	[LV] Move insertelement sequence after scalar definitions After r279649 when getting a vector value from VectorLoopValueMap, we create an insertelement sequence on-demand if the value has been scalarized instead of vectorized. We previously inserted this insertelement sequence before the value's first vector user. However, this insert location is problematic if that user is the phi node of a first-order recurrence. With this patch, we move the insertelement sequence after the last scalar instruction we created when scalarizing the value. Thus, the value's vector definition in the new loop will immediately follow its scalar definitions. This should fix PR30183. Reference: https://llvm.org/bugs/show_bug.cgi?id=30183 llvm-svn: 280001	2016-08-29 20:14:04 +00:00
Krzysztof Parzyszek	354832e585	Propagate TBAA info in SelectionDAG::getIndexedLoad Patch by Pranav Bhandarkar. llvm-svn: 279998	2016-08-29 19:50:15 +00:00
Douglas Katzman	47ca88ace2	[Myriad]: add missing 'mcpu' values Should have been done with r276646. llvm-svn: 279996	2016-08-29 19:42:57 +00:00
Tom Stellard	0d23ebe888	AMDGPU/SI: Implement a custom MachineSchedStrategy Summary: GCNSchedStrategy re-uses most of GenericScheduler, it's just uses a different method to compute the excess and critical register pressure limits. It's not enabled by default, to enable it you need to pass -misched=gcn to llc. Shader DB stats: 32464 shaders in 17874 tests Totals: SGPRS: 1542846 -> 1643125 (6.50 %) VGPRS: 1005595 -> 904653 (-10.04 %) Spilled SGPRs: 29929 -> 27745 (-7.30 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 36688188 -> 37034900 (0.95 %) bytes LDS: 1913 -> 1913 (0.00 %) blocks Max Waves: 254101 -> 265125 (4.34 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 1338220 -> 1438499 (7.49 %) VGPRS: 886221 -> 785279 (-11.39 %) Spilled SGPRs: 29869 -> 27685 (-7.31 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 34315716 -> 34662428 (1.01 %) bytes LDS: 1551 -> 1551 (0.00 %) blocks Max Waves: 188127 -> 199151 (5.86 %) Wait states: 0 -> 0 (0.00 %) Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23688 llvm-svn: 279995	2016-08-29 19:42:52 +00:00
Vitaly Buka	3c4f6bf654	[asan] Enable new stack poisoning with store instruction by default Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23968 llvm-svn: 279993	2016-08-29 19:28:34 +00:00
Tim Northover	ac5148ef41	GlobalISel: switch to SmallVector for pending legalizations. std::queue was doing far to many heap allocations to be healthy. llvm-svn: 279992	2016-08-29 19:27:20 +00:00
Tom Stellard	c2ff0eb697	AMDGPU/SI: Improve SILoadStoreOptimizer and run it before the scheduler Summary: The SILoadStoreOptimizer can now look ahead more then one instruction when looking for instructions to merge, which greatly improves the number of loads/stores that we are able to merge. Moving the pass before scheduling avoids increasing register pressure after the scheduler, so that the scheduler's register pressure estimates will be more accurate. It also gives more consistent results, since it is no longer affected by minor scheduling changes. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23814 llvm-svn: 279991	2016-08-29 19:15:22 +00:00
Tim Northover	c10c33444e	ASan: remove variable only used in assertions build llvm-svn: 279990	2016-08-29 19:12:20 +00:00
Tim Northover	edb3c8ccb8	GlobalISel: legalize frem to a libcall on AArch64. llvm-svn: 279988	2016-08-29 19:07:16 +00:00
Tim Northover	fe5f89ba14	GlobalISel: rework CallLowering so that it can be used for libcalls too. There should be no functional change here, I'm just making the implementation of "frem" (to libcall) legalization easier for a followup. llvm-svn: 279987	2016-08-29 19:07:08 +00:00
Matt Arsenault	b90fc9b3b4	AMDGPU/R600: Fix fixups used for constant arrays Fixes bug 29289 llvm-svn: 279986	2016-08-29 19:01:48 +00:00
Kyle Butt	092c4dd5b6	IfConversion: Fix branch predication bug. This bug shows up with diamonds that share unpredicable, unanalyzable branches. There's an included test case from Hexagon. What was happening was that we were attempting to predicate the branch instruction despite the fact that it was checked to be the same. Now for unanalyzable branches we skip over the branch instructions when predicating the block. Differential Revision: https://reviews.llvm.org/D23939 llvm-svn: 279985	2016-08-29 18:27:12 +00:00
Vitaly Buka	793913c7eb	Use store operation to poison allocas for lifetime analysis. Summary: Calling __asan_poison_stack_memory and __asan_unpoison_stack_memory for small variables is too expensive. Code is disabled by default and can be enabled by -asan-experimental-poisoning. PR27453 Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23947 llvm-svn: 279984	2016-08-29 18:17:21 +00:00
Vitaly Buka	db331d8be7	[asan] Separate calculation of ShadowBytes from calculating ASanStackFrameLayout Summary: No functional changes, just refactoring to make D23947 simpler. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23954 llvm-svn: 279982	2016-08-29 17:41:29 +00:00
David Majnemer	e8fd5f9ffd	[SimplifyCFG] Hoisting invalidates metadata We forgot to remove optimization metadata when performing hosting during FoldTwoEntryPHINode. This fixes PR29163. llvm-svn: 279980	2016-08-29 17:14:08 +00:00
Evandro Menezes	a8a25ca905	[AArch64] Adjust the scheduling model for Exynos M1. Further refine the model for loads. llvm-svn: 279976	2016-08-29 16:04:37 +00:00
Anna Thomas	2bc129c5fd	[StatepointsForGC] Rematerialize in the presence of PHIs Summary: While walking the use chain for identifying rematerializable values in RS4GC, add the case where the current value and base value are the same PHI nodes. This will aid rematerialization of geps and casts instead of relocating. Reviewers: sanjoy, reames, igor Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23920 llvm-svn: 279975	2016-08-29 15:41:59 +00:00
Teresa Johnson	6e711c33b4	[LTO] Remove extraneous output Remove some debugging output to stderr that snuck in with r279576. llvm-svn: 279974	2016-08-29 15:33:01 +00:00
Sanjay Patel	25475bcc0c	[Constant] remove fdiv and frem from canTrap() Assuming the default FP env, we should not treat fdiv and frem any differently in terms of trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env. This matches how we treat the fdiv/frem in IR with isSafeToSpeculativelyExecute() and in the backend after: https://reviews.llvm.org/rL279970 llvm-svn: 279973	2016-08-29 15:27:17 +00:00
Gor Nishanov	dce9b02677	[Coroutines] Part 9: Add cleanup subfunction. Summary: [Coroutines] Part 9: Add cleanup subfunction. This patch completes coroutine heap allocation elision. Now, the heap elision example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex3.ll) Intrinsic Changes: * coro.free gets a token parameter tying it to coro.id to allow reliably discovering all coro.frees associated with a particular coroutine. * coro.id gets an extra parameter that points back to a coroutine function. This allows to check whether a coro.id describes the enclosing function or it belongs to a different function that was later inlined. CoroSplit now creates three subfunctions: # f$resume - resume logic # f$destroy - cleanup logic, followed by a deallocation code # f$cleanup - just the cleanup code CoroElide pass during devirtualization replaces coro.destroy with either f$destroy or f$cleanup depending whether heap elision is performed or not. Other fixes, improvements: * Fixed buglet in Shape::buildFrame that was not creating coro.save properly if coroutine has more than one suspend point. * Switched to using variable width suspend index field (no longer limited to 32 bit index field can be as little as i1 or as large as i<whatever-size_t-is>) Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23844 llvm-svn: 279971	2016-08-29 14:34:12 +00:00
Sanjay Patel	b57d0a2fda	[TargetLowering] remove fdiv and frem from canOpTrap() (PR29114) Assuming the default FP env, we should not treat fdiv and frem any differently in terms of trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env. This matches how we treat these ops in IR with isSafeToSpeculativelyExecute(). There's a similar bug in Constant::canTrap(). This bug manifests in PR29114: https://llvm.org/bugs/show_bug.cgi?id=29114 ...as a sequence of scalar divisions instead of a vector division on x86 for a <3 x float> type. Differential Revision: https://reviews.llvm.org/D23974 llvm-svn: 279970	2016-08-29 13:32:41 +00:00
Krzysztof Parzyszek	0a955d6dcb	Do not use MRI::getMaxLaneMaskForVReg as a mask covering whole register MRI::getMaxLaneMaskForVReg does not always cover the whole register. For example, on X86 the upper 16 bits of EAX cannot be accessed via any subregister. Consequently, there is no lane mask that only covers that part of EAX. The getMaxLaneMaskForVReg will return the union of the lane masks for all subregisters, and in case of EAX, that union will not cover the upper 16 bits. This fixes https://llvm.org/bugs/show_bug.cgi?id=29132 llvm-svn: 279969	2016-08-29 13:15:35 +00:00
Tom Stellard	5d3f71f721	AMDGPU/SI: Improve register allocation hints for sopk instructions Summary: For shrinking SOPK instructions, we were creating a hint to tell the register allocator to use the register allocated for src0 for the dst operand as well. However, this seems to not work sometimes depending on the order virtual registers are assigned physical registers. To fix this, I've added a second allocation hint which does the reverse, asks that the register allocated for dst is used for src0. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23862 llvm-svn: 279968	2016-08-29 13:06:10 +00:00
Rafael Espindola	412a529551	Use the correct ctor/dtor section for dynamic-no-pic. llvm-svn: 279967	2016-08-29 12:47:22 +00:00
Rafael Espindola	46fa231c52	Move code only used by codegen out of MC. NFC. MC itself never needs to know about these sections. llvm-svn: 279965	2016-08-29 12:33:42 +00:00
Haojian Wu	eab33cecf3	Fix -Wunused-but-set-variable warning. Summary: A follow-up fix on r279958. Reviewers: bkramer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23989 llvm-svn: 279964	2016-08-29 12:26:33 +00:00
Tom Stellard	662f330852	AMDGPU/SI: Query AA, if available, in areMemAccessesTriviallyDisjoint() Summary: The SILoadStoreOptimizer will need to use AliasAnalysis here in order to move it before scheduling. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23813 llvm-svn: 279963	2016-08-29 12:05:32 +00:00
Igor Breger	24281b4740	Fixed a bug in type legalizer for masked gather. The problem occurs when the Node doesn't updated in place , UpdateNodeOperation() return the node that already exist. In this case assert fail in PromoteIntegerOperand() , N have 2 results ( val + chain). Differential Revision: http://reviews.llvm.org/D23756 llvm-svn: 279961	2016-08-29 09:12:31 +00:00
Igor Breger	1a388871b9	[AVX512] In some cases KORTEST instruction may be used instead of ZEXT + TEST sequence. Differential Revision: http://reviews.llvm.org/D23490 llvm-svn: 279960	2016-08-29 08:52:52 +00:00
Haojian Wu	407f275894	[InstructionSelect] NumBlocks isn't defined in DEBUG build. Summary: A follow-up fixing on http://llvm.org/viewvc/llvm-project?view=revision&revision=279905. Reviewers: bkramer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23985 llvm-svn: 279959	2016-08-29 08:48:15 +00:00
Craig Topper	713085e60a	[X86] Don't lower FABS/FNEG masking directly to a ConstantPool load. Just create a ConstantFPSDNode and let that be lowered. This allows broadcast loads to used when available. llvm-svn: 279958	2016-08-29 04:49:31 +00:00
Craig Topper	f0e822ff31	[AVX-512] Always use v8i64 when converting 512-bit FAND/FOR/FXOR/FANDN to integer operations when DQI isn't supported. This is consistent with the recent changes to promote logical operations to i64 vectors. llvm-svn: 279957	2016-08-29 04:49:27 +00:00
Craig Topper	850feaf3b7	[AVX-512] Add support for selecting 512-bit VPABSB/VPABSW when BWI is available. llvm-svn: 279951	2016-08-28 22:20:51 +00:00
Craig Topper	056c9062f3	[AVX-512] Add patterns for selecting 128/256-bit EVEX VPABS instructions. llvm-svn: 279950	2016-08-28 22:20:48 +00:00
Sanjay Patel	5c5311f4e5	[InstCombine] use m_APInt to allow icmp (and X, Y), C folds for splat constant vectors llvm-svn: 279937	2016-08-28 18:18:00 +00:00
Simon Pilgrim	5369cd9e9c	[X86][AVX512] Only combine EVEX targets shuffles to shuffles of the same number of vector elements Over eager combing prevents the correct folding of writemasks. At the moment this occurs for ALL EVEX shuffles, in the future we need to check that the user of the root shuffle is a VSELECT that can fold to a writemask. llvm-svn: 279934	2016-08-28 17:27:14 +00:00
Hal Finkel	5728200f33	[PowerPC] Implement lowering for atomicrmw min/max/umin/umax Implement lowering for atomicrmw min/max/umin/umax. Fixes PR28818. llvm-svn: 279933	2016-08-28 16:17:58 +00:00
Elena Demikhovsky	3622fbfc68	[Loop Vectorizer] Fixed memory confilict checks. Fixed a bug in run-time checks for possible memory conflicts inside loop. The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size". Differential revision: https://reviews.llvm.org/D23176 llvm-svn: 279930	2016-08-28 08:53:53 +00:00
Craig Topper	abe80cc04d	[AVX-512] Promote AND/OR/XOR to v2i64/v4i64/v8i64 even when we have AVX512F/AVX512VL. Previously we weren't creating masked logical operations if bitcasts appeared between the logic operation and the select. The IR optimizers can move bitcasts across logic operations and create these cases. To minimize the number of cases we need to handle, this change promotes all logic ops to an i64 vector type just like when only SSE or AVX is available. Unfortunately, this also has the consequence of making it difficult to select unmasked VPANDD/VPORD/VPXORD in all the cases it was previously used. This is the cause of most of the test change. This shouldn't result in any functional change though. llvm-svn: 279929	2016-08-28 06:06:28 +00:00
Craig Topper	8877a026e4	[X86] Rename PABSB/D/W instructions to be consistent with SSE/AVX instructions instead of ending 128/256. NFC llvm-svn: 279927	2016-08-28 06:06:21 +00:00
Jan Vesely	38814fa2fd	AMDGPU/R600: Enable Load combine Fix and improve tests Differential Revision: https://reviews.llvm.org/D23899 llvm-svn: 279925	2016-08-27 19:09:43 +00:00
Craig Topper	6943aa306e	[X86] Rename predicate function that detects if requires one of the REX.B, REX.X or REX.R bits. It's old name conflicted with a function in X8II namespace that doesnt' quite do the same thing. NFC llvm-svn: 279924	2016-08-27 17:13:43 +00:00
Craig Topper	45793a1f7a	[X86] Keep looping over operands looking for byte registers even if we already found a register that requires a REX prefix. Otherwise we don't error if a high byte register is used after SPL/BPL/DIL/SIL. llvm-svn: 279923	2016-08-27 17:13:41 +00:00
Craig Topper	6acca80e17	[X86] Include XMM/YMM/ZMM16-23 in X86II::isX86_64ExtendedReg. This feels more consistent with its name and simplifies assembler code. llvm-svn: 279922	2016-08-27 17:13:37 +00:00
Craig Topper	06c60c067f	[X86] Don't allow DR8-DR15 to be assembled in 32-bit mode. Add missing test for CR8-CR15. llvm-svn: 279921	2016-08-27 17:13:34 +00:00
Craig Topper	ed71f04abb	[X86] Remove stale comment about FixupBWInsts pass being off by default. NFC llvm-svn: 279915	2016-08-27 05:26:54 +00:00
Craig Topper	225da2cb84	[AVX-512] Allow EVEX encoding unordered/ordered/equal/notequal VCMPPS/PD/SS/SD to be commuted just like the SSE and AVX counterparts. llvm-svn: 279914	2016-08-27 05:22:15 +00:00
Craig Topper	144fdef66b	[X86] Enable FR32/FR64 cmpeq/cmpne/cmpunord/cmpord to be commuted. llvm-svn: 279913	2016-08-27 05:22:12 +00:00
Craig Topper	4891c724aa	[AVX-512] Add load folding for EVEX vcmpps/pd/ss/sd. llvm-svn: 279912	2016-08-27 05:22:08 +00:00
Teresa Johnson	e2e621a36c	[LTO] Don't create a new common unless merged has different size Summary: This addresses a regression in common handling from the new LTO API in r278338. Only create a new common if the size is different. The type comparison against an array type fails when the size is different but not an array. GlobalMerge does not handle the array types as well and we lose some global merging opportunities. Reviewers: mehdi_amini Subscribers: junbuml, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23955 llvm-svn: 279911	2016-08-27 04:41:22 +00:00
Matt Arsenault	a15ea4e217	AMDGPU: Mark sched model complete Fixes bug 26800 llvm-svn: 279910	2016-08-27 03:39:27 +00:00
Matt Arsenault	71ed8a67e8	AMDGPU: Remove unneeded implicit exec uses/defs SI_BREAK, SI_IF_BREAK, and SI_ELSE_BREAK do not def exec. SI_IF_BREAK and SI_ELSE_BREAK do not read it either. llvm-svn: 279909	2016-08-27 03:00:51 +00:00
Sebastian Pop	4660199a33	GVN-hoist: invalidate MD cache (PR29144) Without invalidating the entries in the MD cache we would try to access instructions that were removed in previous iterations of hoisting. Differential Revision: https://reviews.llvm.org/D23927 llvm-svn: 279907	2016-08-27 02:48:41 +00:00
Quentin Colombet	acb857b831	[RegBankSelect] Do not abort when the target wants to fall back. llvm-svn: 279906	2016-08-27 02:38:27 +00:00
Quentin Colombet	948abf0a0f	[InstructionSelect] Do not abort when the target wants to fall back. llvm-svn: 279905	2016-08-27 02:38:24 +00:00
Quentin Colombet	5e60bcdeaf	[MachineLegalize] Do not abort when the target wants to fall back. llvm-svn: 279904	2016-08-27 02:38:21 +00:00
Matt Arsenault	2712d4a3d8	AMDGPU: Select mulhi 24-bit instructions llvm-svn: 279902	2016-08-27 01:32:27 +00:00
Matt Arsenault	22e417956d	AMDGPU: Move cndmask pseudo to be isel pseudo There's only one use of this for the convenience of a pattern. I think v_mov_b64_pseudo should also be moved, but SIFoldOperands does currently make use of it. llvm-svn: 279901	2016-08-27 01:00:37 +00:00
Matt Arsenault	e949744474	AMDGPU: Fix sched type for branches llvm-svn: 279900	2016-08-27 00:51:02 +00:00
Matt Arsenault	f98a596954	AMDGPU: Remove register operand from si_mask_branch It isn't used for anything, and is also misleading since it could be spilled at the end of the block, so it can't be relied on. There ends up being a verifier error about using an undefined register since the spill kills the register. llvm-svn: 279899	2016-08-27 00:42:21 +00:00
Matt Arsenault	00e102baf4	AMDGPU: Improve error reporting for maximum branch distance Unfortunately this seems to only help the assembler diagnostic. llvm-svn: 279895	2016-08-27 00:21:22 +00:00
Quentin Colombet	374796d678	[GlobalISel] Add a fallback path to SDISel. When global-isel fails on a MachineFunction MF, MF will be cleaned up and given to SDISel. Thanks to this fallback, we can already perform correctness test even if we support only a small portion of the functions in a test. llvm-svn: 279891	2016-08-27 00:18:31 +00:00
Quentin Colombet	a94caa5673	[AArch64][CallLowering] Do not assert for not implemented part. When doing the ABI lowering, report a failure to the caller instead of asserting. This gives a chance for the caller to recover. llvm-svn: 279890	2016-08-27 00:18:28 +00:00
Quentin Colombet	6049524d37	[GlobalISel] Teach the core pipeline not to run if ISel failed. llvm-svn: 279889	2016-08-27 00:18:24 +00:00
Quentin Colombet	3bb32cc79c	[IRTranslator] Do not abort when the target wants to fall back. Every pass in the GlobalISel pipeline will need to do something similar. llvm-svn: 279886	2016-08-26 23:49:05 +00:00
Quentin Colombet	e076d3094c	[MFProperties] Introduce a FailedISel property. This is used to communicate that the instruction selection pipeline failed at some point. Another way to achieve that would be to have some kind of conditional scheduling in the PassManager, such that we only schedule a pass based on the success/failure of another one. The property approach has the advantage of being lightweight and solve the problem at stake. llvm-svn: 279885	2016-08-26 23:49:01 +00:00
Teresa Johnson	26a462877b	[ThinLTO] Move loading of cache entry to client Summary: Have the cache pass back the path to the cache entry when it is ready to be loaded, instead of a buffer. For gold-plugin we can simply pass this file back to gold directly, which avoids expensive writing of a separate tmp file. Ensure the cache entry is not deleted on cleanup by adjusting the setting of the IsTemporary flags. Moved the loading of the buffer into llvm-lto2 to maintain current behavior. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23946 llvm-svn: 279883	2016-08-26 23:29:14 +00:00
Quentin Colombet	0de43b225f	[TargetPassConfig] Add a target hook to know what GlobalISel should do on error. By default, this hook tells GlobalISel to abort (report a fatal error) when it encounters an error. The alternative will be to fall back on SDISel. This fall back will be removed when the bring-up of GlobalISel is over. llvm-svn: 279879	2016-08-26 22:32:59 +00:00
Quentin Colombet	1d0cb6f107	[IRTranslator][NFC] Use DEBUG_TYPE instead of repeating the name. llvm-svn: 279878	2016-08-26 22:32:57 +00:00
Quentin Colombet	e063e1f68a	[SelectionDAG] Do not run the ISel process on already selected code. Right now, this cannot happen, but with the fall back path of GlobalISel it will show up eventually. llvm-svn: 279877	2016-08-26 22:32:55 +00:00
Quentin Colombet	380cd3eb23	[MachineFunction] Introduce a reset method. This method allows to reset the state of a MachineFunction as if it was just created. This will be used during the bring-up of GlobalISel to provide a way to fallback on SelectionDAG. That way, we can start doing correctness testing even if we are not able to select all functions via the global instruction selector. llvm-svn: 279876	2016-08-26 22:32:53 +00:00
Quentin Colombet	e609a9a80a	[MFProperties] Introduce a reset method with no argument. This method allows to reset all the properties in one go. llvm-svn: 279874	2016-08-26 22:09:11 +00:00
Quentin Colombet	c437aa9c26	[MFProperties][NFC] Rename clear into reset to match BitVector naming. The name clear is used to reset all the bit in bitvectors and using it to reset just properties was confusing. llvm-svn: 279873	2016-08-26 22:09:08 +00:00
Tom Stellard	e175d8aba5	AMDGPU/SI: Canonicalize offset order for merged DS instructions Summary: If the scheduler clusters the loads, then the offsets will be sorted, but it is possible for the scheduler to scheduler loads together without out explicitly clustering them, which would give us non-sorted offsets. Also, we will want to do this if we move the load/store optimizer before the scheduler. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23776 llvm-svn: 279870	2016-08-26 21:36:47 +00:00
Tom Stellard	4b5cd87ed3	XXX llvm-svn: 279868	2016-08-26 21:16:40 +00:00
Tom Stellard	7c463c9168	AMDGPU/SI: Use a better method for determining the largest pressure sets Summary: There are a few different sgpr pressure sets, but we only care about the one which covers all of the sgprs. We were using hard-coded register pressure set names to determine the reg set id for the biggest sgpr set. However, we were using the wrong name, and this method is pretty fragile, since the reg pressure set names may change. The new method just looks for the pressure set that contains the most reg units and sets that set as our SGPR pressure set. We've also adopted the same technique for determining our VGPR pressure set. Reviewers: arsenm Subscribers: MatzeB, arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23687 llvm-svn: 279867	2016-08-26 21:16:37 +00:00
Adam Nemet	cef3314156	[Inliner] Report when inlining fails because callee's def is unavailable Summary: This is obviously an interesting case because it may motivate code restructuring or LTO. Reporting this requires instantiation of ORE in the loop where the call sites are first gathered. I've checked compile-time overhead with -Rpass-with-hotness and the worst slow-down was 6% in mcf and quickly tailing off. As before without -Rpass-with-hotness there is no overhead. Because this could be a pretty noisy diagnostics, it is currently qualified as 'verbose'. As of this patch, 'verbose' diagnostics are only emitted with -Rpass-with-hotness, i.e. when the output is expected to be filtered. Reviewers: eraman, chandlerc, davidxl, hfinkel Subscribers: tejohnson, Prazek, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D23415 llvm-svn: 279860	2016-08-26 20:21:05 +00:00
Rafael Espindola	7775c3310c	Make writeToResolutionFile a static helper. llvm-svn: 279859	2016-08-26 20:19:35 +00:00
Kyle Butt	723aa1327c	TailDuplication: Record blocks that received the duplicated block. NFC. This will allow tail duplication during layout to handle the cfg changes more cleanly. llvm-svn: 279858	2016-08-26 20:12:40 +00:00
Kevin Enderby	0e52c92e22	Next set of additional error checks for invalid Mach-O files for bad LC_SYMTAB’s. This contains the missing checks for LC_SYMTAB load command fields. llvm-svn: 279854	2016-08-26 19:34:07 +00:00
Manman Ren	66b54e9f32	Swift Calling Convetion: add support for AArch64. It will just be the same as the regular calling convention. rdar://28029509 llvm-svn: 279853	2016-08-26 19:28:17 +00:00
Tim Northover	85cf564c51	AArch64: avoid assertion on illegal types in performFDivCombine. In the code to detect fixed-point conversions and make use of AArch64's special instructions, we weren't prepared for weird types. The fptosi direction got fixed recently, but not the similar sitofp code. llvm-svn: 279852	2016-08-26 18:52:31 +00:00
Sanjay Patel	14e0e18d76	[InstCombine] add helper function for icmp (and (sh X, Y), C2), C1 ; NFC Like other recent changes near here, the goal is to allow vector types for all of these folds. Splitting things up makes it easier to incrementally enhance the code and easier to read. llvm-svn: 279851	2016-08-26 18:28:46 +00:00
Chad Rosier	58f505ba24	[AArch64] Avoid materializing constant values when generating csel instructions. Differential Revision: https://reviews.llvm.org/D23677 llvm-svn: 279849	2016-08-26 18:05:50 +00:00
Davide Italiano	f4e1661bd8	[AsmParser] Placate a -Wmisleading-indentantion warning (GCC7). llvm-svn: 279848	2016-08-26 18:05:03 +00:00
Reid Kleckner	a5b1eef846	[MC] Move .cv_loc management logic out of MCContext MCContext already has many tasks, and separating CodeView out from it is probably a good idea. The .cv_loc tracking was modelled on the DWARF tracking which lived directly in MCContext. Removes the inclusion of MCCodeView.h from MCContext.h, so now there are only 10 build actions while I hack on CodeView support instead of 265. llvm-svn: 279847	2016-08-26 17:58:37 +00:00
Tim Northover	bc1701c7fb	GlobalISel: mark G_FPEXT legal from float to double. llvm-svn: 279845	2016-08-26 17:46:22 +00:00
Tim Northover	30bd36e3fc	GlobalISel: mark G_FCMP legal on float & double. llvm-svn: 279844	2016-08-26 17:46:19 +00:00
Tim Northover	051b8ad3d9	GlobalISel: simplify G_ICMP legalization regime. It's unclear how the old %res(32) = G_ICMP { s32, s32 } intpred(eq), %0, %1 is actually different from an s1 verison %res(1) = G_ICMP { s1, s32 } intpred(eq), %0, %1 so we'll remove it for now. llvm-svn: 279843	2016-08-26 17:46:17 +00:00
Tim Northover	cecee56abb	GlobalISel: legalize sdiv and srem operations. llvm-svn: 279842	2016-08-26 17:46:13 +00:00
Tim Northover	7a753d9bec	GlobalISel: legalize under-width divisions. llvm-svn: 279841	2016-08-26 17:46:06 +00:00
Tim Northover	1d18a99a53	GlobalISel: mark selects legal llvm-svn: 279840	2016-08-26 17:46:03 +00:00
Tim Northover	5d0eaa4e79	GlobalISel: mark float/int conversions legal llvm-svn: 279839	2016-08-26 17:45:58 +00:00
Sanjay Patel	da9c56299b	[InstCombine] clean up foldICmpAndConstConst(); NFC 1. Early exit to reduce indent 2. Fix comments and variable names to match 3. Reformat comments / clang-format code llvm-svn: 279837	2016-08-26 17:15:22 +00:00
Krzysztof Parzyszek	fb18d1e381	Missed a semicolon in r279835 llvm-svn: 279836	2016-08-26 16:50:57 +00:00
Krzysztof Parzyszek	eb34b71f0a	Add some more detailed debugging information in RegisterCoalescer llvm-svn: 279835	2016-08-26 16:46:14 +00:00
Sanjay Patel	d3c7bb28be	[InstCombine] add helper function for folding of icmp (and X, C2), C; NFC llvm-svn: 279834	2016-08-26 16:42:33 +00:00
Bob Haarman	3db176410a	limit the number of instructions per block examined by dead store elimination Summary: Dead store elimination gets very expensive when large numbers of instructions need to be analyzed. This patch limits the number of instructions analyzed per store to the value of the memdep-block-scan-limit parameter (which defaults to 100). This resulted in no observed difference in performance of the generated code, and no change in the statistics for the dead store elimination pass, but improved compilation time on some files by more than an order of magnitude. Reviewers: dexonsmith, bruno, george.burgess.iv, dberlin, reames, davidxl Subscribers: davide, chandlerc, dberlin, davidxl, eraman, tejohnson, mbodart, llvm-commits Differential Revision: https://reviews.llvm.org/D15537 llvm-svn: 279833	2016-08-26 16:34:27 +00:00
Sanjay Patel	311e0fabb1	[InstCombine] rename variables in foldICmpAndConstant(); NFC llvm-svn: 279831	2016-08-26 16:14:06 +00:00
Bob Haarman	244ed8b574	test commit llvm-svn: 279830	2016-08-26 16:00:04 +00:00
Adam Nemet	4f155b6e91	[LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis pass We can't mark ORE (a function pass) preserved as required by the loop passes because that is how we ensure that the required passes like LazyBFI are all available any time ORE is used. See the new comments in the patch. Instead we use it directly just like the inliner does in D22694. As expected there is some additional overhead after removing the caching provided by analysis passes. The worst case, I measured was LNT/CINT2006_ref/401.bzip2 which regresses by 12%. As before, this only affects -Rpass-with-hotness and not default compilation. llvm-svn: 279829	2016-08-26 15:58:34 +00:00
Sanjay Patel	f7ba0891ce	[InstCombine] rename variables in foldICmpDivConstant(); NFC Removing the redundant 'CmpRHSV' local variable exposes a bug in the caller foldICmpShrConstant() - it was sending in the div constant instead of the cmp constant. But I have not been able to expose this in a regression test yet - the affected folds all appear to be handled before we ever reach this code. I'll keep trying to find a case as I make changes to allow vector folds in both functions. llvm-svn: 279828	2016-08-26 15:53:01 +00:00
Davide Italiano	f8014f82ed	[lib/LTO] Add an assertion to catch invalid opt levels. llvm-svn: 279823	2016-08-26 15:22:59 +00:00
Chad Rosier	39c1dbb845	[AArch64] Avoid materializing constant 1 by using csinc, rather than csel. This is similar to what was done in r261675, but for CSINC rather than CSINV. Differential Revision: https://reviews.llvm.org/D23892 llvm-svn: 279822	2016-08-26 14:01:55 +00:00
Pablo Barrio	b8ec630583	Handle empty functions with debug info in load/store opt pass Summary: In fuctions that contained debug info but were empty otherwise, the ARM load/store optimizer could abort. This was because function MergeReturnIntoLDM handled the special case where a Machine Basic BLock is empty by calling MBB.empty(). However, this returns false in presence of debug info, although the function should be considered empty in the eyes of the load/store optimizer. This has been fixed by handling the case where searching through the block finds only debug instructions. Reviewers: rengolin, dexonsmith, llvm-commits, jmolloy Subscribers: t.p.northover, aemerson, rengolin, samparker Differential Revision: https://reviews.llvm.org/D23847 llvm-svn: 279820	2016-08-26 13:00:39 +00:00
Simon Pilgrim	091c4c781c	[X86][SSE4A] The EXTRQ/INSERTQ bit extraction/insertion ops should be in the integer domain llvm-svn: 279811	2016-08-26 09:55:41 +00:00
Eugene Leviant	ea877d40b4	Implement getRandomBytes() function This function allows getting arbitrary sized block of random bytes. Primary motivation is support for --build-id=uuid in lld. Differential revision: https://reviews.llvm.org/D23671 llvm-svn: 279807	2016-08-26 08:14:54 +00:00
Craig Topper	8f27f51192	[X86][SSE] Add CMPSS/CMPSD intrinsic scalar load folding support. llvm-svn: 279806	2016-08-26 07:08:00 +00:00
Matt Arsenault	f403df38eb	Replace subregister uses when processing tied operands This was for some reason skipping operands that are subregisters instead of keeping the same subregister index. v_movreld_b32 expects src0 to be the subregister of the tied super register use/def. e.g. v_movreld_b32 v0, v9, <imp-def, tied3> v[0:3], <imp-use, tied2> v[0:3] was being replaced with v[4:7] = copy v[0:3] v_movreld_b32 v0, v9, <imp-def, tied3> v[4:7], <imp-use, tied2> v[4:7], which really writes to v[0:3] llvm-svn: 279804	2016-08-26 06:31:32 +00:00
Kostya Serebryany	3e5991e540	[libFuzzer] simplify a test to make it pass on the bot llvm-svn: 279796	2016-08-26 00:18:16 +00:00
Kostya Serebryany	1426f59a76	[libFuzzer] make sure we have symbols on fuzzer tests llvm-svn: 279792	2016-08-25 23:30:02 +00:00
Michael Kuperstein	2ee911e985	Revert r274613 because it breaks the test suite with AVX512 This reverts most of r274613 (AKA r274626) and its follow-ups (r276347, r277289), due to miscompiles in the test suite. The FastISel change was left in, because it apparently fixes an unrelated issue. (Recommit of r279782 which was broken due to a bad merge.) This fixes 4 out of the 5 test failures in PR29112. llvm-svn: 279788	2016-08-25 22:48:11 +00:00
Kostya Serebryany	0f0fa4faf2	[libFizzer] rename -print_new_cov_pcs=1 into -print_pcs=1 and make it more useful: print PCs only after the initial corpus has been read and symbolize them llvm-svn: 279787	2016-08-25 22:35:08 +00:00
Michael Kuperstein	6e271f4ce8	Revert r279782 due to debug buildbot breakage. llvm-svn: 279785	2016-08-25 22:14:45 +00:00
Michael Kuperstein	a6ccc8d365	Revert r274613 because it breaks the test suite with AVX512 This reverts most of r274613 and its follow-ups (r276347, r277289), due to miscompiles in the test suite. The FastISel change was left in, because it apparently fixes an unrelated issue. This fixes 4 out of the 5 test failures in PR29112. llvm-svn: 279782	2016-08-25 21:55:41 +00:00
Tim Shen	3ad8b43cc2	[MemCpy] Add comments for r279769 Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279778	2016-08-25 21:03:46 +00:00
Tim Northover	3495647d0d	ARM: by default don't set the Thumb bit on MachO relocated values. Its existence is largely historical, apparently we tried to make ARM object files look maybe-almost-possibly runnable by putting our best guess at the actual value into relocated locations. Of course, the real linker then comes along and can completely change things. But it should only be there for word-sized and movw/movt relocations. It can't be encoded in branch relocations, and I've seen it mess up validity calculations twice in the last couple of weeks so the default is clearly problematic. llvm-svn: 279773	2016-08-25 20:41:30 +00:00
Tim Shen	a3dbead2d6	[MemCpy] Check for alias in performMemCpyToMemSetOptzn, instead of the identity of two operands Summary: This fixes pr29105. The reason is that lifetime marks creates new aliasing pointers the original ones, but before this patch aliases were not checked in performMemCpyToMemSetOptzn. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279769	2016-08-25 19:27:26 +00:00
Michael Kuperstein	260daed147	Reuse an SDLoc throughout a function. NFC. llvm-svn: 279767	2016-08-25 18:50:56 +00:00
Tim Northover	6c43b850b7	GlobalISel: add missing type to G_UADDE instructions llvm-svn: 279762	2016-08-25 17:37:44 +00:00
Tim Northover	d8a6d7ce91	GlobalISel: mark overflow bit of overflow ops legal. It's expected this will map to NZCV register class and be properly selectable. llvm-svn: 279761	2016-08-25 17:37:41 +00:00
Tim Northover	fe880a8801	GlobalISel: mark simple ops legal even on types < 32-bit. The 32-bit variants of these operations don't depend on the bits not being operated on, so they also naturally model operations narrower than the actual register width. llvm-svn: 279760	2016-08-25 17:37:39 +00:00
Tim Northover	7a1ec0141a	GlobalISel: mark pointer constants as legal on AArch64. llvm-svn: 279759	2016-08-25 17:37:35 +00:00
Tim Northover	438c77ca1a	GlobalISel: perform multi-step legalization llvm-svn: 279758	2016-08-25 17:37:32 +00:00
Tim Northover	2c4a838e24	GlobalISel: mark small extends as legal on AArch64 llvm-svn: 279757	2016-08-25 17:37:25 +00:00
Michael Kuperstein	40887c5566	[X86] 512-bit VPAVG requires AVX512BW Fix VPAVG detection to require AVX512BW, not AVX512F for 512-bit widths, and change associated asserts to assert in the right direction... This fixes PR29111. llvm-svn: 279755	2016-08-25 17:17:46 +00:00
Simon Pilgrim	5aa9c203ac	[X86][SSE] INSERTPS is only combined on v4f32 types. NFCI. llvm-svn: 279751	2016-08-25 17:02:00 +00:00
Ron Lieberman	a3c739b977	[Hexagon] Remove extraneous debug output from HexagonCopyToCombine.cpp BB# ... llvm-svn: 279750	2016-08-25 16:46:09 +00:00
Wei Mi	59ca96636d	[UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded when unroll runtime iteration loop. In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner loop inside a loop nest, the scalar evolution needs to be dropped for its parent loop which is done by ScalarEvolution::forgetLoop. However, we can postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC expansion can still reuse existing value. Differential Revision: https://reviews.llvm.org/D23572 llvm-svn: 279748	2016-08-25 16:17:18 +00:00
Simon Pilgrim	6fe4a9ed1e	Fix line endings llvm-svn: 279745	2016-08-25 15:45:27 +00:00
Ron Lieberman	c93d123b86	[Hexagon] vector store print tracing. Add vector store print tracing option for hexagon vector instructions. https://reviews.llvm.org/D23870 llvm-svn: 279739	2016-08-25 13:35:48 +00:00
Simon Pilgrim	0ad9f3e93b	[X86][AVX] Provide SubVectorBroadcast fallback if load fold fails (PR29133) Fix for PR29133, matching the approach that was taken for AVX1 scalar broadcasts. llvm-svn: 279735	2016-08-25 12:45:16 +00:00
Sebastian Pop	5f0d0e60d1	GVN-hoist: fix hoistingFromAllPaths for loops (PR29034) It is invalid to hoist stores or loads if they are not executed on all paths from the hoisting point to the exit of the function. In the testcase, there are paths in the loop that do not execute the stores or the loads, and so hoisting them within the loop is unsafe. The problem is that the current implementation of hoistingFromAllPaths is incomplete: it walks all blocks dominated by the hoisting point, and does not return false when the loop contains a path on which the hoisted ld/st is not executed. Differential Revision: https://reviews.llvm.org/D23843 llvm-svn: 279732	2016-08-25 11:55:47 +00:00
Craig Topper	5ef7a0f45a	[X86] Simplify getOperandBias as a bit. NFC There's no reason for it to return a signed type. Just return the operand bias in each if instead of starting from 0 and adding in the 'if'. llvm-svn: 279720	2016-08-25 04:16:10 +00:00
Craig Topper	969e56a2cc	[X86] Fix indentation per coding standards. NFC llvm-svn: 279719	2016-08-25 04:16:08 +00:00
George Burgess IV	b42e0e7fa3	Make buildbots happy. "warning: extra ‘;’ [-Wpedantic]" llvm-svn: 279703	2016-08-25 02:15:54 +00:00
Kyle Butt	c7f1eac514	TailDuplication: Don't pass MMI separately from MF. NFC MMI must match the function passed, and MF has a handle on MMI. Use that instead of accepting it as separate argument. No Functional Change. llvm-svn: 279701	2016-08-25 01:37:07 +00:00
Kyle Butt	3ed4273d33	TailDuplication: Save MF and reduce number of parameters. NFC Save the function in the class, and then don't pass it around. This reduces the number of parameters and makes calls to member functions simpler. No Functional Change. llvm-svn: 279700	2016-08-25 01:37:03 +00:00
George Burgess IV	ff7205ca85	Update a comment. r279696, which changed `LLVM_CONSTEXPR AliasAttr` to `const AliasAttr`, made this comment make less sense. llvm-svn: 279699	2016-08-25 01:29:55 +00:00
Matthias Braun	1eb473680a	MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, compute it Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of running after register and simply describes that no vregs are used in a machine function. With that we can simply compute the property and do not need to dump/parse it in .mir files. Differential Revision: http://reviews.llvm.org/D23850 llvm-svn: 279698	2016-08-25 01:27:13 +00:00
Kostya Serebryany	f67357c671	[libFuzzer] simplify the code, NFC llvm-svn: 279697	2016-08-25 01:25:03 +00:00
George Burgess IV	381fc0ee3c	Make some LLVM_CONSTEXPR variables const. NFC. This patch changes LLVM_CONSTEXPR variable declarations to const variable declarations, since LLVM_CONSTEXPR expands to nothing if the current compiler doesn't support constexpr. In all of the changed cases, it looks like the code intended the variable to be const instead of sometimes-constexpr sometimes-not. llvm-svn: 279696	2016-08-25 01:05:08 +00:00
Eugene Zelenko	1804a77b2a	Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Differential revision: https://reviews.llvm.org/D23861 llvm-svn: 279695	2016-08-25 00:45:04 +00:00
Xinliang David Li	cad3a995a4	[Profile] Propagate branch metadata properly in instcombine Differential Revision: http://reviews.llvm.org/D23590 llvm-svn: 279693	2016-08-25 00:26:32 +00:00
Kostya Serebryany	41bcb830af	[libFuzzer] make a test more deterministic llvm-svn: 279686	2016-08-24 23:10:17 +00:00
Sanjay Patel	1655414903	[InstCombine] move foldICmpDivConstConst() contents to foldICmpDivConstant(); NFCI There was no logic in foldICmpDivConstant, so no need for a separate function. The code is directly copy/pasted, so further cleanups to follow. llvm-svn: 279685	2016-08-24 23:03:36 +00:00
Evgeny Stupachenko	d7f9c3564a	The patch improves ValueTracking on left shift with nsw flag. Summary: The patch fixes PR28946. Reviewers: majnemer, sanjoy Differential Revision: http://reviews.llvm.org/D23296 From: Li Huang llvm-svn: 279684	2016-08-24 23:01:33 +00:00
Heejin Ahn	b6cd5121b7	[WebAssembly] Change a comment line Test for commit access. llvm-svn: 279683	2016-08-24 22:53:00 +00:00
Krzysztof Parzyszek	6dff336ad1	[Hexagon] Check for block end when skipping debug instructions llvm-svn: 279681	2016-08-24 22:36:35 +00:00
Matthias Braun	a319e2cae0	MIRParser/MIRPrinter: Compute HasInlineAsm instead of printing/parsing it llvm-svn: 279680	2016-08-24 22:34:06 +00:00
Krzysztof Parzyszek	951fb36120	[Hexagon] Change insertion of expand-condsets pass to avoid memory leaks llvm-svn: 279678	2016-08-24 22:27:36 +00:00
Sanjay Patel	d398d4a39e	[InstCombine] use m_APInt to allow icmp eq/ne (shr X, C2), C folds for splat constant vectors llvm-svn: 279677	2016-08-24 22:22:06 +00:00
Matthias Braun	f1b20c5225	MachineRegisterInfo/MIR: Initialize tracksSubRegLiveness early, do not print/parser it tracksSubRegLiveness only depends on the Subtarget and a cl::opt, there is not need to change it or save/parse it in a .mir file. Make the field const and move the initialization LiveIntervalAnalysis to the MachineRegisterInfo constructor. Also cleanup some code and fix some instances which better use MachineRegisterInfo::subRegLivenessEnabled() instead of TargetSubtargetInfo::enableSubRegLiveness(). llvm-svn: 279676	2016-08-24 22:17:45 +00:00
Kyle Butt	a8c7371d16	CodeGen: If Convert blocks that would form a diamond when tail-merged. The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 279671	2016-08-24 21:34:27 +00:00
Kyle Butt	6262ca3448	IfConversion: Rescan diamonds. The cost of predicating a diamond is only the instructions that are not shared between the two branches. Additionally If a predicate clobbering instruction occurs in the shared portion of the branches (e.g. a cond move), it may still be possible to if convert the sub-cfg. This change handles these two facts by rescanning the non-shared portion of a diamond sub-cfg to recalculate both the predication cost and whether both blocks are pred-clobbering. Fixed 2 bugs before recommitting. Branch instructions must be compared and found identical before diamond conversion. Also, predicate-clobbering instructions in the shared prefix disqualifies a potential diamond conversion. Includes tests for both. llvm-svn: 279670	2016-08-24 21:34:24 +00:00
Tim Northover	9c3633f516	ARM: don't diagnose cbz/cbnz to Thumb functions. A branch-distance to a Thumb function shouldn't be forced to be odd for CBZ/CBNZ instructions because (assuming it's within range), it's going to be a valid, even offset. llvm-svn: 279665	2016-08-24 21:21:29 +00:00
Changpeng Fang	75f0968b39	AMDGCN/SI: Implement readlane/readfirstlane intrinsics Summary: This patch implements readlane/readfirstlane intrinsics. TODO: need to define a new register class to consider the case that the source could be a vector register or M0. Reviewed by: arsenm and tstellarAMD Differential Revision: http://reviews.llvm.org/D22489 llvm-svn: 279660	2016-08-24 20:35:23 +00:00
Rafael Espindola	70c6a3976b	Use isTargetMachO instead of isTargetDarwin. llvm-svn: 279655	2016-08-24 19:02:29 +00:00
Simon Pilgrim	e14653e17d	[X86][SSE] Add MINSD/MAXSD/MINSS/MAXSS intrinsic scalar load folding support These are no different in load behaviour to the existing ADD/SUB/MUL/DIV scalar ops but were missing from isNonFoldablePartialRegisterLoad llvm-svn: 279652	2016-08-24 18:40:53 +00:00
David Blaikie	a01f295322	DebugInfo: Add flag to CU to disable emission of inline debug info into the skeleton CU In cases where .dwo/.dwp files are guaranteed to be available, skipping the extra online (in the .o file) inline info can save a substantial amount of space - see the original r221306 for more details there. llvm-svn: 279650	2016-08-24 18:29:49 +00:00
Matthew Simpson	abd2be1e2e	[LV] Unify vector and scalar maps This patch unifies the data structures we use for mapping instructions from the original loop to their corresponding instructions in the new loop. Previously, we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap. WidenMap maintained the vector values each instruction from the old loop was represented with, and ScalarIVMap maintained the scalar values each scalarized induction variable was represented with. With this patch, all values created for the new loop are maintained in VectorLoopValueMap. The change allows for several simplifications. Previously, when an instruction was scalarized, we had to insert the scalar values into vectors in order to maintain the mapping in WidenMap. Then, if a user of the scalarized value was also scalar, we had to extract the scalar values from the temporary vector we created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions. If a scalarized value is used by a scalar instruction, the scalar value is used directly. However, if the scalarized value is needed by a vector instruction, we generate the needed insertelement instructions on-demand. A common idiom in several locations in the code (including the scalarization code), is to first get the vector values an instruction from the original loop maps to, and then extract a particular scalar value. This patch adds getScalarValue for this purpose along side getVectorValue as an interface into VectorLoopValueMap. These functions work together to return the requested values if they're available or to produce them if they're not. The mapping has also be made less permissive. Entries can be added to VectorLoopValue map with the new initVector and initScalar functions. getVectorValue has been modified to return a constant reference to the mapped entries. There's no real functional change with this patch; however, in some cases we will generate slightly different code. For example, instead of an insertelement sequence following the definition of an instruction, it will now precede the first use of that instruction. This can be seen in the test case changes. Differential Revision: https://reviews.llvm.org/D23169 llvm-svn: 279649	2016-08-24 18:23:17 +00:00
Evandro Menezes	5395187fe5	[AArch64] Adjust the feature set for Exynos M1. Enable zero cycle zeroing. llvm-svn: 279648	2016-08-24 18:17:30 +00:00
Sanjoy Das	ff855b6020	[SCCP] Don't delete side-effecting instructions I'm not sure if the `!isa<CallInst>(Inst) && !isa<TerminatorInst>(Inst))` bit is correct either, but this fixes the case we know is broken. llvm-svn: 279647	2016-08-24 18:10:21 +00:00
Simon Pilgrim	941bd6bbae	[X86][SSE] Add support for combining VZEXT_MOVL target shuffles Includes adding more general support for the pattern: VZEXT_MOVL(VZEXT_LOAD(ptr)) -> VZEXT_LOAD(ptr) This has unearthed a couple of latent poor codegen issues (MINSS/MAXSS scalar load folding and MOVDDUP/BROADCAST load folding patterns), which will be fixed shortly. Its also reduced a couple of tests so that they no longer reach the instruction threshold necessary to be combined to PSHUFB (see PR26183). llvm-svn: 279646	2016-08-24 18:07:53 +00:00
Krzysztof Parzyszek	b5ec48755d	[Hexagon] Enable subregister liveness tracking llvm-svn: 279642	2016-08-24 17:17:39 +00:00
Krzysztof Parzyszek	cbd559f507	[Hexagon] Remove the utilization of IMPLICIT_DEFs from expand-condsets This is no longer necessary, because since r279625 the subregister liveness properly accounts for read-undefs. llvm-svn: 279637	2016-08-24 16:36:37 +00:00
Nico Weber	0c28557a59	fix typo 'varaible' in assert llvm-svn: 279636	2016-08-24 16:34:54 +00:00
Wei Ding	1041a646a9	AMDGPU : Add V_SAD_U32 instruction pattern. Differential Revision: http://reviews.llvm.org/D23069 llvm-svn: 279629	2016-08-24 14:59:47 +00:00
Sanjay Patel	8e297749c1	[InstCombine] add assert and explanatory comment for fold removed in r279568; NFC I deleted a fold from InstCombine at: https://reviews.llvm.org/rL279568 because it (like any InstCombine to a constant?) should always happen in InstSimplify, however, it's not obvious what the assumptions are in the remaining code. Add a comment and assert to make it clearer. Differential Revision: https://reviews.llvm.org/D23819 llvm-svn: 279626	2016-08-24 13:55:55 +00:00
Krzysztof Parzyszek	a7ed090bba	Create subranges for new intervals resulting from live interval splitting The register allocator can split a live interval of a register into a set of smaller intervals. After the allocation of registers is complete, the rewriter will modify the IR to replace virtual registers with the corres- ponding physical registers. At this stage, if a register corresponding to a subregister of a virtual register is used, the rewriter will check if that subregister is undefined, and if so, it will add the <undef> flag to the machine operand. The function verifying liveness of the subregis- ter would assume that it is undefined, unless any of the subranges of the live interval proves otherwise. The problem is that the live intervals created during splitting do not have any subranges, even if the original parent interval did. This could result in the <undef> flag placed on a register that is actually defined. Differential Revision: http://reviews.llvm.org/D21189 llvm-svn: 279625	2016-08-24 13:37:55 +00:00
Simon Dardis	f114820912	[mips] Preparatory work for a generic scheduler Extend instruction definitions from nearly all ISAs to include appropriate instruction itineraries. Change MIPS16s gp prologue generation to use real instructions instead of using a pseudo instruction. Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23548 llvm-svn: 279623	2016-08-24 13:00:47 +00:00
Simon Pilgrim	7a50c8c2ba	[X86][AVX2] Ensure on 32-bit targets that we broadcast f64 types not i64 (PR29101) llvm-svn: 279622	2016-08-24 12:42:31 +00:00
Gil Rapaport	550148b2f6	[Loop Vectorizer] Support predication of div/rem div/rem instructions in basic blocks that require predication currently prevent vectorization. This patch extends the existing mechanism for predicating stores to handle other instructions and leverages it to predicate divs and rems. Differential Revision: https://reviews.llvm.org/D22918 llvm-svn: 279620	2016-08-24 11:37:57 +00:00
Simon Pilgrim	6392b8d4ce	[X86][SSE] Add support for 32-bit element vectors to X86ISD::VZEXT_LOAD Consecutive load matching (EltsFromConsecutiveLoads) currently uses VZEXT_LOAD (load scalar into lowest element and zero uppers) for vXi64 / vXf64 vectors only. For vXi32 / vXf32 vectors it instead creates a scalar load, SCALAR_TO_VECTOR and finally VZEXT_MOVL (zero upper vector elements), relying on tablegen patterns to match this into an equivalent of VZEXT_LOAD. This patch adds the VZEXT_LOAD patterns for vXi32 / vXf32 vectors directly and updates EltsFromConsecutiveLoads to use this. This has proven necessary to allow us to easily make VZEXT_MOVL a full member of the target shuffle set - without this change the call to combineShuffle (which is the main caller of EltsFromConsecutiveLoads) tended to recursively recreate VZEXT_MOVL nodes...... Differential Revision: https://reviews.llvm.org/D23673 llvm-svn: 279619	2016-08-24 10:46:40 +00:00
Chandler Carruth	8882346842	[PM] Introduce basic update capabilities to the new PM's CGSCC pass manager, including both plumbing and logic to handle function pass updates. There are three fundamentally tied changes here: 1) Plumbing some mechanism for updating the CGSCC pass manager as the CG changes while passes are running. 2) Changing the CGSCC pass manager infrastructure to have support for the underlying graph to mutate mid-pass run. 3) Actually updating the CG after function passes run. I can separate them if necessary, but I think its really useful to have them together as the needs of #3 drove #2, and that in turn drove #1. The plumbing technique is to extend the "run" method signature with extra arguments. We provide the call graph that intrinsically is available as it is the basis of the pass manager's IR units, and an output parameter that records the results of updating the call graph during an SCC passes's run. Note that "...UpdateResult" isn't a great name here... suggestions very welcome. I tried a pretty frustrating number of different data structures and such for the innards of the update result. Every other one failed for one reason or another. Sometimes I just couldn't keep the layers of complexity right in my head. The thing that really worked was to just directly provide access to the underlying structures used to walk the call graph so that their updates could be informed by the particular nature of the change to the graph. The technique for how to make the pass management infrastructure cope with mutating graphs was also something that took a really, really large number of iterations to get to a place where I was happy. Here are some of the considerations that drove the design: - We operate at three levels within the infrastructure: RefSCC, SCC, and Node. In each case, we are working bottom up and so we want to continue to iterate on the "lowest" node as the graph changes. Look at how we iterate over nodes in an SCC running function passes as those function passes mutate the CG. We continue to iterate on the "lowest" SCC, which is the one that continues to contain the function just processed. - The call graph structure re-uses SCCs (and RefSCCs) during mutation events for the highest entry in the resulting new subgraph, not the lowest. This means that it is necessary to continually update the current SCC or RefSCC as it shifts. This is really surprising and subtle, and took a long time for me to work out. I actually tried changing the call graph to provide the opposite behavior, and it breaks EVERYTHING. The graph update algorithms are really deeply tied to this particualr pattern. - When SCCs or RefSCCs are split apart and refined and we continually re-pin our processing to the bottom one in the subgraph, we need to enqueue the newly formed SCCs and RefSCCs for subsequent processing. Queuing them presents a few challenges: 1) SCCs and RefSCCs use wildly different iteration strategies at a high level. We end up needing to converge them on worklist approaches that can be extended in order to be able to handle the mutations. 2) The order of the enqueuing need to remain bottom-up post-order so that we don't get surprising order of visitation for things like the inliner. 3) We need the worklists to have set semantics so we don't duplicate things endlessly. We don't need a persistent set though because we always keep processing the bottom node!!!! This is super, super surprising to me and took a long time to convince myself this is correct, but I'm pretty sure it is... Once we sink down to the bottom node, we can't re-split out the same node in any way, and the postorder of the current queue is fixed and unchanging. 4) We need to make sure that the "current" SCC or RefSCC actually gets enqueued here such that we re-visit it because we continue processing a new, bottom SCC/RefSCC. - We also need the ability to skip SCCs and RefSCCs that get merged into a larger component. We even need the ability to skip nodes from an SCC that are no longer part of that SCC. This led to the design you see in the patch which uses SetVector-based worklists. The RefSCC worklist is always empty until an update occurs and is just used to handle those RefSCCs created by updates as the others don't even exist yet and are formed on-demand during the bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and we push new SCCs onto it and blacklist existing SCCs on it to get the desired processing. We then directly update these when updating the call graph as I was never able to find a satisfactory abstraction around the update strategy. Finally, we need to compute the updates for function passes. This is mostly used as an initial customer of all the update mechanisms to drive their design to at least cover some real set of use cases. There are a bunch of interesting things that came out of doing this: - It is really nice to do this a function at a time because that function is likely hot in the cache. This means we want even the function pass adaptor to support online updates to the call graph! - To update the call graph after arbitrary function pass mutations is quite hard. We have to build a fairly comprehensive set of data structures and then process them. Fortunately, some of this code is related to the code for building the cal graph in the first place. Unfortunately, very little of it makes any sense to share because the nature of what we're doing is so very different. I've factored out the one part that made sense at least. - We need to transfer these updates into the various structures for the CGSCC pass manager. Once those were more sanely worked out, this became relatively easier. But some of those needs necessitated changes to the LazyCallGraph interface to make it significantly easier to extract the changed SCCs from an update operation. - We also need to update the CGSCC analysis manager as the shape of the graph changes. When an SCC is merged away we need to clear analyses associated with it from the analysis manager which we didn't have support for in the analysis manager infrsatructure. New SCCs are easy! But then we have the case that the original SCC has its shape changed but remains in the call graph. There we need to invalidate the analyses associated with it. - We also need to invalidate analyses after we finish processing an SCC. But the analyses we need to invalidate here are only those for the newly updated SCC!!! Because we only continue processing the bottom SCC, if we split SCCs apart the original one gets invalidated once when its shape changes and is not processed farther so its analyses will be correct. It is the bottom SCC which continues being processed and needs to have the "normal" invalidation done based on the preserved analyses set. All of this is mostly background and context for the changes here. Many thanks to all the reviewers who helped here. Especially Sanjoy who caught several interesting bugs in the graph algorithms, David, Sean, and others who all helped with feedback. Differential Revision: http://reviews.llvm.org/D21464 llvm-svn: 279618	2016-08-24 09:37:14 +00:00
Gor Nishanov	4570e26e68	[Coroutines] Fix unused var warning in release build llvm-svn: 279610	2016-08-24 05:20:30 +00:00
Gor Nishanov	241b041fba	[Coroutines] Part 8: Coroutine Frame Building algorithm Summary: This patch adds coroutine frame building algorithm. Now, simple coroutines such as ex0.ll and ex1.ll (first examples from docs\Coroutines.rst can be compiled). Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) ... 7. Split coroutine into subfunctions. (https://reviews.llvm.org/D23461) 8. Coroutine Frame Building algorithm <= we are here 9. Add f.cleanup subfunction. 10+. The rest of the logic Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23586 llvm-svn: 279609	2016-08-24 04:44:35 +00:00
Chandler Carruth	eb232dc90d	Preserve a pointer to the newly allocated signal stack as well. That too is flagged by LSan at least among leak detectors. llvm-svn: 279605	2016-08-24 03:42:51 +00:00
Matthias Braun	3a133159cc	TargetSchedule: Do not consider subregister definitions as reads. We should not consider subregister definitions as reads for schedule model purposes (they are just modeled as reads of the overal vreg for liveness calculation purposes, the CPU instructions are not actually reading). Unfortunately I cannot submit a test for this as it requires a target which uses ReadAdvance annotation in the scheduling model and has subregister liveness enabled at the same time, which is only the case on an out of tree target. llvm-svn: 279604	2016-08-24 02:32:29 +00:00
Matthias Braun	733fe3676c	CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses Re-apply this patch, hopefully I will get away without any warnings in the constructor now. This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279602	2016-08-24 01:52:46 +00:00
Kostya Serebryany	bceadcf1cd	[libFuzzer] use __attribute__((target("popcnt"))) only on x86_64 llvm-svn: 279601	2016-08-24 01:38:42 +00:00
Matthias Braun	79f85b3b8f	MIRParser/MIRPrinter: Compute isSSA instead of printing/parsing it. Specifying isSSA is an extra line at best and results in invalid MI at worst. Compute the value instead. Differential Revision: http://reviews.llvm.org/D22722 llvm-svn: 279600	2016-08-24 01:32:41 +00:00
Richard Smith	b31163136c	Increase the size of the sigaltstack used by LLVM signal handlers. 8KB is not sufficient in some cases; increase to 64KB, which should be enough for anyone :) Patch by github.com/bryant! llvm-svn: 279599	2016-08-24 00:54:49 +00:00
Matthias Braun	c3b2e80b9d	MachineModuleInfo: Avoid dummy constructor, use INITIALIZE_TM_PASS Change this pass constructor to just accept a const TargetMachine * and use INITIALIZE_TM_PASS, that way we can get rid of the dummy constructor. The pass will still fail when calling the default constructor leading to TM == nullptr, this is no different than before but is more in line what other codegen passes are doing and avoids the dummy constructor. llvm-svn: 279598	2016-08-24 00:42:05 +00:00
David Callahan	012d1c0766	[ADCE] Add control dependence computation Summary: This is part of a serious of patches to evolve ADCE.cpp to support removing of unnecessary control flow. This patch adds the ability to compute control dependences using the iterated dominance frontier. We extend the liveness propagation to alternate between data and control dependences until convergences. Modify the pass manager intergation to compute the post-dominator tree needed for iterator dominance frontier. We still force all terminators live for now until we add code to handlinge removing control flow in a later patch. No changes to effective behavior with this patch Previous patches: D23225 [ADCE] Modify data structures to support removing control flow D23065 [ADCE] Refactor anticipating new functionality (NFC) D23102 [ADCE] Refactoring for new functionality (NFC) Reviewers: nadav, majnemer, mehdi_amini Subscribers: twoh, freik, llvm-commits Differential Revision: https://reviews.llvm.org/D23559 llvm-svn: 279594	2016-08-24 00:10:06 +00:00
Philip Reames	d06a1b4cdc	[stackmaps] Remove an unneeded member variable [NFC] llvm-svn: 279590	2016-08-23 23:58:08 +00:00
Kostya Serebryany	ac524cfcce	[libFuzzer] collect 64 states for value profile, not 65 llvm-svn: 279588	2016-08-23 23:37:37 +00:00
Philip Reames	e83c4b30ca	[stackmaps] More extraction of common code [NFCI] General cleanup before starting to work on the part I want to actually change. llvm-svn: 279586	2016-08-23 23:33:29 +00:00
Michael Zolotukhin	bd63d436c1	[LoopUnroll] By default disable unrolling when optimizing for size. Summary: In clang commit r268509 we started to invoke loop-unroll pass from the driver even under -Os. However, we happen to not initialize optsize thresholds properly, which si fixed with this change. r268509 led to some big compile time regressions, because we started to unroll some loops that we didn't unroll before. With this change I hope to recover most of the regressions. We still are slightly slower than before, because we do some checks here and there in loop-unrolling before we bail out, but at least the slowdown is not that huge now. Reviewers: hfinkel, chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23388 llvm-svn: 279585	2016-08-23 23:13:15 +00:00
Richard Smith	84c4cc47f5	Don't use "return {...}" to initialize a std::tuple. This has only been valid since 2015 (n4387), though it's allowed by a library DR so new implementations accept it in their C++11 modes... This should unbreak the build with libstdc++ 4.9. llvm-svn: 279583	2016-08-23 22:21:58 +00:00
Richard Smith	418237bed8	#ifdef out validation code when asserts are disabled to remove unused variable warnings. llvm-svn: 279582	2016-08-23 22:14:15 +00:00
Richard Smith	eae6138936	Remove unused data member to unbreak -Werror builds. llvm-svn: 279581	2016-08-23 22:10:46 +00:00
Richard Smith	8c3fbdc6c4	Revert r279564. It introduces undefined behavior (binding a reference to a dereferenced null pointer) in MachineModuleInfo::MachineModuleInfo that causes -Werror builds (including several buildbots) to fail. llvm-svn: 279580	2016-08-23 22:08:27 +00:00
Sanjay Patel	d64e988701	[InstCombine] use local variables for repeated values; NFCI llvm-svn: 279578	2016-08-23 22:05:55 +00:00
Petr Hosek	731bb9cf1e	[MC] Support .dc directives in assembler parser While these directives are mostly aliases for the existing integer and float value directives, some of them like .dc.a have no direct equivalents and are sometimes being used for convenience. Differential Revision: https://reviews.llvm.org/D23810 llvm-svn: 279577	2016-08-23 21:34:53 +00:00
Mehdi Amini	adc0e26bef	[ThinLTO] Add caching to the new LTO API Add the ability to plug a cache on the LTO API. I tried to write such that a linker implementation can control the cache backend. This is intrusive and I'm not totally happy with it, but I can't figure out a better design right now. Differential Revision: https://reviews.llvm.org/D23599 llvm-svn: 279576	2016-08-23 21:30:12 +00:00
Sanjay Patel	dcac0dfca9	[InstCombine] move foldICmpShrConstConst() contents to foldICmpShrConst(); NFCI There will only be 3 lines of code in foldICmpShrConst() when the cleanup is done, so it doesn't make much sense to have a separate function for a single fold. llvm-svn: 279575	2016-08-23 21:25:13 +00:00
Philip Reames	570dd009c3	[stackmaps] Extract out magic constants [NFCI] This is a first step towards clarifying the exact MI semantics of stackmap's "live values". llvm-svn: 279574	2016-08-23 21:21:43 +00:00
Matthias Braun	90799ce8b2	MachineFunction: Introduce NoPHIs property I want to compute the SSA property of .mir files automatically in upcoming patches. The problem with this is that some inputs will be reported as static single assignment with some passes claiming not to support SSA form. In reality though those passes do not support PHI instructions => Track the presence of PHI instructions separate from the SSA property. Differential Revision: https://reviews.llvm.org/D22719 llvm-svn: 279573	2016-08-23 21:19:49 +00:00
Sanjay Patel	6ef22da9ec	[InstCombine] remove icmp shr folds that are already handled by InstSimplify AFAICT, these already worked in all cases for scalar types, and I enhanced the code to work for vector types in: https://reviews.llvm.org/rL279543 llvm-svn: 279568	2016-08-23 21:01:35 +00:00
Tim Northover	bdf67c9a00	GlobalISel: make truncate/extend casts uniform They really should have both types represented, but early variants were created before MachineInstrs could have multiple types so they're rather ambiguous. llvm-svn: 279567	2016-08-23 21:01:33 +00:00
Tim Northover	6cd4b23a0f	GlobalISel: legalize integer comparisons on AArch64. Next step is doing both legalizations at the same time! Marvel at GlobalISel's cunning. llvm-svn: 279566	2016-08-23 21:01:26 +00:00
Tim Northover	b3a0be4d38	GlobalISel: legalize conditional branches on AArch64. llvm-svn: 279565	2016-08-23 21:01:20 +00:00
Matthias Braun	4c1f1f120c	CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses Re-apply this commit with the deletion of a MachineFunction delegated to a separate pass to avoid use after free when doing this directly in AsmPrinter. This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279564	2016-08-23 20:58:29 +00:00
David Majnemer	54690dcdb0	[ValueTracking] Use a function_ref to avoid multiple instantiations No functional change intended, this should just be a code size improvement. llvm-svn: 279563	2016-08-23 20:52:00 +00:00
Matthew Simpson	df2ab917ad	[SLP] Avoid signed integer overflow The test case included with r279125 exposed an existing signed integer overflow. Since getTreeCost can return INT_MAX, we can't sum this cost together with other costs, such as getReductionCost. This patch removes the possibility of assigning a cost of INT_MAX. Since we were previously using INT_MAX as an indicator for "should not vectorize", we now explicitly check this condition with "isTreeTinyAndNotFullyVectorizable" before computing a cost. This patch adds a run-line to the test case used for r279125 that ensures we don't vectorize. Previously, this line would vectorize the test case by chance due to undefined behavior in the cost calculation. Differential Revision: https://reviews.llvm.org/D23723 llvm-svn: 279562	2016-08-23 20:48:50 +00:00
Zachary Turner	f6884a1aac	Remove unused translation unit. llvm-svn: 279561	2016-08-23 20:08:02 +00:00
Tim Northover	a01bece1dc	GlobalISel: extend legalizer interface to handle multiple types. Instructions like G_ICMP have multiple types that may need to be legalized (the boolean output and nearly arbitrary inputs in this case). So the legalizer must be capable of deciding what to do for each of them separately. llvm-svn: 279554	2016-08-23 19:30:42 +00:00
Tim Northover	456a3c03ac	GlobalISel: mark pointer casts legal on AArch64. llvm-svn: 279553	2016-08-23 19:30:38 +00:00
Mehdi Amini	e7494530b2	Stop always creating and running an LTO compilation if there is not a single LTO object Summary: I assume there was a use case, so maybe this strawman patch will help clarifying if it is legit. In any case the current situation is not legit: a ThinLTO compilation should not trigger an unexpected full LTO compilation. Right now, adding a --save-temps option triggers this and makes the number of output differs. Reviewers: tejohnson Subscribers: pcc, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23600 llvm-svn: 279550	2016-08-23 18:39:12 +00:00
Tim Northover	3c73e367c0	GlobalISel: legalize 1-bit load/store and mark 8/16 bit variants legal on AArch64. llvm-svn: 279548	2016-08-23 18:20:09 +00:00
Sanjay Patel	6946e2ade3	[InstSimplify] allow icmp with constant folds for splat vectors, part 2 Completes the m_APInt changes for simplifyICmpWithConstant(). Other commits in this series: https://reviews.llvm.org/rL279492 https://reviews.llvm.org/rL279530 https://reviews.llvm.org/rL279534 https://reviews.llvm.org/rL279538 llvm-svn: 279543	2016-08-23 18:00:51 +00:00
Xinliang David Li	812288a5b4	Possible fix of test failures on win bots llvm-svn: 279542	2016-08-23 18:00:41 +00:00
Sanjay Patel	200e3cbfb0	[InstSimplify] allow icmp with constant folds for splat vectors, part 1 llvm-svn: 279538	2016-08-23 17:30:56 +00:00
Justin Lebar	1972e222ea	[SelectionDAG] Use a union of bitfield structs for SDNode::SubclassData. Summary: This greatly simplifies our handling of SDNode::SubclassData. NFC, hopefully. :) See discussion in D23035 for discussion about the design API of these bitfields. Reviewers: chandlerc Subscribers: llvm-commits, rnk Differential Revision: https://reviews.llvm.org/D23036 llvm-svn: 279537	2016-08-23 17:18:11 +00:00
Justin Lebar	0a33a7aefa	[CodeGen] Convert a loop to a for-each loop. NFC llvm-svn: 279536	2016-08-23 17:18:07 +00:00
Eugene Zelenko	33d7b762d0	Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Differential revision: https://reviews.llvm.org/D23789 llvm-svn: 279535	2016-08-23 17:14:32 +00:00
Mehdi Amini	9ec5a61358	[ThinLTO] Make sure the Context used for the ThinLTO backend has all the appropriate options An important performance setting on the LLVMContext for LTO is enableDebugTypeODRUniquing(), this adds an automatic merging of debug information in the context based on type ids. Also, the lto::Config includes a diagnostic handler that needs to be set on the Context, as well as the setDiscardValueNames() setting. llvm-svn: 279532	2016-08-23 16:53:34 +00:00
Pete Cooper	036b94dad3	Fix some more asserts after r279466. That commit added a new version of Intrinsic::getName which should only be called when the intrinsic has no overloaded types. There are several debugging paths, such as SDNode::dump which are printing the name of the intrinsic but don't have the overloaded types. These paths should be ok to just print the name instead of crashing. The fix here is ultimately to just add a 'None' second argument as that calls the overload capable getName, which is less efficient, but this is a debugging path anyway, and not perf critical. Thanks to Björn Pettersson for pointing out that there were more crashes. llvm-svn: 279528	2016-08-23 16:23:45 +00:00
Krzysztof Parzyszek	38e2ccc8d0	[Hexagon] Packetize return value setup with the return instruction Commit r279241 unintentionally reverted that ability. llvm-svn: 279526	2016-08-23 16:01:01 +00:00
Xinliang David Li	dc49140b44	[Profile] refactor meta data copying/swapping code Differential Revision: http://reviews.llvm.org/D23619 llvm-svn: 279523	2016-08-23 15:39:03 +00:00
Jacques Pienaar	8ed1cd291d	[lanai] Use const instead of constexpr The windows build bot did not like constexpr. llvm-svn: 279517	2016-08-23 14:36:53 +00:00
Elliot Colp	a4092104cb	Fix SystemZ hang caused by r279105 The change in r279105 causes an infinite loop in some cases, as it sets the upper bits of an AND mask constant, which DAGCombiner::SimplifyDemandedBits then unsets. This patch reverts that part of the behaviour, instead relying on .td peepholes to perform the transformation to NILL. I reapplied my original fix for the problem addressed by r279105 (unsetting the upper bits, which prevents a compiler abort for a different reason). Differential Revision: https://reviews.llvm.org/D23781 llvm-svn: 279515	2016-08-23 14:03:02 +00:00
Davide Italiano	fc4430ea45	[LTOCodeGenerator] Reduce code duplication. NFCI. llvm-svn: 279514	2016-08-23 12:32:57 +00:00
NAKAMURA Takumi	708591d231	LLVMLanaDesc: Update libdesp. llvm-svn: 279510	2016-08-23 10:47:40 +00:00
NAKAMURA Takumi	031a6330c4	Change the target's name, s/LanaiMCTargetDesc/LanaiDesc/g. "AllTargetsDescs" in llvm-mc/CMakeLists.txt expects not ${target}MCTargetDesc, but ${target}Desc. llvm-svn: 279509	2016-08-23 10:43:01 +00:00
Oliver Stannard	9aa6f010a4	[ARM] Generate consistent frame records for Thumb2 There is not an official documented ABI for frame pointers in Thumb2, but we should try to emit something which is useful. We use r7 as the frame pointer for Thumb code, which currently means that if a function needs to save a high register (r8-r11), it will get pushed to the stack between the frame pointer (r7) and link register (r14). This means that while a stack unwinder can follow the chain of frame pointers up the stack, it cannot know the offset to lr, so does not know which functions correspond to the stack frames. To fix this, we need to push the callee-saved registers in two batches, with the first push saving the low registers, fp and lr, and the second push saving the high registers. This is already implemented, but previously only used for iOS. This patch turns it on for all Thumb2 targets when frame pointers are required by the ABI, and the frame pointer is r7 (Windows uses r11, so this isn't a problem there). If frame pointer elimination is enabled we still emit a single push/pop even if we need a frame pointer for other reasons, to avoid increasing code size. We must also ensure that lr is pushed to the stack when using a frame pointer, so that we end up with a complete frame record. Situations that could cause this were rare, because we already push lr in most situations so that we can return using the pop instruction. Differential Revision: https://reviews.llvm.org/D23516 llvm-svn: 279506	2016-08-23 09:19:22 +00:00
Daniel Berlin	ea02eee18f	GVNHoist: Use the pass version of MemorySSA and preserve it. Summary: GVNHoist: Use the pass version of MemorySSA and preserve it. Reviewers: sebpop, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23782 llvm-svn: 279504	2016-08-23 05:42:41 +00:00
Matthias Braun	7f66202d38	Revert "(HEAD -> master, origin/master, origin/HEAD) CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses" Reverting while tracking down a use after free. This reverts commit r279502. llvm-svn: 279503	2016-08-23 05:17:11 +00:00
Matthias Braun	fd936841eb	CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279502	2016-08-23 03:20:09 +00:00
Matt Arsenault	567631bdd4	BranchRelaxation: Fix handling of blocks with multiple conditional branches Looping over all terminators exposed AArch64 tests hitting an assert from analyzeBranch failing. I believe these cases were miscompiled before. e.g. fcmp s0, s1 b.ne LBB0_1 b.vc LBB0_2 b LBB0_2 LBB0_1: ; Large block LBB0_2: ; ... Both of the individual conditional branches need to be expanded, since neither can reach the final block. Split the original block into ones which analyzeBranch will be able to understand. llvm-svn: 279499	2016-08-23 01:30:30 +00:00
Jacques Pienaar	0e2171904e	[lanai] Exit early in Mem Alu combiner if sentinel reach. LanaiMemAluCombiner could try to query the debug value of a list sentinel. Add check to exit early instead. llvm-svn: 279497	2016-08-23 01:04:41 +00:00
George Burgess IV	7f414b90ab	[MemorySSA] Remove unused field. NFC. Given that we're not currently using blocker info, and whether or not we will end up using it it is unclear, don't waste 8 (or 4) bytes of memory per path node. llvm-svn: 279493	2016-08-22 23:40:01 +00:00
Sanjay Patel	67bde28627	[InstSimplify] add helper function for SimplifyICmpInst(); NFCI And add a FIXME because the helper excludes folds for vectors. It's not clear yet how many of these are actually testable (and therefore necessary?) because later analysis uses computeKnownBits and other methods to catch many of these cases. llvm-svn: 279492	2016-08-22 23:12:02 +00:00
Pete Cooper	1523925daa	Fix crash from assert in r279466. The assert in r279466 checks that we call the correct version of Intrinsic::getName. The version which accepts only an ID should not be used for intrinsics with overloaded types. The global-isel code was calling the wrong version. The test CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll will ensure that we call the correct version from now on. llvm-svn: 279487	2016-08-22 22:27:05 +00:00
Sanjay Patel	c9196c4488	[InstCombine] change param type from Instruction to BinaryOperator for icmp helpers; NFCI This saves some casting in the helper functions and eases some further refactoring. llvm-svn: 279478	2016-08-22 21:24:29 +00:00
Tim Shen	f2187ed321	[GraphTraits] Replace all NodeType usage with NodeRef This should finish the GraphTraits migration. Differential Revision: http://reviews.llvm.org/D23730 llvm-svn: 279475	2016-08-22 21:09:30 +00:00
Duncan P. N. Exon Smith	b29ec1e040	ADT: Remove ilist_sentinel_traits, NFC Remove all the dead code around ilist_sentinel_traits. This is a follow-up to gutting them as part of r279314 (originally r278974), staged to prevent broken builds in sub-projects. Uses were removed from clang in r279457 and lld in r279458. llvm-svn: 279473	2016-08-22 20:51:00 +00:00
Sanjay Patel	a392049419	[InstCombine] use m_APInt to allow icmp (shr exact X, Y), 0 folds for splat constant vectors llvm-svn: 279472	2016-08-22 20:45:06 +00:00
Pete Cooper	067ee5b549	Add ADT headers to the cmake headers directory for LLVMSupport. NFC. Xcode and MSVC list the headers and source files for each library. LLVMSupport lists included the source files for ADT but not the headers. This add the ADT headers so that they are browsable by the UI. llvm-svn: 279470	2016-08-22 20:38:53 +00:00
Pete Cooper	a5f8c722c4	Add comments and an assert to follow-up on r279113. NFC. Philip commented on r279113 to ask for better comments as to when to use the different versions of getName. Its also possible to assert in the simple case that we aren't an overloaded intrinsic as those have to use the more capable version of getName. Thanks for the comments Philip. llvm-svn: 279466	2016-08-22 20:18:28 +00:00
Matt Arsenault	78fc9daf8d	AMDGPU: Split SILowerControlFlow into two pieces Do most of the lowering in a pre-RA pass. Keep the skip jump insertion late, plus a few other things that require more work to move out. One concern I have is now there may be COPY instructions which do not have the necessary implicit exec uses if they will be lowered to v_mov_b32. This has a positive effect on SGPR usage in shader-db. llvm-svn: 279464	2016-08-22 19:33:16 +00:00
Daniel Berlin	3d512a2dc2	MSSA: Factor out phi node placement llvm-svn: 279462	2016-08-22 19:14:30 +00:00
Daniel Berlin	868381bff6	MSSA: Only rename accesses whose defining access is nullptr llvm-svn: 279461	2016-08-22 19:14:16 +00:00
James Molloy	5bf2114265	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd [Recommitting now an unrelated assertion in SROA is sorted out] The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279460	2016-08-22 19:07:15 +00:00
James Molloy	0fee97f8ba	[SROA] Remove incorrect assertion Confirmed with aprantl, this assertion is incorrect - code can get here (for example 80-bit FP types) and if it does it's benign. This is exposed by a completely unrelated patch of mine, so stop the compiler falling over. Original differential: http://reviews.llvm.org/D16187 aprantl's advice to remove assertion: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160815/382129.html llvm-svn: 279454	2016-08-22 18:49:42 +00:00
Tim Shen	a5cc25e50f	[SSP] Do not set __guard_local to hidden for OpenBSD SSP __guard_local is defined as long on OpenBSD. If the source file contains a definition of __guard_local, it mismatches with the int8 pointer type used in LLVM. In that case, Module::getOrInsertGlobal() returns a cast operation instead of a GlobalVariable. Trying to set the visibility on the cast operation leads to random segfaults (seen when compiling the OpenBSD kernel, which also runs with stack protection). In the kernel, the hidden attribute does not matter. For userspace code, __guard_local is defined as hidden in the startup code. If a program re-defines __guard_local, the definition from the startup code will either win or the linker complains about multiple definitions (depending on whether the re-defined __guard_local is placed in the common segment or not). It also matches what gcc on OpenBSD does. Thanks Stefan Kempf <sisnkemp@gmail.com> for the patch! Differential Revision: http://reviews.llvm.org/D23674 llvm-svn: 279449	2016-08-22 18:26:27 +00:00
Jun Bum Lim	ec8b8cc595	[InstCombine] Allow sinking from unique predecessor with multiple edges Summary: We can allow sinking if the single user block has only one unique predecessor, regardless of the number of edges. Note that a switch statement with multiple cases can have the same destination. Reviewers: mcrosier, majnemer, spatel, reames Subscribers: reames, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23722 llvm-svn: 279448	2016-08-22 18:21:56 +00:00
James Molloy	475f4a763f	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r279443. It caused buildbot failures. llvm-svn: 279447	2016-08-22 18:13:12 +00:00
James Molloy	353052698a	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279443	2016-08-22 17:40:23 +00:00
Simon Pilgrim	c8ad5c069c	[X86][AVX] Don't use SubVectorBroadcast if there are additional users of the chain (PR29088) We could improve on this by making X86SubVBroadcast a full memory intrinsic similar to X86vzload llvm-svn: 279441	2016-08-22 16:47:55 +00:00
Simon Atanasyan	eb9ed61021	[mips][ias] Support .dtprel[d]word and .tprel[d]word directives Assembler directives .dtprelword, .dtpreldword, .tprelword, and .tpreldword generates relocations R_MIPS_TLS_DTPREL32, R_MIPS_TLS_DTPREL64, R_MIPS_TLS_TPREL32, and R_MIPS_TLS_TPREL64 respectively. The main motivation for this patch is to be able to write test cases for checking correctness of the LLD linker's behaviour. Differential Revision: https://reviews.llvm.org/D23669 llvm-svn: 279439	2016-08-22 16:18:42 +00:00
Mehdi Amini	f8c2f08cb3	[LTO] Constify the Module Hook function (NFC) It use to be non-const for the sole purpose of custom handling of commons symbol. This is moved now in the regular LTO handling now and such we can constify the callback. llvm-svn: 279438	2016-08-22 16:17:40 +00:00
Krzysztof Parzyszek	673b347e5a	Reset isUndef when removing subreg from a def operand llvm-svn: 279437	2016-08-22 14:50:12 +00:00
Simon Pilgrim	13fa33012b	[X86] Only accept SM_SentinelUndef (-1) as an undefined shuffle mask in range As discussed on D23027 we should be trying to be more strict on what is an undefined mask value. llvm-svn: 279435	2016-08-22 13:18:56 +00:00
Artur Pilipenko	bc76ecada0	Revert -r278267 [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers This change cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See https://reviews.llvm.org/D18777 for details. llvm-svn: 279433	2016-08-22 13:14:07 +00:00
Artur Pilipenko	b78ad9d41f	Revert -r278269 [IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative This change needs to be reverted in order to revert -r278267 which cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See comments on https://reviews.llvm.org/D18777 for details. llvm-svn: 279432	2016-08-22 13:12:07 +00:00
Simon Pilgrim	2279e59573	[X86][SSE] Avoid specifying unused arguments in SHUFPD lowering As discussed on PR26491, we are missing the opportunity to make use of the smaller MOVHLPS instruction because we set both arguments of a SHUFPD when using it to lower a single input shuffle. This patch sets the lowered argument to UNDEF if that shuffle element is undefined. This in turn makes it easier for target shuffle combining to decode UNDEF shuffle elements, allowing combines to MOVHLPS to occur. A fix to match against MOVHPD stores was necessary as well. This builds on the improved MOVLHPS/MOVHLPS lowering and memory folding support added in D16956 Adding similar support for SHUFPS will have to wait until have better support for target combining of binary shuffles. Differential Revision: https://reviews.llvm.org/D23027 llvm-svn: 279430	2016-08-22 12:56:54 +00:00
Hrvoje Varga	f0ed16eae5	[mips][microMIPS] Implement BLTZC, BLEZC, BGEZC and BGTZC instructions, fix disassembly and add operand checking to existing B<cond>C implementations Differential Revision: https://reviews.llvm.org/D22667 llvm-svn: 279429	2016-08-22 12:17:59 +00:00
Davide Italiano	80d379f228	[MC] Remove guard(s). NFCI. All the methods are already marked with LLVM_DUMP_METHOD. llvm-svn: 279428	2016-08-22 11:55:22 +00:00
Craig Topper	5f8419da34	[X86] Create a new instruction format to handle 4VOp3 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling. llvm-svn: 279424	2016-08-22 07:38:50 +00:00
Craig Topper	9b20fece81	[X86] Create a new instruction format to handle MemOp4 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling. llvm-svn: 279423	2016-08-22 07:38:45 +00:00
Craig Topper	61b62e56b7	[X86] Space out the encodings of X86 instruction formats. I plan to add some new encodings in future commits and this will reduce the size of those commits. NFC This tries to keep all the ModRM memory and register forms in their own regions of the encodings. Hoping to make it simple on some of the switch statements that operate on these encodings. llvm-svn: 279422	2016-08-22 07:38:41 +00:00
Mehdi Amini	dc4c8cf9ac	[LTO] Handles commons in monolithic LTO The gold-plugin was doing this internally, now the API is handling commons correctly based on the given resolution. Differential Revision: https://reviews.llvm.org/D23739 llvm-svn: 279417	2016-08-22 06:25:46 +00:00
Mehdi Amini	d310b47c23	[LTO] Add a "CodeGenOnly" option. Allows the client to skip the optimizer. Summary: Slowly getting on par with libLTO Reviewers: tejohnson Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23615 llvm-svn: 279416	2016-08-22 06:25:41 +00:00
Vitaly Buka	0672a27bb5	[asan] Use 1 byte aligned stores to poison shadow memory Summary: r279379 introduced crash on arm 32bit bot. I suspect this is alignment issue. Reviewers: eugenis Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D23762 llvm-svn: 279413	2016-08-22 04:16:14 +00:00
Craig Topper	ca0eda3e6a	[X86] Merge hasVEX_i8ImmReg into the ImmFormat type which had extra unused encodings. This saves one bit in TSFlags. NFC llvm-svn: 279412	2016-08-22 01:37:19 +00:00
Craig Topper	522541231a	[X86] Remove ignoreVEX_L from TSFlags. Only the disassembler needs it and the disassembler doesn't use TSFlags. NFC llvm-svn: 279411	2016-08-22 01:37:16 +00:00
NAKAMURA Takumi	9d0b53129c	Reformat. llvm-svn: 279409	2016-08-22 00:58:47 +00:00
NAKAMURA Takumi	59a20649c6	Untabify. llvm-svn: 279408	2016-08-22 00:58:04 +00:00
Sanjay Patel	643d21a62c	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 4 This concludes the fixes for icmp+shl in this series: https://reviews.llvm.org/rL279339 https://reviews.llvm.org/rL279398 https://reviews.llvm.org/rL279399 llvm-svn: 279401	2016-08-21 17:10:07 +00:00
Sanjay Patel	7ffcde7422	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 3 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279399	2016-08-21 16:35:34 +00:00
Sanjay Patel	7e09f13fed	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 2 This is a partial enablement (move the ConstantInt guard down). llvm-svn: 279398	2016-08-21 16:28:22 +00:00
Simon Pilgrim	67e7e22462	[X86][AVX] Dropped combineShuffle256 - this can now be performed by EltsFromConsecutiveLoads llvm-svn: 279397	2016-08-21 15:39:45 +00:00
Sanjay Patel	792636603f	[InstCombine] use APInt instead of ConstantInt in isSignBitCheck(); NFCI The callers still have ConstantInt guards, so there is no functional change intended from this change. But relaxing the callers will allow more folds for vector types. llvm-svn: 279396	2016-08-21 15:07:45 +00:00
Guy Blank	9ae797a798	[AVX512][FastISel] Do not use K registers in TEST instructions In some cases, FastIsel was emitting TEST instruction with K reg input, which is illegal. Changed to using KORTEST when dealing with K regs. Differential Revision: https://reviews.llvm.org/D23163 llvm-svn: 279393	2016-08-21 08:02:27 +00:00
Duncan P. N. Exon Smith	8f44c98d04	ARM: Avoid dereferencing end() in ARMFrameLowering::emitEpilogue This fixes the crash from PR29072, where the MachineBasicBlock::iterator wasn't being properly checked against MachineBasicBlock::end() before iterating. This was another bug exposed by the new ilist::iterator::operator*() assertion from r279314. This testcase is poor quality. bugpoint couldn't reduce any further, and I haven't had time to dig into what's going on so I can't invent a better one. I didn't even get good CHECK lines in: this is just a crasher. I'm committing anyway since this is a real crash with an obvious fix, but I'll leave PR29072 open and ask an ARM maintainer to help improve the testcase. llvm-svn: 279391	2016-08-21 00:08:10 +00:00
Vitaly Buka	1f9e135023	[asan] Minimize code size by using __asan_set_shadow_* for large blocks Summary: We can insert function call instead of multiple store operation. Current default is blocks larger than 64 bytes. Changes are hidden behind -asan-experimental-poisoning flag. PR27453 Differential Revision: https://reviews.llvm.org/D23711 llvm-svn: 279383	2016-08-20 20:23:50 +00:00
Simon Pilgrim	02b13d4d3c	Use SDValue::getOpcode() helper instead of via SDValue::getNode() llvm-svn: 279381	2016-08-20 20:04:18 +00:00
Vitaly Buka	3455b9b8bc	[asan] Initialize __asan_set_shadow_* callbacks Summary: Callbacks are not being used yet. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23634 llvm-svn: 279380	2016-08-20 18:34:39 +00:00
Vitaly Buka	186280daa5	[asan] Optimize store size in FunctionStackPoisoner::poisonRedZones Summary: Reduce store size to avoid leading and trailing zeros. Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23648 llvm-svn: 279379	2016-08-20 18:34:36 +00:00
Vitaly Buka	5b4f12176c	[asan] Cleanup instrumentation of dynamic allocas Summary: Extract instrumenting dynamic allocas into separate method. Rename asan-instrument-allocas -> asan-instrument-dynamic-allocas Differential Revision: https://reviews.llvm.org/D23707 llvm-svn: 279376	2016-08-20 17:22:27 +00:00
Vitaly Buka	f9fd63ad39	[asan] Add support of lifetime poisoning into ComputeASanStackFrameLayout Summary: We are going to combine poisoning of red zones and scope poisoning. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23623 llvm-svn: 279373	2016-08-20 16:48:24 +00:00
Matthew Simpson	235e479984	Reapply "[SLP] Initialize VectorizedValue when gathering" The test case included in r279125 exposed existing undefined behavior in the SLP vectorizer that it did not introduce. This patch reapplies the original patch, but modifies the test case to avoid hitting the undefined behavior. This allows us to close PR28330 while keeping the UBSan bot happy. The undefined behavior the original test uncovered will be addressed in a follow-on patch. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 llvm-svn: 279370	2016-08-20 14:49:02 +00:00
Matthew Simpson	2429656aa9	[SLP] Add command line option for minimum tree size (NFC) llvm-svn: 279369	2016-08-20 14:10:06 +00:00
Vitaly Buka	cc7db13bf0	Revert "[SLP] Initialize VectorizedValue when gathering" to fix ubsan bot. This reverts commit r279125. https://reviews.llvm.org/D23410 llvm-svn: 279363	2016-08-20 07:09:39 +00:00
Chandler Carruth	8abdf75d6b	[PM] Introduce an abstraction for all the analyses over a particular IR unit for use in the PreservedAnalyses set. This doesn't have any important functional change yet but it cleans things up and makes the analysis substantially more efficient by avoiding querying through the type erasure for every analysis. I also think it makes it much easier to reason about how analyses are preserved when walking across pass managers and across IR unit abstractions. Thanks to Sean and Mehdi both for the comments and suggestions. Differential Revision: https://reviews.llvm.org/D23691 llvm-svn: 279360	2016-08-20 04:57:28 +00:00
Matthias Braun	367d853042	MachineFunction: Add llvm_unreachable for missing properties Most compilers should give you a warning anyway though. llvm-svn: 279346	2016-08-19 23:03:28 +00:00
Krzysztof Parzyszek	d95d100c28	Reset "undef" flag when coalescing subregister into whole register llvm-svn: 279344	2016-08-19 22:57:23 +00:00
Tim Northover	a11be04769	GlobalISel: support legalization of G_FCONSTANTs llvm-svn: 279341	2016-08-19 22:40:08 +00:00
Tim Northover	ea904f9424	GlobalISel: teach legalizer how to handle integer constants. llvm-svn: 279340	2016-08-19 22:40:00 +00:00
Sanjay Patel	fa7de606c4	[InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat constant vectors, part 1 This is a partial enablement (move the ConstantInt guard down) because there are many different folds here and one of the later ones will require reworking 'isSignBitCheck'. llvm-svn: 279339	2016-08-19 22:33:26 +00:00
Matthias Braun	a7d6fc9618	MachineFunction: Cleanup/simplify MachineFunctionProperties::print() - Always compile print() regardless of LLVM_ENABLE_DUMP. (We usually only gard dump() functions with that). - Only show the set properties to reduce output clutter. - Remove the unused variant that even shows the unset properties. - Fix comments llvm-svn: 279338	2016-08-19 22:31:45 +00:00
Matthias Braun	a3b983aa5e	MachineFunction: Make LastProperty an alias of the last property This avoids unnecessary cases in switch statements covering all properties. llvm-svn: 279337	2016-08-19 22:31:42 +00:00
Daniel Berlin	11da66fc10	Partially revert 279331, as we modify this instruction in the loop llvm-svn: 279335	2016-08-19 22:18:38 +00:00
Vitaly Buka	e149b392a8	Revert "[asan] Add support of lifetime poisoning into ComputeASanStackFrameLayout" This reverts commit r279020. Speculative revert in hope to fix asan test on arm. llvm-svn: 279332	2016-08-19 22:12:58 +00:00
Daniel Berlin	a36f46363f	Convert some depth first traversals to depth_first llvm-svn: 279331	2016-08-19 22:06:23 +00:00
Tim Shen	b5e0f5ac95	[GraphTraits] Make nodes_iterator dereference to NodeType/NodeRef Currently nodes_iterator may dereference to a NodeType or a NodeType&. Make them all dereference to NodeType*, which is NodeRef later. Differential Revision: https://reviews.llvm.org/D23704 Differential Revision: https://reviews.llvm.org/D23705 llvm-svn: 279326	2016-08-19 21:20:13 +00:00
Krzysztof Parzyszek	e4582d4a2e	[Packetizer] Add debugging code to stop packetization after N instructions llvm-svn: 279325	2016-08-19 21:12:52 +00:00
Krzysztof Parzyszek	29a6a2eb8f	[Hexagon] Avoid register dependencies on indirect branches in packetizer Do not packetize the instruction setting the branch address with the indirect branch itself. llvm-svn: 279324	2016-08-19 21:07:35 +00:00
Kostya Serebryany	a533e514b8	[libFuzzer] fix the non-debug build warnings llvm-svn: 279321	2016-08-19 20:57:09 +00:00
Tim Northover	d5c23bcfc9	GlobalISel: translate floating-point comparisons llvm-svn: 279319	2016-08-19 20:48:16 +00:00
Justin Lebar	d13880a750	[NVPTX] Switch nvptx-use-infer-addrspace to true. Summary: This switches us to use a different, more powerful algorithm for address space inference. I've tested this locally and it seems to work great. Once we're more confident in it, we can remove the old pass altogether. Reviewers: jingyue Subscribers: llvm-commits, tra, jholewinski Differential Revision: https://reviews.llvm.org/D23694 llvm-svn: 279317	2016-08-19 20:46:45 +00:00
Reid Kleckner	98a48afa5d	Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd" This reverts commit r279229. It breaks intrinsic function calls in diamonds. llvm-svn: 279313	2016-08-19 20:22:39 +00:00
Tim Northover	b16734fbaa	GlobalISel: translate floating-point constants llvm-svn: 279311	2016-08-19 20:09:15 +00:00
Tim Northover	5a28c3642f	GlobalISel: support translating select instructions. llvm-svn: 279309	2016-08-19 20:09:07 +00:00
Tim Northover	b604622bba	GlobalISel: fix insert/extract to work on ConstantExprs too. No tests yet unfortunately (ConstantFolding reduces all supported constants to ConstantInts before we get to translation). Soon. llvm-svn: 279308	2016-08-19 20:09:03 +00:00
Tim Northover	bbbfb1cfb8	GlobalISel: translate insertvalue instructions. This adds a G_INSERT instruction, which technically makes G_SEQUENCE redundant (it's equivalent to a G_INSERT into an IMPLICIT_DEF). We'll leave G_SEQUENCE for now though: it's likely to be far more common as it's a fundamental part of legalization, so avoiding the mess and bloat of the extra IMPLICIT_DEFs is probably worthwhile. llvm-svn: 279306	2016-08-19 20:08:55 +00:00
Tom Stellard	68726a5359	MachineScheduler: Add constructor functions for the DAGMutations Summary: This way they can be re-used by target-specific schedulers. Reviewers: atrick, MatzeB, kparzysz Subscribers: kparzysz, llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D23678 llvm-svn: 279305	2016-08-19 19:59:18 +00:00
Krzysztof Parzyszek	fb4c4178a2	[Hexagon] Fix subesthetic indentation llvm-svn: 279303	2016-08-19 19:29:15 +00:00
Krzysztof Parzyszek	505eb498bd	[Hexagon] Allow i1 values for 'r' constraint in inline-asm llvm-svn: 279302	2016-08-19 19:17:28 +00:00
Sanjay Patel	7a104615c5	[InstCombine] remove an icmp fold that is already handled by InstSimplify Specifically, this is done near the end of "SimplifyICmpInst" using computeKnownBits() as the broader solution. There are even vector tests (yay!) for this in test/Transforms/InstSimplify/compare.ll. I considered putting an assert here instead of just deleting, but then we could assert every possible fold in InstSimplify in InstCombine, so...less is more? llvm-svn: 279300	2016-08-19 19:03:07 +00:00
Krzysztof Parzyszek	8849a51370	[Hexagon] Do not cache alloca instructions during isel They can be deleted or replicated, so the cache may become outdated. They only need to be visited once during frame lowering, so just scan the function instead. llvm-svn: 279297	2016-08-19 18:46:13 +00:00
Chandler Carruth	9b35e6d746	[PM] Re-instate r279227 and r279228 with a fix to the way the templating was done to hopefully appease MSVC. As an upside, this also implements the suggestion Sanjoy made in code review, so two for one! =] I'll be watching the bots to see if there are still issues. llvm-svn: 279295	2016-08-19 18:36:06 +00:00
Tim Northover	26b76f2c59	GlobalISel: improve representation of G_SEQUENCE and G_EXTRACT First, make sure all types involved are represented, rather than being implicit from the register width. Second, canonicalize all types to scalar. These operations just act in bits and don't care about vectors. Also standardize spelling of Indices in the MachineIRBuilder (NFC here). llvm-svn: 279294	2016-08-19 18:32:14 +00:00
Kyle Butt	5b10483618	Revert "IfConversion: Rescan diamonds." This reverts commit bfd62a4b4465dd21811bf615c3b04c30ddb09f7b. llvm-svn: 279289	2016-08-19 18:17:06 +00:00
Kyle Butt	ce0196de3f	Revert "CodeGen: If Convert blocks that would form a diamond when tail-merged." This reverts commit 0fda93481c4231c06b838ef476c0c404c51ff875. llvm-svn: 279288	2016-08-19 18:17:04 +00:00
Tim Northover	2fa5fa391f	GlobalISel: allow extractvalue to extract an aggregate. llvm-svn: 279287	2016-08-19 18:09:41 +00:00
Krzysztof Parzyszek	3d9946eb23	[Hexagon] Fixes for new-value jump formation - Recognize C2_cmpgtui, S2_tstbit_i, and S4_ntstbit_i. - Avoid creating new-value instructions with both source operands equal. llvm-svn: 279286	2016-08-19 17:54:49 +00:00
Tim Northover	6f80b08c64	GlobalISel: support translation of extractvalue instructions. llvm-svn: 279285	2016-08-19 17:47:05 +00:00
Sanjay Patel	e38e79c3e6	[InstCombine] use local variables to reduce code in foldICmpShlConstant; NFC llvm-svn: 279282	2016-08-19 17:34:05 +00:00
Krzysztof Parzyszek	5a7bef9c14	[Hexagon] Fix a few omissions in HexagonInstrInfo llvm-svn: 279280	2016-08-19 17:20:57 +00:00
Sanjay Patel	38b7506f75	[InstCombine] rename variables in foldICmpShlConstant(); NFC llvm-svn: 279279	2016-08-19 17:20:37 +00:00
Tim Northover	91c8173093	GlobalISel: support overflow arithmetic intrinsics. Unsigned addition and subtraction can reuse the instructions created to legalize large width operations (i.e. both produce and consume a carry flag). Signed operations and multiplies get a dedicated op-with-overflow instruction. Once this is produced the two values are combined into a struct register (which will almost always be merged with a corresponding G_EXTRACT as part of legalization). llvm-svn: 279278	2016-08-19 17:17:06 +00:00
Vitaly Buka	170dede75d	Revert "[asan] Optimize store size in FunctionStackPoisoner::poisonRedZones" This reverts commit r279178. Speculative revert in hope to fix asan crash on arm. llvm-svn: 279277	2016-08-19 17:15:38 +00:00
Vitaly Buka	c8f4d69c82	Revert "[asan] Fix size of shadow incorrectly calculated in r279178" This reverts commit r279222. Speculative revert in hope to fix asan crash on arm. llvm-svn: 279276	2016-08-19 17:15:33 +00:00
Lang Hames	6e9f0309e9	[RuntimeDyld] Revert r279182 and 279201 -- they broke some ARM bots. llvm-svn: 279275	2016-08-19 17:06:39 +00:00
Michael Kuperstein	41898f0396	[AliasSetTracker] Degrade AliasSetTracker when may-alias sets get too large. Repeated inserts into AliasSetTracker have quadratic behavior - inserting a pointer into AST is linear, since it requires walking over all "may" alias sets and running an alias check vs. every pointer in the set. We can avoid this by tracking the total number of pointers in "may" sets, and when that number exceeds a threshold, declare the tracker "saturated". This lumps all pointers into a single "may" set that aliases every other pointer. (This is a stop-gap solution until we migrate to MemorySSA) This fixes PR28832. Differential Revision: https://reviews.llvm.org/D23432 llvm-svn: 279274	2016-08-19 17:05:22 +00:00
Simon Pilgrim	d7a3782ae4	[X86][SSE] Generalised combining to VZEXT_MOVL to any vector size This doesn't change tests codegen as we already combined to blend+zero which is what we lower VZEXT_MOVL to on SSE41+ targets, but it does put us in a better position when we improve shuffling for optsize. llvm-svn: 279273	2016-08-19 17:02:00 +00:00
Krzysztof Parzyszek	639545b4d8	[Hexagon] Enforce LLSC packetization rules Ensure that load locked and store conditional instructions are only packetized with ALU32 instructions. Patch by Ben Craig. llvm-svn: 279272	2016-08-19 16:57:05 +00:00
Reid Kleckner	a871d3872a	Fix regression in InstCombine introduced by r278944 The intended transform is: // Simplify icmp eq (or (ptrtoint P), (ptrtoint Q)), 0 // -> and (icmp eq P, null), (icmp eq Q, null). P and Q are both pointer types, but may have different types. We need two calls to getNullValue() to make the icmps. llvm-svn: 279271	2016-08-19 16:53:18 +00:00
Krzysztof Parzyszek	b7640d4df0	[Hexagon] Minor updates to register definitions llvm-svn: 279269	2016-08-19 16:40:19 +00:00
David Majnemer	5554edabef	[CloneFunction] Don't remove unrelated nodes from the CGSSC CGSCC use a WeakVH to track call sites. RAUW a call within a function can result in that WeakVH getting confused about whether or not the call site is still around. llvm-svn: 279268	2016-08-19 16:37:40 +00:00
Krzysztof Parzyszek	9335bf0ec5	[Hexagon] Fix incorrect generation of S4_subi_asl_ri Patch by Jyotsna Verma. llvm-svn: 279267	2016-08-19 16:35:05 +00:00
Sanjay Patel	a867afe094	[InstCombine] use m_APInt to allow icmp (shl 1, Y), C folds for splat constant vectors llvm-svn: 279266	2016-08-19 16:12:16 +00:00
Krzysztof Parzyszek	dddb097a1f	[Hexagon] Add missing pattern for C4_cmplte llvm-svn: 279265	2016-08-19 16:11:33 +00:00
Sanjay Patel	57b12d3876	[InstCombine] use m_APInt to allow icmp X, C folds for splat constant vectors Of course, we really need to refactor and fix all of the cmp predicates, but this one is interesting because without it, we later perform an information-losing transform of icmp (shl 1, Y), C, and we can't recover the better fold. llvm-svn: 279263	2016-08-19 15:40:44 +00:00
Mehdi Amini	9989f80ae8	[LTO] Remove dead-code: collectUsedGlobalVariables has been moved to Thin and LTO specifc path (NFC) llvm-svn: 279261	2016-08-19 15:35:44 +00:00
Krzysztof Parzyszek	0b8672269c	[Hexagon] Make p0 an explicit operand in VA1_clr* subinstructions, NFC llvm-svn: 279255	2016-08-19 15:17:19 +00:00
Krzysztof Parzyszek	6ce82951c3	[Hexagon] Add explicit default constructor for HexagonSelectionDAGInfo llvm-svn: 279254	2016-08-19 15:13:54 +00:00
Krzysztof Parzyszek	0ba9754584	[Hexagon] Allow tail-call optimization when mixing C and fast calling conv Patch by Arnold Schwaighofer. llvm-svn: 279251	2016-08-19 15:02:18 +00:00
Krzysztof Parzyszek	66dd6797e8	[Hexagon] Check for empty live interval Patch by Brendon Cahoon. llvm-svn: 279249	2016-08-19 14:29:43 +00:00
Krzysztof Parzyszek	db019ae801	[Hexagon] Consider zext/sext of a load to i32 to be free llvm-svn: 279248	2016-08-19 14:22:07 +00:00
Anton Korobeynikov	b38195c1a8	Revert r279242 - it's failing the tests llvm-svn: 279247	2016-08-19 14:18:34 +00:00
Krzysztof Parzyszek	a243adfd27	[Hexagon] Handle J2_jumptpt and J2_jumpfpt instructions llvm-svn: 279246	2016-08-19 14:14:09 +00:00
Krzysztof Parzyszek	067debe0a0	[Hexagon] Fix indentation, NFC llvm-svn: 279245	2016-08-19 14:12:51 +00:00
Krzysztof Parzyszek	9273ecc176	[Hexagon] Remove unnecessary llvm::, NFC llvm-svn: 279244	2016-08-19 14:10:57 +00:00
Krzysztof Parzyszek	75e74ee699	[Hexagon] Rename the HEXAGON_MC namespace to Hexagon_MC, NFC llvm-svn: 279243	2016-08-19 14:09:47 +00:00
Anton Korobeynikov	2aae31a945	Fix PR27500: on MSP430 the branch destination offset is measured in words, not bytes. In addition, the branch instructions will have proper BB destinations, not offsets, like before. Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D20162 llvm-svn: 279242	2016-08-19 14:07:10 +00:00
Krzysztof Parzyszek	6421b934ec	[Hexagon] Mark PS_jumpret as pseudo-instruction, expand it into J2_jumpr llvm-svn: 279241	2016-08-19 14:04:45 +00:00
Krzysztof Parzyszek	bd8ef4b8ce	[Hexagon] Improvements to handling and generation of FP instructions Improved handling of fma, floating point min/max, additional load/store instructions for floating point types. Patch by Jyotsna Verma. llvm-svn: 279239	2016-08-19 13:34:31 +00:00
Benjamin Kramer	96fcf5df03	[LoopVectorize] Don't copy std::vector in for-range loop. llvm-svn: 279233	2016-08-19 12:44:24 +00:00
Chandler Carruth	b8824a5d3f	[PM] Revert r279227 and r279228 until I can find someone to help me solve completely opaque MSVC build errors. It complains about lots of stuff with this change without givin nearly enough information to even try to fix. llvm-svn: 279231	2016-08-19 10:51:55 +00:00
Simon Pilgrim	f1b8fdc074	[X86][SSE] Add support for matching commuted insertps patterns INSERTPS doesn't fit well with our shuffle mask canonicalization, so we need to attempt both the original mask and the commuted mask to more likely get a match llvm-svn: 279230	2016-08-19 10:31:53 +00:00
James Molloy	11a1936b70	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. llvm-svn: 279229	2016-08-19 10:10:27 +00:00
Chandler Carruth	db1759ace1	[PM] Make the the new pass manager support fully generic extra arguments to run methods, both for transform passes and analysis passes. This also allows the analysis manager to use a different set of extra arguments from the pass manager where useful. Consider passes over analysis produced units of IR like SCCs of the call graph or loops. Passes of this nature will often want to refer to the analysis result that was used to compute their IR units (the call graph or LoopInfo). And for transformations, they may want to communicate special update information to the outer pass manager. With this change, it becomes possible to have a run method for a loop pass that looks more like: PreservedAnalyses run(Loop &L, AnalysisManager<Loop, LoopInfo> &AM, LoopInfo &LI, LoopUpdateRecord &UR); And to query the analysis manager like: AM.getResult<MyLoopAnalysis>(L, LI); This makes accessing the known-available analyses convenient and clear, and it makes passing customized data structures around easy. My initial use case is going to be in updating the pass manager layers when the analysis units of IR change. But there are more use cases here such as having a layer that lets inner passes signal whether certain additional passes should be run because of particular simplifications made. Two desires for this have come up in the past: triggering additional optimization after successfully unrolling loops, and triggering additional inlining after collapsing indirect calls to direct calls. Despite adding this layer of generic extensibility, the only change to existing, simple usage are for places where we forward declare the AnalysisManager template. We really shouldn't be doing this because of the fragility exposed here, but currently it makes coping with the legacy PM code easier. Differential Revision: http://reviews.llvm.org/D21462 llvm-svn: 279227	2016-08-19 09:45:16 +00:00
James Molloy	7ee640f9b6	[CodeGen] Fix a trivial type conversion bug dating back to pre-2008 The heuristic above this code is incredibly suspect, but disregarding that it mutates the cast opcode so we need to check the mutated opcode later to see if we need to emit an AssertSext or AssertZext node. Fixes PR29041. llvm-svn: 279223	2016-08-19 08:38:50 +00:00
Vitaly Buka	b81960a6c8	[asan] Fix size of shadow incorrectly calculated in r279178 Summary: r279178 generates 8 times more stores than necessary. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23708 llvm-svn: 279222	2016-08-19 08:33:53 +00:00
Chandler Carruth	b7be5b6479	[PM] Rework the new PM support for building the ModuleSummaryIndex to directly produce the index as the value type result. This requires making the index movable which is straightforward. It greatly simplifies things by allowing us to completely avoid the builder API and the layers of abstraction inherent there. Instead both pass managers can directly construct these when run by value. They still won't be constructed truly eagerly thanks to the optional in the legacy PM. The code that directly builds the index can also just share a direct function. A notable change here is that the result type of the analysis for the new PM is no longer a reference type. This was really problematic when making changes to how we handle result types to make our interface requirements much more strict and precise. But I think this is an overall improvement. Differential Revision: https://reviews.llvm.org/D23701 llvm-svn: 279216	2016-08-19 07:49:19 +00:00
Xinliang David Li	63248ab888	[Profile] Fix edge count read bug Use uint64_t to avoid value truncation before scaling. llvm-svn: 279213	2016-08-19 06:31:45 +00:00
Mehdi Amini	18b91112af	[LTO] Move callback member from base class to the derived where it is used (NFC) llvm-svn: 279212	2016-08-19 06:10:03 +00:00
Mehdi Amini	cc1fe9b9d6	Constify some path in the bitcode writer (NFC) llvm-svn: 279211	2016-08-19 06:06:18 +00:00
Mehdi Amini	026ddbb4d6	[LTO] Add a move to inialize member in ctor initialization list (NFC) llvm-svn: 279210	2016-08-19 05:56:37 +00:00
Xinliang David Li	2c9336823c	[Profile] Simple code refactoring for reuse /NFC llvm-svn: 279209	2016-08-19 05:31:33 +00:00
Dean Michael Berris	1dd1ca9727	[XRay] Synthesize a reference to the xray_instr_map Without the synthesized reference to a symbol in the xray_instr_map, linker section garbage collection will helpfully remove the whole xray_instr_map section from the final executable (or archive). This will cause the runtime to not be able to identify the sleds and hot-patch the calls/jumps into the runtime trampolines. This change adds a reference from the text section at the end of the function to keep around the associated xray_instr_map section as well. We also make sure that we catch this reference in the test. Reviewers: chandlerc, echristo, majnemer, mehdi_amini Subscribers: mehdi_amini, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D23398 llvm-svn: 279204	2016-08-19 04:44:30 +00:00
Matthias Braun	fdc4c6b426	Revert "RegScavenging: Add scavengeRegisterBackwards()" The ppc64 multistage bot fails on this. This reverts commit r279124. Also Revert "CodeGen: Add/Factor out LiveRegUnits class; NFCI" because it depends on the previous change This reverts commit r279171. llvm-svn: 279199	2016-08-19 03:03:24 +00:00
Lang Hames	b65f16c8e5	[RuntimeDyld] Add support for ELF R_ARM_REL32 and R_ARM_GOT_PREL. Patch by William Dillon. Thanks William! This patch adds support for the R_ARM_REL32 and R_ARM_GOT_PREL ELF ARM relocations to RuntimeDyld, which should allow JITing of code that produces these relocations. No test case: Unfortunately RuntimeDyldELF's GOT building mechanism (which uses a separate section for GOT entries) isn't compatible with RuntimeDyldChecker. The correct fix for this is to fix RuntimeDyldELF's GOT support (it's fundamentally broken at the moment: separate sections aren't guaranteed to be in range of a GOT entry load), but that's a non-trivial job. llvm-svn: 279182	2016-08-19 01:15:39 +00:00
Vitaly Buka	aa654292bd	[asan] Optimize store size in FunctionStackPoisoner::poisonRedZones Summary: Reduce store size to avoid leading and trailing zeros. Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23648 llvm-svn: 279178	2016-08-18 23:51:15 +00:00
Andrew Kaylor	81901d658f	Include X86CallFrameOptimization in the opt-bisect process. Differential Revision: https://reviews.llvm.org/D23683 llvm-svn: 279175	2016-08-18 22:49:51 +00:00
Saleem Abdulrasool	dab786fb78	AArch64: remove extraneous padding The structs BarrierOp, PrefetchOp, PSBHintOp are in AArch64AsmParser.cpp (inside anonymous namespace). This diff changes the order of fields and removes the excessive padding (8 bytes). Patch by Alexander Shaposhnikov! llvm-svn: 279173	2016-08-18 22:35:06 +00:00
Matthias Braun	91f95f0201	CodeGen: Add/Factor out LiveRegUnits class; NFCI This is a set of register units intended to track register liveness, it is similar in spirit to LivePhysRegs. You can also think of this as the liveness tracking parts of the RegisterScavenger factored out into an own class. This was proposed in http://llvm.org/PR27609 Differential Revision: http://reviews.llvm.org/D21916 llvm-svn: 279171	2016-08-18 22:11:28 +00:00
Kyle Butt	780b517d6b	CodeGen: If Convert blocks that would form a diamond when tail-merged. The following function currently relies on tail-merging for if conversion to succeed. The common tail of cond_true and cond_false is extracted, and this then forms a diamond pattern that can be successfully if converted. If this block does not get extracted, either because tail-merging is disabled or the threshold is higher, we should still recognize this pattern and if-convert it. Fixed a regression in the original commit. Need to un-reverse branches after reversing them, or other conversions go awry. Regression on self-hosting bots with no obvious explanation. Tidied up range handling to be more obviously correct, but there was no smoking gun. define i32 @t2(i32 %a, i32 %b) nounwind { entry: %tmp1434 = icmp eq i32 %a, %b ; <i1> [#uses=1] br i1 %tmp1434, label %bb17, label %bb.outer bb.outer: ; preds = %cond_false, %entry %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ] %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ] br label %bb bb: ; preds = %cond_true, %bb.outer %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ] %tmp. = sub i32 0, %b_addr.021.0.ph %tmp.40 = mul i32 %indvar, %tmp. %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph br i1 %tmp3, label %cond_true, label %cond_false cond_true: ; preds = %bb %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph %indvar.next = add i32 %indvar, 1 br i1 %tmp1437, label %bb17, label %bb cond_false: ; preds = %bb %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0 %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10 br i1 %tmp14, label %bb17, label %bb.outer bb17: ; preds = %cond_false, %cond_true, %entry %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ] ret i32 %a_addr.026.1 } Without tail-merging or diamond-tail if conversion: LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ble LBB1_3 @ BB#2: @ %cond_true @ in Loop: Header=BB1_1 Depth=1 subs r0, r0, r1 cmp r1, r0 it ne cmpne r0, r1 bgt LBB1_4 LBB1_3: @ %cond_false @ in Loop: Header=BB1_1 Depth=1 subs r1, r1, r0 cmp r1, r0 bne LBB1_1 LBB1_4: @ %bb17 bx lr With diamond-tail if conversion, but without tail-merging: @ BB#0: @ %entry cmp r0, r1 it eq bxeq lr LBB1_1: @ %bb @ =>This Inner Loop Header: Depth=1 cmp r0, r1 ite le suble r1, r1, r0 subgt r0, r0, r1 cmp r1, r0 bne LBB1_1 @ BB#2: @ %bb17 bx lr llvm-svn: 279168	2016-08-18 22:09:27 +00:00
Kyle Butt	491afad8f6	IfConversion: Rescan diamonds. The cost of predicating a diamond is only the instructions that are not shared between the two branches. Additionally If a predicate clobbering instruction occurs in the shared portion of the branches (e.g. a cond move), it may still be possible to if convert the sub-cfg. This change handles these two facts by rescanning the non-shared portion of a diamond sub-cfg to recalculate both the predication cost and whether both blocks are pred-clobbering. llvm-svn: 279167	2016-08-18 22:09:25 +00:00
Kyle Butt	d76755ec95	IfConversion: Handle inclusive ranges more carefully. This may affect calculations for thresholds, but is not a significant change in behavior. The problem was that an inclusive range must have an additonal flag to showr that it is empty, because otherwise begin == end implies that the range has one element, and it may not be possible to move past on either side. llvm-svn: 279166	2016-08-18 22:09:23 +00:00
Zhan Jun Liau	cf2f4b3251	[SystemZ] Use valid base/index regs for inline asm Summary: Inline asm memory constraints can have the base or index register be assigned to %r0 right now. Make sure that we assign only ADDR64 registers to the base and index. Reviewers: uweigand Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23367 llvm-svn: 279157	2016-08-18 21:44:15 +00:00
Sanjay Patel	98cd99dfc6	[InstCombine] add helper function for folds of icmp (shl 1, Y), C; NFCI Clean up the existing code by: 1. Renaming variables 2. Adding local variables 3. Making it vector-safe This is still guarded by a ConstantInt check, so no functional change is intended. But this should be ready to go: if we move the ConstantInt check down, all of these folds should do the right thing for vector types. llvm-svn: 279150	2016-08-18 21:28:30 +00:00
Kostya Serebryany	32661f9d66	[libFuzzer] add more __attribute__((visibility("default"))) llvm-svn: 279143	2016-08-18 20:52:52 +00:00
Amaury Sechet	763c59dc9a	Make cltz and cttz zero undef when the operand cannot be zero in InstCombine Summary: Also add popcount(n) == bitsize(n) -> n == -1 transformation. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23134 llvm-svn: 279141	2016-08-18 20:43:50 +00:00
Sanjay Patel	40e8ca46ad	[InstCombine] use m_APInt to allow icmp (trunc X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 https://reviews.llvm.org/rL279101 llvm-svn: 279133	2016-08-18 20:28:54 +00:00
Sanjay Patel	5f4ce4e23d	[InstCombine] clean up foldICmpTruncConstant(); NFCI 1. Fix variable names 2. Add local variables to reduce code llvm-svn: 279132	2016-08-18 20:25:16 +00:00
Michael Kuperstein	2bc3d4d46c	[SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129	2016-08-18 20:08:15 +00:00
Wei Ding	52bb661dec	AMDGPU : Fix QSAD and MQSAD instructions' incorrect data type. Differential Revision: http://reviews.llvm.org/D23689 llvm-svn: 279126	2016-08-18 19:51:14 +00:00
Matthew Simpson	11db6b6b8c	[SLP] Initialize VectorizedValue when gathering We abort building vectorizable trees in some cases (e.g., if the maximum recursion depth is reached, if the region size is too large, etc.). If this happens for a reduction, we can be left with a root entry that needs to be gathered. For these cases, we need make sure we actually set VectorizedValue to the resulting vector. This patch ensures we properly set VectorizedValue, and it also ensures the insertelement sequence generated for the gathers is inserted at the correct location. Reference: https://llvm.org/bugs/show_bug.cgi?id=28330 Differential Revison: https://reviews.llvm.org/D23410 llvm-svn: 279125	2016-08-18 19:50:32 +00:00
Matthias Braun	075d0c23d5	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044 with off-by-1 instruction fix for the reload placement. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 279124	2016-08-18 19:47:59 +00:00
Kyle Butt	64e428147f	Branch Folding: Accept explicit threshold for tail merge size. This is prep work for allowing the threshold to be different during layout, and to enforce a single threshold between merging and duplicating during layout. No observable change intended. llvm-svn: 279117	2016-08-18 18:57:29 +00:00
Pete Cooper	a8db71e840	Add a version of Intrinsic::getName which is more efficient when there are no overloads. When running 'opt -O2 verify-uselistorder-nodbg.lto.bc', there are 33m allocations. 8.2m come from std::string allocations in Intrinsic::getName(). Turns out this method only returns a std::string because it needs to handle overloads, but that is not the common case. This adds an overload of getName which just returns a StringRef when there are no overloads and so saves on the allocations. llvm-svn: 279113	2016-08-18 18:30:54 +00:00
Valery Pykhtin	609c2f8137	[AMDGPU] add s_incperflevel/s_decperflevel intrinsics. Differential revision: https://reviews.llvm.org/D23666 llvm-svn: 279106	2016-08-18 18:06:20 +00:00
Elliot Colp	687691aeac	Fix SystemZ compilation abort caused by negative AND mask Normally, when an AND with a constant is lowered to NILL, the constant value is truncated to 16 bits. However, since r274066, ANDs whose results are used in a shift are caught by a different pattern that does not truncate. The instruction printer expects a 16-bit unsigned immediate operand for NILL, so this results in an abort. This patch adds code to manually truncate the constant in this situation. The rest of the bits are then set, so we will detect a case for NILL "naturally" rather than using peephole optimizations. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 279105	2016-08-18 18:04:26 +00:00
Duncan P. N. Exon Smith	84c2da47f9	AArch64: Don't call getIterator() on iterators Remove an unnecessary round-trip: iterator => operator->() => getIterator() In some cases, the iterator is end(), so the dereference of operator-> is invalid (UB). The testcase only crashes with r278974 (currently reverted to investigate this), which adds an assertion for invalid dereferences of ilist nodes. Fixes PR29035. llvm-svn: 279104	2016-08-18 17:58:09 +00:00
Eugene Zelenko	61a72d8850	[LLVM] Fix some Clang-tidy modernize-use-using and Include What You Use warnings Differential revision: https://reviews.llvm.org/D23675 llvm-svn: 279102	2016-08-18 17:56:27 +00:00
Sanjay Patel	fa5ca2bf46	[InstCombine] use m_APInt to allow icmp (udiv X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 https://reviews.llvm.org/rL279077 llvm-svn: 279101	2016-08-18 17:55:59 +00:00
Dan Gohman	c9623db884	[WebAssembly] Disable the store-results optimization. The WebAssemly spec removing the return value from store instructions, so remove the associated optimization from LLVM. This patch leaves the store instruction operands in place for now, so stores now always write to "$drop"; these will be removed in a seperate patch. llvm-svn: 279100	2016-08-18 17:51:27 +00:00
Chandler Carruth	e2f36bcb84	[Assumptions] Make collecting ephemeral values not quadratic in the number of assume intrinsics. The classical way to have a cache-friendly vector style container when we need queue semantics for BFS instead of stack semantics for DFS is to use an ever-growing vector and an index. Erasing from the front requires O(size) work, and unless we expect the worklist to grow very large, its probably cheaper to just grow and race down the list. But that makes it more bad that we're putting the assume intrinsics in this at all. We end up looking at the (by definition empty) use list to see if they're ephemeral (when we've already put them in that set), etc. Instead, directly populate the worklist with the operands when we mark the assume intrinsics as ephemeral. Also, test the visited set before putting things into the worklist so we don't accumulate the same value in the list 100s of times. It would be nice to use a set-vector for this but I think its useful to test the set earlier to avoid repeatedly querying whether the same instruction is safe to speculate. Hopefully with these changes the number of values pushed onto the worklist is smaller, and we avoid quadratic work by letting it grow as necessary. Differential Revision: https://reviews.llvm.org/D23396 llvm-svn: 279099	2016-08-18 17:51:24 +00:00
Vedant Kumar	c948d182e1	Fix -Wpessimizing-move error, NFC llvm-svn: 279095	2016-08-18 17:39:53 +00:00
Sanjay Patel	12a4105647	[InstCombine] clean up foldICmpUDivConstant; NFC 1. Better variable names 2. Remove unnecessary check of ConstantInt llvm-svn: 279094	2016-08-18 17:37:26 +00:00
Zachary Turner	ac5763eca4	Resubmit "Write the TPI stream from a PDB to Yaml." The original patch was breaking some buildbots due to an incorrect ordering of function definitions which caused some compilers to recognize a definition but others to not. llvm-svn: 279089	2016-08-18 16:49:29 +00:00
Artur Pilipenko	615b820af6	CVP. Turn marking adds as no wrap (introduced by r278107) off by default It causes a regression on our internal benchmark. Introduce cvp-dont-process flag and set it off by default while investigating the regression. llvm-svn: 279082	2016-08-18 16:08:35 +00:00
Ahmed Bougacha	33e19fe1c4	[AArch64][GlobalISel] Select floating-point binary ops. There is no FREM instruction, but the others are straightforward. llvm-svn: 279081	2016-08-18 16:05:11 +00:00
Davide Italiano	d1279df752	[IRCE] Switch over to LLVM_DUMP_METHOD. NFCI. llvm-svn: 279079	2016-08-18 15:55:49 +00:00
Sanjay Patel	6347807f87	[InstCombine] use m_APInt to allow icmp (mul X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 https://reviews.llvm.org/rL279066 llvm-svn: 279077	2016-08-18 15:44:44 +00:00
Derek Schuff	ccdceda128	[WebAssembly] Refactor WebAssemblyLowerEmscriptenException pass for setjmp/longjmp This patch changes the code structure of WebAssemblyLowerEmscriptenException pass to support both exception handling and setjmp/longjmp. It also changes the name of the pass and the source file. 1. Change the file/pass name to WebAssemblyLowerEmscriptenExceptions -> WebAssemblyLowerEmscriptenEHSjLj to make it clear that it supports both EH and SjLj 2. List function / global variable names at the top so they can be changed easily 3. Some cosmetic changes Patch by Heejin Ahn Differential Revision: https://reviews.llvm.org/D23588 llvm-svn: 279075	2016-08-18 15:27:25 +00:00
Ahmed Bougacha	1d0560b14d	[AArch64][GlobalISel] Select G_SDIV/G_UDIV. There is no REM instruction; that will require an expansion. It's not obvious that should be done in select, rather than as a (custom?) legalization. llvm-svn: 279074	2016-08-18 15:17:13 +00:00
Sanjay Patel	5b112845da	[InstCombine] use APInt in isSignTest instead of ConstantInt; NFC This will enable vector splat folding, but NFC until the callers have their ConstantInt restrictions removed. llvm-svn: 279072	2016-08-18 14:59:14 +00:00
Sanjay Patel	7d37b221a2	fix typo; NFC llvm-svn: 279068	2016-08-18 14:17:34 +00:00
Krzysztof Parzyszek	b1b0372337	[Hexagon] Create vcombine in HexagonCopyToCombine llvm-svn: 279067	2016-08-18 14:12:34 +00:00
Sanjay Patel	4c5e60d95c	[InstCombine] use m_APInt to allow icmp (xor X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 https://reviews.llvm.org/rL278945 llvm-svn: 279066	2016-08-18 14:10:48 +00:00
Simon Dardis	ea3431598e	[mips] Correct tail call encoding for MIPSR6 r277708 enabled tails calls for MIPS but used the 'jr' instruction when the jump target was held in a register. For MIPSR6, 'jalr $zero, $reg' should have been used. Additionally, add missing patterns for external and global symbols for tail calls. Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23301 llvm-svn: 279064	2016-08-18 13:22:43 +00:00
Alex Bradbury	3447ca3f08	(Trivial) TargetPassConfig: assert when TargetMachine has no MCAsmInfo Summary: This is a pretty trivial, but I thought it was worth just checking that nobody feels it's completely the wrong thing to be doing. The motivation is that when starting a new backend, you often start with a minimal stub, pretty much just FooTargetMachine and FooTargetInfo. Once that's built, you might naturally try `llc -march=foo myinput.ll` and it seems more developer-friendly if this ends up asserting due to the lack of MCAsmInfo with an informative message rather than just segfaulting. Reviewers: MatzeB, chandlerc Subscribers: bogner, llvm-commits Differential Revision: https://reviews.llvm.org/D23443 llvm-svn: 279061	2016-08-18 13:08:58 +00:00
Simon Pilgrim	916485c765	Remove trailing whitespace llvm-svn: 279054	2016-08-18 11:22:22 +00:00
Lang Hames	75601bf71e	Revert r279016 -- it breaks win32-elf JIT tests. llvm-svn: 279029	2016-08-18 01:33:28 +00:00
Kostya Serebryany	524c3f32e7	[sanitizer-coverage/libFuzzer] instrument comparisons with __sanitizer_cov_trace_cmp[1248] instead of __sanitizer_cov_trace_cmp, don't pass the comparison type to save a bit performance. Use these new callbacks in libFuzzer llvm-svn: 279027	2016-08-18 01:25:28 +00:00
Matthias Braun	6442fc1f6e	TailDuplicator: Fix crash after r278974 Some inputs would after r278974 without this fix (see http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_build/2733/console for an example) llvm-svn: 279022	2016-08-18 00:59:32 +00:00
Mehdi Amini	8ac7b32207	[LTO] Promote before performing weak resolution Summary: This was reversed compared to ThinLTOCodeGenerator for some reason, and lead to an increased code-size on my tests. I figured that the weak resolution may internalize a linkonce function, which will be promoted immediately (and renamed), before being internalized again. Reviewers: tejohnson Subscribers: pcc, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23632 llvm-svn: 279021	2016-08-18 00:59:24 +00:00
Vitaly Buka	d5ec14989d	[asan] Add support of lifetime poisoning into ComputeASanStackFrameLayout Summary: We are going to combine poisoning of red zones and scope poisoning. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23623 llvm-svn: 279020	2016-08-18 00:56:58 +00:00
Lang Hames	1d39cb16ec	[RuntimeDyld] Strip leading '_' from symbols on 32-bit windows in RTDyldMemoryManager::getSymbolAddressInProcess() This should allow JIT'd code for win32 to find in-process symbols. See http://llvm.org/PR28699 . Patch by James Holderness. Thanks James! llvm-svn: 279016	2016-08-18 00:22:34 +00:00
Mehdi Amini	eccffada33	[LTO] Change addSaveTemps API: do not add dot to the supplied prefix path Summary: It does not play well with directories (end up with a bunch of hidden files). Also, do not strip the 0 suffix for the first task, especially since 0 can be used by ThinLTO as well now. Reviewers: tejohnson Subscribers: mehdi_amini, pcc, llvm-commits Differential Revision: https://reviews.llvm.org/D23612 llvm-svn: 279014	2016-08-18 00:12:33 +00:00
Dominic Chen	a8a638292c	[WebAssembly] Handle debug information and virtual registers without crashing (reland r278967) Summary: Currently, enabling debug information when compiling for WebAssembly crashes the backend. This commit fixes these by skipping debug values in backend passes. Reviewers: jfb, aprantl, dschuff, echristo Subscribers: llvm-commits, dschuff, jfb, MatzeB, dexonsmith, yurydelendik, mehdi_amini Differential Revision: https://reviews.llvm.org/D23635 llvm-svn: 279011	2016-08-17 23:42:27 +00:00
Kostya Serebryany	5a5d5548f0	[libFuzzer] force proper popcnt instruction llvm-svn: 279002	2016-08-17 23:09:57 +00:00
Hans Wennborg	3879035e66	SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932) Differential Revision: https://reviews.llvm.org/D23594 llvm-svn: 278999	2016-08-17 22:50:18 +00:00
Haicheng Wu	e787763275	[LoopUnroll] Move a simple check earlier. NFC. Move the check of CallInst earlier to skip expensive recursive operations. Differential Revision: https://reviews.llvm.org/D23611 llvm-svn: 278998	2016-08-17 22:42:58 +00:00
Tim Shen	5c0c063ad5	[LV] Move LoopBodyTraits to a better place, and add comment for simplifying LoopBlocksTraversal. NFC. Summary: I later (after r278573) found that LoopIterator.h has some overlapping with LoopBodyTraits. It's good to use LoopBodyTraits because a *Traits struct is algorithm independent. Reviewers: anemet, nadav, mkuper Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23529 llvm-svn: 278996	2016-08-17 22:20:07 +00:00
Kostya Serebryany	e72774dd69	[libFuzzer] given 0 and 255 more preference when inserting repeated bytes llvm-svn: 278986	2016-08-17 21:50:54 +00:00
Chris Bieneman	432ba9d89a	[macho2yaml] Don't write empty linkedit data Since I stopped writing empty export tries it causes LinkEdit to potentially be completely empty which results in invalid yaml being generated. To prevent this we skip linkedit data if it is empty. llvm-svn: 278985	2016-08-17 21:46:04 +00:00
Kostya Serebryany	0c537b124c	[libFuzzer] one more mutation: ChangeBinaryInteger; also fix the breakage from r278970 llvm-svn: 278982	2016-08-17 21:30:30 +00:00
Kyle Butt	db3391ebe0	Tail Duplication: Accept explicit threshold for duplicating. This will allow tail duplication and tail merging during layout to have a shared threshold to make sure that they don't overlap. No observable change intended. llvm-svn: 278981	2016-08-17 21:07:35 +00:00
Kyle Butt	feafec588f	TailDuplicator: Use optForSize instead of hasFnAttribute. This will cause minsize functions to have the same threshold as optsize functions, but otherwise should have no effects. llvm-svn: 278980	2016-08-17 21:07:33 +00:00
Kostya Serebryany	a9a548049a	[libFuzzer] when printing the reproducer input, also print the base input and the mutation sequence llvm-svn: 278975	2016-08-17 20:45:23 +00:00
Duncan P. N. Exon Smith	afdd8e541b	Revert "[WebAssembly] Handle debug information and virtual registers without crashing" This reverts commit r278967, since the new test is failing when you don't build the WebAssembly target (most people, since it's off-by-default). llvm-svn: 278973	2016-08-17 20:41:50 +00:00
Justin Bogner	cd1d5aaf2e	Replace a few more "fall through" comments with LLVM_FALLTHROUGH Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970	2016-08-17 20:30:52 +00:00
Tim Northover	de3aea0412	GlobalISel: support irtranslation of icmp instructions. llvm-svn: 278969	2016-08-17 20:25:25 +00:00
Dominic Chen	4326167a37	[WebAssembly] Handle debug information and virtual registers without crashing Summary: Currently, enabling debug information when compiling for WebAssembly crashes the backend. This commit fixes these by skipping debug values in backend passes. Reviewers: jfb, aprantl, dschuff, echristo Subscribers: mehdi_amini, yurydelendik, dexonsmith, MatzeB, jfb, dschuff, llvm-commits Differential Revision: https://reviews.llvm.org/D21808 llvm-svn: 278967	2016-08-17 20:11:03 +00:00
Tim Shen	eb3958fafd	[GraphWriter] Change GraphWriter to use NodeRef in GraphTraits Summary: This is part of the "NodeType* -> NodeRef" migration. Notice that since GraphWriter prints object address as identity, I added a static_assert on NodeRef to be a pointer type. Reviewers: dblaikie Subscribers: llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D23580 llvm-svn: 278966	2016-08-17 20:07:29 +00:00
Matt Arsenault	d42d58cf21	AMDGPU: Remove dead option llvm-svn: 278965	2016-08-17 20:07:16 +00:00
Tim Shen	8b58bdfe6f	[GenericDomTree] Change GenericDomTree to use NodeRef in GraphTraits. NFC. Summary: Looking at the implementation, GenericDomTree has more specific requirements on NodeRef, e.g. NodeRefObject->getParent() should compile, and NodeRef should be a pointer. We can remove the pointer requirement, but it seems to have little gain, given the limited use cases. Also changed GraphTraits<Inverse<Inverse<T>> to be more accurate. Reviewers: dblaikie, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23593 llvm-svn: 278961	2016-08-17 20:01:58 +00:00
Sanjay Patel	daffec91ef	[InstCombine] more clean up of foldICmpXorConstant(); NFCI Use m_APInt for the xor constant, but this is all still guarded by the initial ConstantInt check, so no vector types should make it in here. llvm-svn: 278957	2016-08-17 19:45:18 +00:00
Sanjay Patel	6d5f448746	[InstCombine] clean up foldICmpXorConstant(); NFCI 1. Change variable names 2. Use local variables to reduce code 3. Early exit to reduce indent llvm-svn: 278955	2016-08-17 19:23:42 +00:00
Marina Yatsina	53ce3f9d02	Fix for PR29010 This is a fix for https://llvm.org/bugs/show_bug.cgi?id=29010 Root cause of the bug is that the register class of the machine instruction operand does not fully reflect if this registers that can be allocated. Both for i386 and x86_64 the operand's register class is VR128RegClass and thus contains xmm0-xmm15, though in i386 we can only use xmm0-xmm8. In order to get the actual allocable registers of the class we need to use RegisterClassInfo. Differential Revision: https://reviews.llvm.org/D23613 llvm-svn: 278954	2016-08-17 19:07:40 +00:00
Kostya Serebryany	a7398ba024	[libFuzzer] more mutations llvm-svn: 278950	2016-08-17 18:10:42 +00:00
Sanjay Patel	63e14a07e8	[InstCombine] use m_APInt to allow icmp (or X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 https://reviews.llvm.org/rL278935 llvm-svn: 278945	2016-08-17 16:38:57 +00:00
Sanjay Patel	943e92efde	[InstCombine] clean up foldICmpOrConstant(); NFCI 1. Change variable names 2. Use local variables to reduce code 3. Use ? instead of if/else 4. Use the APInt variable instead of 'RHS' so the removal of the FIXME code will be direct llvm-svn: 278944	2016-08-17 16:30:43 +00:00
Adrian Prantl	c19dee734f	Support the DW_AT_noreturn DWARF flag. This is used to mark functions with the C++11 [[ noreturn ]] or C11 _Noreturn attributes. Patch by Victor Leschuk! https://reviews.llvm.org/D23167 llvm-svn: 278940	2016-08-17 16:02:43 +00:00
Chad Rosier	ea7e4647db	Revert "Reassociate: Reprocess RedoInsts after each inst". This reverts commit r258830, which introduced a bug described in PR28367. PR28367 llvm-svn: 278938	2016-08-17 15:54:39 +00:00
Sanjay Patel	4f7eb2aa95	[InstCombine] use m_APInt to allow icmp (add X, Y), C folds for splat constant vectors This is a sibling of: https://reviews.llvm.org/rL278859 llvm-svn: 278935	2016-08-17 15:24:30 +00:00
Simon Dardis	ac96ec7906	[mips] Add l.[sd] and s.[sd] instruction aliases Reviewers: dsanders, vkalintiris Differential Review: https://reviews.llvm.org/D23121 llvm-svn: 278930	2016-08-17 14:45:09 +00:00
Chad Rosier	a6822f64f3	Revert "[Reassociate] Avoid iterator invalidation when negating value." This reverts commit r278928 due to lit test failures. llvm-svn: 278929	2016-08-17 14:31:34 +00:00
Chad Rosier	cf3e8121a6	[Reassociate] Avoid iterator invalidation when negating value. Differential Revision: https://reviews.llvm.org/D23464 PR28367 llvm-svn: 278928	2016-08-17 14:16:45 +00:00
Jonas Paulsson	7a79422536	[LoopStrenghtReduce] Refactoring and addition of a new target cost function. Refactored so that a LSRUse owns its fixups, as oppsed to letting the LSRInstance own them. This makes it easier to rate formulas for LSRUses, since the fixups are available directly. The Offsets vector has been removed since it was no longer necessary. New target hook isFoldableMemAccessOffset(), which is used during formula rating. For SystemZ, this is useful to express that loads and stores with float or vector types with a big/negative offset should be avoided in loops. Without this, LSR will generate a lot of negative offsets that would require extra instructions for loading the address. Updated tests: test/CodeGen/SystemZ/loop-01.ll Reviewed by: Quentin Colombet and Ulrich Weigand. https://reviews.llvm.org/D19152 llvm-svn: 278927	2016-08-17 13:24:19 +00:00
Marina Yatsina	4b22642e6f	Fixing bug committed in rev. 278321 In theory the indices of RC (and thus the index used for LiveRegs) may differ from the indices of OpRC. Fixed the code to extract the correct RC index. OpRC contains the first X consecutive elements of RC, and thus their indices are currently de facto the same, therefore a test cannot be added at this point. Differential Revision: https://reviews.llvm.org/D23491 llvm-svn: 278923	2016-08-17 11:40:21 +00:00
Ayman Musa	71b43c5c1d	Fix bug in DAGBuilder for getelementptr with expanded vector. Replacing the usage of MVT with EVT in case the vector type is expanded. Differential Revision: https://reviews.llvm.org/D23306 llvm-svn: 278913	2016-08-17 07:52:15 +00:00
Ayman Musa	c96f421ad4	First commit (test commit) - Adding empty line. llvm-svn: 278910	2016-08-17 07:37:34 +00:00
Mehdi Amini	970800e0c8	[LTO] Introduce an Output class to wrap the output stream creation (NFC) Summary: While NFC for now, this will allow more flexibility on the client side to hold state necessary to back up the stream. Also when adding caching, this class will grow in complexity. Note I blindly modified the gold-plugin as I can't compile it. Reviewers: tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23542 llvm-svn: 278907	2016-08-17 06:23:09 +00:00
Justin Bogner	14f383e9c4	Fix a use of LLVM_FALLTHROUGH that wasn't even in a switch. I was over-aggressive in my conversions from comments to the fallthrough attribute. llvm-svn: 278903	2016-08-17 05:25:38 +00:00
Justin Bogner	b03fd12cef	Replace "fallthrough" comments with LLVM_FALLTHROUGH This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902	2016-08-17 05:10:15 +00:00
Chuang-Yu Cheng	f7ba716bcb	[ppc64] Don't apply sibling call optimization if callee has any byval arg This is a quick work around, because in some cases, e.g. caller's stack size > callee's stack size, we are still able to apply sibling call optimization even callee has any byval arg. This patch fix: https://llvm.org/bugs/show_bug.cgi?id=28328 Reviewers: hfinkel kbarton nemanjai amehsan Subscribers: hans, tjablin https://reviews.llvm.org/D23441 llvm-svn: 278900	2016-08-17 03:17:44 +00:00
Chandler Carruth	67fc52f067	[PM] Port the always inliner to the new pass manager in a much more minimal and boring form than the old pass manager's version. This pass does the very minimal amount of work necessary to inline functions declared as always-inline. It doesn't support a wide array of things that the legacy pass manager did support, but is alse ... about 20 lines of code. So it has that going for it. Notably things this doesn't support: - Array alloca merging - To support the above, bottom-up inlining with careful history tracking and call graph updates - DCE of the functions that become dead after this inlining. - Inlining through call instructions with the always_inline attribute. Instead, it focuses on inlining functions with that attribute. The first I've omitted because I'm hoping to just turn it off for the primary pass manager. If that doesn't pan out, I can add it here but it will be reasonably expensive to do so. The second should really be handled by running global-dce after the inliner. I don't want to re-implement the non-trivial logic necessary to do comdat-correct DCE of functions. This means the -O0 pipeline will have to be at least 'always-inline,global-dce', but that seems reasonable to me. If others are seriously worried about this I'd like to hear about it and understand why. Again, this is all solveable by factoring that logic into a utility and calling it here, but I'd like to wait to do that until there is a clear reason why the existing pass-based factoring won't work. The final point is a serious one. I can fairly easily add support for this, but it seems both costly and a confusing construct for the use case of the always inliner running at -O0. This attribute can of course still impact the normal inliner easily (although I find that a questionable re-use of the same attribute). I've started a discussion to sort out what semantics we want here and based on that can figure out if it makes sense ta have this complexity at O0 or not. One other advantage of this design is that it should be quite a bit faster due to checking for whether the function is a viable candidate for inlining exactly once per function instead of doing it for each call site. Anyways, hopefully a reasonable starting point for this pass. Differential Revision: https://reviews.llvm.org/D23299 llvm-svn: 278896	2016-08-17 02:56:20 +00:00
Matthias Braun	08f4704ec8	IfConversion: Use references instead of pointers where possible; NFC Also put some commonly used subexpressions into variables. llvm-svn: 278895	2016-08-17 02:52:01 +00:00
Matthias Braun	b1e0558df4	IfConversion: Use range based for; NFC Also avoid some pointless use of auto! Because that's friendlier to readers and avoids several types accidentally resolving to unnecessary references here (MachineInstr *&, unsigned &). llvm-svn: 278894	2016-08-17 02:51:59 +00:00
Matthias Braun	2c931798d6	IfConversion: Improve doxygen comments llvm-svn: 278893	2016-08-17 02:51:57 +00:00
Chandler Carruth	f702d8ecb6	[Inliner] Add a flag to disable manual alloca merging in the Inliner. This is off for now while testing can take place to make sure that in fact we do sufficient stack coloring to fully obviate the manual alloca array merging. Some context on why we should be using stack coloring rather than merging allocas in this way: LLVM relies very heavily on analyzing pointers as coming from different allocas in order to make aliasing decisions. These are some of the most powerful aliasing signals available in LLVM. So merging allocas is an extremely destructive operation on the LLVM IR -- it takes away highly valuable and hard to reconstruct information. As a consequence, inlined functions which happen to have array allocas that this pattern matches will fail to be properly interleaved unless SROA manages to hoist everything to an SSA register. Instead, the inliner will have added an unnecessary dependence that one inlined function execute after the other because they will have been rewritten to refer to the same memory. All that said, folks will reasonably want some time to experiment here and make sure there are no significant regressions. A flag should give us an easy knob to test. For more context, see the thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-July/103277.html http://lists.llvm.org/pipermail/llvm-dev/2016-August/103285.html Differential Revision: https://reviews.llvm.org/D23052 llvm-svn: 278892	2016-08-17 02:40:23 +00:00
Zijiao Ma	53d55f45a1	Some places that could using TargetParser in LLVM. NFC. llvm-svn: 278888	2016-08-17 02:08:28 +00:00
Duncan P. N. Exon Smith	362d120488	Scalar: Avoid dereferencing end() in IndVarSimplify IndVarSimplify::sinkUnusedInvariants calls BasicBlock::getFirstInsertionPt on the ExitBlock and moves instructions before it. This can return end(), so it's not safe to dereference. Add an iterator-based overload to Instruction::moveBefore to avoid the UB. llvm-svn: 278886	2016-08-17 01:54:41 +00:00
Duncan P. N. Exon Smith	9e3edad932	IPO: Swap \|\| operands to avoid dereferencing end() IsOperandBundleUse conveniently indicates whether std::next(F->arg_begin(),UseIndex) will get to (or past) end(). Check it first to avoid dereferencing end(). llvm-svn: 278884	2016-08-17 01:23:58 +00:00
Duncan P. N. Exon Smith	3bcaa81204	Scalar: Avoid dereferencing end() in InductiveRangeCheckElimination BasicBlock::Create isn't designed to take iterators (which might be end()), but pointers (which might be nullptr). Fix the UB that was converting end() to a BasicBlock* by calling BasicBlock::getNextNode() in the first place. llvm-svn: 278883	2016-08-17 01:16:17 +00:00
Duncan P. N. Exon Smith	6331dc171c	ObjCARC: Don't increment or dereference end() when scanning args When there's only one argument and it doesn't match one of the known functions, return ARCInstKind::CallOrUser rather than falling through to the two argument case. The old behaviour both incremented past and dereferenced end(). llvm-svn: 278881	2016-08-17 01:02:18 +00:00
Duncan P. N. Exon Smith	ec083b59ed	ARM: Avoid dereferencing end() in ARMFrameLowering::emitPrologue llvm::tryFoldSPUpdateIntoPushPop assumes its arguments are valid MachineInstrs. Update ARMFrameLowering::emitPrologue to respect that; when LastPush==end(), it can't possibly be a push instruction anyway. llvm-svn: 278880	2016-08-17 00:53:04 +00:00
Duncan P. N. Exon Smith	00ec93da26	CodeGen: Avoid dereferencing end() in OptimizePHIs::OptimizeBB llvm-svn: 278879	2016-08-17 00:43:59 +00:00
Duncan P. N. Exon Smith	e04fe1a394	Hexagon: Avoid dereferencing end() in HexagonInstrInfo::InsertBranch llvm-svn: 278878	2016-08-17 00:34:00 +00:00
Duncan P. N. Exon Smith	db53d99d02	AMDGPU: Avoid looking for the DebugLoc in end() The end() iterator isn't a safe thing to dereference. Pass the DebugLoc into EmitFetchClause and EmitALUClause to avoid it. llvm-svn: 278873	2016-08-17 00:06:43 +00:00
Duncan P. N. Exon Smith	0a12729f99	SimplifyCFG: Avoid dereferencing end() When comparing a User* to a BasicBlock::iterator in passingValueIsAlwaysUndefined, don't dereference the iterator in case it is end(). llvm-svn: 278872	2016-08-16 23:57:56 +00:00
Justin Bogner	39eec466a2	Revert "Write the TPI stream from a PDB to Yaml." This is hitting a "use of undeclared identifier 'skipPadding' error locally and on some bots. This reverts r278869. llvm-svn: 278871	2016-08-16 23:37:10 +00:00
Duncan P. N. Exon Smith	dcbce9c391	CodeGen: Avoid dereferencing end() when unconstifying iterators Rather than doing a funny dance that relies on dereferencing end() not crashing, add some API to MachineInstrBundleIterator to get a non-const version of the iterator. llvm-svn: 278870	2016-08-16 23:34:07 +00:00
Zachary Turner	8321ba5437	Write the TPI stream from a PDB to Yaml. Reviewed By: ruiu, rnk Differential Revision: https://reviews.llvm.org/D23226 llvm-svn: 278869	2016-08-16 23:28:54 +00:00
Kyle Butt	07d61425e3	Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough. If AnalyzeBranch can't analyze a block and it is possible to fallthrough, then duplicating the block doesn't make sense, as only one block can be the layout predecessor for the un-analyzable fallthrough. Submitted wit a test case, but NOTE: the test case doesn't currently fail. However, the test case fails with D20505 and would have saved me some time debugging. llvm-svn: 278866	2016-08-16 22:56:14 +00:00
Sanjay Patel	60ea1b43d6	[InstCombine] clean up foldICmpAddConstant(); NFCI 1. Fix variable names 2. Add local variables to reduce code 3. Fix code comments 4. Add early exit to reduce indentation 5. Remove 'else' after if -> return 6. Hoist common predicate llvm-svn: 278864	2016-08-16 22:34:42 +00:00
Konstantin Zhuravlyov	e0b87181cf	[AMDGPU] Remove duplicate initialization of SIDebuggerInsertNops pass Differential Revision: https://reviews.llvm.org/D23556 llvm-svn: 278863	2016-08-16 22:30:11 +00:00
David Majnemer	744a8753db	Preserve the assumption cache more often We were clearing it out in LoopUnswitch and InlineFunction instead of attempting to preserve it. llvm-svn: 278860	2016-08-16 22:07:32 +00:00
Sanjay Patel	e47df1ac62	[InstCombine] use m_APInt to allow icmp (sub X, Y), C folds for splat constant vectors llvm-svn: 278859	2016-08-16 21:53:19 +00:00
Duncan P. N. Exon Smith	41cf73ce16	CodeGen: Don't dereference end() in MachineBasicBlock::CorrectExtraCFGEdges The current MachineBasicBlock might be the last block, so FallThru may be past the end(). Use getNextNode(), which will convert to nullptr, rather than &*++, which is invalid if we reach the end(). llvm-svn: 278858	2016-08-16 21:46:03 +00:00
Sanjay Patel	904cd39b05	[x86] Allow merging multiple instances of an immediate within a basic block for code size savings, for 64-bit constants. This patch handles 64-bit constants which can be encoded as 32-bit immediates. It extends the functionality added by https://reviews.llvm.org/D11363 for 32-bit constants to 64-bit constants. Patch by Sunita Marathe! Differential Revision: https://reviews.llvm.org/D23391 llvm-svn: 278857	2016-08-16 21:35:16 +00:00
Kostya Serebryany	3044390af1	[libFuzzer] minor speed improvement llvm-svn: 278856	2016-08-16 21:28:05 +00:00
Sanjay Patel	b9aa67bfcf	[InstCombine] fix variable names to match formula comments; NFC llvm-svn: 278855	2016-08-16 21:26:10 +00:00
David Majnemer	110522bc0f	[LoopUnroll] Don't clear out the AssumptionCache on each loop Clearing out the AssumptionCache can cause us to rescan the entire function for assumes. If there are many loops, then we are scanning over the entire function many times. Instead of clearing out the AssumptionCache, register all cloned assumes. llvm-svn: 278854	2016-08-16 21:09:46 +00:00
Reid Kleckner	b99b709068	Revert "Enhance SCEV to compute the trip count for some loops with unknown stride." This reverts commit r278731. It caused http://crbug.com/638314 llvm-svn: 278853	2016-08-16 21:02:04 +00:00
Matt Arsenault	b8037a1bd3	TailDuplicator: Use range loops llvm-svn: 278847	2016-08-16 20:38:05 +00:00
Evandro Menezes	5a5b8dcd32	[AArch64] Adjust the scheduling model for Exynos M1. Refine the model for the FP division unit. llvm-svn: 278846	2016-08-16 20:35:01 +00:00
Evandro Menezes	d03aff2e11	[AArch64] Adjust the scheduling model for Exynos M1. Refine the model for the integer division unit. llvm-svn: 278845	2016-08-16 20:34:58 +00:00
Matt Arsenault	7f19298bfa	AMDGPU: Remove excessive padding from ImmOp and RegOp. The structs ImmOp and RegOp are in AArch64AsmParser.cpp (inside anonymous namespace). This diff changes the order of fields and removes the excessive padding (8 bytes). Patch by Alexander Shaposhnikov llvm-svn: 278844	2016-08-16 20:28:06 +00:00
Sjoerd Meijer	15c81b05ea	[MBP] do not reorder and move up loop latch block Do not reorder and move up a loop latch block before a loop header when optimising for size because this will generate an extra unconditional branch. Differential Revision: https://reviews.llvm.org/D22521 llvm-svn: 278840	2016-08-16 19:50:33 +00:00
Kostya Serebryany	d46a59fac4	[libFuzzer] new experimental feature: value profiling. Profiles values that affect control flow and treats new values as new coverage. llvm-svn: 278839	2016-08-16 19:33:51 +00:00
Benjamin Kramer	0464ae83e7	Remove excessive padding from LineNoCacheTy The struct LineNoCacheTy is in SourceMgr.cpp inside anonymous namespace. This diff changes the order of fields and removes the excessive padding (8 bytes). Patch by Alexander Shaposhnikov! Differential revision: https://reviews.llvm.org/D23546 llvm-svn: 278838	2016-08-16 19:20:10 +00:00
David Majnemer	00940fb854	Make MDNode::intersect faster than O(n * m) It is pretty easy to get it down to O(nlogn + mlogm). This implementation has the added benefit of automatically deduplicating entries between the two sets. llvm-svn: 278837	2016-08-16 18:48:37 +00:00
David Majnemer	fa0f1e660b	Don't passively concatenate MDNodes I have audited all the callers of concatenate and none require duplicate entries to service concatenation. These duplicates serve no purpose but to needlessly embiggen the IR. N.B. Layering getMostGenericAliasScope on top of concatenate makes it O(nlogn + mlogm) instead of O(n*m). llvm-svn: 278836	2016-08-16 18:48:34 +00:00

... 8 9 10 11 12 ...

94831 Commits