llvm-project

Commit Graph

Author	SHA1	Message	Date
Daniel Sanders	e0b529f619	[mips][fastisel] Handle 0-4 arguments without SelectionDAG. Summary: Implements fastLowerArguments() to avoid the need to fall back on SelectionDAG for 0-4 argument functions that don't do tricky things like passing double in a pair of i32's. This allows us to move all except one test to -fast-isel-abort=3. The remaining one has function prototypes of the form 'i32 (i32, double, double)' which requires floats to be passed in GPR's. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D22680 llvm-svn: 276982	2016-07-28 14:55:28 +00:00
Nicolai Haehnle	3b572002a2	AMDGPU: add execfix flag to SI_ELSE Summary: SI_ELSE is lowered into two parts: s_or_saveexec_b64 dst, src (at the start of the basic block) s_xor_b64 exec, exec, dst (at the end of the basic block) The idea is that dst contains the exec mask of the preceding IF block. It can happen that SIWholeQuadMode decides to switch from WQM to Exact mode inside the basic block that contains SI_ELSE, in which case it introduces an instruction s_and_b64 exec, exec, s[...] which masks out bits that can correspond to both the IF and the ELSE paths. So the resulting sequence must be: s_or_savexec_b64 dst, src s_and_b64 exec, exec, s[...] <-- added by SIWholeQuadMode s_and_b64 dst, dst, exec <-- added by SILowerControlFlow s_xor_b64 exec, exec, dst Whether to add the additional s_and_b64 dst, dst, exec is currently determined via the ExecModified tracking. With this change, it is instead determined by an additional flag on SI_ELSE which is set by SIWholeQuadMode. Finally: It also occured to me that an alternative approach for the long run is for SILowerControlFlow to unconditionally emit s_or_saveexec_b64 dst, src ... s_and_b64 dst, dst, exec s_xor_b64 exec, exec, dst and have a pass that detects and cleans up the "redundant AND with exec" pattern where possible. This could be useful anyway, because we also add instructions s_and_b64 vcc, exec, vcc before s_cbranch_scc (in moveToALU), and those are often redundant. I have some pending changes to how KILL is lowered that could also benefit from such a cleanup pass. In any case, this current patch could help in the short term with the whole ExecModified business. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22846 llvm-svn: 276972	2016-07-28 11:39:24 +00:00
David Majnemer	19d024b2fd	[ConstantFolding] Don't bail on folding if ConstantFoldConstantExpression fails When folding an expression, we run ConstantFoldConstantExpression on each operand of that expression. However, ConstantFoldConstantExpression can fail and retur nullptr. Previously, we would bail on further refining the expression. Instead, use the original operand and see if we can refine a later operand. llvm-svn: 276959	2016-07-28 06:39:48 +00:00
David Majnemer	67f684e18e	[CodeView] Don't crash on functions without subprograms A function may have instructions annotated with debug info without having a subprogram. This fixes PR28747. llvm-svn: 276956	2016-07-28 05:03:22 +00:00
David Majnemer	0be7155350	[InstCombine] Handle failures from ConstantFoldConstantExpression ConstantFoldConstantExpression returns null when folding fails. This fixes PR28745. llvm-svn: 276952	2016-07-28 02:29:06 +00:00
Wei Mi	315bb33f27	Fix the assertion error in collectLoopUniforms caused by empty Worklist before expanding. Contributed-by: David Callahan Differential Revision: https://reviews.llvm.org/D22886 llvm-svn: 276943	2016-07-27 23:53:58 +00:00
George Burgess IV	dbd35c44d4	[CFLAA] Add getModRefBehavior to CFLAnders. This patch lets CFLAnders respond to mod-ref queries. It also includes a small bugfix to CFLSteens. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22823 llvm-svn: 276939	2016-07-27 23:07:07 +00:00
Vedant Kumar	fc07e8b428	[llvm-cov] Add a debug mode for source range highlighting (in html) llvm-cov's `-dump' option now emits information which helps debug source range highlighting in html mode. llvm-svn: 276924	2016-07-27 21:57:15 +00:00
Justin Lebar	23a9686011	[LSV] Don't assume that bitcast ops are Instructions. Summary: When we ask the builder to create a bitcast on a constant, we get back a constant, not an instruction. Reviewers: asbirlea Subscribers: jholewinski, mzolotukhin, llvm-commits, arsenm Differential Revision: https://reviews.llvm.org/D22878 llvm-svn: 276922	2016-07-27 21:45:48 +00:00
Krzysztof Parzyszek	06a2b6b1ee	[Hexagon] Find speculative loop preheader in hardware loop generation Before adding a new preheader block, check if there is a candidate block where the loop setup could be placed speculatively. This will be off by default. llvm-svn: 276919	2016-07-27 21:20:54 +00:00
Krzysztof Parzyszek	5241b8efcf	[Hexagon] Do not optimize volatile stack spill slots llvm-svn: 276916	2016-07-27 20:50:42 +00:00
Andrew Kaylor	9155354ff2	Revert EH-specific checks in BranchFolding that were causing blow ups in compile time. Differential Revision: https://reviews.llvm.org/D22839 llvm-svn: 276898	2016-07-27 17:55:33 +00:00
Tim Northover	8d2f52e035	GlobalISel: support zero-sized allocas All allocas must be at least 1 byte at the MachineIR level so we allocate just one byte. llvm-svn: 276897	2016-07-27 17:47:54 +00:00
Nirav Dave	06a99a46e2	[MC][X86] Fix Intel Operand assembly parsing for .set ids Fix intel syntax special case identifier operands that refer to a constant (e.g. .set <ID> n) to be interpreted as immediate not memory in parsing. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22585 llvm-svn: 276895	2016-07-27 17:39:41 +00:00
Simon Pilgrim	7efafbc583	[X86][SSE] Updated test so that both are applying the post-multiply This is to ensure that there are no diffs other than due to buildvector/legalization llvm-svn: 276882	2016-07-27 15:30:20 +00:00
Renato Golin	5267d53779	[ARM] Check that the thumb COFF segment flag gets set on thumb windows Patch by Martin Storsjö. llvm-svn: 276877	2016-07-27 14:37:18 +00:00
Ahmed Bougacha	6756a2c953	[GlobalISel] Introduce an instruction selector. And implement it for AArch64, supporting x/w ADD/OR. Differential Revision: https://reviews.llvm.org/D22373 llvm-svn: 276875	2016-07-27 14:31:55 +00:00
Daniel Sanders	c5537427c2	[mips][ias] Check '$rs = $rd' constraints when both registers are in AsmText. Summary: This is one possible solution to the problem of ignoring constraints that Simon raised in D21473 but it's a bit of a hack. The integrated assembler currently ignores violations of the tied register constraints when the operands involved in a tie are both present in the AsmText. For example, 'dati $rs, $rt, $imm' with the '$rs = $rt' will silently replace $rt with $rs. So 'dati $2, $3, 1' is processed as if the user provided 'dati $2, $2, 1' without any diagnostic being emitted. This is difficult to solve properly because there are multiple parts of the matcher that are silently forcing these constraints to be met. Tied operands are rendered to instructions by cloning previously rendered operands but this is unnecessary because the matcher was already instructed to render the operand it would have cloned. This is also unnecessary because earlier code has already replaced the MCParsedOperand with the one it was tied to (so the parsed input is matched as if it were 'dati <RegIdx 2>, <RegIdx 2>, <Imm 1>'). As a result, it looks like fixing this properly amounts to a rewrite of the tied operand handling which affects all targets. This patch however, merely inserts a checking hook just before the substitution of MCParsedOperands and the Mips target overrides it. It's not possible to accurately check the registers are the same this early (because numeric registers haven't been bound to a register class yet) so it cheats a bit and checks that the tokens that produced the operand are lexically identical. This works because tied registers need to have the same register class but it does have a flaw. It will reject 'dati $4, $a0, 1' for violating the constraint even though $a0 ends up as the same register as $4. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D21994 llvm-svn: 276867	2016-07-27 13:49:44 +00:00
Teresa Johnson	d92012d51c	[test/gold] Add gold test subdirectory tests needing v1.12 (or higher) Summary: As discussed in the review for D22677, added a subdirectory to enable tests that require at least version 1.12 of gold. Add an initial test requiring this version. Reviewers: davidxl, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D22827 llvm-svn: 276860	2016-07-27 12:59:51 +00:00
Renato Golin	80e58869f8	[ARM] Set a non-conflicting comment character for assembly in MSVC mode Currently, for ARMCOFFMCAsmInfoMicrosoft, no comment character is set, thus the idefault, '#', is used. The hash character doesn't work as comment character in ARM assembly, since '#' is used for immediate values. The comment character is set to ';', which is the comment character used by MS armasm.exe. (The microsoft armasm.exe uses a different directive syntax than what LLVM currently supports though, similar to ARM's armasm.) This allows inline assembly with immediate constants to be built (and brings the assembly output from clang -S closer to being possible to assemble). A test is added that verifies that ';' is correctly interpreted as comments in this mode, and verifies that assembling code that includes literal constants with a '#' works. Patch by Martin Storsjö. llvm-svn: 276859	2016-07-27 12:31:58 +00:00
Renato Golin	b9edb5c9b0	[ARM] Adds test for immediate encoding The encoding of expressions as immediates wasn't correct, and was reported in PR23000. However, we have done some refactoring on how immediates are handled and now it seems the problem is fixed. This is a test just to make sure it won't regress again. llvm-svn: 276858	2016-07-27 12:15:26 +00:00
Simon Pilgrim	10bf0ff879	[DAGCombiner] Use APInt directly to detect out of range shift constants Using getZExtValue() will assert if the value doesn't fit into uint64_t - SHL was already doing this, I've just updated ASHR/LSHR to match As mentioned on D22726 llvm-svn: 276855	2016-07-27 10:30:55 +00:00
Davide Italiano	7c9fc738b1	[MC] Add command-line option to choose the max nest level in asm macros. Submitted by: t83wCSLq Differential Revision: https://reviews.llvm.org/D22313 llvm-svn: 276842	2016-07-27 05:51:56 +00:00
Sebastian Pop	55c3007b88	GVN-hoist: improve code generation for recursive GEPs When loading or storing in a field of a struct like "a.b.c", GVN is able to detect the equivalent expressions, and GVN-hoist would fail in the code generation. This is because the GEPs are not hoisted as scalar operations to avoid moving the GEPs too far from their ld/st instruction when the ld/st is not movable. So we end up having to generate code for the GEP of a ld/st when we move the ld/st. In the case of a GEP referring to another GEP as in "a.b.c" we need to code generate all the GEPs necessary to make all the operands available at the new location for the ld/st. With this patch we recursively walk through the GEP operands checking whether all operands are available, and in the case of a GEP operand, it recursively makes all its operands available. Code generation happens from the inner GEPs out until reaching the GEP that appears as an operand of the ld/st. Differential Revision: https://reviews.llvm.org/D22599 llvm-svn: 276841	2016-07-27 05:48:12 +00:00
Vedant Kumar	90be9db7d9	[llvm-cov] Escape '\' in strings when emitting JSON Test that Windows path separators are escaped properly. Add a round-trip test to verify the JSON produced by the exporter. llvm-svn: 276832	2016-07-27 04:08:32 +00:00
David Majnemer	bc36b15253	[ConstantFolding] Correctly handle failures in ConstantFoldConstantExpressionImpl Failures in ConstantFoldConstantExpressionImpl were ignored causing crashes down the line. This fixes PR28725. llvm-svn: 276827	2016-07-27 02:39:16 +00:00
Andrew Kaylor	f990fa5f7b	Reverting r276771 due to MSan failures. llvm-svn: 276824	2016-07-27 01:19:24 +00:00
Matt Arsenault	e3862cdc93	AMDGPU: Use rcp for fdiv 1, x with fpmath metadata Using rcp should be OK for safe math usually, so this should not be replacing the original fdiv. llvm-svn: 276823	2016-07-26 23:25:44 +00:00
Hans Wennborg	685e8ff953	Revert r276136 "Use ValueOffsetPair to enhance value reuse during SCEV expansion." It causes Clang tests to fail after Windows self-host (PR28705). (Also reverts follow-up r276139.) llvm-svn: 276822	2016-07-26 23:25:13 +00:00
Matt Arsenault	918e81c3d4	AMDGPU: Add more tests for LDS size with occupancy llvm-svn: 276821	2016-07-26 23:15:59 +00:00
Vedant Kumar	7101d73c71	Retry: [llvm-cov] Add support for exporting coverage data to JSON This enables users to export coverage information as portable JSON for use by analysis tools and storage in document based databases. The export sub-command is invoked just like the others: llvm-cov export -instr-profile path/to/foo.profdata path/to/foo.binary The resulting JSON contains a list of files and functions. Every file object contains a list of segments, expansions, and a summary of the file's region, function, and line coverage. Every function object contains the function's name and regions. There is also a total summary for the entire object file. Changes since the initial commit (r276813): - Fixed the regexes in the tests to handle Windows filepaths. Patch by Eddie Hurtig! Differential Revision: https://reviews.llvm.org/D22651 llvm-svn: 276818	2016-07-26 22:50:58 +00:00
Vedant Kumar	e85353b849	Revert "[llvm-cov] Add support for exporting coverage data to JSON" This reverts commit r276813. The Windows bots are complaining about some of the filename regexes in the tests: http://bb.pgr.jp/builders/ninja-clang-i686-msc19-R/builds/5299 llvm-svn: 276816	2016-07-26 21:55:39 +00:00
Matthias Braun	333e468d15	MIRParser: Use dot instead of colon to mark subregisters Change the syntax to use `%0.sub8` to denote a subregister. This seems like a more natural fit to denote subregisters; I also plan to introduce a new ":classname" syntax in upcoming patches to denote the register class of a vreg. Note that this commit disallows plain identifiers to start with a '.' character. This shouldn't affect anything as external names/IR references are all prefixed with '$'/'%', plain identifiers are only used for instruction names, register mask names and subreg indexes. Differential Revision: https://reviews.llvm.org/D22390 llvm-svn: 276815	2016-07-26 21:49:34 +00:00
Vedant Kumar	d5b7436c1f	[llvm-cov] Add support for exporting coverage data to JSON This enables users to export coverage information as portable JSON for use by analysis tools and storage in document based databases. The export sub-command is invoked just like the others: llvm-cov export -instr-profile path/to/foo.profdata path/to/foo.binary The resulting JSON contains a list of files and functions. Every file object contains a list of segments, expansions, and a summary of the file's region, function, and line coverage. Every function object contains the function's name and regions. There is also a total summary for the entire object file. Patch by Eddie Hurtig! Differential Revision: https://reviews.llvm.org/D22651 llvm-svn: 276813	2016-07-26 21:35:43 +00:00
Krzysztof Parzyszek	2a480599bb	[Hexagon] Post-increment loads/stores enhancements - Generate vector post-increment stores more aggressively. - Predicate post-increment and vector stores in early if-conversion. llvm-svn: 276800	2016-07-26 20:30:30 +00:00
Tim Northover	ad2b717f2c	GlobalISel: add generic load and store instructions. Pretty straightforward, the only oddity is the MachineMemOperand (which it's surprisingly difficult to share code for). llvm-svn: 276799	2016-07-26 20:23:26 +00:00
Krzysztof Parzyszek	57c3ddddec	[Hexagon] Gracefully handle reg class mismatch in HexagonLoopReschedule llvm-svn: 276793	2016-07-26 19:17:13 +00:00
Krzysztof Parzyszek	6eba5b8c37	[Hexagon] Rerun bit tracker on new instructions in RIE Consider this case: vreg1 = A2_zxth vreg0 (1) ... vreg2 = A2_zxth vreg1 (2) Redundant instruction elimination could delete the instruction (1) because the user (2) only cares about the low 16 bits. Then it could delete (2) because the input is already zero-extended. The problem is that the properties allowing each individual instruction to be deleted depend on the existence of the other instruction, so either one can be deleted, but not both. The existing check for this situation in RIE was insufficient. The fix is to update all dependent cells when an instruction is removed (replaced via COPY) in RIE. llvm-svn: 276792	2016-07-26 19:08:45 +00:00
Krzysztof Parzyszek	1adca30c39	[Hexagon] Bitwise operations for insert/extract word not simplified Change the bit simplifier to generate REG_SEQUENCE instructions in addition to COPY, which will handle cases of word insert/extract. llvm-svn: 276787	2016-07-26 18:30:11 +00:00
Justin Lebar	16da82f4d2	Fix NVPTX/call-with-alloca-buffer.ll after r276777. r276777 makes InstSimplify stronger, letting it see through some unnecessary addrspace casts. llvm-svn: 276786	2016-07-26 18:28:33 +00:00
Matthias Braun	ee0679207b	MIRParser: Use shorter cfi identifiers In an instruction like: CFI_INSTRUCTION .cfi_def_cfa ... we can drop the '.cfi_' prefix since that should be obvious by the context: CFI_INSTRUCTION def_cfa ... While being a terser and cleaner syntax this also prepares to dropping support for identifiers starting with a dot character so we can use it for expressions. Differential Revision: http://reviews.llvm.org/D22388 llvm-svn: 276785	2016-07-26 18:20:00 +00:00
Davide Italiano	f17d48e58a	[MC] Don't crash when trying to emit a relocation against .bss. Turn that into an error instead. llvm-svn: 276783	2016-07-26 18:16:33 +00:00
David Majnemer	6774d612d4	[InstSimplify] Cast folding can be made more generic Use isEliminableCastPair to determine if a pair of casts are foldable. llvm-svn: 276777	2016-07-26 17:58:05 +00:00
Tim Northover	ab395cb071	GlobalISel: add correct operand type to G_FRAME_INDEX instrs. Frame indices should use "addFrameIndex", not "addImm". llvm-svn: 276775	2016-07-26 17:42:40 +00:00
Krzysztof Parzyszek	29c567a3f0	[Hexagon] Add support for proper handling of H and L constraints H -> High part of reg pair. L -> Low part of reg pair. Patch by Sundeep Kushwaha. llvm-svn: 276773	2016-07-26 17:31:02 +00:00
Tim Northover	26e40bdb9b	GlobalISel: omit braces on MachineInstr types when there's only one. Tidies up the representation a bit in the common case. llvm-svn: 276772	2016-07-26 17:28:01 +00:00
Andrew Kaylor	3104a6bad0	Re-committing r275284: add support to inline __builtin_mempcpy Patch by Sunita Marathe Differential Revision: http://reviews.llvm.org/D21920 llvm-svn: 276771	2016-07-26 17:23:13 +00:00
Matt Arsenault	07f65718bb	AMDGPU: Add missing tests for xnack option for HSA llvm-svn: 276765	2016-07-26 16:45:50 +00:00
Matt Arsenault	32fc527c65	AMDGPU: Add fp legacy instruction intrinsics This could use some additional optimization work to use mad/mac legacy. llvm-svn: 276764	2016-07-26 16:45:45 +00:00
Oliver Stannard	1c6e591457	[ARM] Improve error messages for .arch_extension directive - More informative message when extension name is not an identifier token. - Stop parsing directive if extension is unknown (avoid duplicate error messages). - Report unsupported extensions with a source location, rather than report_fatal_error. Differential Revision: https://reviews.llvm.org/D22806 llvm-svn: 276748	2016-07-26 14:24:43 +00:00
Oliver Stannard	2171828a49	[ARM] Implement -mimplicit-it assembler option This option, compatible with gas's -mimplicit-it, controls the generation/checking of implicit IT blocks in ARM/Thumb assembly. This option allows two behaviours that were not possible before: - When in ARM mode, emit a warning when assembling a conditional instruction that is not in an IT block. This is enabled with -mimplicit-it=never and -mimplicit-it=thumb. - When in Thumb mode, automatically generate IT instructions when an instruction with a condition code appears outside of an IT block. This is enabled with -mimplicit-it=thumb and -mimplicit-it=always. The default option is -mimplicit-it=arm, which matches the existing behaviour (allow conditional ARM instructions outside IT blocks without warning, and error if a conditional Thumb instruction is outside an IT block). The general strategy for generating IT blocks in Thumb mode is to keep a small list of instructions which should be in the IT block, and only emit them when we encounter something in the input which means we cannot continue the block. This could be caused by: - A non-predicable instruction - An instruction with a condition not compatible with the IT block - The IT block already contains 4 instructions - A branch-like instruction (including ALU instructions with the PC as the destination), which cannot appear in the middle of an IT block - A label (branching into an IT block is not legal) - A change of section, architecture, ISA, etc - The end of the assembly file. Some of these, such as change of section and end of file, are parsed outside of the ARM asm parser, so I've added a new virtual function to AsmParser to ensure any previously-parsed instructions have been emitted. The ARM implementation of this flushes the currently pending IT block. We now have to try instruction matching up to 3 times, because we cannot know if the current IT block is valid before matching, and instruction matching changes depending on the IT block state (due to the 16-bit ALU instructions, which set the flags iff not in an IT block). In the common case of not having an open implicit IT block and the instruction being matched not needing one, we still only have to run the matcher once. I've removed the ITState.FirstCond variable, because it does not store any information that isn't already represented by CurPosition. I've also updated the comment on CurPosition to accurately describe it's meaning (which this patch doesn't change). Differential Revision: https://reviews.llvm.org/D22760 llvm-svn: 276747	2016-07-26 14:19:47 +00:00
Simon Pilgrim	0280959c0d	[X86][SSE] Added extra memory folding tests for cvtsd2ss intrinsic SSE only fold partial reg update instructions when optsize is enabled llvm-svn: 276743	2016-07-26 12:44:50 +00:00
Simon Pilgrim	019e102426	[X86][SSE] Fixed issue with memory folding of (v)cvtsd2ss intrinsics Fixed typo in the intrinsic definitions of (v)cvtsd2ss with memory folding. This was only unearthed when rL276102 started using the intrinsic again..... llvm-svn: 276740	2016-07-26 10:41:28 +00:00
Simon Dardis	68a204ddc1	[mips] MIPS64R6 compact branch support MIPS64R6 compact branch support. As the MIPS LLVM backend uses distinct MachineInstrs for certain 32 and 64 bit instructions (e.g. BEQ & BEQ64) that map to the same instruction, extend compact branch support for the corresponding 64bit branches. Reviewers: dsanders Differential Revision: https://reviews.llvm.org/D20164 llvm-svn: 276739	2016-07-26 10:25:07 +00:00
Simon Dardis	273fc26b79	[mips] sgtu, s[rl]l, sra, dnegu, neg instruction aliases Add the instruction alias sgtu (register form only), two operand forms of s[rl]l and sra, and missing single/two operand forms of dnegu/neg. Reviewers: dsanders Differential Revision: https://reviews.llvm.org/D22752 llvm-svn: 276736	2016-07-26 09:13:46 +00:00
David Majnemer	a90a621d1e	Reapply: [InstSimplify] Add support for bitcasts" This reverts commit r276700 and reapplies r276698. The relevant clang tests have been updated. llvm-svn: 276727	2016-07-26 05:52:29 +00:00
Sebastian Pop	91d4a30159	GVN-hoist: use a DFS numbering of instructions (PR28670) Instead of DFS numbering basic blocks we now DFS number instructions that avoids the costly operation of which instruction comes first in a basic block. Patch mostly written by Daniel Berlin. Differential Revision: https://reviews.llvm.org/D22777 llvm-svn: 276714	2016-07-26 00:15:10 +00:00
Evgeniy Stepanov	906f6fb565	[safestack] Fix stack guard live range. Stack guard slot is live throughout the function. llvm-svn: 276712	2016-07-26 00:05:14 +00:00
Adam Nemet	39e039c606	[lit] Don't match tool names within new PM's <> markers For example, stop expanding 'opt' in -passes='require<opt-remark-emit>'. llvm-svn: 276707	2016-07-25 23:09:10 +00:00
Renato Golin	32b165f561	[ARM] Saturation instructions are DSP-only The saturation instructions appeared in v6T2, with DSP extensions, but they were being accepted / generated on any, with the new introduction of the saturation detection in the back-end. This commit restricts the usage to DSP-enable only cores. Fixes PR28607. llvm-svn: 276701	2016-07-25 22:25:25 +00:00
David Majnemer	6e06b577cc	Revert "[InstSimplify] Add support for bitcasts" This reverts commit r276698. Clang has tests which rely on the optimizer :( llvm-svn: 276700	2016-07-25 22:24:59 +00:00
David Majnemer	62611fd3f7	[InstSimplify] Add support for bitcasts BitCasts of BitCasts can be folded away as can BitCasts which don't change the type of the operand. llvm-svn: 276698	2016-07-25 22:04:58 +00:00
Simon Pilgrim	fa863fb2f1	[X86] Regenerate v2i256 shift legalization tests llvm-svn: 276692	2016-07-25 21:14:22 +00:00
Simon Pilgrim	2d7735e428	[X86] Regenerate i64 shift legalization tests llvm-svn: 276691	2016-07-25 21:11:45 +00:00
Tim Northover	7c9eba90ff	GlobalISel: add generic casts to IRTranslator This adds LLVM's 3 main cast instructions (inttoptr, ptrtoint, bitcast) to the IRTranslator. The first two are direct translations (with 2 MachineInstr types each). Since LLT discards information, a bitcast might become trivial and we emit a COPY in those cases instead. llvm-svn: 276690	2016-07-25 21:01:29 +00:00
Tim Northover	e2e0067352	GlobalISel[AArch64]: support pointer types in argument lowering. They're basically i64 for AArch64, but we'll leave them intact for stranger targets. Also add some tests for the (very few) other cases we can handle right now. llvm-svn: 276689	2016-07-25 21:01:17 +00:00
Michael Kuperstein	39feb6290c	[PM] Port SymbolRewriter to the new PM Differential Revision: https://reviews.llvm.org/D22703 llvm-svn: 276687	2016-07-25 20:52:00 +00:00
Kevin Enderby	95b0842e64	Next step along the way to getting good error messages for bad archives. I consulted with Lang Hames on this work, and the goal was to add a bit of "where" in the archive the error occurred along with what the error was. So this step changes ArchiveMemberHeader into a class with a pointer to the archive header and the parent archive. Which allows the methods in the ArchiveMemberHeader to determine which member the header is for to include that information in the error message. For this first step the "where" is just the offset to the member in the archive. The next step will be a new method on ArchiveMemberHeader to get the full name, if possible, to be use in the error message. Which will now be possible as ArchiveMemberHeader contains a pointer to the Archive with its string table and its size, etc. so the full name can be determined from the header if it is valid. Also this change adds the missing checks the archive header is actually contained in the buffer and is not truncated, as well as if the terminating characters are correct in the header. And changes one error message in Archive::Child::getNext() where the name or offset to member is now added. llvm-svn: 276686	2016-07-25 20:36:36 +00:00
Jan Vesely	b64c8925e9	AMDGPU: Remove read_workdim intrinsic Differential revision: https://reviews.llvm.org/D22732 llvm-svn: 276682	2016-07-25 20:17:02 +00:00
Matt Arsenault	7cddfed7e8	Scalarizer: Support scalarizing intrinsics llvm-svn: 276681	2016-07-25 20:02:54 +00:00
Matt Arsenault	e047b2598e	AMDGPU: Fix missing verify-machineinstrs in control flow test llvm-svn: 276679	2016-07-25 19:39:06 +00:00
Evgeniy Stepanov	8d78bd5041	Fix invalid iterator use in safestack coloring. llvm-svn: 276676	2016-07-25 19:25:40 +00:00
Rong Xu	705f7775bb	[PGO] Fix profile mismatch in COMDAT function with pre-inliner Pre-instrumentation inline (pre-inliner) greatly improves the IR instrumentation code performance, among other benefits. One issue of the pre-inliner is it can introduce CFG-mismatch for COMDAT functions. This is due to the fact that the same COMDAT function may have different early inline decisions across different modules -- that means different copies of COMDAT functions will have different CFG checksum. In this patch, we propose a partially renaming the COMDAT group and its member function/variable so we have different profile counter for each version. We will post-fix the COMDAT function and the group name with its FunctionHash. Differential Revision: http://reviews.llvm.org/D22600 llvm-svn: 276673	2016-07-25 18:45:37 +00:00
Simon Pilgrim	ce8d82775c	[X86][SSE] Added 2048-bit vector comparison tests Upper limit of what can be held in a <32 x i8> result llvm-svn: 276666	2016-07-25 17:56:01 +00:00
Elena Demikhovsky	64e5f929d0	AVX-512: Fixed [US]INT_TO_FP selection for i1 vectors. It failed with assertion before this patch. Differential Revision: https://reviews.llvm.org/D22735 llvm-svn: 276648	2016-07-25 16:51:00 +00:00
Wei Mi	97de034e18	Remove useless pass from the pipeline in test/Analysis/Dominators/2007-01-14-BreakCritEdges.ll. llvm-svn: 276644	2016-07-25 16:27:34 +00:00
Krzysztof Parzyszek	080bebd212	[Hexagon] Add target feature to generate long calls llvm-svn: 276638	2016-07-25 14:42:11 +00:00
Sam Parker	d5ca0a65b5	[ARM] Improve longMAC codegen test Added thumb targets and dataflow checks to the longMAC test. Differential Revision: https://reviews.llvm.org/D22684 llvm-svn: 276629	2016-07-25 10:11:00 +00:00
Simon Dardis	618975206e	[mips] Optimize materialization of i64 constants Avoid MipsAnalyzeImmediate usage if the constant fits in an 32-bit integer. This allows us to generate the same instructions for the materialization of the same constants regardless the width of their type. Patch by: Vasileios Kalintiris Contributions by: Simon Dardis Reviewers: Daniel Sanders Differential Review: https://reviews.llvm.org/D21689 llvm-svn: 276628	2016-07-25 09:57:28 +00:00
Sam Parker	68c71cd1e4	[ARM] Enable ISel of SMMLS for ARM and Thumb2 Use ISelDAGToDAG to recognise the SMMLS instruction pattern. Differential Revision: https://reviews.llvm.org/D22562 llvm-svn: 276624	2016-07-25 09:20:20 +00:00
Craig Topper	ce415ff9c5	[AVX512] Add load folding support for the unmasked forms of the FMA instructions. llvm-svn: 276615	2016-07-25 07:20:35 +00:00
Craig Topper	318e40b6f7	[AVX512] Add some additional patterns so that we can fold broadcast loads in the first argument of an FMADD/FMSUB/FNMADD/FNMSUB/FMADDSUB/FMSUBADD node. Also add patterns to support all combinations of the broadcast input and the preserved input for masked versions. llvm-svn: 276614	2016-07-25 07:20:31 +00:00
Craig Topper	6bcbf5338c	[AVX512] Cleanup FMA operand order in patterns to match the VEX versions and to really be 213, 231, and 132. llvm-svn: 276613	2016-07-25 07:20:28 +00:00
Sean Silva	fe5abd5e0c	Fix : Partial Inliner requires AssumptionCacheTracker The public InlineFunction utility assumes that the passed in InlineFunctionInfo has a valid AssumptionCacheTracker. Patch by River Riddle! Differential Revision: https://reviews.llvm.org/D22706 llvm-svn: 276609	2016-07-25 05:00:00 +00:00
David Majnemer	68623a0e9f	[GVNHoist] Merge metadata on hoisted instructions less conservatively We can combine metadata from multiple instructions intelligently for certain metadata nodes. llvm-svn: 276602	2016-07-25 02:21:25 +00:00
David Majnemer	4728569d0a	[GVNHoist] Properly merge alignments when hoisting If we two loads of two different alignments, we must use the minimum of the two alignments when hoisting. Same deal for stores. For allocas, use the maximum of the two allocas. llvm-svn: 276601	2016-07-25 02:21:23 +00:00
Simon Pilgrim	a6878bdc0f	[X86][SSE] Added PR27854 tests llvm-svn: 276571	2016-07-24 16:39:50 +00:00
Simon Pilgrim	b7d75fee74	[X86] Add shift double tests for PR14593 llvm-svn: 276570	2016-07-24 16:10:21 +00:00
Simon Pilgrim	381a0ade5a	[X86] Add 'FeatureSlowSHLD' to cpu 'bdver4' As with all AMD CPUs, excavator has poor SHLD/SHRD performance. Also added bdver3 to the test as it was missing. llvm-svn: 276569	2016-07-24 16:00:53 +00:00
Simon Pilgrim	b2df2dc298	[X86] Add SHRD shift combine tests llvm-svn: 276568	2016-07-24 15:47:44 +00:00
Simon Pilgrim	0ededcb344	[X86] Regenerate shift by parts tests llvm-svn: 276567	2016-07-24 15:38:51 +00:00
Simon Pilgrim	114205076d	[X86][SSE] Regenerate shifts tests llvm-svn: 276566	2016-07-24 15:25:36 +00:00
Simon Pilgrim	30a7cc2e1f	[X86][SSE] Regenerate SSE copysign tests llvm-svn: 276565	2016-07-24 15:17:50 +00:00
Simon Pilgrim	7f1c1db73c	[X86][AVX512VL] Added AVX512VL half2float vector conversions tests to demonstrate PR23941 llvm-svn: 276563	2016-07-24 13:01:51 +00:00
Craig Topper	2dca3b287b	[X86] Make the FMA3 instruction names consistent between VEX and EVEX encoded versions. This places the 132/213/231 form number in front of the SS/SD/PS/PD. Move the Y for 256-bit versions to be after the PS/PD. Change the AVX512 scalar forms to include a Z in the their name. This new format should be consistent with the general naming of instructions. llvm-svn: 276559	2016-07-24 08:26:38 +00:00
Elena Demikhovsky	376a18bd92	[Loop Vectorizer] Handling loops FP induction variables. Allowed loop vectorization with secondary FP IVs. Like this: float *A; float x = init; for (int i=0; i < N; ++i) { A[i] = x; x -= fp_inc; } The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute. Differential Revision: https://reviews.llvm.org/D21330 llvm-svn: 276554	2016-07-24 07:24:54 +00:00
Simon Pilgrim	9e99dd9e8b	[X86][SSE] Added float widened broadcast tests llvm-svn: 276535	2016-07-23 21:24:02 +00:00
Simon Pilgrim	8d18969716	[X86][SSE] Added more widened broadcast tests Added more vXi16 and vXi8 tests llvm-svn: 276534	2016-07-23 21:15:31 +00:00
Simon Pilgrim	b9e47a8cd0	[X86][SSE] Added tests where we should be trying to widen a load+splat into a broadcast llvm-svn: 276527	2016-07-23 16:19:17 +00:00
Simon Pilgrim	8aa6f34455	[X86][SSE] Regenerated uitofp <2 x i32> -> <2 x float> conversion tests Demonstrate difference in codegen discussed on PR14760 llvm-svn: 276526	2016-07-23 15:55:42 +00:00
Sanjay Patel	1271bf9178	[InstCombine] allow icmp (bit-manipulation-intrinsic(), C) folds for vectors llvm-svn: 276523	2016-07-23 13:06:49 +00:00
Craig Topper	b6519db90d	[AVX512] Implement commuting support for EVEX encoded FMA3 instructions. llvm-svn: 276521	2016-07-23 07:16:56 +00:00
Xinliang David Li	9239245401	[Profile] Use explicit flag to enable IR PGO Patch by Jake VanAdrighem Differential Revision: http://reviews.llvm.org/D22607 llvm-svn: 276516	2016-07-23 04:28:52 +00:00
Sanjoy Das	a7d9ec8751	[SCEV] Make isImpliedCondOperandsViaRanges smarter This change lets us prove things like "{X,+,10} s< 5000" implies "{X+7,+,10} does not sign overflow" It does this by replacing replacing getConstantDifference by computeConstantDifference (which is smarter) in isImpliedCondOperandsViaRanges. llvm-svn: 276505	2016-07-23 00:54:36 +00:00
Sanjay Patel	8d8594acb9	auto-generate checks llvm-svn: 276501	2016-07-23 00:09:54 +00:00
Tom Stellard	b8253c88b6	Revert "[AMDGPU] Emit read-only data to .rodata for hsa" This reverts commit r276298. Data stored in .rodata can have a negative offset from .text, but we don't support negative values in relocations yet. This caused a regression in one of the amp conformance tests: 5_Data_Cont/5_2_a_v/5_2_3_m/Assignment/Test.02.01 llvm-svn: 276498	2016-07-22 23:46:40 +00:00
Adam Nemet	9e6e63fba2	[LoopDataPrefetch] Include hotness of region in opt remark llvm-svn: 276488	2016-07-22 22:53:17 +00:00
Sanjay Patel	e063ddb347	add tests for icmp vector folds llvm-svn: 276482	2016-07-22 22:19:52 +00:00
Tim Northover	98a56eb7f4	GlobalISel: allow multiple types on MachineInstrs. llvm-svn: 276481	2016-07-22 22:13:36 +00:00
Vitaly Buka	e3a032a740	Unpoison stack before resume instruction Summary: Clang inserts cleanup code before resume similar way as before return instruction. This makes asan poison local variables causing false use-after-scope reports. __asan_handle_no_return does not help here as it was executed before llvm.lifetime.end inserted into resume block. To avoid false report we need to unpoison stack for resume same way as for return. PR27453 Reviewers: kcc, eugenis Differential Revision: https://reviews.llvm.org/D22661 llvm-svn: 276480	2016-07-22 22:04:38 +00:00
Alina Sbirlea	ba21ffebff	Add flag to PassManagerBuilder to disable GVN Hoist Pass. Summary: Adding a flag to diable GVN Hoisting by default. Note: The GVN Hoist Pass causes some Halide tests to hang. Halide will disable the pass while investigating. Reviewers: llvm-commits, chandlerc, spop, dberlin Subscribers: mehdi_amini Differential Revision: https://reviews.llvm.org/D22639 llvm-svn: 276479	2016-07-22 22:02:19 +00:00
Michael Kuperstein	38e7298093	[SLPVectorizer] Vectorize reverse-order loads in horizontal reductions When vectorizing a tree rooted at a store bundle, we currently try to sort the stores before building the tree, so that the stores can be vectorized. For other trees, the order of the root bundle - which determines the order of all other bundles - is arbitrary. That is bad, since if a leaf bundle of consecutive loads happens to appear in the wrong order, we will not vectorize it. This is partially mitigated when the root is a binary operator, by trying to build a "reversed" tree when that's considered profitable. This patch extends the workaround we have for binops to trees rooted in a horizontal reduction. This fixes PR28474. Differential Revision: https://reviews.llvm.org/D22554 llvm-svn: 276477	2016-07-22 21:28:48 +00:00
Sanjay Patel	cbc4377af1	add tests for icmp vector folds llvm-svn: 276476	2016-07-22 21:28:20 +00:00
Sanjay Patel	97e61dcc2d	add tests for icmp vector folds llvm-svn: 276475	2016-07-22 21:13:08 +00:00
Sanjay Patel	b73d7aed71	add tests for icmp vector folds llvm-svn: 276472	2016-07-22 21:02:33 +00:00
Vedant Kumar	127d0502a0	[llvm-cov] Don't copy stylesheets into index files Just link in the stylesheet from the toplevel dir of the report. llvm-svn: 276468	2016-07-22 20:49:23 +00:00
Sanjay Patel	859278005d	update to use FileCheck and auto-generate checks llvm-svn: 276466	2016-07-22 20:39:07 +00:00
Sanjay Patel	296a776a5b	add tests for icmp vector folds llvm-svn: 276464	2016-07-22 20:11:08 +00:00
Tim Northover	33b07d6725	GlobalISel: implement legalization pass, with just one transformation. This adds the actual MachineLegalizeHelper to do the work and a trivial pass wrapper that legalizes all instructions in a MachineFunction. Currently the only transformation supported is splitting up a vector G_ADD into one acting on smaller vectors. llvm-svn: 276461	2016-07-22 20:03:43 +00:00
Teresa Johnson	f432c9cefa	[ThinLTO/gold] Remove thin archive part of new test due to bot failures I am getting a bot failure from the thin archive part of this test: From http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/40468/steps/test_llvm/logs/LLVM%20%3A%3A%20tools__gold__X86__thinlto_emit_linked_objects.ll: Command Output (stderr): -- /home/bb/cmake-llvm-x86_64-linux/build/./bin/llvm-ar: creating /home/bb/cmake-llvm-x86_64-linux/build/test/tools/gold/X86/Output/thinlto_emit_linked_objects.ll.tmp2.a /usr/bin/ld.gold: internal error in add_writer, at ../../gold/token.h:124 -- This appears to be an issue with an older version of gold. The test case passes for me locally when I use the gold v1.12 I was testing with, but when I tried the gold installed on my system which is v1.11 I get the same error. Remove the thin archive version of the test, since there isn't a way to predicate it on gold version. llvm-svn: 276453	2016-07-22 18:32:30 +00:00
Jun Bum Lim	6a7dc5c430	Recommit - [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals Recommiting r275571 after fixing crash reported in PR28270. Now we erase elements of IOL in deleteDeadInstruction(). Original Summary: This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics. Add test cases which was missing opportunities before. llvm-svn: 276452	2016-07-22 18:27:24 +00:00
Sanjay Patel	beaea95a0d	add tests for vector bit manipulation intrinsics llvm-svn: 276451	2016-07-22 18:22:25 +00:00
Teresa Johnson	1e2708c9e0	[ThinLTO/gold] Support for getting list of included objects from gold Summary: In the distributed backend case, the ThinLink step and the final native object link are separate processes. This can be problematic when archive libraries are involved in the link (e.g. via --start-lib/--end-lib pairs). The linker only includes objects from libraries when there is a strong reference to them, and depending on the intervening ThinLTO backend processes' importing/inlining, the strong references may appear different in the two link steps. See D22356 and D22467 for two scenarios where this causes issues. To ensure that the final link includes the same objects, this patch adds support for an "=filename" form of the thinlto-index-only plugin option, in which case objects gold included in the link are emitted to the given filename. This should be used as input to the final link (e.g. via the @filename option to gold), instead of listing all the objects within --start-lib/--end-lib pairs again. Note that the support for the gold callback that identifies included objects was added in gold version 1.12. Reviewers: davidxl, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D22677 llvm-svn: 276450	2016-07-22 18:20:22 +00:00
Wei Mi	e04d0eff29	[PM] Port BreakCriticalEdges to the new PM. Differential Revision: https://reviews.llvm.org/D22688 llvm-svn: 276449	2016-07-22 18:04:25 +00:00
Anna Thomas	0be4a0e6a4	Invariant start/end intrinsics overloaded for address space Summary: The llvm.invariant.start and llvm.invariant.end intrinsics currently support specifying invariant memory objects only in the default address space. With this change, these intrinsics are overloaded for any adddress space for memory objects and we can use these llvm invariant intrinsics in non-default address spaces. Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr) This overloaded intrinsic is needed for representing final or invariant memory in managed languages. Reviewers: apilipenko, reames Subscribers: llvm-commits llvm-svn: 276447	2016-07-22 17:49:40 +00:00
Matt Arsenault	e2fe67b951	AMDGPU: Remove redundant test llvm-svn: 276439	2016-07-22 17:01:36 +00:00
Matt Arsenault	3c07c813c0	AMDGPU: Fix groupstaticsize for large LDS The size can exceed s_movk_i32's limit, and we don't want to use it this early since it inhibits optimizations. This should probably be merged to the release branch. llvm-svn: 276438	2016-07-22 17:01:33 +00:00
Matt Arsenault	8d718dcfda	AMDGPU: Add HSA dispatch id intrinsic llvm-svn: 276437	2016-07-22 17:01:30 +00:00
Matt Arsenault	7fb961f3e6	AMDGPU: Fix i1 fp_to_int R600's i1 fp_to_uint selected but was incorrect according to what instcombine constant folds to. llvm-svn: 276435	2016-07-22 17:01:21 +00:00
Tim Northover	bd5054602e	GlobalISel: implement alloca instruction llvm-svn: 276433	2016-07-22 16:59:52 +00:00
Simon Pilgrim	820f87a72d	[SelectionDAG] Optimization of BITREVERSE legalization for power-of-2 integer scalar/vector types An extension of D19978, this patch replaces the default BITREVERSE evaluation of individual bit masks+shifts with block mask+shifts when we have integer elements of power-of-2 bits in size. After calling BSWAP to reverse the order of the constituent bytes (which typically follows a similar approach), every neighbouring 4-bits, 2-bits and finally 1-bit pairs are masked off and swapped over with shifts. In doing so we can significantly reduce the number of operations required. Differential Revision: https://reviews.llvm.org/D21578 llvm-svn: 276432	2016-07-22 16:46:25 +00:00
Krzysztof Parzyszek	d3d0a4bda3	[Hexagon] Use loop data prefetch on Hexagon llvm-svn: 276422	2016-07-22 14:22:43 +00:00
Simon Pilgrim	ea0d4f9962	[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 (reapplied) As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Reapplied with fix for PR28657 - removed intrinsic definitions (clang companion patch to be be submitted shortly). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276416	2016-07-22 13:58:44 +00:00
Ahmed Bougacha	29333c9de6	[FastISel] Ignore @llvm.assume. llvm-svn: 276410	2016-07-22 12:54:53 +00:00
Ying Yi	e59ee43cf1	[llvm-cov] - Add the coverage of lines in the summary report. The llvm-cov ‘report' command displays a summary of the coverage of a binary file. The summary report currently only includes covered regions and covered functions. This patch adds the coverage of lines in the summary report. Differential Revision: https://reviews.llvm.org/D22569 llvm-svn: 276409	2016-07-22 12:46:13 +00:00
Benjamin Kramer	a81f4728f3	[llvm-profdata] Bring back reading profile data from STDIN. This feature was lost in r276197. llvm-svn: 276407	2016-07-22 12:39:55 +00:00
Benjamin Kramer	5ba0e20315	Revert "[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128" It caused PR28657. This reverts commit r276281. llvm-svn: 276405	2016-07-22 11:03:10 +00:00
Ying Yi	60a3da3f4c	[llvm-cov] - Improve llvm-cov error message Summary: When giving the following command: % llvm-cov report -instr-profile=default.profraw llvm-cov will give the following error message: >llvm-cov report: Not enough positional command line arguments specified! >Must specify at least 1 positional arguments: See: orbis-llvm-cov report -help This patch changes the error message from '1 positional arguments' to '1 positional argument'. Differential Revision: https://reviews.llvm.org/D22621 llvm-svn: 276404	2016-07-22 10:52:21 +00:00
Hrvoje Varga	2db00ce4b6	[mips][microMIPS] Implement SLT, SLTI, SLTIU, SLTU microMIPS32r6 instructions Differential Revision: https://reviews.llvm.org/D19906 llvm-svn: 276397	2016-07-22 07:18:33 +00:00
Craig Topper	52e2e8381b	[AVX512] Add ExeDomain to vector extend and truncate instructions. llvm-svn: 276394	2016-07-22 05:46:44 +00:00
Craig Topper	f4151bea72	[AVX512] Add initial support for the Execution Domain fixing pass to change some EVEX instructions. llvm-svn: 276393	2016-07-22 05:00:52 +00:00
Craig Topper	0b90756b0a	[AVX512] Add load folding for some AVX512VL logic and arithmetic instructions. llvm-svn: 276391	2016-07-22 05:00:39 +00:00
Craig Topper	ab13b33ded	[AVX512] Update X86InstrInfo::foldMemoryOperandCustom to handle the EVEX encoded instructions too. llvm-svn: 276390	2016-07-22 05:00:35 +00:00
David Majnemer	522a91181a	Don't remove side effecting instructions due to ConstantFoldInstruction Just because we can constant fold the result of an instruction does not imply that we can delete the instruction. It may have side effects. This fixes PR28655. llvm-svn: 276389	2016-07-22 04:54:44 +00:00
Vitaly Buka	53054a7024	Fix detection of stack-use-after scope for char arrays. Summary: Clang inserts GetElementPtrInst so findAllocaForValue was not able to find allocas. PR27453 Reviewers: kcc, eugenis Differential Revision: https://reviews.llvm.org/D22657 llvm-svn: 276374	2016-07-22 00:56:17 +00:00
Sanjoy Das	aae623f4c2	[IRCE] Don't misuse CHECK-LABEL; NFC llvm-svn: 276373	2016-07-22 00:41:02 +00:00
Sanjoy Das	bb969791b4	[IRCE] Add an option to skip profitability checks If `-irce-skip-profitability-checks` is passed in, IRCE will kick in in all cases where it is legal for it to kick in. This flag is intended to help diagnose and analyse performance issues. llvm-svn: 276372	2016-07-22 00:40:56 +00:00
Vedant Kumar	fa522ca3b3	[llvm-cov] Strengthen a test case Check that stylesheets work when we're not using -output-dir. llvm-svn: 276363	2016-07-21 23:31:26 +00:00
Vedant Kumar	c076c49076	[llvm-cov] Use relative paths to the stylesheet (for html reports) This makes it easy to swap out the default stylesheet for a custom one. It also shaves ~6.62 MB out of the report directory for a full coverage build of llvm+clang. While we're at it, prune the CSS and add tests for it. llvm-svn: 276359	2016-07-21 23:26:15 +00:00
Sebastian Pop	31fd506623	GVH-hoist: only clone GEPs (PR28606) Do not clone stored values unless they are GEPs that are special cased to avoid hoisting them without hoisting their associated ld/st. Differential revision: https://reviews.llvm.org/D22652 llvm-svn: 276358	2016-07-21 23:22:10 +00:00
Wei Mi	1cf58f8996	[PM] Port NaryReassociate to the new PM Differential Revision: https://reviews.llvm.org/D22648 llvm-svn: 276349	2016-07-21 22:28:52 +00:00
Quentin Colombet	ecd81a3d1b	[MIRTesting] Abort when failing to parse a function. When we failed to parse a function in the mir parser, we should abort the whole compilation instead of continuing in a weird state. Indeed, this was creating strange machine function passes failures that were hard to understand, until we notice that the function actually did not get parsed correctly! llvm-svn: 276348	2016-07-21 22:25:57 +00:00
Michael Kuperstein	c523333bbf	[X86] Do not use AND8ri8 in AVX512 pattern This variant is (as documented in the TD) for disassembler use only, and should not be used in patterns - it is longer, and is broken on 64-bit. llvm-svn: 276347	2016-07-21 22:24:08 +00:00
Sanjay Patel	e9fc79bb13	[InstSimplify] don't crash handling a pointer or aggregate type llvm-svn: 276345	2016-07-21 21:56:00 +00:00
Akira Hatanaka	b8d2873d93	[AArch64][Inline-Asm] Return the 32-bit floating point register class when constraint "w" is used on a 32-bit operand. This enables compiling the following code, which used to error out in the backend: void foo1(int a) { asm volatile ("sqxtn h0, %s0\n" : : "w"(a):); } Fixes PR28633. llvm-svn: 276344	2016-07-21 21:39:05 +00:00
Sanjay Patel	a3bfb4e313	[InstSimplify] recognize trunc + icmp sgt/slt variants of select simplifications (PR28466) rL245171 exposed a hole in InstSimplify that manifested in a strange way in PR28466: https://llvm.org/bugs/show_bug.cgi?id=28466 It's possible to use trunc + icmp sgt/slt in place of an and + icmp eq/ne, so we need to recognize that pattern to eliminate selects that are choosing between some value and some bitmasked version of that value. Note that there is significant room for improvement (refactoring) and enhancement (more patterns, possibly in InstCombine rather than here). Differential Revision: https://reviews.llvm.org/D22537 llvm-svn: 276341	2016-07-21 21:26:45 +00:00
Adam Nemet	84a6425d61	[OptDiag,LDist] Convert remaining opt remarks to use the new API llvm-svn: 276340	2016-07-21 21:21:34 +00:00
Matthew Simpson	102729cf1b	[LV] Move vector int induction update to end of latch This patch moves the update instruction for vectorized integer induction phi nodes to the end of the latch block. This ensures consistent placement of all induction updates across all the kinds of int inductions we create (scalar, splat vector, or vector phi). Differential Revision: https://reviews.llvm.org/D22416 llvm-svn: 276339	2016-07-21 21:20:15 +00:00
Sanjay Patel	9eec550a2b	add vector tests and a simpler version of the negative tests llvm-svn: 276328	2016-07-21 20:11:08 +00:00
Anna Thomas	c858faa244	Revert "Invariant start/end intrinsics overloaded for address space" This reverts commit r276316. llvm-svn: 276320	2016-07-21 19:06:28 +00:00
Anna Thomas	29b24dfe44	Invariant start/end intrinsics overloaded for address space Summary: The llvm.invariant.start and llvm.invariant.end intrinsics currently support specifying invariant memory objects only in the default address space. With this change, these intrinsics are overloaded for any adddress space for memory objects and we can use these llvm invariant intrinsics in non-default address spaces. Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr) This overloaded intrinsic is needed for representing final or invariant memory in managed languages. Reviewers: tstellarAMD, reames, apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22519 llvm-svn: 276316	2016-07-21 18:41:44 +00:00
Quentin Colombet	2b59eab79f	[IRTranslator] Add G_SUB opcode. This commit adds a generic SUB opcode to global-isel. llvm-svn: 276308	2016-07-21 17:26:50 +00:00
Konstantin Zhuravlyov	3c0d8d22fe	[AMDGPU] Emit read-only data to .rodata for hsa Differential Revision: https://reviews.llvm.org/D22538 llvm-svn: 276298	2016-07-21 15:59:23 +00:00
Quentin Colombet	7bcc921dd8	[IRTranslator] Add G_AND opcode. This commit adds a generic AND opcode to global-isel. llvm-svn: 276297	2016-07-21 15:50:42 +00:00
Konstantin Zhuravlyov	155626238b	AMDGPU/SI: Add support for R_AMDGPU_ABS32 Differential Revision: https://reviews.llvm.org/D21646 llvm-svn: 276294	2016-07-21 15:29:19 +00:00
Geoff Berry	4ff2e36d32	[AArch64] Load/store opt: Don't count transient instructions towards search limits. Summary: This change also changes findMatchingInsn and findMatchingUpdateInsnForward to take DBG_VALUE opcodes into account when tracking register defs and uses, which could potentially inhibit these optimizations in the presence of debug information. Reviewers: mcrosier Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22582 llvm-svn: 276293	2016-07-21 15:20:25 +00:00
Simon Pilgrim	88e0940d3b	[X86][SSE] Allow folding of store/zext with PEXTRW of 0'th element Under normal circumstances we prefer the higher performance MOVD to extract the 0'th element of a v8i16 vector instead of PEXTRW. But as detailed on PR27265, this prevents the SSE41 implementation of PEXTRW from folding the store of the 0'th element. Additionally it prevents us from making use of the fact that the (SSE2) reg-reg version of PEXTRW implicitly zero-extends the i16 element to the i32/i64 destination register. This patch only preferentially lowers to MOVD if we will not be zero-extending the extracted i16, nor prevent a store from being folded (on SSSE41). Fix for PR27265. Differential Revision: https://reviews.llvm.org/D22509 llvm-svn: 276289	2016-07-21 14:54:17 +00:00
Simon Pilgrim	4caefdf834	Fixed line endings llvm-svn: 276287	2016-07-21 14:36:41 +00:00
Simon Pilgrim	c8e20b1150	[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276281	2016-07-21 14:10:54 +00:00
Marina Yatsina	c1fa163392	ExecutionDepsFix - Fix bug in clearance calculation The clearance calculation did not take into account registers defined as outputs or clobbers in inline assembly machine instructions because these register defs are implicit. Differential Revision: http://reviews.llvm.org/D22580 llvm-svn: 276266	2016-07-21 12:37:07 +00:00
Matt Arsenault	f0ba86a4d5	AMDGPU: Fix phis from blocks split due to register indexing llvm-svn: 276257	2016-07-21 09:40:57 +00:00
David Majnemer	825e4ab9e3	[GVNHoist] Preserve optimization hints which agree If we have optimization hints with agree with each other along different paths, preserve them. llvm-svn: 276248	2016-07-21 07:16:26 +00:00
David Majnemer	4808f26422	[GVNHoist] Don't wrongly preserve TBAA We hoisted loads/stores without taking into account which can cause miscompiles. llvm-svn: 276240	2016-07-21 05:59:53 +00:00
Matthias Braun	d9fdad72ae	IPRA: Fix RegMask calculation for alias registers This patch fixes a very subtle bug in regmask calculation. Thanks to zan jyu Wong <zyfwong@gmail.com> for bringing this to notice. For example if CL is only clobbered than CH should not be marked clobbered but CX, RCX and ECX should be mark clobbered. Previously for each modified register all of its aliases are marked clobbered by markRegClobbred() in RegUsageInfoCollector.cpp but that is wrong because when CL is clobbered then MRI::isPhysRegModified() will return true for CL, CX, ECX, RCX which is correct behavior but then for CX, EXC, RCX we mark CH also clobbered as CH is aliased to CX,ECX,RCX so markRegClobbred() is not required because isPhysRegModified already take cares of proper aliasing register. A very simple test case has been added to verify this change. Please find relevant bug report here : http://llvm.org/PR28567 Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: https://reviews.llvm.org/D22400 llvm-svn: 276235	2016-07-21 03:50:39 +00:00
Adam Nemet	7cfd5971ab	[OptDiag,LV] Add hotness attribute to applied-optimization remarks Test coverage is provided by modifying the function in the FP-math testcase that we are allowed to vectorize. llvm-svn: 276223	2016-07-21 01:07:13 +00:00
Sanjay Patel	0753c06d9c	[InstCombine] LogicOpc (zext X), C --> zext (LogicOpc X, C) (PR28476) The benefits of this change include: 1. Remove DeMorgan-matching code that was added specifically to work-around the missing transform in http://reviews.llvm.org/rL248634. 2. Makes the DeMorgan transform work for vectors too. 3. Fix PR28476: https://llvm.org/bugs/show_bug.cgi?id=28476 Extending this transform to other casts and other associative operators may be useful too. See https://reviews.llvm.org/D22421 for a prerequisite for doing that though. Differential Revision: https://reviews.llvm.org/D22271 llvm-svn: 276221	2016-07-21 00:24:18 +00:00
Adam Nemet	0e0e2d5d26	[OptDiag,LV] Add hotness attribute to the derived analysis remarks This includes FPCompute and Aliasing. Testcase is based on no_fpmath.ll. llvm-svn: 276211	2016-07-20 23:50:32 +00:00
Sanjay Patel	5f3c70307d	[InstSimplify][InstCombine] don't crash when folding vector selects of icmp Differential Revision: https://reviews.llvm.org/D22602 llvm-svn: 276209	2016-07-20 23:40:01 +00:00
Xinliang David Li	fb64ebe313	Fix test failure on Win llvm-svn: 276202	2016-07-20 22:53:39 +00:00
Xinliang David Li	9a1bfcfa16	Reapply r276185 Fix the test case that should not depend on dir iteration order. llvm-svn: 276197	2016-07-20 22:24:52 +00:00
Justin Lebar	cd564c6b46	[NVPTX] Enable the load-store vectorizer on nvptx. Reviewers: tra Subscribers: jholewinski, arsenm, asbirlea Differential Revision: https://reviews.llvm.org/D22592 llvm-svn: 276196	2016-07-20 22:11:36 +00:00
Xinliang David Li	ce3f385eeb	Revert r276185 -- build bot failure llvm-svn: 276194	2016-07-20 21:50:38 +00:00
Adam Nemet	5b3a5cf6b0	[OptDiag,LV] Add hotness attribute to analysis remarks The earlier change added hotness attribute to missed-optimization remarks. This follows up with the analysis remarks (the ones explaining the reason for the missed optimization). llvm-svn: 276192	2016-07-20 21:44:26 +00:00
Artem Belevich	7e9c9a6582	[NVPTX] Renamed NVPTXLowerKernelArgs -> NVPTXLowerArgs. NFC. After r276153 the pass applies to both kernels and regular functions. Differential Revision: https://reviews.llvm.org/D22583 llvm-svn: 276189	2016-07-20 21:44:07 +00:00
Xinliang David Li	d0b867e3e5	[Profile] support directory reading in profile merging Differential Revision: http://reviews.llvm.org/D22560 llvm-svn: 276185	2016-07-20 21:31:29 +00:00
Ahmed Bougacha	a0cdd79070	[AArch64][FastISel] Select -O0 legal cmpxchg. At -O0, cmpxchg survives AtomicExpand: it's mostly straightforward to select it in fast-isel, and let the pseudo be expanded later. extractvalues on the result are the tricky part: the generic logic only works for legal types (and it would be painful to make it support illegal types), so we can only support i32/i64 cmpxchg. llvm-svn: 276183	2016-07-20 21:12:32 +00:00
Ahmed Bougacha	b0674d1143	[AArch64][FastISel] Select atomic stores into STLR. llvm-svn: 276182	2016-07-20 21:12:27 +00:00
David Majnemer	bd21012c6c	[GVNHoist] Don't hoist PHI nodes We hoisted PHIs without respecting their special insertion point in the block, leading to verfier errors. This fixes PR28626. llvm-svn: 276181	2016-07-20 21:05:01 +00:00
Davide Italiano	15ff2d6d0c	[SCCP] Zap multiple return values. We can replace the return values with undef if we replaced all the call uses with a constant/undef. Differential Revision: https://reviews.llvm.org/D22336 llvm-svn: 276174	2016-07-20 20:17:13 +00:00
Justin Lebar	a272c12b73	[LSV] Don't move stores across may-load instrs, and loosen restrictions on moving loads. Summary: Previously we wouldn't move loads/stores across instructions that had side-effects, where that was defined as may-write or may-throw. But this is not sufficiently restrictive: Stores can't safely be moved across instructions that may load. This patch also adds a DEBUG check that all instructions in our chain are either loads or stores. Reviewers: asbirlea Subscribers: llvm-commits, jholewinski, arsenm, mzolotukhin Differential Revision: https://reviews.llvm.org/D22547 llvm-svn: 276171	2016-07-20 20:07:37 +00:00
Justin Lebar	62b03e344e	[LSV] Vectorize up to side-effecting instructions. Summary: Previously if we had a chain that contained a side-effecting instruction, we wouldn't vectorize it at all. Now we'll vectorize everything that comes before the side-effecting instruction. Reviewers: asbirlea Subscribers: arsenm, jholewinski, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22536 llvm-svn: 276170	2016-07-20 20:07:34 +00:00
Rui Ueyama	d8388aaecb	[pdbdump] Use the "flow" style to print out a sequence of uint32_t. Summary: Lists can be written either with "-" or "[]" in YAML. Differential Revision: https://reviews.llvm.org/D22579 llvm-svn: 276168	2016-07-20 19:41:47 +00:00
Tim Northover	62ae568bbb	GlobalISel: implement low-level type with just size & vector lanes. This should be all the low-level instruction selection needs to determine how to implement an operation, with the remaining context taken from the opcode (e.g. G_ADD vs G_FADD) or other flags not based on type (e.g. fast-math). llvm-svn: 276158	2016-07-20 19:09:30 +00:00
Artem Belevich	74158b5061	[NVPTX] deal with all aggregate return types. Fixes a crash in llvm_unreachable when a function has array return type. Differential Revision: https://reviews.llvm.org/D22524 llvm-svn: 276154	2016-07-20 18:39:52 +00:00
Artem Belevich	b2e76a5e7a	[NVPTX] Improve lowering of byval args of device functions. Avoid unnecessary spills of byval arguments of device functions to local space on SASS level and subsequent pointer conversion to generic address space that follows. Instead, make a local copy in IR, provide a way to access arguments directly, and let LLVM optimize the copy away when possible. Differential Review: https://reviews.llvm.org/D21421 llvm-svn: 276153	2016-07-20 18:39:47 +00:00
Sanjay Patel	c0812702f8	minimize tests and auto-generate checks llvm-svn: 276147	2016-07-20 17:58:20 +00:00
Wei Mi	481232e991	Fix test/Analysis/ScalarEvolution/scev-expander-existing-value-offset.ll for rL276136. The content in this testcase was accidentally duplicated. Fix the error. llvm-svn: 276139	2016-07-20 16:54:58 +00:00
Wei Mi	db80c0c77f	Use ValueOffsetPair to enhance value reuse during SCEV expansion. In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion. However, const folding and sext/zext distribution can make the reuse still difficult. A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and S1 = S2 + C_a S3 = S2 + C_b where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused by the fact that S3 is generated from S1 after const folding. In order to do that, we represent ExprValueMap as a mapping from SCEV to ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to V1 - C_a + C_b. Differential Revision: https://reviews.llvm.org/D21313 llvm-svn: 276136	2016-07-20 16:40:33 +00:00
Matt Arsenault	f14db7a933	AMDGPU: Add missing test coverage for control flow breaks None of the current lit tests hit si_break handling. llvm-svn: 276129	2016-07-20 15:20:35 +00:00
Yaxun Liu	4b1d9f7f18	AMDGPU: Fix bug causing crash due to invalid opencl version metadata. Differential Revision: https://reviews.llvm.org/D22526 llvm-svn: 276119	2016-07-20 14:38:06 +00:00
Benjamin Kramer	b4d64cf27d	Revert "[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp))" Makes InstCombine infloop when compiling v8. This reverts commit r275989 and r276105. llvm-svn: 276106	2016-07-20 11:40:16 +00:00
Tobias Grosser	8c6201b49f	[InstCombine] Provide more test cases for cast-folding [NFC] Summary: In r275989 we enabled the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))`. Here we add more test cases to assure this folding works for all logical operations `and`/`or`/`xor`. Reviewers: grosser Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22561 Contributed-by: Matthias Reisinger llvm-svn: 276105	2016-07-20 11:24:27 +00:00
Simon Pilgrim	1b4f511aaa	[X86][SSE] Add cost model values for CTPOP of vectors This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better. Differential Revision: https://reviews.llvm.org/D22456 llvm-svn: 276104	2016-07-20 10:41:28 +00:00
Diana Picus	f345d40ae2	[ARM] Skip inline asm memory operands in DAGToDAGISel Retry r275776 (no changes, we suspect the issue was with another commit). The current logic for handling inline asm operands in DAGToDAGISel interprets the operands by looking for constants, which should represent the flags describing the kind of operand we're dealing with (immediate, memory, register def etc). The operands representing actual data are skipped only if they are non-const, with the exception of immediate operands which are skipped explicitly when a flag describing an immediate is found. The oversight is that memory operands may be const too (e.g. for device drivers reading a fixed address), so we should explicitly skip the operand following a flag describing a memory operand. If we don't, we risk interpreting that constant as a flag, which is definitely not intended. Fixes PR26038 Differential Revision: https://reviews.llvm.org/D22103 llvm-svn: 276101	2016-07-20 09:48:24 +00:00
David Majnemer	a75736087d	Forgot to add a test for r276008. llvm-svn: 276082	2016-07-20 04:13:05 +00:00
David Majnemer	5d26127752	Revert "Disable this-return argument forwarding on ARM/AArch64" Inference of the 'returned' attribute was fixed in r276008, lets try turning the backend support back on. This reverts commit r275677. llvm-svn: 276081	2016-07-20 04:13:01 +00:00
Adam Nemet	67c8929a2c	[LV] Add hotness attribute to missed-optimization remarks The new OptimizationRemarkEmitter analysis pass is hooked up to both new and old PM passes. llvm-svn: 276080	2016-07-20 04:03:43 +00:00
Michael Zolotukhin	6bc56d552a	Revert "Revert r275883 and r275891. They seem to cause PR28608." This reverts commit r276064, and thus reapplies r275891 and r275883 with a fix for PR28608. llvm-svn: 276077	2016-07-20 01:55:27 +00:00
Justin Lebar	6114b37838	[LSV] Don't assume that loads/stores appear in address order in the BB. Summary: getVectorizablePrefix previously didn't work properly in the face of aliasing loads/stores. It unwittingly assumed that the loads/stores appeared in the BB in address order. If they didn't, it would do the wrong thing. Reviewers: asbirlea, tstellarAMD Subscribers: arsenm, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22535 llvm-svn: 276072	2016-07-20 00:55:12 +00:00
Matthias Braun	5b9722d6c7	Revert "RegScavenging: Add scavengeRegisterBackwards()" Reverting this commit for now as it seems to be causing failures on test-suite tests on the clang-ppc64le-linux-lnt bot. This reverts commit r276044. llvm-svn: 276068	2016-07-20 00:21:32 +00:00
Sean Silva	554efb28d2	Revert r275883 and r275891. They seem to cause PR28608. Revert "[LoopSimplify] Update LCSSA after separating nested loops." This reverts commit r275891. Revert "[LCSSA] Post-process PHI-nodes created by SSAUpdate when constructing LCSSA form." This reverts commit r275883. llvm-svn: 276064	2016-07-19 23:54:29 +00:00
Sean Silva	e3c18a5ae8	[PM] Port LoopUnroll. We just set PreserveLCSSA to always true since we don't have an analogous method `mustPreserveAnalysisID(LCSSA)`. Also port LoopInfo verifier pass to test LoopUnrollPass. llvm-svn: 276063	2016-07-19 23:54:23 +00:00
Justin Lebar	8778c62629	[LSV] Insert stores at the right point. Summary: Previously, the insertion point for stores was the last instruction in Chain before calling getVectorizablePrefixEndIdx. Thus if getVectorizablePrefixEndIdx didn't return Chain.size(), we still would insert at the last instruction in Chain. This patch changes our internal API a bit in an attempt to make it less prone to this sort of error. As a result, we end up recalculating the Chain's boundary instructions, but I think worrying about the speed hit of this is a premature optimization right now. Reviewers: asbirlea, tstellarAMD Subscribers: mzolotukhin, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D22534 llvm-svn: 276056	2016-07-19 23:19:20 +00:00
Justin Lebar	d9446d3770	[LSV] Add detail to correct-order.ll test. Summary: This helps keep us honest -- there were a number of ways we could screw up and still have passed this test. Reviewers: asbirlea Subscribers: llvm-commits, arsenm Differential Revision: https://reviews.llvm.org/D22531 llvm-svn: 276053	2016-07-19 23:18:59 +00:00
Matt Arsenault	a1fe17c9ad	AMDGPU: Change fdiv lowering based on !fpmath metadata If 2.5 ulp is acceptable, denormals are not required, and isn't a reciprocal which will already be handled, replace with a faster fdiv. Simplify the lowering tests by using per function subtarget features. llvm-svn: 276051	2016-07-19 23:16:53 +00:00
Paul Robinson	2d23c029f7	Make GVN Hoisting obey optnone/bisect. Differential Revision: http://reviews.llvm.org/D22545 llvm-svn: 276048	2016-07-19 22:57:14 +00:00
Matthias Braun	84fd4bee6c	RegScavenging: Add scavengeRegisterBackwards() This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 276044	2016-07-19 22:37:09 +00:00
Sanjay Patel	d4ea94eb94	regenerate checks llvm-svn: 276042	2016-07-19 22:32:15 +00:00
Evandro Menezes	238fa76574	[AArch64] Properly validate the reciprocal estimation. Add check for legal data types when expanding into a Newton series. Differential Revision: https://reviews.llvm.org/D22267 llvm-svn: 276041	2016-07-19 22:31:11 +00:00
Sanjay Patel	2d477e59e8	[InstCombine] fold add(zext(xor X, C), C) --> sext X when C is INT_MIN in the source type The pattern may look more obviously like a sext if written as: define i32 @g(i16 %x) { %zext = zext i16 %x to i32 %xor = xor i32 %zext, 32768 %add = add i32 %xor, -32768 ret i32 %add } We already have that fold in visitAdd(). Differential Revision: https://reviews.llvm.org/D22477 llvm-svn: 276035	2016-07-19 22:09:34 +00:00
George Burgess IV	8b85321bae	[CFLAA] Make a test tell the truth. NFC. Dishonesty noted by Jia Chen. llvm-svn: 276028	2016-07-19 20:56:41 +00:00
George Burgess IV	3b059841ff	[CFLAA] Add some interproc. analysis to CFLAnders. This patch adds function summary support to CFLAnders. It also comes with a lot of tests! Woohoo! Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22450 llvm-svn: 276026	2016-07-19 20:47:15 +00:00
Kevin Enderby	6524bd8c00	Next step along the way to getting good error messages for bad archives. This step builds on Lang Hames work to change Archive::child_iterator for better interoperation with Error/Expected. Building on that it is now possible to return an error message when the size field of an archive contains non-decimal characters. llvm-svn: 276025	2016-07-19 20:47:07 +00:00
Sanjay Patel	47c04f9543	add even more missing tests for simplifySelectBitTest() llvm-svn: 276024	2016-07-19 20:47:00 +00:00
Vedant Kumar	57faf2d208	[tsan] Don't instrument __llvm_gcov_global_state_pred or __llvm_gcda* r274801 did not go far enough to allow gcov+tsan to cooperate. With this commit it's possible to run the following code without false positives: std::thread T1(fib), T2(fib); T1.join(); T2.join(); llvm-svn: 276015	2016-07-19 20:16:08 +00:00
Tim Northover	554fbd05e8	ARM: move feature for Thumb2 pkhbt/pkhtb onto architectures. There's not much functional change, but it really is an architectural feature (on v6T2, v7A, v7R and v7EM) rather than something each CPU implements individually. The main functional change is the default behaviour you get when specifying only "-triple". llvm-svn: 276013	2016-07-19 19:49:13 +00:00
Ahmed Bougacha	5a59b24bdd	[GlobalISel] Mark newly-created gvregs as having a bank. Also verify that we never try to set the size of a vreg associated to a register class. Report an error when we encounter that in MIR. Fix a testcase that hit that error and had a size for no reason. llvm-svn: 276012	2016-07-19 19:48:36 +00:00
David Majnemer	5246e0b2c2	[FunctionAttrs] Correct the safety analysis for inference of 'returned' We skipped over ReturnInsts which didn't return an argument which would lead us to incorrectly conclude that an argument returned by another ReturnInst was 'returned'. This reverts commit r275756. This fixes PR28610. llvm-svn: 276008	2016-07-19 18:50:26 +00:00
David Majnemer	07ea344222	Add a testcase for r275581 llvm-svn: 276002	2016-07-19 17:52:41 +00:00
Sanjay Patel	8b76ebe5b8	add tests related to PR28466 llvm-svn: 275995	2016-07-19 17:07:35 +00:00
Simon Pilgrim	5366d0e0bc	[X86][AVX512] Added AVX512 subvector broadcast tests llvm-svn: 275994	2016-07-19 17:04:28 +00:00
Simon Pilgrim	f2d02cb0f6	[X86][AVX] Fixed typo in test names llvm-svn: 275992	2016-07-19 16:52:05 +00:00
Sanjay Patel	d2ff6d727f	add missing test for simplifySelectBitTest() llvm-svn: 275990	2016-07-19 16:49:55 +00:00
Tobias Grosser	1c38262279	[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp)) Summary: Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one: > Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp. Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check: `if ((!isa<ICmpInst>(Cast0Src) \|\| !isa<ICmpInst>(Cast1Src)) && ...` This check seems to sort out more cases than necessary because: - the reverse transformation is obviously done for `or` instructions only - and also not every `zext icmp` pair is necessarily the result of this reverse transformation Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`). As an example, consider the following IR snippet ``` %1 = icmp sgt i64 %a, %b %2 = zext i1 %1 to i8 %3 = icmp slt i64 %a, %c %4 = zext i1 %3 to i8 %5 = and i8 %2, %4 ``` which would now be transformed to ``` %1 = icmp sgt i64 %a, %b %2 = icmp slt i64 %a, %c %3 = and i1 %1, %2 %4 = zext i1 %3 to i8 ``` This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code. Reviewers: grosser, vtjnash, majnemer Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22511 Contributed-by: Matthias Reisinger llvm-svn: 275989	2016-07-19 16:39:17 +00:00
Simon Pilgrim	0ea8d275cc	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. A companion clang patch is at D22105 Differential Revision: https://reviews.llvm.org/D22106 llvm-svn: 275981	2016-07-19 15:07:43 +00:00
Sam Parker	6ca4bbb00d	[ARM] Refactor Thumb2 Mul and Mla instr descs Recommitting after r274347 was reverted. This patch introduces some classes to refactor the 3 and 4 register Thumb2 multiplication instruction descriptions, plus improved tests for some of those instructions. Differential Revision: https://reviews.llvm.org/D21929 llvm-svn: 275979	2016-07-19 14:44:05 +00:00
Peter Smith	cbcecca538	Add support for tlsldm assembler operator to ARM target The standard local dynamic model for TLS on ARM systems needs two relocations: - R_ARM_TLS_LDM32 (module idx) - R_ARM_TLS_LDO32 (offset of object from origin of module TLS block) In GNU style assembler we use symbol(tlsldm) and symbol(tlsldo) to produce these relocations. llvm-mc for ARM supports symbol(tlsldo) but does not support symbol(tlsldm). This patch wires up the existing symbol(tlsldm) to R_ARM_TLS_LDM32. TLS for ARM is defined in Addenda to, and Errata in, the ABI for the ARM Architecture Differential Revision: https://reviews.llvm.org/D22461 llvm-svn: 275977	2016-07-19 14:15:33 +00:00
Simon Pilgrim	b87a21f1c3	[AARCH64] Fix linu triple typo As promised in D22191 llvm-svn: 275976	2016-07-19 14:12:45 +00:00
Simon Pilgrim	fc4d4b251d	[AARCH64] Enable AARCH64 lit tests on windows dev machines As discussed on PR27654, this patch fixes the triples of a lot of aarch64 tests and enables lit tests on windows This will hopefully help stop cases where windows developers break the aarch64 target Differential Revision: https://reviews.llvm.org/D22191 llvm-svn: 275973	2016-07-19 13:35:11 +00:00
Daniel Sanders	3878412875	[mips][ias] R_MIPS_GOT_(PAGE\|OFST) do not need symbols Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D22458 llvm-svn: 275968	2016-07-19 10:58:06 +00:00
Daniel Sanders	6a73883c48	[mips] Correct label prefixes for N32 and N64. Summary: N32 and N64 follow the standard ELF conventions (.L) whereas O32 uses its own ($). This fixes the majority of object differences between -fintegrated-as and -fno-integrated-as. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D22412 llvm-svn: 275967	2016-07-19 10:49:03 +00:00
Elena Demikhovsky	2c0780b8e5	AVX-512: Fixed BT instruction selection. The following condition expression ( a >> n) & 1 is converted to "bt a, n" instruction. It works on all intel targets. But on AVX-512 it was broken because the expression is modified to (truncate (a >>n) to i1). I added the new sequence (truncate (a >>n) to i1) to the BT pattern. Differential Revision: https://reviews.llvm.org/D22354 llvm-svn: 275950	2016-07-19 07:14:21 +00:00
Craig Topper	d6ca1dc45e	[AVX512] Give priority to EVEX encoded PSHUFB over the VEX versions. llvm-svn: 275942	2016-07-19 02:00:38 +00:00
George Burgess IV	5f30897b7b	[MemorySSA] Update to the new shiny walker. This patch updates MemorySSA's use-optimizing walker to be more accurate and, in some cases, faster. Essentially, this changed our core walking algorithm from a cache-as-you-go DFS to an iteratively expanded DFS, with all of the caching happening at the end. Said expansion happens when we hit a Phi, P; we'll try to do the smallest amount of work possible to see if optimizing above that Phi is legal in the first place. If so, we'll expand the search to see if we can optimize to the next phi, etc. An iteratively expanded DFS lets us potentially quit earlier (because we don't assume that we can optimize above all phis) than our old walker. Additionally, because we don't cache as we go, we can now optimize above loops. As an added bonus, this patch adds a ton of verification (if EXPENSIVE_CHECKS are enabled), so finding bugs is easier. Differential Revision: https://reviews.llvm.org/D21777 llvm-svn: 275940	2016-07-19 01:29:15 +00:00
Vedant Kumar	e3a0bf5048	Retry: [llvm-profdata] Speed up merging by using a thread pool Add a "-j" option to llvm-profdata to control the number of threads used. Auto-detect NumThreads when it isn't specified, and avoid spawning threads when they wouldn't be beneficial. I tested this patch using a raw profile produced by clang (147MB). Here is the time taken to merge 4 copies together on my laptop: No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total Changes since the initial commit: - When handling odd-length inputs, call ThreadPool::wait() before merging the last profile. Should fix a race/off-by-one (see r275937). Differential Revision: https://reviews.llvm.org/D22438 llvm-svn: 275938	2016-07-19 01:17:20 +00:00
Vedant Kumar	21ab20e005	Revert "[llvm-profdata] Speed up merging by using a thread pool" This reverts commit r275921. It broke the ppc64be bot: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/3537 I'm not sure why it broke, but based on the output, it looks like an off-by-one (one profile left un-merged). llvm-svn: 275937	2016-07-19 00:57:09 +00:00
Wei Mi	79997a24d7	Recommit the patch "Use uniforms set to populate VecValuesToIgnore". For instructions in uniform set, they will not have vector versions so add them to VecValuesToIgnore. For induction vars, those only used in uniform instructions or consecutive ptrs instructions have already been added to VecValuesToIgnore above. For those induction vars which are only used in uniform instructions or non-consecutive/non-gather scatter ptr instructions, the related phi and update will also be added into VecValuesToIgnore set. The change will make the vector RegUsages estimation less conservative. Differential Revision: https://reviews.llvm.org/D20474 The recommit fixed the testcase global_alias.ll. llvm-svn: 275936	2016-07-19 00:50:43 +00:00
Matt Arsenault	cb540bc03c	AMDGPU: Expand register indexing pseudos in custom inserter This is to help moveSILowerControlFlow to before regalloc. There are a couple of tradeoffs with this. The complete CFG is visible to more passes, the loop body avoids an extra copy of m0, vcc isn't required, and immediate offsets can be shrunk into s_movk_i32. The disadvantage is the register allocator doesn't understand that the single lane's vector is dead within the loop body, so an extra register is used to outlive the loop block when expanding the VGPR -> m0 loop. This also now results in worse waitcnt insertion before the loop instead of after for pending operations at the point of the indexing, but that should be fixed by future improvements to cross block waitcnt insertion. v_movreld_b32's operands are now modeled more correctly since vdst is not a true output. This is kind of a hack to treat vdst as a use operand. Extra checking is required in the verifier since I can't seem to get tablegen to emit an implicit operand for a virtual register. llvm-svn: 275934	2016-07-19 00:35:03 +00:00
Sanjoy Das	ab73c9d88e	[LoopReroll] Reroll loops with unordered atomic memory accesses Reviewers: hfinkel, jfb, reames Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22385 llvm-svn: 275932	2016-07-19 00:23:54 +00:00
Matt Arsenault	50b76399ed	AMDGPU: Fix test name and broken CHECK-LABEL llvm-svn: 275928	2016-07-18 23:09:51 +00:00
Vedant Kumar	0bd9907581	[llvm-profdata] Speed up merging by using a thread pool Add a "-j" option to llvm-profdata to control the number of threads used. Auto-detect NumThreads when it isn't specified, and avoid spawning threads when they wouldn't be beneficial. I tested this patch using a raw profile produced by clang (147MB). Here is the time taken to merge 4 copies together on my laptop: No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total Differential Revision: https://reviews.llvm.org/D22438 llvm-svn: 275921	2016-07-18 22:02:39 +00:00
Artem Belevich	9f97dcb018	[NVPTX] Make sure we adjust alignment at all call sites .. including calls from kernel functions that were ignored by mistake before. llvm-svn: 275920	2016-07-18 21:58:48 +00:00
Dehao Chen	6132ee8502	[PM] Convert Loop Strength Reduce pass to new PM Summary: Convert Loop String Reduce pass to new PM Reviewers: davidxl, silvas Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22468 llvm-svn: 275919	2016-07-18 21:41:50 +00:00
Teresa Johnson	2124157102	[PM] Port FunctionImport Pass to new PM Summary: Port FunctionImport Pass to new PM. Reviewers: mehdi_amini, davide Subscribers: davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D22475 llvm-svn: 275916	2016-07-18 21:22:24 +00:00
Wei Mi	f9afff71a2	Revert rL275912. llvm-svn: 275915	2016-07-18 21:14:43 +00:00
Wei Mi	1fd25726af	Use uniforms set to populate VecValuesToIgnore. For instructions in uniform set, they will not have vector versions so add them to VecValuesToIgnore. For induction vars, those only used in uniform instructions or consecutive ptrs instructions have already been added to VecValuesToIgnore above. For those induction vars which are only used in uniform instructions or non-consecutive/non-gather scatter ptr instructions, the related phi and update will also be added into VecValuesToIgnore set. The change will make the vector RegUsages estimation less conservative. Differential Revision: https://reviews.llvm.org/D20474 llvm-svn: 275912	2016-07-18 20:59:53 +00:00
Sanjay Patel	dbf44f5016	add tests for missed sext transform llvm-svn: 275908	2016-07-18 20:37:51 +00:00
Sanjay Patel	8a2bf3099f	auto-generate checks llvm-svn: 275899	2016-07-18 20:06:51 +00:00
Artem Belevich	052b1ed2fd	[NVPTX] Force minimum alignment of 4 for byval arguments of device-side functions. Taking address of a byval variable in PTX is legal, but currently runs into miscompilation by ptxas on sm_50+ (NVIDIA issue 1789042). Work around the issue by enforcing minimum alignment on byval arguments of device functions. The change is a no-op on SASS level for sm_3x where ptxas already aligns local copy by at least 4. Differential Revision: https://reviews.llvm.org/D22428 llvm-svn: 275893	2016-07-18 19:54:56 +00:00
Michael Zolotukhin	ea5b72825b	[LoopSimplify] Update LCSSA after separating nested loops. Summary: Usually LCSSA survives this transformation, but in some cases (see attached test) it doesn't: values from the original loop after separating might be used from the outer loop. Before the transformation it was the same loop, so LCSSA phis were not required. This fixes PR28272. Reviewers: sanjoy, hfinkel, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21665 llvm-svn: 275891	2016-07-18 19:44:19 +00:00
Vitaly Buka	c93e10fcbb	Revert "[ARM] Skip inline asm memory operands in DAGToDAGISel" Breaks asan, see https://reviews.llvm.org/D22103 This reverts commit r275776. llvm-svn: 275890	2016-07-18 19:44:01 +00:00
Vitaly Buka	fa474e3eb9	Revert "[ARM] Update test to use CHECK-LABEL. NFCI." Breaks asan, see https://reviews.llvm.org/D22103 This reverts commit r275777. llvm-svn: 275889	2016-07-18 19:43:58 +00:00
Michael Zolotukhin	7a3040dc83	[LCSSA] Post-process PHI-nodes created by SSAUpdate when constructing LCSSA form. Summary: SSAUpdate might insert PHI-nodes inside loops, which can break LCSSA form unless we fix it up. This fixes PR28424. Reviewers: sanjoy, chandlerc, hfinkel Subscribers: uabelho, llvm-commits Differential Revision: http://reviews.llvm.org/D21997 llvm-svn: 275883	2016-07-18 19:05:08 +00:00
Simon Pilgrim	069c732f82	[X86][SSE] Regenerate extraction from promotion test Added tests for SSE2 as well as SSE41 llvm-svn: 275878	2016-07-18 18:53:15 +00:00
Simon Pilgrim	a68b8df3a7	[X86][SSE] Regenerate extraction+store memop tests Added tests for SSE2 as well as SSE41+AVX llvm-svn: 275876	2016-07-18 18:44:01 +00:00
Simon Pilgrim	b21b47ba61	[X86][SSE] Regenerate truncate+extension memop tests Added tests for SSE2 as well as SSE41 llvm-svn: 275875	2016-07-18 18:42:33 +00:00
Simon Pilgrim	600baaed89	Regenerate test llvm-svn: 275872	2016-07-18 18:38:51 +00:00
Matt Arsenault	c96e1deffa	AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32 llvm-svn: 275871	2016-07-18 18:35:05 +00:00
Matt Arsenault	4c519d3518	AMDGPU/R600: Replace barrier intrinsics llvm-svn: 275870	2016-07-18 18:34:59 +00:00
Matt Arsenault	efb24540b1	AMDGPU: Remove dead check in AMDGPUPromoteAlloca This is currently only called with GEP users. A direct alloca would only happen with current typed pointers for arrays which are a perverse case. Also fix crashes on 0 x and 1 x arrays. llvm-svn: 275869	2016-07-18 18:34:53 +00:00
Tim Northover	918f05063c	CodeGenPrep: use correct function to determine Global's alignment. Elsewhere (particularly computeKnownBits) we assume that a global will be aligned to the value returned by Value::getPointerAlignment. This is used to boost the alignment on memcpy/memset, so any target-specific request can only increase that value. llvm-svn: 275866	2016-07-18 18:28:52 +00:00
Vedant Kumar	2e0893629a	[llvm-cov] Place anchors around line numbers in html reports Based on a suggestion by Harlan Haskins! llvm-svn: 275840	2016-07-18 17:53:16 +00:00
Simon Pilgrim	c941f6b329	[X86][AVX] Add target shuffle decode support for VBROADCAST Currently we only decode broadcasts from a vector of the same size. llvm-svn: 275823	2016-07-18 17:32:59 +00:00
Krzysztof Parzyszek	5948ea78b9	[Hexagon] Handle returning small structures by value This is compliant with the official ABI, but allows experimentation with calling conventions. llvm-svn: 275822	2016-07-18 17:30:41 +00:00
Chih-Hung Hsieh	4d9f2c154d	[X86] Accept SELECT op code for x86-64 fp128 type DAGTypeLegalizer::CanSkipSoftenFloatOperand should allow SELECT op code for x86_64 fp128 type for MME targets, so SoftenFloatOperand does not abort on SELECT op code. Differential Revision: http://reviews.llvm.org/D21758 llvm-svn: 275818	2016-07-18 17:20:09 +00:00
Adam Nemet	d6ba0bf831	[LoopDist] This test does not require ASSERTS Only its counterpart, diagnostics-with-hotness-lazy-BFI.ll, which invokes opt with -debug-only=. llvm-svn: 275812	2016-07-18 16:37:32 +00:00
Adam Nemet	b2593f78ca	[LoopDist] Port to new PM Summary: The direct motivation for the port is to ensure that the OptRemarkEmitter tests work with the new PM. This remains a function pass because we not only create multiple loops but could also version the original loop. In the test I need to invoke opt with -passes='require<aa>,loop-distribute'. LoopDistribute does not directly depend on AA however LAA does. LAA uses getCachedResult so I think we need manually pull in 'aa'. Reviewers: davidxl, silvas Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22437 llvm-svn: 275811	2016-07-18 16:29:27 +00:00
Simon Pilgrim	4ac7420618	[X86][AVX2] Added tests that demonstrate duplicate broadcasts We don't yet decode broadcasts as a target shuffle llvm-svn: 275808	2016-07-18 16:17:34 +00:00
Krzysztof Parzyszek	786333ffcc	[Hexagon] Enable .cur formation in MISched for Hexagon V60 Schedule a load and its use in the same packet in MISched. Previously, isResourceAvailable was returning false for dependences in the same packet, which prevented MISched from packetizing a load and its use in the same packet for v60. Patch by Ikhlas Ajbar. llvm-svn: 275804	2016-07-18 16:05:27 +00:00
Alexander Kornienko	63dd36faa5	Revert "r275571 [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals" Causes https://llvm.org/bugs/show_bug.cgi?id=28588 llvm-svn: 275801	2016-07-18 15:51:31 +00:00
Nemanja Ivanovic	d3c284f645	[PowerPC] Remove redundant direct moves when extracting integers and converting to FP This patch corresponds to review: https://reviews.llvm.org/D21354 We use direct moves for extracting integer elements from vectors. We also use direct moves when converting integers to FP. When these operations are chained, we get a direct move out of a VSR followed by a direct move back into a VSR. These are redundant - all we need to do is line up the element and convert. llvm-svn: 275796	2016-07-18 15:30:00 +00:00
Nirav Dave	a645433c5f	[MC] Cleanup Error Handling in AsmParser Add parseToken and compatriot functions to stitch error checks in straight linear code. As part of this fix some erronous handling of directives where the EndOfStatement token either was not checked or Lexed on termination. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22312 llvm-svn: 275795	2016-07-18 15:24:03 +00:00
Krzysztof Parzyszek	393b37937b	[Hexagon] Use timing class info as tie-breaker in machine scheduler Patch by Sirish Pande. llvm-svn: 275794	2016-07-18 15:17:10 +00:00
Krzysztof Parzyszek	3467e9d0a9	[Hexagon] HexagonMachineScheduler should account for resources The machine scheduler needs to account for available resources more accurately in order to avoid scheduling an instruction that forces a new packet to be created. This occurs in two ways: First, an instruction without an available resource may have a large priority due to other metrics and be scheduled when there are other instructions with available resources. Second, an instruction with a non-zero latency may become available prematurely. In both these cases, we attempt change the priority in order to allow a better instruction to be scheduled. Patch by Brendon Cahoon. llvm-svn: 275793	2016-07-18 14:52:13 +00:00
Krzysztof Parzyszek	748d3efec6	[Hexagon] Fix zero latency instructions with multiple predecessors An instruction may have multiple predecessors that are candidates for using .cur. However, only one of them can use .cur in the packet. When this case occurs, we need to make sure that only one of the dependences gets a 0 latency value. Patch by Brendon Cahoon. llvm-svn: 275790	2016-07-18 14:23:10 +00:00
Simon Pilgrim	1b2ab113fb	[SLPVectorizer][X86] Added sqrt vectorization tests llvm-svn: 275788	2016-07-18 13:20:54 +00:00
Simon Dardis	d32a2d30cb	[inlineasm] Propagate operand constraints to the backend When SelectionDAGISel transforms a node representing an inline asm block, memory constraint information is not preserved. This can cause constraints to be broken when a memory offset is of the form: offset + frame index when the frame is resolved. By propagating the constraints all the way to the backend, targets can enforce memory operands of inline assembly to conform to their constraints. For MIPSR6, some instructions had their offsets reduced to 9 bits from 16 bits such as ll/sc. This becomes problematic when using inline assembly to perform atomic operations, as an offset can generated that is too big to encode in the instruction. Reviewers: dsanders, vkalintris Differential Review: https://reviews.llvm.org/D21615 llvm-svn: 275786	2016-07-18 13:17:31 +00:00
Nicolai Haehnle	bef1ceb815	AMDGPU: Disable AMDGPUPromoteAlloca pass for shader calling conventions. Summary: The work item intrinsics are not available for the shader calling conventions. And even if we did hook them up most shader stages haves some extra restrictions on the amount of available LDS. Reviewers: tstellarAMD, arsenm Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D20728 llvm-svn: 275779	2016-07-18 09:02:47 +00:00
Diana Picus	6731f13458	[ARM] Update test to use CHECK-LABEL. NFCI. llvm-svn: 275777	2016-07-18 07:48:42 +00:00
Diana Picus	73ed44d328	[ARM] Skip inline asm memory operands in DAGToDAGISel The current logic for handling inline asm operands in DAGToDAGISel interprets the operands by looking for constants, which should represent the flags describing the kind of operand we're dealing with (immediate, memory, register def etc). The operands representing actual data are skipped only if they are non-const, with the exception of immediate operands which are skipped explicitly when a flag describing an immediate is found. The oversight is that memory operands may be const too (e.g. for device drivers reading a fixed address), so we should explicitly skip the operand following a flag describing a memory operand. If we don't, we risk interpreting that constant as a flag, which is definitely not intended. Fixes PR26038 Differential Revision: https://reviews.llvm.org/D22103 llvm-svn: 275776	2016-07-18 07:35:14 +00:00
Craig Topper	a3c55f5915	[AVX512] Add EVEX versions of scalar ADD/SUB/MUL/DIV to load folding tables. llvm-svn: 275775	2016-07-18 06:49:32 +00:00
Craig Topper	83613bb436	[X86] Fix test checks to include leading 'v' on avx mnemonic names. llvm-svn: 275774	2016-07-18 06:49:29 +00:00
Diana Picus	774d157a5d	[ARM] Honour ABI for rem under -O0 for EABI, GNUEABI, Android and Musl At higher optimization levels, we generate the libcall for DIVREM_Ix, which is fine: aeabi_{u\|i}divmod. At -O0 we generate the one for REM_Ix, which is the default {u}mod{q\|h\|s\|d}i3. This commit makes sure that we don't generate REM_Ix calls for ABIs that don't support them (i.e. where we need to use DIVREM_Ix instead). This is achieved by bailing out of FastISel, which can't handle non-double multi-reg returns, and letting the legalization infrastructure expand the REM_Ix calls. It also updates the divmod-eabi.ll test to run under -O0 as well, and adds some Windows checks to it to make sure we don't break things for it. Fixes PR27068 Differential Revision: https://reviews.llvm.org/D21926 llvm-svn: 275773	2016-07-18 06:48:25 +00:00
Craig Topper	1af6cc00dc	[X86] Add VPADD instructions to X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275769	2016-07-18 06:14:54 +00:00
Craig Topper	ba9b93d7f2	[X86] Add floating point packed logical ops to X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275768	2016-07-18 06:14:50 +00:00
Craig Topper	3a99de4067	[X86] Add AVX512 instructions to X86InstrInfo::isAssociativeAndCommutative. llvm-svn: 275767	2016-07-18 06:14:47 +00:00
Craig Topper	f7a06c29bc	[X86] Add AVX512 load opcodes and a couple AVX load opcodes to X86InstrInfo::areLoadsFromSameBasePtr. llvm-svn: 275765	2016-07-18 06:14:43 +00:00
Craig Topper	650a15e2b3	[X86] Add more opcodes to isFrameLoadOpcode/isFrameStoreOpcode. Mainly AVX-512 related. llvm-svn: 275764	2016-07-18 06:14:39 +00:00
Craig Topper	5c913e84df	[AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when VLX is supported. Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step. llvm-svn: 275763	2016-07-18 06:14:34 +00:00
David Majnemer	04c7c225a1	[GVNHoist] Change the key for VNtoInsns to a pair While debugging GVNHoist, I found it confusing that the entries in a VNtoInsns were not always value numbers. They _usually_ were except for StoreInst in which case they were a hash of two different value numbers. This leads to two observations: - It is more difficult to debug things when the semantic contents of VNtoInsns changes over time. - Using a single value number is not much cheaper, the value of VNtoInsns is a SmallVector. - It is not immediately clear what the algorithm would do if there were hash collisions in the StoreInst case. Using a DenseMap of std::pair sidesteps all of this. N.B. The changes in the test were due their sensitivity to the iteration order of VNtoInsns which has changed. llvm-svn: 275761	2016-07-18 06:11:37 +00:00
Vedant Kumar	733f795947	[llvm-cov] Attempt to fix a test failure on Windows Don't make the test/tools/llvm-cov/demangle.test depend on the order in which symbols are seen, or on the exact formatting llvm-cov emits after a symbol is printed. This is an attempt to fix a Windows bot failure: http://lab.llvm.org:8011/builders/clang-x86-win2008-selfhost/builds/9141 I don't know what the root cause of the failure is, or why the showTemplateInstantiations test doesn't fail in the same way on the Windows bots. However, this measure can't hurt, and it'll at least get me on the blamelists again. llvm-svn: 275758	2016-07-18 04:49:42 +00:00
NAKAMURA Takumi	966bde50c3	Revert r275678, "Revert "Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute"" This reverts also r275029, "Update Clang tests after adding inference for the returned argument attribute" It broke LTO build. Seems miscompilation. llvm-svn: 275756	2016-07-18 03:23:25 +00:00
Davide Italiano	4edd54794b	[GVN] Move other PRE tests to a subdirectory. llvm-svn: 275742	2016-07-17 23:55:20 +00:00
Davide Italiano	ed8e0881c1	[GVN] Move the PRE/LOADPRE test in a subdirectory. llvm-svn: 275741	2016-07-17 23:48:18 +00:00
Davide Italiano	6a69f829bd	[GVN] Use FileCheck instead of grep for tests. llvm-svn: 275739	2016-07-17 23:21:26 +00:00
Simon Pilgrim	47638635cc	[X86] Add CTPOP/CTLZ/CTTZ scalar cost tests llvm-svn: 275725	2016-07-17 18:29:19 +00:00
Simon Pilgrim	5aa90c55b6	[X86][AVX] Added VBROADCASTF128/VBROADCASTI128 tests llvm-svn: 275713	2016-07-17 17:44:18 +00:00
Simon Pilgrim	d1e941ae85	[X86] Regenerated ctlz/cttz scalar tests for 32/64-bit targets with/without LZCNT/TZCNT support llvm-svn: 275710	2016-07-17 16:15:51 +00:00
Simon Pilgrim	0bf66c9d62	[X86] Regenerated popcnt scalar tests for 32/64-bit targets with/without POPCNT support llvm-svn: 275709	2016-07-17 16:04:19 +00:00
Teresa Johnson	cd21a646f6	[ThinLTO] Perform profile-guided indirect call promotion Summary: To enable profile-guided indirect call promotion in ThinLTO mode, we simply add call graph edges for each profitable target from the profile to the summaries, then the summary-guided importing will consider the callee for importing as usual. Also we need to enable the indirect call promotion pass creation in the PassManagerBuilder when PerformThinLTO=true (we are in the ThinLTO backend), so that the newly imported functions are considered for promotion in the backends. The IC promotion profiles refer to callees by GUID, which required adding GUIDs to the per-module VST in bitcode (and assigning them valueIds similar to how they are assigned valueIds in the combined index). Reviewers: mehdi_amini, xur Subscribers: mehdi_amini, davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D21932 llvm-svn: 275707	2016-07-17 14:47:01 +00:00
Elena Demikhovsky	eaa356501d	X86: Updated a test file. NFC. This test shows subotimal code generated for AVX-512 vs PENTIUM4. The issue will be fixed in an upcomming commit. llvm-svn: 275702	2016-07-17 07:03:13 +00:00
Dehao Chen	1a44452b11	[PM] Convert IVUsers analysis to new pass manager. Summary: Convert IVUsers analysis to new pass manager. Reviewers: davidxl, silvas Subscribers: junbuml, sanjoy, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22434 llvm-svn: 275698	2016-07-16 22:51:33 +00:00
Sanjay Patel	79acd2a96b	[InstCombine] allow X + signbit --> X ^ signbit for vector splats llvm-svn: 275691	2016-07-16 18:29:26 +00:00
Sanjay Patel	040bd16e56	add vector test to show missing transform llvm-svn: 275690	2016-07-16 18:24:18 +00:00
Sanjay Patel	eb50476f77	update tests to use FileCheck, consolidate tests, fix comments llvm-svn: 275688	2016-07-16 18:08:22 +00:00
Sanjay Patel	e540a31f59	update test to use FileCheck llvm-svn: 275687	2016-07-16 16:31:58 +00:00
Sanjay Patel	972a53fb42	auto-generate checks llvm-svn: 275686	2016-07-16 16:27:58 +00:00
Sanjay Patel	309f98ef3a	auto-ggenerate checks llvm-svn: 275685	2016-07-16 16:24:06 +00:00
Sanjay Patel	f9d2b20daf	[InstCombine] reassociate logic ops with constants separated by a zext This is a partial implementation of a general fold for associative+commutative operators: (op (cast (op X, C2)), C1) --> (cast (op X, op (C1, C2))) (op (cast (op X, C2)), C1) --> (op (cast X), op (C1, C2)) There are 7 associative operators and 13 cast types, so this could potentially go a lot further. Differential Revision: https://reviews.llvm.org/D22421 llvm-svn: 275684	2016-07-16 15:20:19 +00:00
Hal Finkel	660096b260	Revert "Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute" This reverts commit r275042; the initial commit triggered self-hosting failures on ARM/AArch64. James Molloy identified the problematic backend code, which has been disabled in r275677. Trying again... Original commit message: Let FuncAttrs infer the 'returned' argument attribute A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. llvm-svn: 275678	2016-07-16 07:21:28 +00:00
Hal Finkel	04b5330ccd	Disable this-return argument forwarding on ARM/AArch64 r275042 reverted function-attribute inference for the 'returned' attribute because the feature triggered self-hosting failures on ARM and AArch64. James Molloy determined that the this-return argument forwarding feature, which directly ties the returned input argument to the returned value, was the cause. It seems likely that this forwarding code contains, or triggers, a subtle bug. Disabling for now until we can track that down. llvm-svn: 275677	2016-07-16 07:07:29 +00:00
Yaxun Liu	a711cc7951	Re-commit [AMDGPU] Add metadata for runtime Attempting to fix lit test failure on ppc. llvm-svn: 275676	2016-07-16 05:09:21 +00:00
Matthias Braun	538859cca3	llc: Add support for -run-pass none This does not schedule any passes besides the ones necessary to construct and print the machine function. This is useful to test .mir file reading and printing. Differential Revision: http://reviews.llvm.org/D22432 llvm-svn: 275664	2016-07-16 02:24:59 +00:00
Matthias Braun	c92a5fc9f6	ARM/MIR: Move test from MIR to CodeGen/ARM directory test/CodeGen/MIR/ARM/ARMLoadStoreDBG.mir is an actual test for the ARM load store optimization pass and not a test of the mir parser/printer. It belongs to test/CodeGen/ARM; This also updates the test to use the new -run-pass llc syntax. llvm-svn: 275662	2016-07-16 02:24:13 +00:00
Matthias Braun	5d00b3213e	MIParser: reject subregister indexes on physregs llvm-svn: 275658	2016-07-16 01:36:18 +00:00
Vedant Kumar	424f51bb04	[llvm-cov] Optionally use a symbol demangler when preparing reports Add an option to specify a symbol demangler (as well as options to the demangler). This can be used to make reports more human-readable. This option is especially useful in -output-dir mode, since it isn't as easy to manually pipe reports into a demangler in this mode. llvm-svn: 275640	2016-07-15 22:44:57 +00:00
Matt Arsenault	73d2f8954a	AMDGPU: Fix verifier error from partially undef copy In this situation: %VGPR2<def> = BUFFER_LOAD_DWORD_OFFSET %SGPR8_SGPR9_SGPR10_SGPR11, %VGPR7<def,tied3> = V_MAC_F32_e32 %VGPR0<undef>, %VGPR1<kill>, %VGPR7<kill,tied0>, %EXEC<imp-use> %VGPR3_VGPR4_VGPR5_VGPR6<def> = COPY %VGPR0_VGPR1_VGPR2_VGPR3 %VGPR4<def> = COPY %VGPR2 The copy for VGPR1 -> VGPR4 was an error from reading undefined VGPR1, but VGPR4 is defined immediately after this copy. llvm-svn: 275635	2016-07-15 22:32:02 +00:00
Michael Kuperstein	be2e3f5ce5	ExpandPostRAPseudos should transfer implicit uses, not only implicit defs Previously, we would expand: %BL<def> = COPY %DL<kill>, %EBX<imp-use,kill>, %EBX<imp-def> Into: %BL<def> = MOV8rr %DL<kill>, %EBX<imp-def> Dropping the imp-use on the floor. That confused CriticalAntiDepBreaker, which (correctly) assumes that if an instruction defs but doesn't use a register, that register is dead immediately before the instruction - while in this case, the high lanes of EBX can be very much alive. This fixes PR28560. Differential Revision: https://reviews.llvm.org/D22425 llvm-svn: 275634	2016-07-15 22:31:14 +00:00
Zachary Turner	b927e02e1b	[pdb] Teach MsfBuilder and other classes about the Free Page Map. Block 1 and 2 of an MSF file are bit vectors that represent the list of blocks allocated and free in the file. We had been using these blocks to write stream data and other data, so we mark them as the free page map now. We don't yet serialize these pages to the disk, but at least we make a note of what it is, and avoid writing random data to them. Doing this also necessitated cleaning up some of the tests to be more general and hardcode fewer values, which is nice. llvm-svn: 275629	2016-07-15 22:17:19 +00:00
Zachary Turner	5e534c7fb3	[pdb] Round trip the NameMap data structure to YAML. llvm-svn: 275628	2016-07-15 22:17:08 +00:00
Zachary Turner	faa554b2fd	[pdb] Use MsfBuilder to handle the writing PDBs. Previously we would read a PDB, then write some of it back out, but write the directory, super block, and other pertinent metadata back out unchanged. This generates incorrect PDBs since the amount of data written was not always the same as the amount of data read. This patch changes things to use the newly introduced `MsfBuilder` class to write out a correct and accurate set of Msf metadata for the data actually written, which opens up the door for adding and removing type records, symbol records, and other types of data to an existing PDB. llvm-svn: 275627	2016-07-15 22:16:56 +00:00
Matt Arsenault	93be6e8c0a	StructurizeCFG: Fix inverting constantexpr conditions llvm-svn: 275626	2016-07-15 22:13:16 +00:00
Matt Arsenault	a65e6b8335	AMDGPU: Remove brev intrinsic llvm-svn: 275620	2016-07-15 21:27:13 +00:00
Matt Arsenault	82e5e1e564	AMDGPU: Fix TargetPrefix for remaining r600 intrinsics llvm-svn: 275619	2016-07-15 21:27:08 +00:00
Matt Arsenault	11d3e21f2b	AMDGPU: Remove AMDGPU.ldexp llvm-svn: 275618	2016-07-15 21:26:56 +00:00
Matt Arsenault	09b2c4aee8	AMDGPU: Remove legacy rsq.clamped intrinsic Mesa still has a use of llvm.AMDGPU.rsq.f64 remaining. Also fix mismatch with non-IEEE rsq selecting to IEEE rsq. llvm-svn: 275617	2016-07-15 21:26:52 +00:00
Saleem Abdulrasool	467269a40e	CodeGen: avoid emitting unnecessary CFI Remove unnecessary clutter in assembly output. When using SjLj EH, the CFI is not actually used for anything. Do not emit the CFI needlessly. The minor test adjustments are interesting. The prologue test was just overzealous matcching. The interesting case is the LSDA change. It was originally added to ensure that various compilations did not mangle the name (it explicitly checked the name!). However, subsequent cleanups made it more reliant on the CFI to find the name. Parse the generated code flow to generically find the label still. llvm-svn: 275614	2016-07-15 21:10:29 +00:00
Nico Weber	8d66df15f4	Teach fast isel about the win64 calling convention. This mostly just works. Vectorcall rets are still not supported. The win64_eh test change is because fast isel doesn't use rsi for temporary computations, so it doesn't need to be pushed. The test case I'm changing was originally added to test pushes, but by now there are other test cases in that file exercising that code path. https://reviews.llvm.org/D22422 llvm-svn: 275607	2016-07-15 20:18:37 +00:00
George Burgess IV	22682e293b	[CFLAA] Add attributes handling for CFLAnders. This patch adds proper handling of stratified attributes into our anders-style CFLAA implementation. It also comes bundled with more CFLAnders tests. :) Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22325 llvm-svn: 275604	2016-07-15 20:02:49 +00:00
George Burgess IV	6d30aa03a0	[CFLAA] Add an initial CFLAnders implementation. This adds an incomplete anders-style implementation for CFLAA. It's incomplete in that it's missing interprocedural analysis, attrs handling, etc. and that it needs more tests. More tests and features will be added in future commits. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22291 llvm-svn: 275602	2016-07-15 19:53:25 +00:00
Vitaly Buka	7f64844481	Revert "[AMDGPU] Add metadata for runtime" This reverts commit r275566. llvm-svn: 275599	2016-07-15 19:14:57 +00:00
Jingyue Wu	2b353a9522	[ReassociateGEP] Update tests to allow missing "inbounds" on certain GEPs. With r275532 fixing miscompilation of GVN, "inbounds" on certain GEPs in these tests cannot be preserved any more. Left a TODO in the tests for future reference. llvm-svn: 275596	2016-07-15 18:47:17 +00:00
Sanjay Patel	27fefb2fcf	add tests for associative ops blocked by a cast These are more generalized versions of the cases added in r275302 and r275297. llvm-svn: 275594	2016-07-15 18:39:02 +00:00
Rong Xu	96a19d35ae	[PGO] IRPGO pre-cleanup pass changes This patch adds a selected set of cleanup passes including a pre-inline pass before LLVM IR PGO instrumentation. The inline is only intended to apply those obvious/trivial ones before instrumentation so that much less instrumentation is needed to get better profiling information. This will drastically improve the instrumented code performance for large C++ applications. Another benefit is the context sensitive counts that can potentially improve the PGO optimization. Differential Revision: http://reviews.llvm.org/D21405 llvm-svn: 275588	2016-07-15 18:10:49 +00:00
Adam Nemet	aad816083e	[OptRemark,LDist] RFC: Add hotness attribute Summary: This is the first set of changes implementing the RFC from http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334 This is a cross-sectional patch; rather than implementing the hotness attribute for all optimization remarks and all passes in a patch set, it implements it for the 'missed-optimization' remark for Loop Distribution. My goal is to shake out the design issues before scaling it up to other types and passes. Hotness is computed as an integer as the multiplication of the block frequency with the function entry count. It's only printed in opt currently since clang prints the diagnostic fields directly. E.g.: remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300) A new API added is similar to emitOptimizationRemarkMissed. The difference is that it additionally takes a code region that the diagnostic corresponds to. From this, hotness is computed using BFI. The new API is exposed via an analysis pass so that it can be made dependent on LazyBFI. (Thanks to Hal for the analysis pass idea.) This feature can all be enabled by setDiagnosticHotnessRequested in the LLVM context. If this is off, LazyBFI is not calculated (D22141) so there should be no overhead. A new command-line option is added to turn this on in opt. My plan is to switch all user of emitOptimizationRemark* to use this module instead. Reviewers: hfinkel Subscribers: rcox2, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D21771 llvm-svn: 275583	2016-07-15 17:23:20 +00:00
Jun Bum Lim	a5737d8eac	[DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals Summary: This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics. Add test cases which was missing opportunities before. Reviewers: hfinkel, eeckstein, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D21909 llvm-svn: 275571	2016-07-15 16:14:34 +00:00
Krzysztof Parzyszek	bba0bf7d37	[Hexagon] Improve patterns with stack-based addressing - Treat bitwise OR with a frame index as an ADD wherever possible, fold it into addressing mode. - Extend patterns for memops to allow memops with frame indexes as address operands. llvm-svn: 275569	2016-07-15 15:35:52 +00:00
Nico Weber	f7f2b81602	In dag-optnone.ll, use varargs instead of win64 to fast SDIsel. The test used to rely on targeting win64 to disable fast isel, but I'd like to teach fast isel about win64 rets. Change the test to use varargs to disable fast isel. llvm-svn: 275568	2016-07-15 15:30:18 +00:00
Yaxun Liu	b3d17690eb	[AMDGPU] Add metadata for runtime Added emitting metadata to elf for runtime. Runtime requires certain information (metadata) about kernels to be able to execute and query them. Such information is emitted to an elf section as a key-value pair stream. Differential Revision: https://reviews.llvm.org/D21849 llvm-svn: 275566	2016-07-15 14:58:21 +00:00
Sebastian Pop	4177480aad	code hoisting pass based on GVN This pass hoists duplicated computations in the program. The primary goal of gvn-hoist is to reduce the size of functions before inline heuristics to reduce the total cost of function inlining. Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki. Important algorithmic contributions by Daniel Berlin under the form of reviews. Differential Revision: http://reviews.llvm.org/D19338 llvm-svn: 275561	2016-07-15 13:45:20 +00:00
Simon Pilgrim	efd841e294	[X86][AVX] Added shuffle tests for UNPCK+PERMUTE lowerVectorShuffleAsPermuteAndUnpack could solve this if it worked with 256-bit vectors llvm-svn: 275554	2016-07-15 11:51:46 +00:00
Simon Pilgrim	cf9c31550c	[X86][AVX2] Added a memory version of test_mm256_broadcastsi128_si256 This should lower to vbroadcasti128 llvm-svn: 275552	2016-07-15 11:40:27 +00:00
Simon Pilgrim	2683ad54ad	[X86][AVX2] Improve lowerShuffleAsRepeatedMaskAndLanePermute permutation of 64-bit sub-lanes As discussed on PR28136, lowerShuffleAsRepeatedMaskAndLanePermute was attempting to match repeated masks at the 128-bit level and then permute the resultant lanes at the 128-bit (AVX1) or 64-bit (AVX2) sub-lane level. This change allows us to create the repeated masks at the sub-lane level (and then concat them together to create a 128-bit repeated mask) and then select which sub-lane to permute. This has no effect on the AVX1 codegen. Fixes PR28136. llvm-svn: 275543	2016-07-15 09:49:12 +00:00
James Molloy	b3326df56a	[Thumb-1] Select post-increment load and store where possible Thumb-1 doesn't have post-inc or pre-inc load or store instructions. However the LDM/STM instructions with writeback can function as post-inc load/store: ldm r0!, {r1} @ load from r0 into r1 and increment r0 by 4 Obviously, this only works if the post increment is 4. llvm-svn: 275540	2016-07-15 08:03:56 +00:00
James Molloy	a454a11d60	[ARM] Prefer indirect calls in minsize mode ... When we emit several calls to the same function in the same basic block. An indirect call uses a "BLX r0" instruction which has a 16-bit encoding. If many calls are made to the same target, this can enable significant code size reductions. llvm-svn: 275537	2016-07-15 07:55:21 +00:00
David Majnemer	959a6623b5	XFAIL two SeparateConstOffsetFromGEP tests They appear to have relied on bugs hidden in copyIRFlags/andIRFlags. This has been filed as PR28564. llvm-svn: 275533	2016-07-15 05:37:22 +00:00
David Majnemer	92f84ccf0f	[IR] andIRFlags and copyIRFlags needs to handle GEP We didn't consider the inbounds flag on GEPs leading to downstream users introducing UB. This fixes PR28562. llvm-svn: 275532	2016-07-15 05:02:31 +00:00
Vedant Kumar	71d515b0b3	[llvm-cov] Relax a test for Windows Attempt to address this bot failure: http://bb.pgr.jp/builders/ninja-clang-i686-msc19-R/builds/4967 llvm-svn: 275522	2016-07-15 02:11:37 +00:00
Vedant Kumar	b95dc4608d	[llvm-cov] Improve error messages While we're at it, extend an existing test to make sure that error messages look reasonable. llvm-svn: 275520	2016-07-15 01:53:39 +00:00
Matt Arsenault	b91805ea2b	AMDGPU: Fix not expanding control flow after some kill blocks Also stop trying to insert skip blocks at end_cf. This was inserting them at the end of the block which doesn't make sense. The skip should be inserted at the beginning of the block right after the end cf. Just remove this for now since no tests seem to stress this and I think this can be handled more generally later. Fixes bug 28550 llvm-svn: 275510	2016-07-15 00:58:15 +00:00
Matt Arsenault	fa5a86a403	AMDGPU: Fix trying to skip from a block with no successors Found while reducing bug 28550 llvm-svn: 275509	2016-07-15 00:58:13 +00:00
Matt Arsenault	83ab049af2	AMDGPU: Fix splitting kill blocks with defs before kill llvm-svn: 275508	2016-07-15 00:58:09 +00:00
Reid Kleckner	c29b4f07f9	[codeview] Shrink inlined call site line info tables For a fully inlined call chain like a -> b -> c -> d, we were emitting line info for 'd' 3 separate times: once for d's actual InlineSite line table, and twice for 'b' and 'c'. This is particularly inefficient when all these functions are in different headers, because now we need to encode the file change. Windbg was coping with our suboptimal output, so this should not be noticeable from the debugger. llvm-svn: 275502	2016-07-14 23:47:15 +00:00
Tim Northover	fbefee3bff	llvm-objdump: extend __mh_execute_header handling to other special syms We don't need to print any of the special __mh_*_header symbols when disassembling. Since they point at the beginning of the segment (not where the actual code is) they're pretty misleading. Should also fix lld bots. llvm-svn: 275498	2016-07-14 23:13:03 +00:00
Simon Pilgrim	420b266d0a	[X86][AVX2] Allow VPERMPD/VPERMQ shuffles to call combineShuffle (reapplied) This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better. This was incorrectly reverted in rL275421 during triage of PR28552. llvm-svn: 275497	2016-07-14 23:05:09 +00:00
Adam Nemet	74730d9ab0	[LoopDist] Fix typo in diagnostic llvm-svn: 275495	2016-07-14 22:33:46 +00:00
Tim Northover	f203ab5be3	llvm-objdump: handle stubbed and malformed dylibs better We were quite happy to read past the end of the valid section data when disassembling. Instead we entirely skip stub dylibs, and tell the user what's happened if their section only has partial data. llvm-svn: 275487	2016-07-14 22:13:32 +00:00
Ekaterina Romanova	7aea5906c0	[GVN] Fold constant expression in GVN. Fix for PR 28418. opt never finishes compiling a test when -gvn option is passed. The problem is caused by the fact that GVN fails to fold a constant expression. Differential Revision: https://reviews.llvm.org/D22185 llvm-svn: 275483	2016-07-14 22:02:25 +00:00
Teresa Johnson	35e0204eec	[ThinLTO/gold] Perform index-based weak/linkonce resolution Summary: Invoke the weak/linkonce symbol resolution support (already used by libLTO) that operates via the summary index. This ensures prevailing linkonce are kept, by making them weak, and marks preempted copies as available_externally when possible. With this change, the older support for keeping the prevailing linkonce (by changing their symbol resolution) is removed. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D22302 llvm-svn: 275474	2016-07-14 21:13:24 +00:00
Matthew Simpson	65ca32b83c	[LV] Allow interleaved accesses in loops with predicated blocks This patch allows the formation of interleaved access groups in loops containing predicated blocks. However, the predicated accesses are prevented from forming groups. Differential Revision: https://reviews.llvm.org/D19694 llvm-svn: 275471	2016-07-14 20:59:47 +00:00
Krzysztof Parzyszek	ecea07c50e	[Hexagon] Packetize function call arguments with tail call instructions On Hexagon is it legal to packetize the instructions setting up call arguments with the call instruction itself. This was already done, except for tail calls. Make sure tail calls are handled as well. llvm-svn: 275458	2016-07-14 19:30:55 +00:00
Sanjoy Das	13623ad009	[JumpThreading] PRE unordered loads Summary: Extend JumpThreading's PRE to unordered atomic loads. Reviewers: hfinkel, reames Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D22326 llvm-svn: 275456	2016-07-14 19:21:15 +00:00
Jun Bum Lim	c837af306e	[PM] Port Dead Loop Deletion Pass to the new PM Summary: Port Dead Loop Deletion Pass to the new pass manager. Reviewers: silvas, davide Subscribers: llvm-commits, sanjoy, mcrosier Differential Revision: https://reviews.llvm.org/D21483 llvm-svn: 275453	2016-07-14 18:28:29 +00:00
Kostya Serebryany	dd5c7f9313	[sanitizer-coverage] make sure that calls to __sanitizer_cov_trace_pc are not merged (otherwise different calls get the same PC and confuse fuzzers) llvm-svn: 275449	2016-07-14 17:59:01 +00:00
Nirav Dave	a6c7595d0f	[X86][MC] Fix bracket expression parsing in intel-style assembly. Only perform struct field check on Identifier tokens. Fixes PR28547. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22361 llvm-svn: 275445	2016-07-14 17:37:05 +00:00
Saleem Abdulrasool	0233cc55de	X86: handle external tail calls in Windows JIT If there was a tail call, we would incorrectly handle the relocation. It would end up indexing into the array with an incorrect section id. The symbol was external to the module, so the Section ID was UNDEFINED (-1). We would then index the SmallVector with this ID, triggering an assertion. Use the Value rather than the section load address in this case. llvm-svn: 275442	2016-07-14 17:27:06 +00:00
Sanjay Patel	2996a342f3	auto-generate checks Note: I removed the checks after each jump because that's noise, but we apparently need branches rather than returning i1 to see the bt codegen in some cases. llvm-svn: 275439	2016-07-14 17:07:55 +00:00
Tim Northover	6003fb58df	ARM: fix vmov.i64 immediate validity check Typo meant we were only checking the low byte (repeatedly). llvm-svn: 275437	2016-07-14 17:04:34 +00:00
Tom Stellard	1b5cf6217e	GlobalsAA: Functions with the argmemonly attribute won't read arbitrary globals Summary: In preparation for changing GlobalsAA to stop assuming that intrinsics can't read arbitrary globals, we need to make sure GlobalsAA is querying function attributes rather than relying on this assumption. This patch was inspired by: http://reviews.llvm.org/D20206 Reviewers: jmolloy, hfinkel Subscribers: eli.friedman, llvm-commits Differential Revision: https://reviews.llvm.org/D21318 llvm-svn: 275433	2016-07-14 15:50:27 +00:00
Nico Weber	5bb284226b	Don't optimize movs to pushes in -O0 builds. https://reviews.llvm.org/D22362 llvm-svn: 275431	2016-07-14 15:40:22 +00:00
Ahmed Bougacha	85dc93c56b	[X86] Decode MPX BND registers. We were able to assemble, but not disassemble. Note that fixupRMValue was truncating EA_REG_BND0-3 because we hit the uint8_t max. The control registers were already squarely above it, but I don't think they ever go in .r/m, only in .reg. I also did notice an extra REX.W in our encoding, but I think that's fine. llvm-svn: 275427	2016-07-14 14:53:21 +00:00
Sam Kolton	7a2a323feb	[AMDGPU] Assembler: fix row_bcast parsing Summary: This change fix bug 28538 Reviewers: tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl Differential Revision: https://reviews.llvm.org/D22355 llvm-svn: 275422	2016-07-14 14:50:35 +00:00
Nico Weber	3afaf16abc	Revert r275411, it cause PR28552. llvm-svn: 275421	2016-07-14 14:49:35 +00:00
Nico Weber	755cd760cd	Revert r275401, it caused PR28551. llvm-svn: 275420	2016-07-14 14:41:25 +00:00
Matthew Simpson	3c3b4a257b	[LV] Avoid unnecessary IV scalar-to-vector-to-scalar conversions This patch prevents increases in the number of instructions, pre-instcombine, due to induction variable scalarization. An increase in instructions can lead to an increase in the compile-time required to simplify the induction variables. We now maintain a new map for scalarized induction variables to prevent us from converting between the scalar and vector forms. This patch should resolve compile-time regressions seen after r274627. llvm-svn: 275419	2016-07-14 14:36:06 +00:00
Nico Weber	ecdf45b1e6	Teach fast isel calls and rets about stdcall. stdcall is callee-pop like thiscall, so the thiscall changes already did most of the work for this. This change only opts stdcall in and adds tests. llvm-svn: 275414	2016-07-14 13:54:26 +00:00
Simon Pilgrim	bed37ccd54	[X86][AVX] Added an additional vperm2f128 memory folding test llvm-svn: 275413	2016-07-14 13:40:53 +00:00
Simon Pilgrim	3ecb6bdd5f	[X86][AVX2] Allow VPERMPD/VPERMQ shuffles to call combineShuffle This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better. llvm-svn: 275411	2016-07-14 13:28:43 +00:00
Daniel Sanders	46fe6550ac	[mips] SelectionDAGISel subclasses now follow the optimization level. Summary: It was recently discovered that, for Mips's SelectionDAGISel subclasses, all optimization levels caused SelectionDAGISel to behave like -O2. This change adds the necessary plumbing to initialize the optimization level. Reviewers: andrew.w.kaylor Subscribers: andrew.w.kaylor, sdardis, dean, llvm-commits, vradosavljevic, petarj, qcolombet, probinson, dsanders Differential Revision: https://reviews.llvm.org/D14900 llvm-svn: 275410	2016-07-14 13:25:22 +00:00
Simon Pilgrim	053d32906f	[X86][AVX] Add support for narrowing 128-bit+ shuffle mask elements to 64-bits to allow combining Primarily this is to allow blend with zero instead of having to use vperm2f128, but we can use this in the future to deal with AVX512 cases where we need to keep the original element size to correctly fold masked operations. llvm-svn: 275406	2016-07-14 12:58:04 +00:00
Sjoerd Meijer	716abbb2f5	This converts a signed remainder instruction to unsigned remainder, which enables the code size optimisation to fold a rem and div into a single aeabi_uidivmod call. This was not happening before because sdiv was converted but srem not, and instructions with different signedness are not combined. Differential Revision: http://reviews.llvm.org/D22214 llvm-svn: 275403	2016-07-14 12:23:48 +00:00
Simon Pilgrim	700e4a1ab8	[X86][AVX] Add 128-bit wide shuffle tests that should combine to blend-with-zero llvm-svn: 275402	2016-07-14 12:21:40 +00:00
Sebastian Pop	63847d04e7	code hoisting pass based on GVN This pass hoists duplicated computations in the program. The primary goal of gvn-hoist is to reduce the size of functions before inline heuristics to reduce the total cost of function inlining. Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki. Important algorithmic contributions by Daniel Berlin under the form of reviews. Differential Revision: http://reviews.llvm.org/D19338 llvm-svn: 275401	2016-07-14 12:18:53 +00:00
Simon Pilgrim	a76a8e50e5	[X86][AVX] Add VBROADCASTF128/VBROADCASTI128 shuffle comments support llvm-svn: 275400	2016-07-14 12:07:43 +00:00
Simon Pilgrim	9e812169cc	[X86][AVX] Regenerate broadcast upgrade tests llvm-svn: 275398	2016-07-14 11:05:43 +00:00
Sjoerd Meijer	38c2cd0c14	This implements a more optimal algorithm for selecting a base constant in constant hoisting. It not only takes into account the number of uses and the cost of expressions in which constants appear, but now also the resulting integer range of the offsets. Thus, the algorithm maximizes the number of uses within an integer range that will enable more efficient code generation. On ARM, for example, this will enable code size optimisations because less negative offsets will be created. Negative offsets/immediates are not supported by Thumb1 thus preventing more compact instruction encoding. Differential Revision: http://reviews.llvm.org/D21183 llvm-svn: 275382	2016-07-14 07:44:20 +00:00
David Majnemer	666aa945a5	[InstCombine] Masked loads with undef masks can fold to normal loads We were able to fold masked loads with an all-ones mask to a normal load. However, we couldn't turn a masked load with a mask with mixed ones and undefs into a normal load. llvm-svn: 275380	2016-07-14 06:58:42 +00:00
David Majnemer	17a95aaa7b	Simplify llvm.masked.load w/ undef masks We can always pick the passthru value if the mask is undef: we are permitted to treat the mask as-if it were filled with zeros. llvm-svn: 275379	2016-07-14 06:58:37 +00:00
Eli Friedman	17e8ea18e9	[X86] Fix stupid typo in isel lowering. Apparently someone miscounted the number of zeros in the immediate. Fixes https://llvm.org/bugs/show_bug.cgi?id=28544 . llvm-svn: 275376	2016-07-14 05:48:25 +00:00
Matt Arsenault	ca7f5701f8	AMDGPU/R600: Delete/rename intrinsics no longer used by mesa Use the replacement pass to update the tests, and delete old names. llvm-svn: 275375	2016-07-14 05:47:17 +00:00
Matt Arsenault	897eee4187	AMDGPU: Remove unused intrinsics llvm-svn: 275371	2016-07-14 05:23:19 +00:00
Matt Arsenault	aa94c1e7ee	AMDGPU: Fix test not actually testing anything It wasn't actually running the pass, and since it is missing the llvm prefix, the eh intrinsic was not really an IntrinsicInst. Also add missing test for lifetime markers. llvm-svn: 275370	2016-07-14 05:23:15 +00:00
Dean Michael Berris	52735fc435	XRay: Add entry and exit sleds Summary: In this patch we implement the following parts of XRay: - Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches. - Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts). - X86-specific nop sleds as described in the white paper. - A machine function pass that adds the different instrumentation marker instructions at a very late stage. - A way of identifying which return opcode is considered "normal" for each architecture. There are some caveats here: 1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet. 2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library. Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D19904 llvm-svn: 275367	2016-07-14 04:06:33 +00:00
Davide Italiano	7dac027ed7	[IPSCCP] Constant fold struct argument/instructions when all the lattice values are constant. This now should also work with the interprocedural variant of the pass. Slightly easier now that the yak is shaved. Differential Revision: http://reviews.llvm.org/D22329 llvm-svn: 275363	2016-07-14 02:51:41 +00:00
Nico Weber	af7e8465e1	Teach fast isel about thiscall (and callee-pop) calls. http://reviews.llvm.org/D22315 llvm-svn: 275360	2016-07-14 01:52:51 +00:00
Mehdi Amini	8484f92f7f	[Scalarizer] PR28108: Skip over nullptr rather than crashing on it. Summary: In Scalarizer::gather we see if we already have a scattered form of Op, and in that case use the new form. In the particular case of PR28108, the found ValueVector SV has size 2, where the first Value is nullptr, and the second is indeed a proper Value. The nullptr then caused an assert to blow when we tried to do cast<Instruction>(SV[I]). With this patch we check SV[I] before doing the cast, and if it's nullptr we just skip over it. I don't know the Scalarizer well enough to know if this is the best fix or if something should be done else where to prevent the nullptr from being in the ValueVector at all, but at least this avoids the crash and looking at the test case output it looks reasonable. Reviewers: hfinkel, frasercrmck, wala, mehdi_amini Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21518 llvm-svn: 275359	2016-07-14 01:31:25 +00:00
Mehdi Amini	9e332a7719	Add missing test for r275347 "[IPRA] Set callee saved registers to none for local function when IPRA is enabled." llvm-svn: 275358	2016-07-14 01:31:20 +00:00
Adrian Prantl	0418ef2691	Synchronize LLVM and clang's ObjCDeclSpec::ObjCPropertyAttributeKind. This adds Clang-specific DWARF constants for nullability and ObjC class properties that are already generated by clang. This patch adds dwarfdump support and a more comprehensive testcase. <rdar://problem/27335745> llvm-svn: 275354	2016-07-14 00:41:18 +00:00
David Majnemer	7f781aba97	[ConstantFolding] Fold masked loads We can constant fold a masked load if the operands are appropriately constant. Differential Revision: http://reviews.llvm.org/D22324 llvm-svn: 275352	2016-07-14 00:29:50 +00:00
David Majnemer	f89660aba7	[ConstantFolding] Extend FoldReinterpretLoadFromConstPtr to handle negative offsets Treat loads which clip before the start of a global initializer the same way we treat clipping beyond the end of the initializer: use zeros. llvm-svn: 275345	2016-07-13 23:33:07 +00:00
Michael Kuperstein	be837fa40f	[DAG] Correctly chain masked loads If a masked loads is not added to the chain, it should not reset the chain's root. This fixes the remaining part of PR28515. llvm-svn: 275340	2016-07-13 23:23:40 +00:00
Quentin Colombet	68a84587c5	[MIR] Fix one GlobalISel test case that I missed in r275314. llvm-svn: 275333	2016-07-13 22:35:33 +00:00
Nico Weber	b888555bcc	Add a triple to fix test on bots after 275320. llvm-svn: 275327	2016-07-13 22:19:40 +00:00
Nico Weber	eb9488b151	Fix a TODO in X86CallFrameOptimization to not rely on a codegen artifact. This happens to make X86CallFrameOptimization in -O0 / FastISel builds as well, but it's not clear if the pass should run in that setup. http://reviews.llvm.org/D22314 llvm-svn: 275320	2016-07-13 21:38:27 +00:00
Alina Sbirlea	640a61cd8b	Extended LoadStoreVectorizer to vectorize subchains. Summary: LSV used to abort vectorizing a chain for interleaved load/store accesses that alias. Allow a valid prefix of the chain to be vectorized, mark just the prefix and retry vectorizing the remaining chain. Reviewers: llvm-commits, jlebar, arsenm Subscribers: mzolotukhin Differential Revision: http://reviews.llvm.org/D22119 llvm-svn: 275317	2016-07-13 21:20:01 +00:00
Quentin Colombet	545e558b82	[MIR] Print on the given output instead of stderr. Currently the MIR framework prints all its outputs (errors and actual representation) on stderr. This patch fixes that by printing the regular output in the output specified with -o. Differential Revision: http://reviews.llvm.org/D22251 llvm-svn: 275314	2016-07-13 20:36:03 +00:00
Matt Arsenault	f071102647	AMDGPU: Remove last AMDIL intrinsics llvm-svn: 275309	2016-07-13 19:42:06 +00:00
Andrew Kaylor	346dd7f1bd	Reverting r275284 due to platform-specific test failures llvm-svn: 275304	2016-07-13 19:09:16 +00:00
Sanjay Patel	eff2aa70fc	add more tests for zexty xor sandwiches ...mmm sandwiches llvm-svn: 275302	2016-07-13 18:58:55 +00:00
Simon Pilgrim	5d664af3c3	[X86][SSE] Regenerate truncated shift test Check SSE2 and AVX2 implementations llvm-svn: 275300	2016-07-13 18:50:10 +00:00
Simon Pilgrim	631643e7d9	Regenerate test llvm-svn: 275299	2016-07-13 18:46:37 +00:00
Sanjay Patel	904a88025a	add test for zexty xor sandwich llvm-svn: 275297	2016-07-13 18:40:38 +00:00
Krzysztof Parzyszek	cb4dd7656b	Move mempcpy_call.ll to X86 subdirectory llvm-svn: 275294	2016-07-13 18:28:45 +00:00
Sanjay Patel	c00e48a3db	[InstCombine] extend vector select matching for non-splat constants In D21740, we discussed trying to make this a more general matcher. However, I didn't see a clean way to handle the regular m_Not cases and these non-splat vector patterns, so I've opted for the direct approach here. If there are other potential uses of areInverseVectorBitmasks(), we could move that helper function to a higher level. There is an open question as to which is of these forms should be considered the canonical IR: %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a, <4 x i32> %b %shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32 5, i32 6, i32 3> Differential Revision: http://reviews.llvm.org/D22114 llvm-svn: 275289	2016-07-13 18:07:02 +00:00
Andrew Kaylor	12cccdd731	Fix for Bug 26903, adds support to inline __builtin_mempcpy Patch by Sunita Marathe Differential Revision: http://reviews.llvm.org/D21920 llvm-svn: 275284	2016-07-13 17:25:11 +00:00
Matthias Braun	512424f28a	PatchableFunction: Skip pseudos that do not create code This fixes http://llvm.org/PR28524 llvm-svn: 275278	2016-07-13 16:37:29 +00:00
Teresa Johnson	b907d06151	[ThinLTO/gold] Enable symbol resolution in distributed backend case While testing a follow-on change to enable index-based symbol resolution and internalization in the distributed backends, I realized that a test case change I made in r275247 was only required because we were not analyzing symbols in the claimed files in thinlto-index-only mode. In the fixed test case there should be no internalization because we are linking in -shared mode, so f() is in fact exported, which is detected properly when we analyze symbols in thinlto-index-only mode. Note that this is not (yet) a correctness issue (because we are not yet performing the index-based linkage optimizations in the distributed backends - that's coming in a follow-on patch). llvm-svn: 275277	2016-07-13 16:35:56 +00:00
Sanjay Patel	610a2f6525	[x86][SSE/AVX] optimize pcmp results better (PR28484) We know that pcmp produces all-ones/all-zeros bitmasks, so we can use that behavior to avoid unnecessary constant loading. One could argue that load+and is actually a better solution for some CPUs (Intel big cores) because shifts don't have the same throughput potential as load+and on those cores, but that should be handled as a CPU-specific later transformation if it ever comes up. Removing the load is the more general x86 optimization. Note that the uneven usage of vpbroadcast in the test cases is filed as PR28505: https://llvm.org/bugs/show_bug.cgi?id=28505 Differential Revision: http://reviews.llvm.org/D22225 llvm-svn: 275276	2016-07-13 16:04:07 +00:00
Simon Pilgrim	a99368fa35	[X86][AVX512] Add support for VPERMILPD/VPERMILPS variable shuffle mask comments llvm-svn: 275272	2016-07-13 15:45:36 +00:00
Simon Pilgrim	48d8340760	[X86][AVX] Add support for target shuffle combining to VPERMILPS variable shuffle mask Added AVX512F VPERMILPS shuffle decoding support llvm-svn: 275270	2016-07-13 15:10:43 +00:00
Tom Stellard	418beb7671	AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21484 llvm-svn: 275268	2016-07-13 14:23:33 +00:00
Nirav Dave	8ea792db60	[MC] Fix lexing ordering in assembly label parsing to preserve same line comment placement. llvm-svn: 275265	2016-07-13 14:03:12 +00:00
Matt Arsenault	0056868c4a	AMDGPU: Fold out no-op kill intrinsics llvm-svn: 275253	2016-07-13 06:04:22 +00:00
David Majnemer	1b3db33e3d	[ConstantFolding] Don't treat negative GEP offsets as positive GEP offsets are signed, don't treat them as huge positive numbers. llvm-svn: 275251	2016-07-13 05:16:16 +00:00
Adam Nemet	c2f791d8a7	[BFI] Add new LazyBFI analysis pass Summary: This is necessary for D21771. In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. However we don't want to pay for computing BFI unless the hotness attribute is requested. This is achieved by making BFI lazy at the very high-level through a new analysis pass -- BFI is not calculated unless requested. I am adding a test to check the laziness under D21771 where the first user of the analysis is added. Reviewers: hfinkel, dexonsmith, davidxl Subscribers: davidxl, dexonsmith, llvm-commits Differential Revision: http://reviews.llvm.org/D22141 llvm-svn: 275250	2016-07-13 05:01:48 +00:00
Teresa Johnson	27694571b1	[ThinLTO/gold] ThinLTO internalization fixes Internalization was missing cases where we originally had a local symbol that was promoted eagerly but not actually exported. This is because we were only internalizing the set of global (non-local) symbols that were PREVAILAING_DEF_IRONLY. Instead, collect the set of global symbols that are referenced outside of a single IR file, and skip internalization for those. llvm-svn: 275247	2016-07-13 03:42:41 +00:00
David Majnemer	a7b6c973e5	[ConstantFold] Don't incorrectly infer inbounds on array GEP The many levels of nesting inside the responsible code made it easy for bugs to sneak in. Flattening the logic makes it easier to see what's going on. llvm-svn: 275244	2016-07-13 03:24:41 +00:00
Keno Fischer	1efc3b70c5	Fix ScalarEvolutionExpander step scaling bug The expandAddRecExprLiterally function incorrectly transforms `[Start + Step * X]` into `Step * [Start + X]` instead of the correct transform of `[Step * X] + Start`. This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219 due to what appeared to be sufficiently complicated loop interactions. Patch by Jameson Nash (jameson@juliacomputing.com). Reviewers: sanjoy Differential Revision: http://reviews.llvm.org/D16505 llvm-svn: 275239	2016-07-13 01:28:12 +00:00
Dehao Chen	9cba1f4e7e	New pass manager for LICM. Summary: Port LICM to the new pass manager. Reviewers: davidxl, silvas Subscribers: krasin, vitalybuka, silvas, davide, sanjoy, llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21772 llvm-svn: 275222	2016-07-12 22:37:48 +00:00
Tim Northover	72eebfa4b0	GlobalISel: freeze reserved regs after IRTranslator. We can freeze the registers after the MachineFrameInfo has been configured (by telling it about calls, inline asm, ...). This doesn't happen at all yet, but will be part of IR translation. Fixes -verify-machineinstrs assertion. llvm-svn: 275221	2016-07-12 22:23:42 +00:00
Matt Arsenault	786724a22e	AMDGPU: Follow up to r275203 I meant to squash this into it. llvm-svn: 275220	2016-07-12 21:41:32 +00:00
Nemanja Ivanovic	f0407e3902	The test case I added is PowerPC specific but I accidentally had it in the wrong directory. Moved it to CodeGen/PowerPC. Sorry about the noise. llvm-svn: 275218	2016-07-12 21:24:08 +00:00
Michael Kuperstein	a99c46cc73	[LV] Remove wrong assumption about LCSSA The LCSSA pass itself will not generate several redundant PHI nodes in a single exit block. However, such redundant PHI nodes don't violate LCSSA form, and may be introduced by passes that preserve LCSSA, and/or preserved by the LCSSA pass itself. So, assuming a single PHI node per exit block is not safe. llvm-svn: 275217	2016-07-12 21:24:06 +00:00
Nemanja Ivanovic	b43bb6141e	[Power9] Add codegen for VSX word insert/extract instructions This patch corresponds to review: http://reviews.llvm.org/D20239 It adds exploitation of XXINSERTW and XXEXTRACTUW instructions that are useful in some cases for inserting and extracting vector elements of v4[if]32 vectors. llvm-svn: 275215	2016-07-12 21:00:10 +00:00
Piotr Padlewski	fa0cdb371b	Review fixes to lit documentation Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22245 llvm-svn: 275214	2016-07-12 20:59:17 +00:00
Simon Pilgrim	6fa71da4a4	[X86][AVX] Add support for target shuffle combining to VPERM2F128/VPERM2I128 llvm-svn: 275212	2016-07-12 20:27:32 +00:00
Davide Italiano	0080269342	[SCCP] Constant fold structs if all the lattice value are constant. Differential Revision: http://reviews.llvm.org/D22269 llvm-svn: 275208	2016-07-12 19:54:19 +00:00
Matthias Braun	96ec47db74	X86FixupBWInsts: No need for forward liveness analysis. With r274952 and r275201 in place there are no cases left where a forward liveness analysis yields different results than a backward one. So we can remove the forward stepping logic. Differential Revision: http://reviews.llvm.org/D22083 llvm-svn: 275204	2016-07-12 19:04:30 +00:00
Matt Arsenault	657f871a4e	AMDGPU: Fix verifier error with kill intrinsic Don't create a terminator in the middle of the block. We should probably get rid of this intrinsic. llvm-svn: 275203	2016-07-12 19:01:23 +00:00
Dehao Chen	b9f8e29290	[PM] Port LoopIdiomRecognize Pass to new PM Summary: Port LoopIdiomRecognize Pass to new PM Reviewers: davidxl Subscribers: davide, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D22250 llvm-svn: 275202	2016-07-12 18:45:51 +00:00
Wei Ding	5b2636a152	AMDGPU: Add LLVM IR Intrinsic for v_lerp_u8 Differential Revision: http://reviews.llvm.org/D22239 llvm-svn: 275197	2016-07-12 18:02:14 +00:00
Xinliang David Li	9eb472ba4b	[PGO] Don't include full file path in static function profile counter names Patch by Jake VanAdrighem Differential Revision: http://reviews.llvm.org/D22028 llvm-svn: 275193	2016-07-12 17:14:51 +00:00
Sanjay Patel	4a6a751dce	add tests for missing DeMorgan's Law folds llvm-svn: 275192	2016-07-12 17:05:04 +00:00
Sanjay Patel	3900191ecc	auto-generate checks llvm-svn: 275188	2016-07-12 16:21:55 +00:00
Sanjay Patel	93dffe629a	auto-generate checks llvm-svn: 275187	2016-07-12 16:17:30 +00:00
Sanjay Patel	6d1f227e6b	auto-generate checks llvm-svn: 275186	2016-07-12 16:13:04 +00:00
Haicheng Wu	711ca868fc	[AArch64] Set FMOVS0 and FMOVD0 as isAsCheapAsAMove when needed. If a subtarget has both ZCZeroing and CustomCheapAsMoveHandling features (now only Kryo has both), set FMOVS0 and FMOVD0 isAsCheapAsAMove. Differential Revision: http://reviews.llvm.org/D22256 llvm-svn: 275178	2016-07-12 15:31:41 +00:00
Nemanja Ivanovic	eebbcb6d57	[PowerPC] Cannonicalize applicable vector shift immediates as swaps This patch corresponds to review: http://reviews.llvm.org/D21358 Vector shifts that have the same semantics as a vector swap are cannonicalized as such to provide additional opportunities for swap removal optimization to remove unnecessary swaps. llvm-svn: 275168	2016-07-12 12:16:27 +00:00
Amjad Aboud	acee568545	[codeview] Improved array type support. Added support for: 1. Multi dimension array. 2. Array of structure type, which previously was declared incompletely. 3. Dynamic size array. 4. Array where element type is a typedef, volatile or constant (this should resolve PR28311). Differential Revision: http://reviews.llvm.org/D21526 llvm-svn: 275167	2016-07-12 12:06:34 +00:00
Nicolai Haehnle	7968c34586	AMDGPU: Unify MOVRELSOffset and MOVRELDOffset Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217 llvm-svn: 275160	2016-07-12 08:12:16 +00:00
Vitaly Buka	204dc533c5	Revert "New pass manager for LICM." Summary: This reverts commit r275118. Subscribers: sanjoy, mehdi_amini Differential Revision: http://reviews.llvm.org/D22259 llvm-svn: 275156	2016-07-12 06:25:32 +00:00
Craig Topper	a6e6febe2c	[AVX512] Remove masked logic op intrinsics and autoupgrade them to native IR. llvm-svn: 275155	2016-07-12 05:27:53 +00:00
Ivan Krasin	5474645dc8	Print remarks from WholeProgramDevirt pass for each call site. Summary: It's useful to have some visibility about which call sites are devirtualized, especially for debug purposes. Another use case is a regression test on the application side (like, Chromium). Reviewers: pcc Differential Revision: http://reviews.llvm.org/D22252 llvm-svn: 275145	2016-07-12 02:38:37 +00:00
NAKAMURA Takumi	e92e2124f6	llvm/test/CodeGen/AMDGPU/selected-stack-object.ll REQUIRES +Asserts, since it expects assertion failure. llvm-svn: 275144	2016-07-12 02:18:09 +00:00
Haicheng Wu	1e39574e9f	[Kryo] Enable ZCZeroing feature This feature uses immediate #0 to zero a register. Differential Revision: http://reviews.llvm.org/D19985 llvm-svn: 275143	2016-07-12 02:04:01 +00:00
Nico Weber	c7bf646a99	Teach FastISel about thiscall (and, hence, about callee-pop). http://reviews.llvm.org/D22115 llvm-svn: 275135	2016-07-12 01:30:35 +00:00
Matt Arsenault	45f8216cee	AMDGPU: Remove superfluous string attributes from tests Also fix v_mac.ll not testing right thing for fneg llvm-svn: 275129	2016-07-11 23:35:48 +00:00
Mehdi Amini	e75aa6f674	Add a libLTO API to query a memory buffer and check if it contains ObjC categories The linker supports a feature to force load an object from a static archive if it defines an Objective-C category. This API supports this feature by looking at every section in the module to find if a category is defined in the module. llvm-svn: 275125	2016-07-11 23:10:18 +00:00
Dehao Chen	7ef5820fa3	New pass manager for LICM. Summary: Port LICM to the new pass manager. Reviewers: davidxl, silvas Subscribers: silvas, davide, sanjoy, llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21772 llvm-svn: 275118	2016-07-11 22:45:24 +00:00
Alina Sbirlea	cbc6ac2afd	Correct ordering of loads/stores. Summary: Aiming to correct the ordering of loads/stores. This patch changes the insert point for loads to the position of the first load. It updates the ordering method for loads to insert before, rather than after. Before this patch the following sequence: "load a[1], store a[1], store a[0], load a[2]" Would incorrectly vectorize to "store a[0,1], load a[1,2]". The correctness check was assuming the insertion point for loads is at the position of the first load, when in practice it was at the last load. An alternative fix would have been to invert the correctness check. The current fix changes insert position but also requires reordering of instructions before the vectorized load. Updated testcases to reflect the changes. Reviewers: tstellarAMD, llvm-commits, jlebar, arsenm Subscribers: mzolotukhin Differential Revision: http://reviews.llvm.org/D22071 llvm-svn: 275117	2016-07-11 22:34:29 +00:00
Tim Northover	3e0361710a	ARM: validate immediate branch targets in AsmParser. Immediate branch targets aren't commonly used, but if they are we should make sure they can actually be encoded. This means they must be divisible by 2 when targeting Thumb mode, and by 4 when targeting ARM mode. Also do a little naming cleanup while I was changing everything around anyway. llvm-svn: 275116	2016-07-11 22:29:37 +00:00
Nicolai Haehnle	c06bfa1daa	AMDGPU: Treat texture gather instructions more like other MIMG instructions Summary: Setting MIMG to 0 has a bunch of unexpected side effects, including that isVMEM returns false which leads to incorrect treatment in the hazard recognizer. The reason I noticed it is that it also leads to incorrect treatment in VGPR-to-SGPR copies, which is one cause of the referenced bug. The only reason why MIMG was set to 0 is to signal the special handling of dmasks, but that can be checked differently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96877 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22210 llvm-svn: 275113	2016-07-11 21:59:43 +00:00
Zachary Turner	dbeaea7b35	Refactor the PDB writing to use a builder approach llvm-svn: 275110	2016-07-11 21:45:26 +00:00
Zachary Turner	f6b9382467	[pdb] Add a pdb2yaml option to not dump file headers. This will be useful once we start adding the ability to dump type records and symbol records, since it will allow us to generate mergeable information instead of information that specifies an entire file. llvm-svn: 275109	2016-07-11 21:45:09 +00:00
Nicolai Haehnle	f52c3cf272	AMDGPU: fix local stack slot allocation bugs Summary: The main bug fix here is using the 32-bit encoding of V_ADD_I32 in materializeFrameBaseRegister and resolveFrameIndex, so that arbitrary immediates work. The second part is that we may now require the SegmentWaveByteOffset even when there are initially no stack objects and VGPR spilling isn't enabled, for stack slots that are allocated later. This means that some bits become effectively dead and can be cleaned up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96602 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21551 llvm-svn: 275108	2016-07-11 21:44:40 +00:00
Michael Kuperstein	f0c59330e9	[X86] Make some cast costs more precise Make some AVX and AVX512 cast costs more precise. Based on part of a patch by Elena Demikhovsky (D15604). Differential Revision: http://reviews.llvm.org/D22064 llvm-svn: 275106	2016-07-11 21:39:44 +00:00
Quentin Colombet	fb82c7bc94	[X86] Fix tailcall return address clobber bug. This bug (llvm.org/PR28124) was introduced by r237977, which refactored the tail call sequence to be generated in two passes instead of one. Unfortunately, the stack adjustment produced by the first pass was not recognized by X86FrameLowering::mergeSPUpdates() in all cases, causing code such as the following, which clobbers the return address, to be generated: popl %edi popl %edi pushl %eax jmp tailcallee # TAILCALL To fix the problem, the entire stack adjustment is performed in X86ExpandPseudo::ExpandMI() for tail calls. Patch by Magnus Lång <margnus1@gmail.com> Differential Revision: http://reviews.llvm.org/D21325 llvm-svn: 275103	2016-07-11 21:03:03 +00:00
Alina Sbirlea	327955e057	Add TLI.allowsMisalignedMemoryAccesses to LoadStoreVectorizer Summary: Extend TTI to access TLI.allowsMisalignedMemoryAccesses(). Check condition when vectorizing load and store chains. Add additional parameters: AddressSpace, Alignment, Fast. Reviewers: llvm-commits, jlebar Subscribers: arsenm, mzolotukhin Differential Revision: http://reviews.llvm.org/D21935 llvm-svn: 275100	2016-07-11 20:46:17 +00:00
Michael Kuperstein	cfbac5f361	[X86] Disable FixupSetCC for CodeGenOpt::None It is an optimization pass, and should not run at -O0. Especially since Fast RA will not do the required register coalescing anyway, so it's a loss even from the optimization standpoint. This also works around (but doesn't quite fix) PR28489. llvm-svn: 275099	2016-07-11 20:40:44 +00:00
Chad Rosier	4f0dad1674	[IPRA] Properly compute register usage at call sites. Differential Revision: http://reviews.llvm.org/D21395 Patch by Vivek Pandya. PR28144 llvm-svn: 275087	2016-07-11 18:45:49 +00:00
Zhan Jun Liau	def708a0f9	[SystemZ] Recognize Load On Condition Immediate (LOCHI/LOGHI) opportunities Summary: Add support for the z13 instructions LOCHI and LOCGHI which conditionally load immediate values. Add target instruction info hooks so that if conversion will allow predication of LHI/LGHI. Author: RolandF Reviewers: uweigand Subscribers: zhanjunl Commiting on behalf of Roland. Differential Revision: http://reviews.llvm.org/D22117 llvm-svn: 275086	2016-07-11 18:45:03 +00:00
Jingyue Wu	641cfee976	[SLSR] Call getPointerSizeInBits with the correct address space. llvm-svn: 275083	2016-07-11 18:13:28 +00:00
Davide Italiano	e8ae0b5eb4	[PM/IPO] Port LowerTypeTests to the new PassManager. There's a little bit of churn in this patch because the initialization mechanism is now shared between the old and the new PM. Other than that, it's just a pretty mechanical translation. llvm-svn: 275082	2016-07-11 18:10:06 +00:00
Jacques Pienaar	c3a162c451	[lanai] Add more tests for assembly of conditional ALU ops llvm-svn: 275081	2016-07-11 17:58:16 +00:00
Dehao Chen	9232f98279	Implement callsite-hotness based inline cost for Sample-based PGO Summary: For sample-based PGO, using BFI to calculate callsite count is sometime not accurate. This is because with sampling based approach, if a callsite resides in a hot loop deeply nested in a bunch of cold branches, the callsite's BFI frequency would be inaccurately calculated due to lack of samples in the cold branch. E.g. if (A1 && A2 && A3 && ..... && A10) { for (i=0; i < 100000000; i++) { callsite(); } } Assume that A1 to A100 are all 100% taken, and callsite has 1000 samples and thus is considerred hot. Because the loop's trip count is huge, it's normal that all branches outside the loop has no sample at all. As a result, we can only use static branch probability to derive the the frequency of the loop header. Assuming that static heuristic thinks each branch is 50% taken, then the count calculated from BFI will be 1/(2^10) of the actual value. In order to get more accurate callsite count, we directly annotate the weight on the call instruction, and directly use it when checking callsite hotness. Note that this mechanism can also be shared by instrumentation based callsite hotness analysis. The side benefit is that it breaks the dependency from Inliner to BFI as call count is embedded in the IR. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22118 llvm-svn: 275073	2016-07-11 16:48:54 +00:00
Dehao Chen	29d2641f52	Tune the weight propagation algorithm for sample profile. Summary: Handle the case when there is only one incoming/outgoing edge for a visited basic block: use the block weight to adjust edge weight even when the edge has been visited before. This can help reduce inaccuracies introduced by incorrect basic block profile, as shown in the updated unittest. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22180 llvm-svn: 275072	2016-07-11 16:40:17 +00:00
Sanjay Patel	8f1d408c74	[x86] make some of the tests 256-bit for testing diversity llvm-svn: 275070	2016-07-11 15:08:37 +00:00
Nirav Dave	8603062ee4	Fix branch relaxation in 16-bit mode. Thread through MCSubtargetInfo to relaxInstruction function allowing relaxation to generate jumps with 16-bit sized immediates in 16-bit mode. This fixes PR22097. Reviewers: dwmw2, tstellarAMD, craig.topper, jyknight Subscribers: jfb, arsenm, jyknight, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D20830 llvm-svn: 275068	2016-07-11 14:23:53 +00:00
Sanjay Patel	b428951990	[x86] specify triple to avoid bot failures llvm-svn: 275067	2016-07-11 14:17:54 +00:00
Nicolai Haehnle	889a20cf40	[Sink] Don't move calls to readonly functions across stores Summary: Reviewers: hfinkel, majnemer, tstellarAMD, sunfish Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17279 llvm-svn: 275066	2016-07-11 14:11:51 +00:00
Sanjay Patel	0d38830aca	[x86] update checks llvm-svn: 275064	2016-07-11 14:07:31 +00:00
Nirav Dave	53a72f4d3c	Provide support for preserving assembly comments Preserve assembly comments from input in output assembly and flags to toggle property. This is on by default for inline assembly and off in llvm-mc. Parsed comments are emitted immediately before an EOL which generally places them on the expected line. Reviewers: rtrieu, dwmw2, rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20020 llvm-svn: 275058	2016-07-11 12:42:14 +00:00
Artem Tamazov	53c9de08d2	[AMDGPU][llvm-mc] Quickfix for r272748 to enable labels in branch instructions. Fixes issue mentioned at: https://github.com/RadeonOpenCompute/LLVM-AMDGPU-Assembler-Extra/issues/13. Lit tests added. Differential Revision: http://reviews.llvm.org/D22133 llvm-svn: 275054	2016-07-11 12:07:18 +00:00
Zlatko Buljan	cba9f80ba8	[mips][microMIPS] Implement LDC1, SDC1, LDC2, SDC2, LWC1, SWC1, LWC2 and SWC2 instructions and add CodeGen support Differential Revision: http://reviews.llvm.org/D18824 llvm-svn: 275050	2016-07-11 07:41:56 +00:00
Elena Demikhovsky	d84f337953	AVX-512: DAG lowering for scalar MIN/MAX commutable ops DAG lowering was missing for the scalar FMINC, FMAXC nodes. The nodes are generated only in the "unsafe-fp-math" mode. Added tests. llvm-svn: 275048	2016-07-11 06:08:06 +00:00
Craig Topper	7ee070e7bc	[AVX512] Add support for 512-bit ANDN now that all ones build vectors survive long enough to allow the matching. llvm-svn: 275046	2016-07-11 05:36:53 +00:00
Craig Topper	516e14cd8e	[AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one vectors. llvm-svn: 275045	2016-07-11 05:36:48 +00:00
Hal Finkel	02012bcfee	Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot. llvm-svn: 275042	2016-07-11 04:51:23 +00:00
Hal Finkel	2cac58f604	Pointer-comparison folding should look through returned-argument functions For functions which are known to return a specific argument, pointer-comparison folding can look through the function calls as part of its analysis. Differential Revision: http://reviews.llvm.org/D9387 llvm-svn: 275039	2016-07-11 03:37:59 +00:00
Hal Finkel	bf3957a553	Teach isDereferenceablePointer to look through returned-argument functions For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 llvm-svn: 275038	2016-07-11 03:08:49 +00:00
Hal Finkel	e186debb8b	Teach SCEV to look through returned-argument functions When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 llvm-svn: 275037	2016-07-11 02:48:23 +00:00
Hal Finkel	6fd5e1f02b	Teach computeKnownBits to look through returned-argument functions If a function is known to return one of its arguments, we can use that in order to compute known bits of the return value. Differential Revision: http://reviews.llvm.org/D9397 llvm-svn: 275036	2016-07-11 02:25:14 +00:00
Hal Finkel	5c12d8fe8f	BasicAA should look through functions with returned arguments Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 llvm-svn: 275035	2016-07-11 01:32:20 +00:00
Hal Finkel	d66a7b05db	Let FuncAttrs infer the 'returned' argument attribute A function can have one argument with the 'returned' attribute, indicating that the associated argument is always the return value of the function. Add FuncAttrs inference logic. Differential Revision: http://reviews.llvm.org/D22202 llvm-svn: 275027	2016-07-10 22:02:55 +00:00
Jan Vesely	2fa28c330c	AMDGPU/R600: Add implicitarg.ptr intrinsic Differential Revision: http://reviews.llvm.org/D21622 llvm-svn: 275024	2016-07-10 21:20:29 +00:00
Simon Pilgrim	2191faa433	[X86][SSE] Add support for target shuffle combining to PSHUFLW/PSHUFHW llvm-svn: 275022	2016-07-10 21:02:47 +00:00
Sanjay Patel	ccd08fc8c4	[x86, SSE, AVX] add tests for icmp+zext (PR28484) Note the inconsistent vpbroadcast generation for AVX2; another bug. llvm-svn: 275020	2016-07-10 20:45:14 +00:00
Simon Pilgrim	51c786bd91	[X86][SSE] Added tests for combining shuffles to PSHUFLW/PSHUFHW llvm-svn: 275019	2016-07-10 20:19:56 +00:00
Marcin Koscielnicki	cf7cc724a7	[SystemZ] Utilize Test Data Class instructions. This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32\|64\|128), which maps straight to the test data class instructions. A new IR pass is added to recognize instructions that can be converted to TDC and perform the necessary replacements. Differential Revision: http://reviews.llvm.org/D21949 llvm-svn: 275016	2016-07-10 14:41:22 +00:00
Craig Topper	0b0954570a	[AVX512] Add support for lowering to 512-bit SHUFPS. llvm-svn: 275011	2016-07-10 05:55:53 +00:00
Sean Silva	db90d4d9c1	[PM] Port LoopVectorize to the new PM. llvm-svn: 275000	2016-07-09 22:56:50 +00:00
Simon Pilgrim	606126e848	[X86][SSE] Add support for target shuffle combining to INSERTPS llvm-svn: 274990	2016-07-09 21:47:55 +00:00
Simon Pilgrim	890b415902	[X86][SSE] Regenerate vector shift tests llvm-svn: 274987	2016-07-09 20:55:20 +00:00
David Majnemer	28c3646f82	[COFF, Dwarf] Don't emit DW_AT_location for dllimported entities There exists no relocation which can describe the address of a dllimported variable: do not try to describe their location. llvm-svn: 274986	2016-07-09 20:47:48 +00:00
Jingyue Wu	debce55ac3	[SLSR] Fix crash on handling 128-bit integers. ConstantInt::getSExtValue may fail on >64-bit integers. Add checks to call getSExtValue only on narrow integers. As a minor aside, simplify slsr-gep.ll to remove unnecessary load instructions. llvm-svn: 274982	2016-07-09 19:13:18 +00:00
Jacques Pienaar	b32a912f72	[lanai] Treat .t as optional in assembly parser for RR operands and add predicate operand to ShiftRR llvm-svn: 274980	2016-07-09 18:26:04 +00:00
Matt Arsenault	c1e6a45f2e	AMDGPU: Merge / reorganize tests llvm-svn: 274972	2016-07-09 08:02:28 +00:00
Matt Arsenault	b2cb5f8105	AMDGPU: Simplify tests with per function subtargets llvm-svn: 274971	2016-07-09 07:55:03 +00:00
Matt Arsenault	dfec5ce032	AMDGPU: Fix fdiv lowering when f32 denormals supported Also fix test not actually using function labels. llvm-svn: 274969	2016-07-09 07:48:11 +00:00
Craig Topper	70610cf7b6	[X86] Remove and autoupgrade 512-bit non-temporal store intrinsics. llvm-svn: 274966	2016-07-09 04:38:27 +00:00
Davide Italiano	92b933a55c	[PM] Port CrossDSOCFI to the new pass manager. llvm-svn: 274962	2016-07-09 03:25:35 +00:00
Davide Italiano	cd96cfd8df	[PM] Port LoopSimplify to the new pass manager. While here move simplifyLoop() function to the new header, as suggested by Chandler in the review. Differential Revision: http://reviews.llvm.org/D21404 llvm-svn: 274959	2016-07-09 03:03:01 +00:00
Matt Arsenault	1322b6f8bb	AMDGPU: Improve offset folding for register indexing llvm-svn: 274954	2016-07-09 01:13:56 +00:00
Matthias Braun	152e7c8b12	VirtRegMap: Replace some identity copies with KILL instructions. An identity COPY like this: %AL = COPY %AL, %EAX<imp-def> has no semantic effect, but encodes liveness information: Further users of %EAX only depend on this instruction even though it does not define the full register. Replace the COPY with a KILL instruction in those cases to maintain this liveness information. (This reverts a small part of r238588 but this time adds a comment explaining why a KILL instruction is useful). llvm-svn: 274952	2016-07-09 00:19:07 +00:00
Piotr Padlewski	7a298c1df0	Added REQUIRES to TestingGuide documentation Reviewers: alexfh, wolfgangp, rengolin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22172 llvm-svn: 274949	2016-07-08 23:47:29 +00:00
Piotr Padlewski	3b77612839	Add 'thinlto_src_module' md with asserts or -enable-import-metadata Summary: This way the metadata will be only generated when asserts enabled, or when -enable-import-metadata specified FIXED missing colon on requires. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D22167 llvm-svn: 274947	2016-07-08 23:01:49 +00:00
Piotr Padlewski	d4b792346c	Revert "Add 'thinlto_src_module' md with asserts or -enable-import-metadata" Reverting because of 17463 http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17463 This reverts commit d20cb431bba2ba43b4c65a8556cff445bfefbb7c. llvm-svn: 274946	2016-07-08 22:55:48 +00:00
Jacques Pienaar	9e70127b0a	[lanai] Update test to use peephole-opt and not peephole-opts llvm-svn: 274945	2016-07-08 22:28:29 +00:00
Anna Thomas	9ad45adfd7	Revert "InstCombine rule to fold truncs whose value is available" This reverts commit r274853. Caused failure in ppcBE build llvm-svn: 274943	2016-07-08 22:15:08 +00:00
David Majnemer	230bbfbeec	[MC, COFF] Permit a variable to be redefined Our assertions in WinCOFFStreamer had unexpected side effects resulting in symbols getting unexpectedly marked as used. This fixes PR28462. llvm-svn: 274941	2016-07-08 21:54:16 +00:00
Piotr Padlewski	d6efefa2b8	Add 'thinlto_src_module' md with asserts or -enable-import-metadata Summary: This way the metadata will be only generated when asserts enabled, or when -enable-import-metadata specified Reviewers: tejohnson, eraman, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D22167 llvm-svn: 274938	2016-07-08 21:25:39 +00:00
Matt Arsenault	3fb8f9eabf	Reapply r274829 with fix for FP vectors llvm-svn: 274937	2016-07-08 21:25:33 +00:00
Adam Nemet	f836067cc0	[LAA] Port test to the new PM This is a follow-on to r274452. The LAA with the new PM is a loop pass so we go from inner to outer loops. Also using a CHECK-NOT didn't make much sense because we print something in either case; whether an invariant is 'found' or 'not found'. llvm-svn: 274935	2016-07-08 21:24:06 +00:00
Sanjay Patel	664514f7fe	[InstCombine] don't form select from bitcasted logic ops if bitcasts have >1 use This isn't a sure thing (are 2 extra bitcasts less expensive than a logic op?), but we'll try to err on the conservative side by going with the case that has less IR instructions. Note: This question came up in http://reviews.llvm.org/D22114 , but this part is independent of that patch proposal, so I'm making this small change ahead of that one. See also: http://reviews.llvm.org/rL274926 llvm-svn: 274932	2016-07-08 21:17:51 +00:00
Sanjay Patel	5246482c7a	add another multi-use test for logic->select transform llvm-svn: 274929	2016-07-08 21:08:16 +00:00
Sanjay Patel	f4a08ede03	[InstCombine] don't form select from logic ops if it's unlikely that we'll eliminate any ops llvm-svn: 274926	2016-07-08 20:53:29 +00:00
Sanjay Patel	297a0e67b6	adjust test so it won't completely optimize away llvm-svn: 274925	2016-07-08 20:35:53 +00:00
Sanjay Patel	0733e6b61c	add tests for multi-use folding to select llvm-svn: 274922	2016-07-08 20:22:27 +00:00
Dehao Chen	429f5c735f	Remove inline hints computation from SampleProfile.cpp Summary: As we will move to use uniformed hotness check in inliner, we do not need inline hints in SampleProfile pass any more. Reviewers: dnovillo, davidxl Subscribers: eraman, llvm-commits Differential Revision: http://reviews.llvm.org/D19287 llvm-svn: 274918	2016-07-08 20:12:44 +00:00
Nico Weber	28410c6846	Revert r274829, it caused PR28472. llvm-svn: 274916	2016-07-08 19:52:19 +00:00
Simon Pilgrim	0a0e0d4e8e	[X86] Regenerated bitreverse tests to demonstrate what is going on. llvm-svn: 274915	2016-07-08 19:51:08 +00:00
Simon Pilgrim	aaaeedb8cb	[X86] Added bitreverse tests for non-legal types Requested on D21578 llvm-svn: 274914	2016-07-08 19:48:33 +00:00
Simon Pilgrim	950419f948	[X86][AVX2] Add support for target shuffle combining to VPERMPD/VPERMQ llvm-svn: 274908	2016-07-08 19:23:29 +00:00
Davide Italiano	d555bde59f	[SCCP] Fold constants as we build them whne visiting cast instructions. This should be slightly more efficient and could avoid spurious overdefined markings, as Eli pointed out. Differential Revision: http://reviews.llvm.org/D22122 llvm-svn: 274905	2016-07-08 19:13:40 +00:00
Sanjay Patel	1b6b824548	[InstCombine] check for one-use before turning simple logic op into a select llvm-svn: 274891	2016-07-08 17:26:47 +00:00
Simon Pilgrim	4ca42e232d	[SLPVectorizer][X86] Added fma vectorization tests llvm-svn: 274889	2016-07-08 17:19:13 +00:00
Sanjay Patel	910ce0d511	add test to show multi-use output llvm-svn: 274887	2016-07-08 17:12:27 +00:00
Simon Pilgrim	b600ba3b79	[X86][AVX] Added combine test that should simplify to insertps llvm-svn: 274884	2016-07-08 17:01:42 +00:00
Sanjay Patel	cbfca9e8ef	[InstCombine] allow or(sext(A), B) --> A ? -1 : B transform for vectors llvm-svn: 274883	2016-07-08 17:01:15 +00:00
Zhan Jun Liau	7d4d436c74	[SystemZ] Add support for the .word directive. Summary: Branch off the work to add support for the .word directive, using addAliasForDirective. Reviewers: koriakin Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D22142 llvm-svn: 274878	2016-07-08 16:50:02 +00:00
Sanjay Patel	647174c8a4	add vector tests to show missing transform llvm-svn: 274876	2016-07-08 16:39:53 +00:00
Matt Arsenault	44540a3db2	PeepholeOptimizer: Make pass name match DEBUG_TYPE llvm-svn: 274874	2016-07-08 16:29:11 +00:00
Zhan Jun Liau	3b4c3f4d51	[SystemZ] Add support for missing instructions Summary: Add support to allow clang integrated assembler to recognize some missing instructions, for openssl. Instructions are: LM, LMH, LMY, STM, STMH, STMY, ICM, ICMH, ICMY, SLA, SLAK, TML, TMH, EX, EXRL. Reviewers: uweigand Subscribers: koriakin, llvm-commits Differential Revision: http://reviews.llvm.org/D22050 llvm-svn: 274869	2016-07-08 16:18:40 +00:00
Sanjay Patel	46df968326	minimize tests The cmp and load aren't required. llvm-svn: 274864	2016-07-08 16:11:48 +00:00
Sanjay Patel	e1acad9b61	regenerate checks llvm-svn: 274860	2016-07-08 16:06:38 +00:00
Chris Dewhurst	3202f065b8	[Sparc] Leon errata fix passes. Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor. The nature of the errata are listed in the comments preceding the errata fix passes. Relevant unit tests are implemented for each of these. Note: Running clang-format has changed a few other lines too, unrelated to the implemented errata fixes. These have been left in as this keeps the code formatting consistent. Differential Revision: http://reviews.llvm.org/D21960 llvm-svn: 274856	2016-07-08 15:33:56 +00:00
Sjoerd Meijer	1ee119f897	Do not expand SDIV when compiling for minimum code size Differential Revision: http://reviews.llvm.org/D22139 llvm-svn: 274855	2016-07-08 15:32:01 +00:00
Anna Thomas	3124f6273a	InstCombine rule to fold truncs whose value is available We can fold truncs whose operand feeds from a load, if the trunc value is available through a prior load/store. This change is from: http://reviews.llvm.org/D21246, which folded the trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW call, when the load type didnt match the prior load/store type. Differential Revision: http://reviews.llvm.org/D21791 llvm-svn: 274853	2016-07-08 15:18:56 +00:00
Valery Pykhtin	68853ab2c5	[AMDGPU] fix ds_swizzle_b32 opcode for VI (bz 28371) Differential Revision: http://reviews.llvm.org/D22049 llvm-svn: 274852	2016-07-08 15:12:46 +00:00
Sjoerd Meijer	46c4c3d31c	Addressing post-commit comments regarding not expanding UDIV; we don't expand only when compiling for minimum code size. llvm-svn: 274847	2016-07-08 14:17:09 +00:00
Simon Pilgrim	4f1877fb57	[X86][SSE] Improve constant folding tests for CVTSD/CVTSS/CVTTSD/CVTTSS As discussed on D22106, improve the testing for constant folding sse scalar conversion intrinsics to ensure we are correctly handling special/out of range cases llvm-svn: 274846	2016-07-08 13:28:34 +00:00
Sjoerd Meijer	a625af3feb	Code size optimisation: don't expand a div to a mul and and a shift sequence. As a result, the urem instruction will not be expanded to a sequence of umull, lsrs, muls and sub instructions, but just a call to __aeabi_uidivmod. Differential Revision: http://reviews.llvm.org/D22131 llvm-svn: 274843	2016-07-08 12:54:43 +00:00
Simon Pilgrim	828c731880	[X86][SSE] Accept any shuffle mask that is all zeroes Until we have a better way to extract constants through bitcasted build vectors (and how to handle undefs of partial lanes etc.) at least accept build vectors that are all zeroes. llvm-svn: 274833	2016-07-08 10:39:12 +00:00
Matt Arsenault	c3a6fe6ecd	Bug 28444: Fix assertion when extract_vector_elt has mismatched type For some reason extract_vector_elt is sometimes allowed to have a different result type than the vector element type. llvm-svn: 274829	2016-07-08 07:05:00 +00:00
Craig Topper	f7bf6de0af	[AVX512] Remove and autoupgrade a duplicate set of 512-bit masked shift intrinsics. I'm not sure if clang ever used these builtin names or not. llvm-svn: 274827	2016-07-08 06:14:47 +00:00
Wei Mi	90d195a5fd	[PM] Port UnreachableBlockElim to the new Pass Manager Differential Revision: http://reviews.llvm.org/D22124 llvm-svn: 274824	2016-07-08 03:32:49 +00:00
Saleem Abdulrasool	eb059b0e0a	ARM: support high registers in __builtin_longjmp on WoA Windows on ARM uses a pure thumb-2 environment. This means that it can select a high register when doing a __builtin_longjmp. We would use a tLDRi which would truncate the register to a low register. Use a t2LDRi12 to get the full register file access. Tweak the code to just load into PC, as that is an interworking branch on all supported cores anyways. llvm-svn: 274815	2016-07-08 00:48:22 +00:00
Andrew Kaylor	3387074ae9	Temporarily remove a test case to unblock PPC bots. llvm-svn: 274813	2016-07-08 00:35:39 +00:00
Andrew Kaylor	8b8805c94c	Temporarily remove one test run line to unblock PPC bots. llvm-svn: 274812	2016-07-08 00:32:58 +00:00
Jacques Pienaar	6d3eecc843	[lanai] Use peephole optimizer to generate more conditional ALU operations. Summary: * Similiar to the ARM backend yse the peephole optimizer to generate more conditional ALU operations; * Add predicated type with default always true to RR instructions in LanaiInstrInfo.td; * Move LanaiSetflagAluCombiner into optimizeCompare; * The ASM parser can currently only handle explicitly specified CC, so specify ".t" (true) where needed in the ASM test; * Remove unused MachineOperand flags; Reviewers: eliben Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D22072 llvm-svn: 274807	2016-07-07 23:36:04 +00:00
Michael Kuperstein	3e3652aef2	Recommit r274692 - [X86] Transform setcc + movzbl into xorl + setcc xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD) which was not appreciated by fast regalloc on 32-bit. llvm-svn: 274802	2016-07-07 22:50:23 +00:00
Vedant Kumar	0fdffd3709	[tsan] Try harder to not instrument gcov counters GCOVProfiler::emitProfileArcs() can create many variables with names starting with "__llvm_gcov_ctr", so llvm appends a numeric suffix to most of them. Teach tsan about this. llvm-svn: 274801	2016-07-07 22:45:28 +00:00
Kevin Enderby	1851a827a0	Add checks to the MachOObjectFile() constructor to make sure load commands sizes are the correct multiple. llvm-svn: 274798	2016-07-07 22:11:42 +00:00
Davide Italiano	16284df8ec	[PM] Port InstSimplify to the new pass manager. llvm-svn: 274796	2016-07-07 21:14:36 +00:00
Anna Thomas	6a78c78a03	[DSE] Remove dead stores in end blocks containing fence We can remove dead stores in the presence of fence instructions. Fence does not change an otherwise thread local store to visible. reviewers: reames, dexonsmith, jfb Differential Revision: http://reviews.llvm.org/D22001 llvm-svn: 274795	2016-07-07 20:51:42 +00:00
Chad Rosier	112d0e996b	[AArch64] Change the preferred alignment for char and short to word alignment. The commit reinstates r273279, which was informally approved. Original Review: http://reviews.llvm.org/D21414 This reverts commit ca632c91aaa7cafc50942f890c49f727a046ace1. llvm-svn: 274790	2016-07-07 20:02:18 +00:00
Andrew Kaylor	65fa0704aa	Include SelectionDAGISel in the opt-bisect process Differential Revision: http://reviews.llvm.org/D21143 llvm-svn: 274786	2016-07-07 18:55:02 +00:00
Peter Collingbourne	73589f321b	ThinLTO: Do not take into account whether a definition has multiple copies when promoting. We currently do not touch a symbol's linkage in the case where a definition has a single copy. However, this code is effectively unnecessary: either the definition is not exported, in which case the internalize phase sets its linkage to internal, or it is exported, in which case we need to promote linkage to weak. Those two cases are already handled by existing code. I believe that the only real functional change here is in the case where we have a single definition which does not prevail (e.g. because the definition in a native object file prevails). In that case we now lower linkage to available_externally following the existing code path for that case. As a result we can remove the isExported function parameter from the thinLTOResolveWeakForLinkerInIndex function. Differential Revision: http://reviews.llvm.org/D21883 llvm-svn: 274784	2016-07-07 18:31:51 +00:00
Tim Northover	1d106c5fc2	tests: accept different TargetOpcode values. These tests don't actually care about the internal opcode number, but have to be updated whenever we add a new one for GlobalISel. That's bad. llvm-svn: 274774	2016-07-07 17:51:42 +00:00
Michael Kuperstein	edb38a94f8	Revert r274692 to check whether this is what breaks windows selfhost. llvm-svn: 274771	2016-07-07 16:55:35 +00:00
Justin Bogner	a466cc33fa	NVPTX: Remove the legacy ptx intrinsics - Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0. There's a fair amount of code motion here, but it's all very mechanical. llvm-svn: 274769	2016-07-07 16:40:17 +00:00
Chad Rosier	3972953efd	Revert "[AArch64] Change the preferred alignment for char and short to word alignment" This reverts commit r273279 as the change was not properly approved. llvm-svn: 274768	2016-07-07 16:37:29 +00:00
Valery Pykhtin	af8b1bddbd	[AMDGPU] fix ds_write_src2 encoding (bz26027) Differential revision: http://reviews.llvm.org/D22041 llvm-svn: 274756	2016-07-07 14:23:38 +00:00
Rafael Espindola	b34cba97b7	Don't crash trying to relax 32 loads on COFF. Fixes pr28452. llvm-svn: 274754	2016-07-07 14:00:07 +00:00
Sjoerd Meijer	17c08dc701	Code size optimisation: don't rewrite fputs to fwrite when optimising for size because fwrite requires more arguments and thus extra MOVs are required. llvm-svn: 274753	2016-07-07 13:56:23 +00:00
David Majnemer	7afb46d3c8	[LoopAccessAnalysis] Fix an integer overflow We were inappropriately using 32-bit types to account for quantities that can be far larger. Fixed in PR28443. llvm-svn: 274737	2016-07-07 06:24:36 +00:00
Craig Topper	d5d2a35013	[AVX512] Zero extend the result of vpcmpeq/vpcmpgt and similar intrinsics in the autoupgrade code. This currently results in worse codegen but is needed for correctness. llvm-svn: 274736	2016-07-07 06:11:07 +00:00
Elena Demikhovsky	fc1e969dfc	Fixed a bug in vectorizing GEP before gather/scatter intrinsic. Vectorizing GEP was incorrect and broke SSA in some cases. The patch fixes PR27997 https://llvm.org/bugs/show_bug.cgi?id=27997. Differential revision: http://reviews.llvm.org/D22035 llvm-svn: 274735	2016-07-07 06:06:46 +00:00
David Majnemer	a54fe1acdc	[CodeView] Implement support for thread-local variables llvm-svn: 274734	2016-07-07 05:14:21 +00:00
Qin Zhao	c35b2cba6f	[esan:cfrag] Add option -esan-aux-field-info Summary: Adds option -esan-aux-field-info to control generating binary with auxiliary struct field information. Extracts code for creating auxiliary information from createCacheFragInfoGV into createCacheFragAuxGV. Adds test struct_field_small.ll for -esan-aux-field-info test. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D22019 llvm-svn: 274726	2016-07-07 03:20:16 +00:00
Peter Collingbourne	730c82e6b8	ThinLTO: Remove check for multiple modules before applying weak resolutions. This check is not only unnecessary, it can produce the wrong result. If we are linking a single module and it has an exported linkonce symbol, we need to promote to weak in order to avoid PR19901-style problems. Differential Revision: http://reviews.llvm.org/D21917 llvm-svn: 274722	2016-07-07 01:51:11 +00:00
Sean Silva	284b0324e2	[PM] Avoid getResult on a higher level in LoopAccessAnalysis Note that require<domtree> and require<loops> aren't needed because they come in implicitly via the loop pass manager. llvm-svn: 274712	2016-07-07 01:01:53 +00:00
Sean Silva	59fe82f4ce	[PM] Port TailCallElim llvm-svn: 274708	2016-07-06 23:48:41 +00:00
Sean Silva	b025d375a1	[PM] Port CorrelatedValuePropagation llvm-svn: 274705	2016-07-06 23:26:29 +00:00
Peter Collingbourne	d1d2614ee1	ThinLTO: Add test cases for promote+internalize. This tests the effect of both promotion and internalization on a module, and helps show that D21883 is NFC wrt promotion+internalization. Differential Revision: http://reviews.llvm.org/D21915 llvm-svn: 274699	2016-07-06 22:53:02 +00:00
Sanjay Patel	65a51c25c1	[InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we allow these transforms for splat vectors. Differential Revision: http://reviews.llvm.org/D21899 llvm-svn: 274696	2016-07-06 22:23:01 +00:00
Manman Ren	524ca27b90	Add testing coverage for r274582. llvm-svn: 274693	2016-07-06 22:01:28 +00:00
Michael Kuperstein	1ef6c59b1d	[X86] Transform setcc + movzbl into xorl + setcc xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller. This fixes PR28146. Differential Revision: http://reviews.llvm.org/D21774 llvm-svn: 274692	2016-07-06 21:56:18 +00:00
Vedant Kumar	4c01092a25	[llvm-cov] Add support for creating html reports Based on a patch by Harlan Haskins! Differential Revision: http://reviews.llvm.org/D18278 llvm-svn: 274688	2016-07-06 21:44:05 +00:00
Matthias Braun	ad0032a649	AArch64: Change modeling of zero cycle zeroing. On CPUs with the zero cycle zeroing feature enabled "movi v.2d" should be used to zero a vector register. This was previously done at instruction selection time, however the register coalescer sometimes widened multiple vregs to the Q width because of that leading to extra spills. This patch leaves the decision on how to zero a register to the AsmPrinter phase where it doesn't affect register allocation anymore. This patch also sets isAsCheapAsAMove=1 on FMOVS0, FMOVD0. This fixes http://llvm.org/PR27454, rdar://25866262 Differential Revision: http://reviews.llvm.org/D21826 llvm-svn: 274686	2016-07-06 21:39:33 +00:00
Chad Rosier	232e29ebea	[MemorySSA] Reinstate the legacy printer and verifier. Differential Revision: http://reviews.llvm.org/D22058 llvm-svn: 274679	2016-07-06 21:20:47 +00:00
Rafael Espindola	a29971faeb	Add initial support for R_386_GOT32X. This adds it only for movl mov@GOT(%reg), %reg. llvm-svn: 274678	2016-07-06 21:19:11 +00:00
David Majnemer	7abd269aa9	[CodeView] Emit an appropriate symbol kind for globals We emitted debug info for globals/functions as if they all had external linkage. Instead, emit local symbol records when appropriate. llvm-svn: 274676	2016-07-06 21:07:47 +00:00
David Majnemer	e1e7372e93	[CodeView] Unions are always sealed It is impossible to inherit from a union. We are missing a way to represent this in IR for classes/structs... llvm-svn: 274675	2016-07-06 21:07:42 +00:00
Justin Lebar	6f9d01bbd5	[NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D22068 llvm-svn: 274674	2016-07-06 21:06:10 +00:00
Haicheng Wu	a95cd1267f	[LIR] Fix mis-compilation with unwinding. To fix PR27859, bail out if there is an instruction may throw. Differential Revision: http://reviews.llvm.org/D20638 llvm-svn: 274673	2016-07-06 21:05:40 +00:00
Piotr Padlewski	6deaa6afae	Add 'thinlto_src_module' metadata to imported function Added metadata to be able to make statistics on how many functions that have been imported have been removed. Also module name might be helpfull when debugging. Reviewers: tejohnson, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D21943 llvm-svn: 274668	2016-07-06 20:26:25 +00:00
Derek Bruening	d712a3c10e	[esan\|wset] Fix incorrect memory size assert Summary: Fixes an incorrect assert that fails on 128-bit-sized loads or stores. Augments the wset tests to include this case. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits Differential Revision: http://reviews.llvm.org/D22062 llvm-svn: 274666	2016-07-06 20:13:53 +00:00
Justin Bogner	a463537a36	NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0 Everywhere where cuda.syncthreads or __syncthreads is used, use the properly namespaced nvvm.barrier0 instead. llvm-svn: 274664	2016-07-06 20:02:45 +00:00
Adrian McCarthy	820ca5404c	Retry: "Emit CodeView type records for nested classes." Now with a corrected test to account for a recently supported properties bit in the debug info of a struct. Original review: http://reviews.llvm.org/D21939 This reverts commit 970c3fd497a28d25dd69526eb52594a696c37968. llvm-svn: 274661	2016-07-06 19:49:51 +00:00
Chad Rosier	dcfce2d0ec	[DSE] Avoid iterator invalidation bugs. The dse_with_dbg_value.ll test committed with r273141 is removed because this we no longer performs any type of back tracking, which is what was causing the codegen differences with and without debug information. Differential Revision: http://reviews.llvm.org/D21613 llvm-svn: 274660	2016-07-06 19:48:52 +00:00
Sanjay Patel	04b3496d9b	[x86] fix cost of SINT_TO_FP for i32 --> float (PR21356, PR28434) This is "cvtdq2ps" which does not appear to be particularly slow on any CPU according to Agner's tables. Choosing "5" as a cost here as suggested in: https://llvm.org/bugs/show_bug.cgi?id=21356 ...but it seems very conservative given that the instruction is fully pipelined, and I think these costs are supposed to model throughput. Note that related costs are also most likely too high, but this fixes PR21356 and partly fixes PR28434. llvm-svn: 274658	2016-07-06 19:15:54 +00:00
Sean Silva	f50d4b6cdc	Work around PR28400 a bit harder. We were still crashing in the "no change" case because LVI was not getting invalidated. See the thread "Should analyses be able to hold AssertingVH to IR? (related to PR28400)" for more discussion. llvm-svn: 274656	2016-07-06 19:05:41 +00:00
Elliot Colp	bc2cfc2291	[SystemZ] Remove AND mask of bottom 6 bits when result is used for shift/rotate On SystemZ, shift and rotate instructions only use the bottom 6 bits of the shift/rotate amount. Therefore, if the amount is ANDed with an immediate mask that has all of the bottom 6 bits set, we can remove the AND operation entirely. Differential Revision: http://reviews.llvm.org/D21854 llvm-svn: 274650	2016-07-06 18:13:11 +00:00
Zachary Turner	8848a7a6b2	[pdb] Round trip the PDB stream between YAML and binary PDB. This gets writing of the PDB stream working. llvm-svn: 274647	2016-07-06 18:05:57 +00:00
Kit Barton	f9d0a40573	Ensure all uses of permute instructions feed vector stores There is a problem in VSXSwapRemoval where it is incorrectly removing permute instructions. In this case, the permute is feeding both a vector store and also a non-store instruction. In this case, the permute cannot be removed. The fix is to simply look at all the uses of the vector register defined by the permute and ensure that all the uses are vector store instructions. This problem was reported in PR 27735 (https://llvm.org/bugs/show_bug.cgi?id=27735). Test case based on the original problem reported. Phabricator Review: http://reviews.llvm.org/D21802 llvm-svn: 274645	2016-07-06 18:03:52 +00:00
Tim Shen	1c3c0afc53	[DAGCombiner] Fix visitSTORE to continue processing current SDNode, if findBetterNeighborChains doesn't actually CombineTo it. Summary: findBetterNeighborChains may or may not find a better chain for each node it finds, which include the node ("St") that visitSTORE is currently processing. If no better chain is found for St, visitSTORE should continue instead of return SDValue(St, 0), as if it's CombinedTo'ed. This fixes bug 28130. There might be other ways to make the test pass (see D21409). I think both of the patches are fixing actual bugs revealed by the same testcase. Reviewers: echristo, wschmidt, hfinkel, kbarton, amehsan, arsenm, nemanjai, bogner Subscribers: mehdi_amini, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D21692 llvm-svn: 274644	2016-07-06 17:44:03 +00:00
Michael Kuperstein	aa71bdd3af	[TTI] The cost model should not assume vector casts get completely scalarized The cost model should not assume vector casts get completely scalarized, since on targets that have vector support, the common case is a partial split up to the legal vector size. So, when a vector cast gets split, the resulting casts end up legal and cheap. Instead of pessimistically assuming scalarization, base TTI can use the costs the concrete TTI provides for the split vector, plus a fudge factor to account for the cost of the split itself. This fudge factor is currently 1 by default, except on AMDGPU where inserts and extracts are considered free. Differential Revision: http://reviews.llvm.org/D21251 llvm-svn: 274642	2016-07-06 17:30:56 +00:00
Adrian McCarthy	7649d8388a	Revert "Emit CodeView type records for nested classes." This reverts commit 256b29322c827a2d94da56468c936596f5509032. llvm-svn: 274632	2016-07-06 15:14:10 +00:00
Simon Pilgrim	118da63a9d	[X86][SSE] Added test cases for missed opportunities to combine pshufb to pslldq/psrldq llvm-svn: 274631	2016-07-06 15:09:48 +00:00
Adrian McCarthy	024a7b6358	Emit CodeView type records for nested classes. Differential Revision: http://reviews.llvm.org/D21939 llvm-svn: 274629	2016-07-06 14:47:32 +00:00
Matthew Simpson	433cb1dfe3	[LV] Don't widen trivial induction variables We currently always vectorize induction variables. However, if an induction variable is only used for counting loop iterations or computing addresses with getelementptr instructions, we don't need to do this. Vectorizing these trivial induction variables can create vector code that is difficult to simplify later on. This is especially true when the unroll factor is greater than one, and we create vector arithmetic when computing step vectors. With this patch, we check if an induction variable is only used for counting iterations or computing addresses, and if so, scalarize the arithmetic when computing step vectors instead. This allows for greater simplification. This patch addresses the suboptimal pointer arithmetic sequence seen in PR27881. Reference: https://llvm.org/bugs/show_bug.cgi?id=27881 Differential Revision: http://reviews.llvm.org/D21620 llvm-svn: 274627	2016-07-06 14:26:59 +00:00
Elena Demikhovsky	ad0a56f3da	Re-commit of 274613. The prev commit failed on compilation. A minor change in one pattern in lib/Target/X86/X86InstrAVX512.td fixes the failure. llvm-svn: 274626	2016-07-06 14:15:43 +00:00
Sam Kolton	3c21a69077	[AMDGPU] Assembler: regression tests for bug 28413. NFC llvm-svn: 274623	2016-07-06 12:52:20 +00:00
Diana Picus	b772e409ba	[ARM] Do not test for CPUs, use SubtargetFeatures. Also remove 2 flags. This is a follow-up for r273544. The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods. This commit also removes two command-line flags that weren't used in any of the tests: widen-vmovs and swift-partial-update-clearance. The former may be easily replaced with the mattr mechanism, but the latter may not (as it is a subtarget property, and not a proper feature). Differential Revision: http://reviews.llvm.org/D21797 llvm-svn: 274620	2016-07-06 11:22:11 +00:00
Elena Demikhovsky	02ced295aa	Reverted 274613 due to compilation failue. llvm-svn: 274615	2016-07-06 09:11:49 +00:00
Elena Demikhovsky	5a4f2476fd	AVX-512: Optimization for patterns with i1 scalar type The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc". I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction. I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions. This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173. Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails). Differential revision: http://reviews.llvm.org/D21956 llvm-svn: 274613	2016-07-06 09:01:20 +00:00
Nicolai Haehnle	e40530ea7b	AMDGPU: Fix return of non-void-returning shaders Summary: Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that ensures that a non-void-returning shader falls off the end of the last basic block was effectively disabled, since SI_RETURN is now used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21975 llvm-svn: 274612	2016-07-06 08:35:17 +00:00
Elena Demikhovsky	971fbfda1e	Vector GEP test: renamed + some comments Differential revision: http://reviews.llvm.org/D21957 llvm-svn: 274611	2016-07-06 08:11:23 +00:00
Daniel Berlin	fc7e651bfd	Fix handling of forward unreachable but reverse-reachable blocks in MemorySSA construction llvm-svn: 274606	2016-07-06 05:32:05 +00:00
George Burgess IV	bfa401e5ad	[CFLAA] Split into Anders+Steens analysis. StratifiedSets (as implemented) is very fast, but its accuracy is also limited. If we take a more aggressive andersens-like approach, we can be way more accurate, but we'll also end up being slower. So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA. Long-term, we want to end up in a place where CFLSteens is queried first; if it can provide an answer, great (since queries are basically map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc. This patch splits everything out so we can try to do something like that when we get a reasonable CFLAnders implementation. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21910 llvm-svn: 274589	2016-07-06 00:26:41 +00:00
Ryan Govostes	e51401bdab	[asan] Add a hidden option for Mach-O global metadata liveness tracking llvm-svn: 274578	2016-07-05 21:53:08 +00:00
Tim Northover	e6ae6767d9	AArch64: TableGenerate system instruction operands. The way the named arguments for various system instructions are handled at the moment has a few problems: - Large-scale duplication between AArch64BaseInfo.h and AArch64BaseInfo.cpp - That weird Mapping class that I have no idea what I was on when I thought it was a good idea. - Searches are performed linearly through the entire list. - We print absolutely all registers in upper-case, even though some are canonically mixed case (SPSel for example). - The ARM ARM specifies sysregs in terms of 5 fields, but those are relegated to comments in our implementation, with a slightly opaque hex value indicating the canonical encoding LLVM will use. This adds a new TableGen backend to produce efficiently searchable tables, and switches AArch64 over to using that infrastructure. llvm-svn: 274576	2016-07-05 21:23:04 +00:00
Balaram Makam	d4acd7ed10	Revert r259387: "AArch64: Implement missed conditional compare sequences." This reverts commit r259387 because it inserts illegal code after legalization in some backends where i64 OR type is illegal for example. llvm-svn: 274573	2016-07-05 20:24:05 +00:00
Simon Pilgrim	bec6543d17	[X86][AVX2] Add support for target shuffle combining to BROADCAST Only support broadcast from vector register so far - memory folding support will have to wait. llvm-svn: 274572	2016-07-05 20:11:29 +00:00
Simon Pilgrim	48adedffb7	[X86][AVX512] Fixed decoding of permd/permpd variable mask shuffles + enabled them for target shuffle combining Corrected element mask masking to extract the bottom index bits (now matches the perm2 implementation but for unary inputs). llvm-svn: 274571	2016-07-05 18:31:17 +00:00
Saleem Abdulrasool	4d950ef892	ARM: fix `-mlong-calls` for WoA Not all code-paths set the relocation model to static for Windows. This currently breaks on Windows ARM with `-mlong-calls` when built with clang. Loosen the assertion to what it was previously. We would ideally ensure that all the configuration sets Windows to static relocation model. llvm-svn: 274570	2016-07-05 18:30:52 +00:00
Matt Arsenault	2d79389508	DAGCombiner: Fold away vector extract of insert with the same index This only really matters when the index is non-constant since the constant case already gets taken care of by other combines. llvm-svn: 274569	2016-07-05 18:25:02 +00:00
Tim Northover	01dff9d18a	AArch64: use correct SDValue # when looking for bitfield placement. The other use really does only care about the SDNode (it checks the opcode against a whitelist), but bitFieldPlacement can be misled if the node produces multiple results. Patch by Ismail Badawi. llvm-svn: 274567	2016-07-05 18:02:57 +00:00
Matt Arsenault	ffc8275f2b	AMDGPU: Fix folding SGPRs into madak/madmk src0 Because of the special immediate operand, the constant bus is already used so SGPRs are never useful. r263212 changed the name of the immediate operand, which broke the verifier check for the restriction. llvm-svn: 274564	2016-07-05 17:09:01 +00:00
Sam Kolton	a9cd6aa895	[AMDGPU] Assembler: Fix parsing error with floating-point literals passed to integer instructions Differential Revision: http://reviews.llvm.org/D21972 llvm-svn: 274551	2016-07-05 14:01:11 +00:00
Simon Pilgrim	4e96fbf3c1	[X86][AVX512] Autoupgrade the BROADCAST intrinsics llvm-svn: 274550	2016-07-05 13:58:47 +00:00
Simon Pilgrim	1e91654b38	[X86][AVX512BW] Added BROADCAST intrinsics fast-isel generic IR tests llvm-svn: 274545	2016-07-05 13:16:05 +00:00
James Molloy	ae5ff990ae	[Thumb] Reapply r272251 with a fix for PR28348 (mk 2) The important thing I was missing was ensuring newly added constants were kept in topological order. Repositioning the node is correct if the constant is newly added (so it has no topological ordering) but wrong if it already existed - positioning it next in the worklist would break the topological ordering. Original commit message: [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 274543	2016-07-05 12:37:13 +00:00
Simon Pilgrim	20ede63a33	[X86][AVX512] Added BROADCAST intrinsics fast-isel generic IR tests llvm-svn: 274537	2016-07-05 10:15:14 +00:00
Nemanja Ivanovic	44513e545f	[PowerPC] - Legalize vector types by widening instead of integer promotion This patch corresponds to review: http://reviews.llvm.org/D20443 It changes the legalization strategy for illegal vector types from integer promotion to widening. This only applies for vectors with elements of width that is a multiple of a byte since we have hardware support for vectors with 1, 2, 3, 8 and 16 byte elements. Integer promotion for vectors is quite expensive on PPC due to the sequence of breaking apart the vector, extending the elements and reconstituting the vector. Two of these operations are expensive. This patch causes between minor and major improvements in performance on most benchmarks. There are very few benchmarks whose performance regresses. These regressions can be handled in a subsequent patch with a DAG combine (similar to how this patch handles int -> fp conversions of illegal vector types). llvm-svn: 274535	2016-07-05 09:22:29 +00:00
Simon Pilgrim	dea33cc2f3	[X86][AVX512] Added VSHUFPD intrinsics fast-isel generic IR tests llvm-svn: 274534	2016-07-05 09:10:07 +00:00
Simon Pilgrim	8a01915bd2	[X86][AVX512VL] Added VSHUFPD/VSHUFPS intrinsics fast-isel generic IR tests llvm-svn: 274533	2016-07-05 09:09:41 +00:00
Saleem Abdulrasool	4d3626ed31	test: relax the match on the timestamp llvm-svn: 274529	2016-07-05 01:14:53 +00:00
Saleem Abdulrasool	aecbdf70bf	Object: support empty UID/GID fields Normal archives do not have empty UID/GID fields. However, the Microsoft Import library format is a customized archive (it just uses an alternate symbol index format). When the import library is constructed by lib.exe, the UID and GID fields are left empty. Do not abort on such an input. llvm-svn: 274528	2016-07-05 00:23:05 +00:00
Simon Pilgrim	3ad040909a	[X86][AVX512] Add support for lowering shuffles to VSHUFPD llvm-svn: 274520	2016-07-04 20:41:24 +00:00
James Molloy	c3b4ed4a70	Revert "[Thumb] Reapply r272251 with a fix for PR28348" This reverts commit r274510 - it made green dragon unhappy. llvm-svn: 274512	2016-07-04 17:14:24 +00:00
James Molloy	9f019835ef	[Thumb] Reapply r272251 with a fix for PR28348 We were using DAG->getConstant instead of DAG->getTargetConstant. This meant that we could inadvertently increase the use count of a constant if stars aligned, which it did in this testcase. Increasing the use count of the constant could cause ISel to fall over (because DAGToDAG lowering assumed the constant had only one use!) Original commit message: [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 274510	2016-07-04 16:35:41 +00:00
Simon Pilgrim	02d435d2f4	[X86][AVX512] Autoupgrade the VPERMPD/VPERMQ intrinsics llvm-svn: 274506	2016-07-04 14:19:05 +00:00
Simon Pilgrim	8b82fce537	[X86][AVX512] Added VPERMPD/VPERMQ intrinsics fast-isel generic IR tests llvm-svn: 274503	2016-07-04 13:43:10 +00:00
Simon Pilgrim	9fca300cbe	[X86][AVX512] Autoupgrade the VPERMILPD/VPERMILPS intrinsics llvm-svn: 274498	2016-07-04 12:40:54 +00:00
Simon Pilgrim	c8cf2ddb6d	[X86][AVX512] Added VPERMILPD/VPERMILPS intrinsics fast-isel generic IR tests Added PSHUFD tests as well llvm-svn: 274493	2016-07-04 11:07:50 +00:00
Nicolai Haehnle	84c9f9919a	Add writeonly IR attribute Summary: This complements the earlier addition of IntrWriteMem and IntrWriteArgMem LLVM intrinsic properties, see D18291. Also start using the attribute for memset, memcpy, and memmove intrinsics, and remove their special-casing in BasicAliasAnalysis. Reviewers: reames, joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18714 llvm-svn: 274485	2016-07-04 08:01:29 +00:00
Craig Topper	d83f818a3e	[CodeGen] Make the code that detects a if a shuffle is really a concatenation of the inputs more general purpose. We can now handle concatenation of each source multiple times. The previous code just checked for each source to appear once in either order. This also now handles an entire source vector sized piece having undef indices correctly. We now concat with UNDEF instead of using one of the sources. This is responsible for the test case change. llvm-svn: 274483	2016-07-04 06:19:35 +00:00
Simon Pilgrim	7f096de0b8	[X86][AVX512] Add support for 512-bit shuffle lowering to VPERMPD/VPERMQ llvm-svn: 274473	2016-07-03 19:50:06 +00:00
Craig Topper	d1eca0f32c	[CodeGen] Teach OR combine of shuffles involving zero vectors to better handle undef indices. Undef indices can now be treated as zeros. Or if its undef ORed with zero, we will keep the undef. llvm-svn: 274472	2016-07-03 19:37:12 +00:00
Craig Topper	8e826d5abe	[X86] Add tests to show that the DAG combine for OR of shuffles with zero vectors doesn't handle undefs as well as it could. Fix coming in another commit. llvm-svn: 274471	2016-07-03 19:37:10 +00:00
Haicheng Wu	b71b2f622a	[MBB] add a missing corner case in UpdateTerminator() After the block placement, if a block ends with a conditional branch, but the next block is not its successor. The conditional branch should be changed to unconditional branch. This patch fixes PR28307, PR28297, PR28402. Differential Revision: http://reviews.llvm.org/D21811 llvm-svn: 274470	2016-07-03 19:14:17 +00:00
Simon Pilgrim	68ea80649b	[X86][AVX512] Add support for VPERMPD/VPERMQ masked shuffle comments llvm-svn: 274469	2016-07-03 18:40:24 +00:00
Simon Pilgrim	a0d73835b2	[X86][AVX512] Add support for 512-bit shuffle decoding of VPERMPD/VPERMQ llvm-svn: 274468	2016-07-03 18:27:37 +00:00
Simon Pilgrim	dbd6db0dc7	[X86][AVX512] Add support for VPALIGNR/PSHUFD/PSHUFHW/PSHUFLW masked shuffle comments llvm-svn: 274466	2016-07-03 15:00:51 +00:00
Sanjay Patel	cbaac41856	[InstCombine] enable vector select of bools -> logic folds llvm-svn: 274465	2016-07-03 14:34:39 +00:00
Simon Pilgrim	598bdb6bfe	[X86][AVX512] Add support for UNPCK masked shuffle comments llvm-svn: 274464	2016-07-03 14:26:21 +00:00
Simon Pilgrim	1f59076196	[X86][AVX512] Add support for VPERM/VSHUF masked shuffle comments llvm-svn: 274462	2016-07-03 13:55:41 +00:00
Simon Pilgrim	68f438a036	[X86][AVX512] Add support for PMOVZX masked shuffle comments llvm-svn: 274461	2016-07-03 13:33:28 +00:00
Sanjay Patel	42396ae0ea	add vector bool select tests and regenerate checks for scalar bool select tests llvm-svn: 274460	2016-07-03 13:26:02 +00:00
Simon Pilgrim	7c2fbdc101	[X86][AVX512] Add support for masked shuffle comments This patch adds support for including the avx512 mask register information in the mask/maskz versions of shuffle instruction comments. This initial version just adds support for MOVDDUP/MOVSHDUP/MOVSLDUP to reduce the mass of test regenerations, other shuffle instructions can be added in due course. Differential Revision: http://reviews.llvm.org/D21953 llvm-svn: 274459	2016-07-03 13:08:29 +00:00
Simon Pilgrim	129b720c18	[X86][AVX512] Add support for lowering shuffles to VPERMILPS llvm-svn: 274458	2016-07-03 12:47:21 +00:00
Sean Silva	45835e731d	Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC. This actually uncovered a surprisingly large chain of ultimately unused TLI args. From what I can gather, this argument is a remnant of when isKnownNonNull would look at the TLI directly. The current approach seems to be that InferFunctionAttrs runs early in the pipeline and uses TLI to annotate the TLI-dependent non-null information as return attributes. This also removes the dependence of functionattrs on TLI altogether. llvm-svn: 274455	2016-07-02 23:47:27 +00:00
Xinliang David Li	8a021317a2	[PM] Port LoopAccessInfo analysis to new PM It is implemented as a LoopAnalysis pass as discussed and agreed upon. llvm-svn: 274452	2016-07-02 21:18:40 +00:00
Simon Pilgrim	99e8a1aa0b	[X86][AVX512] Add support for lowering shuffles to VPERMILPD llvm-svn: 274450	2016-07-02 20:20:12 +00:00
Simon Pilgrim	72052f6de9	[X86][AVX512VL] Add fast-isel MOVDDUP/MOVSLDUP/MOVSHDUP shuffle tests llvm-svn: 274448	2016-07-02 19:22:46 +00:00
Simon Pilgrim	cde7c54baa	[X86][AVX512] Add support for 512-bit PSHUFB lowering llvm-svn: 274444	2016-07-02 18:14:31 +00:00
Simon Pilgrim	77dda7c2e0	[X86][AVX512] Converted the MOVDDUP/MOVSLDUP/MOVSHDUP masked intrinsics to generic IR llvm-svn: 274443	2016-07-02 17:16:41 +00:00
Simon Pilgrim	19adee9d84	[X86][AVX512] Autoupgrade the MOVDDUP/MOVSLDUP/MOVSHDUP intrinsics llvm-svn: 274439	2016-07-02 14:42:35 +00:00
Simon Pilgrim	f040d8c061	[X86][AVX512] Add support for lowering shuffles to MOVDDUP/MOVSLDUP/MOVSHDUP llvm-svn: 274436	2016-07-02 12:45:03 +00:00
Simon Pilgrim	5e95390957	[X86][AVX512] Add test cases that should lower to MOVSLDUP/MOVSHDUP llvm-svn: 274435	2016-07-02 12:20:35 +00:00
Simon Pilgrim	a6f262a1f9	[X86][AVX512] Add fast-isel shuffle tests Its not worth trying to write out tests for all the avx512f builtins yet, just adding tests for lowering of generic IR as we transition to it (shuffles mainly right now). llvm-svn: 274434	2016-07-02 12:13:29 +00:00
Qin Zhao	b463c23c10	[esan\|cfrag] Add counters for struct array accesses Summary: Adds one counter to the struct counter array for counting struct array accesses. Adds instrumentation to insert counter update for struct array accesses. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21594 llvm-svn: 274420	2016-07-02 03:25:37 +00:00
Michael Kuperstein	071d8306b0	[PM] Port ConstantHoisting to the new Pass Manager Differential Revision: http://reviews.llvm.org/D21945 llvm-svn: 274411	2016-07-02 00:16:47 +00:00
Reid Kleckner	e092dad72c	[codeview] Set the Nested and Scoped ClassOptions based on the scope chain These are set on both the declaration record and the definition record. llvm-svn: 274410	2016-07-02 00:11:07 +00:00
Matt Arsenault	accddacb70	TII: Fix inlineasm size counting comments as insts The main problem was counting comments on their own line as instructions. llvm-svn: 274405	2016-07-01 23:26:50 +00:00
David Majnemer	08bd744c2c	[CodeView] Include the offset of nested members Given something like: struct S { int a; struct { int b; }; }; We would fail to give 'b' offset 4. Instead, we would give it the offset it has inside of it's struct. llvm-svn: 274400	2016-07-01 23:12:48 +00:00
David Majnemer	6bdc24e7b6	[CodeView] Pretty print anonymous scopes A namespace without a name should be written out as `anonymous namespace' while a tag type without a name should be written out as <unnamed-tag>. llvm-svn: 274399	2016-07-01 23:12:45 +00:00
Matt Arsenault	7f681ac7a9	AMDGPU: Add feature for unaligned access llvm-svn: 274398	2016-07-01 23:03:44 +00:00
Matt Arsenault	8af47a09e5	AMDGPU: Expand unaligned accesses early Due to visit order problems, in the case of an unaligned copy the legalized DAG fails to eliminate extra instructions introduced by the expansion of both unaligned parts. llvm-svn: 274397	2016-07-01 22:55:55 +00:00
Evgeniy Stepanov	b736335dc3	[msan] Fix __msan_maybe_ for non-standard type sizes. Fix incorrect calculation of the type size for __msan_maybe_warning_N call that resulted in an invalid (narrowing) zext instruction and "Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed." Only happens in very large functions (with more than 3500 MSan checks) operating on integer types that are not power-of-two. llvm-svn: 274395	2016-07-01 22:49:59 +00:00
Matt Arsenault	327bb5ad82	AMDGPU: Improve load/store of illegal types. There was a combine before to handle the simple copy case. Split this into handling loads and stores separately. We might want to change how this handles some of the vector extloads, since this can result in large code size increases. llvm-svn: 274394	2016-07-01 22:47:50 +00:00
Reid Kleckner	ad56ea3129	[codeview] Don't record UDTs for anonymous structs MSVC makes up names for these anonymous structs, but we don't (yet). Eventually Clang should use getTypedefNameForAnonDecl() to put some name in the debug info, and we can update the test case when that happens. llvm-svn: 274391	2016-07-01 22:24:51 +00:00
Alina Sbirlea	8d8aa5dd6c	Address two correctness issues in LoadStoreVectorizer Summary: GetBoundryInstruction returns the last instruction as the instruction which follows or end(). Otherwise the last instruction in the boundry set is not being tested by isVectorizable(). Partially solve reordering of instructions. More extensive solution to follow. Reviewers: tstellarAMD, llvm-commits, jlebar Subscribers: escha, arsenm, mzolotukhin Differential Revision: http://reviews.llvm.org/D21934 llvm-svn: 274389	2016-07-01 21:44:12 +00:00
Dehao Chen	7b2c997736	Specify mtriple for the frame-order.ll test. Summary: original test may have different bahavior on different bot, specifically it broke llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21931 llvm-svn: 274368	2016-07-01 17:35:13 +00:00
Dehao Chen	ad2b4e1334	Do not count debug instructions when counting number of uses to reorder frame objects. Summary: The code generation should be independent of the debug info. Reviewers: zansari, davidxl, mkuper, majnemer Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D21911 llvm-svn: 274357	2016-07-01 15:40:25 +00:00
Nikolay Haustov	beb24f5b20	Resubmit r268719 - AMDGPU/SI: Add amdgpu_kernel calling convention. Part 2. This was reverted in r268740 because of problems with corresponding Clang change. Clang change was updated and resubmitted in r274220. Check calling convention in AMDGPUMachineFunction::isKernel This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF. Also, in the future unused non-kernels may be optimized. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19917 llvm-svn: 274341	2016-07-01 10:00:58 +00:00
Sam Kolton	5196b88f07	[AMDGPU] Assembler: support SDWA for VOPC instructions Summary: dst_sel and dst_unused disabled for VOPC as they have no effect on result Reviewers: artem.tamazov, tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21376 llvm-svn: 274340	2016-07-01 09:59:21 +00:00
Duncan P. N. Exon Smith	e60719b3fa	Revert "add tests for bugs fixed by the GVN hoist pass" This reverts commit r274327 since the tests fail. E.g.: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17240 It looks like this commit is building on r274305, but that commit caused a miscompile and was reverted in r274320. llvm-svn: 274332	2016-07-01 04:55:13 +00:00
Sebastian Pop	196ba4f844	add tests for bugs fixed by the GVN hoist pass https://llvm.org/bugs/show_bug.cgi?id=20242 https://llvm.org/bugs/show_bug.cgi?id=22005 llvm-svn: 274327	2016-07-01 03:03:19 +00:00
Reid Kleckner	b5af11dfa3	[codeview] Add DISubprogram::ThisAdjustment Summary: This represents the adjustment applied to the implicit 'this' parameter in the prologue of a virtual method in the MS C++ ABI. The adjustment is always zero unless multiple inheritance is involved. This increases the size of DISubprogram by 8 bytes, unfortunately. The adjustment really is a signed 32-bit integer. If this size increase is too much, we could probably win it back by splitting out a subclass with info specific to virtual methods (virtuality, vindex, thisadjustment, containingType). Reviewers: aprantl, dexonsmith Subscribers: aaboud, amccarth, llvm-commits Differential Revision: http://reviews.llvm.org/D21614 llvm-svn: 274325	2016-07-01 02:41:21 +00:00
Matt Arsenault	0101ecade0	LoadStoreVectorizer: Don't increase alignment with no align set If no alignment was set on the load/stores, it would vectorize to the new type even though this increases the default alignment. llvm-svn: 274323	2016-07-01 02:09:38 +00:00
Matt Arsenault	370e8226c7	LoadStoreVectorizer: Check TTI for vec reg bit width llvm-svn: 274322	2016-07-01 02:07:22 +00:00
Matt Arsenault	42ad17059a	LoadStoreVectorizer: Fix assert when merging pointer ops This needs to use inttoptr/ptrtoint if combining an int and pointer load. If a pointer is used always do an integer load. llvm-svn: 274321	2016-07-01 01:55:52 +00:00
Duncan P. N. Exon Smith	9d1f156418	Revert "code hoisting pass based on GVN" This reverts commit r274305, since it breaks self-hosting: http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/22349/ http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17232 Note that the blamelist on lab.llvm.org:8011 is incorrect. The previous build was r274299, but somehow r274305 wasn't included in the blamelist: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules llvm-svn: 274320	2016-07-01 01:51:40 +00:00
Matt Arsenault	241f34cde8	LoadStoreVectorizer: Use AA metadata This was not passing the full instruction with metadata to the alias query. llvm-svn: 274318	2016-07-01 01:47:46 +00:00
Matt Arsenault	d7e8898bdd	LoadStoreVectorizer: if one element of a vector is integer, default to integer. Fixes issues on some architectures where we use arithmetic ops to build vectors, which can cause bad things to happen for loads/stores of mixed types. Patch by Fiona Glaser llvm-svn: 274307	2016-07-01 00:37:01 +00:00
Matt Arsenault	8a4ab5e19f	LoadStoreVectorizer: Fix crashes on sub-byte types llvm-svn: 274306	2016-07-01 00:36:54 +00:00
Sebastian Pop	5c5798c57c	code hoisting pass based on GVN This pass hoists duplicated computations in the program. The primary goal of gvn-hoist is to reduce the size of functions before inline heuristics to reduce the total cost of function inlining. Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki. Important algorithmic contributions by Daniel Berlin under the form of reviews. Differential Revision: http://reviews.llvm.org/D19338 llvm-svn: 274305	2016-07-01 00:24:31 +00:00
Matt Arsenault	079d0f19a2	LoadStoreVectorizer: Check skipFunction first. Also add test I forgot to add to r274296. llvm-svn: 274299	2016-06-30 23:50:18 +00:00
Matt Arsenault	2cbe52b990	LoadStoreVectorizer: Skip optnone functions llvm-svn: 274296	2016-06-30 23:30:29 +00:00
Matt Arsenault	08debb0244	Add LoadStoreVectorizer pass This was contributed by Apple, and I've been working on minimal cleanups and generalizing it. llvm-svn: 274293	2016-06-30 23:11:38 +00:00
Matt Arsenault	727e279ac4	SLPVectorizer: Move propagateMetadata to VectorUtils This will be re-used by the LoadStoreVectorizer. Fix handling of range metadata and testcase by Justin Lebar. llvm-svn: 274281	2016-06-30 21:17:59 +00:00
Yunzhong Gao	b386955adc	Add an artificial line-0 debug location when the compiler emits a call to __stack_chk_fail(). This avoids a compiler crash. Differential Revision: http://reviews.llvm.org/D21818 llvm-svn: 274263	2016-06-30 18:49:04 +00:00
Wei Mi	95685faeee	Refine the set of UniformAfterVectorization instructions. Except the seed uniform instructions (conditional branch and consecutive ptr instructions), dependencies to be added into uniform set should only be used by existing uniform instructions or intructions outside of current loop. Differential Revision: http://reviews.llvm.org/D21755 llvm-svn: 274262	2016-06-30 18:42:56 +00:00
Etienne Bergeron	078d8f69b6	revert http://reviews.llvm.org/D21101 llvm-svn: 274251	2016-06-30 17:52:24 +00:00
Zachary Turner	ab58ae8730	[pdb] Re-add code to write PDB files. Somehow all the functionality to write PDB files got removed, probably accidentally when uploading the patch perhaps the wrong one got uploaded. This re-adds all the code, as well as the corresponding test. llvm-svn: 274248	2016-06-30 17:43:00 +00:00
Zachary Turner	a30bd1a1bc	Update llvm-pdbdump to use subcommands. llvm-svn: 274247	2016-06-30 17:42:48 +00:00
Etienne Bergeron	47cf4eabe6	[exceptions] Upgrade exception handlers when stack protector is used Summary: MSVC provide exception handlers with enhanced information to deal with security buffer feature (/GS). To be more secure, the security cookies (GS and SEH) are validated when unwinding the stack. The following code: ``` void f() {} void foo() { __try { f(); } __except(1) { f(); } } ``` Reviewers: majnemer, rnk Subscribers: thakis, llvm-commits, chrisha Differential Revision: http://reviews.llvm.org/D21101 llvm-svn: 274239	2016-06-30 15:36:59 +00:00
Jun Bum Lim	596a3bd9ec	[DSE] Fix bug in partial overwrite tracking Summary: Found cases where DSE incorrectly add partially-overwritten intervals. Please see the test case for details. Reviewers: mcrosier, eeckstein, hfinkel Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D21859 llvm-svn: 274237	2016-06-30 15:32:20 +00:00
Sanjay Patel	7c6eab5777	[InstCombine] shrink switch conditions better (PR24766) https://llvm.org/bugs/show_bug.cgi?id=24766#c2 This removes a hack that was added for the benefit of x86 codegen. It prevented shrinking the switch condition even to smaller legal (DataLayout) types. We have a safety mechanism in CGP after: http://reviews.llvm.org/rL251857 ...so we're free to use the optimal (smallest) IR type now. Differential Revision: http://reviews.llvm.org/D12965 llvm-svn: 274233	2016-06-30 14:51:21 +00:00
Sanjay Patel	7ad98babfa	[InstCombine] extend matchSelectFromAndOr() to work with i1 scalar types If the incoming types are i1, then we don't have to pattern match any sext ops. Differential Revision: http://reviews.llvm.org/D21740 llvm-svn: 274228	2016-06-30 14:18:18 +00:00
Jonas Paulsson	25e193da4c	[SystemZ] Let z13 also support FeatureMiscellaneousExtensions. This processor feature had been left out by mistake from the z13 ProcessorModel. This time with updated test case. Thanks, Hans. Reviewed by Ulrich Weigand. llvm-svn: 274216	2016-06-30 07:13:56 +00:00
David Majnemer	9319cbc045	[CodeView] Implement support for bitfields in LLVM CodeView need to know the offset of the storage allocation for a bitfield. Encode this via the "extraData" field in DIDerivedType and introduced a new flag, DIFlagBitField, to indicate whether or not a member is a bitfield. This fixes PR28162. Differential Revision: http://reviews.llvm.org/D21782 llvm-svn: 274200	2016-06-30 03:00:20 +00:00
Sanjoy Das	0da2d14766	[SCEV] Compute max be count from shift operator only if all else fails In particular, check to see if we can compute a precise trip count by exhaustively simulating the loop first. llvm-svn: 274199	2016-06-30 02:47:28 +00:00
George Burgess IV	d86e38e1db	[CFLAA] Add support for ModRef queries. This patch makes CFLAA answer some ModRef queries. Because we don't distinguish between reading/writing when making StratifiedSets, we're unable to offer any of the readonly-related answers. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21858 llvm-svn: 274197	2016-06-30 02:11:26 +00:00
Sanjay Patel	348111f4b9	add vector tests to show missing transform llvm-svn: 274192	2016-06-30 00:09:13 +00:00
Sanjay Patel	c3701e8b92	regenerate checks llvm-svn: 274188	2016-06-29 23:58:39 +00:00
Vedant Kumar	d6d192cd12	[llvm-cov] Use relative paths to file reports in -output-dir mode This makes it possible to e.g copy a report to another filesystem. llvm-svn: 274173	2016-06-29 21:55:46 +00:00
Artem Belevich	4d5d7be8cc	Revert r273313 "[NVPTX] Improve lowering of byval args of device functions." The change causes llvm crash in some unoptimized builds. llvm-svn: 274163	2016-06-29 20:51:15 +00:00
Evgeniy Stepanov	a5da256f92	StackColoring for SafeStack. This is a fix for PR27842. An IR-level implementation of stack coloring tailored to work with SafeStack. It is a bit weaker than the MI implementation in that it does not the "lifetime start at first access" logic. This can be improved in the future. This patch also replaces the naive implementation of stack frame layout with a greedy algorithm that can split existing stack slots and even fit small objects inside the alignment padding of other objects. llvm-svn: 274162	2016-06-29 20:37:43 +00:00
Tim Shen	aec68b263d	[InstCombine] Simplify and correct folding fcmps with the same children Summary: Take advantage of FCmpInst::Predicate's bit pattern and handle (fcmp , x, y) \| (fcmp , x, y) and (fcmp , x, y) & (fcmp , x, y) more consistently. Also fold more FCmpInst::FCMP_FALSE and FCmpInst::FCMP_TRUE to constants. Currently InstCombine wrongly folds (fcmp ogt, x, y) \| (fcmp ord, x, y) to (fcmp ogt, x, y); this patch also fixes that. Reviewers: spatel Subscribers: llvm-commits, iteratee, echristo Differential Revision: http://reviews.llvm.org/D21775 llvm-svn: 274156	2016-06-29 20:10:17 +00:00
Tim Shen	860a67eb4c	[InstCombine, NFC] Change the generated variable names by creating new instructions This removes some noise for D21775's test changes. llvm-svn: 274155	2016-06-29 20:10:13 +00:00
Nirav Dave	8e10380b73	Permit memory operands in ins/outs instructions [x86] (PR15455) While (ins\|outs)[bwld] instructions do not take %dx as a memory operand, various unofficial references do and objdump disassembles to this format. Extend special treatment of similar (in\|out)[bwld] operations. Reviewers: craig.topper, rnk, ab Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18837 llvm-svn: 274152	2016-06-29 19:54:27 +00:00
Rafael Espindola	2211f015cc	Don't verify inputs to the Linker if ODR merging. This fixes pr28072. The point, as Duncan pointed out, is that the file is already partially linked by just reading it. Long term I think the solution is to make metadata owned by the module and then the linker will lazily read it and be in charge of all the linking. Running a verifier in each input will defeat the lazy loading, but will be legal. Right now we are at the unfortunate position that to support odr merging we cannot verify the inputs, which mildly annoying (see test update). llvm-svn: 274148	2016-06-29 18:31:48 +00:00
Tim Shen	4561e784f4	[InstCombine] Add full tests for FoldAndOfFCmps and FoldOrOfFCmps Summary: This adds tests for covering all cases that FoldAndOfFCmps and FoldOrOfFCmps handle. Reviewers: spatel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21844 llvm-svn: 274144	2016-06-29 17:55:11 +00:00
Vedant Kumar	c1561cb2fa	[llvm-cov] Change some FileCheck prefixes to make tests reusable (NFC) I'm planning on extending these two tests with checks that validate html coverage reports. Make it easier to extend them by not using a prefix called "CHECK". llvm-svn: 274143	2016-06-29 17:47:08 +00:00
Nico Weber	0a480b2c05	Add a regression test for PR28348. llvm-svn: 274142	2016-06-29 17:34:31 +00:00
Nico Weber	12fdf60b75	Revert r272251, it caused PR28348. llvm-svn: 274141	2016-06-29 17:33:41 +00:00
Ahmed Bougacha	15a2f6d58c	[X86] Lower blended PACKUSes using appropriate types. When lowering two blended PACKUS, we used to disregard the types of the PACKUS inputs, indiscriminately generating a v16i8 PACKUS. This leads to non-selectable things like: (v16i8 (PACKUS (v4i32 v0), (v4i32 v1))) Instead, check that the PACKUSes have the same type, and use that as the final result type. llvm-svn: 274138	2016-06-29 16:56:09 +00:00
Vedant Kumar	84cfb884e2	[llvm-cov] Disable PGO name compression in a test file Some bots do not configure llvm with zlib enabled. Should fix: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/15571 llvm-svn: 274137	2016-06-29 16:34:57 +00:00
Vedant Kumar	2ca5eaa85a	Fix a typo; NFC llvm-svn: 274136	2016-06-29 16:23:34 +00:00
Vedant Kumar	4a54abeacd	[llvm-cov] Do not allow ".." to escape the coverage sub-directory In -output-dir mode, file reports are placed into a "coverage" directory. If filenames in the coverage mapping contain "..", they might escape out of this directory. Fix the problem by removing ".." from source filenames (expand the path component). llvm-svn: 274135	2016-06-29 16:22:12 +00:00
Rafael Espindola	c4cabb8054	Update tests to use at least darwin9. llvm-svn: 274129	2016-06-29 14:51:10 +00:00
Simon Pilgrim	f9c5908ffd	[X86][SSE2] Added _mm_loadu_si64 test to match llvm\tools\clang\test\CodeGen\sse2-builtins.c llvm-svn: 274127	2016-06-29 14:05:33 +00:00
Simon Pilgrim	851019175b	[X86] Regenerated popcnt combine tests llvm-svn: 274124	2016-06-29 13:54:03 +00:00
Elena Demikhovsky	5e21c94f25	Reverted patch 273864 llvm-svn: 274115	2016-06-29 10:01:06 +00:00
Marcin Koscielnicki	518cbc7cc3	[SystemZ] Add floating-point test data class instructions. These are not used by CodeGen yet - ISD combiners creating the new node will come in subsequent patches. llvm-svn: 274108	2016-06-29 07:29:07 +00:00
Craig Topper	df7454f94b	Revert "[ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition." This is breaking an optimizaton remark test in clang. I've identified a couple fixes for that, but want to understand it better before I commit to anything. llvm-svn: 274102	2016-06-29 04:57:00 +00:00
Craig Topper	2cc199baff	[ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition. If a operation for a recurrence is an addition with no signed wrap and both input sign bits are 0, then the result sign bit must also be 0. Similar for the negative case. I found this deficiency while playing around with a loop in the x86 backend that contained a signed division that could be optimized into an unsigned division if we could prove both inputs were positive. One of them being the loop induction variable. With this patch we can perform the conversion for this case. One of the test cases here is a contrived variation of the loop I was looking at. Differential revision: http://reviews.llvm.org/D21493 llvm-svn: 274098	2016-06-29 03:46:47 +00:00
Craig Topper	3a011de10c	[DAGCombine] Teach DAG combine to handle ORs of shuffles involving zero vectors where the zero vector is the first operand to the shuffle instead of the second. llvm-svn: 274097	2016-06-29 03:29:12 +00:00
Craig Topper	1e7e36e7e6	[DAGCombine] Add test cases to show that DAG combining an OR of two shuffles with zero vectors doesn't work if the zero vector is the first operand of the shuffle. Fix coming in a follow up patch. llvm-svn: 274096	2016-06-29 03:29:09 +00:00
Eric Christopher	0c58837b1f	Revert "[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions" Revert "[InstCombine] Combine A->B->A BitCast" as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847 This reverts commits r270135 and r263734. llvm-svn: 274094	2016-06-29 03:05:58 +00:00
Vedant Kumar	8d74cb27e8	[llvm-cov] Minor cleanups to prepare for the html format patch - Add renderView{Header,Footer}, renderLineSuffix, and hasSubViews to support creating tables with nested views. - Move the 'Format' cl::opt to make it easier to extend. - Just create one function view file, instead of overwriting the same file for every new function. Add a regression test for this. llvm-svn: 274086	2016-06-29 00:38:21 +00:00
Kevin Enderby	42398051d8	Finish cleaning up most of the error handling in libObject’s MachOUniversalBinary and its clients to use the new llvm::Error model for error handling. Changed getAsArchive() from ErrorOr<...> to Expected<...> so now all interfaces there use the new llvm::Error model for return values. In the two places it had if (!Parent) this is actually a program error so changed from returning errorCodeToError(object_error::parse_failed) to calling report_fatal_error() with a message. In getObjectForArch() added error messages to its two llvm::Error return values instead of returning errorCodeToError(object_error::arch_not_found) with no error message. For the llvm-obdump, llvm-nm and llvm-size clients since the only binary files in Mach-O Universal Binaries that are supported are Mach-O files or archives with Mach-O objects, updated their logic to generate an error when a slice contains something like an ELF binary instead of ignoring it. And added a test case for that. The last error stuff to be cleaned up for libObject’s MachOUniversalBinary is the use of errorOrToExpected(Archive::create(ObjBuffer)) which needs Archive::create() to be changed from ErrorOr<...> to Expected<...> first, which I’ll work on next. llvm-svn: 274079	2016-06-28 23:16:13 +00:00
Weiming Zhao	5410edddb1	[ARM] Fix 28282: cost computation for constant hoisting Summary: This fixes bug: https://llvm.org/bugs/show_bug.cgi?id=28282 Currently the cost model of constant hoisting checks the bit width of the data type of the constants. However, the actual immediate value is small enough and not need to be hoisted. This patch checks for the actual bit width needed for the constant. Reviewers: t.p.northover, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D21668 llvm-svn: 274073	2016-06-28 22:30:45 +00:00
Dehao Chen	8cd84aaa6f	Relax the clearance calculating for breaking partial register dependency. Summary: LLVM assumes that large clearance will hide the partial register spill penalty. But in our experiment, 16 clearance is too small. As the inserted XOR is normally fairly cheap, we should have a higher clearance threshold to aggressively insert XORs that is necessary to break partial register dependency. Reviewers: wmi, davidxl, stoklund, zansari, myatsina, RKSimon, DavidKreitzer, mkuper, joerg, spatel Subscribers: davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D21560 llvm-svn: 274068	2016-06-28 21:19:34 +00:00
Chris Bieneman	92b2e8a295	[YAML] Fix YAML tags appearing before the start of sequence elements Our existing yaml::Output code writes tags immediately when mapTag is called, without any state handling. This results in tags on sequence elements being written before the element itself. For example, we see this: SomeArray: !elem_type - key1: 1 key2: 2 !elem_type2 - key3: 3 key4: 4 We should instead see: SomeArray: - !elem_type key1: 1 key2: 2 - !elem_type2 key3: 3 key4: 4 Our reader handles reading properly, so this bug only impacts writing yaml sequences with tagged elements. As a test for this I've modified the Mach-O yaml encoding to allways apply the !mach-o tag when encoding MachOYAML::Object entries. This results in the !mach-o tag appearing as expected in dumped fat files. llvm-svn: 274067	2016-06-28 21:10:26 +00:00
Zhan Jun Liau	347db3e18e	[SystemZ] Use NILL instruction instead of NILF where possible Summary: SystemZ shift instructions only use the last 6 bits of the shift amount. When the result of an AND operation is used as a shift amount, this means that we can use the NILL instruction (which operates on the last 16 bits) rather than NILF (which operates on the last 32 bits) for a 16-bit savings in instruction size. Reviewers: uweigand Subscribers: llvm-commits Author: colpell Committing on behalf of Elliot. Differential Revision: http://reviews.llvm.org/D21686 llvm-svn: 274066	2016-06-28 21:03:19 +00:00
Matthias Braun	0b9a07883d	X86FrameLowering: Check subregs when deciding prolog kill flags llvm-svn: 274057	2016-06-28 20:31:56 +00:00
Sanjay Patel	3a0f2606ec	minimize regression tests and update checks llvm-svn: 274047	2016-06-28 18:40:08 +00:00
Sanjay Patel	8ce43c098b	minimize regression tests and update checks llvm-svn: 274046	2016-06-28 18:33:10 +00:00
Artur Pilipenko	7ad95ec22d	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details). This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 274043	2016-06-28 18:27:25 +00:00
Jacques Pienaar	f43266b868	[lanai] Update ELF number to correspond to the assigned number. Change EM_LANAI to correspond to machine number assigned by Xinuos. llvm-svn: 274042	2016-06-28 18:22:22 +00:00
Michael Kuperstein	a118acb82f	[X86] Update a test with more explicit checks. NFC. llvm-svn: 274040	2016-06-28 17:42:13 +00:00
Vedant Kumar	9cbad2c2b8	[llvm-cov] Create an index of reports in -output-dir mode This index lists the reports available in the 'coverage' sub-directory. This will help navigate coverage output from large projects. This commit factors the file creation code out of SourceCoverageView and into CoveragePrinter. llvm-svn: 274029	2016-06-28 16:12:24 +00:00
Vedant Kumar	64d8a029e9	[llvm-cov] Minor cleanups (NFC) - Test the '-o' alias for -output-dir. - Use a helper method in a conditional. - Add a period. llvm-svn: 274028	2016-06-28 16:12:20 +00:00
David Majnemer	1c7d532cde	[X86] Make WRPKRU/RDPKRU pass -verify-machineinstrs The original implementation attempted to zero registers using XOR %foo, %foo. This is problematic because it constitutes a read-modify-write of a register which might not be defined. Instead, use MOV32r0 to avoid these problems; expandPostRAPseudo does the right thing here. llvm-svn: 274024	2016-06-28 16:04:46 +00:00
Marcin Koscielnicki	234e5a809b	[SystemZ] Save/restore r6 and r7 if function contains landing pad. This fixes PR27102. Differential Revision: http://reviews.llvm.org/D18541 llvm-svn: 274017	2016-06-28 14:13:11 +00:00
Simon Pilgrim	5f71c909f0	[X86][AVX] Peek through bitcasts to find the source of broadcasts (reapplied) AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors. This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected. Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node. As we're being more aggressive with bitcasts, we also need to ensure that the broadcast type is correctly bitcasted Differential Revision: http://reviews.llvm.org/D21660 llvm-svn: 274013	2016-06-28 13:24:05 +00:00
Arnaud A. de Grandmaison	eee4711fbe	[gold] Really fix test to run on non x86 platforms. Address post-commit comment from H.J. Lu. llvm-svn: 274000	2016-06-28 08:12:09 +00:00
Simon Pilgrim	c15d217831	[X86][SSE] Added support for combining target shuffles to (V)PSHUFD/VPERMILPD/VPERMILPS immediate permutes This patch allows target shuffles to be combined to single input immediate permute instructions - (V)PSHUFD/VPERMILPD/VPERMILPS - allowing more general pattern matching than what we current do and improves the likelihood of memory folding compared to existing patterns which tend to reuse the input in multiple arguments. Further permute instructions (V)PSHUFLW/(V)PSHUFHW/(V)PERMQ/(V)PERMPD may be added in the future but its proven tricky to create tests cases for them so far. (V)PSHUFLW/(V)PSHUFHW is already handled quite well in combineTargetShuffle so it may be that removing some of that code may allow us to perform more of the combining in one place without duplication. Differential Revision: http://reviews.llvm.org/D21148 llvm-svn: 273999	2016-06-28 08:08:15 +00:00
Elena Demikhovsky	a727f3cfde	[X86 Target Lowering] Merged ICMP test. llvm-svn: 273995	2016-06-28 06:25:38 +00:00
Adam Nemet	bd861acf29	[LLE] Don't hoist conditionally executed loads If the load is conditional we can't hoist its 0-iteration instance to the preheader because that would make it unconditional. Thus we would access a memory location that the original loop did not access. llvm-svn: 273991	2016-06-28 04:02:47 +00:00
Vedant Kumar	7937ef3796	Reapply "[llvm-cov] Add an -output-dir option for the show sub-command"" Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if it doesn't already exist, and prints reports into that directory. In function view mode, all views are written into path/to/dir/functions.$EXTENSION. In file view mode, all views are written into path/to/dir/coverage/$PATH.$EXTENSION. Changes since the initial commit: - Avoid accidentally closing stdout twice. llvm-svn: 273985	2016-06-28 02:09:39 +00:00
Nick Lewycky	9980075133	NFC. Fix popular typo in comment 'deferencing' --> 'dereferencing'. Bonus changes, * placement in X86ISelLowering and 'exerce' -> 'exercise' in test. llvm-svn: 273984	2016-06-28 01:45:05 +00:00
Vedant Kumar	a48d9fe86a	Revert "[llvm-cov] Add an -output-dir option for the show sub-command" This reverts commit r273971. test/profile/instrprof-visibility.cpp is failing because of an uncaught error in SafelyCloseFileDescriptor. llvm-svn: 273978	2016-06-28 01:14:04 +00:00
Matt Arsenault	b4d9503171	AMDGPU: Fix out of bounds indirect indexing errors This was producing acceses to registers beyond the super register's limits, resulting in verifier failures. llvm-svn: 273977	2016-06-28 01:09:00 +00:00
Vedant Kumar	02507c435c	[llvm-cov] Add an -output-dir option for the show sub-command Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if it doesn't already exist, and prints reports into that directory. In function view mode, all views are written into path/to/dir/functions.$EXTENSION. In file view mode, all views are written into path/to/dir/coverage/$PATH.$EXTENSION. llvm-svn: 273971	2016-06-28 00:18:57 +00:00
Vedant Kumar	dcbf4d68b2	[llvm-cov] Use -check-prefixes in a test (NFC) llvm-svn: 273970	2016-06-28 00:18:53 +00:00
Vedant Kumar	635c83c1b4	[llvm-cov] Add a format option for the 'show' sub-command (mostly NFC) llvm-svn: 273968	2016-06-28 00:15:54 +00:00
Chandler Carruth	dca834089a	[PM] Improve the debugging and logging facilities of the CGSCC bits of the new pass manager. This adds operator<< overloads for the various bits of the LazyCallGraph, dump methods for use from the debugger, and debug logging using them to the CGSCC pass manager. Having this was essential for debugging the call graph update patch, and I've extracted what I could from that patch here to minimize the delta. llvm-svn: 273961	2016-06-27 23:26:08 +00:00
Easwaran Raman	22eb80a114	Fix size computation of array allocation in inline cost analysis Differential revision: http://reviews.llvm.org/D21690 llvm-svn: 273952	2016-06-27 22:31:53 +00:00
Sanjay Patel	59ed2ffca3	[InstCombine] shrink type of sdiv if dividend is sexted and constant divisor is small enough (PR28153) This should fix PR28153: https://llvm.org/bugs/show_bug.cgi?id=28153 Differential Revision: http://reviews.llvm.org/D21769 llvm-svn: 273951	2016-06-27 22:27:11 +00:00
Kevin Enderby	1051909df1	Change all but the last ErrorOr<...> use for MachOUniversalBinary to Expected<...> to allow a good error message to be produced. I added the one test case that the object file tools could produce an error message. The other two errors can’t be triggered if the input file is passed through sys::fs::identify_magic(). But the malformedError("bad magic number") does get triggered by the logic in llvm-dsymutil when dealing with a normal Mach-O file. The other "File too small ..." error would take a logic error currently to produce and is not tested for. llvm-svn: 273946	2016-06-27 21:39:39 +00:00
Matt Arsenault	59c0ffa22a	AMDGPU: Implement per-function subtargets llvm-svn: 273940	2016-06-27 20:48:03 +00:00
Matt Arsenault	03d8584590	AMDGPU: Move subtarget feature checks into passes llvm-svn: 273937	2016-06-27 20:32:13 +00:00
Sanjay Patel	5cdf699daa	add tests for PR28153 llvm-svn: 273936	2016-06-27 20:28:59 +00:00
Justin Holewinski	cb29fb4a98	Only emit extension for zeroext/signext arguments if type is < 32 bits Reviewers: jingyue, jlebar Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D21756 llvm-svn: 273922	2016-06-27 20:22:22 +00:00
Rafael Espindola	8121becac3	Teach shouldAssumeDSOLocal about tls. Fixes a fixme about handling other visibilities. llvm-svn: 273921	2016-06-27 20:19:14 +00:00
Elena Demikhovsky	6f2ec8104a	Fixed crash of SLP Vectorizer on KNL The bug is connected to vector GEPs. https://llvm.org/bugs/show_bug.cgi?id=28313 llvm-svn: 273919	2016-06-27 20:07:00 +00:00
Chris Bieneman	e5cc1fd498	[yaml2obj] Missed updating a few test cases in r273915 This should fix the broken bots. llvm-svn: 273918	2016-06-27 20:02:49 +00:00
Matt Arsenault	21a4625a16	AMDGPU: Fix verifier errors with undef vector indices Also fix pointlessly adding exec to liveins. llvm-svn: 273916	2016-06-27 19:57:44 +00:00
Chris Bieneman	8ff0c11357	[yaml2obj] Remove --format option in favor of YAML tags Summary: Our YAML library's handling of tags isn't perfect, but it is good enough to get rid of the need for the --format argument to yaml2obj. This patch does exactly that. Instead of requiring --format, it infers the format based on the tags found in the object file. The supported tags are: !ELF !COFF !mach-o !fat-mach-o I have a corresponding patch that is quite large that fixes up all the in-tree test cases. Reviewers: rafael, Bigcheese, compnerd, silvas Subscribers: compnerd, llvm-commits Differential Revision: http://reviews.llvm.org/D21711 llvm-svn: 273915	2016-06-27 19:53:53 +00:00
Matt Arsenault	82f41518ed	Verifier: Reject non-float !fpmath Code already assumes this is float. getFPAccuracy() crashes on any other type. llvm-svn: 273912	2016-06-27 19:43:15 +00:00
Matt Arsenault	f0f721a682	DAGCombiner: Don't narrow volatile vector loads + extract llvm-svn: 273909	2016-06-27 19:31:04 +00:00
Elena Demikhovsky	ad3929cc64	X86 Lowering - Fixed a crash in ICMP scalar instruction Fixed a bug in EmitTest() function in combining shl + icmp. https://llvm.org/bugs/show_bug.cgi?id=28119 llvm-svn: 273899	2016-06-27 18:07:16 +00:00
Sanjay Patel	c6ada53be5	[InstCombine] use m_APInt for div --> ashr fold The APInt matcher works with splat vectors, so we get this fold for vectors too. llvm-svn: 273897	2016-06-27 17:25:57 +00:00
Artur Pilipenko	72f76b8805	Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures. llvm-svn: 273895	2016-06-27 16:54:33 +00:00
Easwaran Raman	1832bf6aee	[PM] Port PartialInlining to the new PM Differential revision: http://reviews.llvm.org/D21699 llvm-svn: 273894	2016-06-27 16:50:18 +00:00
Artur Pilipenko	a36aa41519	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details). This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 273892	2016-06-27 16:29:26 +00:00
Simon Pilgrim	476e8ceed3	[X86][SSE] Added extra broadcast tests to cover PR28327 llvm-svn: 273891	2016-06-27 16:15:37 +00:00
Zhan Jun Liau	4f130b4410	[SystemZ] Avoid generating 2 XOR instructions for (and (xor x, -1), y) Summary: Created a pattern to match 64-bit mode (and (xor x, -1), y) to a shorter sequence of instructions. Before the change, the canonical form is translated to: xihf %r3, 4294967295 xilf %r3, 4294967295 ngr %r2, %r3 After the change, the canonical form is translated to: ngr %r3, %r2 xgr %r2, %r3 Reviewers: zhanjunl, uweigand Subscribers: llvm-commits Author: assem Committing on behalf of Assem. Differential Revision: http://reviews.llvm.org/D21693 llvm-svn: 273887	2016-06-27 15:55:30 +00:00
Krzysztof Parzyszek	5da24e5495	[Hexagon] Equally-sized vectors are equivalent in ISel (except vNi1) llvm-svn: 273885	2016-06-27 15:08:22 +00:00
Nico Weber	1e058160dd	Revert 273848, it caused PR28329 llvm-svn: 273879	2016-06-27 14:36:46 +00:00
Simon Pilgrim	9c2f378587	Removed duplicate assertions note llvm-svn: 273874	2016-06-27 13:06:18 +00:00
Elena Demikhovsky	f65e865e33	Removed extra test from the prev commit. llvm-svn: 273865	2016-06-27 11:40:49 +00:00
Elena Demikhovsky	4c58b2761a	Fixed consecutive memory access detection in Loop Vectorizer. It did not handle correctly cases without GEP. The following loop wasn't vectorized: for (int i=0; i<len; i++) to++ = from++; I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1. Re-commit rL273257 - revision: http://reviews.llvm.org/D20789 llvm-svn: 273864	2016-06-27 11:19:23 +00:00
Arnaud A. de Grandmaison	efb0b899d3	[gold] Fix test to not assume it runs on x86 hardware. llvm-svn: 273854	2016-06-27 09:13:03 +00:00
Hrvoje Varga	24b975dc66	[mips][micromips] Implement LD, LLD, LWU, SD, DSRL, DSRL32 and DSRLV instructions Differential Revision: http://reviews.llvm.org/D16625 llvm-svn: 273850	2016-06-27 08:23:28 +00:00
Simon Pilgrim	a45da385f8	[X86][AVX] Peek through bitcasts to find the source of broadcasts AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors. This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected. Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node. Differential Revision: http://reviews.llvm.org/D21660 llvm-svn: 273848	2016-06-27 07:44:32 +00:00
Igor Breger	7357849dca	[ConstantFolding] Fix bitcast vector of i1. Differential Revision: http://reviews.llvm.org/D21735 llvm-svn: 273845	2016-06-27 06:42:54 +00:00
Rafael Espindola	1ac1fa818e	Mips: Fix access to private functions. llvm-svn: 273843	2016-06-27 03:19:40 +00:00
Sanjay Patel	1d745384da	add tests for potential select transforms llvm-svn: 273833	2016-06-26 23:44:21 +00:00
Nico Weber	d8db1e172c	Revert r273807 (and r273809, r273810), it caused PR28311 llvm-svn: 273815	2016-06-26 15:10:34 +00:00
Amjad Aboud	ff976c99c7	[codeview] Improved array type support. Added support for: 1. Multi dimension array. 2. Array of structure type, which previously was declared incompletely. 3. Dynamic size array. Differential Revision: http://reviews.llvm.org/D21526 llvm-svn: 273807	2016-06-26 11:44:45 +00:00
Sanjoy Das	a37bb4a65d	[LoopUnswitch] Unswitch on conditions feeding into guards Summary: This is a straightforward extension of what LoopUnswitch does to branches to guards. That is, we unswitch ``` for (;;) { ... guard(loop_invariant_cond); ... } ``` into ``` if (loop_invariant_cond) { for (;;) { ... // There is no need to emit guard(true) ... } } else { for (;;) { ... guard(false); // SimplifyCFG will clean this up by adding an // unreachable after the guard(false) ... } } ``` Reviewers: majnemer Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D21725 llvm-svn: 273801	2016-06-26 05:10:45 +00:00
Jan Vesely	3bc1af2be4	AMDGPU/R600: Fix GlobalValue regressions. Don't cast GV expression to MCSymbolRefExpr. r272705 changed GV to binary expressions by including offset even if the offset it 0 (we haven't hit this sooner since tested workloads don't include static offsets) We don't really care about the type of expression, so set it directly. Fixes: r272705 Consider section relative relocations. Since all const as data is in one boffer section relative is equivalent to abs32. Fixes: r273166 Differential Revision: http://reviews.llvm.org/D21633 llvm-svn: 273785	2016-06-25 18:24:16 +00:00
Sanjay Patel	51ff79fd82	update tests to use FileCheck llvm-svn: 273784	2016-06-25 17:39:10 +00:00
David Majnemer	e14e7bc4b8	Revert "[SimplifyCFG] Stop inserting calls to llvm.trap for UB" This reverts commit r273778, it seems to break UBSan :/ llvm-svn: 273779	2016-06-25 08:19:55 +00:00
David Majnemer	d346a37737	[SimplifyCFG] Stop inserting calls to llvm.trap for UB SimplifyCFG had logic to insert calls to llvm.trap for two very particular IR patterns: stores and invokes of undef/null. While InstCombine canonicalizes certain undefined behavior IR patterns to stores of undef, phase ordering means that this cannot be relied upon in general. There are much better tools than llvm.trap: UBSan and ASan. N.B. I could be argued into reverting this change if a clear argument as to why it is important that we synthesize llvm.trap for stores, I'd be hard pressed to see why it'd be useful for invokes... llvm-svn: 273778	2016-06-25 08:04:19 +00:00
David Majnemer	bb53d23ef8	[InstSimplify] Replace calls to null with undef Calling null is undefined behavior, we can simplify the resulting value to undef. llvm-svn: 273777	2016-06-25 07:37:30 +00:00
David Majnemer	1fea77c6fc	[SimplifyCFG] Replace calls to null/undef with unreachable Calling null is undefined behavior, a call to undef can be trivially treated as a call to null. llvm-svn: 273776	2016-06-25 07:37:27 +00:00
Konstantin Zhuravlyov	f2f3d14774	[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 llvm-svn: 273769	2016-06-25 03:11:28 +00:00
Saleem Abdulrasool	92d33bd2af	llvm-ar: add some tests for llvm-ar default selection This adds some tests for the smarter llvm-ar selection mode as well as some additional tests as per Rafael's post commit review comments. llvm-svn: 273768	2016-06-25 03:05:56 +00:00
Tom Stellard	b164a9843b	AMDGPU/SI: Make sure not to fold offsets into local address space globals Summary: Offset folding only works if you are emitting relocations, and we don't emit relocations for local address space globals. Reviewers: arsenm, nhaustov Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21647 llvm-svn: 273765	2016-06-25 01:59:16 +00:00
Sanjoy Das	f63768cbfc	[PlaceSafepoints] Don't call undef in test case; NFC llvm-svn: 273764	2016-06-25 01:40:54 +00:00
Sanjoy Das	d850068282	[LoopUnswitch] Avoid exponential behavior Summary: (No semantic change intended). Reviewers: majnemer, bogner, mzolotukhin Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D21707 llvm-svn: 273763	2016-06-25 01:14:19 +00:00
David Majnemer	0f45572761	The absence of noreturn doesn't ensure mayReturn There are two separate issues: - LLVM doesn't consider infinite loops to be side effects: we happily hoist/sink above/below loops whose bounds are unknown. - The absence of the noreturn attribute is insufficient for us to know if a function will definitely return. Relying on noreturn in the middle-end for any property is an accident waiting to happen. llvm-svn: 273762	2016-06-25 00:55:12 +00:00
Peter Collingbourne	0312f614b1	IR: Introduce llvm.type.checked.load intrinsic. This intrinsic safely loads a function pointer from a virtual table pointer using type metadata. This intrinsic is used to implement control flow integrity in conjunction with virtual call optimization. The virtual call optimization pass will optimize away llvm.type.checked.load intrinsics associated with devirtualized calls, thereby removing the type check in cases where it is not needed to enforce the control flow integrity constraint. This patch also introduces the capability to copy type metadata between global variables, and teaches the virtual call optimization pass to do so. Differential Revision: http://reviews.llvm.org/D21121 llvm-svn: 273756	2016-06-25 00:23:04 +00:00
Matthias Braun	6ad3d05b68	MachineScheduler: Fully compare top/bottom candidates In bidirectional scheduling this gives more stable results than just comparing the "reason" fields of the top/bottom node because the reason field may be higher depending on what other nodes are in the queue. Differential Revision: http://reviews.llvm.org/D19401 llvm-svn: 273755	2016-06-25 00:23:00 +00:00
David Majnemer	b8da3a2bb2	Reinstate r273711 r273711 was reverted by r273743. The inliner needs to know about any call sites in the inlined function. These were obscured if we replaced a call to undef with an undef but kept the call around. This fixes PR28298. llvm-svn: 273753	2016-06-25 00:04:10 +00:00
Matthias Braun	1e374a7aa6	AMDGPU: Define a schedule class for COPY. COPY was lacking a scheduling class, define it to avoid regressions in the upcoming change to the bidirectional MachineScheduler. Approved by tstellar on IRC. Differential Revision: http://reviews.llvm.org/D21540 llvm-svn: 273751	2016-06-24 23:52:11 +00:00
Michael Kuperstein	83b753d430	[PM] Port float2int to the new pass manager Differential Revision: http://reviews.llvm.org/D21704 llvm-svn: 273747	2016-06-24 23:32:02 +00:00
Dehao Chen	c66a06ad0e	Hookup ProfileSummary with SampleProfilerLoader Summary: Set ProfileSummary in SampleProfilerLoader. Reviewers: davidxl, eraman Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21702 llvm-svn: 273745	2016-06-24 22:57:06 +00:00
Nico Weber	ae2ef4ccd4	Revert r273711, it caused PR28298. llvm-svn: 273743	2016-06-24 22:52:39 +00:00
Adrian Prantl	29ce701a06	Fix the type signature of DwarfExpression::Add.*Constant to support values >32 bits. This fixes an embarrassing bug when emitting .debug_loc entries for 64-bit+ constants, which were previously silently truncated to 32 bits. <rdar://problem/26843232> llvm-svn: 273736	2016-06-24 21:35:09 +00:00
Krzysztof Parzyszek	709a626015	[Hexagon] Simplify (+fix) instruction selection for indexed loads/stores llvm-svn: 273733	2016-06-24 21:27:17 +00:00
Peter Collingbourne	7efd750607	IR: New representation for CFI and virtual call optimization pass metadata. The bitset metadata currently used in LLVM has a few problems: 1. It has the wrong name. The name "bitset" refers to an implementation detail of one use of the metadata (i.e. its original use case, CFI). This makes it harder to understand, as the name makes no sense in the context of virtual call optimization. 2. It is represented using a global named metadata node, rather than being directly associated with a global. This makes it harder to manipulate the metadata when rebuilding global variables, summarise it as part of ThinLTO and drop unused metadata when associated globals are dropped. For this reason, CFI does not currently work correctly when both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable globals, and fails to associate metadata with the rebuilt globals. As I understand it, the same problem could also affect ASan, which rebuilds globals with a red zone. This patch solves both of those problems in the following way: 1. Rename the metadata to "type metadata". This new name reflects how the metadata is currently being used (i.e. to represent type information for CFI and vtable opt). The new name is reflected in the name for the associated intrinsic (llvm.type.test) and pass (LowerTypeTests). 2. Attach metadata directly to the globals that it pertains to, rather than using the "llvm.bitsets" global metadata node as we are doing now. This is done using the newly introduced capability to attach metadata to global variables (r271348 and r271358). See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html Differential Revision: http://reviews.llvm.org/D21053 llvm-svn: 273729	2016-06-24 21:21:32 +00:00
Rafael Espindola	a895a0cd01	Add support for musl-libc on ARM Linux. Patch by Lei Zhang! llvm-svn: 273726	2016-06-24 21:14:33 +00:00
Chris Bieneman	93e7119380	[obj2yaml] [yaml2obj] Support for MachO Universal binaries This patch adds round-trip support for MachO Universal binaries to obj2yaml and yaml2obj. Universal binaries have a header and list of architecture structures, followed by a the individual object files at specified offsets. llvm-svn: 273719	2016-06-24 20:42:28 +00:00
Michael Kuperstein	82d5da5aac	[PM] Port PreISelIntrinsicLowering to the new PM llvm-svn: 273713	2016-06-24 20:13:42 +00:00
David Majnemer	3b3e954ea2	SimplifyInstruction does not imply DCE We cannot remove an instruction with no uses just because SimplifyInstruction succeeds. It may have side effects. llvm-svn: 273711	2016-06-24 19:34:46 +00:00
Rafael Espindola	88ae09e9be	Use shouldAssumeDSOLocal in isOffsetFoldingLegal. This makes it slightly more powerful for dynamic-no-pic. llvm-svn: 273704	2016-06-24 18:48:36 +00:00
Reid Kleckner	fbd5eef691	Revert "InstCombine rule to fold trunc when value available" This reverts commit r273608. Broke building code with sanitizers, where apparently these kinds of loads, casts, and truncations are common: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/24502 http://crbug.com/623099 llvm-svn: 273703	2016-06-24 18:42:58 +00:00
Sanjay Patel	f8b08f7179	[InstCombine] consolidate commutation variants of matchSelectFromAndOr() in one place; NFCI By putting all the possible commutations together, we simplify the code. Note that this is NFCI, but I'm adding tests that actually exercise each commutation pattern because we don't have this anywhere else. llvm-svn: 273702	2016-06-24 18:26:02 +00:00
Kyle Butt	267164df0a	Codegen: Fix broken assumption in Tail Merge. Tail merge was making the assumption that a layout successor or predecessor was always a cfg successor/predecessor. Remove that assumption. Changes to tests are necessary because the errant cfg edges were preventing optimizations. llvm-svn: 273700	2016-06-24 18:16:36 +00:00
Rafael Espindola	955d3569e7	Use FileCheck. NFC. llvm-svn: 273699	2016-06-24 18:04:39 +00:00
Reid Kleckner	10dd55c548	[codeview] Emit parameter variables in the right order Clang emits them in reverse order to conform to the ABI, which requires left-to-right destruction. As a result, the order doesn't fall out naturally, and we have to sort things out in the backend. Fixes PR28213 llvm-svn: 273696	2016-06-24 17:55:40 +00:00
Peter Collingbourne	4f7c16dd53	Linker: Copy metadata when linking declarations. Differential Revision: http://reviews.llvm.org/D21624 llvm-svn: 273692	2016-06-24 17:42:21 +00:00
Reid Kleckner	9f7f3e1e64	[codeview] Emit base class information from DW_TAG_inheritance nodes There are two remaining issues here: 1. No vbptr information 2. Need to mention indirect virtual bases Getting indirect virtual bases is just a matter of adding an "indirect" flag, emitting them in the frontend, and ignoring them when appropriate for DWARF. All virtual bases use the same artificial vbptr field, so I think the vbptr offset will be best represented by an implicit __vbptr$ClassName member similar to our existing __vptr$ member. llvm-svn: 273688	2016-06-24 16:24:24 +00:00
Matthew Simpson	e794678404	[LV] Preserve order of dependences in interleaved accesses analysis The interleaved access analysis currently assumes that the inserted run-time pointer aliasing checks ensure the absence of dependences that would prevent its instruction reordering. However, this is not the case. Issues can arise from how code generation is performed for interleaved groups. For a load group, all loads in the group are essentially moved to the location of the first load in program order, and for a store group, all stores in the group are moved to the location of the last store. For groups having members involved in a dependence relation with any other instruction in the loop, this reordering can violate the dependence. This patch teaches the interleaved access analysis how to avoid breaking such dependences, and should fix PR27626. An assumption of the original analysis was that the accesses had been collected in "program order". The analysis was then simplified by visiting the accesses bottom-up. However, this ordering was never guaranteed for anything other than single basic block loops. Thus, this patch also enforces the desired ordering. Reference: https://llvm.org/bugs/show_bug.cgi?id=27626 Differential Revision: http://reviews.llvm.org/D19984 llvm-svn: 273687	2016-06-24 15:33:25 +00:00
Artur Pilipenko	6c7a8abf5c	Remangle intrinsics names when types are renamed This is a resubmittion of previously reverted rL273568. This is a fix for the problem mentioned in "LTO and intrinsics mangling" llvm-dev mail thread: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098387.html Reviewers: mehdi_amini, reames Differential Revision: http://reviews.llvm.org/D19373 llvm-svn: 273686	2016-06-24 15:10:29 +00:00
Saleem Abdulrasool	f6b5f0fffd	ExecutionEngine: add preliminary support for COFF ARM This adds rudimentary support for COFF ARM to the dynamic loader for the exeuction engine. This can be used by lldb to JIT code into a COFF ARM environment. This lays the foundation for the loader, though a few of the relocation types are yet unhandled. llvm-svn: 273682	2016-06-24 14:11:44 +00:00
Chad Rosier	fd342808e0	[MachineDominatorTree] Add a MDT verifier. Differential Revision: http://reviews.llvm.org/D21657 llvm-svn: 273678	2016-06-24 13:32:22 +00:00
Daniel Sanders	0d97270ae5	[mips] Use --check-prefixes where appropriate. NFC. llvm-svn: 273669	2016-06-24 12:23:17 +00:00
Matt Arsenault	86de486d31	AMDGPU: Add stub custom CodeGenPrepare pass This will do various things including ones CodeGenPrepare does, but with knowledge of uniform values. llvm-svn: 273657	2016-06-24 07:07:55 +00:00
George Burgess IV	2efbb3f394	Remove hack introduced by r273641. Hopefully the buildbots have had enough time to pick this up by now. llvm-svn: 273656	2016-06-24 06:58:15 +00:00
Matt Arsenault	0534f4aa79	AMDGPU: Un-xfail and add tests Un XFAIL a few tests plus a few more I had lying around in my tree, which seem to all work now but I don't see tests that quite test the same things. llvm-svn: 273655	2016-06-24 06:58:01 +00:00
Matt Arsenault	c581611e11	AMDGPU: Remove disable-irstructurizer subtarget feature The only real reason to use it is for testing, so replace it with a command line option instead of a potentially function dependent feature. llvm-svn: 273653	2016-06-24 06:30:22 +00:00
Vedant Kumar	2c96e88ed4	[llvm-cov] Fix two warnings They were using output streams inconsistently. One also had a grammar bug. I noticed these while trying to pare down D18278. llvm-svn: 273642	2016-06-24 02:33:01 +00:00
George Burgess IV	43e9ba0e5a	Temporary hack to clean a file from buildbots. Some buildbots are complaining about a .s file under test/ that was inadvertently created by a test earlier today, and is still hanging around. I'll undo this change in ~1hr (or whenever the bots are done getting rid of said file). llvm-svn: 273641	2016-06-24 02:19:11 +00:00
Chuang-Yu Cheng	68f7f1cf00	Teaching SimplifyCFG to recognize the Or-Mask trick that InstCombine uses to reduce the number of comparisons. Specifically, InstCombine can turn: (i == 5334 \|\| i == 5335) into: ((i \| 1) == 5335) SimplifyCFG was already able to detect the pattern: (i == 5334 \|\| i == 5335) to: ((i & -2) == 5334) This patch supersedes D21315 and resolves PR27555 (https://llvm.org/bugs/show_bug.cgi?id=27555). Thanks to David and Chandler for the suggestions! Author: Thomas Jablin (tjablin) Reviewers: majnemer chandlerc halfdan cycheng http://reviews.llvm.org/D21397 llvm-svn: 273639	2016-06-24 01:59:00 +00:00
Peter Collingbourne	b19924a425	BitcodeWriter: Remove redundant (and incorrect) check for whether to emit module summary. The function name Module::empty() is slightly misleading in that it only tests for the presence of functions in the module. However we still want to emit the module summary if the module contains only global variables or aliases. The presence of such entities can be determined simply by checking the summary directly, as we are doing below. Differential Revision: http://reviews.llvm.org/D21669 llvm-svn: 273638	2016-06-24 01:58:02 +00:00
George Burgess IV	a3d62be733	[CFLAA] Propagate StratifiedAttrs in interproc. analysis. This patch also has a refactor that kills StratifiedAttr, and leaves us with StratifiedAttrs, because having both was mildly redundant. This patch makes us correctly handle stratified attributes when doing interprocedural analysis. It also adds another attribute, AttrCaller, which acts like AttrUnknown. We can filter out AttrCaller values when during interprocedural analysis, since the caller should have information about what arguments it's passing to its callee. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21645 llvm-svn: 273636	2016-06-24 01:00:03 +00:00
Ahmed Bougacha	f0b46ee0aa	[ARM] Use aapcs_vfp for ___truncdfhf2 on v7k. r215348 overrode the f16 libcalls to be soft-float, but v7k uses the default (hard-float) calling convention. llvm-svn: 273631	2016-06-24 00:08:01 +00:00
Tom Stellard	14416ae6cd	Support/ELF: Add R_AMDGPU_GOTPCREL relocation Summary: We will start generating this in a future patch. Reviewers: arsenm, kzhuravl, rafael, ruiu, tony-tye Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21482 llvm-svn: 273628	2016-06-23 23:11:29 +00:00
Hans Wennborg	4b63a98de3	[codeview] Add classes and unions to the Local/Global UDTs lists Differential Revision: http://reviews.llvm.org/D21655 llvm-svn: 273626	2016-06-23 22:57:25 +00:00
Chris Bieneman	10dcd3bd3b	[yaml2macho] Removing asserts in favor of explicit yaml parse error 32-bit Mach headers don't have reserved fields. When generating the mapping for 32-bit headers leaving off the reserved field will result in parse errors if the field is present in the yaml. Added a CHECK-NOT line to ensure that mach_header.yaml isn't adding a reserved field, and a test to ensure that the parser error gets hit with 32-bit headers. llvm-svn: 273623	2016-06-23 22:36:31 +00:00
Kyle Butt	991df7889b	Codegen: [X86] preservere memory refs for folded umul_lohi Memory references were not being propagated for this folded load. This prevented optimizations like LICM from hoisting the load. Added test to verify that this allows LICM to proceed. llvm-svn: 273617	2016-06-23 21:40:35 +00:00
Kyle Butt	178314ab52	Codegen: LICM Remove check for exactly 1 register def. When considering whether to split an instruction with a memory operand into an explicit load and a register-based instruction, we currently check that the resulting instruction has exactly 1 def. This prevents 2 important LICM optimizations: compares with memory operands, and double indirect calls. All the tests and the test-suite pass without the check. My guess as to original intent is to limit the additional register pressure created by the new instruction, but given that we only split out a single register, it is already limited. The licm-dominance test now checks actual memory loads for hoisting instead of undef, and it tests compares. hoist-invariant-load.ll now checks for 2 hoists, the intended hoist, and a bonus from calling a got-relative function in a loop. llvm-svn: 273616	2016-06-23 21:38:49 +00:00
Rafael Espindola	2d3cce71ee	Uses shouldAssumeDSOLocal. With that SystemZ knows to avoid a GOT for PIE. llvm-svn: 273614	2016-06-23 21:18:59 +00:00
Rafael Espindola	f2898d73a5	Convert test to FileCheck. llvm-svn: 273609	2016-06-23 20:37:49 +00:00
Anna Thomas	31a0b2088f	InstCombine rule to fold trunc when value available Summary: This instcombine rule folds away trunc operations that have value available from a prior load or store. This kind of code can be generated as a result of GVN widening the load or from source code as well. Reviewers: reames, majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21246 llvm-svn: 273608	2016-06-23 20:22:22 +00:00
George Burgess IV	1f99da54c2	[CFLAA] Use better interprocedural function summaries. Previously, we just unified any arguments that seemed to be related to each other. With this patch, we now respect dereference levels, etc. which should make us substantially more accurate. Proper handling of StratifiedAttrs will be done in a later patch. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21536 llvm-svn: 273596	2016-06-23 18:55:23 +00:00
Hans Wennborg	a21a263101	[codeview] Fix letter casing in FileCheck regexes We print those hex numbers with uppercase letters. llvm-svn: 273594	2016-06-23 18:23:28 +00:00
Michael Kuperstein	0194d30e09	[X86] Extract HiPE prologue constants into metadata X86FrameLowering::adjustForHiPEPrologue() contains a hard-coded offset into an Erlang Runtime System-internal data structure (the PCB). As the layout of this data structure is prone to change, this poses problems for maintaining compatibility. To address this problem, the compiler can produce this information as module-level named metadata. For example (where P_NSP_LIMIT is the offending offset): !hipe.literals = !{ !2, !3, !4 } !2 = !{ !"P_NSP_LIMIT", i32 152 } !3 = !{ !"X86_LEAF_WORDS", i32 24 } !4 = !{ !"AMD64_LEAF_WORDS", i32 24 } Patch by Magnus Lang Differential Revision: http://reviews.llvm.org/D20363 llvm-svn: 273593	2016-06-23 18:17:25 +00:00
Nirav Dave	38bb1c15fd	Prevent generation of temp file in test from r273585. llvm-svn: 273588	2016-06-23 18:06:35 +00:00
Nirav Dave	bfdb483755	Preserve DebugInfo when replacing values in DAGCombiner Recommiting after correcting over-eager Debug Value transfer fixing PR28270. [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 273585	2016-06-23 17:52:57 +00:00
Pablo Barrio	7a64346533	[ARM] Lower (select_cc k k (select_cc ~k ~k x)) into (SSAT l_k x) Summary: SSAT saturates an integer, making sure that its value lies within an interval [-k, k]. Since the constant is given to SSAT as the number of bytes set to one, k + 1 must be a power of 2, otherwise the optimization is not possible. Also, the select_cc must use < and > respectively so that they define an interval. Reviewers: mcrosier, jmolloy, rengolin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D21372 llvm-svn: 273581	2016-06-23 16:53:49 +00:00
Artur Pilipenko	80771b9ad9	Upgrade other old memset/memcpy signatures in tests causing buildbot failures with rL273568. llvm-svn: 273580	2016-06-23 16:34:52 +00:00
Hans Wennborg	b510b458b9	[codeview] Emit retained types Differential Revision: http://reviews.llvm.org/D21630 llvm-svn: 273579	2016-06-23 16:33:53 +00:00
Hans Wennborg	a63b50afb8	Revert r273568 "Remangle intrinsics names when types are renamed" It broke 2008-07-15-Bswap.ll and 2009-09-01-PostRAProlog.ll llvm-svn: 273574	2016-06-23 16:13:23 +00:00
Artur Pilipenko	4fec7b7131	Fix an old memset signature in 2009-09-01-PostRAProlog.ll test causing a buildbot failure llvm-svn: 273573	2016-06-23 16:07:10 +00:00
Artur Pilipenko	f0c9f81379	Remangle intrinsics names when types are renamed This is a fix for the problem mentioned in "LTO and intrinsics mangling" llvm-dev mail thread: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098387.html Reviewers: mehdi_amini, reames Differential Revision: http://reviews.llvm.org/D19373 llvm-svn: 273568	2016-06-23 15:25:09 +00:00
Michael Zolotukhin	2d3592d481	[LoopUnrollAnalyzer] Fix a bug in UnrolledInstAnalyzer::visitLoad. When simplifying a load we need to make sure that the type of the simplified value matches the type of the instruction we're processing. In theory, we can handle casts here as we deal with constant data, but since it's not implemented at the moment, we at least need to bail out. This fixes PR28262. llvm-svn: 273562	2016-06-23 14:31:31 +00:00
Valery Pykhtin	a852d695b8	[AMDGPU] Enable absolute expression initializer for amd_kernel_code_t fields. Differential Revision: http://reviews.llvm.org/D21380 llvm-svn: 273561	2016-06-23 14:13:06 +00:00
Simon Pilgrim	595dddb103	[X86][AVX512] Added AVX512F vector sign extend tests Now that Elena has confirmed that PR26474 has been fixed llvm-svn: 273560	2016-06-23 14:01:45 +00:00
Hal Finkel	a1271036c5	Allow DeadStoreElimination to track combinations of partial later wrties DeadStoreElimination can currently remove a small store rendered unnecessary by a later larger one, but could not remove a larger store rendered unnecessary by a series of later smaller ones. This adds that capability. It works by keeping a map, which is used as an effective interval map, for each store later overwritten only partially, and filling in that interval map as more such stores are discovered. No additional walking or aliasing queries are used. In the map forms an interval covering the the entire earlier store, then it is dead and can be removed. The map is used as an interval map by storing a mapping between the ending offset and the beginning offset of each interval. I discovered this problem when investigating a performance issue with code like this on PowerPC: #include <complex> using namespace std; complex<float> bar(complex<float> C); complex<float> foo(complex<float> C) { return bar(C)C; } which produces this: define void @_Z4testSt7complexIfE(%"struct.std::complex" noalias nocapture sret %agg.result, i64 %c.coerce) { entry: %ref.tmp = alloca i64, align 8 %tmpcast = bitcast i64* %ref.tmp to %"struct.std::complex"* %c.sroa.0.0.extract.shift = lshr i64 %c.coerce, 32 %c.sroa.0.0.extract.trunc = trunc i64 %c.sroa.0.0.extract.shift to i32 %0 = bitcast i32 %c.sroa.0.0.extract.trunc to float %c.sroa.2.0.extract.trunc = trunc i64 %c.coerce to i32 %1 = bitcast i32 %c.sroa.2.0.extract.trunc to float call void @_Z3barSt7complexIfE(%"struct.std::complex"* nonnull sret %tmpcast, i64 %c.coerce) %2 = bitcast %"struct.std::complex"* %agg.result to i64* %3 = load i64, i64* %ref.tmp, align 8 store i64 %3, i64* %2, align 4 ; <--- *** THIS SHOULD NOT BE HERE ** %_M_value.realp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 0 %4 = lshr i64 %3, 32 %5 = trunc i64 %4 to i32 %6 = bitcast i32 %5 to float %_M_value.imagp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 1 %7 = trunc i64 %3 to i32 %8 = bitcast i32 %7 to float %mul_ad.i.i = fmul fast float %6, %1 %mul_bc.i.i = fmul fast float %8, %0 %mul_i.i.i = fadd fast float %mul_ad.i.i, %mul_bc.i.i %mul_ac.i.i = fmul fast float %6, %0 %mul_bd.i.i = fmul fast float %8, %1 %mul_r.i.i = fsub fast float %mul_ac.i.i, %mul_bd.i.i store float %mul_r.i.i, float* %_M_value.realp.i.i, align 4 store float %mul_i.i.i, float* %_M_value.imagp.i.i, align 4 ret void } the problem here is not just that the i64 store is unnecessary, but also that it blocks further backend optimizations of the other uses of that i64 value in the backend. In the future, we might want to add a special case for handling smaller accesses (e.g. using a bit vector) if the map mechanism turns out to be noticeably inefficient. A sorted vector is also a possible replacement for the map for small numbers of tracked intervals. Differential Revision: http://reviews.llvm.org/D18586 llvm-svn: 273559	2016-06-23 13:46:39 +00:00
Daniel Sanders	de393329b9	[mips] Don't derive the default ABI from the CPU in the backend. Summary: The backend has no reason to behave like a driver and should generally do as it's told (and error out if it can't) instead of trying to figure out what the API user meant. The default ABI is still derived from the arch component as a concession to backwards compatibility. API-users that previously passed an explicit CPU and a triple that was inconsistent with the CPU (e.g. mips-linux-gnu and mips64r2) may get a different ABI to what they got before. However, it's expected that there are no such users on the basis that CodeGen has been asserting that the triple is consistent with the selected ABI for several releases. API-users that were consistent or passed '' or 'generic' as the CPU will see no difference. Reviewers: sdardis, rafael Subscribers: rafael, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21466 llvm-svn: 273557	2016-06-23 12:42:53 +00:00
Daniel Sanders	8e17bea7d5	[mips][ias] Integers are not registers. Summary: When parseAnyRegister() encounters a symbol alias, it parses integers and adds a corresponding expression to the operand list. This is clearly wrong since the only operands that parseAnyRegister() should be accepting are registers. It's not clear why this code was added and there are no test cases that cover it. I think it might be leftover from when searchSymbolAlias() was more widely used. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21377 llvm-svn: 273555	2016-06-23 10:54:09 +00:00
Diana Picus	e440f99913	[AMDGPU] Remove exit-on-error in test (PR27761) The exit-on-error flag was necessary in order to avoid an assertion when handling DYNAMIC_STACKALLOC nodes in SelectionDAGLegalize. We can avoid the assertion by creating some dummy nodes. This enables us to remove the exit-on-error flag on the first 2 run lines (SI), but on the third run line (R600) we would run into another assertion when trying to reserve indirect registers. This patch also replaces that assertion with an early exit from the function. Fixes PR27761. Differential Revision: http://reviews.llvm.org/D20852 llvm-svn: 273550	2016-06-23 09:19:16 +00:00
Simon Dardis	724e530296	[mips] Fix dext/dins definitions dext and dins, along with their 'm' and 'u' variants are defined in mips64r2, not mips64. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D21608 llvm-svn: 273549	2016-06-23 09:06:20 +00:00
Craig Topper	597aa42fec	[AVX512] Remove masked unpack intrinsics and autoupgrade to vectorshuffle and selects. llvm-svn: 273543	2016-06-23 07:37:33 +00:00
David Majnemer	d1fbf48566	[SCCP] Don't assume all Constants are ConstantInt This fixes PR28269. llvm-svn: 273521	2016-06-23 00:14:29 +00:00
Peter Collingbourne	6717803485	Revert r273456, "Preserve DebugInfo when replacing values in DAGCombiner" as it caused pr28270. llvm-svn: 273518	2016-06-23 00:06:17 +00:00
Vedant Kumar	cae1cff94a	[llvm-cov] Fix a buggy lit test There is no check prefix for "WHOLE-FILE": this particular line was supposed to use the "ALL" prefix. llvm-svn: 273517	2016-06-22 23:58:03 +00:00
Matt Arsenault	3cb4ddeb4e	AMDGPU: Fix liveness when expanding m0 loop llvm-svn: 273514	2016-06-22 23:40:57 +00:00
Sanjoy Das	e57bf680ec	[ImplicitNullChecks] Hoist trivial depdendencies if possible When trying to convert a loading instruction into a FAULTING_LOAD, we sometimes face code like this: if %R10 is not null: %R9<def> = MOV32ri Immediate %R9<def, tied> = AND32rm %R9, 0x20(%R10) else: goto TRAP In these cases we would like to use the AND32rm instruction as the faulting operation by hoisting the "depedency" def-ing %R9 also above the control flow, transforming the program into: %R9<def> = MOV32ri Immediate %R9<def, tied> = FAULTING_LOAD_OP(AND32rm %R9, 0x20(%R10), FailPath: TRAP) This change teaches ImplicitNullChecks to do the above, when safe. llvm-svn: 273501	2016-06-22 22:16:51 +00:00
Rafael Espindola	928a95d0b0	Use shouldAssumeDSOLocal. With this it handle -fPIE. llvm-svn: 273499	2016-06-22 22:09:17 +00:00
Changpeng Fang	47efe1f6db	AMDGPU/SI: Define an intrinsic to expose ds_swizzle_b32 Reviewers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D21533 llvm-svn: 273496	2016-06-22 21:33:49 +00:00
Hans Wennborg	9a519a099e	[codeview] Write LF_UDT_SRC_LINE records (PR28251) Differential Revision: http://reviews.llvm.org/D21621 llvm-svn: 273495	2016-06-22 21:22:13 +00:00
Reid Kleckner	ac460619d2	[codeview] Fix the alignment padding that we add to list records Tweak the big-types.ll test case to catch this bug. We just need an enumerator name that doesn't have a length that is a multiple of 4. llvm-svn: 273477	2016-06-22 20:59:17 +00:00
Davide Italiano	ec7e29e941	[IRObjectFile] Propagate .weak attribute correctly for ASM symbols. PR: 28256 Differential Revision: http://reviews.llvm.org/D21616 llvm-svn: 273474	2016-06-22 20:48:15 +00:00
Peter Collingbourne	6d88fde3af	IR: Introduce Module::global_objects(). This is a convenience iterator that allows clients to enumerate the GlobalObjects within a Module. Also start using it in a few places where it is obviously the right thing to use. Differential Revision: http://reviews.llvm.org/D21580 llvm-svn: 273470	2016-06-22 20:29:42 +00:00
Matt Arsenault	9babdf4265	AMDGPU: Fix verifier errors in SILowerControlFlow The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking. Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return. llvm-svn: 273467	2016-06-22 20:15:28 +00:00
Krzysztof Parzyszek	f7f7068109	[Hexagon] Add SDAG preprocessing step to expose shifted addressing modes Transform: (store ch addr (add x (add (shl y c) e))) to: (store ch addr (add x (shl (add y d) c))), where e = (shl d c) for some integer d. The purpose of this is to enable generation of loads/stores with shifted addressing mode, i.e. mem(x+y<<#c). For that, the shift value c must be 0, 1 or 2. llvm-svn: 273466	2016-06-22 20:08:27 +00:00
Sanjay Patel	a06d989552	[ValueTracking] improve ComputeNumSignBits for vector constants This is similar to the computeKnownBits improvement in rL268479. There's probably more we can do for vector logic instructions, but this should let us see non-splat constant masking ops that can become vector selects instead of and/andn/or sequences. Differential Revision: http://reviews.llvm.org/D21610 llvm-svn: 273459	2016-06-22 19:20:59 +00:00
Chad Rosier	8c106bcbe8	[AArch64] Remove an overly aggressive assert. llvm-svn: 273458	2016-06-22 19:18:52 +00:00
Rafael Espindola	8474fdf90d	Start using shouldAssumeDSOLocal on Hexagon. Include a token test showing that access to private is now the same as to internal. llvm-svn: 273457	2016-06-22 19:09:14 +00:00
Nirav Dave	96beb7dee5	Preserve DebugInfo when replacing values in DAGCombiner Recommiting after fixing over-aggressive assertion [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 273456	2016-06-22 19:03:26 +00:00
Wei Ding	0526e7f8d9	AMDGPU: Add convergent flag to INLINEASM instruction. Differential Revision: http://reviews.llvm.org/D21214 llvm-svn: 273455	2016-06-22 18:51:08 +00:00
Reid Kleckner	156a7239c1	[codeview] Add IntroducingVirtual debug info flag CodeView needs to know if a virtual method was introduced in the current class, and base classes may not have complete type information, so we need to thread this bit through from the frontend. llvm-svn: 273453	2016-06-22 18:31:14 +00:00
Vedant Kumar	f5ac6d49e4	[asan] Do not instrument accesses to profiling globals It's only useful to asan-itize profiling globals while debugging llvm's profiling instrumentation passes. Enabling asan along with instrprof or gcov instrumentation shouldn't incur extra overhead. This patch is in the same spirit as r264805 and r273202, which disabled tsan instrumentation of instrprof/gcov globals. Differential Revision: http://reviews.llvm.org/D21541 llvm-svn: 273444	2016-06-22 17:30:58 +00:00
Reid Kleckner	643dd83661	[codeview] Defer emission of all referenced complete records This is the motivating example: struct B { int b; }; struct A { B b; }; int f(A p) { return p->b->b; } Clang emits complete types for both A and B because they are required to be complete, but our CodeView emission would only emit forward declarations of A and B. This was a consequence of the fact that the A* type must reference the forward declaration of A, which doesn't reference B at all. We can't eagerly emit complete definitions of A and B when we request the forward declaration's type index because of recursive types like linked lists. If we did that, our stack usage could get out of hand, and it would be possible to lower a type while attempting to lower a type, and we would need to double check if our type is already present in the TypeIndexMap after all recursive getTypeIndex calls. Instead, defer complete type emission until after all type lowering has completed. This ensures that all referenced complete types are emitted, and that type lowering is not re-entrant. llvm-svn: 273443	2016-06-22 17:15:28 +00:00
Zhan Jun Liau	0df350589f	[SystemZ] Recognize RISBG opportunities involving a truncate Summary: Recognize RISBG opportunities where the end result is narrower than the original input - where a truncate separates the shift/and operations. The motivating case is some code in postgres which looks like: srlg %r2, %r0, 11 nilh %r2, 255 Reviewers: uweigand Author: RolandF Differential Revision: http://reviews.llvm.org/D21452 llvm-svn: 273433	2016-06-22 16:16:27 +00:00
Krzysztof Parzyszek	f228c95f87	[Hexagon] Handle expansion of cmpxchg llvm-svn: 273432	2016-06-22 16:07:10 +00:00
Artur Pilipenko	1cec4fdddf	Upgrade old memset/memcpy signatures (without isVolatile argument) in tests We no longer have corresponding code in autoupgrade and the vast majority of the tests were fixed long time ago. Fix the remaining few. One of the verifier test cases is marked as XFAIL because it was passing only because the signature was incorrect. llvm-svn: 273428	2016-06-22 15:16:06 +00:00
Sanjay Patel	c6cacd6067	[InstSimplify] add ashr tests including vector types llvm-svn: 273421	2016-06-22 14:18:04 +00:00
Simon Pilgrim	bc35f9f702	[SLPVectorizer][X86] Added ceil/floor/nearbyint/rint/trunc vectorization tests llvm-svn: 273420	2016-06-22 14:07:46 +00:00
Sanjay Patel	21579bb39a	[InstSimplify] regenerate checks llvm-svn: 273419	2016-06-22 14:00:16 +00:00
George Rimar	ff8b539f7b	[llvm-readobj] - Teach llvm-readobj to print dependencies of SHT_GNU_verdef and refactor dumping method. This patch changes single method of llvm-readobj. It teaches SHT_GNU_verdef dumper to print version dependencies, also it removes few fields from output that can be dumped with other keys and slightly refactors code. Testcase was also modified to match the changes. Change is required for testcases of upcoming lld patches. Differential revision: http://reviews.llvm.org/D21552 llvm-svn: 273417	2016-06-22 13:43:38 +00:00
Simon Pilgrim	1536c19642	Regenerated test llvm-svn: 273404	2016-06-22 12:58:15 +00:00
Peter Zotov	ee462ca194	[OCaml] Add functions for accessing metadata nodes. Patch by Xinyu Zhuang. Differential Revision: http://reviews.llvm.org/D19309 llvm-svn: 273370	2016-06-22 03:30:24 +00:00
Reid Kleckner	0c5d874bea	[codeview] Improve names of types in scopes and member function ids We now include namespace scope info in LF_FUNC_ID records and we emit LF_MFUNC_ID records for member functions as we should. Class names are now fully qualified, which is what MSVC does. Add a little bit of scaffolding to handle ThisAdjustment when it arrives in DISubprogram. llvm-svn: 273358	2016-06-22 01:32:56 +00:00
Reid Kleckner	fd3b35ad84	[codeview] Add missing test from r273294 llvm-svn: 273355	2016-06-22 01:17:05 +00:00
Anna Zaks	644d9d3a44	[asan] Do not instrument pointers with address space attributes Do not instrument pointers with address space attributes since we cannot track them anyway. Instrumenting them results in false positives in ASan and a compiler crash in TSan. (The compiler should not crash in any case, but that's a different problem.) llvm-svn: 273339	2016-06-22 00:15:52 +00:00
Peter Collingbourne	21521891a2	IR: Allow metadata attachments on declarations, and fix lazy loaded metadata issue with globals. This change is motivated by an upcoming change to the metadata representation used for CFI. The indirect function call checker needs type information for external function declarations in order to correctly generate jump table entries for such declarations. We currently associate such type information with declarations using a global metadata node, but I plan [1] to move all such metadata to global object attachments. In bitcode, metadata attachments for function declarations appear in the global metadata block. This seems reasonable to me because I expect metadata attachments on declarations to be uncommon. In the long term I'd also expect this to be the case for CFI, because we'd want to use some specialized bitcode format for this metadata that could be read as part of the ThinLTO thin-link phase, which would mean that it would not appear in the global metadata block. To solve the lazy loaded metadata issue I was seeing with D20147, I use the same bitcode representation for metadata attachments for global variables as I do for function declarations. Since there's a use case for metadata attachments in the global metadata block, we might as well use that representation for global variables as well, at least until we have a mechanism for lazy loading global variables. In the assembly format, the metadata attachments appear after the "declare" keyword in order to avoid a parsing ambiguity. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html Differential Revision: http://reviews.llvm.org/D21052 llvm-svn: 273336	2016-06-21 23:42:48 +00:00
Haicheng Wu	a783bac50b	[Kryo] Enable loop prefetcher. Differential Revision: http://reviews.llvm.org/D21535 llvm-svn: 273329	2016-06-21 22:47:56 +00:00
Kevin Enderby	606a338db9	Update llvm-obdump(1) to print FAT_MAGIC_64 for Darwin’s 64-bit universal files with the -macho and -universal-headers flags. Just a follow on to r273207, I missed updating the printing of the fat magic number when the universal file is a 64-bit universal file. rdar://26899493 llvm-svn: 273324	2016-06-21 21:55:01 +00:00
Jan Vesely	fea814d531	AMDGPU: Add implicitarg.ptr intrinsic. Points to the start of implicit arguments (appended after explicit arguments) Differential Revision: http://reviews.llvm.org/D20297 llvm-svn: 273317	2016-06-21 20:46:20 +00:00
Michael Kuperstein	78028b84d2	[X86] Make arithmetic operations cost model test saner. NFC. llvm-svn: 273316	2016-06-21 20:41:40 +00:00
Artem Belevich	d7ebcfb291	[NVPTX] Improve lowering of byval args of device functions. Avoid unnecessary spills of such vars to local space on SASS level and pointer space conversion. Instead, make a local copy with appropriate addrspacecasts and let LLVM optimize them away when possible. This allows loading value of the argument using [symbol+offset] instead of converting argument to general space pointer and using it for indexing (which also implicitly converts param space pointer to local space one on SASS level and triggers copying of argument into local space in the process). This reduces call overhead, uses less registers and reduces overall SASS size by 2-4%. Differential Review: http://reviews.llvm.org/D21421 llvm-svn: 273313	2016-06-21 20:30:26 +00:00
Easwaran Raman	8bceb9d210	Fix PR28219: Use profile summary from reader and not compute it Differentiaal revision: http://reviews.llvm.org/D21546 llvm-svn: 273301	2016-06-21 19:29:49 +00:00
Reid Kleckner	5b335b864b	[codeview] Add support for splitting field list records over 64KB The basic structure is that once a list record goes over 64K, the last subrecord of the list is an LF_INDEX record that refers to the next record. Because the type record graph must be toplogically sorted, this means we have to emit them in reverse order. We build the type record in order of declaration, so this means that if we don't want extra copies, we need to detect when we were about to split a record, and leave space for a continuation subrecord that will point to the eventual split top-level record. Also adds dumping support for these records. Next we should make sure that large method overload lists work properly. llvm-svn: 273294	2016-06-21 18:33:01 +00:00
Silviu Baranga	03b6a4fc88	[AArch64] Fix merge-store.ll regression test after r273271 r273271 changed the RUN line of the regression test to use -march=cyclone instead of -mtriple=aarch64-none-none. This caused a change in the output syntax for the ext instruction, causing the test to fail. Change this test back to using -mtriple=aarch64-none-none. llvm-svn: 273286	2016-06-21 17:15:49 +00:00
Etienne Bergeron	f6be62f2c8	[StackProtector] Fix computation of GSCookieOffset and EHCookieOffset with SEH4 Summary: Fix the computation of the offsets present in the scopetable when using the SEH (__except_handler4). This patch added an intrinsic to track the position of the allocation on the stack of the EHGuard. This position is needed when producing the ScopeTable. ``` struct _EH4_SCOPETABLE { DWORD GSCookieOffset; DWORD GSCookieXOROffset; DWORD EHCookieOffset; DWORD EHCookieXOROffset; _EH4_SCOPETABLE_RECORD ScopeRecord[1]; }; struct _EH4_SCOPETABLE_RECORD { DWORD EnclosingLevel; long (FilterFunc)(); union { void (HandlerAddress)(); void (*FinallyFunc)(); }; }; ``` The code to generate the EHCookie is added in `X86WinEHState.cpp`. Which is adding these instructions when using SEH4. ``` Lfunc_begin0: # BB#0: # %entry pushl %ebp movl %esp, %ebp pushl %ebx pushl %edi pushl %esi subl $28, %esp movl %ebp, %eax <<-- Loading FramePtr movl %esp, -36(%ebp) movl $-2, -16(%ebp) movl $L__ehtable$use_except_handler4_ssp, %ecx xorl ___security_cookie, %ecx movl %ecx, -20(%ebp) xorl ___security_cookie, %eax <<-- XOR FramePtr and Cookie movl %eax, -40(%ebp) <<-- Storing EHGuard leal -28(%ebp), %eax movl $__except_handler4, -24(%ebp) movl %fs:0, %ecx movl %ecx, -28(%ebp) movl %eax, %fs:0 movl $0, -16(%ebp) calll _may_throw_or_crash LBB1_1: # %cont movl -28(%ebp), %eax movl %eax, %fs:0 addl $28, %esp popl %esi popl %edi popl %ebx popl %ebp retl ``` And the corresponding offset is computed: ``` Luse_except_handler4_ssp$parent_frame_offset = -36 .p2align 2 L__ehtable$use_except_handler4_ssp: .long -2 # GSCookieOffset .long 0 # GSCookieXOROffset .long -40 # EHCookieOffset <<---- .long 0 # EHCookieXOROffset .long -2 # ToState .long _catchall_filt # FilterFunction .long LBB1_2 # ExceptionHandler ``` Clang is not yet producing function using SEH4, but it's a work in progress. This patch is a step toward having a valid implementation of SEH4. Unfortunately, it is not yet fully working. The EH registration block is not allocated at the right offset on the stack. Reviewers: rnk, majnemer Subscribers: llvm-commits, chrisha Differential Revision: http://reviews.llvm.org/D21231 llvm-svn: 273281	2016-06-21 15:58:55 +00:00
Evandro Menezes	230083ff9d	[AArch64] Change the preferred alignment for char and short to word alignment Differential Revision: http://reviews.llvm.org/D21414 llvm-svn: 273279	2016-06-21 15:55:18 +00:00
Silviu Baranga	dc43d61a25	[AArch64] Switch regression tests to test features not CPUs Summary: We have switched to using features for all heuristics, but the tests for these are still using -mcpu, which means we are not directly testing the features. This converts at least some of the existing regression tests to use the new features. This still leaves the following features untested: merge-narrow-ld predictable-select-expensive alternate-sextload-cvt-f32-pattern disable-latency-sched-heuristic Reviewers: mcrosier, t.p.northover, rengolin Subscribers: MatzeB, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D21288 llvm-svn: 273271	2016-06-21 15:16:34 +00:00
Daniel Sanders	bf2c03ee69	[arm+x86] Make GNU variants behave like GNU w.r.t combining sin+cos into sincos. Summary: canCombineSinCosLibcall() would previously combine sin+cos into sincos for GNUX32/GNUEABI/GNUEABIHF regardless of whether UnsafeFPMath were set or not. However, GNU would only combine them for UnsafeFPMath because sincos does not set errno like sin and cos do. It seems likely that this is an oversight. Reviewers: t.p.northover Subscribers: t.p.northover, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D21431 llvm-svn: 273259	2016-06-21 12:29:03 +00:00
Elena Demikhovsky	a266cf0518	reverted the prev commit due to assertion failure llvm-svn: 273258	2016-06-21 12:10:11 +00:00
Elena Demikhovsky	9823c995bc	Fixed consecutive memory access detection in Loop Vectorizer. It did not handle correctly cases without GEP. The following loop wasn't vectorized: for (int i=0; i<len; i++) to++ = from++; I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1. Differential revision: http://reviews.llvm.org/D20789 llvm-svn: 273257	2016-06-21 11:32:01 +00:00
Craig Topper	283418fbb6	[AVX512] Add patterns for any-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX. llvm-svn: 273253	2016-06-21 07:37:32 +00:00
Craig Topper	9038aa3001	[AVX512] Use update_llc_test_checks.py to regenerate a test in preparation for a future commit. llvm-svn: 273252	2016-06-21 07:37:27 +00:00
James Y Knight	03c1415b8f	Revert "Change RelaxELFRelocations for llc." This reverts commit r273019. From email I sent to list: > I don't think this makes sense. Either the linker you're using supports > this feature, or it doesn't. Having it enabled for llc if your linker > doesn't support it is not fun. > > Further note that this also affects basically all other code using llvm > libraries -- other than Clang, which explicitly sets it back to false by > default, unless you set the ENABLE_X86_RELAX_RELOCATIONS cmake flag to > true. > > If you want to enable the relax mode across all llvm tools in some > circumstances, I think it should be via moving the cmake flag from clang > down into llvm. > > I'm going to revert this commit, since I both think it intrinsically > doesn't make sense to do this, and because it's breaking some of our > tools. llvm-svn: 273245	2016-06-21 05:40:41 +00:00
Craig Topper	0a0fb0fda1	[AVX512] Remove the masked vpcmpeq/vcmpgt intrinsics and autoupgrade them to native icmps. llvm-svn: 273240	2016-06-21 03:53:24 +00:00
George Burgess IV	9fdbfe17a8	[CFLAA] Be more aggressive with interprocedural analysis. This patch makes us perform interprocedural analysis on functions that don't have internal linkage. It also removes a test that should've been deleted in an earlier commit (since other tests now cover everything that the newly-removed test covers). Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21513 llvm-svn: 273229	2016-06-21 01:42:47 +00:00
George Burgess IV	87b2e41416	[CFLAA] Add interprocedural function summaries. This patch adds function summaries, so that we don't need to recompute various properties about function parameters/return values at each callsite of a function. It also adds many interprocedural tests for CFLAA. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21475#inline-182390 llvm-svn: 273219	2016-06-20 23:10:56 +00:00
Simon Pilgrim	356e823b51	[X86][SSE] Add cost model for BSWAP of vectors The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization. Differential Revision: http://reviews.llvm.org/D21521 llvm-svn: 273217	2016-06-20 23:08:21 +00:00
Simon Pilgrim	225b2e37a0	[X86][X87] Fix issue with sitofp i64 -> fp128 on 32-bit targets Fix for PR27726 - sitofp i64 to fp128 was loading the merged load i64 to a x87 register preventing legalization for conversion to fp128. Added 32-bit tests for fp128 cast/conversions. llvm-svn: 273210	2016-06-20 22:41:17 +00:00
Kevin Enderby	d0a814ccec	Forgot to svn add one of my test files for the change in r273207. llvm-svn: 273208	2016-06-20 22:27:49 +00:00
Kevin Enderby	eb6d110c1d	Add support for Darwin’s 64-bit universal files with 64-bit offsets and sizes for the objects. Darwin added support in its Xcode 8.0 tools (released in the beta) for universal files where offsets and sizes for the objects are 64-bits to allow support for objects contained in universal files to be larger then 4gb. The change is very straight forward. There is a new magic number that differs by one bit, much like the 64-bit Mach-O files. Then there is a new structure that follow the fat_header that has the same layout but with the offset and size fields using 64-bit values instead of 32-bit values. rdar://26899493 llvm-svn: 273207	2016-06-20 22:16:18 +00:00
Vedant Kumar	0222adbcd2	[tsan] Do not instrument accesses to the gcov counters array There is a known intended race here. This is a follow-up to r264805, which disabled tsan instrumentation for updates to instrprof counters. For more background on this please see the discussion in D18164. llvm-svn: 273202	2016-06-20 21:24:26 +00:00
Sanjay Patel	9ad8fb68f7	[InstSimplify] analyze (optionally casted) icmps to eliminate obviously false logic (PR27869) By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question raised by PR27869: https://llvm.org/bugs/show_bug.cgi?id=27869 ...where InstCombine turns an icmp+zext into a shift causing us to miss the fold. Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp. Differential Revision: http://reviews.llvm.org/D21512 llvm-svn: 273200	2016-06-20 20:59:59 +00:00
Dehao Chen	071bb9d7af	Pass AssumptionCacheTracker from SampleProfileLoader to Inliner Summary: Inliner needs ACT when calling InlineFunction. Instead of nullptr, we need to pass it in from SampleProfileLoader Reviewers: davidxl Subscribers: eraman, vsk, danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D21205 llvm-svn: 273199	2016-06-20 20:53:40 +00:00
Matt Arsenault	802ebcb4bb	InstCombine: Don't strip convergent from intrinsic callsites Specific instances of intrinsic calls may want to be convergent, such as certain register reads but the intrinsic declaration is not. llvm-svn: 273188	2016-06-20 19:04:44 +00:00
Sanjay Patel	445d7ecf89	[InstCombine] consolidate some icmp+logic tests and improve checks llvm-svn: 273186	2016-06-20 18:40:37 +00:00
Matt Arsenault	2209625387	AMDGPU: Preserve undef flag on vcc when shrinking v_cndmask_b32 The implicit operand is added by the initial instruction construction, so this was adding an additional vcc use. The original one was missing the undef flag the original condition had, so the verifier would complain. llvm-svn: 273182	2016-06-20 18:34:00 +00:00
Matt Arsenault	b6d8c37e1a	AMDGPU: Fold more custom nodes to undef This will help sneak undefs past GVN into the DAG for some tests. Also add missing intrinsic for rsq_legacy, even though the node was already selected to the instruction. Also start passing the debug location to intrinsic errors. llvm-svn: 273181	2016-06-20 18:33:56 +00:00
Sanjay Patel	14dcb042bc	[InstCombine] update to use FileCheck with autogenerated exact checking llvm-svn: 273180	2016-06-20 18:23:40 +00:00
Matt Arsenault	ff98241f37	Generalize DiagnosticInfoStackSize to support other limits Backends may want to report errors on resources other than stack size. llvm-svn: 273177	2016-06-20 18:13:04 +00:00
Sanjay Patel	06918ad79e	[InstCombine] update to use FileCheck with autogenerated exact checking llvm-svn: 273173	2016-06-20 17:56:13 +00:00
Matt Arsenault	a9720c67f1	AMDGPU: Use correct method for determining instruction size llvm-svn: 273172	2016-06-20 17:51:32 +00:00
Sanjay Patel	a038240660	[InstCombine] regenerate checks llvm-svn: 273170	2016-06-20 17:48:48 +00:00
Rafael Espindola	959e9c8d01	Use shouldAssumeDSOLocal. With this ARM fast isel knows that PIE variable are not preemptable. llvm-svn: 273169	2016-06-20 17:45:33 +00:00
Tom Stellard	5350894265	AMDGPU: Add support for R_AMDGPU_REL32 relocations Reviewers: arsenm, kzhuravl, rafael Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21401 llvm-svn: 273168	2016-06-20 17:33:43 +00:00
Tom Stellard	1c89eb7db0	AMDGPU: Emit R_AMDGPU_ABS32_{HI,LO} for scratch buffer relocations Reviewers: arsenm, rafael, kzhuravl Subscribers: rafael, arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21400 llvm-svn: 273166	2016-06-20 16:59:44 +00:00
Sam Parker	d616cf07b2	[ARM] Enable isel of UMAAL TargetLowering and DAGToDAG are used to combine ADDC, ADDE and UMLAL dags into UMAAL. Selection is split into the two phases because it is easier to match the two patterns at those different times. Differential Revision: http://http://reviews.llvm.org/D21461 llvm-svn: 273165	2016-06-20 16:47:09 +00:00
David Majnemer	c5601df9fd	Reapply "[LoopIdiom] Don't remove dead operands manually" This reverts commit r273160, reapplying r273132. RecursivelyDeleteTriviallyDeadInstructions cannot be called on a parentless Instruction. llvm-svn: 273162	2016-06-20 16:03:25 +00:00
Cong Liu	1c28b6d733	Revert "[LoopIdiom] Don't remove dead operands manually" This reverts commit r273132. Breaks multiple test under /llvm/test:Transforms (e.g. llvm/test:Transforms/LoopIdiom/basic.ll.test) under asan. llvm-svn: 273160	2016-06-20 15:22:15 +00:00
Simon Pilgrim	0a81b13f31	[X86][F16C] Added half <-> double conversion tests llvm-svn: 273153	2016-06-20 12:51:55 +00:00
Pankaj Gode	0aab2e398a	[AARCH64] Add support for Broadcom Vulcan Adding core tuning support for new Broadcom Vulcan core (ARMv8.1A). Differential Revision: http://reviews.llvm.org/D21500 llvm-svn: 273148	2016-06-20 11:13:31 +00:00
Patrik Hagglund	7205215591	Fix for PR27940 After a store has been eliminated, when making sure that the instruction iterator points to a valid instruction, dbg intrinsics are now ignored as a new instruction. Patch by Henric Karlsson. Reviewed by Daniel Berlin. Differential Revision: http://reviews.llvm.org/D21076 llvm-svn: 273141	2016-06-20 09:10:10 +00:00
Igor Breger	e59165ca63	[AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering. Differential Revision: http://reviews.llvm.org/D20897 llvm-svn: 273138	2016-06-20 07:05:43 +00:00
David Majnemer	a705843f23	[LoopIdiom] Don't remove dead operands manually Removing dead instructions requires remembering which operands have already been removed. RecursivelyDeleteTriviallyDeadInstructions has this logic, don't partially reimplement it in LoopIdiomRecognize. This fixes PR28196. llvm-svn: 273132	2016-06-20 02:33:29 +00:00
Sanjay Patel	a4b052c7d1	[InstSimplify] add tests for PR27689; regenerate checks llvm-svn: 273128	2016-06-19 21:40:12 +00:00
Simon Pilgrim	0887d5b02e	[X86][AVX512] Added 512-bit BITREVERSE tests and enabled AVX512BW lowering support llvm-svn: 273125	2016-06-19 20:59:19 +00:00
Simon Pilgrim	3d881a0230	[X86][SSE] Allow target shuffle combining to match masks with SM_Sentinel values We currently only allow exact matches of shuffle mask patterns during target shuffle combining. This patch relaxes this to permit SM_SentinelUndef in the combined shuffle to always be accepted as well as allowing exact matching of the SM_SentinelZero value. I've adjusted some tests that were requiring exact shuffle masks to now include undef values. Differential Revision: http://reviews.llvm.org/D21495 llvm-svn: 273119	2016-06-19 18:03:52 +00:00
Chris Dewhurst	a294541c05	[SPARC[ Correcting out-of-date unit tests checked in as part of r273108 llvm-svn: 273110	2016-06-19 12:52:39 +00:00
Chris Dewhurst	0c1e0026aa	[SPARC] Fixes for hardware errata on LEON processor. Passes to fix three hardware errata that appear on some LEON processor variants. The instructions FSMULD, FMULS and FDIVS do not work as expected on some LEON processors. This change allows those instructions to be substituted for alternatives instruction sequences that are known to work. These passes only run when selected individually, or as part of a processor defintion. They are not included in general SPARC processor compilations for non-LEON processors or for those LEON processors that do not have these hardware errata. llvm-svn: 273108	2016-06-19 11:03:28 +00:00
David Majnemer	3119599475	[LoadCombine] Combine Loads formed from GEPS with negative indexes Change the underlying offset and comparisons to use int64_t instead of uint64_t. Patch by River Riddle! Differential Revision: http://reviews.llvm.org/D21499 llvm-svn: 273105	2016-06-19 06:14:56 +00:00
Simon Pilgrim	9a09652a3a	[X86][AVX] Added test case for PR28136 llvm-svn: 273098	2016-06-18 22:59:08 +00:00
Simon Pilgrim	cd6d4352bc	[X86][SSSE3] Added examples of target shuffle combining failing to match undefs in shuffle masks llvm-svn: 273097	2016-06-18 21:18:21 +00:00
Simon Pilgrim	ab009e9f41	[X86][XOP] Added fast-isel tests matching tools/clang/test/CodeGen/xop-builtins.c llvm-svn: 273096	2016-06-18 21:07:31 +00:00
Simon Pilgrim	b201678763	[X86][TBM] Added fast-isel tests matching tools/clang/test/CodeGen/tbm-builtins.c llvm-svn: 273087	2016-06-18 17:20:52 +00:00
Vasileios Kalintiris	0cf68df6cc	[mips] Emit a JALR with $rd equal to $zero, instead of a JR in MIPS32R6. Summary: JR is an alias of JALR with $rd=0 in the R6 ISA. Also, this fixes recursive builds in MIPS32R6. Reviewers: dsanders, sdardis Subscribers: jfb, dschuff, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21370 llvm-svn: 273085	2016-06-18 15:39:43 +00:00
Amjad Aboud	76c9eb99a7	[codeview] Emit non-virtual method type. Differential Revision: http://reviews.llvm.org/D21011 llvm-svn: 273084	2016-06-18 10:25:07 +00:00
Marcin Koscielnicki	3feda222c6	[sanitizers] Disable target-specific lowering of string functions. CodeGen has hooks that allow targets to emit specialized code instead of calls to memcmp, memchr, strcpy, stpcpy, strcmp, strlen, strnlen. When ASan/MSan/TSan/ESan is in use, this sidesteps its interceptors, resulting in uninstrumented memory accesses. To avoid that, make these sanitizers mark the calls as nobuiltin. Differential Revision: http://reviews.llvm.org/D19781 llvm-svn: 273083	2016-06-18 10:10:37 +00:00
Matt Arsenault	a466a7cf62	Add looping testcase that broke in r272987 llvm-svn: 273081	2016-06-18 05:15:58 +00:00
Matt Arsenault	e935f05a94	AMDGPU: Fix kernel argument alignment impacting stack size Don't use AllocateStack because kernel arguments have nothing to do with the stack. The ensureMaxAlignment call was still changing the stack alignment. llvm-svn: 273080	2016-06-18 05:15:53 +00:00
Sanjoy Das	e8fd9561cb	[SCEV] Fix incorrect trip count computation The way we elide max expressions when computing trip counts is incorrect -- it breaks cases like this: ``` static int wrapping_add(int a, int b) { return (int)((unsigned)a + (unsigned)b); } void test() { volatile int end_buf = 2147483548; // INT_MIN - 100 int end = end_buf; unsigned counter = 0; for (int start = wrapping_add(end, 200); start < end; start++) counter++; print(counter); } ``` Note: the `NoWrap` variable that was being tested has little to do with the values flowing into the max expression; it is a property of the induction variable. test/Transforms/LoopUnroll/nsw-tripcount.ll was added to solely test functionality I'm reverting in this change, so I've deleted the test fully. llvm-svn: 273079	2016-06-18 04:38:31 +00:00
Simon Pilgrim	f4b2af1b9f	[X86][SSE4A] Autoupgrade and remove MOVNTSD/MOVNTSS intrinsics Required better annotation of the instruction defs upon removal of the builtin intrinsic pattern. llvm-svn: 273077	2016-06-18 02:38:26 +00:00
Rafael Espindola	88bb8ce821	Add a test for r273022. llvm-svn: 273073	2016-06-18 00:24:49 +00:00
Matt Arsenault	8fd5978811	Revert "Revert "Revert "InstCombine: Reduce trunc (shl x, K) width.""" This seems to be causing an infinite loop / crash in instcombine on some bots. llvm-svn: 273069	2016-06-17 23:36:38 +00:00
Tom Stellard	f8db61c5f0	Support/ELF: Add AMDGPU relocation definitions to match documentation Reviewers: arsenm, kzhuravl, rafael Subscribers: llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21443 llvm-svn: 273066	2016-06-17 22:38:08 +00:00
Adam Nemet	a9f09c6245	[LAA] Enable symbolic stride speculation for all LAA clients This is a functional change for LLE and LDist. The other clients (LV, LVerLICM) already had this explicitly enabled. The temporary boolean parameter to LAA is removed that allowed turning off speculation of symbolic strides. This makes LAA's caching interface LAA::getInfo only take the loop as the parameter. This makes the interface more friendly to the new Pass Manager. The flag -enable-mem-access-versioning is moved from LV to a LAA which now allows turning off speculation globally. llvm-svn: 273064	2016-06-17 22:35:41 +00:00
Matt Arsenault	0bb294b224	AMDGPU: Temporarily select trap to s_endpgm This should select to s_trap, but that requires additonal work to setup and enable the trap handler. For now emit s_endpgm so bugpoint stops getting stuck on the unsupported call to abort. Emit a warning that this will only terminate the wave and not really trap. llvm-svn: 273062	2016-06-17 22:27:03 +00:00
Kevin Enderby	ae108ffb9a	Add support for Darwin’s static library table of contents with 64-bit offsets to the archive members. Darwin added support in its Xcode 8.0 tools (released in the beta) for static library table of contents with 64-bit offsets to the archive members. The change is very straight forward. The table of contents member is named ___.SYMDEF_64 or "___.SYMDEF_64 SORTED" and same layout is used but with fields using 64 bit values instead of 32 bit values. rdar://26869808 llvm-svn: 273058	2016-06-17 22:16:06 +00:00
Reid Kleckner	6fa1546ad9	[codeview] Emit incomplete member pointer types with the unknown model An incomplete member pointer type will always have a size of zero, so we don't need an extra flag. Credit to David Majnemer for the idea. llvm-svn: 273057	2016-06-17 22:14:39 +00:00
Reid Kleckner	604105bb90	[codeview] Add DIFlags for pointer to member representations Summary: This seems like the least intrusive way to pass this information through. Fixes PR28151 Reviewers: majnemer, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21444 llvm-svn: 273053	2016-06-17 21:31:33 +00:00
Matt Arsenault	8885910f8e	AMDGPU: Remove llvm.SI.tid intrinsic Mesa doesn't emit this for llvm >= 3.8 anymore. llvm-svn: 273050	2016-06-17 21:18:41 +00:00
Matt Arsenault	d76efc14b9	Revert "Revert "InstCombine: Reduce trunc (shl x, K) width."" Reapply r272987. Condition should be in terms of the destination type, and the flags should not be copied. llvm-svn: 273045	2016-06-17 20:33:53 +00:00
Marcin Koscielnicki	fd4b6b9e51	[SelectionDAG] Don't treat library calls specially if marked with nobuiltin. To be used by D19781. Differential Revision: http://reviews.llvm.org/D19801 llvm-svn: 273039	2016-06-17 20:24:07 +00:00
Michael Kuperstein	18d6d3d95e	[X86] Add missing AVX512 anyext patterns. Add AVX512 anyext patterns for i16 and i64, modeled on the existing i8 and i32 patterns. llvm-svn: 273038	2016-06-17 20:21:17 +00:00
Davide Italiano	b49aa5c0c4	[PM] Port MergedLoadStoreMotion to the new pass manager, take two. This is indeed a much cleaner approach (thanks to Daniel Berlin for pointing out), and also David/Sean for review. Differential Revision: http://reviews.llvm.org/D21454 llvm-svn: 273032	2016-06-17 19:10:09 +00:00
Tim Northover	28a9e7f4ba	ARM: take account of possible bundle when erasing an instruction. Fortunately this appears to be the only ARM-specific pass that runs while bundles might be in play, so no other cases need modifying. llvm-svn: 273029	2016-06-17 18:40:46 +00:00
Davide Italiano	16bfa13a77	[IRObjectFile] Handle .weak in RecordStreamer. Differential Revision: http://reviews.llvm.org/D21476 llvm-svn: 273027	2016-06-17 18:20:14 +00:00
James Y Knight	148a6469dc	Support expanding partial-word cmpxchg to full-word cmpxchg in AtomicExpandPass. Many CPUs only have the ability to do a 4-byte cmpxchg (or ll/sc), not 1 or 2-byte. For those, you need to mask and shift the 1 or 2 byte values appropriately to use the 4-byte instruction. This change adds support for cmpxchg-based instruction sets (only SPARC, in LLVM). The support can be extended for LL/SC-based PPC and MIPS in the future, supplanting the ISel expansions those architectures currently use. Tests added for the IR transform and SPARCv9. Differential Revision: http://reviews.llvm.org/D21029 llvm-svn: 273025	2016-06-17 18:11:48 +00:00
Rafael Espindola	9f86baebe0	Change RelaxELFRelocations for llc. As a developer tool it makes sense for it to use the new relocations. llvm-svn: 273019	2016-06-17 17:43:41 +00:00
Rafael Espindola	e021e85166	Change the default of -relax-relocations. llvm-mc is a developer tool, as such it make sense for it to use new features by default. This doesn't change the user facing clang, which still defaults to non relaxable relocations. llvm-svn: 273014	2016-06-17 17:04:56 +00:00
Sanjay Patel	216d8cf720	[InstCombine] allow more than one use for vector bitcast folding with selects The motivating example for this transform is similar to D20774 where bitcasts interfere with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to produce min and max ops: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %bc1 = bitcast <4 x float> %a to <4 x i32> %bc2 = bitcast <4 x float> %b to <4 x i32> %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2 %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1 %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>* store <4 x i32> %sel1, <4 x i32>* %bc3 %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>* store <4 x i32> %sel2, <4 x i32>* %bc4 ret void } With this patch, we move the selects up to use the input args which allows getting rid of all of the bitcasts: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16 store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16 ret void } The asm for x86 SSE then improves from: movaps %xmm0, %xmm2 cmpltps %xmm1, %xmm2 movaps %xmm2, %xmm3 andnps %xmm1, %xmm3 movaps %xmm2, %xmm4 andnps %xmm0, %xmm4 andps %xmm2, %xmm0 orps %xmm3, %xmm0 andps %xmm1, %xmm2 orps %xmm4, %xmm2 movaps %xmm0, (%rdi) movaps %xmm2, (%rsi) To: movaps %xmm0, %xmm2 minps %xmm1, %xmm2 maxps %xmm0, %xmm1 movaps %xmm2, (%rdi) movaps %xmm1, (%rsi) The TODO comments show that we're limiting this transform only to vectors and only to bitcasts because we need to improve other transforms or risk creating worse codegen. Differential Revision: http://reviews.llvm.org/D21190 llvm-svn: 273011	2016-06-17 16:46:50 +00:00
David Majnemer	da9548f949	[CodeView] Refactor enumerator emission This addresses Amjad's review comments on D21442. llvm-svn: 273010	2016-06-17 16:13:21 +00:00
Reid Kleckner	ac945e27dd	[codeview] Make function names more consistent with MSVC Names in function id records don't include nested name specifiers or template arguments, but names in the symbol stream include both. For the symbol stream, instead of having Clang put the fully qualified name in the subprogram display name, recreate it from the subprogram scope chain. For the type stream, take the unqualified name and chop of any template arguments. This makes it so that CodeView DI metadata is more similar to DWARF DI metadata. llvm-svn: 273009	2016-06-17 16:11:20 +00:00
Nirav Dave	fd91041ce1	Refactor and cleanup Assembly Parsing / Lexing Recommiting after fixing non-atomic insert to front of SmallVector in MCAsmLexer.h Add explicit Comment Token in Assembly Lexing for future support for outputting explicit comments from inline assembly. As part of this, CPPHash Directives are now explicitly distinguished from Hash line comments in Lexer. Line comments are recorded as EndOfStatement tokens, not Comment tokens to simplify compatibility with current TargetParsers. This slightly complicates comment output. This remove all lexing tasks out of the parser, does minor cleanup to remove extraneous newlines Asm Output, and some improvements white space handling. Reviewers: rtrieu, dwmw2, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20009 llvm-svn: 273007	2016-06-17 16:06:17 +00:00
Simon Pilgrim	6a35e5ab97	[X86][SSE4A] Remove the GCCBuiltins from the movntsd/movntss intrinsic defs so we can emit native IR from clang. Clang-side sibling commit to follow. llvm-svn: 273002	2016-06-17 14:27:38 +00:00
Adam Nemet	e7709d92ba	[LLE] Don't hard-code the name of the preheader in test Turns out a didn't get this right because symbolic stride versioning changes the name. Relax the matching. llvm-svn: 272992	2016-06-17 09:13:15 +00:00
Matt Arsenault	ce56f7bbaa	Revert "InstCombine: Reduce trunc (shl x, K) width." This reverts commit r272987. This might be causing crashes on some bots. llvm-svn: 272990	2016-06-17 06:28:53 +00:00
Qin Zhao	bb4496f8c8	[esan\|cfrag] Add the struct field size array in StructInfo Summary: Adds the struct field size array in struct StructInfo. Updates test struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits Differential Revision: http://reviews.llvm.org/D21341 llvm-svn: 272989	2016-06-17 04:50:20 +00:00
Matt Arsenault	028fd50642	InstCombine: Reduce trunc (shl x, K) width. llvm-svn: 272987	2016-06-17 04:43:22 +00:00
Ranjeet Singh	39d2d097d6	[ARM] Add support for mrrc/mrrc2 intrinsics. Reapplying patch as it was reverted when it was first committed because of an assertion failure when the mrrc2 intrinsic was called in ARM mode. The failure was happening because the instruction was being built in ARMISelDAGToDAG.cpp and the tablegen description for mrrc2 instruction doesn't allow you to use a predicate. The ARM architecture manuals do say that mrrc2 in ARM mode can be predicated with AL in assembly but this has no effect on the encoding of the instruction as the top 4 bits will always be 1111 not 1110 which is the encoding for the condition AL. Differential Revision: http://reviews.llvm.org/D21408 llvm-svn: 272982	2016-06-17 00:52:41 +00:00
Evgeniy Stepanov	45fa0fd758	[safestack] Sink unsafe address computation to each use. This is a fix for PR27844. When replacing uses of unsafe allocas, emit the new location immediately after each use. Without this, the pointer stays live from the function entry to the last use, while it's usually cheaper to recalculate. llvm-svn: 272969	2016-06-16 22:34:04 +00:00
Evgeniy Stepanov	72d961a1da	[safestack] Fixup llvm.dbg.value when rewriting unsafe allocas. When moving unsafe allocas to the unsafe stack, dbg.declare intrinsics are updated to refer to the new location. This change does the same to dbg.value intrinsics. llvm-svn: 272968	2016-06-16 22:34:00 +00:00
David Majnemer	979cb88870	[CodeView] Implement support for enums MSVC handles enums differently from structs and classes: a forward declaration is not emitted unconditionally. MSVC does not emit an S_UDT record for the enum. Differential Revision: http://reviews.llvm.org/D21442 llvm-svn: 272960	2016-06-16 21:32:16 +00:00
Nirav Dave	280ecf6ff0	Revert "Refactor and cleanup Assembly Parsing / Lexing" Reverting for unexpected crashes on various platforms. This reverts commit r272953. llvm-svn: 272957	2016-06-16 21:19:23 +00:00
Sanjoy Das	07c6521aed	[EarlyCSE] Fold invariant loads Redundant invariant loads can be CSE'ed with very little extra effort over what early-cse already tracks, so it looks reasonable to make early-cse handle this case. llvm-svn: 272954	2016-06-16 20:47:57 +00:00
Nirav Dave	c19c3260df	Refactor and cleanup Assembly Parsing / Lexing Add explicit Comment Token in Assembly Lexing for future support for outputting explicit comments from inline assembly. As part of this, CPPHash Directives are now explicitly distinguished from Hash line comments in Lexer. Line comments are recorded as EndOfStatement tokens, not Comment tokens to simplify compatibility with current TargetParsers. This slightly complicates comment output. This remove all lexing tasks out of the parser, does minor cleanup to remove extraneous newlines Asm Output, and some improvements white space handling. Reviewers: rtrieu, dwmw2, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20009 llvm-svn: 272953	2016-06-16 20:34:22 +00:00
Justin Lebar	b0bd07aff7	Fix strip-dead-debug-info test if path contains "bar". This test checks that the string 'bar' (no quotes) doesn't exist in the output after running opt. But opt embeds the absolute path to the filename, and on my machine, the filename contains the string 'jlebar', causing the test to fail. This patch changes the test to look for the string '"bar"' instead. llvm-svn: 272941	2016-06-16 19:39:55 +00:00
Paul Robinson	4ce6a93226	Make check lines not match themselves. Noticed during review of the recent FileCheck change. llvm-svn: 272940	2016-06-16 19:38:48 +00:00
Sanjay Patel	0e9afea3c8	[x86] autoupgrade and remove AVX2 integer min/max intrinsics This will (hopefully very temporarily) break clang. The clang side of this should be the next commit. llvm-svn: 272932	2016-06-16 18:44:20 +00:00
Rafael Espindola	5a07687a8e	dos2unix this test. NFC. llvm-svn: 272928	2016-06-16 18:21:11 +00:00
Davide Italiano	41315f7873	[PM] Revert the port of MergeLoadStoreMotion to the new pass manager. Daniel Berlin expressed some real concerns about the port and proposed and alternative approach. I'll revert this for now while working on a new patch, which I hope to put up for review shortly. Sorry for the churn. llvm-svn: 272925	2016-06-16 17:40:53 +00:00
Sanjay Patel	d09a21682f	remove old FileCheck lines that are no longer used llvm-svn: 272921	2016-06-16 17:04:16 +00:00
Sanjay Patel	f664f3a578	[DAG] Remove redundant FMUL in Newton-Raphson SQRT code When calculating a square root using Newton-Raphson with two constants, a naive implementation is to use five multiplications (four muls to calculate reciprocal square root and another one to calculate the square root itself). However, after some reassociation and CSE the same result can be obtained with only four multiplications. Unfortunately, there's no reliable way to do such a reassociation in the back-end. So, the patch modifies NR code itself so that it directly builds optimal code for SQRT and doesn't rely on any further reassociation. Patch by Nikolai Bozhenov! Differential Revision: http://reviews.llvm.org/D21127 llvm-svn: 272920	2016-06-16 16:58:54 +00:00
Adam Nemet	776346848a	[LLE] New test to check that no versioning for symbolic strides occurs. NFC This is currently only performed in the Vectorizer. I will change this as symbolic stride collection is moved to LAA. This test will track when the actual functional change occurs. llvm-svn: 272918	2016-06-16 16:45:29 +00:00
Igor Laevsky	87f0d0e185	Revert r272891 "[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo" It was causing failures in Profile-i386 and Profile-x86_64 tests. llvm-svn: 272912	2016-06-16 16:25:53 +00:00
Reid Kleckner	0166a71386	[PATCH] Fix RuntimeDyldCOFFI386 to handle relocations with a non-zero addend This fixes IMAGE_REL_I386_DIR32, IMAGE_REL_I386_DIR32NB, IMAGE_REL_I386_SECREL, and IMAGE_REL_I386_REL32 relocations. Based on patch by Jon Turney <jon.turney@dronecode.org.uk> llvm-svn: 272911	2016-06-16 16:21:41 +00:00
Rafael Espindola	afade35003	Don't print (PLT) on arm. The R_ARM_PLT32 relocation is deprecated and is not produced by MC. This means that the code being deleted is dead from the .o point of view and was making the .s more confusing. llvm-svn: 272909	2016-06-16 16:09:53 +00:00
Sanjay Patel	51ab757941	[x86] autoupgrade and remove SSE2/SSE41 integer min/max intrinsics Follow-up to: http://reviews.llvm.org/rL272806 http://reviews.llvm.org/rL272807 llvm-svn: 272907	2016-06-16 15:48:30 +00:00
Daniel Sanders	8e3c74210f	Remove redundant -mattr options from llvm-objdump commands. The -mattr options in these four tests have no effect on the output of llvm-objdump. In the case of the two Mips tests, removing the -mattr option left duplicate RUN lines so the duplicates have been removed. llvm-svn: 272906	2016-06-16 15:47:19 +00:00
Igor Laevsky	c9179fd2c2	[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo We should update results of the BranchProbabilityInfo after removing block in JumpThreading. Otherwise we will get dangling pointer inside BranchProbabilityInfo cache. Differential Revision: http://reviews.llvm.org/D20957 llvm-svn: 272891	2016-06-16 13:28:25 +00:00
Patrik Hagglund	0acaefaf9d	PR27938: Don't remove valid DebugLoc in Scalarizer Added checks to make sure the Scalarizer::transferMetadata() don't remove valid debug locations from instructions. This is important as the verifier pass require that e.g. inlinable callsites have a valid debug location. https://llvm.org/bugs/show_bug.cgi?id=27938 Patch by Karl-Johan Karlsson Reviewers: dblaikie Differential Revision: http://reviews.llvm.org/D20807 llvm-svn: 272884	2016-06-16 10:48:54 +00:00
Daniel Sanders	de7816b0cd	[mips][mips16] Fix machine verifier errors about incorrect register classes on load/stores. Summary: [ls][bh] and [ls][bh]u cannot use sp-relative addresses and must therefore lower frameindex nodes such that there is a copy to a CPU16Regs register. This is now done consistently using a separate addressing mode that does not permit frameindex nodes. As part of this I've had to remove an optimization that reduced the number of instructions needed to work around the lack of sp-relative addresses on [ls][bh] and [ls][bh]u. This optimization used one of the eight CPU16Regs registers as a copy of the stack pointer and it's implementation was the root cause of many of the register vs register class mismatches. lw/sw can use sp-relative addresses but we ought to ensure that we use the correct version of lw/sw internally for things like IAS. This is not currently the case and this change does not fix this. However, this change does clean it up sufficiently well to fix the machine verifier failures. Also removed irrelevant functions from stchar.ll. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21062 llvm-svn: 272882	2016-06-16 10:20:59 +00:00
Daniel Sanders	1d14864bb3	[llvm-objdump] Support detection of feature bits from the object and implement this for Mips. Summary: The Mips implementation only covers the feature bits described by the ELF e_flags so far. Mips stores additional feature bits such as MSA in the .MIPS.abiflags section. Also fixed a small bug this revealed where microMIPS wouldn't add the EF_MIPS_MICROMIPS flag when using -filetype=obj. Reviewers: echristo, rafael Subscribers: rafael, mehdi_amini, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21125 llvm-svn: 272880	2016-06-16 09:17:03 +00:00
Hrvoje Varga	f1e0a03d08	[mips][micromips] Implement DCLO, DCLZ, DROTR, DROTR32 and DROTRV instructions Differential Revision: http://reviews.llvm.org/D16917 llvm-svn: 272876	2016-06-16 07:06:25 +00:00
Chuang-Yu Cheng	dbe00d51b4	SimplifyCFG is able to detect the pattern: (i == 5334 \|\| i == 5335) to: ((i & -2) == 5334) This transformation has some incorrect side conditions. Specifically, the transformation is only applied when the right-hand side constant (5334 in the example) is a power of two not equal and not equal to the negated mask. These side conditions were added in r258904 to fix PR26323. The correct side condition is that: ((Constant & Mask) == Constant)[(5334 & -2) == 5334]. It's a little bit hard to see why these transformations are correct and what the side conditions ought to be. Here is a CVC3 program to verify them for 64-bit values: ONE : BITVECTOR(64) = BVZEROEXTEND(0bin1, 63); x : BITVECTOR(64); y : BITVECTOR(64); z : BITVECTOR(64); mask : BITVECTOR(64) = BVSHL(ONE, z); QUERY( (y & ~mask = y) => ((x & ~mask = y) <=> (x = y OR x = (y \| mask))) ); Please note that each pattern must be a dual implication (<--> or iff). One directional implication can create spurious matches. If the implication is only one-way, an unsatisfiable condition on the left side can imply a satisfiable condition on the right side. Dual implication ensures that satisfiable conditions are transformed to other satisfiable conditions and unsatisfiable conditions are transformed to other unsatisfiable conditions. Here is a concrete example of a unsatisfiable condition on the left implying a satisfiable condition on the right: mask = (1 << z) (x & ~mask) == y --> (x == y \|\| x == (y \| mask)) Substituting y = 3, z = 0 yields: (x & -2) == 3 --> (x == 3 \|\| x == 2) The version of this code before r258904 had no side-conditions and incorrectly justified itself in comments through one-directional implication. Thanks to Chandler for the suggestion! Author: Thomas Jablin (tjablin) Reviewers: chandlerc majnemer hfinkel cycheng http://reviews.llvm.org/D21417 llvm-svn: 272873	2016-06-16 04:44:25 +00:00
Eli Friedman	bd254a6f45	[InstCombine] Don't widen metadata on store-to-load forwarding The original check for load CSE or store-to-load forwarding is wrong when the forwarded stored value happened to be a load. Ref https://github.com/JuliaLang/julia/issues/16894 Differential Revision: http://reviews.llvm.org/D21271 Patch by Yichao Yu! llvm-svn: 272868	2016-06-16 02:33:42 +00:00
Tim Northover	daa1c018b0	AArch64: allow MOV (imm) alias to be printed The backend has been around for years, it's pretty ridiculous that we can't even use the preferred form for printing "MOV" aliases. Unfortunately, TableGen can't handle the complex predicates when printing so it's a bunch of nasty C++. Oh well. llvm-svn: 272865	2016-06-16 01:42:25 +00:00
Reid Kleckner	95e8af7395	[codeview] Regenerate test case with unique identifiers Clang now emits these, and these match MSVC. Should allow more powerful merging of type records across TUs. llvm-svn: 272864	2016-06-16 01:33:59 +00:00
Matt Arsenault	191763026c	AMDGPU: Disable scheduling in some slow tests Disabling the pre-RA scheduler on large-work-group-registers causes it to be ~50% slower. llvm-svn: 272860	2016-06-16 00:56:47 +00:00
Xinliang David Li	1eaecefaf9	[PM] Port Add discriminator pass to new PM llvm-svn: 272847	2016-06-15 21:51:30 +00:00
Rui Ueyama	5dbea9db10	[Codeview] Add a class for LF_UDT_MOD_SRC_LINE. Differential Revision: http://reviews.llvm.org/D21406 llvm-svn: 272843	2016-06-15 21:25:29 +00:00
Sanjay Patel	74b40bdb53	[x86, SSE] update packed FP compare tests for direct translation from builtin to IR The clang side of this was r272840: http://reviews.llvm.org/rL272840 A follow-up step would be to auto-upgrade and remove these LLVM intrinsics completely. Differential Revision: http://reviews.llvm.org/D21269 llvm-svn: 272841	2016-06-15 21:22:15 +00:00
Kevin Enderby	d8a6e83dcf	Fix llvm-objdump when disassembling a stripped Mach-O binary with the -macho option. It was printing out nothing in this case. llvm-objdump tries to disassemble sections a symbol at a time. In the case of a fully stripped Mach-O executable the only symbol remaining in the (__TEXT,__text) section is the special linker defined symbol __mh_execute_header . This symbol is special in that while it is N_SECT symbol in the (__TEXT,__text) its address is before the start of the (__TEXT,__text). It’s address is the start of the __TEXT segment which is where the mach header is statically linked. So the code in DisassembleMachO() needs to deal with this case specially. rdar://26778273 llvm-svn: 272837	2016-06-15 21:14:01 +00:00
Sanjay Patel	0b526676ab	[x86] delete unnecessary function declarations Missed this in r272806, r272807. llvm-svn: 272834	2016-06-15 20:51:47 +00:00
Tim Northover	389a1e39ea	AArch64: stop trying to use 32-bit MOVZs when expanding patchpoints. Of course the assembly was right but because the opcode was MOVZWi it was encoded as "movz w16, #65535, lsl #32" which is an unallocated encoding and would go horribly wrong on a CPU. No idea how this bug survived this long. It seems nobody is using that aspect of patchpoints. llvm-svn: 272831	2016-06-15 20:33:36 +00:00
Sanjay Patel	1a4569df54	[x86] add folds for x86 vector compare nodes (PR27924) Ideally, we can get rid of most x86 LLVM intrinsics by transforming them to IR (and some of that happened with http://reviews.llvm.org/rL272807), but it doesn't cost much to have some simple folds in the backend too while we're working on that and as a backstop. This fixes: https://llvm.org/bugs/show_bug.cgi?id=27924 Differential Revision: http://reviews.llvm.org/D21356 llvm-svn: 272828	2016-06-15 20:26:58 +00:00
Matthias Braun	98ea88be42	Statistic: Add machine parseable json output - We lacked a short unique identifier for a statistics, so I renamed the current "Name" field that just contained the DEBUG_TYPE name of the current file to DebugType and added a new "Name" field that contains the C++ identifier of the statistic variable. - Add the -stats-json option which outputs statistics in json format. Differential Revision: http://reviews.llvm.org/D20995 llvm-svn: 272826	2016-06-15 20:19:16 +00:00
Kevin B. Smith	acbda9ef30	[X86]: Updated r272801 to promote 16 bit compares with immediate operand to 32 bits. This is in response to a comment by Eli Friedman. llvm-svn: 272814	2016-06-15 18:18:05 +00:00
David Majnemer	3128b10cdc	[CodeView] Add support for emitting S_UDT for typedefs Emit a S_UDT record for typedefs. We still need to do something for class types. Differential Revision: http://reviews.llvm.org/D21149 llvm-svn: 272813	2016-06-15 18:00:01 +00:00
Sanjay Patel	a6c6f09967	[x86, SSE] remove the GCCBuiltins from the integer min/max intrinsics This allows us to emit native IR in Clang (next commit). Also, update the intrinsic tests to show that codegen already knows how to handle the IR that Clang will soon produce. llvm-svn: 272806	2016-06-15 17:17:27 +00:00
David Majnemer	b62692e2e0	[TargetLibraryInfo] Teach isValidProtoForLibFunc about tan We would fail to validate the type of the tan function which would cause downstream users of isValidProtoForLibFunc to assert. This fixes PR28143. llvm-svn: 272802	2016-06-15 16:47:23 +00:00
Kevin B. Smith	54566a0e9a	[X86]: Quit promoting 8 and 16 bit compares to 32 bit. Differential Revision: http://reviews.llvm.org/D21144 llvm-svn: 272801	2016-06-15 16:37:46 +00:00
Nirav Dave	194cb55f37	Revert "Preserve DebugInfo when replacing values in DAGCombiner" Reverting due to assertion failure in lib/CodeGen/SelectionDAG/InstrEmitter.cpp This reverts commit r272792. llvm-svn: 272799	2016-06-15 16:08:50 +00:00
Kevin B. Smith	c3c82cdbd0	[X86]: Improve Liveness checking for X86FixupBWInsts.cpp Differential Revision: http://reviews.llvm.org/D21085 llvm-svn: 272797	2016-06-15 16:03:06 +00:00
Nirav Dave	a72e308403	Preserve DebugInfo when replacing values in DAGCombiner [DAG] Previously debug values would transfer debuginfo for the selected start node for a replacement which allows for debug to be dropped. Push debug value transfer to occur with node/value replacement in SelectionDAG, remove now extraneous transfers of debug values. This refixes PR9817 which was being incompletely checked in the testsuite. Reviewers: jyknight Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D21037 llvm-svn: 272792	2016-06-15 14:50:08 +00:00
Ranjeet Singh	0db7be886e	Reverting r272778 because there's an assertion failure when running the test CodeGen/ARM/intrinsics-coprocessor.ll llvm-svn: 272791	2016-06-15 14:23:29 +00:00
Simon Dardis	7bdf183ac1	[mips] Missing test case Add missing testcase from r272666. llvm-svn: 272784	2016-06-15 13:49:58 +00:00
Ranjeet Singh	351364fe76	[ARM] Add support for mrrc/mrrc2 intrinsics. Differential Revision: http://reviews.llvm.org/D21178 llvm-svn: 272778	2016-06-15 11:32:24 +00:00
Daniel Sanders	df3185d2ea	[mips] Removed invalid test from o32_cc.ll MIPS32R1 cannot implement a 64-bit FPU because this was introduced in MIPS32R2. llvm-svn: 272769	2016-06-15 09:47:27 +00:00
Sean Silva	e0a9e66040	[PM] Port SLPVectorizer to the new PM This uses the "runImpl" approach to share code with the old PM. Porting to the new PM meant abandoning the anonymous namespace enclosing most of SLPVectorizer.cpp which is a bit of a bummer (but not a big deal compared to having to pull the pass class into a header which the new PM requires since it calls the constructor directly). llvm-svn: 272766	2016-06-15 08:43:40 +00:00
Daniel Sanders	d3bb20821d	[mips][msa] Fix register/register-class mismatches in emitINSERT_DF_VIDX(). Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21068 llvm-svn: 272765	2016-06-15 08:43:23 +00:00
Zlatko Buljan	d2ed9c6c2c	[mips][microMIPS] Add CodeGen support for AND, OR16, OR, XOR*, NOT16 and NOR instructions Differential Revision: http://reviews.llvm.org/D16719 llvm-svn: 272764	2016-06-15 07:46:24 +00:00
Igor Breger	64cfd3a442	[AVX512] Fix BLENDM lowering patterns. Operands should be swapped to match SELECT behavior. Use BLENDM instead of masked move instruction. Differential Revision: http://reviews.llvm.org/D21001 llvm-svn: 272763	2016-06-15 07:30:38 +00:00
Nicolai Haehnle	a609259832	AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsics Summary: This fixes two related bugs. First, the generic optimization passes unfortunately generate negative constant offsets but the hardware treats SOffset as an unsigned value. Second, there is a hardware bug on SI and CI, where address clamping in MUBUF instructions does not work correctly when SOffset is larger than the buffer size. This patch works around this bug by never using SOffset. An alternative workaround would be to do the clamping manually when SOffset is too large, but generating the required code sequence during instruction selection would be rather involved, and in any case the resulting code would probably be worse. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21326 llvm-svn: 272761	2016-06-15 07:13:05 +00:00
Sean Silva	a4c2d150d0	[PM] Port AlignmentFromAssumptions to the new PM. This uses the "runImpl" pattern to share code between the old and new PM. llvm-svn: 272757	2016-06-15 06:18:01 +00:00
Sanjoy Das	0272be206a	Don't force SP-relative addressing for statepoints Summary: ... when the offset is not statically known. Prioritize addresses relative to the stack pointer in the stackmap, but fallback gracefully to other modes of addressing if the offset to the stack pointer is not a known constant. Patch by Oscar Blumberg! Reviewers: sanjoy Subscribers: llvm-commits, majnemer, rnk, sanjoy, thanm Differential Revision: http://reviews.llvm.org/D21259 llvm-svn: 272756	2016-06-15 05:35:14 +00:00
Amaury Sechet	a65a237805	Add support for callsite in the new C API for attributes Summary: The second consumer of attributes. Reviewers: Wallbraker, whitequark, echristo, rafael, jyknight Subscribers: mehdi_amini Differential Revision: http://reviews.llvm.org/D21266 llvm-svn: 272754	2016-06-15 05:14:29 +00:00
Tom Stellard	82785e9fe7	AMDGPU/SI: Correctly encode constant expressions Summary: We we have an MCConstantExpr, we can encode it directly into the instruction instead of emitting fixups. Reviewers: artem.tamazov, vpykhtin, SamWot, nhaustov, arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21236 Change-Id: I88b3edf288d48e65c5d705fc4850d281f8e36948 llvm-svn: 272750	2016-06-15 03:09:39 +00:00
Tom Stellard	89049702ce	AMDGPU/AsmParser: Add support for parsing symbol operands Summary: We can now reference symbols directly in operands, like this: s_mov_b32 s0, global Reviewers: artem.tamazov, vpykhtin, SamWot, nhaustov Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21038 llvm-svn: 272748	2016-06-15 02:54:14 +00:00
Michael Kuperstein	3277a05fcf	Recommit [LV] Enable vectorization of loops where the IV has an external use r272715 broke libcxx because it did not correctly handle cases where the last iteration of one IV is the second-to-last iteration of another. Original commit message: Vectorizing loops with "escaping" IVs has been disabled since r190790, due to PR17179. This re-enables it, with support for external use of both "post-increment" (last iteration) and "pre-increment" (second-to-last iteration) IVs. llvm-svn: 272742	2016-06-15 00:35:26 +00:00
David Majnemer	4a697c312f	[LoopUnroll] Don't crash trying to unroll loop with EH pad exit We do not support splitting cleanuppad or catchswitches. This is problematic for passes which assume that a loop is in loop simplify form (the loop would have a dedicated exit block instead of sharing it). While it isn't great that we don't support this for cleanups, we still cannot make loop-simplify form an assertable precondition because indirectbr will also disable these sorts of CFG cleanups. This fixes PR28132. llvm-svn: 272739	2016-06-15 00:19:56 +00:00
David Majnemer	577be0fed3	[CodeView] Don't emit debuginfo for imported symbols Emitting symbol information requires us to have a definition for the symbol. A symbol reference is insufficient. This fixes PR28123. llvm-svn: 272738	2016-06-15 00:19:52 +00:00
David Majnemer	cbf614a93b	Remove the ScalarReplAggregates pass Nearly all the changes to this pass have been done while maintaining and updating other parts of LLVM. LLVM has had another pass, SROA, which has superseded ScalarReplAggregates for quite some time. Differential Revision: http://reviews.llvm.org/D21316 llvm-svn: 272737	2016-06-15 00:19:09 +00:00
Matt Arsenault	f42c69206d	AMDGPU: Run pointer optimization passes llvm-svn: 272736	2016-06-15 00:11:01 +00:00
Peter Collingbourne	6dbee00d67	Verifier: check that functions have at most a single !prof attachment. llvm-svn: 272734	2016-06-14 23:13:15 +00:00
Xinliang David Li	8052238ac0	Fix a test case to match its intention llvm-svn: 272733	2016-06-14 23:05:46 +00:00
Michael Kuperstein	d4bd3ab5fe	Reverting r272715 since it broke libcxx. llvm-svn: 272730	2016-06-14 22:30:41 +00:00
Dehao Chen	9f2bdfb40f	Set machine block placement hot prob threshold for both static and runtime profile. Summary: With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally. Reviewers: djasper, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20991 llvm-svn: 272729	2016-06-14 22:27:17 +00:00
Davide Italiano	d737dd2ec6	[PM] Port WholeProgramDevirt to the new pass manager. llvm-svn: 272721	2016-06-14 21:44:19 +00:00
Michael Kuperstein	23b6d6adc9	[LV] Enable vectorization of loops where the IV has an external use Vectorizing loops with "escaping" IVs has been disabled since r190790, due to PR17179. This re-enables it, with support for external use of both "post-increment" (last iteration) and "pre-increment" (second-to-last iteration) IVs. Differential Revision: http://reviews.llvm.org/D21048 llvm-svn: 272715	2016-06-14 21:27:27 +00:00
Sanjay Patel	4c3cb8b6c0	[x86] add current codegen tests for PR27924 llvm-svn: 272714	2016-06-14 21:25:46 +00:00
Evgeniy Stepanov	0be3cf1d35	Add a missing test. This is a test for r272421: Disable MSan-hostile loop unswitching. llvm-svn: 272713	2016-06-14 21:24:13 +00:00
Peter Collingbourne	96efdd6107	IR: Introduce local_unnamed_addr attribute. If a local_unnamed_addr attribute is attached to a global, the address is known to be insignificant within the module. It is distinct from the existing unnamed_addr attribute in that it only describes a local property of the module rather than a global property of the symbol. This attribute is intended to be used by the code generator and LTO to allow the linker to decide whether the global needs to be in the symbol table. It is possible to exclude a global from the symbol table if three things are true: - This attribute is present on every instance of the global (which means that the normal rule that the global must have a unique address can be broken without being observable by the program by performing comparisons against the global's address) - The global has linkonce_odr linkage (which means that each linkage unit must have its own copy of the global if it requires one, and the copy in each linkage unit must be the same) - It is a constant or a function (which means that the program cannot observe that the unique-address rule has been broken by writing to the global) Although this attribute could in principle be computed from the module contents, LTO clients (i.e. linkers) will normally need to be able to compute this property as part of symbol resolution, and it would be inefficient to materialize every module just to compute it. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html for earlier discussion. Part of the fix for PR27553. Differential Revision: http://reviews.llvm.org/D20348 llvm-svn: 272709	2016-06-14 21:01:22 +00:00
Zachary Turner	1dc9fd3c4a	Resubmit "[pdb] Actually write a PDB to disk from YAML."" Reviewed By: ruiu Differential Revision: http://reviews.llvm.org/D21220 llvm-svn: 272708	2016-06-14 20:48:36 +00:00
Sanjoy Das	d7e8206b58	[ValueTracking] Calls to @llvm.assume always return This change teaches llvm::isGuaranteedToTransferExecutionToSuccessor that calls to @llvm.assume always terminate. Most other relevant intrinsics should be covered by the "CS.onlyReadsMemory() \|\| CS.onlyAccessesArgMemory()" bit but we were missing @llvm.assumes because we state that it clobbers memory. Added an LICM test case, but this change is not specific to LICM. llvm-svn: 272703	2016-06-14 20:23:16 +00:00
Wei Mi	b799a625f9	[X86] Reduce the width of multiplification when its operands are extended from i8 or i16 For <N x i32> type mul, pmuludq will be used for targets without SSE41, which often introduces many extra pack and unpack instructions in vectorized loop body because pmuludq generates <N/2 x i64> type value. However when the operands of <N x i32> mul are extended from smaller size values like i8 and i16, the type of mul may be shrunk to use pmullw + pmulhw/pmulhuw instead of pmuludq, which generates better code. For targets with SSE41, pmulld is supported so no shrinking is needed. Differential Revision: http://reviews.llvm.org/D20931 llvm-svn: 272694	2016-06-14 18:53:20 +00:00
Zachary Turner	07c229c9e7	Revert "[pdb] Actually write a PDB to disk from YAML." This reverts commit 879139e1c6577b09df52de56a6bab856a19ed185. This was committed accidentally when I blindly typed git svn dcommit instead of the command to generate a patch. llvm-svn: 272693	2016-06-14 18:51:35 +00:00
Zachary Turner	fe5bc02492	[pdb] Actually write a PDB to disk from YAML. llvm-svn: 272692	2016-06-14 18:49:36 +00:00
George Burgess IV	24eb0daf7c	[CFLAA] Tag arguments as escaped instead of unknown. This patch also includes some refactoring. Prior to this patch, we tagged all CFLAA attributes as unknown. This is suboptimal, since it meant that any Value used as an argument would be considered to alias any other Value that existed. Now that we have the machinery to tag sets below the set for an arbitrary value with attributes, it's okay to be less conservative with arguments. (Specifically, we still tag the set under an argument with unknown). Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21262 llvm-svn: 272690	2016-06-14 18:12:28 +00:00
Nirav Dave	f8d00d5cac	Fix BSS global handling in AsmPrinter Change EmitGlobalVariable to check final assembler section is in BSS before using .lcomm/.comm directive. This prevents globals from being put into .bss erroneously when -data-sections is used. This fixes PR26570. Reviewers: echristo, rafael Subscribers: llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21146 llvm-svn: 272674	2016-06-14 15:09:30 +00:00
Artem Tamazov	17091364d1	[AMDGPU][llvm-mc] Predefined symbols to access -mcpu from the assembly source (.option.machine_version...) The feature allows for conditional assembly etc. TODO: make those symbols read-only. Test added. Differential Revision: http://reviews.llvm.org/D21238 llvm-svn: 272673	2016-06-14 15:03:59 +00:00
Daniel Sanders	fd557cb01f	[FileCheck] Add --check-prefixes as a shorthand for multiple --check-prefix options. Summary: This new alias takes a comma separated list of prefixes which allows '--check-prefix=A --check-prefix=B --check-prefix=C' to be written as '--check-prefixes=A,B,C'. Reviewers: probinson Subscribers: probinson, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D21293 llvm-svn: 272670	2016-06-14 14:28:04 +00:00
Simon Dardis	878c0b1b76	[mips] Optimize stack pointer adjustments. Instead of always using addu to adjust the stack pointer when the size out is of the range of an addiu instruction, use subu so that a smaller constant can be generated. This can give savings of ~3 instructions whenever a function has a a stack frame whose size is out of range of an addiu instruction. This change may break some naive stack unwinders. Partially resolves PR/26291. Thanks to David Chisnall for reporting the issue. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D21321 llvm-svn: 272666	2016-06-14 13:39:43 +00:00
James Molloy	65b6be1d3a	[Thumb] Fix off-by-one error in r272007 We can only generate immediates up to #510 with a MOV+ADD, not #511, because there's no such instruction as add #256. Found by Oliver Stannard and csmith! llvm-svn: 272665	2016-06-14 13:33:07 +00:00
Simon Dardis	4fbf76f7c3	[mips][atomics] Fix atomic instruction descriptions and uses. PR27458 highlights that the MIPS backend does not have well formed MIR for atomic operations (among other errors). This patch adds expands and corrects the LL/SC descriptions and uses for MIPS(64). Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D19719 llvm-svn: 272655	2016-06-14 11:29:28 +00:00
Daniel Sanders	e858136d91	[mips][ias] Implement one N32 case (of two) for .cpsetup. This patch implements the N32 case where -mno-shared is in effect. The case where -mshared is in effect will be added later since doing that now requires additional changes to how we handle %hi(%neg(%gp_rel(foo))) expressions to emit the three relocations as three relocations (currently only one of the three would be emitted) which then requires further changes to our MCFixup handling. While we could fix both cases together, fixing the -mno-shared case allows us to fix the ELFCLASS bug (where N32 incorrectly uses ELFCLASS64 instead of ELFCLASS32) in a way that allows cpsetup.s to check for a correct output instead of another incorrect output. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: http://reviews.llvm.org/D21131 llvm-svn: 272652	2016-06-14 10:13:47 +00:00
Simon Pilgrim	cf1165b86e	[X86][SSE4A] Added patterns for nontemporal stores of scalar float/doubles using MOVNTSD/MOVNTSS llvm-svn: 272651	2016-06-14 09:43:38 +00:00
Adam Nemet	73a26957fc	[LoopVer] Update all existing PHIs in the exit block We only used to add the edge from the cloned loop to PHIs that corresponded to values defined by the loop. We need to do this for all PHIs obviously since we need a PHI operand for each incoming edge. This includes things like PHIs with a constant value or with values defined before the original loop (see the testcases). After the patch the PHIs are added to the exit block in two passes. In the first pass we ensure there is a single-operand (LCSSA) PHI for each value defined by the loop. In the second pass we loop through each (single-operand) PHI and add the value for the edge from the cloned loop. If the value is defined in the loop we'll use the cloned instruction from the cloned loop. Fixes PR28037 llvm-svn: 272649	2016-06-14 09:38:54 +00:00
Simon Dardis	e661e528db	[mips] MIPS32/64 itineraries Itineraries for some pre MIPSR6 and EVA instructions. Some pseudo expanded instructions are marked as having no scheduling info. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D20418 llvm-svn: 272648	2016-06-14 09:35:29 +00:00
Daniel Sanders	435a653437	[mips][dsp] Fix use without def on DSPCtrl registers read by rddsp intrinsic. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21063 llvm-svn: 272647	2016-06-14 09:29:46 +00:00
Daniel Sanders	d2a49ec3ab	[mips][msa] copyPhysReg() should not set RegState::Define on result of CTCMSA. Summary: The machine verifier reports 'Explicit operand marked as def' when it is manually specified even though it agrees with the operand info. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21065 llvm-svn: 272646	2016-06-14 09:11:33 +00:00
Diana Picus	bae1d89e45	[SelectionDAG] Remove exit-on-error flag from test (PR27765) The exit-on-error flag in the ARM test is necessary in order to avoid an unreachable in the DAGTypeLegalizer, when trying to expand a physical register. We can also avoid this situation by introducing a bitcast early on, where the invalid scalar-to-vector conversion is detected. We also add a test for PowerPC, which goes through a similar code path in the SelectionDAGBuilder. Fixes PR27765. Differential Revision: http://reviews.llvm.org/D21061 llvm-svn: 272644	2016-06-14 07:30:20 +00:00
Igor Breger	484bace21b	re-generate the tests using the update_llc_test_checks.py script llvm-svn: 272643	2016-06-14 07:05:10 +00:00
Davide Italiano	cccf4f01ad	[PM] Port Mem2Reg to the new pass manager. llvm-svn: 272630	2016-06-14 03:22:22 +00:00
Craig Topper	99e30e6a66	[AVX512] Use MOVZX32 instead of MOVZ16 for loading single v8/v4/v2/v1 masks when KMOVB is not available. This has better behavior with respect to partial register stalls since it won't need to preserve the upper 16-bits of the GPR. llvm-svn: 272626	2016-06-14 03:13:00 +00:00
Craig Topper	ddab395397	[AVX512] Add patterns for zero-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX. llvm-svn: 272625	2016-06-14 03:12:54 +00:00
Craig Topper	cbe54a4bd9	[AVX512] Add tests for zero extending masks that show an unnecessary movzx instruction. A followup patch will remove that instruction, but adding the tests first to make the more obvious. llvm-svn: 272624	2016-06-14 03:12:48 +00:00
Sean Silva	6347df0f81	[PM] Port MemCpyOpt to the new PM. The need for all these Lookup* functions is just because of calls to getAnalysis inside methods (i.e. not at the top level) of the runOnFunction method. They should be straightforward to clean up when the old PM is gone. llvm-svn: 272615	2016-06-14 02:44:55 +00:00
Davide Italiano	5669ef1efe	Placate bots fixing a typo in AA-pipeline description. Sorry. llvm-svn: 272608	2016-06-14 01:11:12 +00:00
Sean Silva	46590d556a	Bring back "[PM] Port JumpThreading to the new PM" with a fix This reverts commit r272603 and adds a fix. Big thanks to Davide for pointing me at r216244 which gives some insight into how to fix this VS2013 issue. VS2013 can't synthesize a move constructor. So the fix here is to add one explicitly to the JumpThreadingPass class. llvm-svn: 272607	2016-06-14 00:51:09 +00:00
Davide Italiano	89ab89d6cd	[PM] Port MergedLoadStoreMotion to the new pass manager. llvm-svn: 272606	2016-06-14 00:49:23 +00:00
Sean Silva	7d5a57cbfc	Revert "[PM] Port JumpThreading to the new PM" This reverts commit r272597. Will investigate issue with VS2013 compilation and then recommit. llvm-svn: 272603	2016-06-14 00:26:31 +00:00
Sean Silva	f81328d0b4	[PM] Port JumpThreading to the new PM This follows the approach in r263208 (for GVN) pretty closely: - move the bulk of the body of the function to the new PM class. - expose a runImpl method on the new-PM class that takes the IRUnitT and pointers/references to any analyses and use that to implement the old-PM class. - use a private namespace in the header for stuff that used to be file scope llvm-svn: 272597	2016-06-13 22:52:52 +00:00
Kevin Enderby	d2d2ce9b9f	Update the AArch64ExternalSymbolizer to print literal strings as escaped strings so it is the same as the MCExternalSymbolizer. rdar://17349181 llvm-svn: 272588	2016-06-13 21:08:57 +00:00
Sanjoy Das	98ac278b86	Move previously added test case to the right location In rL272580 I accidentally added a test case to test/CodeGen when test/Transforms/DeadStoreElimination/ is a better place for it. llvm-svn: 272581	2016-06-13 20:12:07 +00:00
Sanjoy Das	d0bdf3e02b	Fix AAResults::callCapturesBefore for operand bundles Summary: AAResults::callCapturesBefore would previously ignore operand bundles. It was possible for a later instruction to miss its memory dependency on a call site that would only access the pointer through a bundle. Patch by Oscar Blumberg! Reviewers: sanjoy Differential Revision: http://reviews.llvm.org/D21286 llvm-svn: 272580	2016-06-13 19:55:04 +00:00
Simon Pilgrim	582b9ce36e	[X86][SSE] Added extract to scalar nontemporal store tests llvm-svn: 272577	2016-06-13 19:08:28 +00:00
David Majnemer	248190ba69	[X86] Remove llvm.x86.bit.scan.{forward,reverse}.32 The need for these intrinsics has been obviated by r272564 which reimplements their functionality using generic IR. llvm-svn: 272566	2016-06-13 17:33:13 +00:00
Rafael Espindola	426ae7c72d	Add triple to input file. Patch by H.J. Lu. llvm-svn: 272563	2016-06-13 17:08:15 +00:00
Marek Olsak	e93f6d6923	AMDGPU/SI: Set INDEX_STRIDE for scratch coalescing Summary: Mesa and other users must set this to enable coalescing: - STRIDE = 0 - SWIZZLE_ENABLE = 1 This makes one particular compute shader 8x faster. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21136 llvm-svn: 272556	2016-06-13 16:05:57 +00:00
Ulrich Weigand	daae87aa21	[SystemZ] Enable index register memory constraints for inline ASM This enables use of the 'R' and 'T' memory constraints for inline ASM operands on SystemZ, which allow an index register as well as an immediate displacement. This patch includes corresponding documentation and test case updates. As with the last patch of this kind, I moved the 'm' constraint to the most general case, which is now 'T' (base + 20-bit signed displacement + index register). Author: colpell Differential Revision: http://reviews.llvm.org/D21239 llvm-svn: 272547	2016-06-13 14:24:05 +00:00
Ranjeet Singh	933e1aa39f	[ARM] Reverting r272544 because clang patch needs to go in as soon as llvm patch has gone in because tests will start breaking in Clang. llvm-svn: 272546	2016-06-13 10:58:24 +00:00
Ranjeet Singh	8feacb330d	[ARM] Add mrrc/mrrc2 co-processor intrinsics MRRC/MRRC2 instruction writes to two registers. The intrinsic definition returns a single uint64_t to represent the write, this is a compact way of representing a write to two 32 bit registers, the alternative might have been two return a struct of 2 uint32_t's but this isn't as nice. Differential Revision: llvm-svn: 272544	2016-06-13 10:43:50 +00:00
Strahinja Petrovic	f0980e4dc0	This patch fixes handling long double type when it is constant in soft float mode on PowerPC 32 architecture. llvm-svn: 272543	2016-06-13 10:29:29 +00:00
Simon Pilgrim	377bc2ea43	[X86][SSE4A] Renamed tests to correspond with the the instruction with being tested llvm-svn: 272542	2016-06-13 10:14:42 +00:00
Craig Topper	13cf7cac07	[AVX512] Remove maksed pshufd, pshuflw, and phufhw intrinsics and autoupgrade them to selects and shufflevector. llvm-svn: 272527	2016-06-13 02:36:48 +00:00
Sanjay Patel	977530a8c9	[x86, SSE] change patterns for CMPP to float types to allow matching with SSE1 (PR28044) This patch is intended to solve: https://llvm.org/bugs/show_bug.cgi?id=28044 By changing the definition of X86ISD::CMPP to use float types, we allow it to be created and pass legalization for an SSE1-only target where v4i32 is not legal. The motivational trail for this change includes: https://llvm.org/bugs/show_bug.cgi?id=28001 and eventually makes this trigger: http://reviews.llvm.org/D21190 Ie, after this step, we should be free to have Clang generate FP compare IR instead of x86 intrinsics for SSE C packed compare intrinsics. (We can auto-upgrade and remove the LLVM sse.cmp intrinsics as a follow-up step.) Once we're generating vector IR instead of x86 intrinsics, a big pile of generic optimizations can trigger. Differential Revision: http://reviews.llvm.org/D21235 llvm-svn: 272511	2016-06-12 15:03:25 +00:00
Craig Topper	1067986c5b	[X86] Remove sse2 pshufd/pshuflw/pshufhw intrinsics and upgrade them to shufflevector. llvm-svn: 272510	2016-06-12 14:11:32 +00:00
Simon Pilgrim	9d8bed1796	[X86][BMI] Added fast-isel tests for BMI1 intrinsics A lot of the codegen is pretty awful for these as they are mostly implemented as generic bit twiddling ops llvm-svn: 272508	2016-06-12 09:56:05 +00:00
Sean Silva	e3bb457423	[PM] Port DeadArgumentElimination to the new PM The approach taken here follows r267631. deadarghaX0r should be easy to port when the time comes to add new-PM support to bugpoint. llvm-svn: 272507	2016-06-12 09:16:39 +00:00
Sean Silva	f5080194fd	[PM] Port ReversePostOrderFunctionAttrs to the new PM Below are my super rough notes when porting. They can probably serve as a basic guide for porting other passes to the new PM. As I port more passes I'll expand and generalize this and make a proper docs/HowToPortToNewPassManager.rst document. There is also missing documentation for general concepts and API's in the new PM which will require some documentation. Once there is proper documentation in place we can put up a list of passes that have to be ported and game-ify/crowdsource the rest of the porting (at least of the middle end; the backend is still unclear). I will however be taking personal responsibility for ensuring that the LLD/ELF LTO pipeline is ported in a timely fashion. The remaining passes to be ported are (do something like `git grep "<the string in the bullet point below>"` to find the pass): General Scalar: [ ] Simplify the CFG [ ] Jump Threading [ ] MemCpy Optimization [ ] Promote Memory to Register [ ] MergedLoadStoreMotion [ ] Lazy Value Information Analysis General IPO: [ ] Dead Argument Elimination [ ] Deduce function attributes in RPO Loop stuff / vectorization stuff: [ ] Alignment from assumptions [ ] Canonicalize natural loops [ ] Delete dead loops [ ] Loop Access Analysis [ ] Loop Invariant Code Motion [ ] Loop Vectorization [ ] SLP Vectorizer [ ] Unroll loops Devirtualization / CFI: [ ] Cross-DSO CFI [ ] Whole program devirtualization [ ] Lower bitset metadata CGSCC passes: [ ] Function Integration/Inlining [ ] Remove unused exception handling info [ ] Promote 'by reference' arguments to scalars Please let me know if you are interested in working on any of the passes in the above list (e.g. reply to the post-commit thread for this patch). I'll probably be tackling "General Scalar" and "General IPO" first FWIW. Steps as I port "Deduce function attributes in RPO" --------------------------------------------------- (note: if you are doing any work based on these notes, please leave a note in the post-commit review thread for this commit with any improvements / suggestions / incompleteness you ran into!) Note: "Deduce function attributes in RPO" is a module pass. 1. Do preparatory refactoring. Do preparatory factoring. In this case all I had to do was to pull out a static helper (r272503). (TODO: give more advice here e.g. if pass holds state or something) 2. Rename the old pass class. llvm/lib/Transforms/IPO/FunctionAttrs.cpp Rename class ReversePostOrderFunctionAttrs -> ReversePostOrderFunctionAttrsLegacyPass in preparation for adding a class ReversePostOrderFunctionAttrs as the pass in the new PM. (edit: actually wait what? The new class name will be ReversePostOrderFunctionAttrsPass, so it doesn't conflict. So this step is sort of useless churn). llvm/include/llvm/InitializePasses.h llvm/lib/LTO/LTOCodeGenerator.cpp llvm/lib/Transforms/IPO/IPO.cpp llvm/lib/Transforms/IPO/FunctionAttrs.cpp Rename initializeReversePostOrderFunctionAttrsPass -> initializeReversePostOrderFunctionAttrsLegacyPassPass (note that the "PassPass" thing falls out of `s/ReversePostOrderFunctionAttrs/ReversePostOrderFunctionAttrsLegacyPass/`) Note that the INITIALIZE_PASS macro is what creates this identifier name, so renaming the class requires this renaming too. Note that createReversePostOrderFunctionAttrsPass does not need to be renamed since its name is not generated from the class name. 3. Add the new PM pass class. In the new PM all passes need to have their declaration in a header somewhere, so you will often need to add a header. In this case llvm/include/llvm/Transforms/IPO/FunctionAttrs.h is already there because PostOrderFunctionAttrsPass was already ported. The file-level comment from the .cpp file can be used as the file-level comment for the new header. You may want to tweak the wording slightly from "this file implements" to "this file provides" or similar. Add declaration for the new PM pass in this header: class ReversePostOrderFunctionAttrsPass : public PassInfoMixin<ReversePostOrderFunctionAttrsPass> { public: PreservedAnalyses run(Module &M, AnalysisManager<Module> &AM); }; Its name should end with `Pass` for consistency (note that this doesn't collide with the names of most old PM passes). E.g. call it `<name of the old PM pass>Pass`. Also, move the doxygen comment from the old PM pass to the declaration of this class in the header. Also, include the declaration for the new PM class `llvm/Transforms/IPO/FunctionAttrs.h` at the top of the file (in this case, it was already done when the other pass in this file was ported). Now define the `run` method for the new class. The main things here are: a) Use AM.getResult<...>(M) to get results instead of `getAnalysis<...>()` b) If the old PM pass would have returned "false" (i.e. `Changed == false`), then you should return PreservedAnalyses::all(); c) In the old PM getAnalysisUsage method, observe the calls `AU.addPreserved<...>();`. In the case `Changed == true`, for each preserved analysis you should do call `PA.preserve<...>()` on a PreservedAnalyses object and return it. E.g.: PreservedAnalyses PA; PA.preserve<CallGraphAnalysis>(); return PA; Note that calls to skipModule/skipFunction are not supported in the new PM currently, so optnone and optimization bisect support do not work. You can just drop those calls for now. 4. Add the pass to the new PM pass registry to make it available in opt. In llvm/lib/Passes/PassBuilder.cpp add a #include for your header. `#include "llvm/Transforms/IPO/FunctionAttrs.h"` In this case there is already an include (from when PostOrderFunctionAttrsPass was ported). Add your pass to llvm/lib/Passes/PassRegistry.def In this case, I added `MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())` The string is from the `INITIALIZE_PASS*` macros used in the old pass manager. Then choose a test that uses the pass and use the new PM `-passes=...` to run it. E.g. in this case there is a test that does: ; RUN: opt < %s -basicaa -functionattrs -rpo-functionattrs -S \| FileCheck %s I have added the line: ; RUN: opt < %s -aa-pipeline=basic-aa -passes='require<targetlibinfo>,cgscc(function-attrs),rpo-functionattrs' -S \| FileCheck %s The `-aa-pipeline=basic-aa` and `require<targetlibinfo>,cgscc(function-attrs)` are what is needed to run functionattrs in the new PM (note that in the new PM "functionattrs" becomes "function-attrs" for some reason). This is just pulled from `readattrs.ll` which contains the change from when functionattrs was ported to the new PM. Adding rpo-functionattrs causes the pass that was just ported to run. llvm-svn: 272505	2016-06-12 07:48:51 +00:00
Amaury Sechet	5db224e1f0	Make sure we have a Add/Remove/Has function for various thing that can have attribute. Summary: This also deprecated the get attribute function familly. Reviewers: Wallbraker, whitequark, joker.eph, echristo, rafael, jyknight Subscribers: axw, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19181 llvm-svn: 272504	2016-06-12 06:17:24 +00:00
Eli Friedman	9f8031c2da	[MergedLoadStoreMotion] Use correct helper for load hoist safety. It isn't legal to hoist a load past a call which might not return; even if it doesn't throw, it could, for example, call exit(). Fixes http://llvm.org/PR27953. llvm-svn: 272495	2016-06-12 02:11:20 +00:00
Craig Topper	b7713e413b	[X86] Move tests for llvm.x86.avx.vpermil.* intrinsics to a -upgrade test since they are autoupgraded to shufflevector. llvm-svn: 272494	2016-06-12 01:41:06 +00:00
Eli Friedman	f1da33e4d3	[LICM] Make isGuaranteedToExecute more accurate. Summary: Make isGuaranteedToExecute use the isGuaranteedToTransferExecutionToSuccessor helper, and make that helper a bit more accurate. There's a potential performance impact here from assuming that arbitrary calls might not return. This probably has little impact on loads and stores to a pointer because most things alias analysis can reason about are dereferenceable anyway. The other impacts, like less aggressive hoisting of sdiv by a variable and less aggressive hoisting around volatile memory operations, are unlikely to matter for real code. This also impacts SCEV, which uses the same helper. It's a minor improvement there because we can tell that, for example, memcpy always returns normally. Strictly speaking, it's also introducing a bug, but it's not any worse than everywhere else we assume readonly functions terminate. Fixes http://llvm.org/PR27857. Reviewers: hfinkel, reames, chandlerc, sanjoy Subscribers: broune, llvm-commits Differential Revision: http://reviews.llvm.org/D21167 llvm-svn: 272489	2016-06-11 21:48:25 +00:00
Simon Pilgrim	2b7c02a04f	[X86] Updated test checks script to generalise LCPI symbol refs The script now replace '.LCPI888_8' style asm symbols with the {{\.LCPI.*}} re pattern - this helps stop hardcoded symbols in 32-bit x86 tests changing with every edit of the file Refreshed some tests to demonstrate the new check llvm-svn: 272488	2016-06-11 20:39:21 +00:00
Simon Pilgrim	3fc09f7be6	[CostModel][X86][SSE] Updated costs for vector BITREVERSE ops on SSSE3+ targets To account for the fast PSHUFB implementation now available llvm-svn: 272484	2016-06-11 19:23:02 +00:00
Simon Pilgrim	5b9bade8dd	[X86][SSSE3] Added PSHUFB LUT implementation of BITREVERSE PSHUFB can speed up BITREVERSE of byte vectors by performing LUT on the low/high nibbles separately and ORing the results. Wider integer vector types are already BSWAP'd beforehand so also make use of this approach. llvm-svn: 272477	2016-06-11 15:44:13 +00:00
Craig Topper	46f49fb407	[AVX512] Re-generate v8i64 shuffle test now that we use pshufd for some cases. llvm-svn: 272474	2016-06-11 13:57:08 +00:00
Craig Topper	504fba5c8a	[AVX512] Lower v8i64 and v16i32 to pshufd when possible. llvm-svn: 272473	2016-06-11 13:43:21 +00:00
Simon Pilgrim	6800a45790	[X86][SSE] Added PSLLDQ/PSRLDQ as a target shuffle type Ensure that PALIGNR/PSLLDQ/PSRLDQ are byte vectors so that they can be correctly decoded for target shuffle combining llvm-svn: 272471	2016-06-11 13:38:28 +00:00
Simon Pilgrim	8dd73e3ffa	[X86][AVX2] Added PSLLDQ/PSRLDQ shuffle combining tests llvm-svn: 272469	2016-06-11 13:18:21 +00:00
Craig Topper	40abd1cc61	[AVX512] Add support for lowering v32i16 shuffles with repeated lanes. This allows us to create 512-bit PSHUFLW/PSHUFHW. llvm-svn: 272450	2016-06-11 03:27:42 +00:00
Qin Zhao	bc8fbeacf3	[esan\|cfrag] Handle complex GEP instr in the cfrag tool Summary: Iterates all (except the first and the last) operands within each GEP instruction for instrumentation. Adds test struct_field_gep.ll. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits Differential Revision: http://reviews.llvm.org/D21242 llvm-svn: 272442	2016-06-10 22:28:55 +00:00
Quentin Colombet	f2a1909bb5	[IRTranslator] Support the translation of or. Now or instructions get translated into G_OR. llvm-svn: 272433	2016-06-10 20:50:35 +00:00
Sanjay Patel	b114fd65fc	[x86] enable bitcasted fabs/fneg transforms The vector cases don't change because we already have folds in X86ISelLowering to look through and remove bitcasts. llvm-svn: 272427	2016-06-10 20:33:50 +00:00
Zhan Jun Liau	ab42cbce98	[SystemZ] Support Compare and Traps Support and generate Compare and Traps like CRT, CIT, etc. Support Trap as legal DAG opcodes and generate "j .+2" for them by default. Add support for Conditional Traps and use the If Converter to convert them into the corresponding compare and trap opcodes. Differential Revision: http://reviews.llvm.org/D21155 llvm-svn: 272419	2016-06-10 19:58:10 +00:00
Tom Stellard	f3af841462	AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocations Summary: We need to set the fixup type to FK_Data_4 for the SCRATCH_RSRC_DWORD[01] symbols, since these require absolute relocations, and fixup_si_rodata is for relative relocations. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21153 llvm-svn: 272417	2016-06-10 19:26:38 +00:00
Mehdi Amini	cbd68ecf04	Move CodeGen test from Generic to X86 specific directory llvm-svn: 272416	2016-06-10 19:14:01 +00:00
Mehdi Amini	1d396832d3	Interprocedural Register Allocation (IPRA): add a Transformation Pass Adds a MachineFunctionPass that scans the body to find calls, and update the register mask with the one saved by the RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D21180 llvm-svn: 272414	2016-06-10 18:37:21 +00:00
Sanjay Patel	d558bdadd2	[x86] add test for PR28044 llvm-svn: 272411	2016-06-10 18:05:55 +00:00
Saleem Abdulrasool	6d0d228d2a	test: split test into two files Split up the test cases into two inputs as per post-commit review comments from Renato. NFC. llvm-svn: 272408	2016-06-10 17:33:28 +00:00
Michael Kuperstein	9a0542a792	[X86] Add costs for SSE zext/sext to v4i64 to TTI The costs are somewhat hand-wavy, but should be much closer to the truth than what we get from BasicTTI. Differential Revision: http://reviews.llvm.org/D21156 llvm-svn: 272406	2016-06-10 17:01:05 +00:00
Mehdi Amini	bbacddfe92	Interprocedural Register Allocation (IPRA) Analysis Add an option to enable the analysis of MachineFunction register usage to extract the list of clobbered registers. When enabled, the CodeGen order is changed to be bottom up on the Call Graph. The analysis is split in two parts, RegUsageInfoCollector is the MachineFunction Pass that runs post-RA and collect the list of clobbered registers to produce a register mask. An immutable pass, RegisterUsageInfo, stores the RegMask produced by RegUsageInfoCollector, and keep them available. A future tranformation pass will use this information to update every call-sites after instruction selection. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D20769 llvm-svn: 272403	2016-06-10 16:19:46 +00:00
Sanjay Patel	27f06ae7a5	[x86] fix test attributes and autogenerate checks llvm-svn: 272398	2016-06-10 15:30:52 +00:00
Sanjay Patel	cccccd9ab5	[x86] add missing tests for fcmp ueq/one Somehow, the codegen logic for these sequences has gone completely untested until now (note the 2 compare instructions generated per test). There's also an Intel AVX optimization opportunity exposed in these cases and the existing tests. Intel's (but not AMD's) AVX spec shows that extra FP predicates were added, so a single comparison should always be sufficient, and operand commutation should never be necessary. llvm-svn: 272397	2016-06-10 15:17:54 +00:00
Sanjay Patel	330a359fb3	[x86] regenerate checks llvm-svn: 272396	2016-06-10 14:48:50 +00:00
Matthew Simpson	12b9c5ba98	Reapply "[TTI] Refine default cost for interleaved load groups with gaps" This reapplies commit r272385 with a fix. The build was failing when compiled with gcc, but not with clang. With the fix, we now get the data layout from the current TTI implementation, which will hopefully solve the issue. llvm-svn: 272395	2016-06-10 14:33:30 +00:00
Simon Pilgrim	2fa2690bca	[X86][SSE] Added target shuffle combine tests for byte shift/rotates (PSLLDQ/PSRLDQ/PALIGNR) llvm-svn: 272392	2016-06-10 13:03:22 +00:00
Matthew Simpson	65c7b74de4	Revert "[TTI] Refine default cost for interleaved load groups with gaps" This reverts commit r272385. This commit broke the build. I'm temporarily reverting to investigate. llvm-svn: 272391	2016-06-10 12:41:33 +00:00
Matthew Simpson	b16907f17a	[TTI] Refine default cost for interleaved load groups with gaps This patch refines the default cost for interleaved load groups having gaps. If a load group has gaps, the legalized instructions corresponding to the unused elements will be dead. Thus, we don't need to account for them in the cost model. Instead, we only need to account for the fraction of legalized loads that will actually be used. Differential Revision: http://reviews.llvm.org/D20873 llvm-svn: 272385	2016-06-10 11:27:51 +00:00
Sam Kolton	945231ada5	[AMDGPU] AsmParser: Support for sext() modifier in SDWA. Some code cleaning in AMDGPUOperand. Summary: sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported. Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier. Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers. Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...). Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D20968 llvm-svn: 272384	2016-06-10 09:57:59 +00:00
Simon Pilgrim	34263ad995	[X86][AVX512] Added VPSLLDQ/VPSRLDQ memory fold tests Memory operand is new for AVX512 (SSE/AVX2 didn't support it). Also dropped the 'mask' from the tests (VPSLLDQ/VPSRLDQ don't support masked operations). Regenerated VPALIGNR test now that the shuffle comments work llvm-svn: 272383	2016-06-10 09:56:20 +00:00
Craig Topper	200d237e57	[AVX512] Add shuffle comment printing for masked VPERMPD/VPERMQ. llvm-svn: 272371	2016-06-10 05:12:40 +00:00
Craig Topper	89c1761474	[AVX512] Fix shuffle comment printing to handle the masked versions of some shuffles. Previously we were printing the mask operands as the register names. llvm-svn: 272367	2016-06-10 04:48:05 +00:00
Qin Zhao	0b96aa7190	[esan\|cfrag] Add the struct field offset array in StructInfo Summary: Adds the struct field offset array in struct StructInfo. Updates test struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21192 llvm-svn: 272362	2016-06-10 02:10:06 +00:00
Quentin Colombet	3198649199	[LiveRangeEdit] Add a test case for r272314. The test case is not great espicially because it is still cumbersome to run the regalloc pass with run-pass. (We miss a bunch of initiliazier to be properly implemented.) Related to llvm.org/PR27983 llvm-svn: 272360	2016-06-10 01:57:48 +00:00
Quentin Colombet	129458a7ed	[llc] Add support for several run-pass options. Previously we could run only one machine pass with the run-pass option. With that patch, we can now specify several passes with several run-pass options (or just one option with a list of comma separated passes) and llc will build the related pipeline. This is great to test the interaction of two passes that are not necessarily next to each other in the pipeline, or play with pass ordering. Now, we should be at parity with opt for the flexibility of running passes. Note: I also moved the run pass option from CommandFlags.h to llc.cpp because, really, this is needed only there! llvm-svn: 272356	2016-06-10 00:52:10 +00:00
Qin Zhao	d677d88867	[esan\|cfrag] Disable load/store instrumentation for cfrag Summary: Adds ClInstrumentFastpath option to control fastpath instrumentation. Avoids the load/store instrumentation for the cache fragmentation tool. Renames cache_frag_basic.ll to working_set_slow.ll for slowpath instrumentation test. Adds the __esan_init check in struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka Differential Revision: http://reviews.llvm.org/D21079 llvm-svn: 272355	2016-06-10 00:48:53 +00:00
Matt Arsenault	58ddad5bd6	AMDGPU: v_cndmask_b32 does not def vcc Fixes verifier errors after SIShrinkInstructions. llvm-svn: 272351	2016-06-10 00:18:41 +00:00
Tom Stellard	26a2ab7477	AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_permute Summary: This fixes a bug with ds_permute instructions where if it was passed a constant address, then the offset operand would get assigned a register operand instead of an immediate. Reviewers: scchan, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19994 llvm-svn: 272349	2016-06-10 00:01:04 +00:00
Matt Arsenault	7757c59e48	AMDGPU: Fix flat atomics The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. llvm-svn: 272345	2016-06-09 23:42:54 +00:00
Matt Arsenault	887018179a	AMDGPU: Fix i64 global cmpxchg This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. llvm-svn: 272344	2016-06-09 23:42:48 +00:00
Matt Arsenault	25363d37fc	AMDGPU: Fix missing and broken check lines in atomic tests llvm-svn: 272343	2016-06-09 23:42:44 +00:00
Vitaly Buka	b451f1bdf6	Make sure that not interesting allocas are not instrumented. Summary: We failed to unpoison uninteresting allocas on return as unpoisoning is part of main instrumentation which skips such allocas. Added check -asan-instrument-allocas for dynamic allocas. If instrumentation of dynamic allocas is disabled it will not will not be unpoisoned. PR27453 Reviewers: kcc, eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21207 llvm-svn: 272341	2016-06-09 23:31:59 +00:00
Eric Christopher	1dbb23e162	Add aliases for mfvrsave/mtvrsave. Update a test as we're now going to emit it for easier reading of generated assembly as well. llvm-svn: 272339	2016-06-09 23:27:48 +00:00
George Burgess IV	652ec4f595	[CFLAA] Handle global/arg attrs more sanely. Prior to this patch, we used argument/global stratified attributes in order to note that a value could have come from either dereferencing a global/arg, or from the assignment from a global/arg. Now, AttrUnknown is placed on sets when we see a dereference, instead of the global/arg attributes. This allows us to be more aggressive in the future when we see global/arg attributes without AttrUnknown. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21110 llvm-svn: 272335	2016-06-09 23:15:04 +00:00
Vitaly Buka	79b75d3d11	Unpoison stack memory in use-after-return + use-after-scope mode Summary: We still want to unpoison full stack even in use-after-return as it can be disabled at runtime. PR27453 Reviewers: eugenis, kcc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21202 llvm-svn: 272334	2016-06-09 23:05:35 +00:00
Easwaran Raman	71069cf67d	Use ProfileSummaryInfo in inline cost analysis. Instead of directly using MaxFunctionCount and function entry count to determine callee hotness, use the isHotFunction/isColdFunction methods provided by ProfileSummaryInfo. Differential revision: http://reviews.llvm.org/D21045 llvm-svn: 272321	2016-06-09 22:23:21 +00:00
Simon Pilgrim	643734c565	[X86][AVX512] Added avx512 VPSLLDQ/VPSRLDQ instruction comments llvm-svn: 272319	2016-06-09 22:03:15 +00:00
Simon Pilgrim	f718682eb9	[X86][AVX512] Dropped avx512 VPSLLDQ/VPSRLDQ intrinsics Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ llvm-svn: 272308	2016-06-09 21:09:03 +00:00
Simon Pilgrim	47c76e201a	[X86][AVX512] Fixed issue with v16i32 shuffles lowering to VPALIGNR llvm-svn: 272307	2016-06-09 20:53:12 +00:00
Simon Pilgrim	0ab9d3026a	[X86][AVX512] Added support for lowering 512-bit vector shuffles to bit/byte shifts 512-bit VPSLLDQ/VPSRLDQ can only be used for avx512bw targets so lowerVectorShuffleAsShift had to be adjusted to include the subtarget llvm-svn: 272300	2016-06-09 20:13:58 +00:00
Justin Lebar	ed2c282d4b	[NVPTX] Add intrinsics for shfl instructions. Summary: Currently clang emits these instructions via inline (volatile) asm in the CUDA headers. Switching to intrinsics will let the optimizer reason across calls to these intrinsics. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D21160 llvm-svn: 272298	2016-06-09 20:04:08 +00:00
Easwaran Raman	e12c487b8c	[PM] Port LCSSA to the new PM. Differential Revision: http://reviews.llvm.org/D21090 llvm-svn: 272294	2016-06-09 19:44:46 +00:00
Wei Ding	ed0f97fad2	AMDGPU/SI: Fix 32-bit fdiv lowering We were using the fast fdiv lowering for all division, implementation of IEEE754 fdiv is added. http://reviews.llvm.org/D20557 llvm-svn: 272292	2016-06-09 19:17:15 +00:00
Michael Kuperstein	c5edcdeb0e	[LV] Use vector phis for some secondary induction variables Previously, we materialized secondary vector IVs from the primary scalar IV, by offseting the primary to match the correct start value, and then broadcasting it - inside the loop body. Instead, we can use a real vector IV, like we do for the primary. This enables using vector IVs for secondary integer IVs whose type matches the type of the primary. Differential Revision: http://reviews.llvm.org/D20932 llvm-svn: 272283	2016-06-09 18:03:15 +00:00
Davide Italiano	1a7e32cc48	Also fix a typo. Need more coffee today. llvm-svn: 272278	2016-06-09 17:06:01 +00:00
Davide Italiano	f326b30a15	Improve r272262, check that __stack_chk_guard is used. Thanks to Rafael for the suggestion. llvm-svn: 272277	2016-06-09 17:04:38 +00:00
Jan Vesely	2da0cba5fb	SelectionDAG: Implement expansion of {S,U}MIN/MAX in integer legalization Fixes {u,}long_{min,max,clamp} opencl piglit regressions on EG. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D17898 llvm-svn: 272272	2016-06-09 16:04:00 +00:00
Haicheng Wu	5b458cc1f6	Reapply "[MBP] Reduce code size by running tail merging in MBP."" This reapplies commit r271930, r271915, r271923. They hit a bug in Thumb which is fixed in r272258 now. The original message: The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. llvm-svn: 272267	2016-06-09 15:24:29 +00:00
Ulrich Weigand	79564611d9	[SystemZ] Enable long displacement constraints for inline ASM operands This enables use of the 'S' constraint for inline ASM operands on SystemZ, which allows for a memory reference with a signed 20-bit immediate displacement. This patch includes corresponding documentation and test case updates. I've changed the 'T' constraint to match the new behavior for 'S', as 'T' also uses a long displacement (though index constraints are still not implemented). I also changed 'm' to match the behavior for 'S' as this will allow for a wider range of displacements for 'm', though correct me if that's not the right decision. Author: colpell Differential Revision: http://reviews.llvm.org/D21097 llvm-svn: 272266	2016-06-09 15:19:16 +00:00
Davide Italiano	24f1f62dca	Move stackguard test to X86/ directory as it's not generic. llvm-svn: 272264	2016-06-09 15:16:58 +00:00
Davide Italiano	bd4243c519	[CodeGen] Change getSDagStackGuard to get an internal sym. Fixes a crash in the backend during an LTO build of rtld(1) in FreeBSD. llvm-svn: 272262	2016-06-09 14:23:38 +00:00
Hrvoje Varga	c962c4936e	[mips][microMIPS] Implement BOVC, BNVC, EXT, INS and JALRC instructions Differential Revision: http://reviews.llvm.org/D11798 llvm-svn: 272259	2016-06-09 12:57:23 +00:00
Igor Breger	f635367e2b	[AVX512] Remove masked_move/blendm intrinsic from back-end. This is complement patch to D21060. Differential Revision: http://reviews.llvm.org/D21174 llvm-svn: 272257	2016-06-09 11:46:55 +00:00
Zlatko Buljan	cd242c1655	[mips][microMIPS] Add CodeGen support for SEL., SELEQZ, SELNEZ, SELEQZ., SELNEZ.* and CMP.condn.fmt instructions Differential Revision: http://reviews.llvm.org/D20862 llvm-svn: 272256	2016-06-09 11:15:53 +00:00
Sam Kolton	c9bdcb75c4	[AMDGPU] Disassembler: Support for sdwa instructions Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21129 llvm-svn: 272255	2016-06-09 11:04:45 +00:00
Diana Picus	db2aff0ab4	[llc] Remove exit-on-error flag from MIR tests (PR27770) This is made possible by removing an assert in llc that assumed MIRParser::parseLLVMModule would exit on error. MIRParser's documentation states that it returns null if a parsing error occurs, so there's no reason to assert. We can instead just fall through to where the check for a module is performed and exit if it is null. This commit is part of the clean-up after r269655. Fixes PR27770 Differential Revision: http://reviews.llvm.org/D20371 llvm-svn: 272254	2016-06-09 10:31:05 +00:00
Craig Topper	6f7288dc44	[AVX512] Fix shuffle decode printing for several instructions with write masks. There are still more bugs here with UNPCK and PALIGN for sure. But these were the easiest ones to fix. llvm-svn: 272252	2016-06-09 07:49:08 +00:00
James Molloy	feb9f4243b	[Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 llvm-svn: 272251	2016-06-09 07:39:08 +00:00
Craig Topper	8537c11ff3	[X86] Fix a test I failed to re-generate in r272249. llvm-svn: 272250	2016-06-09 07:10:34 +00:00
Craig Topper	7a2993093e	[X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR instructions. Then add shuffle decode printing for the EVEX forms which is made easier by having the naming structure more similar to other instructions. llvm-svn: 272249	2016-06-09 07:06:38 +00:00
Saleem Abdulrasool	d3568e3ba3	test: fix typo llvm-svn: 272242	2016-06-09 03:14:32 +00:00
Saleem Abdulrasool	6c19ffc8bc	AArch64: support the `.arch` directive in the IAS Add support to the AArch64 IAS for the `.arch` directive. This allows the assembly input to use architectural functionality in part of a file. This is used in existing code like BoringSSL. Resolves PR26016! llvm-svn: 272241	2016-06-09 02:56:40 +00:00
Teresa Johnson	7ab1f69272	[ThinLTO/gold] Enable summary-based internalization Summary: Enable existing summary-based importing support in the gold-plugin. Reviewers: mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21080 llvm-svn: 272239	2016-06-09 01:14:13 +00:00
Sanjoy Das	c7f69b921f	Be wary of abnormal exits from loop when exploiting UB We can safely rely on a NoWrap add recurrence causing UB down the road only if we know the loop does not have a exit expressed in a way that is opaque to ScalarEvolution (e.g. by a function call that conditionally calls exit(0)). I believe with this change PR28012 is fixed. Note: I had to change some llvm-lit tests in LoopReroll, since it looks like they were depending on this incorrect behavior. llvm-svn: 272237	2016-06-09 01:13:59 +00:00
Reid Kleckner	6d1d27542f	[codeview] Skip DIGlobalVariables with no variable They have probably been discarded during optimization. llvm-svn: 272231	2016-06-09 00:29:00 +00:00
Quentin Colombet	2c6469687d	[MIR] Check that generic virtual registers get a size. Without that check it was possible to write test cases where the size was not specified and we ended up with weird asserts down the road, because the default value (1) would not make sense. llvm-svn: 272226	2016-06-08 23:27:46 +00:00
Michael Zolotukhin	8e7e76729d	[LoopSimplify] Preserve LCSSA when merging exit blocks. Summary: This fixes PR26682. Also add LCSSA as a preserved pass to LoopSimplify, that looks correct to me and allows to write a test for the issue. Reviewers: chandlerc, bogner, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21112 llvm-svn: 272224	2016-06-08 23:13:21 +00:00
Michael Zolotukhin	987ab631fa	[SLPVectorizer] Handle GEP with differing constant index types Summary: This fixes PR27617. Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64. The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert. Reviewers: nadav, mzolotukhin Subscribers: JesperAntonsson, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20685 Patch by Jesper Antonsson! llvm-svn: 272206	2016-06-08 21:55:16 +00:00
Dehao Chen	769219b11a	Revive http://reviews.llvm.org/D12778 to handle forward-hot-prob and backward-hot-prob consistently. Summary: Consider the following diamond CFG: A / \ B C \/ D Suppose A->B and A->C have probabilities 81% and 19%. In block-placement, A->B is called a hot edge and the final placement should be ABDC. However, the current implementation outputs ABCD. This is because when choosing the next block of B, it checks if Freq(C->D) > Freq(B->D) * 20%, which is true (if Freq(A) = 100, then Freq(B->D) = 81, Freq(C->D) = 19, and 19 > 8120%=16.2). Actually, we should use 25% instead of 20% as the probability here, so that we have 19 < 8125%=20.25, and the desired ABDC layout will be generated. Reviewers: djasper, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20989 llvm-svn: 272203	2016-06-08 21:30:12 +00:00
Reid Kleckner	de3d8b500f	[DebugInfo] Add calling convention support for DWARF and CodeView Summary: Now DISubroutineType has a 'cc' field which should be a DW_CC_ enum. If it is present and non-zero, the backend will emit it as a DW_AT_calling_convention attribute. On the CodeView side, we translate it to the appropriate enum for the LF_PROCEDURE record. I added a new LLVM vendor specific enum to the list of DWARF calling conventions. DWARF does not appear to attempt to standardize these, so I assume it's OK to do this until we coordinate with GCC on how to emit vectorcall convention functions. Reviewers: dexonsmith, majnemer, aaboud, amccarth Subscribers: mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D21114 llvm-svn: 272197	2016-06-08 20:34:29 +00:00
Evgeny Stupachenko	3e2f389a7e	The patch set unroll disable pragma when unroll with user specified count has been applied. Summary: Previously SetLoopAlreadyUnrolled() set the disable pragma only if there was some loop metadata. Now it set the pragma in all cases. This helps to prevent multiple unroll when -unroll-count=N is given. Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D20765 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 272195	2016-06-08 20:21:24 +00:00
Tim Shen	7aa0ad65ce	[MemCpyOpt] Do not exchange llvm.lifetime.start and llvm.memcpy Reviewers: iteratee Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21087 llvm-svn: 272192	2016-06-08 19:42:32 +00:00
Adrian McCarthy	f3c3c13206	Generate codeview for array type metadata. Differential Revision: http://reviews.llvm.org/D21107 llvm-svn: 272187	2016-06-08 18:22:59 +00:00
Reid Kleckner	ee641c20ca	[codeview] Avoid emitting an empty file checksum table Again, the Microsoft linker does not like empty substreams. We still emit an empty string table if CodeView is enabled, but that doesn't cause problems because it always contains at least one null byte. llvm-svn: 272183	2016-06-08 17:50:29 +00:00
Sanjoy Das	8598412e24	[SCEV] Track no-abnormal-exits instead of no-throw calls Absence of may-unwind calls is not enough to guarantee that a UB-generating use of an add-rec poison in the loop latch will actually cause UB. We also need to guard against calls that terminate the thread or infinite loop themselves. This partially addresses PR28012. llvm-svn: 272181	2016-06-08 17:48:42 +00:00
Sanjoy Das	9a65cd214d	Teach isGuarantdToTransferExecToSuccessor about debug info intrinsics Calls to `@llvm.dbg.*` can be assumed to terminate. llvm-svn: 272180	2016-06-08 17:48:36 +00:00
Sanjoy Das	a19edc4d15	Fix a bug in SCEV's poison value propagation The worklist algorithm introduced in rL271151 didn't check to see if the direct users of the post-inc add recurrence propagates poison. This change fixes the problem and makes the code structure more obvious. Note for release managers: correctness wise, this bug wasn't a regression introduced by rL271151 -- the behavior of SCEV around post-inc add recurrences was strictly improved (in terms of correctness) in rL271151. llvm-svn: 272179	2016-06-08 17:48:31 +00:00
Quentin Colombet	d1cd30b218	[AArch64][RegisterBankInfo] G_OR are fine on either GPR or FPR. Teach AArch64RegisterBankInfo that G_OR can be mapped on either GPR or FPR for 64-bit or 32-bit values. Add test cases demonstrating how this information is used to coalesce a computation on a single register bank. llvm-svn: 272170	2016-06-08 16:53:32 +00:00
Quentin Colombet	ea4d848be3	[Target] Introduce a generic opcode for bitwise OR: G_OR. This G_OR is used in GlobalISel to represent bitwise OR. llvm-svn: 272160	2016-06-08 16:12:19 +00:00
Oliver Stannard	b3378e2f3c	[ARM] MSR instructions implicitly set CPSR The MSR instructions can write to the CPSR, but we did not model this fact, so we could emit them in the middle of IT blocks, changing the condition flags for later instructions in the block. The tests use two calls to llvm.write_register.i32 because it is valid to use these instructions at the end of an IT block, which if conversion does do in some cases. With two calls, the first clobbers the flags, so a branch has to be used to make the second one conditional. Differential Revision: http://reviews.llvm.org/D21139 llvm-svn: 272154	2016-06-08 15:26:34 +00:00
Matthias Braun	3ef7df9cdf	MIR: Fix parsing of stack object references in MachineMemOperands The MachineMemOperand parser lacked the code to handle %stack.X references (%fixed-stack.X was working). llvm-svn: 272082	2016-06-08 00:47:07 +00:00
Rui Ueyama	f14a74c102	[pdbdump] Print out # of hash buckets. In the reference code, the field name is `cHashBuckets`. llvm-svn: 272075	2016-06-07 23:53:43 +00:00
Rui Ueyama	d833917f98	[pdbdump] Print out TPI hash key size. llvm-svn: 272073	2016-06-07 23:44:27 +00:00
Vedant Kumar	cef4360ac4	Retry^4 "[llvm-profdata] Add option to ingest filepaths from a file" Changes since the initial commit: - Use echo instead of printf. This should side-step the character escaping issues on Windows. Differential Revision: http://reviews.llvm.org/D20980 llvm-svn: 272068	2016-06-07 22:47:31 +00:00
Easwaran Raman	f894b5e89c	Use FileCheck instead of grepping for patterns. NFC. llvm-svn: 272065	2016-06-07 21:46:14 +00:00
Nicolai Haehnle	c00e03b8f5	AMDGPU: Add amdgpu-ps-wqm-outputs function attributes Summary: The presence of this attribute indicates that VGPR outputs should be computed in whole quad mode. This will be used by Mesa for prolog pixel shaders, so that derivatives can be taken of shader inputs computed by the prolog, fixing a bug. The generated code could certainly be improved: if a prolog pixel shader is used (which isn't common in modern OpenGL - they're used for gl_Color, polygon stipples, and forcing per-sample interpolation), Mesa will use this attribute unconditionally, because it has to be conservative. So WQM may be used in the prolog when it isn't really needed, and furthermore a silly back-and-forth switch is likely to happen at the boundary between prolog and main shader parts. Fixing this is a bit involved: we'd first have to add a mechanism by which LLVM writes the WQM-related input requirements to the main shader part binary, and then Mesa specializes the prolog part accordingly. At that point, we may as well just compile a monolithic shader... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D20839 llvm-svn: 272063	2016-06-07 21:37:17 +00:00
Simon Pilgrim	536434e80f	[X86][SSE4A] Regenerated SSE4A intrinsics tests There are no VEX encoded versions of SSE4A instructions, make sure that AVX targets give the same output llvm-svn: 272060	2016-06-07 21:15:45 +00:00
Eric Christopher	538d09d0dd	Revert "Differential Revision: http://reviews.llvm.org/D20557 " Author: Wei Ding <wei.ding2@amd.com> Date: Tue Jun 7 19:04:44 2016 +0000 Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8 as it was breaking the bots. This reverts commit r272044. llvm-svn: 272056	2016-06-07 20:27:12 +00:00
Etienne Bergeron	22bfa83208	[stack-protection] Add support for MSVC buffer security check Summary: This patch is adding support for the MSVC buffer security check implementation The buffer security check is turned on with the '/GS' compiler switch. * https://msdn.microsoft.com/en-us/library/8dbf701c.aspx * To be added to clang here: http://reviews.llvm.org/D20347 Some overview of buffer security check feature and implementation: * https://msdn.microsoft.com/en-us/library/aa290051(VS.71).aspx * http://www.ksyash.com/2011/01/buffer-overflow-protection-3/ * http://blog.osom.info/2012/02/understanding-vs-c-compilers-buffer.html For the following example: ``` int example(int offset, int index) { char buffer[10]; memset(buffer, 0xCC, index); return buffer[index]; } ``` The MSVC compiler is adding these instructions to perform stack integrity check: ``` push ebp mov ebp,esp sub esp,50h [1] mov eax,dword ptr [__security_cookie (01068024h)] [2] xor eax,ebp [3] mov dword ptr [ebp-4],eax push ebx push esi push edi mov eax,dword ptr [index] push eax push 0CCh lea ecx,[buffer] push ecx call _memset (010610B9h) add esp,0Ch mov eax,dword ptr [index] movsx eax,byte ptr buffer[eax] pop edi pop esi pop ebx [4] mov ecx,dword ptr [ebp-4] [5] xor ecx,ebp [6] call @__security_check_cookie@4 (01061276h) mov esp,ebp pop ebp ret ``` The instrumentation above is: * [1] is loading the global security canary, * [3] is storing the local computed ([2]) canary to the guard slot, * [4] is loading the guard slot and ([5]) re-compute the global canary, * [6] is validating the resulting canary with the '__security_check_cookie' and performs error handling. Overview of the current stack-protection implementation: * lib/CodeGen/StackProtector.cpp * There is a default stack-protection implementation applied on intermediate representation. * The target can overload 'getIRStackGuard' method if it has a standard location for the stack protector cookie. * An intrinsic 'Intrinsic::stackprotector' is added to the prologue. It will be expanded by the instruction selection pass (DAG or Fast). * Basic Blocks are added to every instrumented function to receive the code for handling stack guard validation and errors handling. * Guard manipulation and comparison are added directly to the intermediate representation. * lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp * lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp * There is an implementation that adds instrumentation during instruction selection (for better handling of sibbling calls). * see long comment above 'class StackProtectorDescriptor' declaration. * The target needs to override 'getSDagStackGuard' to activate SDAG stack protection generation. (note: getIRStackGuard MUST be nullptr). * 'getSDagStackGuard' returns the appropriate stack guard (security cookie) * The code is generated by 'SelectionDAGBuilder.cpp' and 'SelectionDAGISel.cpp'. * include/llvm/Target/TargetLowering.h * Contains function to retrieve the default Guard 'Value'; should be overriden by each target to select which implementation is used and provide Guard 'Value'. * lib/Target/X86/X86ISelLowering.cpp * Contains the x86 specialisation; Guard 'Value' used by the SelectionDAG algorithm. Function-based Instrumentation: * The MSVC doesn't inline the stack guard comparison in every function. Instead, a call to '__security_check_cookie' is added to the epilogue before every return instructions. * To support function-based instrumentation, this patch is * adding a function to get the function-based check (llvm 'Value', see include/llvm/Target/TargetLowering.h), * If provided, the stack protection instrumentation won't be inlined and a call to that function will be added to the prologue. * modifying (SelectionDAGISel.cpp) do avoid producing basic blocks used for inline instrumentation, * generating the function-based instrumentation during the ISEL pass (SelectionDAGBuilder.cpp), * if FastISEL (not SelectionDAG), using the fallback which rely on the same function-based implemented over intermediate representation (StackProtector.cpp). Modifications * adding support for MSVC (lib/Target/X86/X86ISelLowering.cpp) * adding support function-based instrumentation (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp, .h) Results * IR generated instrumentation: ``` clang-cl /GS test.cc /Od /c -mllvm -print-isel-input ``` ``` * Final LLVM Code input to ISel * ; Function Attrs: nounwind sspstrong define i32 @"\01?example@@YAHHH@Z"(i32 %offset, i32 %index) #0 { entry: %StackGuardSlot = alloca i8* <<<-- Allocated guard slot %0 = call i8* @llvm.stackguard() <<<-- Loading Stack Guard value call void @llvm.stackprotector(i8* %0, i8** %StackGuardSlot) <<<-- Prologue intrinsic call (store to Guard slot) %index.addr = alloca i32, align 4 %offset.addr = alloca i32, align 4 %buffer = alloca [10 x i8], align 1 store i32 %index, i32* %index.addr, align 4 store i32 %offset, i32* %offset.addr, align 4 %arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 0 %1 = load i32, i32* %index.addr, align 4 call void @llvm.memset.p0i8.i32(i8* %arraydecay, i8 -52, i32 %1, i32 1, i1 false) %2 = load i32, i32* %index.addr, align 4 %arrayidx = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 %2 %3 = load i8, i8* %arrayidx, align 1 %conv = sext i8 %3 to i32 %4 = load volatile i8, i8* %StackGuardSlot <<<-- Loading Guard slot call void @__security_check_cookie(i8* %4) <<<-- Epilogue function-based check ret i32 %conv } ``` * SelectionDAG generated instrumentation: ``` clang-cl /GS test.cc /O1 /c /FA ``` ``` "?example@@YAHHH@Z": # @"\01?example@@YAHHH@Z" # BB#0: # %entry pushl %esi subl $16, %esp movl ___security_cookie, %eax <<<-- Loading Stack Guard value movl 28(%esp), %esi movl %eax, 12(%esp) <<<-- Store to Guard slot leal 2(%esp), %eax pushl %esi pushl $204 pushl %eax calll _memset addl $12, %esp movsbl 2(%esp,%esi), %esi movl 12(%esp), %ecx <<<-- Loading Guard slot calll @__security_check_cookie@4 <<<-- Epilogue function-based check movl %esi, %eax addl $16, %esp popl %esi retl ``` Reviewers: kcc, pcc, eugenis, rnk Subscribers: majnemer, llvm-commits, hans, thakis, rnk Differential Revision: http://reviews.llvm.org/D20346 llvm-svn: 272053	2016-06-07 20:15:35 +00:00
Wei Ding	a70216f1b3	Differential Revision: http://reviews.llvm.org/D20557 llvm-svn: 272044	2016-06-07 19:04:44 +00:00
George Burgess IV	a1f9a2daeb	[CFLAA] Add AttrEscaped, remove bit twiddling functions. This patch does a few things: - Unifies AttrAll and AttrUnknown (since they were used for more or less the same purpose anyway). - Introduces AttrEscaped, an attribute that notes that a value escapes our analysis for a given set, but not that an unknown value flows into said set. - Removes functions that take bit indices, since we also had functions that took bitsets, and the use of both (with similar names) was unclear and bug-prone. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21000 llvm-svn: 272040	2016-06-07 18:35:37 +00:00
Geoff Berry	486f49cc63	Reapply [AArch64] Fix isLegalAddImmediate() to return true for valid negative values. Originally reviewed here: http://reviews.llvm.org/D17463 llvm-svn: 272023	2016-06-07 16:48:43 +00:00
Andrey Turetskiy	94c2179550	Quick fix for the test from rL272014 "[LAA] Improve non-wrapping pointer detection by handling loop-invariant case" (s couple of buildbots failed). Patch by Roman Shirokiy. llvm-svn: 272019	2016-06-07 15:52:35 +00:00
Haicheng Wu	4fa9f3ae45	Revert "[MBP] Reduce code size by running tail merging in MBP." This reverts commit r271930, r271915, r271923. They break a thumb selfhosting bot. llvm-svn: 272017	2016-06-07 15:17:21 +00:00
Simon Pilgrim	15c6ab5fac	[X86][AVX512] Added 512-bit integer vector non-temporal load tests llvm-svn: 272016	2016-06-07 15:12:47 +00:00
Oliver Stannard	8de5f24d10	[ARM] Accept conditional versions of BXNS and BLXNS These instructions end in "S" but are not flag-setting, so they need including in the list of special cases in the assembly parser. Differential Revision: http://reviews.llvm.org/D21077 llvm-svn: 272015	2016-06-07 14:58:48 +00:00

... 23 24 25 26 27 ...

39519 Commits