llvm-project

Commit Graph

Author	SHA1	Message	Date
Richard Sandiford	32379b8141	[SystemZ] Optimize (sext (ashr (shl ...), ...)) ...into (ashr (shl (anyext X), ...), ...), which requires one fewer instruction. The (anyext X) can sometimes be simplified too. I didn't do this in DAGCombiner because widening shifts isn't a win on all targets. llvm-svn: 199114	2014-01-13 15:17:53 +00:00
Tim Northover	7d074a5ad6	ARM: add test for r199108. Oops. rdar://problem/15800156 llvm-svn: 199109	2014-01-13 14:20:25 +00:00
David Woodhouse	4e033b0e92	[x86] Fix retq/retl handling in 64-bit mode This finishes the job started in r198756, and creates separate opcodes for 64-bit vs. 32-bit versions of the rest of the RET instructions too. LRETL/LRETQ are interesting... I can't see any justification for their existence in the SDM. There should be no 'LRETL' in 64-bit mode, and no need for a REX.W prefix for LRETQ. But this is what GAS does, and my Sandybridge CPU and an Opteron 6376 concur when tested as follows: asm __volatile__("pushq $0x1234\nmovq $0x33,%rax\nsalq $32,%rax\norq $1f,%rax\npushq %rax\nlretl $8\n1:"); asm __volatile__("pushq $1234\npushq $0x33\npushq $1f\nlretq $8\n1:"); asm __volatile__("pushq $0x33\npushq $1f\nlretq\n1:"); asm __volatile__("pushq $0x1234\npushq $0x33\npushq $1f\nlretq $8\n1:"); cf. PR8592 and commit r118903, which added LRETQ. I only added LRETIQ to match it. I don't quite understand how the Intel syntax parsing for ret instructions is working, despite r154468 allegedly fixing it. Aren't the explicitly sized 'retw', 'retd' and 'retq' supposed to work? I have at least made the 'lretq' work with (and indeed require) the 'q'. llvm-svn: 199106	2014-01-13 14:05:59 +00:00
Elena Demikhovsky	b19c9dc1a1	AVX-512: Embedded Rounding Control - encoding and printing Changed intrinsics for vrcp14/vrcp28 vrsqrt14/vrsqrt28 - aligned with GCC. llvm-svn: 199102	2014-01-13 12:55:03 +00:00
Chandler Carruth	b7bdfd65ac	[PM] Wire up support for writing bitcode with new PM. This moves the old pass creation functionality to its own header and updates the callers of that routine. Then it adds a new PM supporting bitcode writer to the header file, and wires that up in the opt tool. A test is added that round-trips code into bitcode and back out using the new pass manager. llvm-svn: 199078	2014-01-13 07:38:24 +00:00
NAKAMURA Takumi	eccd28d519	llvm/test/ExecutionEngine/MCJIT/load-object-a.ll: Put together rm(1) and mkdir(1) at the top. llvm-svn: 199077	2014-01-13 05:55:10 +00:00
Chandler Carruth	b353c3f7f2	[PM] Wire up support for printing assembly output from the opt command. This lets us round-trip IR in the expected manner with the opt tool. llvm-svn: 199075	2014-01-13 05:16:45 +00:00
Kevin Qin	cfef55d6d4	[AArch64 NEON] Add missing patterns for bitcast from or to v1f64 llvm-svn: 199070	2014-01-13 01:58:38 +00:00
Kevin Qin	21e8f1c4eb	[AArch64 NEON] Add more scenarios to use perm instructions when lowering shuffle_vector This patch covered 2 more scenarios: 1. Two operands of shuffle_vector are the same, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> 2. One of operands is undef, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> After this patch, perm instructions will have chance to be emitted instead of lots of INS. llvm-svn: 199069	2014-01-13 01:56:29 +00:00
Saleem Abdulrasool	a6505ca4c2	correct target directive handling error handling The target specific parser should return `false' if the target AsmParser handles the directive, and `true' if the generic parser should handle the directive. Many of the target specific directive handlers would `return Error' which does not follow these semantics. This change simply changes the target specific routines to conform to the semantis of the ParseDirective correctly. Conformance to the semantics improves diagnostics emitted for the invalid directives. X86 is taken as a sample to ensure that multiple diagnostics are not presented for a single error. llvm-svn: 199068	2014-01-13 01:15:39 +00:00
Jakob Stoklund Olesen	1995b9fead	Handle bundled terminators in isBlockOnlyReachableByFallthrough. Targets like SPARC and MIPS have delay slots and normally bundle the delay slot instruction with the corresponding terminator. Teach isBlockOnlyReachableByFallthrough to find any MBB operands on bundled terminators so SPARC doesn't need to specialize this function. llvm-svn: 199061	2014-01-12 19:24:08 +00:00
Nico Rieck	f15341c9de	Make test independent of scheduling llvm-svn: 199055	2014-01-12 15:57:38 +00:00
NAKAMURA Takumi	d7032ac21e	llvm/test/CodeGen/X86/shl_undef.ll: Tweak to satisfy r199050. Use intel syntax, or "shl" might hit "pushl". llvm-svn: 199051	2014-01-12 14:41:41 +00:00
Nico Rieck	b5262d6d8f	Fix non-deterministic SDNodeOrder-dependent codegen Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050	2014-01-12 14:09:17 +00:00
Chandler Carruth	52eef8876e	[PM] Add module and function printing passes for the new pass manager. This implements the legacy passes in terms of the new ones. It adds basic testing using explicit runs of the passes. Next up will be wiring the basic output mechanism of opt up when the new pass manager is engaged unless bitcode writing is requested. llvm-svn: 199049	2014-01-12 12:15:39 +00:00
Chandler Carruth	6546cb6313	[PM] Fix a bunch of bugs I spotted by inspection when working on this code. Copious tests added to cover these cases. llvm-svn: 199039	2014-01-12 10:02:02 +00:00
Chandler Carruth	d833098d17	[PM] Add support for parsing function passes and function pass manager nests to the opt commandline support. This also showcases the implicit-initial-manager support which will be most useful for testing. There are several bugs that I spotted by inspection here that I'll fix with test cases in subsequent commits. llvm-svn: 199038	2014-01-12 09:34:22 +00:00
Saleem Abdulrasool	bdae4b8743	ARM IAS: fix diagnostics of improper qualification An improper qualifier would result in a superfluous error due to the parser not consuming the remainder of the statement. Simply consume the remainder of the statement to avoid the error. llvm-svn: 199035	2014-01-12 05:25:44 +00:00
Venkatraman Govindaraju	cd4d9ac62a	[Sparc] Add support for parsing floating point instructions. llvm-svn: 199033	2014-01-12 04:48:54 +00:00
Saleem Abdulrasool	fb3950ec63	ARM: change implicit immediate forms of {ld,st}r{,b}t to psuedo-instructions The implicit immediate 0 forms are assembly aliases, not distinct instruction encodings. Fix the initial implementation introduced in r198914 to an alias to avoid two separate instruction definitions for the same encoding. An InstAlias is insufficient in this case as the necessary due to the need to add a new additional operand for the implicit zero. By using the AsmPsuedoInst, fall back to the C++ code to transform the instruction to the equivalent _POST_IMM form, inserting the additional implicit immediate 0. llvm-svn: 199032	2014-01-12 04:36:01 +00:00
Jakob Stoklund Olesen	e7084a1c5c	The SPARCv9 ABI returns a float in %f0. This is different from the argument passing convention which puts the first float argument in %f1. With this patch, all returned floats are treated as if the 'inreg' flag were set. This means multiple float return values get packed in %f0, %f1, %f2, ... Note that when returning a struct in registers, clang will set the 'inreg' flag on the return value, so that behavior is unchanged. This also happens when returning a float _Complex. llvm-svn: 199028	2014-01-12 04:13:17 +00:00
Joerg Sonnenberger	4bde03023b	Typo llvm-svn: 199027	2014-01-12 03:38:30 +00:00
Joerg Sonnenberger	485f00fe0f	Add missing mul aliases for armv4 support. Add checks that armv4 can assemble the various mul instructions. llvm-svn: 199026	2014-01-12 03:35:18 +00:00
Hans Wennborg	ac114a3ce7	Switch-to-lookup tables: Don't require a result for the default case when the lookup table doesn't have any holes. This means we can build a lookup table for switches like this: switch (x) { case 0: return 1; case 1: return 2; case 2: return 3; case 3: return 4; default: exit(1); } The default case doesn't yield a constant result here, but that doesn't matter, since a default result is only necessary for filling holes in the lookup table, and this table doesn't have any holes. This makes us transform 505 more switches in a clang bootstrap, and shaves 164 KB off the resulting clang binary. llvm-svn: 199025	2014-01-12 00:44:41 +00:00
Venkatraman Govindaraju	a66b314c34	[Sparc] Add missing processor types: v7 and niagara llvm-svn: 199024	2014-01-11 23:56:13 +00:00
Saleem Abdulrasool	2d48edeca3	ARM IAS: support emitting constant values in target expressions A 32-bit immediate value can be formed from a constant expression and loaded into a register. Add support to emit this into an object file. Because this value is a constant, a relocation must not be produced for it. llvm-svn: 199023	2014-01-11 23:03:48 +00:00
Benjamin Kramer	c10563d14e	Fix broken CHECK lines. llvm-svn: 199016	2014-01-11 21:06:00 +00:00
Venkatraman Govindaraju	0653218b2b	[Sparc] Bundle instruction with delay slow and its filler. Now, we can use -verify-machineinstrs with SPARC backend. llvm-svn: 199014	2014-01-11 19:38:03 +00:00
Chandler Carruth	258dbb3b12	[PM] Actually nest pass managers correctly when parsing the pass pipeline string. Add tests that cover this now that we have execution dumping in the pass managers. llvm-svn: 199005	2014-01-11 12:06:47 +00:00
NAKAMURA Takumi	a64d0bccc8	llvm/test/Transforms/SampleProfile/syntax.ll: Eliminate locale-sensitive message check. llvm-svn: 199000	2014-01-11 09:23:52 +00:00
NAKAMURA Takumi	80a474c1c3	llvm/test/CodeGen/X86/anyregcc.ll: Add explicit -mtriple=x86_64-unknown-unknown. XMM(s) are really spilling for targeting Win64. llvm-svn: 198999	2014-01-11 09:23:44 +00:00
Chandler Carruth	66445382ff	[PM] Add (very skeletal) support to opt for running the new pass manager. I cannot emphasize enough that this is a WIP. =] I expect it to change a great deal as things stabilize, but I think its really important to get some functionality here so that the infrastructure can be tested more traditionally from the commandline. The current design is looking something like this: ./bin/opt -passes='module(pass_a,pass_b,function(pass_c,pass_d))' So rather than custom-parsed flags, there is a single flag with a string argument that is parsed into the pass pipeline structure. This makes it really easy to have nice structural properties that are very explicit. There is one obvious and important shortcut. You can start off the pipeline with a pass, and the minimal context of pass managers will be built around the entire specified pipeline. This makes the common case for tests super easy: ./bin/opt -passes=instcombine,sroa,gvn But this won't introduce any of the complexity of the fully inferred old system -- we only ever do this for the entire argument, and we only look at the first pass. If the other passes don't fit in the pass manager selected it is a hard error. The other interesting aspect here is that I'm not relying on any registration facilities. Such facilities may be unavoidable for supporting plugins, but I have alternative ideas for plugins that I'd like to try first. My plan is essentially to build everything without registration until we hit an absolute requirement. Instead of registration of pass names, there will be a library dedicated to parsing pass names and the pass pipeline strings described above. Currently, this is directly embedded into opt for simplicity as it is very early, but I plan to eventually pull this into a library that opt, bugpoint, and even Clang can depend on. It should end up as a good home for things like the existing PassManagerBuilder as well. There are a bunch of FIXMEs in the code for the parts of this that are just stubbed out to make the patch more incremental. A quick list of what's coming up directly after this: - Support for function passes and building the structured nesting. - Support for printing the pass structure, and FileCheck tests of all of this code. - The .def-file based pass name parsing. - IR priting passes and the corresponding tests. Some obvious things that I'm not going to do right now, but am definitely planning on as the pass manager work gets a bit further: - Pull the parsing into library, including the builders. - Thread the rest of the target stuff into the new pass manager. - Wire support for the new pass manager up to llc. - Plugin support. Some things that I'd like to have, but are significantly lower on my priority list. I'll get to these eventually, but they may also be places where others want to contribute: - Adding nice error reporting for broken pass pipeline descriptions. - Typo-correction for pass names. llvm-svn: 198998	2014-01-11 08:16:35 +00:00
Juergen Ributzka	976d94b834	[anyregcc] Fix callee-save mask for anyregcc Use separate callee-save masks for XMM and YMM registers for anyregcc on X86 and select the proper mask depending on the target cpu we compile for. llvm-svn: 198985	2014-01-11 01:00:27 +00:00
Diego Novillo	9518b63bfc	Extend and simplify the sample profile input file. 1- Use the line_iterator class to read profile files. 2- Allow comments in profile file. Lines starting with '#' are completely ignored while reading the profile. 3- Add parsing support for discriminators and indirect call samples. Our external profiler can emit more profile information that we are currently not handling. This patch does not add new functionality to support this information, but it allows profile files to provide it. I will add actual support later on (for at least one of these features, I need support for DWARF discriminators in Clang). A sample line may contain the following additional information: Discriminator. This is used if the sampled program was compiled with DWARF discriminator support (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators). This is currently only emitted by GCC and we just ignore it. Potential call targets and samples. If present, this line contains a call instruction. This models both direct and indirect calls. Each called target is listed together with the number of samples. For example, 130: 7 foo:3 bar:2 baz:7 The above means that at relative line offset 130 there is a call instruction that calls one of foo(), bar() and baz(). With baz() being the relatively more frequent call target. Differential Revision: http://llvm-reviews.chandlerc.com/D2355 4- Simplify format of profile input file. This implements earlier suggestions to simplify the format of the sample profile file. The symbol table is not necessary and function profiles do not need to know the number of samples in advance. Differential Revision: http://llvm-reviews.chandlerc.com/D2419 llvm-svn: 198973	2014-01-10 23:23:51 +00:00
Diego Novillo	0accb3d2bc	Propagation of profile samples through the CFG. This adds a propagation heuristic to convert instruction samples into branch weights. It implements a similar heuristic to the one implemented by Dehao Chen on GCC. The propagation proceeds in 3 phases: 1- Assignment of block weights. All the basic blocks in the function are initial assigned the same weight as their most frequently executed instruction. 2- Creation of equivalence classes. Since samples may be missing from blocks, we can fill in the gaps by setting the weights of all the blocks in the same equivalence class to the same weight. To compute the concept of equivalence, we use dominance and loop information. Two blocks B1 and B2 are in the same equivalence class if B1 dominates B2, B2 post-dominates B1 and both are in the same loop. 3- Propagation of block weights into edges. This uses a simple propagation heuristic. The following rules are applied to every block B in the CFG: - If B has a single predecessor/successor, then the weight of that edge is the weight of the block. - If all the edges are known except one, and the weight of the block is already known, the weight of the unknown edge will be the weight of the block minus the sum of all the known edges. If the sum of all the known edges is larger than B's weight, we set the unknown edge weight to zero. - If there is a self-referential edge, and the weight of the block is known, the weight for that edge is set to the weight of the block minus the weight of the other incoming edges to that block (if known). Since this propagation is not guaranteed to finalize for every CFG, we only allow it to proceed for a limited number of iterations (controlled by -sample-profile-max-propagate-iterations). It currently uses the same GCC default of 100. Before propagation starts, the pass builds (for each block) a list of unique predecessors and successors. This is necessary to handle identical edges in multiway branches. Since we visit all blocks and all edges of the CFG, it is cleaner to build these lists once at the start of the pass. Finally, the patch fixes the computation of relative line locations. The profiler emits lines relative to the function header. To discover it, we traverse the compilation unit looking for the subprogram corresponding to the function. The line number of that subprogram is the line where the function begins. That becomes line zero for all the relative locations. llvm-svn: 198972	2014-01-10 23:23:46 +00:00
Arnold Schwaighofer	c2e9d759f2	LoopVectorizer: Handle strided memory accesses by versioning for (i = 0; i < N; ++i) A[i * Stride1] += B[i * Stride2]; We take loops like this and check that the symbolic strides 'Strided1/2' are one and drop to the scalar loop if they are not. This is currently disabled by default and hidden behind the flag 'enable-mem-access-versioning'. radar://13075509 llvm-svn: 198950	2014-01-10 18:20:32 +00:00
Artyom Skrobov	4e62c0b2b2	Amending test/MC/ARM/thumb2-mclass.s to match its apparent original purpose (to test the ARMv6M/ARMv7M commonality), and creating a new test case for the differences between ARMv6M and ARMv7M llvm-svn: 198946	2014-01-10 16:49:49 +00:00
Artyom Skrobov	4d91d944ae	Must not produce Tag_CPU_arch_profile for pre-ARMv7 cores (e.g. cortex-m0) llvm-svn: 198945	2014-01-10 16:42:55 +00:00
Saleem Abdulrasool	b16c09f241	ARM: fix regression caused by r198914 The disassembler would no longer be able to disambiguage between the two variants (explicit immediate #0 vs implicit, omitted #0) for the ldrt, strt, ldrbt, strbt mnemonics as both versions indicated the disassembler routine. llvm-svn: 198944	2014-01-10 16:22:47 +00:00
Kristof Beyls	58306ad903	Make sure -use-init-array has intended effect on all AArch64 ELF targets, not just linux. llvm-svn: 198937	2014-01-10 13:41:49 +00:00
NAKAMURA Takumi	d38ac74662	llvm/test/ExecutionEngine/MCJIT/load-object-a.ll: Remove "REQUIRES:shell". This doesn't depend on shell's behavior. llvm-svn: 198931	2014-01-10 10:38:52 +00:00
NAKAMURA Takumi	566080cc80	llvm/test/ExecutionEngine/MCJIT/lit.local.cfg: Add "AMD64" in the host_arch list. FIXME: We should not take CMake's ${CMAKE_SYSTEM_PROCESSOR}... llvm-svn: 198930	2014-01-10 10:38:46 +00:00
NAKAMURA Takumi	52f9d3818b	llvm/test/ExecutionEngine/MCJIT/load-object-a.ll: Fix not to use %t.cachedir/%p. %p is like X:\foo\bar. llvm-svn: 198926	2014-01-10 10:38:23 +00:00
Saleem Abdulrasool	435f45653a	ARM IAS: support #:{lower,upper}16: for GNU compatibility The GNU assembler supports prefixing the expression with a '#' to indiciate that the value that is being moved is infact a constant. This improves the compatibility of the integrated assembler's parser for this. llvm-svn: 198916	2014-01-10 04:38:40 +00:00
Saleem Abdulrasool	e6e6d71477	ARM IAS: support GNU extension for ldrd, strd The GNU assembler has an extension that allows for the elision of the paired register (dt2) for the LDRD and STRD mnemonics. Add support for this in the assembly parser. Canonicalise the usage during the instruction parsing from the specified version. llvm-svn: 198915	2014-01-10 04:38:35 +00:00
Saleem Abdulrasool	5bfefb6a8f	ARM IAS: support implicit immediate 0s for {LD,ST}R{B,}T The ARM ARM indicates the mnemonics as follows: ldrbt{<c>}{<q>} <Rt>, [<Rn>], {, #+/-<imm>} ldrt{<c>}{<q>} <Rt>, [<Rn>] {, #+/-<imm>} strbt{<c>}{<q>} <Rt>, [<Rn>] {, #<imm>} strt{<c>}{<q>} <Rt>, [<Rn>] {, #+/-<imm>} This improves the parser to deal with the implicit immediate 0 for the mnemonics as per the specification. Thanks to Joerg Sonnenberger for the tests! llvm-svn: 198914	2014-01-10 04:38:31 +00:00
Venkatraman Govindaraju	ad40dfcb4b	[Sparc] Emit retl/ret instead of jmp instruction. It improves the readability of the assembly generated. llvm-svn: 198910	2014-01-10 02:55:27 +00:00
Venkatraman Govindaraju	0d288d3105	[Sparc] Add support for parsing jmpl instruction and make indirect call and jmp instructions as aliases to jmpl. llvm-svn: 198909	2014-01-10 01:48:17 +00:00
David Blaikie	15ed5ebfc5	Revert "Revert r198851, "Prototype of skeleton type units for fission"" This reverts commit r198865 which reverts r198851. ASan identified a use-of-uninitialized of the DwarfTypeUnit::Ty variable in skeleton type units. llvm-svn: 198908	2014-01-10 01:38:41 +00:00
Kevin Enderby	9bd296ab55	Fix a bug with the ARM thumb2 CBNZ and CBNZ instructions that branch to the next instruction. This can not be encoded but can be turned into a NOP. rdar://15062072 llvm-svn: 198904	2014-01-10 00:43:32 +00:00
NAKAMURA Takumi	c5bf572993	Revert r198851, "Prototype of skeleton type units for fission" It caused undefined behavior. DwarfTypeUnit::Ty might not be initialized properly, I guess. llvm-svn: 198865	2014-01-09 13:08:00 +00:00
Stepan Dyatkovskiy	431993b57b	Fixed old typo in ScalarEvolution, that caused wrong SCEVs zext operation. Detailed description is here: http://llvm.org/bugs/show_bug.cgi?id=18000#c16 For participation in bugfix process special thanks to David Wiberg. llvm-svn: 198863	2014-01-09 12:26:12 +00:00
Richard Sandiford	3875cb60f3	[SystemZ] Fix RNSBG bug introduced by r197802 The zext handling added in r197802 wasn't right for RNSBG. This patch restricts it to ROSBG, RXSBG and RISBG. (The tests for RISBG were added in r197802 since RISBG was the motivating example.) llvm-svn: 198862	2014-01-09 11:28:53 +00:00
Richard Sandiford	15cfc1c33c	Handle masked rotate amounts At the moment we expect rotates to have the form: (or (shl X, Y), (shr X, Z)) where Y == bitsize(X) - Z or Z == bitsize(X) - Y. This form means that the (or ...) is undefined for Y == 0 or Z == 0. This undefinedness can be avoided by using Y == (C * bitsize(X) - Z) & (bitsize(X) - 1) or Z == (C * bitsize(X) - Y) & (bitsize(X) - 1) for any integer C (including 0, the most natural choice). llvm-svn: 198861	2014-01-09 10:56:42 +00:00
Richard Sandiford	0f264db3c6	Match the InstCombine form of rotates by X+C InstCombine converts (sub 32, (add X, C)) into (sub 32-C, X), so a rotate left of a 32-bit Y by X+C could appear as either: (or (shl Y, (add X, C)), (shr Y, (sub 32, (add X, C)))) without InstCombine or: (or (shl Y, (add X, C)), (shr Y, (sub 32-C, X))) with it. We already matched the first form. This patch handles the second too. llvm-svn: 198860	2014-01-09 10:49:40 +00:00
Lang Hames	1ddecc0777	Add an "-object-cache-dir=<string>" option to LLI. This option specifies the root path to which object files managed by the LLIObjectCache instance should be written. This option defaults to "", in which case objects are cached in the same directory as the bitcode they are derived from. The load-object-a.ll test has been rewritten to use this option to support testing in environments where the test directory is not writable. llvm-svn: 198852	2014-01-09 05:24:05 +00:00
David Blaikie	a588365df6	Prototype of skeleton type units for fission llvm-svn: 198851	2014-01-09 05:08:28 +00:00
Saleem Abdulrasool	5b060a92d6	llvm-readobj: address review comments for ARM EHABI printing Rename bytecode to opcodes to make it more clear. Change an impossible case to llvm_unreachable instead. Avoid allocation of a buffer by modifying the PrintOpcodes iteration. llvm-svn: 198848	2014-01-09 04:31:18 +00:00
David Blaikie	38fe6342f6	DwarfDebug: Refactor out common skeleton construction code to be reused for type unit skeletons. llvm-svn: 198846	2014-01-09 04:28:46 +00:00
Andrew Trick	32e1be7bd0	llvm.experimental.stackmap: fix encoding of large constants. In the stackmap format we advertise the constant field as signed. However, we were determining whether to promote to a 64-bit constant pool based on an unsigned comparison. This fix allows -1 to be encoded as a small constant. llvm-svn: 198816	2014-01-09 00:22:31 +00:00
David Blaikie	622dce4194	llvm-dwarfdump: reorder dwo sections to immediately proceed their non-dwo equivalents This makes it easier to write a test that's mostly shared between fission and non-fission (using FileCheck's multiple prefix support). llvm-svn: 198806	2014-01-08 23:29:59 +00:00
Hal Finkel	2150e3a743	Conservatively handle multiple MMOs in MIsNeedChainEdge MIsNeedChainEdge, which is used by -enable-aa-sched-mi (AA in misched), had an llvm_unreachable when -enable-aa-sched-mi is enabled and we reach an instruction with multiple MMOs. Instead, return a conservative answer. This allows testing -enable-aa-sched-mi on x86. Also, this moves the check above the isUnsafeMemoryObject checks. isUnsafeMemoryObject is currently correct only for instructions with one MMO (as noted in the comment in isUnsafeMemoryObject): // We purposefully do no check for hasOneMemOperand() here // in hope to trigger an assert downstream in order to // finish implementation. The problem with this is that, had the candidate edge passed the "!MIa->mayStore() && !MIb->mayStore()" check, the hoped-for assert would never happen (which could, in theory, lead to incorrect behavior if one of these secondary MMOs was volatile, for example). llvm-svn: 198795	2014-01-08 21:52:02 +00:00
Ana Pazos	cfd2ca5826	[AArch64][NEON] Added UXTL and UXTL2 instruction aliases llvm-svn: 198791	2014-01-08 21:02:13 +00:00
Roman Divacky	fb4d390766	Force emit a relocation for @gnu_indirect_function symbols so that the indirect resolution works. llvm-svn: 198780	2014-01-08 18:50:32 +00:00
Andrea Di Biagio	23df4e4a2d	Teach the DAGCombiner how to fold 'vselect' dag nodes according to the following two rules: 1) fold (vselect (build_vector AllOnes), A, B) -> A 2) fold (vselect (build_vector AllZeros), A, B) -> B llvm-svn: 198777	2014-01-08 18:33:04 +00:00
Lang Hames	7b6f99ff0d	Add missing test case for r198737. llvm-svn: 198772	2014-01-08 16:31:16 +00:00
David Woodhouse	adfc885997	[x86] Support R_386_PC8, R_386_PC16 and R_X86_64_PC8 llvm-svn: 198763	2014-01-08 12:58:40 +00:00
David Woodhouse	8bceb5d217	[x86] Do not relax PUSHi16 to PUSHi32 (PR18414) They do different things to %esp, so they are not equivalent. Rename PUSHi8 to PUSH32i8 and add the missing PUSH16i8. llvm-svn: 198761	2014-01-08 12:58:32 +00:00
David Woodhouse	6dbda4415a	[x86] Make AsmParser validate registers for memory operands a bit better We can't do a perfect job here. We have to allow (%dx) even in 64-bit mode, for example, because it might be used for an unofficial form of the in/out instructions. We actually want to do a better job of validation later. Perhaps instead of doing it where we are at the moment. But for now, doing what validation we can do in the place that the code already has its validation, is an improvement. llvm-svn: 198760	2014-01-08 12:58:28 +00:00
David Woodhouse	32da3c8f3b	[x86] Fix MOV8ao8 et al for 16-bit mode, fix up disassembler to understand It seems there is no separate instruction class for having AdSize and OpSize bits set, which is required in order to disambiguate between all these instructions. So add that to the disassembler. Hm, perhaps we do need an AdSize16 bit after all? llvm-svn: 198759	2014-01-08 12:58:24 +00:00
David Woodhouse	374243a290	[x86] Use 16-bit addressing where possible in 16-bit mode Where "where possible" means that it's an immediate value and it's below 0x10000. In fact GAS will either truncate or error with larger values, and will insist on using the addr32 prefix to get 32-bit addressing. So perhaps we should do that, in a later patch. llvm-svn: 198758	2014-01-08 12:58:18 +00:00
David Woodhouse	84ed54f91e	[x86] Fix JCXZ,JECXZ_32 for 16-bit mode JCXZ should have the 0x67 prefix only if we're in 32-bit mode, so make that appropriately conditional. And JECXZ needs the prefix instead. llvm-svn: 198757	2014-01-08 12:58:12 +00:00
David Woodhouse	79dd505ce1	[x86] Disambiguate RET[QL] and fix aliases for 16-bit mode I couldn't see how to do this sanely without splitting RETQ from RETL. Eric says: "sad about the inability to roundtrip them now, but...". I have no idea what that means, but perhaps it wants preserving in the commit comment. llvm-svn: 198756	2014-01-08 12:58:07 +00:00
David Woodhouse	c178fbe2a2	[x86] Disambiguate [LS][IG]DT{32,64}m and add 16-bit versions, fix aliases llvm-svn: 198755	2014-01-08 12:57:55 +00:00
David Woodhouse	fd46016e7f	[x86] Add JMP16[rm],CALL16[rm] instructions, and fix up aliases llvm-svn: 198754	2014-01-08 12:57:49 +00:00
David Woodhouse	13574a7517	[x86] Add PUSHA16,POPA16 instructions, and fix aliases for 16-bit mode llvm-svn: 198753	2014-01-08 12:57:45 +00:00
David Woodhouse	956965ca69	[x86] Add OpSize16 to instructions that need it This fixes the bulk of 16-bit output, and the corresponding test case x86-16.s now looks mostly like the x86-32.s test case that it was originally based on. A few irrelevant instructions have been dropped, and there are still some corner cases to be fixed in subsequent patches. llvm-svn: 198752	2014-01-08 12:57:40 +00:00
Elena Demikhovsky	172a27c750	AVX-512: Added more intrinsics for pmin/pmax, pabs, blend, pmuldq. llvm-svn: 198745	2014-01-08 10:54:22 +00:00
Iain Sandoe	618def651b	[patch] Adjust behavior of FDE cross-section relocs for targets that don't support abs-differences. Modern versions of OSX/Darwin's ld (ld64 > 97.17) have an optimisation present that allows the back end to omit relocations (and replace them with an absolute difference) for FDE some text section refs. This patch allows a backend to opt-in to this behaviour by setting "DwarfFDESymbolsUseAbsDiff". At present, this is only enabled for modern x86 OSX ports. test changes by David Fang. llvm-svn: 198744	2014-01-08 10:22:54 +00:00
Kevin Qin	44946439e1	[AArch64 NEON] Fix generating incorrect value type of NEON_VDUPLANE when lower build_vector if result value type mismatch with operand value type. llvm-svn: 198743	2014-01-08 08:06:14 +00:00
Venkatraman Govindaraju	b7c6965b19	[SparcV9] Rename operands in some sparc64 instructions so that TableGen can encode them correctly. llvm-svn: 198740	2014-01-08 07:47:57 +00:00
Venkatraman Govindaraju	b3b7c38983	[Sparc] Add support for parsing branch instructions and conditional moves. llvm-svn: 198738	2014-01-08 06:14:52 +00:00
Saleem Abdulrasool	7e34cc4dbd	tests: disable ARM unwinding tests if ARM is unavailable Appease the buildbots for targets which do not build the ARM support by moving the ARM specific test into a subdirectory and use the lit configuration to disable them appropriately. Thanks to chapuni and thakis for explaining how to do this! llvm-svn: 198736	2014-01-08 03:44:01 +00:00
Saleem Abdulrasool	d88affb53c	ARM IAS: properly handle expression operands Operands which involved label arithemetic would previously fail to parse. This corrects that by adding the additional case for the shift operand validation. llvm-svn: 198735	2014-01-08 03:28:14 +00:00
Saleem Abdulrasool	be981ebcf0	llvm-readobj: add support for ARM EHABI unwind info This adds some preliminary support for decoding ARM EHABI unwinding information. The major functionality that remains from complete support is bytecode translation. Each Unwind Index Table is printed out as a separate entity along with its section index, name, offset, and entries. Each entry lists the function address, and if possible, the name, of the function to which it corresponds. The encoding model, personality routine or index, and byte code is also listed. llvm-svn: 198734	2014-01-08 03:28:09 +00:00
Hao Liu	26abebbb2c	Fix a bug about generating undef operand when optimising shuffle vector and insert element in instruction combine. llvm-svn: 198730	2014-01-08 03:06:15 +00:00
Roman Divacky	5a1c54999d	In the ELFWriter when writing aliased (.set) symbols dont blindly take type from the new symbol but merge them so that the type is never "downgraded". This is probably quite rare, except for IFUNC symbols which we used to misassemble, losing the IFUNC type. Fixes #18372. llvm-svn: 198706	2014-01-07 20:17:03 +00:00
Rafael Espindola	170a6e7944	Don't assert with private type info variables. With the gnu objc runtime private strings are used. Since we only need to produce a unique label, the fix is to just drop the asserts. llvm-svn: 198701	2014-01-07 19:38:47 +00:00
Benjamin Kramer	8a68ab3710	Emit arange padding with a single directive. llvm-svn: 198700	2014-01-07 19:28:14 +00:00
David Peixotto	a872e0e0a6	Add ARM fconsts/fconstd aliases for vmov.f32/vmov.f64 This commit adds the pre-UAL aliases of fconsts and fconstd for vmov.f32 and vmov.f64. They use an InstAlias rather than a MnemonicAlias to properly support the predicate operand. We need to support encoded 8-bit constants in order to implement the pre-UAL fconsts/fconstd aliases for vmov.f32/vmov.f64, so this commit also fixes parsing of encoded floating point constants used in vmov.f32/vmov.f64 instructions. Now we can support assembly code like this: fconsts s0, #0x70 which is equivalent to vmov.f32 s0, #1.0. Most of the code was already in place to support this feature. Previously the code was trying to accept encoded 8-bit float constants for the vmov.f32/vmov.f64 instructions. It looks like the support for parsing encoded floats was lost in a refactoring in commit r148556 and we did not have any tests in place to catch it. The change in this commit is to keep the parsed value as a 32-bit float instead of a 64-bit double because that is what the isFPImm() function expects to find. There is no loss of precision by using a 32-bit float here because we are still limited to an 8-bit encoded value in the end. Additionally, we explicitly reject encoded 8-bit floats for vmovf.32/64. This is the same as the current behavior, but we now do it explicitly rather than accidently. llvm-svn: 198697	2014-01-07 18:19:23 +00:00
Hao Liu	7d11d99d20	[AArch64]Add support to spill/fill D tuples such as DPair/DTriple/DQuad. There is no test cases for D tuple as the original test cases are too large. As the spill/fill of the D tuple is similar to the Q tuple, the correctness can be guaranteed. llvm-svn: 198684	2014-01-07 10:50:43 +00:00
Hao Liu	27d88376bc	[AArch64]Add support to copy D tuples such as DPair/DTriple/DQuad and Q tuples such as QPair/QTriple/QQuad. There is no test case for D tuple as the original test cases are too large. As the copy of the D tuple is similar to the Q tuple, the correctness can be guaranteed. llvm-svn: 198682	2014-01-07 10:00:03 +00:00
Venkatraman Govindaraju	559c4ac377	[Sparc] Add support for parsing sparc asm modifiers such as %hi, %lo etc., Also, correct the offsets for FixupsKindInfo. llvm-svn: 198681	2014-01-07 08:00:49 +00:00
Andrew Trick	dfacda3635	Fix for PR18396: Assertion: MO->isDead "Cannot fold physreg def". InlineSpiller::foldMemoryOperand needs to handle undef call operands. llvm-svn: 198679	2014-01-07 07:31:10 +00:00
Andrew Trick	e4a18605e0	Reapply r198654 "indvars: sink truncates outside the loop." This doesn't seem to have actually broken anything. It was paranoia on my part. Trying again now that bots are more stable. This is a follow up of the r198338 commit that added truncates for lcssa phi nodes. Sinking the truncates below the phis cleans up the loop and simplifies subsequent analysis within the indvars pass. llvm-svn: 198678	2014-01-07 06:59:12 +00:00
Kevin Qin	cfa41a2569	[AArch64 NEON] Fixed incorrect immediate used in BIC instruction. llvm-svn: 198675	2014-01-07 05:10:47 +00:00
Saleem Abdulrasool	4cb063cbf0	ARM IAS: allow more depth in contextual diagnostics Switch the context to be SmallVectors. This allows for saving additional context when providing previous emission sites. llvm-svn: 198665	2014-01-07 02:29:00 +00:00
Saleem Abdulrasool	c493d1499a	ARM IAS: refactor unwind context Move the unwinding context for the ARM IAS into a helper class. This is purely a structural refactoring. A follow up change allows for recording additional depth to improve diagnostics. llvm-svn: 198664	2014-01-07 02:28:55 +00:00
Saleem Abdulrasool	87ccd367b6	ARM IAS: improve .eabi_attribute handling Parse tag names as well as expressions. The former is part of the specification, the latter is for improved compatibility with the GNU assembler. Fix attribute value handling to be comformant to the specification. llvm-svn: 198662	2014-01-07 02:28:42 +00:00
Saleem Abdulrasool	69c7caf630	MCParser: introduce Note and use it for ARM AsmParser Introduce a new virtual method Note into the AsmParser. This completements the existing Warning and Error methods. Use the new method to clean up the output of the unwind routines in the ARM AsmParser. llvm-svn: 198661	2014-01-07 02:28:31 +00:00
Andrew Trick	3c0ed08996	Revert "indvars: sink truncates outside the loop." This reverts commit r198654. One of the bots reported a SciMark failure. llvm-svn: 198659	2014-01-07 01:50:58 +00:00
Venkatraman Govindaraju	0458b599f8	[Sparc] Add support for parsing memory operands in sparc AsmParser. llvm-svn: 198658	2014-01-07 01:49:11 +00:00
Andrew Trick	0b8e3b2cb4	indvars: sink truncates outside the loop. This is a follow up of the r198338 commit that added truncates for lcssa phi nodes. Sinking the truncates below the phis cleans up the loop and simplifies subsequent analysis within the indvars pass. llvm-svn: 198654	2014-01-07 01:02:55 +00:00
Jack Carter	0cd3c19f33	[Mips] TargetStreamer Support for .abicalls and .set pic0. This patch adds .abicalls and .set pic0 support which affects the ELF ABI and its flags. In addition the patch uses a common interface for both the MipsTargetSteamer and MipsObjectStreamer that both the integrated and standalone assemblers will use for the output for these directives. llvm-svn: 198646	2014-01-06 23:27:31 +00:00
Andrew Trick	6796ab424c	Reapply r198478 "Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things." Now with a fix for PR18384: ValueHandleBase::ValueIsDeleted. We need to invalidate SCEV's loop info when we delete a block, even if no values are hoisted. llvm-svn: 198631	2014-01-06 19:43:14 +00:00
Tim Northover	d6a729bb85	ARM MachO: sort out isTargetDarwin/isTargetIOS/... checks. The ARM backend has been using most of the MachO related subtarget checks almost interchangeably, and since the only target it's had to run on has been IOS (which is all three of MachO, Darwin and IOS) it's worked out OK so far. But we'd like to support embedded targets under the "--none-macho" triple, which means everything starts falling apart and inconsistent behaviours emerge. This patch should pick a reasonably sensible set of behaviours for the new triple (and any others that come along, with luck). Some choices were debatable (notably FP == r7 or r11), but we can revisit those later when deficiencies become apparent. llvm-svn: 198617	2014-01-06 14:28:05 +00:00
Robert Lytton	9523aa41fb	XCore Target: correct callee save register spilling when callsUnwindInit is true. llvm-svn: 198616	2014-01-06 14:21:12 +00:00
Robert Lytton	c8c4aa667b	XCore target: Lower EH_RETURN llvm-svn: 198615	2014-01-06 14:21:07 +00:00
Robert Lytton	5da175214b	XCore target: Lower FRAME_TO_ARGS_OFFSET This requires a knowledge of the stack size which is not known until the frame is complete, hence the need for the XCoreFTAOElim pass which lowers the XCoreISD::FRAME_TO_ARGS_OFFSET instrution into its final form. llvm-svn: 198614	2014-01-06 14:21:00 +00:00
Robert Lytton	dec798751a	XCore target: Lower RETURNADDR Only handles a depth of zero (the same as FRAMEADDR) llvm-svn: 198613	2014-01-06 14:20:53 +00:00
Robert Lytton	cbb588a264	XCore target: Optimise entsp / retsp selection llvm-svn: 198612	2014-01-06 14:20:47 +00:00
Robert Lytton	bc4d976152	XCore target: fix handling of unsized global arrays in large code model llvm-svn: 198609	2014-01-06 14:20:32 +00:00
Tim Northover	7649ebacd6	ARM: keep special non-AEABIness of "-darwin-eabi" triples for now Longer term, we want to move users to "---macho" for embedded work, but for now people are relying on the last thing we told them, which is unfortunately "-*-darwin-eabi". rdar://problem/15703934 llvm-svn: 198602	2014-01-06 12:00:44 +00:00
Elena Demikhovsky	3629b4aa0e	AVX-512: added intrinsic vcvtpd2ps (with rounding mode and without) llvm-svn: 198593	2014-01-06 08:45:54 +00:00
Venkatraman Govindaraju	dfcccc7db0	[Sparc] Add initial implementation of disassembler for sparc llvm-svn: 198591	2014-01-06 08:08:58 +00:00
Craig Topper	7ceb54a2a1	Add OpSize16 bit, for instructions which need 0x66 prefix in 16-bit mode The 0x66 prefix toggles between 16-bit and 32-bit addressing mode. So in 32-bit mode it is used to switch to 16-bit addressing mode for the following instruction, while in 16-bit mode it's the other way round — it's used to switch to 32-bit mode instead. Thus, emit the 0x66 prefix byte for OpSize only in 32-bit (and 64-bit) mode, and introduce a new OpSize16 bit which is used in 16-bit mode instead. This is just the basic infrastructure for that change; a subsequent patch will add the new OpSize16 bit to the 32-bit instructions that need it. Patch from David Woodhouse. llvm-svn: 198586	2014-01-06 06:02:58 +00:00
Craig Topper	3c80d62a6c	[x86] Add basic support for .code16 This is not really expected to work right yet. Mostly because we will still emit the OpSize (0x66) prefix in all the wrong places, along with a number of other corner cases. Those will all be fixed in the subsequent commits. Patch from David Woodhouse. llvm-svn: 198584	2014-01-06 04:55:54 +00:00
Kevin Qin	5cd73c9e0a	[AArch64 NEON] Fix invalid constant used in vselect condition. There is a wrong assumption that the vector element type and the type of each ConstantSDNode in the build_vector were the same. However, when promoting the integer operand of a legally typed build_vector, the operand type and the vector element type do not need to be the same (See method 'DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR' in LegalizeIntegerTypes.cpp). in AArch64 backend, the following dag sequence: C0: i1 = Constant<0> C1: i1 = Constant<-1> V: v8i1 = BUILD_VECTOR C1, C1, C0, C0, C0, C0, C0, C0 is type-legalized into: NewC0: i32 = Constant<0> NewC1: i32 = Constant<1> V: v8i8 = BUILD_VECTOR NewC1, NewC1, NewC0, NewC0, NewC0, NewC0, NewC0, NewC0 Forcing a getZeroExtend to VTBits to ensure that the new constant is correctly. llvm-svn: 198582	2014-01-06 02:26:10 +00:00
Bill Wendling	f7fa730e61	Remove a failing test to get the buildbots back to green. llvm-svn: 198578	2014-01-06 00:43:09 +00:00
Bill Wendling	e4c9a77755	Try to fix s390x build bot. llvm-svn: 198577	2014-01-06 00:43:04 +00:00
Craig Topper	21ba8fbc18	Fix ModR/M byte output for 16-bit addressing modes (PR18220) Add some tests to validate correct register selection, including a fix to an existing test which was requiring the wrong output. Patch from David Woodhouse. llvm-svn: 198566	2014-01-05 19:40:56 +00:00
Elena Demikhovsky	f404e054a1	AVX-512: changed property name from "neverHasSideEffects=1" to "hasSideEffects=0", added this property to VMOVSS/VMOVSD; Optimized a truncate pattern. llvm-svn: 198562	2014-01-05 14:21:07 +00:00
Simon Atanasyan	728d21600c	[Mips] Add support for DT_MIPS_RLD_MAP and DT_MIPS_PLTGOT dynamic section tags to the llvm-readobj. llvm-svn: 198561	2014-01-05 13:40:27 +00:00
Simon Atanasyan	940cc71bea	[Mips] Rename the test case input file. No functional changes. llvm-svn: 198560	2014-01-05 13:40:17 +00:00
Elena Demikhovsky	52e4a0e109	AVX-512: Added more intrinsics for convert and min/max. Removed vzeroupper from AVX-512 mode - our optimization gude does not recommend to insert vzeroupper at all. llvm-svn: 198557	2014-01-05 10:46:09 +00:00
Bill Wendling	4002a941fa	Attempt to fix buildbots by XFAILing some architectures. llvm-svn: 198537	2014-01-05 03:10:56 +00:00
Venkatraman Govindaraju	d11572818b	Add lit.local.cfg for MC/Sparc llvm-svn: 198536	2014-01-05 03:07:04 +00:00
Venkatraman Govindaraju	5f1cce50e6	[Sparc] Add initial implementation of MC Code emitter for sparc. llvm-svn: 198533	2014-01-05 02:13:48 +00:00
Bill Wendling	df7dd28dc8	Emit an error message if the value passed to __builtin_returnaddress isn't a constant __builtin_returnaddress requires that the value passed into is be a constant. However, at -O0 even a constant expression may not be converted to a constant. Emit an error message intead of crashing. llvm-svn: 198531	2014-01-05 01:47:20 +00:00
Craig Topper	5999d47538	Mark the 64-bit x86 push/pop instructions as In64BitMode. Mark the corresponding 32-bit versions with the same encodings Not64BitMode. Remove hack from tablegen disassembler table emitter. Fix bad test. llvm-svn: 198530	2014-01-05 01:35:51 +00:00
Alp Toker	5e9f3265f8	Revert "Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things." This commit was the source of crasher PR18384: While deleting: label %for.cond127 An asserting value handle still pointed to this value! UNREACHABLE executed at llvm/lib/IR/Value.cpp:671! Reverting to get the builders green, feel free to re-land after fixing up. (Renato has a handy isolated repro if you need it.) This reverts commit r198478. llvm-svn: 198503	2014-01-04 17:00:45 +00:00
Venkatraman Govindaraju	96ab3bc5bd	[SparcV9]: Implement RETURNADDR and FRAMEADDR lowering in SPARC64. Fixes PR18356. llvm-svn: 198480	2014-01-04 07:17:21 +00:00
Andrew Trick	aceac9746d	Fix PR18361: Invalidate LoopDispositions after LoopSimplify hoists things. getSCEV for an ashr instruction creates an intermediate zext expression when it truncates its operand. The operand is initially inside the loop, so the narrow zext expression has a non-loop-invariant loop disposition. LoopSimplify then runs on an outer loop, hoists the ashr operand, and properly invalidate the SCEVs that are mapped to value. The SCEV expression for the ashr is now an AddRec with the hoisted value as the now loop-invariant start value. The LoopDisposition of this wide value was properly invalidated during LoopSimplify. However, if we later get the ashr SCEV again, we again try to create the intermediate zext expression. We get the same SCEV that we did earlier, and it is still cached because it was never mapped to a Value. When we try to create a new AddRec we abort because we're using the old non-loop-invariant LoopDisposition. I don't have a solution for this other than to clear LoopDisposition when LoopSimplify hoists things. I think the long-term strategy should be to perform LoopSimplify on all loops before computing SCEV and before running any loop opts on individual loops. It's possible we may want to rerun LoopSimplify on individual loops, but it should rarely do anything, so rarely require invalidating SCEV. llvm-svn: 198478	2014-01-04 05:52:49 +00:00
Ana Pazos	e891c5f264	[AArch64][NEON] Added SXTL and SXTL2 instruction aliases llvm-svn: 198437	2014-01-03 19:20:31 +00:00
David Blaikie	cfb2115e66	Revert "Revert "Debug Info: Type Units: Simplify type hashing using IR-provided unique names."" This reverts commit r198398, thus reapplying r198397. I had accidentally introduced an endianness issue when applying the hash to the type unit. Using support::ulittle64_t in the reinterpret_cast in addDwarfTypeUnitType fixes this issue. Original commit message: Debug Info: Type Units: Simplify type hashing using IR-provided unique names. What's good for LTO metadata size problems ought to be good for non-LTO debug info size too, so let's rely on the same uniqueness in both cases. If it's insufficient for non-LTO for whatever reason (since we now won't be uniquing CU-local types or any C types - but these are likely to not be the most significant contributors to type bloat) we should consider a frontend solution that'll help both LTO and non-LTO alike, rather than using DWARF-level DIE-hashing that only helps non-LTO debug info size. It's also much simpler this way and benefits C++ even more since we can deduplicate lexically separate definitions of the same C++ type since they have the same mangled name. llvm-svn: 198436	2014-01-03 18:59:42 +00:00
David Peixotto	ea9ba446d5	Fix loop rerolling pass failure with non-consant loop lower bound The loop rerolling pass was failing with an assertion failure from a failed cast on loops like this: void foo(int A, int B, int m, int n) { for (int i = m; i < n; i+=4) { A[i+0] = B[i+0] * 4; A[i+1] = B[i+1] * 4; A[i+2] = B[i+2] * 4; A[i+3] = B[i+3] * 4; } } The code was casting the SCEV-expanded code for the new induction variable to a phi-node. When the loop had a non-constant lower bound, the SCEV expander would end the code expansion with an add insted of a phi node and the cast would fail. It looks like the cast to a phi node was only needed to get the induction variable value coming from the backedge to compute the end of loop condition. This patch changes the loop reroller to compare the induction variable to the number of times the backedge is taken instead of the iteration count of the loop. In other words, we stop the loop when the current value of the induction variable == IterationCount-1. Previously, the comparison was comparing the induction variable value from the next iteration == IterationCount. This problem only seems to occur on 32-bit targets. For some reason, the loop is not rerolled on 64-bit targets. PR18290 llvm-svn: 198425	2014-01-03 17:20:01 +00:00
Arnold Schwaighofer	833a82ecde	BasicAA: Use reachabilty instead of dominance for checking value equality in phi cycles This allows the value equality check to work even if we don't have a dominator tree. Also add some more comments. I was worried about compile time impacts and did not implement reachability but used the dominance check in the initial patch. The trade-off was that the dominator tree was required. The llvm utility function isPotentiallyReachable cuts off the recursive search after 32 visits. Testing did not show any compile time regressions showing my worries unjustfied. No compile time or performance regressions at O3 -flto -mavx on test-suite + externals. Addresses review comments from r198290. llvm-svn: 198400	2014-01-03 05:47:03 +00:00
David Blaikie	ab0ba24983	Revert "Debug Info: Type Units: Simplify type hashing using IR-provided unique names." Reverting due to bot failure I won't have time to investigate until tomorrow. This reverts commit r198397. llvm-svn: 198398	2014-01-03 04:49:04 +00:00
David Blaikie	ddb66281cd	Debug Info: Type Units: Simplify type hashing using IR-provided unique names. What's good for LTO metadata size problems ought to be good for non-LTO debug info size too, so let's rely on the same uniqueness in both cases. If it's insufficient for non-LTO for whatever reason (since we now won't be uniquing CU-local types or any C types - but these are likely to not be the most significant contributors to type bloat) we should consider a frontend solution that'll help both LTO and non-LTO alike, rather than using DWARF-level DIE-hashing that only helps non-LTO debug info size. It's also much simpler this way and benefits C++ even more since we can deduplicate lexically separate definitions of the same C++ type since they have the same mangled name. llvm-svn: 198397	2014-01-03 04:20:26 +00:00
David Blaikie	22b29a5f1a	Revert "Reverting r193835 due to weirdness with Go..." The cgo problem was that it wants dwarf2 which doesn't support direct constant encoding of the location. So let's add support for dwarf2 encoding (using a location expression) of data member locations. This reverts commit r198385. llvm-svn: 198389	2014-01-03 01:30:05 +00:00
David Blaikie	2ada116a34	Reverting r193835 due to weirdness with Go... Apologies for the noise - we're seeing some Go failures with cgo interacting with Clang's debug info due to this change. llvm-svn: 198385	2014-01-03 00:48:38 +00:00
Quentin Colombet	1fb3362a6e	[RegAlloc] Make tryInstructionSplit less aggressive. The greedy register allocator tries to split a live-range around each instruction where it is used or defined to relax the constraints on the entire live-range (this is a last chance split before falling back to spill). The goal is to have a big live-range that is unconstrained (i.e., that can use the largest legal register class) and several small local live-range that carry the constraints implied by each instruction. E.g., Let csti be the constraints on operation i. V1= op1 V1(cst1) op2 V1(cst2) V1 live-range is constrained on the intersection of cst1 and cst2. tryInstructionSplit relaxes those constraints by aggressively splitting each def/use point: V1= V2 = V1 V3 = V2 op1 V3(cst1) V4 = V2 op2 V4(cst2) Because of how the coalescer infrastructure works, each new variable (V3, V4) that is alive at the same time as V1 (or its copy, here V2) interfere with V1. Thus, we end up with an uncoalescable copy for each split point. To make tryInstructionSplit less aggressive, we check if the split point actually relaxes the constraints on the whole live-range. If it does not, we do not insert it. Indeed, it will not help the global allocation problem: - V1 will have the same constraints. - V1 will have the same interference + possibly the newly added split variable VS. - VS will produce an uncoalesceable copy if alive at the same time as V1. <rdar://problem/15570057> llvm-svn: 198369	2014-01-02 22:47:22 +00:00
Matt Arsenault	ceae33569d	Fix all the verifier tests I added for address spaces. I originally had these using opt -verify, and I never removed the -verify when converting them to use llvm-as instead, so these were failing because of using the -verify argument which llvm-as doesn't have instead of what it's actually supposed to be testing. llvm-svn: 198352	2014-01-02 21:09:05 +00:00
Matt Arsenault	00436ea156	Allow addrspacecast in global aliases llvm-svn: 198349	2014-01-02 20:55:01 +00:00
Hal Finkel	a8c1f46767	[TableGen] Correctly generate implicit anonymous prototype defs in multiclasses Even within a multiclass, we had been generating concrete implicit anonymous defs when parsing values (generally in value lists). This behavior was incorrect, and led to errors when multiclass parameters were used in the parameter list of the implicit anonymous def. If we had some multiclass: multiclass mc<string n> { ... : SomeClass<SomeOtherClass<n> > The capture of the multiclass parameter 'n' would not work correctly, and depending on how the implicit SomeOtherClass was used, either TableGen would ignore something it shouldn't, or would crash. To fix this problem, when inside a multiclass, we generate prototype anonymous defs for implicit anonymous defs (just as we do for explicit anonymous defs). Within the multiclass, the current record prototype is populated with a node that is essentially: !cast<SomeOtherClass>(!strconcat(NAME, anon_value_name)). This is then resolved to the correct concrete anonymous def, in the usual way, when NAME is resolved during multiclass instantiation. llvm-svn: 198348	2014-01-02 20:47:09 +00:00
Matt Arsenault	461c8e0a8c	Delete unread globals through addrspacecast llvm-svn: 198346	2014-01-02 20:01:43 +00:00
Matt Arsenault	da1deabb16	Fix addrspacecast with metadata globals llvm-svn: 198345	2014-01-02 19:53:49 +00:00
Jordan Rose	353bdcde90	[CMake] Add missing set_output_directory after Takumi's change in r198205. Plugins need to go in build/Debug/lib as well (rather than build/lib/Debug). Also, fix the SHLIBDIR path for Xcode, which by default includes Xcode build settings rather than a simple %(build_mode)s parameter. llvm-svn: 198344	2014-01-02 19:47:45 +00:00
Hal Finkel	f2a0b2b340	[TableGen] Use the same anonymous name as the prefix on all multiclass defs TableGen had been generating a different name for an anonymous multiclass's NAME for every def in the multiclass. This had an unfortunate side effect: it was impossible to reference one def within the multiclass from another (in the parameter list, for example). By making sure we only generate an anonymous name once per multiclass (which, as it turns out, requires only changing the name parameter to reference type), we can now concatenate NAME within the multiclass with a def name in order to generate a reference to that def. This does not matter so much, in and of itself, but is necessary for a follow-up commit that will fix variable capturing in implicit anonymous multiclass defs (and that is important). llvm-svn: 198340	2014-01-02 19:35:33 +00:00
Andrew Trick	020dd898fc	indvars: insert truncate at loop boundary to avoid redundant IVs. When widening an IV to remove s/zext, we generally try to eliminate the original narrow IV. However, LCSSA phi nodes outside the loop were still using the original IV. Clean this up more aggressively to avoid redundancy in generated code. llvm-svn: 198338	2014-01-02 19:29:38 +00:00
Adrian Prantl	fd3279f27f	Revert "Debug info: Add enumerators to the __apple_names accelerator table." This reverts r197927 until the discussion on llvm-commits comes to a conclusion. llvm-svn: 198333	2014-01-02 18:48:24 +00:00
Logan Chien	05ae744813	[arm] Add softvfp to supported FPU names. llvm-svn: 198313	2014-01-02 15:50:02 +00:00
Rafael Espindola	d89b16dcb8	Make the ARM ABI selectable via SubtargetFeature. This patch makes it possible to select the ABI with -mattr. It will be used to forward clang's -target-abi option to llvm's CodeGen. llvm-svn: 198304	2014-01-02 13:40:08 +00:00
Arnold Schwaighofer	0d10a9d579	BasicAA: Fix value equality and phi cycles When there are cycles in the value graph we have to be careful interpreting "Value" identity as "value" equivalence. We interpret the value of a phi node as the value of its operands. When we check for value equivalence now we make sure that the "Value" dominates all cycles (phis). %0 = phi [%noaliasval, %addr2] %l = load %ptr %addr1 = gep @a, 0, %l %addr2 = gep @a, 0, (%l + 1) store %ptr ... Before this patch we would return NoAlias for (%0, %addr1) which is wrong because the value of the load is from different iterations of the loop. Tested on x86_64 -mavx at O3 and O3 -flto with no performance or compile time regressions. PR18068 radar://15653794 llvm-svn: 198290	2014-01-02 03:31:36 +00:00
Venkatraman Govindaraju	9a3da52ea2	[Sparc] Handle atomic loads/stores in sparc backend. llvm-svn: 198286	2014-01-01 22:11:54 +00:00
Venkatraman Govindaraju	77011e861b	[SparcV9]: Custom lower UMULO/SMULO so that the arguments are send to __multi3() in correct order. llvm-svn: 198281	2014-01-01 20:22:45 +00:00
Venkatraman Govindaraju	acf0233a46	[SparcV9]: Use SRL instead of SLL to clear top 32-bits in ctpop:i32. SLL does not clear top 32 bit, only SRL does. llvm-svn: 198280	2014-01-01 19:00:10 +00:00
Craig Topper	9155118602	Remove need for MODIFIER_OPCODE in the disassembler tables. AddRegFrms are really more like OrRegFrm so we don't need a difference since we can just mask bits. llvm-svn: 198278	2014-01-01 15:29:32 +00:00
Elena Demikhovsky	de3f751baf	AVX-512: Added intrinsics for vcvt, vcvtt, vrndscale, vcmp Printing rounding control. Enncoding for EVEX_RC (rounding control). llvm-svn: 198277	2014-01-01 15:12:34 +00:00
Craig Topper	3fec8c612e	Add two fp test cases I missed in my previous commit. llvm-svn: 198269	2013-12-31 23:15:19 +00:00
Craig Topper	719560102d	Add more X86 FP stack disassembler test cases. llvm-svn: 198268	2013-12-31 22:51:53 +00:00
Nick Lewycky	2d4ba2ebba	Fold vector selects with undef elements in the condition. Fixes PR18319. Patch by Ilia Filippov! llvm-svn: 198267	2013-12-31 19:30:47 +00:00
Craig Topper	e98c8cb9f0	Revert r198238 and add FP disassembler tests. It didn't work and I didn't realized we had no FP disassembler test cases. llvm-svn: 198265	2013-12-31 17:21:44 +00:00
Saleem Abdulrasool	e3a9dc134d	ARM IAS: account for predicated pre-UAL mnemonics Checking the trailing letter of the mnemonic is insufficient. Be more thorough in the scanning of the instruction to ensure that we correctly work with the predicated mnemonics. llvm-svn: 198235	2013-12-30 18:38:01 +00:00
Eric Christopher	d86672037b	Revert r198208 and reapply: r198196: Use a pointer to keep track of the skeleton unit for each normal unit and construct it up front. r198199: Reapply r198196 with a fix to zero initialize the skeleton pointer. r198202: Fix aranges and split dwarf by ensuring that the symbol and relocation back to the compile unit from the aranges section is to the skeleton unit and not the one in the dwo. with a fix to use integer 0 for DW_AT_low_pc since the relocation to the text section symbol was causing issues with COFF. Accordingly remove addLocalLabelAddress and machinery since we're not currently using it. llvm-svn: 198222	2013-12-30 17:22:27 +00:00
NAKAMURA Takumi	17b7310858	Revert r198199 (and r198202). It broke 3 DebugInfo tests for targeting i686-cygming. r198196: Use a pointer to keep track of the skeleton unit for each normal unit and construct it up front. r198199: Reapply r198196 with a fix to zero initialize the skeleton pointer. r198202: Fix aranges and split dwarf by ensuring that the symbol and relocation back to the compile unit from the aranges section is to the skeleton unit and not the one in the dwo. They could be reproducible with explicit target. llvm/lib/MC/WinCOFFObjectWriter.cpp:224: bool {anonymous}::COFFSymbol::should_keep() const: Assertion `Section->Number != -1 && "Sections with relocations must be real!"' failed. llvm-svn: 198208	2013-12-30 09:26:10 +00:00
Eric Christopher	c2d401e952	Fix aranges and split dwarf by ensuring that the symbol and relocation back to the compile unit from the aranges section is to the skeleton unit and not the one in the dwo. Do this by adding a method to grab a forwarded on local sym and local section by querying the skeleton if one exists and using that. Add a few tests to verify the relocations are back to the correct section. llvm-svn: 198202	2013-12-30 05:25:49 +00:00
Eric Christopher	d039baad05	Reapply r198196 with a fix to zero initialize the skeleton pointer. llvm-svn: 198199	2013-12-30 03:40:32 +00:00
Eric Christopher	be4c91c57c	Temporarily revert "Use a pointer to keep track of the skeleton unit for each normal unit" as it seems to be causing problems in the asan tests. llvm-svn: 198197	2013-12-30 03:12:31 +00:00
Eric Christopher	83fff3fce7	Use a pointer to keep track of the skeleton unit for each normal unit and construct it up front. Add address ranges at the end and a helper routine so that we're not needlessly using an indirction in the case of split dwarf. Update testcases according to the new ordering of attributes on the compile unit. llvm-svn: 198196	2013-12-30 03:02:12 +00:00
Jiangning Liu	a0acf70af1	For AArch64 Neon, simplify scalar dup by lane0 for fp. llvm-svn: 198194	2013-12-30 02:44:35 +00:00
Hao Liu	fe3bfc8c41	[AArch64]Add code to spill/fill Q register tuples such as QPair/QTriple/QQuad. llvm-svn: 198193	2013-12-30 02:38:12 +00:00
Hao Liu	b591f835d6	[AArch64]Can't select shift left 0 of type v1i64 llvm-svn: 198192	2013-12-30 02:12:46 +00:00
Kevin Qin	ede9ce1933	Fix a bug in DAGcombiner about zero-extend after setcc. For AArch64 backend, if DAGCombiner see "sext(setcc)", it will combine them together to a single setcc with extended value type. Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1, DAGcombiner will create wrong node and get wrong code emitted. llvm-svn: 198190	2013-12-30 02:05:13 +00:00
Hao Liu	74107fe526	[AArch64]Fix the problem that can't select mul of v1i64/v2i64 types. E.g. Can't select such IR: %tmp = mul <2 x i64> %a, %b llvm-svn: 198188	2013-12-30 01:38:41 +00:00
Bill Wendling	8ea7582546	Un-XFAILify some tests which are now passing. llvm-svn: 198184	2013-12-29 23:09:14 +00:00
Saleem Abdulrasool	4da9c6e566	ARM: provide VFP aliases for pre-V6 mnemonics In order to provide compatibility with the GNU assembler, provide aliases for pre-UAL mnemonics for floating point operations. llvm-svn: 198172	2013-12-29 17:58:35 +00:00
Venkatraman Govindaraju	3e3a29a2e9	[SparcV9] Use separate instruction patterns for 64 bit arithmetic instructions instead of reusing 32 bit instruction patterns. This is done to avoid spilling the result of the 64-bit instructions to a 4-byte slot. llvm-svn: 198157	2013-12-29 07:15:09 +00:00
Venkatraman Govindaraju	5ac9c8faec	[SparcV9] For codegen generated library calls that return float, set inreg flag manually in LowerCall(). This makes the sparc backend to generate Sparc64 ABI compliant code. llvm-svn: 198149	2013-12-29 04:27:21 +00:00
Venkatraman Govindaraju	0776cc0acd	[SparcV9]: Implement lowering of long double (fp128) arguments in Sparc64 ABI. Also, pass fp128 arguments to varargs through integer registers if necessary. llvm-svn: 198145	2013-12-29 01:20:36 +00:00
Andrew Trick	3ca67d6404	New machine model for cortex-a9. Schedule for resources and latency. Schedule more conservatively to account for stalls on floating point resources and latency. Use the AGU resource to model latency stalls since it's shared between FP and LD/ST instructions. This might not be completely accurate but should work well in practice. llvm-svn: 198125	2013-12-28 21:57:05 +00:00
NAKAMURA Takumi	cf396cf82c	llvm/test/CodeGen/X86/vselect.ll: Unbreak Windows x64 targets to add -mtriple=x86_64-unknown-unknown. llvm-svn: 198114	2013-12-28 13:04:29 +00:00
Andrea Di Biagio	eaceba0ed0	[X86] Teach the backend how to fold target specific dag node for packed vector shift by immedate count (VSHLI/VSRLI/VSRAI) into a build_vector when the vector in input to the shift is a build_vector of all constants or UNDEFs. Target specific nodes for packed shifts by immediate count are in general introduced by function 'getTargetVShiftByConstNode' (in X86ISelLowering.cpp) when lowering shift operations, SSE/AVX immediate shift intrinsics and (only in very few cases) SIGN_EXTEND_INREG dag nodes. This patch adds extra rules for simplifying vector shifts inside function 'getTargetVShiftByConstNode'. Added file test/CodeGen/X86/vec_shift5.ll to verify that packed shifts by immediate are correctly folded into a build_vector when the input vector to the shift dag node is a vector of constants or undefs. llvm-svn: 198113	2013-12-28 11:11:52 +00:00
Saleem Abdulrasool	51cff7199d	AsmParser: cleanup diagnostics for .rep/.rept Avoid double diagnostics for invalid expressions for count. Improve caret location for negative count. llvm-svn: 198099	2013-12-28 06:39:29 +00:00
Saleem Abdulrasool	d743d0ab8c	IAS: support .rep as an alias for .rept The GNU assembler supports .rep as an alias for .rept. This simply creates the alias for it and introduces a test for both .rept and .rep. llvm-svn: 198097	2013-12-28 05:54:33 +00:00
Chandler Carruth	f5689f8304	Disable transforms that introduce calls to exp10*() on Linux due to widespread glibc bugs. The glibc implementation of exp10 has a very serious precision bug in version 2.15 (and older versions). This is still very widely used (the current Ubuntu LTS for example uses it) and so it isn't reasonable to make transforms that produce these functions. This fixes many miscompiles introduced when we started transforming pow(10.0, ...) into exp10, and it may have fixed other latent miscompiles where exp10 provided sufficient precision but exp10f did not. This is all really horrible. The primary bug has been fixed for over a year and glibc 2.18 works correctly for the test cases I have, but it will be 2017 before the LTS using 2.15 is no longer supported by Ubuntu (and thus reasonable for folks to be relying on). =[ We're either going to need to live without these optimizations, or find a way to switch behavior more dynamically than using simply the fact that the OS is "Linux". To make matters worse, there appears to be significant testing and fixing of numerous other bugs in the exp10 family of functions right now in glibc. While those haven't been causing problems I've seen in the wild, it gives me concerns that we may need to wait until an even later release of glibc before we can reliably transform code into exp10. llvm-svn: 198093	2013-12-28 02:40:19 +00:00
Andrea Di Biagio	46dcddb350	Teach DAGCombiner how to fold a SIGN_EXTEND_INREG of a BUILD_VECTOR of ConstantSDNodes (or UNDEFs) into a simple BUILD_VECTOR. For example, given the following sequence of dag nodes: i32 C = Constant<1> v4i32 V = BUILD_VECTOR C, C, C, C v4i32 Result = SIGN_EXTEND_INREG V, ValueType:v4i1 The SIGN_EXTEND_INREG node can be folded into a build_vector since the vector in input is a BUILD_VECTOR of constants. The optimized sequence is: i32 C = Constant<-1> v4i32 Result = BUILD_VECTOR C, C, C, C llvm-svn: 198084	2013-12-27 20:20:28 +00:00
Joerg Sonnenberger	a13f8b4f36	Recognize armv7a and friends as aliases for armv7-a etc. for the purpose of architecture naming. llvm-svn: 198043	2013-12-26 11:50:28 +00:00
Saleem Abdulrasool	a554968dde	ARM IAS: support .even directive The .even directive aligns content to an evan-numbered address. This is an ARM specific directive applicable to any section. llvm-svn: 198031	2013-12-26 01:52:28 +00:00
Venkatraman Govindaraju	bf683fd15c	[Sparc] Lower and MachineInstr to MC and print assembly using MCInstPrinter. llvm-svn: 198030	2013-12-26 01:49:59 +00:00
Alexander Potapenko	cb66fe377a	[ASan] Fix the tests broken by r198018 to check for private linkage of ASan-generated globals. llvm-svn: 198020	2013-12-25 17:06:04 +00:00
Simon Atanasyan	fde102cb77	[Mips] Does not take in account 'use-soft-float' attribute's value when consider to generate stubs for mips16 hard-float mode. The patch reviewed by Reed Kotler. llvm-svn: 198019	2013-12-25 17:00:27 +00:00
Elena Demikhovsky	371e363833	AVX-512: decoder for AVX-512, made by Alexey Bader. llvm-svn: 198013	2013-12-25 11:40:51 +00:00
Zoran Jovanovic	bd28c373c4	Support for microMIPS load effective address. llvm-svn: 198010	2013-12-25 10:14:07 +00:00
Zoran Jovanovic	8876be39c7	Support for microMIPS FPU instructions 2. llvm-svn: 198009	2013-12-25 10:09:27 +00:00
Hao Liu	83799741fb	[AArch64]Fix a problem that the register order of fmls/fmla by element is incorrect. E.g. the codegen result is fmls v1.2s, v0.2s, v2.s[3] which is expected to be fmls v0.2s, v1.2s, v2.s[3] llvm-svn: 198001	2013-12-25 07:12:34 +00:00
Jiangning Liu	dd1afd5338	Add missing pattern matches to support ACLE intrinsics of AArch64 NEON. llvm-svn: 197993	2013-12-25 01:22:51 +00:00
Alexey Samsonov	60e59e29f8	llvm-symbolizer: add --obj flag to specify a single object file that should be symbolized. llvm-svn: 197988	2013-12-24 19:33:22 +00:00
Richard Sandiford	41350a52ca	[SystemZ] Use interlocked-access 1 instructions for CodeGen ...namely LOAD AND ADD, LOAD AND AND, LOAD AND OR and LOAD AND EXCLUSIVE OR. LOAD AND ADD LOGICAL isn't really separately useful for LLVM. I'll look at adding reusing the CC results in new year. llvm-svn: 197985	2013-12-24 15:18:04 +00:00
Richard Sandiford	45645a2c1c	[SystemZ] Add MC support for interlocked-access 1 instructions llvm-svn: 197984	2013-12-24 15:14:05 +00:00
Elena Demikhovsky	64c9548d66	AVX-512: fixed some patterns for MVT::i1 llvm-svn: 197981	2013-12-24 14:24:07 +00:00
Hao Liu	ce7a12be8f	[AArch64]Add patterns to match normal shift nodes: shl, sra and srl. llvm-svn: 197969	2013-12-24 09:00:21 +00:00
Kevin Qin	82bd84aadf	[AArch64 NEON] Fix a bug when lowering BUILD_VECTOR. DAG.getVectorShuffle() doesn't always return a vector_shuffle node. If mask is the exact sequence of it's operand(For example, operand_0 is v8i8, and the mask is 0, 1, 2, 3, 4, 5, 6, 7), it will directly return that operand. So a check is added here. llvm-svn: 197967	2013-12-24 08:16:06 +00:00
Kevin Qin	cd5f3153f5	[AArch64 NEON] Fix a pattern match failure with NEON_VDUP. This failure caused by improper condition when lowering shuffle_vector to scalar_to_vector. After this patch NEON_VDUP with v1i64 will not be generated. llvm-svn: 197966	2013-12-24 08:11:47 +00:00
Ana Pazos	bc2996b30f	[AArch64] Check fmul node single use in fused multiply patterns Check for single use of fmul node in fused multiply patterns to allow generation of fused multiply add/sub instructions. Otherwise fmul operation ends up being repeated more than once which does not help peformance on targets with only one MAC unit, as for example cortex-a53. llvm-svn: 197929	2013-12-24 00:47:29 +00:00
Ana Pazos	3ca23915cd	[AArch64 NEON] Fixed fused multiply negate add/sub patterns The correct pattern matching should be: - fnmadd is (-Ra) + (-Rn)Rm which should be matched as: fma (fneg node:$Rn), node:$Rm, (fneg node:$Ra) and as (f32 (fsub (f32 (fneg FPR32:$Ra)), (f32 (fmul FPR32:$Rn, FPR32:$Rm)))) - fnmsub is (-Ra) + RnRm which should be matched as fma node:$Rn, node:$Rm, (fneg node:$Ra) and as (f32 (fsub (f32 (fmul FPR32:$Rn, FPR32:$Rm)), FPR32:$Ra)))) llvm-svn: 197928	2013-12-24 00:40:10 +00:00
Adrian Prantl	ad64aeac44	Debug info: Add enumerators to the __apple_names accelerator table. rdar://problem/11516681. llvm-svn: 197927	2013-12-23 23:50:20 +00:00
Andrew Trick	0ba77a0740	Add support to indvars for optimizing sadd.with.overflow. Split sadd.with.overflow into add + sadd.with.overflow to allow analysis and optimization. This should ideally be done after InstCombine, which can perform code motion (eventually indvars should run after all canonical instcombines). We want ISEL to recombine the add and the check, at least on x86. This is currently under an option for reducing live induction variables: -liv-reduce. The next step is reducing liveness of IVs that are live out of the overflow check paths. Once the related optimizations are fully developed, reviewed and tested, I do expect this to become default. llvm-svn: 197926	2013-12-23 23:31:49 +00:00
Adrian Prantl	edb61f02b6	Debug info: On ARM ensure that the data sections come before the (optional) DWARF sections, so compiling with -g does not result in different code being generated. rdar://problem/15623193 llvm-svn: 197922	2013-12-23 22:24:47 +00:00
Saleem Abdulrasool	701875542d	ARM: bkpt has an implicit immediate constant 0 The bkpt mnemonic has an implicit immediate constant of 0 unless otherwise specified. Add an instruction alias for the unvalued breakpoint mnemonic to treat it as a 0. This improves compatibility with GNU AS. Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org> llvm-svn: 197913	2013-12-23 17:23:58 +00:00
Richard Sandiford	1fb5c13e3a	Fix Scalarizer insertion point when replacing PHIs with insertelements If the Scalarizer scalarized a vector PHI but could not scalarize all uses of it, it would insert a series of insertelements to reconstruct the vector PHI value from the scalar ones. The problem was that it would emit these insertelements immediately after the PHI, even if there were other PHIs after it. llvm-svn: 197909	2013-12-23 14:51:56 +00:00
Richard Sandiford	3548cbb980	Fix Scalarizer handling of vector GEPs with multiple index operands The old code only worked for one index operand. Also handle "inbounds". llvm-svn: 197908	2013-12-23 14:45:00 +00:00
Kostya Serebryany	530e207d8a	[asan] don't unpoison redzones on function exit in use-after-return mode. Summary: Before this change the instrumented code before Ret instructions looked like: <Unpoison Frame Redzones> if (Frame != OriginalFrame) // I.e. Frame is fake <Poison Complete Frame> Now the instrumented code looks like: if (Frame != OriginalFrame) // I.e. Frame is fake <Poison Complete Frame> else <Unpoison Frame Redzones> Reviewers: eugenis Reviewed By: eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2458 llvm-svn: 197907	2013-12-23 14:15:08 +00:00
Hao Liu	408c8b0866	[AArch64]The compare to zero intrinsics should be implemented by 'icmp/fcmp' and 'sext' not 'zext'. Modify the test cases. llvm-svn: 197897	2013-12-23 02:42:10 +00:00
Elena Demikhovsky	fe24a30e38	AVX512: SETCC returns i1 for AVX-512 and i8 for all others llvm-svn: 197876	2013-12-22 10:13:18 +00:00
Michael Kuperstein	f5fb0eace5	Ensure bitcode encoding of calling conventions stays stable. Patch by Boaz Ouriel. llvm-svn: 197873	2013-12-22 07:51:53 +00:00
Alp Toker	387350353f	FileCheckize r197869 llvm-svn: 197872	2013-12-22 03:43:58 +00:00
Alp Toker	597942f8ae	Relax tab check into a whitespace check to fix the test in r197869 llvm-svn: 197870	2013-12-21 19:11:31 +00:00
Alp Toker	ce91fe5569	TableGen: Generate valid identifiers for anonymous records Backends like OptParserEmitter assume that record names can be used as valid identifiers. The period '.' in generated anonymous names broke that assumption, causing a build-time error and in practice forcing all records to be named. llvm-svn: 197869	2013-12-21 18:51:00 +00:00
Timur Iskhodzhanov	f75e5bbefc	Add the .secidx test I've forgotten to svn add in 197826 llvm-svn: 197828	2013-12-20 19:06:50 +00:00
Roman Divacky	32143e2bda	Implement initial-exec TLS for PPC32. llvm-svn: 197824	2013-12-20 18:08:54 +00:00
Zoran Jovanovic	ce02486d16	Support for microMIPS FPU instructions 1. llvm-svn: 197815	2013-12-20 15:44:08 +00:00
Richard Sandiford	83a0b6abd0	[SystemZ] Optimize comparisons with truncated extended loads If the extension of a loaded value is compared against zero and used in other arithmetic, InstCombine will change the comparison to use the unextended load. It's also possible that the comparison could be against the unextended load from the outset. In DAG form this becomes a truncation of an extending load. We want to strip the truncation if possible so that we can use load-and-test instructions. llvm-svn: 197804	2013-12-20 11:56:02 +00:00
Richard Sandiford	220ee49bce	[SystemZ] Extend RISBG optimization The handling of ANY_EXTEND and ZERO_EXTEND was too strict. In this context we can treat ZERO_EXTEND in much the same way as an AND and then also handle outermost ZERO_EXTENDs. I couldn't find a test that benefited from the ANY_EXTEND change, but it's more obvious to write it this way once SIGN_EXTEND and ZERO_EXTEND are handled differently. llvm-svn: 197802	2013-12-20 11:49:48 +00:00
Justin Bogner	0ba3f211c4	Transforms: Don't create bad weights when eliminating dead cases If we happen to eliminate every case in a switch that has branch weights, we currently try to create metadata for the one remaining branch, triggering an assert. Instead, we need to check that the metadata we're trying to create is sensible. llvm-svn: 197791	2013-12-20 08:21:30 +00:00
Justin Bogner	668eb1f746	test: Make a branchweight test more specific llvm-svn: 197790	2013-12-20 08:21:27 +00:00
Justin Bogner	f71b18e972	test: Prefer CHECK-LABEL to CHECK in branchweight tests llvm-svn: 197789	2013-12-20 08:21:24 +00:00
Saleem Abdulrasool	6e6c239e33	ARM IAS: add support for the .pool directive The .pool directive is an alias for the .ltorg directive used to create a literal pool. Simply treat .pool as if .ltorg was passed. llvm-svn: 197787	2013-12-20 07:21:16 +00:00
Tom Stellard	eddfa69465	R600: Allow ftrunc v2: Add ftrunc->TRUNC pattern instead of replacing int_AMDGPU_trunc v3: move ftrunc pattern next to TRUNC definition, it's available since R600 Patch By: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 197783	2013-12-20 05:11:55 +00:00
Eric Christopher	fa98a0c451	Remove extra check line that's failing on windows and not necessary at the moment. llvm-svn: 197782	2013-12-20 04:40:28 +00:00
Eric Christopher	4150a774b2	This test requires object emission. llvm-svn: 197781	2013-12-20 04:34:50 +00:00
Eric Christopher	46e2343554	Add support for a CU to output a set of ranges for the CU. This is useful when you want to have the full list of addresses for a particular CU or when you have multiple modules linked together and can't depend upon the ordering of a single CU for begin/end ranges. llvm-svn: 197776	2013-12-20 04:16:18 +00:00
Adrian Prantl	88dd69760a	move test back into the parent directory and add a REQUIRES: obj emission. llvm-svn: 197759	2013-12-20 00:37:18 +00:00
Rafael Espindola	08d32ee757	Update the ML test to expect the new string format of getStringRepresentation. llvm-svn: 197750	2013-12-19 23:38:09 +00:00
Kevin Enderby	36eba25fee	Un-revert: the buildbot failure in LLVM on lld-x86_64-win7 had me with this commit as the only one on the Blamelist so I quickly reverted this. However it was actually Nick's change who has since fixed that issue. Original commit message: Changed the X86 assembler for intel syntax to work with directional labels. The X86 assembler as a separate code to parser the intel assembly syntax in X86AsmParser::ParseIntelOperand(). This did not parse directional labels. And if something like 1f was used as a branch target it would get an "Unexpected token" error. The fix starts in X86AsmParser::ParseIntelExpression() in the case for AsmToken::Integer, it needs to grab the IntVal from the current token then look for a 'b' or 'f' following an Integer. Then it basically needs to do what is done in AsmParser::parsePrimaryExpr() for directional labels. It saves the MCExpr it creates in the IntelExprStateMachine in the Sym field. When it returns to X86AsmParser::ParseIntelOperand() it looks for a non-zero Sym field in the IntelExprStateMachine and if set it creates a memory operand not an immediate operand it would normally do for the Integer. rdar://14961158 llvm-svn: 197744	2013-12-19 23:16:14 +00:00
Kevin Enderby	d6f2a63791	Revert my change to the X86 assembler for intel syntax to work with directional labels. Because it doesn't work for windows :) llvm-svn: 197731	2013-12-19 22:24:09 +00:00
Kevin Enderby	592d3ac226	Changed the X86 assembler for intel syntax to work with directional labels. The X86 assembler has a separate code to parser the intel assembly syntax in X86AsmParser::ParseIntelOperand(). This did not parse directional labels. And if something like 1f was used as a branch target it would get an "Unexpected token" error. The fix starts in X86AsmParser::ParseIntelExpression() in the case for AsmToken::Integer, it needs to grab the IntVal from the current token then look for a 'b' or 'f' following the Integer. Then it basically needs to do what is done in AsmParser::parsePrimaryExpr() for directional labels. It saves the MCExpr it creates in the IntelExprStateMachine in the Sym field. When it returns to X86AsmParser::ParseIntelOperand() it looks for a non-zero Sym field in the IntelExprStateMachine and if set it creates a memory operand not an immediate operand it would normally do for the Integer. rdar://14961158 llvm-svn: 197728	2013-12-19 22:02:03 +00:00
Quentin Colombet	90a646e4d1	[X86][fast-isel] Fix select lowering. The condition in selects is supposed to be i1. Make sure we are just reading the less significant bit of the 8 bits width value to match this constraint. <rdar://problem/15651765> llvm-svn: 197712	2013-12-19 18:32:04 +00:00
David Peixotto	80c083a678	Implement the .ltorg directive for ARM assembly This directive will write out the assembler-maintained constant pool for the current section. These constant pools are created to support the ldr-pseudo instruction (e.g. ldr r0, =val). The directive can be used by the programmer to place the constant pool in a location that can be reached by a pc-relative offset in the ldr instruction. llvm-svn: 197711	2013-12-19 18:26:07 +00:00
Josh Magee	58fa493955	Unbreak ARM buildbots after r197653 by forcing the target triple on this test. llvm-svn: 197709	2013-12-19 18:14:42 +00:00
David Peixotto	e407d093e8	Implement the ldr-pseudo opcode for ARM assembly The ldr-pseudo opcode is a convenience for loading 32-bit constants. It is converted into a pc-relative load from a constant pool. For example, ldr r0, =0x10001 ldr r1, =bar will generate this output in the final assembly ldr r0, .Ltmp0 ldr r1, .Ltmp1 ... .Ltmp0: .long 0x10001 .Ltmp1: .long bar Sketch of the LDR pseudo implementation: Keep a map from Section => ConstantPool When parsing ldr r0, =val parse val as an MCExpr get ConstantPool for current Section Label = CreateTempSymbol() remember val in ConstantPool at next free slot add operand to ldr that is MCSymbolRef of Label On finishParse() callback Write out all non-empty constant pools for each Entry in ConstantPool Emit Entry.Label Emit Entry.Value Possible improvements to be added in a later patch: 1. Does not convert load of small constants to mov (e.g. ldr r0, =0x1 => mov r0, 0x1) 2. Does reuse constant pool entries for same constant The implementation was tested for ARM, Thumb1, and Thumb2 targets on linux and darwin. llvm-svn: 197708	2013-12-19 18:12:36 +00:00
Adrian Prantl	ddad4947b0	Move testcase to the appropriate X86 subdirectory. llvm-svn: 197701	2013-12-19 17:09:05 +00:00
Zoran Jovanovic	8e918c3c4d	Support for microMIPS control instructions. llvm-svn: 197696	2013-12-19 16:25:00 +00:00
Hal Finkel	2345347eb9	Add a disassembler to the PowerPC backend The tests for the disassembler were adapted from the encoder tests, and for the most part, the output from the disassembler matches that encoder-test inputs. There are some places where more-informative mnemonics could be produced (notably for the branch instructions), and those cases are noted in the tests with FIXMEs. Future work includes: - Generating more-informative mnemonics when possible (this may also be done in the printer). - Remove the dependence on positional "numbered" operand-to-variable mapping (for both encoding and decoding). - Internally using 64-bit instruction variants in 64-bit mode (if this turns out to matter). llvm-svn: 197693	2013-12-19 16:13:01 +00:00
Zoran Jovanovic	ff9d5f3284	Support for microMIPS LL and SC instructions. llvm-svn: 197692	2013-12-19 16:12:56 +00:00
Rafael Espindola	357d013e54	Add a triple so that this passes on OS X. I am surprised I am the first one to notice this. llvm-svn: 197689	2013-12-19 16:06:33 +00:00
Zoran Jovanovic	69be811a6e	Support for microMIPS TLS relocations. llvm-svn: 197685	2013-12-19 16:02:32 +00:00
Evgeniy Stepanov	a9164e9e2a	Add an explicit insert point argument to SplitBlockAndInsertIfThen. Currently SplitBlockAndInsertIfThen requires that branch condition is an Instruction itself, which is very inconvenient, because it is sometimes an Operator, or even a Constant. llvm-svn: 197677	2013-12-19 13:29:56 +00:00
Timur Iskhodzhanov	48703be503	Teach the llvm-readobj COFF dumper to dump debug line tables from object files Reviewed at http://llvm-reviews.chandlerc.com/D2425 llvm-svn: 197674	2013-12-19 11:37:14 +00:00
Timur Iskhodzhanov	d4c5c674f0	Remove the COFF files with Z7 debug info from the repo Rationale: going to land D2425 shortly. I'll re-land these COFF files along with D2425 to simplify the SVN history llvm-svn: 197673	2013-12-19 11:30:21 +00:00

... 3 4 5 6 7 ...

22513 Commits