llvm-project

Commit Graph

Author	SHA1	Message	Date
Hal Finkel	f6475bbc4b	[X86TTI] Remove the unrolling branch limits The loop stream detector (LSD) on modern Intel cores, which optimizes the execution of small loops, has limits on the number of taken branches in addition to uop-count limits (modern AMD cores have similar limits). Unfortunately, at the IR level, estimating the number of branches that will be taken is difficult. For one thing, it strongly depends on later passes (block placement, etc.). The original implementation took a conservative approach and limited the maximal BB DFS depth of the loop. However, fairly-extensive benchmarking by several of us has revealed that this is the wrong approach. In fact, there are zero known cases where the branch limit prevents a detrimental unrolling (but plenty of cases where it does prevent beneficial unrolling). While we could improve the current branch counting logic by incorporating branch probabilities, this further complication seems unjustified without a motivating regression. Instead, unless and until a regression appears, the branch counting will be removed. llvm-svn: 208255	2014-05-07 22:25:18 +00:00
Quentin Colombet	246b6fcd28	[X86] Selectively mark the FMA variants inside a family as isCommutable. Given a FMA family (e.g., 213, 231), not all the variants (i.e., register or memory) are commutable. E.g., for the 213 family (with the syntax src1, src2, src3): fmaXXX213 A, B, reg3/mem3 == fmaXXX213 B, A, reg3/mem3 Now consider the 231 family: fmaXXX231 A, B, reg3 == fmaXXX231 A, reg3, B But fmaXXX231 A, B, mem3 != fmaXXX231 A, mem3, B Indeed, mem3 cannot be the second argument of the memory variant of fmaXXX231. Working on a reduced test case! <rdar://problem/16800495> llvm-svn: 208252	2014-05-07 21:43:35 +00:00
Eric Christopher	b8f9768880	Reformat a couple of functions for clarity. llvm-svn: 208248	2014-05-07 21:05:47 +00:00
Jyotsna Verma	f98a1eca6e	[Hexagon] Add New TSFlags to be used in the upcoming patches. llvm-svn: 208239	2014-05-07 19:07:34 +00:00
Chandler Carruth	32908d7a35	[x86] Make the 'x86-64' cpu, what I see as and many use as the generic default architecture for reasonable modern x86 processors, actually be modern. This processor model should essentially be "tuned" for modern x86 chips as much as possible without undue penalties on any specific architecture. Previously we weren't even using the nice scheduling models. There are a few other tweaks needed here, but this change at least I have benchmarked across a decent swatch of chips (intel's clovertown, westmere, and sandybridge; amd's istanbul) and seen no significant regressions. If anyone has suggested ways to test this, just let me know. Somewhat alarmingly, no existing tests failed. llvm-svn: 208230	2014-05-07 17:37:03 +00:00
Chad Rosier	788e5e3d7c	[ARM64][fast-isel] Disable target specific optimizations at -O0. Functionally, this patch disables the dead register elimination pass and the load/store pair optimization pass at -O0. The ILP optimizations don't require the optimization level to be checked because the call to addILPOpts is predicated with the necessary check. The AdvSIMDScalar pass is disabled by default at all optimization levels. This patch leaves that pass disabled by default. Also, move command-line options into ARM64TargetMachine.cpp and add a few additional flags to aid in debugging. This fixes an issue with the -debug-pass=Structure flag where passes were printed, but not actually run (i.e., AdvSIMDScalar pass). llvm-svn: 208223	2014-05-07 16:41:55 +00:00
Daniel Sanders	d240953db2	[mips] Add highly experimental support for MIPS-I, MIPS-II, MIPS-III, and MIPS-V Summary: These processors will only be available for the integrated assembler at first (CodeGen will emit a fatal error saying they are not implemented). The intention is to work through the existing instructions and correctly annotate the ISA they were added in so that we have a sufficiently good base to start MIPS64r6 development. MIPS64r6 removes/re-encodes certain instructions and I believe it is best to define ISA's using set-union's as far as possible rather than using set-subtraction. Reviewers: vmedic Subscribers: emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D3569 llvm-svn: 208221	2014-05-07 16:25:22 +00:00
Rafael Espindola	de3e36be38	Use range loop. llvm-svn: 208218	2014-05-07 14:53:32 +00:00
Daniel Sanders	5b864d0cbb	[mips] Add FGR_32/FGR_64/GPR_64 adjectives and use then instead of FGRPredicates/GPRPredicates Summary: No functional change (confirmed by diffing tablegen-erated files). Depends on D3642 Reviewers: vmedic, dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3645 llvm-svn: 208213	2014-05-07 14:25:43 +00:00
Daniel Sanders	3872b47231	[mips] Add INSN_<name> adverbs and start using them instead of AdditionalPredicates overrides Summary: No functional change Depends on D3641 Reviewers: vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3642 llvm-svn: 208212	2014-05-07 14:11:46 +00:00
Tim Northover	88a51d983e	AArch64/ARM64: optimise vector selects & enable test When performing a scalar comparison that feeds into a vector select, it's actually better to do the comparison on the vector side: the scalar route would be "CMP -> CSEL -> DUP", the vector is "CM -> DUP" since the vector comparisons are all mask based. llvm-svn: 208210	2014-05-07 14:10:27 +00:00
Daniel Sanders	9c1b1bec03	[mips] Add ISA_<name> adverbs and start using them instead of AdditionalPredicates overrides Summary: One small functional change. The recently added PAUSE instruction now has the HasStdEnc predicate which was accidentally removed by a Requires<>. Depends on D3640 Reviewers: vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3641 llvm-svn: 208209	2014-05-07 13:57:22 +00:00
Rafael Espindola	566fcfe69b	Remove the UseCFI option from createAsmStreamer. We were already always passing true, this just removes the option. llvm-svn: 208205	2014-05-07 13:00:43 +00:00
Daniel Sanders	13d7209fa9	[mips] Continue splitting Instruction.Predicates into smaller lists and re-join them with !listconcat Summary: Move IsGP64bit into GPRPredicates, and IsFP64bit/NotFP64bit into FGRPredicates No functional change (confirmed by diffing tablegen-erated files). Depends on D3639 Reviewers: vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3640 llvm-svn: 208201	2014-05-07 12:48:37 +00:00
James Molloy	d3c401a2d0	[ARM64-BE] Fix fast-isel, and add appropriate RUN lines to appropriate tests. llvm-svn: 208200	2014-05-07 12:33:55 +00:00
James Molloy	36132057da	[ARM64-BE] Fix variable-argument saving. llvm-svn: 208199	2014-05-07 12:33:48 +00:00
James Molloy	4049e4fd77	[ARM64-BE] Implement the lane-twiddling logic at AAPCS boundaries for big endian. The AAPCS states that values passed in registers must have a value as though they had been loaded with "LDR". LDR is equivalent to "LD1.64 vX.1D" - that is, loading scalars to vector registers and loading 1-element vectors is equivalent. The logic implemented here is to ensure that at all call boundaries and during formal argument lowering all vectors are treated as their bitwidth-based floating point scalar counterpart, which is always one of f64 or f128 (v2i32 -> f64, v4i32 -> f128 etc). A BITCAST is inserted so that the appropriate REV will be generated during code generation. llvm-svn: 208198	2014-05-07 12:33:41 +00:00
Daniel Sanders	4cd0782bf2	[mips] Move IsFP64bit/NotFP64bit to the front of the AdditionalPredicates list Summary: This makes it easier to prove a more complicated change in the next commit is non-functional. Reviewers: vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3639 llvm-svn: 208197	2014-05-07 12:27:46 +00:00
James Molloy	30e0e11eb4	[ARM64-BE] Implement the crazy bitcast handling for big endian vectors. Because we've canonicalised on using LD1/ST1, every time we do a bitcast between vector types we must do an equivalent lane reversal. Consider a simple memory load followed by a bitconvert then a store. v0 = load v2i32 v1 = BITCAST v2i32 v0 to v4i16 store v4i16 v2 In big endian mode every memory access has an implicit byte swap. LDR and STR do a 64-bit byte swap, whereas LD1/ST1 do a byte swap per lane - that is, they treat the vector as a sequence of elements to be byte-swapped. The two pairs of instructions are fundamentally incompatible. We've decided to use LD1/ST1 only to simplify compiler implementation. LD1/ST1 perform the equivalent of a sequence of LDR/STR + REV. This makes the original code sequence: v0 = load v2i32 v1 = REV v2i32 (implicit) v2 = BITCAST v2i32 v1 to v4i16 v3 = REV v4i16 v2 (implicit) store v4i16 v3 But this is now broken - the value stored is different to the value loaded due to lane reordering. To fix this, on every BITCAST we must perform two other REVs: v0 = load v2i32 v1 = REV v2i32 (implicit) v2 = REV v2i32 v3 = BITCAST v2i32 v2 to v4i16 v4 = REV v4i16 v5 = REV v4i16 v4 (implicit) store v4i16 v5 This means an extra two instructions, but actually in most cases the two REV instructions can be combined into one. For example: (REV64_2s (REV64_4h X)) === (REV32_4h X) There is also no 128-bit REV instruction. This must be synthesized with an EXT instruction. Most bitconverts require some sort of conversion. The only exceptions are: a) Identity conversions - vNfX <-> vNiX b) Single-lane-to-scalar - v1fX <-> fX or v1iX <-> iX Even though there are hundreds of changed lines, I have a fairly high confidence that they are somewhat correct. The changes to add two REV instructions per bitcast were pretty mechanical, and once I'd done that I threw the resulting .td at a script I wrote which combined the two REVs together (and added an EXT instruction, for f128) based on an instruction description I gave it. This was much less prone to error than doing it all manually, plus my brain would not just have melted but would have vapourised. llvm-svn: 208194	2014-05-07 11:28:53 +00:00
James Molloy	3f0da857b4	[ARM64-BE] Predicate VLDR/VSTR for vectors as little-endian only. We must use LD1/ST1 on big-endian. llvm-svn: 208193	2014-05-07 11:28:45 +00:00
James Molloy	ccc7f982c1	[ARM64-BE] Make big endian (scalar) argument passing work correctly. This completes the port of r204814 (cpirker "AArch64_BE function argument passing for ARM ABI") from AArch64 to ARM64, and fixes a bunch of issues found during later development along the way. The biggest of these was that the alignment fixup logic wasn't replicated into all the places it should have been. llvm-svn: 208192	2014-05-07 11:28:36 +00:00
Daniel Sanders	3dc2c016a6	[mips] Split Instruction.Predicates into smaller lists and re-join them with !listconcat Summary: The overall idea is to chop the Predicates list into subsets that are usually overridden independently. This allows subclasses to partially override the predicates of their superclasses without having to re-add all the existing predicates. This patch starts the process by moving HasStdEnc into a new EncodingPredicates list and almost everything else into AdditionalPredicates. It has revealed a couple likely bugs where 'let Predicates' has removed the HasStdEnc predicate. No functional change (confirmed by diffing tablegen-erated files). Depends on D3549, D3506 Reviewers: vmedic Differential Revision: http://reviews.llvm.org/D3550 llvm-svn: 208184	2014-05-07 10:27:09 +00:00
Daniel Sanders	0e2364149c	[mips] Move HasStdEnc to the front of the predicates lists. Summary: This will make it easier to prove that a more complicated change in the following commit is non-functional. No functional change. Depends on D3506 Reviewers: vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3549 llvm-svn: 208179	2014-05-07 09:58:05 +00:00
Evgeniy Stepanov	3819f02819	[asan] Add a flag to control asm instrumentation. With this change, asm instrumentation is disabled by default. llvm-svn: 208167	2014-05-07 07:54:11 +00:00
Joerg Sonnenberger	cf86ce136c	Allow using normal .eh_frame based unwinding on ARM. Use the same encodings as x86. Use this exception model for NetBSD. llvm-svn: 208166	2014-05-07 07:49:34 +00:00
Saleem Abdulrasool	985dcf18a9	ARM: mark additional instructions as MachineFrameSetup Mark up additional instructions which are part of the function prologue as MachineFrameSetup. These instructions are part of the function prologue, emitted by the PEI pass to setup the stack for use in the activating frame. llvm-svn: 208153	2014-05-07 03:03:31 +00:00
Saleem Abdulrasool	acd0338c61	ARM: fix WoA PEI instruction selection The ARM::BLX instruction is an ARM mode instruction. The Windows on ARM target is limited to Thumb instructions. Correctly use the thumb mode tBLXr instruction. This would manifest as an errant write into the object file as the instruction is 4-bytes in length rather than 2. The result would be a corrupted object file that would eventually result in an executable that would crash at runtime. llvm-svn: 208152	2014-05-07 03:03:27 +00:00
Andrew Trick	d0d8cb1d21	Update an embarassing out-of-date comment. llvm-svn: 208137	2014-05-06 22:18:43 +00:00
Joerg Sonnenberger	818e725158	If a function needs a frame pointer, but r11 (aka fp) has not been used, remove it from the list of unspilled registers. Otherwise the following attempt to keep the stack aligned by picking an extra GPR register to spill will not work as it picks up r11. llvm-svn: 208129	2014-05-06 20:43:01 +00:00
Andrea Di Biagio	c14ccc9184	[X86] Improve the lowering of BITCAST dag nodes from type f64 to type v2i32 (and vice versa). Before this patch, the backend always emitted a store+load sequence to bitconvert from f64 to i64 the input operand of a ISD::BITCAST dag node that performed a bitconvert from type MVT::f64 to type MVT::v2i32. The resulting i64 node was then used to build a v2i32 vector. With this patch, the backend now produces a cheaper SCALAR_TO_VECTOR from MVT::f64 to MVT::v2f64. That SCALAR_TO_VECTOR is then followed by a "free" bitcast to type MVT::v4i32. The elements of the resulting v4i32 are then extracted to build a v2i32 vector (which is illegal and therefore promoted to MVT::v2i64). This is in general cheaper than emitting a stack store+load sequence to bitconvert the operand from type f64 to type i64. llvm-svn: 208107	2014-05-06 17:09:03 +00:00
Renato Golin	c7aea40ec6	Implememting named register intrinsics This patch implements the infrastructure to use named register constructs in programs that need access to specific registers (bare metal, kernels, etc). So far, only the stack pointer is supported as a technology preview, but as it is, the intrinsic can already support all non-allocatable registers from any architecture. llvm-svn: 208104	2014-05-06 16:51:25 +00:00
Tim Northover	618850b6a5	AArch64/ARM64: implement diagnosis of unpredictable loads & stores llvm-svn: 208091	2014-05-06 14:15:14 +00:00
Tim Northover	15641cd4e1	AArch64/ARM64: make NEON vector list parsing a bit more robust It doesn't change the results, but it seems silly not to diagnose obvious problems early on. llvm-svn: 208083	2014-05-06 12:50:51 +00:00
Tim Northover	339ecf14ee	AArch64/ARM64: add more specific diagnostic for floating imm 0.0. llvm-svn: 208082	2014-05-06 12:50:47 +00:00
Tim Northover	05cbe7c80a	AArch64/ARM64: add more specific diagnostic for invalid vector lanes llvm-svn: 208081	2014-05-06 12:50:44 +00:00
Tim Northover	0f54f309bb	AArch64/ARM64: produce more informative diagnostic assembling some immediates No tests here, they'll be added when the entire neon-diagnostics.s test from AArch64 is enabled. llvm-svn: 208079	2014-05-06 11:18:53 +00:00
Christian Pirker	fdce7cea93	ARM: For thumb fixups store halfwords high first and low second llvm-svn: 208076	2014-05-06 10:05:11 +00:00
Kevin Qin	1353c3405d	[ARM64] Enable alignment control option in front-end for ARM64. This is the modification in llvm part. llvm-svn: 208074	2014-05-06 09:48:52 +00:00
Craig Topper	646f64f04a	Use X86 memory operand enums instead of hardcoding. llvm-svn: 208064	2014-05-06 07:04:32 +00:00
Reid Kleckner	4a406d32e9	Fix i128 div/mod on mingw64 The Win64 docs are very clear that anything larger than 8 bytes is passed by reference, and GCC MinGW64 honors that for __modti3 and friends. Patch by Jameson Nash! llvm-svn: 208029	2014-05-06 01:20:42 +00:00
Eric Christopher	eb0bf5af65	Fix typo. llvm-svn: 208006	2014-05-05 21:50:57 +00:00
Tom Stellard	45b3dcd35b	R600: Expand i64 ISD:SUB llvm-svn: 208005	2014-05-05 21:47:15 +00:00
Filipe Cabecinhas	fe59062b75	Revert "Optimize shufflevector that copies an i64/f64 and zeros the rest." This reverts commit 207992. I misread the phab number on the LGTM. llvm-svn: 207993	2014-05-05 19:40:36 +00:00
Filipe Cabecinhas	263d98c19f	Optimize shufflevector that copies an i64/f64 and zeros the rest. Summary: Also ran clang-format on the function. The code added is the last else if block. Reviewers: nadav, craig.topper Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3518 llvm-svn: 207992	2014-05-05 19:36:28 +00:00
Marek Olsak	82d3b11e85	R600/SI: allow 5 more input SGPRs to a shader Our OpenGL driver needs 22 SGPRs (16 user SGPRs + 6 streamout non-user SGPRs). Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 207990	2014-05-05 19:30:54 +00:00
Saleem Abdulrasool	e8a7afef86	CodeGen: correct memset emittance for WoA Windows on ARM does not conform to AEABI. However, memset would be emitted using the AEABI signature, resulting in inverted parameters. Handle this special case appropriately. llvm-svn: 207943	2014-05-04 23:13:21 +00:00
Saleem Abdulrasool	729c7a08fb	MC: support FK_SecRel_4 for Windows on ARM Add handling for FK_SecRel_4 (4-byte section relative relocations). These are used by the generation of DWARF debug information (the abbrevations use section relative relocations). This will also be used in generation of CodeView line tables. llvm-svn: 207941	2014-05-04 23:13:15 +00:00
Elena Demikhovsky	e73333a50f	AVX-512: minor change in rndscale intrinsic llvm-svn: 207937	2014-05-04 13:35:37 +00:00
Saleem Abdulrasool	3c82b499a0	X86: further range-loopify AsmPrinter Use more range loops in the X86AsmPrinter. NFC. llvm-svn: 207928	2014-05-04 01:54:17 +00:00
Saleem Abdulrasool	b942035bae	X86: remove X86COFFMachineModuleInfo Remove dead code. This is vestigial after r98384. llvm-svn: 207927	2014-05-04 01:54:12 +00:00
Saleem Abdulrasool	82b69fa105	X86: repair export compatibility with MinGW/cygwin Both MinGW and cygwin (i686) construct export directives without the global leader prefix. This is mostly due to the fact that they use GNU ld which does not correctly handle the export directive. This apparently has been been broken for a while. However, this was recently reported as being broken by mingwandroid and diorcety of the msys2 project. Remove the global leader prefix if targeting MinGW or cygwin, otherwise, retain the global leader prefix. Add an explicit test for cygwin's behaviour of export directives. llvm-svn: 207926	2014-05-04 00:03:48 +00:00
Saleem Abdulrasool	75e68cbd12	X86: refactor export directive generation Create a helper function to generate the export directive. This was previously duplicated inline to handle export directives for variables and functions. This also enables the use of range-based iterators for the generation of the directive rather than the traditional loops. NFC. llvm-svn: 207925	2014-05-04 00:03:41 +00:00
Rafael Espindola	3d082fa507	Fix pr19645. The fix itself is fairly simple: move getAccessVariant to MCValue so that we replace the old weak expression evaluation with the far more general EvaluateAsRelocatable. This then requires that EvaluateAsRelocatable stop when it finds a non trivial reference kind. And that in turn requires the ELF writer to look harder for weak references. Last but not least, this found a case where we were being bug by bug compatible with gas and accepting an invalid input. I reported pr19647 to track it. llvm-svn: 207920	2014-05-03 19:57:04 +00:00
Joey Gouly	b0afd1b929	[ARM64] Correctly select ANDWri in FastISel. http://reviews.llvm.org/D3598 llvm-svn: 207917	2014-05-03 17:27:06 +00:00
Benjamin Kramer	6004573ecf	Add a description for AMD's bdver4 (aka Excavator). This is just bdver3 + AVX2 + BMI2. llvm-svn: 207847	2014-05-02 15:47:07 +00:00
Tom Stellard	10b1502733	R600/SI: Add processor type for Mullins. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> llvm-svn: 207846	2014-05-02 15:41:49 +00:00
Tom Stellard	3dbf1f8df0	R600: Expand vector sin and cos. v2: move code to AMDGPUISelLowering.cpp squash with tests (both EG and SI) Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 207845	2014-05-02 15:41:47 +00:00
Tom Stellard	605e116e8e	R600: Expand TruncStore i64 -> {i16,i8} llvm-svn: 207844	2014-05-02 15:41:46 +00:00
Tom Stellard	eba61071d7	R600/SI: Only create one instruction when spilling/restoring register v3 The register spiller assumes that only one new instruction is created when spilling and restoring registers, so we need to emit pseudo instructions for vector register spills and lower them after register allocation. v2: - Fix calculation of lane index - Extend VGPR liveness to end of program. v3: - Use SIMM16 field of S_NOP to specify multiple NOPs. https://bugs.freedesktop.org/show_bug.cgi?id=75005 llvm-svn: 207843	2014-05-02 15:41:42 +00:00
Tim Northover	d7360900a8	AArch64/ARM64: add patterns for post-indexed ST1 ops. llvm-svn: 207840	2014-05-02 14:54:27 +00:00
Tim Northover	523b5a43fb	ARM64: refactor NEON post-indexed loads & stores (MC). Previously, LLVM had no knowledge that these instructions actually modified their address register: fine if they never end up in CodeGen, but when I'd rather like to write some patterns for them it becomes a disaster. The change is mostly straightforward, I think the most significant design decision was to always put the address write-back first. This allows loads and stores to be accessed more uniformly, for example permitting the continued sharing of the InstAlias definitions. I also discovered that the custom Decode logic is no longer needed, so I removed it. No tests, because there should be no functionality change. llvm-svn: 207839	2014-05-02 14:54:21 +00:00
Tim Northover	d0b07e133b	AArch64/ARM64: support indexed loads/stores on vector types. While post-indexed LD1/ST1 instructions do exist for vector loads, this patch makes use of the more flexible addressing-modes in LDR/STR instructions. llvm-svn: 207838	2014-05-02 14:54:15 +00:00
Pranav Bhandarkar	94cb35cb05	Remove HexagonTargetMachine::addPassesForOptimizations; it is not needed any more. llvm-svn: 207800	2014-05-01 22:10:59 +00:00
Reed Kotler	bab3f23da6	Add basic functionality for assignment of ints. This creates a lot of core infrastructure in which to add, with little effort, quite a bit more to mips fast-isel Test Plan: simplestore.ll Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3527 llvm-svn: 207790	2014-05-01 20:39:21 +00:00
Eli Bendersky	a108a65df2	Add an optimization that does CSE in a group of similar GEPs. This optimization merges the common part of a group of GEPs, so we can compute each pointer address by adding a simple offset to the common part. The optimization is currently only enabled for the NVPTX backend, where it has a large payoff on some benchmarks. Review: http://reviews.llvm.org/D3462 Patch by Jingyue Wu. llvm-svn: 207783	2014-05-01 18:38:36 +00:00
Matt Arsenault	06028dd7be	R600/SI: Fix verifier error with pseudo store instructions. Use i32 instead of specifying SReg_32. When this is the pseudo INDIRECT_BASE_ADDR, this would give a bogus verifier error. llvm-svn: 207770	2014-05-01 16:37:52 +00:00
Bradley Smith	3567cc1b42	[ARM64] Prefer generation of bzero on Darwin only llvm-svn: 207760	2014-05-01 13:11:59 +00:00
Rafael Espindola	4a04294882	Don't force symbols to be globals in .thumb_set. We currently force symbols to be globals in .thumb_set. The intent seems to be that given .thumb_set foo, bar we emit an undefined symbol to bar if it is never defined. The side effect is that we mark bar as global, even if it is defined, which gas does not. Producing an undefined reference to bar is a general difference from MC and gas. For example, given a = b gas will produce an undefined reference to b, MC will not. I would be surprised if any code depends on this, but it it does, we should fix the general difference, not special case .thumb_set. llvm-svn: 207757	2014-05-01 12:45:43 +00:00
Tim Northover	534acbdf73	AArch64/ARM64: print BFM instructions as BFI or BFXIL The canonical form of the BFM instruction is always one of the more explicit extract or insert operations, which makes reading output much easier. llvm-svn: 207752	2014-05-01 12:29:38 +00:00
Richard Barton	3db1d580b3	Correction to assert statemtent to allow 32-bit unsigned numbers with the top bit set. This fixes an ARM assembler crash - regression test added. llvm-svn: 207747	2014-05-01 11:37:44 +00:00
Bradley Smith	f57d5ca234	[ARM64] Conditionalize CPU specific system registers on subtarget features llvm-svn: 207742	2014-05-01 10:25:36 +00:00
Matheus Almeida	d92a3fa212	[mips] Move expansion of .cpsetup to target streamer. Summary: There are two functional changes: 1) The directive is not expanded for the ASM->ASM code path. 2) If PIC is not set, there's no expansion for the ASM->OBJ code path (same behaviour as GAS). Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3482 llvm-svn: 207741	2014-05-01 10:24:46 +00:00
Daniel Sanders	88fbbcaa30	[mips] Removed two-operand alias for sllv, sr[al]v, rotrv, dsllv, dsr[al]v, and drotrv GAS doesn't actually accept these particular cases. The mnemonic without the trailing 'v' still supports two-operand aliases. llvm-svn: 207740	2014-05-01 10:08:36 +00:00
Saleem Abdulrasool	7158303ad7	ARM: fix memory leak, simplify WoA stack probing This fixes the memory leak introduced with the initial addition of support for WoA stack probing. Now that the pseudo-instruction expansion can handle an external symbol, use that to generate the load which simplifies the logic as well as avoids the memory leak. llvm-svn: 207737	2014-05-01 04:19:59 +00:00
Saleem Abdulrasool	d6c0ba3787	ARM: support expanding external symbols in 32-bit moves This enhances the expansion of the mov32imm pseudo-instruction to support an external symbol reference. This is motivated by a simplification of the stack probe emission for Windows on ARM (and fixing a leak). llvm-svn: 207736	2014-05-01 04:19:56 +00:00
Joerg Sonnenberger	0f90c95ccf	If necessary for indirect encodings, emit stubs. llvm-svn: 207730	2014-05-01 00:25:15 +00:00
Joerg Sonnenberger	3c10817b92	Prepare support of Itanium ABI on ARM as opposed to EHABI by conditionally emitting .fnstart and friends only for EHABI. llvm-svn: 207718	2014-04-30 22:43:13 +00:00
Joerg Sonnenberger	fe54364a9d	Restore condition incorrectly changed in r96289 to the older state. llvm-svn: 207716	2014-04-30 22:40:27 +00:00
Weiming Zhao	7f6daf1799	[ARM64] Prevent bit extraction to be adjusted by following shift For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx more difficult. For example: Given %shr = lshr i64 %x, 4 %and = and i64 %shr, 15 %arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and %0 = load i64* %arrayidx With current shift folding, it takes 3 instrs to compute base address: lsr x8, x0, #1 and x8, x8, #0x78 add x8, x9, x8 If using ubfx, it only needs 2 instrs: ubfx x8, x0, #4, #4 add x8, x9, x8, lsl #3 This fixes bug 19589 llvm-svn: 207702	2014-04-30 21:07:24 +00:00
Michael Zolotukhin	1f4a960ccf	[X86] Never hoist the shift value of a shift instruction. There is no need to check if we want to hoist the immediate value of an shift instruction. Simply return TCC_Free right away. This change is like r206101, but for X86. rdar://problem/16190769 llvm-svn: 207692	2014-04-30 19:17:32 +00:00
Matheus Almeida	e844872830	[mips] Add instruction alias (negu). Summary: negu $reg is equivalent to negu $reg, $reg. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3510 llvm-svn: 207673	2014-04-30 16:53:49 +00:00
Matheus Almeida	b7be52343d	[mips] Add instruction alias (sltu). Summary: The pattern sltu $r1, $r2, $imm is found in handwritten assembly which is just a shorthand version of sltui $r1, $r2, $imm. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3508 llvm-svn: 207671	2014-04-30 16:29:56 +00:00
Tim Northover	a8c577e454	ARM64: print fp immediates without using scientific notation. llvm-svn: 207669	2014-04-30 16:13:34 +00:00
Tim Northover	7346f062b6	AArch64/ARM64: implement remaining TLS relocations (purely MC). llvm-svn: 207668	2014-04-30 16:13:26 +00:00
Tim Northover	b8fb7f4193	AArch64/ARM64: add specific diagnostic for MRS/MSR and enable tests. llvm-svn: 207667	2014-04-30 16:13:20 +00:00
Tim Northover	3c9a9401d5	AArch64/ARM64: accept and print floating-point immediate 0 as "#0.0" It's been decided that in the future, the floating-point immediate in instructions like "fcmeq v0.2s, v1.2s, #0.0" will be canonically "0.0", which has been implemented on AArch64 already but not ARM64. This fixes that issue. llvm-svn: 207666	2014-04-30 16:13:07 +00:00
Matheus Almeida	56df6ff2c5	[mips] Add instruction alias (dsll and dsrl). Summary: The pattern dsll/dsrl $rd, $rt, $rs is found in handwritten assembly which is just a shorthand version of dsllv/dsrlv $rd, $rt, $rs. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3486 llvm-svn: 207664	2014-04-30 16:00:49 +00:00
Tom Stellard	1bd80725b3	R600/SI: Use VALU instructions for copying i1 values We can't use SALU instructions for this since they ignore the EXEC mask and are always executed. This fixes several OpenCV tests. llvm-svn: 207661	2014-04-30 15:31:33 +00:00
Tom Stellard	0c354f25c9	R600/SI: Teach moveToVALU how to handle some SMRD instructions llvm-svn: 207660	2014-04-30 15:31:29 +00:00
Chad Rosier	864e35db0a	[ARM64][fast-isel] Fast-isel doesn't know how to handle f128. llvm-svn: 207659	2014-04-30 15:29:57 +00:00
Matheus Almeida	312ac02491	[mips] Add instruction alias (sll and srl). Summary: The pattern sll/srl $rd, $rt, $rs is found in handwritten assembly which is just a shorthand version of sllv/srlv $rd, $rt, $rs. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3483 llvm-svn: 207657	2014-04-30 15:23:04 +00:00
Sasa Stankovic	7b061a42b1	[mips] Fix MipsLongBranch pass to work when the offset from the branch to the target cannot be determined accurately. This is the case for NaCl where the sandboxing instructions are added in MC layer, after the MipsLongBranch pass. It is also the case when the code has inline assembly. Instead of calculating offset in the MipsLongBranch pass, use %hi(sym1 - sym2) and %lo(sym1 - sym2) expressions that are resolved during the fixup. This patch also deletes microMIPS test file test/CodeGen/Mips/micromips-long-branch.ll and implements microMIPS CHECKs in a much simpler way in a file test/CodeGen/Mips/longbranch.ll, together with MIPS32 and MIPS64. llvm-svn: 207656	2014-04-30 15:06:25 +00:00
Tom Stellard	e01fdffd9a	R600: Remove unused function AMDGPUSubtarget::getDefaultSize() llvm-svn: 207654	2014-04-30 14:20:53 +00:00
Evgeniy Stepanov	29865f7803	[asan] Disable asm instrumentation on unsupported platforms. Only emit calls to compiler-rt asm routines on platforms where they are present (currently limited to linux i386/x86_64). Patch by Yuri Gorshenin. llvm-svn: 207651	2014-04-30 14:04:31 +00:00
Tim Northover	0ac99404f0	ARM64: print lsr instead of lsrv for variable shifts (etc) The canonical syntax for shifts by a variable amount does not end with 'v', but that syntax should be supported as an alias (presumably for legacy reasons). llvm-svn: 207649	2014-04-30 13:37:07 +00:00
Tim Northover	7030f05b4f	ARM64: use 32-bit operations for uxtb & uxth Testing will be enabled shortly with basic-a64-instructions.s llvm-svn: 207648	2014-04-30 13:37:02 +00:00
Tim Northover	32ac450f09	AArch64/ARM64: allow smaller granule relocations on MOVZ/MOVN Testing will be enabled shortly with basic-a64-instructions.s llvm-svn: 207647	2014-04-30 13:36:59 +00:00
Tim Northover	a307769b15	AArch64/ARM64: copy support for bCC instead of b.CC across. llvm-svn: 207646	2014-04-30 13:36:56 +00:00
Tim Northover	d53a671354	AArch64/ARM64: expunge CPSR from the sources AArch64 does not have a CPSR register in the same way that AArch32 does. Most of its compiler-relevant roles have been taken over by the more specific NZCV register (representing just the flags set by normal instructions). Its system control functions still remain, but are now under the pseudo-register referred to as "PSTATE". They're accessed via various MRS & MSR instructions described in the reference manual. llvm-svn: 207645	2014-04-30 13:14:14 +00:00
Tim Northover	20ad359b77	AArch64/ARM64: use HS instead of CS & LO instead of CC. On instructions using the NZCV register, a couple of conditions have dual representations: HS/CS and LO/CC (meaning unsigned-higher-or-same/carry-set and unsigned-lower/carry-clear). The first of these is more descriptive in most circumstances, so we should print it. llvm-svn: 207644	2014-04-30 13:14:03 +00:00
Daniel Sanders	e296a0fce5	[mips][msa] Fix vector insertions where the index is variable Summary: This isn't supported directly so we rotate the vector by the desired number of elements, insert to element zero, then rotate back. The i64 case generates rather poor code on MIPS32. There is an obvious optimisation to be made in future (do both insert.w's inside a shared rotate/unrotate sequence) but for now it's sufficient to select valid code instead of aborting. Depends on D3536 Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://reviews.llvm.org/D3537 llvm-svn: 207640	2014-04-30 12:09:32 +00:00
Tim Northover	f9941a9dc6	ARM64: accept ELF-relocated load/store insts without a #. E.g. we print "ldr x0, [x0, :lo12:symbol]" so we need to accept that syntax too. llvm-svn: 207639	2014-04-30 12:00:20 +00:00
Tim Northover	36c93db37a	ARM64: remove duplication by templating InstPrinter methods No functional change, so no tests. llvm-svn: 207638	2014-04-30 11:43:36 +00:00
Matheus Almeida	525bc4f708	[mips] Add support for .cpload. Summary: This directive is used for setting up $gp in the beginning of a function. It expands to three instructions if PIC is enabled: lui $gp, %hi(_gp_disp) addui $gp, $gp, %lo(_gp_disp) addu $gp, $gp, $reg _gp_disp is a special symbol that the linker sets to the distance between the lui instruction and the context pointer (_gp). Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3480 llvm-svn: 207637	2014-04-30 11:28:42 +00:00
Tim Northover	970c4a8d35	ARM64: use hex immediates for movz/movk instructions Since these are mostly used in "lsl #16", "lsl #32", "lsl #48" combinations to piece together an immediate in 16-bit chunks, hex is probably the most appropriate format. llvm-svn: 207635	2014-04-30 11:19:40 +00:00
Tim Northover	4b2f8a990e	ARM64: hexify printing various immediate operands This is mostly aimed at the NEON logical operations and MOVI/MVNI (since they accept weird shifts which are more naturally understandable in hex notation). Also changes BRK/HINT etc, which is probably a neutral change, but easier than the alternative. llvm-svn: 207634	2014-04-30 11:19:28 +00:00
Tim Northover	cfd6e66544	ARM64: print canonical syntax for add/sub (imm) instructions. Since these instructions only accept a 12-bit immediate, possibly shifted left by 12, the canonical syntax used by the architecture reference manual is "#N {, lsl #12 }". We should accept an immediate that has already been shifted, (e.g. Also, print a comment giving the full addend since it can be helpful. llvm-svn: 207633	2014-04-30 11:19:15 +00:00
James Molloy	54f3485dba	[ARM64] Simplify if condition. v2f32 and v4f32 were missed out of these conditions, so this is also a bugfix. llvm-svn: 207628	2014-04-30 10:15:50 +00:00
James Molloy	b5efbcfbe5	[ARM64] Fix stupid copy-pasto in ARM64MCAsmInfo.cpp - aarch64_be -> arm64_be llvm-svn: 207627	2014-04-30 10:15:46 +00:00
Tim Northover	41cec5c3cb	ARM64: make sure FastISel uses a GPR64 source in 64-bit extensions. llvm-svn: 207620	2014-04-30 09:32:01 +00:00
Craig Topper	2d2aa0ca1f	Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently. llvm-svn: 207616	2014-04-30 07:17:30 +00:00
Saleem Abdulrasool	25947c318b	ARM: support stack probe emission for Windows on ARM This introduces the stack lowering emission of the stack probe function for Windows on ARM. The stack on Windows on ARM is a dynamically paged stack where any page allocation which crosses a page boundary of the following guard page will cause a page fault. This page fault must be handled by the kernel to ensure that the page is faulted in. If this does not occur and a write access any memory beyond that, the page fault will go unserviced, resulting in an abnormal program termination. The watermark for the stack probe appears to be at 4080 bytes (for accommodating the stack guard canaries and stack alignment) when SSP is enabled. Otherwise, the stack probe is emitted on the page size boundary of 4096 bytes. llvm-svn: 207615	2014-04-30 07:05:07 +00:00
Saleem Abdulrasool	0aca1c30c6	ARM: print COFF function header for Windows on ARM Emit the COFF header when printing out the function. This is important as the header contains two important pieces of information: the storage class for the symbol and the symbol type information. This bit of information is required for the linker to correctly identify the type of symbol that it is dealing with. llvm-svn: 207613	2014-04-30 06:14:25 +00:00
Craig Topper	ee7b0f3956	De-virtualize or remove some methods that have no overrides nor override anything. In some cases remove all together if there are no callers either. llvm-svn: 207610	2014-04-30 05:53:27 +00:00
Saleem Abdulrasool	ef550a6d01	ARM: move llvm_unreachable use When building with -Werror=covered-switch-default (as on the buildbots), the build would fail since all cases are covered by the switch. Move the llvm_unreachable to the end of the function as an annotation. llvm-svn: 207609	2014-04-30 05:12:41 +00:00
Saleem Abdulrasool	f8222631a5	ARM: partially handle 32-bit relocations for WoA IMAGE_REL_ARM_MOV32T relocations require that the movw/movt pair-wise relocation is not split up and reordered. When expanding the mov32imm pseudo-instruction, create a bundle if the machine operand is referencing an address. This helps ensure that the relocatable address load is not reordered by subsequent passes. Unfortunately, this only partially handles the case as the Constant Island Pass occurs after the instructions are unbundled and does not properly handle bundles. That is a more fundamental issue with the pass itself and beyond the scope of this change. llvm-svn: 207608	2014-04-30 04:54:58 +00:00
Reid Kleckner	fb69308568	Implement X86 code generation for musttail Currently, musttail codegen is relying on sibcall optimization, and reporting a fatal error if fails. Sibcall optimization fails when stack arguments need to be modified, which is insufficient for musttail. The logic for moving arguments in memory safely is already implemented for GuaranteedTailCallOpt. This change merely arranges for musttail calls to use it. No functional change for GuaranteedTailCallOpt. Reviewers: espindola Differential Revision: http://reviews.llvm.org/D3493 llvm-svn: 207598	2014-04-29 23:55:41 +00:00
Benjamin Kramer	d59664f4f7	raw_ostream: Forward declare OpenFlags and include FileSystem.h only where necessary. llvm-svn: 207593	2014-04-29 23:26:49 +00:00
Tom Stellard	93f9f4950c	R600: Remove duplicate setting of SELECT expansion. It's already set in AMDGPUISelLowering for all GPUs Patch By: Jan Vesely Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 207592	2014-04-29 23:12:55 +00:00
Tom Stellard	919bb6b83f	R600/SI: Custom lower SI_IF and SI_ELSE to avoid machine verifier errors SI_IF and SI_ELSE are terminators which also produce a value. For these instructions ISel always inserts a COPY to move their value to another basic block. This COPY ends up between SI_(IF\|ELSE) and the S_BRANCH* instruction at the end of the block. This breaks MachineBasicBlock::getFirstTerminator() and also the machine verifier which assumes that terminators are grouped together at the end of blocks. To solve this we coalesce the copy away right after ISel to make sure there are no instructions in between terminators at the end of blocks. llvm-svn: 207591	2014-04-29 23:12:53 +00:00
Tom Stellard	58ac7440e6	R600/SI: Only select SALU instructions in the entry or exit block SALU instructions ignore control flow, so it is not always safe to use them within branches. This is a partial solution to this problem until we can come up with something better. llvm-svn: 207590	2014-04-29 23:12:48 +00:00
Tom Stellard	676f571999	R600: optimize the UDIVREM 64 algorithm This is a squash of several optimization commits: - calculate DIV_Lo and DIV_Hi separately - use BFE_U32 if we are operating on 32bit values - use precomputed constants instead of shifting in UDVIREM - skip the first 32 iterations of udivrem v2: Check whether BFE is supported before using it Patch by: Jan Vesely Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 207589	2014-04-29 23:12:46 +00:00
Tom Stellard	bcd318fc76	R600: Implement iterative algorithm for udivrem Initial implementation, rather slow Patch by: Jan Vesely Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 207588	2014-04-29 23:12:45 +00:00
Tom Stellard	5f3378879f	R600: Change UDIV/UREM to UDIVREM when legalizing types When legalizing ops, with UDIV/UREM set to expand, they automatically expand to UDIVREM (if legal or custom). We need to do this manually for legalize types. v2: SI should be set to Expand because the type is legal, and it is automatically lowered to UDIVREM if UDIVREM is Legal/Custom R600 should set to UDIV/UREM to Custom because it needs to lower them during type legalization Patch by: Jan Vesely Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 207587	2014-04-29 23:12:43 +00:00
Tom Stellard	df780303ef	R600: remove unused variable Patch by: Jan Vesely Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 207586	2014-04-29 23:12:38 +00:00
Reed Kotler	67077b3032	Add Simple return instruction to Mips fast-isel Reviewers: dsanders Reviewed by: dsanders Differential Revision: http://reviews.llvm.org/D3430 llvm-svn: 207565	2014-04-29 17:57:50 +00:00
Daniel Sanders	690e4d493e	[mips] Remove two more redundant 'let Predicates = [HasStdEnc]' statements that were missed Summary: The InstSE class already initializes Predicates to [HasStdEnc]. No functional change (confirmed by diffing tablegen-erated files before and after) Differential Revision: http://reviews.llvm.org/D3548 llvm-svn: 207558	2014-04-29 17:04:30 +00:00
Daniel Sanders	5682f63b46	[mips] Remove more redundant 'let Predicates = [HasStdEnc]' statements Summary: The InstSE class already initializes Predicates to [HasStdEnc]. No functional change (confirmed by diffing tablegen-erated files before and after) Differential Revision: http://reviews.llvm.org/D3547 llvm-svn: 207551	2014-04-29 16:37:01 +00:00
Daniel Sanders	f562582d15	[mips] Remove redundant 'let Predicates = [HasStdEnc]' statements Summary: The MipsPat class already initializes Predicates to [HasStdEnc]. No functional change (confirmed by diffing tablegen-erated files before and after) Differential Revision: http://reviews.llvm.org/D3546 llvm-svn: 207548	2014-04-29 16:24:10 +00:00
Joerg Sonnenberger	dd18d5b0f6	Parse and create GOT_PREL relocations. llvm-svn: 207526	2014-04-29 13:42:02 +00:00
Daniel Sanders	b3268e71e2	[mips][msa] Fix element extraction where the index is variable. Summary: This isn't supported directly so we splat the vector element and extract the most convenient copy. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://reviews.llvm.org/D3530 llvm-svn: 207524	2014-04-29 13:31:37 +00:00
Rafael Espindola	b60c829a2a	Centralize the handling of the thumb bit. This patch centralizes the handling of the thumb bit around MCStreamer::isThumbFunc and makes isThumbFunc handle aliases. This fixes a corner case, but the main advantage is having just one way to check if a MCSymbol is thumb or not. This should still be refactored to be ARM only, but at least now it is just one predicate that has to be refactored instead of 3 (isThumbFunc, ELF_Other_ThumbFunc, and SF_ThumbFunc). llvm-svn: 207522	2014-04-29 12:46:50 +00:00
Tim Northover	9e7782dcf3	X86: emit hidden stubs into a proper non_lazy_symbol_pointer section. rdar://problem/16660411 llvm-svn: 207518	2014-04-29 10:06:10 +00:00
Tim Northover	2372301bcf	ARM: emit hidden stubs into a proper non_lazy_symbol_pointer section. rdar://problem/16660411 llvm-svn: 207517	2014-04-29 10:06:05 +00:00
Benjamin Kramer	e1ab3f062e	AArch64: Mark vector long multiplication as expand. There are no patterns for this. This was already fixed for ARM64 but I forgot to apply it to AArch64 too. llvm-svn: 207515	2014-04-29 09:37:54 +00:00
Elena Demikhovsky	299cf511c4	AVX-512: optimized a shuffle pattern to VINSERTI64x4. Added intrinsics for VPERMT2PS/PD/D/Q instructions. llvm-svn: 207513	2014-04-29 09:09:15 +00:00
Craig Topper	9d74a5a5f1	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. llvm-svn: 207511	2014-04-29 07:58:41 +00:00
Craig Topper	e06fc4f0ca	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. AArch64 edition llvm-svn: 207510	2014-04-29 07:58:34 +00:00
Craig Topper	f85b7fc197	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. ARM64 edition llvm-svn: 207509	2014-04-29 07:58:25 +00:00
Craig Topper	906c2cd2e6	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Hexagon edition llvm-svn: 207508	2014-04-29 07:58:16 +00:00
Craig Topper	6f9e59ea55	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. MSP430 edition llvm-svn: 207507	2014-04-29 07:58:09 +00:00
Craig Topper	56c590af3b	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Mips edition llvm-svn: 207506	2014-04-29 07:58:02 +00:00
Craig Topper	2865c986d1	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. NVPTX edition llvm-svn: 207505	2014-04-29 07:57:44 +00:00
Craig Topper	0d3fa92514	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. PowerPC edition llvm-svn: 207504	2014-04-29 07:57:37 +00:00
Craig Topper	5656db4a8b	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. R600 edition llvm-svn: 207503	2014-04-29 07:57:24 +00:00
Craig Topper	b0c941bebd	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. Sparc edition llvm-svn: 207502	2014-04-29 07:57:13 +00:00
Craig Topper	60879a3c76	[C++11] Add 'override' keywords and remove 'virtual'. Additionally add 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. XCore edition llvm-svn: 207501	2014-04-29 07:57:00 +00:00
Hao Liu	6db3410071	[ARM64]Fix a bug about incorrect operand order in an EXT instruction, which is introduced by r207485. llvm-svn: 207500	2014-04-29 07:51:19 +00:00
Hao Liu	cf37110920	[ARM64]Fix a bug when lowering shuffle vector to an EXT instruction. E.g. Mask like <-1, -1, 1, ...> will generate incorrect EXT index. llvm-svn: 207485	2014-04-29 01:50:36 +00:00
Eric Christopher	612bb69bf7	None of these targets actually define their own CFI_INSTRUCTION opcode so there's no reason to use the target namespace for it rather than TargetOpcode. llvm-svn: 207475	2014-04-29 00:16:46 +00:00
Eric Christopher	40af450562	80-column fixups. llvm-svn: 207474	2014-04-29 00:16:42 +00:00
Eric Christopher	d17374919b	80-column, tab characters, comment fixups. llvm-svn: 207473	2014-04-29 00:16:40 +00:00
Eric Christopher	4237bf10f3	Fix 80-columns, tab characters, and comments. llvm-svn: 207472	2014-04-29 00:16:33 +00:00
Quentin Colombet	50efe87e5b	[X86] Add more details in the comments of X86TargetLowering::getScalingFactorCost. llvm-svn: 207432	2014-04-28 18:39:57 +00:00
Chad Rosier	0def8e2652	[ARM64] Fix an issue where we were always assuming a copy was coming from a D subregister. llvm-svn: 207423	2014-04-28 16:21:50 +00:00
Tim Northover	6ad1f5c817	ARM: stop passing unused values up the TableGen hierarchy. It's bad enough that I have to look up 5 different levels of TableGen class definitions to work out what bits go where in a simple NEON instruction anyway, without having to keep track of umpteen unused parameters. llvm-svn: 207420	2014-04-28 13:53:00 +00:00
Patrik Hagglund	319983810a	Fix gcc -Wsign-compare warning in X86DisassemblerTables.cpp. X86_MAX_OPERANDS is changed to unsigned. Also, add range-based for loops for affected loops. This in turn needed an ArrayRef instead of a pointer-to-array in InternalInstruction. llvm-svn: 207413	2014-04-28 12:12:27 +00:00
Tim Northover	7b839f833d	ARM64: diagnose use of v16-v31 in certain indexed NEON instructions. Someone couldn't bear to have a completely orthogonal set of floating-point registers, so we've got some instructions that only accept v0-v15 (coming in ARMv9, V128_prime: you're allowed v2, v3, v5, v7, ...). Anyway, we were permitting even the out of range registers during assembly (CodeGen handled it correctly). This adds a diagnostic. llvm-svn: 207412	2014-04-28 11:27:43 +00:00
Hao Liu	9a342778b9	[ARM64]Fix a bug cannot select UQSHL/SQSHL with constant i64 shift amount. llvm-svn: 207399	2014-04-28 07:34:27 +00:00
Craig Topper	8c0b4d0791	Convert more SelectionDAG functions to use ArrayRef. llvm-svn: 207397	2014-04-28 05:57:50 +00:00
Craig Topper	e73658ddbb	[C++] Use 'nullptr'. llvm-svn: 207394	2014-04-28 04:05:08 +00:00
Rafael Espindola	466d66358d	Add emitThumbSet to the arm target streamer. This fixes the asm printer implementation and lets the parser be unaware of what .thumb_set is. llvm-svn: 207381	2014-04-27 20:23:58 +00:00
Craig Topper	131de82adb	Convert SelectionDAG::MorphNodeTo to use ArrayRef. llvm-svn: 207378	2014-04-27 19:21:16 +00:00
Craig Topper	481fb2879f	Convert SelectionDAG::SelectNodeTo to use ArrayRef. llvm-svn: 207377	2014-04-27 19:21:11 +00:00
Craig Topper	dd5e16dd34	Convert one last signature of getNode to take an ArrayRef of SDUse. llvm-svn: 207376	2014-04-27 19:21:06 +00:00
Craig Topper	64941d9786	Convert SelectionDAG::getMergeValues to use ArrayRef. llvm-svn: 207374	2014-04-27 19:20:57 +00:00
Benjamin Kramer	ce4b3fee72	X86TTI: Adjust sdiv cost now that we can lower it on plain SSE2. Includes a fix for a horrible typo that caused all SDIV costs to be slightly off :) llvm-svn: 207371	2014-04-27 18:47:54 +00:00
Benjamin Kramer	3693e77cb4	X86: If SSE4.1 is missing lower SMUL_LOHI of v4i32 to pmuludq and fix up the high parts. This is more expensive than pmuldq but still cheaper than scalarizing the whole thing. llvm-svn: 207370	2014-04-27 18:47:41 +00:00
Rafael Espindola	4c6f61302e	Avoid using MCSymbolData on the asm streamer. Only the object streamers need to track if a symbol should be marked thumb or not. This ports the ELF case. The COFF case is not ported since it is currently not working for some other reason (I will report a bug). llvm-svn: 207366	2014-04-27 17:10:46 +00:00
Saleem Abdulrasool	0ea5d091c7	ARM: MSVC does not support = default Explicitly "implement" the destructor as MSVC does not support defaulted methods yet. llvm-svn: 207350	2014-04-27 05:28:10 +00:00
Saleem Abdulrasool	84b952b677	Add WoA object file emission support Introduce support for WoA PE/COFF object file emission from LLVM. Add the new target specific PE/COFF Streamer (ARMWinCOFFStreamer) that handles the ARM specific behaviour of PE/COFF object emission. ARM exception information is not yet emitted and is a TODO item. The ARM specific object writer (ARMWinCOFFObjectWriter) handles the ARM specific relocation handling in conjunction with the WinCOFFObjectWriter in the MC layer. The MC layer needs to be updated to deal with the relocation adjustments. Branch relocations are adjusted by 4 bytes (unlikely their ELF counterparts). Minor tweaks to switch multiple conditional checks into equivalent switch statements. The ObjectFileInfo is updated to relax the object file setup for Windows COFF. Move the architecture checks into an assertion. Windows COFF is currently only supported on x86, x86_64, and ARM (thumb). Rather than defaulting to ELF, we will refuse to generate an object file. This is better though as you do not get an (arbitrary) object file which is different from the request. llvm-svn: 207345	2014-04-27 03:48:22 +00:00
Saleem Abdulrasool	a8b1f7204b	MC: create X86WinCOFFStreamer for target specific behaviour This introduces a target specific streamer, X86WinCOFFStreamer, which handles the target specific behaviour (e.g. WinEH). This is mostly to ensure that differences between ARM and X86 remain disjoint and do not accidentally cross boundaries. This is the final staging change for enabling object emission for Windows on ARM. llvm-svn: 207344	2014-04-27 03:48:12 +00:00
Saleem Abdulrasool	6d6fee9cbc	ARM: Support SingleParameterDotFile on WoA Currently, the integrated assembler is the only choice for assembling Windows on ARM binaries. IAS supports the .file <filename> directive which emits the file symbol into the resulting object binary. Mark the GNU COFF information to indicate support for this feature. llvm-svn: 207341	2014-04-27 03:47:57 +00:00
Craig Topper	59f626d9d5	Replace std::vector with SmallVector for some small, known size vectors. llvm-svn: 207330	2014-04-26 19:29:47 +00:00
Craig Topper	206fcd450a	Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size. llvm-svn: 207329	2014-04-26 19:29:41 +00:00
Craig Topper	48d114bed1	Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>. llvm-svn: 207327	2014-04-26 18:35:24 +00:00
Benjamin Kramer	c2ad8f3ef1	Print X86ISD::PMULDQ nodes properly in debug output. llvm-svn: 207322	2014-04-26 16:26:41 +00:00
Benjamin Kramer	7c3722724b	X86TTI: i16/i32 vector div with a constant (splat) divisor are reasonably cheap now. Turn vectorization back on. llvm-svn: 207320	2014-04-26 14:53:05 +00:00
Benjamin Kramer	6d2dff61f9	X86: Lower SMUL_LOHI of v4i32 to pmuldq when SSE4.1 is available. llvm-svn: 207318	2014-04-26 14:12:19 +00:00
Benjamin Kramer	c9827ab103	X86: Add patterns for MULHU/MULHS of v8i16 and v16i16. This gets us pretty code for divs of i16 vectors. Turn the existing intrinsics into the corresponding nodes. llvm-svn: 207317	2014-04-26 13:01:03 +00:00
Benjamin Kramer	ad0168702a	Rip out X86-specific vector SDIV lowering, make the corresponding DAGCombiner transform work on vectors. llvm-svn: 207316	2014-04-26 13:00:53 +00:00
Benjamin Kramer	4dae598bc8	DAGCombiner: Turn divs of vector splats into vectorized multiplications. Otherwise the legalizer would just scalarize everything. Support for mulhi in the targets isn't that great yet so on most targets we get exactly the same scalarized output. Add a test for x86 vector udiv. I had to disable the mulhi nodes on ARM because there aren't any patterns for it. As far as I know ARM has instructions for getting the high part of a multiply so this should be fixed. llvm-svn: 207315	2014-04-26 12:06:28 +00:00
Benjamin Kramer	29139d5cb5	X86: Custom lower v4i32 UMUL_LOHI into 2 pmuludqs. Test will follow soon. llvm-svn: 207314	2014-04-26 12:06:11 +00:00
Michael Zolotukhin	1a97a7bcbf	Revert r206749 till a final decision about the intrinsics is made. llvm-svn: 207313	2014-04-26 09:56:41 +00:00
Quentin Colombet	ea18933d97	[X86] Implement TargetLowering::getScalingFactorCost hook. Scaling factors are not free on X86 because every "complex" addressing mode breaks the related instruction into 2 allocations instead of 1. <rdar://problem/16730541> llvm-svn: 207301	2014-04-26 01:11:26 +00:00
Filipe Cabecinhas	363b570d2a	Optimization for certain shufflevector by using insertps. Summary: If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower certain shufflevectors to an insertps instruction: When most of the shufflevector result's elements come from one vector (and keep their index), and one element comes from another vector or a memory operand. Added tests for insertps optimizations on shufflevector. Added support and tests for v4i32 vector optimization. Reviewers: nadav Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3475 llvm-svn: 207291	2014-04-25 23:51:17 +00:00
Matt Arsenault	de1c3410c3	R600: Fix function name printing in LowerCall v2: Check both ExternalSymbol and GlobalAddress Patch by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 207282	2014-04-25 22:22:01 +00:00
Reed Kotler	5c7f91e42f	enable fast isel tablegen files for Mips Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3498 llvm-svn: 207256	2014-04-25 18:36:38 +00:00
Duncan P. N. Exon Smith	d2b2facb07	SCC: Change clients to use const, NFC It's fishy to be changing the `std::vector<>` owned by the iterator, and no one actual does it, so I'm going to remove the ability in a subsequent commit. First, update the users. <rdar://problem/14292693> llvm-svn: 207252	2014-04-25 18:24:50 +00:00
Reed Kotler	c041669927	Make sure that DSUB does not duplicate the pattern of DSUBU Test Plan: Run test suite to make sure there is no regression. https://dmz-portal.mips.com/bb/builders/LLVM%20with%2064bit%20and%20delay%20slot%20optimizer%20and%20direct%20object%20emitter/builds/626 Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3497 llvm-svn: 207247	2014-04-25 18:05:00 +00:00
Saleem Abdulrasool	99f0d458c3	ARM: remove @llvm.arm.sevl This intrinsic is no longer needed with the new @llvm.arm.hint(i32) intrinsic which provides a generic, extensible manner for adding hint instructions. This functionality can now be represented as @llvm.arm.hint(i32 5). llvm-svn: 207246	2014-04-25 17:51:25 +00:00
Saleem Abdulrasool	7e7c2f9ca6	ARM: provide a new generic hint intrinsic Introduce the llvm.arm.hint(i32) intrinsic that can be used to inject hints into the instruction stream. This is particularly useful for generating IR from a compiler where the user may inject an intrinsic (e.g. __yield). These are then pattern substituted into the correct instruction which already existed. llvm-svn: 207242	2014-04-25 17:24:24 +00:00
Tilmann Scheller	2c65bbddd8	[ARM64] When compiling for ELF in PIC mode, local symbols shouldn't go through the GOT There's no need for local symbols to go through the GOT, in fact it seems GNU ld is not even emitting GOT entries for local symbols and will error out when trying to resolve a GOT relocation for a local symbol. This bug triggers when bootstrapping clang on AArch64 Linux with -fPIC and the ARM64 backend. The AArch64 backend is not affected. With this commit it's now possible to bootstrap clang on AArch64 Linux with the ARM64 backend (-fPIC, -O3). llvm-svn: 207226	2014-04-25 13:43:18 +00:00
Jiangning Liu	533b560bc6	[ARM64] Handle fp128 for parameter passing on stack llvm-svn: 207222	2014-04-25 12:07:03 +00:00
Tim Northover	eb7354fd3b	ARM64: fix assertion in ISelDAGToDAG Also an unused variable, so double bonus! This should deal with PR19548. llvm-svn: 207221	2014-04-25 10:48:47 +00:00
Bradley Smith	672df15122	[ARM64] Print preferred aliases for SFBM/UBFM in InstPrinter llvm-svn: 207219	2014-04-25 10:25:29 +00:00
Kevin Qin	022d395c9c	[ARM64] Add RUN lines for "–target arm64 –mattr=-fp-armv8" on AArch64 no-fp test. This patch is a supplement of implementing predicate of FP, enabling aarch64 backend no-fp tests on arm64 target for verification. During this, one bug is exposed and fixed by this patch. llvm-svn: 207215	2014-04-25 09:44:20 +00:00
Kevin Qin	0e7b07704e	[ARM64] Support crc predicate on ARM64. According to the specification, CRC is an optional extension of the architecture. llvm-svn: 207214	2014-04-25 09:25:42 +00:00
Saleem Abdulrasool	d4cae62fda	X86: convert object streamer selection to a switch Change the object streamer selection to a switch from a series of if conditions. Rather than defaulting to ELF, require that an ELF format is requested. The Windows/!ELF is maintained as MachO would have been selected first and will still provide a MachO format. Add an assertion that if COFF is requested that the target platform is Windows as only WinCOFF object emission is currently supported. llvm-svn: 207200	2014-04-25 06:29:36 +00:00
Craig Topper	062a2baef0	[C++] Use 'nullptr'. Target edition. llvm-svn: 207197	2014-04-25 05:30:21 +00:00
Benjamin Kramer	76f753e9a9	X86: Don't transform shifts into ands when the sign bit is tested. Should unbreak MultiSource/Benchmarks/mediabench/g721/g721encode/encode. llvm-svn: 207145	2014-04-24 20:51:37 +00:00
Reid Kleckner	5772b77789	Add 'musttail' marker to call instructions This is similar to the 'tail' marker, except that it guarantees that tail call optimization will occur. It also comes with convervative IR verification rules that ensure that tail call optimization is possible. Reviewers: nicholas Differential Revision: http://llvm-reviews.chandlerc.com/D3240 llvm-svn: 207143	2014-04-24 20:14:34 +00:00
Andrea Di Biagio	d1ab866868	[X86] Add support for Read Time Stamp Counter x86 builtin intrinsics. This patch: - Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and 'int_x86_rdtscp') as GCCBuiltin intrinsics; - Teaches the backend how to lower the two new builtins; - Introduces a common function to lower READCYCLECOUNTER dag nodes and the two new rdtsc/rdtscp intrinsics; - Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll' correctly verifies that both READCYCLECOUNTER and the two new intrinsics work fine for both 64bit and 32bit Subtargets. llvm-svn: 207127	2014-04-24 17:18:27 +00:00
Matt Arsenault	1018c897f6	R600/SI: Use address space in allowsUnalignedMemoryAccesses llvm-svn: 207126	2014-04-24 17:08:26 +00:00
David Blaikie	908f4d4bf5	Spread some const around for non-mutating uses of MCSymbolData. I discovered this const-hole while attempting to coalesnce the Symbol and SymbolMap data structures. There's some pending issues with that, but I figured this change was easy to flush early. llvm-svn: 207124	2014-04-24 16:59:40 +00:00
Matheus Almeida	583a13cf36	[mips] Remove non-ascii character. llvm-svn: 207123	2014-04-24 16:31:10 +00:00
Tim Northover	6331d4b975	AArch64: print NEON lists with a space. This matches ARM64 behaviour, which I think is clearer. It also puts all the churn from that difference into one easily ignored commit. llvm-svn: 207116	2014-04-24 14:06:20 +00:00
Evgeniy Stepanov	f4a36999ad	[asan] Use MCInstrInfo in inline asm instrumentation. Patch by Yuri Gorshenin. llvm-svn: 207115	2014-04-24 13:29:34 +00:00
Tim Northover	d702d6ac6f	AArch64/ARM64: allow negative addends, at least on ELF. llvm-svn: 207111	2014-04-24 12:56:38 +00:00
Tim Northover	624928134f	ARM64: support relocated "TBZ/TBNZ" instructions. llvm-svn: 207110	2014-04-24 12:56:34 +00:00
Tim Northover	0815a43e7c	AArch64/ARM64: support relocated ADR instruction llvm-svn: 207109	2014-04-24 12:56:30 +00:00
Tim Northover	597ccb200c	AArch64/ARM64: add support for :abs_gN_s: MOVZ modifiers We only need assembly support, so it's fairly easy. llvm-svn: 207108	2014-04-24 12:56:27 +00:00
Tim Northover	49153037d4	ARM64: shut up warning about variable only used in assert. llvm-svn: 207106	2014-04-24 12:22:12 +00:00
Tim Northover	79ec019261	AArch64/ARM64: disentangle the "B.CC" and "LDR lit" operands These can have different relocations in ELF. In particular both: b.eq global ldr x0, global are valid, giving different relocations. The only possible way to distinguish them is via a different fixup, so the operands had to be separated throughout the backend. llvm-svn: 207105	2014-04-24 12:12:10 +00:00
Tim Northover	eb6611e727	AArch64/ARM64: implement BFI optimisation ARM64 was not producing pure BFI instructions for bitfield insertion operations, unlike AArch64. The approach had to be a little different (in ISelDAGToDAG rather than ISelLowering), and the outcomes aren't identical but hopefully this gives it similar power. This should address PR19424. llvm-svn: 207102	2014-04-24 12:11:53 +00:00
Evgeniy Stepanov	b6c47a5bd2	[asan] Fix instrumentation of x86 intel syntax inline assembly. Patch by Yuri Gorshenin. llvm-svn: 207092	2014-04-24 09:56:15 +00:00
Benjamin Kramer	f4575db2fd	X86: Emit test instead of constant shift + compare if the shift result is unused. This allows us to compile return (mask & 0x8 ? a : b); into testb $8, %dil cmovnel %edx, %esi instead of andl $8, %edi shrl $3, %edi cmovnel %edx, %esi which we formed previously because dag combiner canonicalizes setcc of and into shift. llvm-svn: 207088	2014-04-24 08:15:31 +00:00
Stepan Dyatkovskiy	00dcc0f53c	Fix for PR18921, "vmov" part. Added support for bytes replication feature, so it could be GAS compatible. E.g. instructions below: "vmov.i32 d0, 0xffffffff" "vmvn.i32 d0, 0xabababab" "vmov.i32 d0, 0xabababab" "vmov.i16 d0, 0xabab" are incorrect, but we could deal with such cases. For first one we should emit: "vmov.i8 d0, 0xff" For second one ("vmvn"): "vmov.i8 d0, 0x54" For last two instructions it should emit: "vmov.i8 d0, 0xab" P.S.: In ARMAsmParser.cpp I have also fixed few nearby style issues in old code. Just for keeping method bodies in harmony with themselves. llvm-svn: 207080	2014-04-24 06:03:01 +00:00
Quentin Colombet	ef86b4067c	[ARM64] Fix the information we give to the peephole optimizer for comparison. ANDS does not use the same encoding scheme as other xxxS instructions (e.g., ADDS). Take that into account to avoid wrong peephole optimization. <rdar://problem/16693089> llvm-svn: 207020	2014-04-23 20:43:38 +00:00
Quentin Colombet	04f7b74c39	[X86] Fix missing/wrong scheduling model found by code inspection. llvm-svn: 207014	2014-04-23 19:30:26 +00:00
NAKAMURA Takumi	d5696915d4	X86AsmParser.cpp: Fix memory leak at replacing movsd to movsl. llvm-svn: 206991	2014-04-23 14:51:35 +00:00
Evgeniy Stepanov	0a951b775e	Create MCTargetOptions. For now it contains a single flag, SanitizeAddress, which enables AddressSanitizer instrumentation of inline assembly. Patch by Yuri Gorshenin. llvm-svn: 206971	2014-04-23 11:16:03 +00:00
James Molloy	029de8b769	[ARM64] Fix formatting. llvm-svn: 206967	2014-04-23 10:50:32 +00:00
James Molloy	650cb57067	[ARM64] Add a big endian version of the ARM64 target machine, and update all users. This completes the porting of r202024 (cpirker "Add AArch64 big endian Target (aarch64_be)") to ARM64. llvm-svn: 206965	2014-04-23 10:26:40 +00:00
Alexey Volkov	9511327db8	Fixing typos in commit r206957 Differential Revision: http://reviews.llvm.org/D3451 llvm-svn: 206960	2014-04-23 10:20:31 +00:00
Alexey Volkov	0e55a99c0f	[X86] Silvermont new scheduler model This model is not final and work is still in progress. However there are substantial improvements on integer tests mainly because of better RAL with new scheduler. Differential Revision: http://reviews.llvm.org/D3451 llvm-svn: 206957	2014-04-23 08:57:09 +00:00
Elena Demikhovsky	8ac0bf96f0	X86Disassembler - fixed a bug in immediate print llvm-svn: 206953	2014-04-23 07:21:04 +00:00
Kevin Qin	a4ee178762	[ARM64] Enable feature predicates for NEON / FP / CRYPTO. AArch64 has feature predicates for NEON, FP and CRYPTO instructions. This allows the compiler to generate code without using FP, NEON or CRYPTO instructions. llvm-svn: 206949	2014-04-23 06:22:48 +00:00
Kevin Enderby	96918bc406	Fix the assembler to print a better relocatable expression error diagnostic that includes location information. Currently if one has this assembly: .quad (0x1234 + (4 * SOME_VALUE)) where SOME_VALUE is undefined ones gets the less than useful error message with no location information: % clang -c x.s clang -cc1as: fatal error: error in backend: expected relocatable expression With this fix one now gets a more useful error message with location information: % clang -c x.s x.s:5:8: error: expected relocatable expression .quad (0x1234 + (4 * SOME_VALUE)) ^ To do this I plumbed the SMLoc through the MCObjectStreamer EmitValue() and EmitValueImpl() interfaces so it could be used when creating the MCFixup. rdar://12391022 llvm-svn: 206906	2014-04-22 17:27:29 +00:00
Matt Arsenault	16353871c3	R600: Emit error instead of unreachable on function call llvm-svn: 206904	2014-04-22 16:42:00 +00:00
Tom Stellard	8d6d449756	R600/SI: Reorganize SIInstructions.td llvm-svn: 206902	2014-04-22 16:33:57 +00:00
Elena Demikhovsky	acc5c9e83e	AVX-512: store and truncstore for i1 values llvm-svn: 206897	2014-04-22 14:13:10 +00:00
Tim Northover	a962398a3f	AArch64/ARM64: make use of ANDS and BICS instructions for comparisons. llvm-svn: 206888	2014-04-22 12:45:42 +00:00
Lang Hames	64f6ebb8a9	[X86] Require HasBMI2 for the new BZHI tablegen patterns. Evidently tablegen doesn't infer this from the HasBMI2 predicate on the BZHI instructions. This should fix the recent bot failures. llvm-svn: 206885	2014-04-22 12:04:53 +00:00
Robert Khasanov	189e7fdcfb	[AVX512] Implemented integer conversions up/down with masking. Added encoding tests. llvm-svn: 206884	2014-04-22 11:36:19 +00:00
Lang Hames	70fa72d340	[X86] Remove Tablegen def of X86bzhi SDNode: It's not needed as of r206879. llvm-svn: 206880	2014-04-22 10:50:46 +00:00
Lang Hames	3067ab2344	[X86] Use tablegen instead of DAG combines to match BZHI instructions, as suggested by Ben Kramer in review of r206738. Thanks again Ben! llvm-svn: 206879	2014-04-22 10:41:56 +00:00
Matheus Almeida	2852af8a00	[mips] Clang-format MipsAsmParser. No functional changes. llvm-svn: 206878	2014-04-22 10:15:54 +00:00
Tim Northover	00b4ee848f	AArch64/ARM64: add patterns for scalar_to_vector/extract pairs llvm-svn: 206876	2014-04-22 10:10:18 +00:00
Tim Northover	978d25f391	ARM: disable emission of __XYZvfp in soft-float environment. The point of these calls is to allow Thumb-1 code to make use of the VFP unit to perform its operations. This is not desirable with -msoft-float, since most of the reasons you'd want that apply equally to the runtime library. rdar://problem/13766161 llvm-svn: 206874	2014-04-22 10:10:09 +00:00
Lang Hames	f6f42cac3f	[X86] Don't use BZHI for short masks (>=32 bits). Thanks to Ben Kramer for the review. llvm-svn: 206869	2014-04-22 07:40:34 +00:00
Matt Arsenault	a3c8cde77b	R600: Change how vector truncating stores are packed. Don't introduce new operations on an illegal sub 32-bit type. Do the operations on a 32-bit value, and then use a truncating store. llvm-svn: 206864	2014-04-22 04:11:14 +00:00
Matt Arsenault	5dbd5db518	R600: Make sign_extend_inreg legal. Don't know why I didn't just do this in the first place. llvm-svn: 206862	2014-04-22 03:49:30 +00:00
Jiangning Liu	87486e0bac	[AArch64] Enable global merge pass. llvm-svn: 206861	2014-04-22 03:33:26 +00:00
Chandler Carruth	84e68b2994	[Modules] Fix potential ODR violations by sinking the DEBUG_TYPE definition below all of the header #include lines, lib/Target/... edition. llvm-svn: 206842	2014-04-22 02:41:26 +00:00
Chandler Carruth	ff55593c40	[cleanup] Fix two headers where we included a standard library header after including the generated code from tablegen. llvm-svn: 206841	2014-04-22 02:28:45 +00:00
Chandler Carruth	b5e481ac91	[cleanup] Fix another place where we were including the tablegen'ed code of a '.inc' file before including actual headers. In this case we had both duplicated a header's include and were including a standard header. llvm-svn: 206840	2014-04-22 02:25:17 +00:00
Chandler Carruth	d174b72a28	[cleanup] Lift using directives, DEBUG_TYPE definitions, and even some system headers above the includes of generated '.inc' files that actually contain code. In a few targets this was already done pretty consistently, but it wasn't done really consistently anywhere. It is strictly cleaner IMO and necessary in a bunch of places where the DEBUG_TYPE is referenced from the generated code. Consistency with the necessary places trumps. Hopefully the build bots are OK with the movement of intrin.h... llvm-svn: 206838	2014-04-22 02:03:14 +00:00
Chandler Carruth	e96dd8975f	[Modules] Make Support/Debug.h modular. This requires it to not change behavior based on other files defining DEBUG_TYPE, which means it cannot define DEBUG_TYPE at all. This is actually better IMO as it forces folks to define relevant DEBUG_TYPEs for their files. However, it requires all files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't already. I've updated all such files in LLVM and will do the same for other upstream projects. This still leaves one important change in how LLVM uses the DEBUG_TYPE macro going forward: we need to only define the macro after header files have been #include-ed. Previously, this wasn't possible because Debug.h required the macro to be pre-defined. This commit removes that. By defining DEBUG_TYPE after the includes two things are fixed: - Header files that need to provide a DEBUG_TYPE for some inline code can do so by defining the macro before their inline code and undef-ing it afterward so the macro does not escape. - We no longer have rampant ODR violations due to including headers with different DEBUG_TYPE definitions. This may be mostly an academic violation today, but with modules these types of violations are easy to check for and potentially very relevant. Where necessary to suppor headers with DEBUG_TYPE, I have moved the definitions below the includes in this commit. I plan to move the rest of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big enough. The comments in Debug.h, which were hilariously out of date already, have been updated to reflect the recommended practice going forward. llvm-svn: 206822	2014-04-21 22:55:11 +00:00
Jim Grosbach	9446534025	ARM64: Refactor away a few redundant helpers. The comment claimed that the register class information wasn't available in the assembly parser, but that's not really true. It's just annoying to get to. Replace the helper functions with references to the auto-generated information. llvm-svn: 206802	2014-04-21 22:13:57 +00:00
Jim Grosbach	9515c52294	ARM64: Improve diagnostics for malformed reg+reg addressing mode. Make sure only general purpose registers are valid for offset regs and that 32-bit regs are only valid for sxtw and uxtw extends. llvm-svn: 206799	2014-04-21 21:45:57 +00:00
Jim Grosbach	ac901086e5	Move helper functions earlier in the file. No functional change. llvm-svn: 206798	2014-04-21 21:45:53 +00:00
Jim Grosbach	9d205d42f3	ARM64: Extended addressing mode source reg is 64-bit. The canonical form for the extended addressing mode (e.g., "[x1, w2, uxtw #3]" is for the MCInst to have the second register be the full 64-bit GPR64 register class. The instruction printer cleans up the output for display to show the 32-bit register instead, per the specification. This simplifies 205893 now that the aliasing is handled in the printer in 206495 so that the codegen path and the disassembler path give the same MCInst form. llvm-svn: 206797	2014-04-21 21:45:44 +00:00
Rafael Espindola	6c76d1d7df	Handle _GLOBAL_OFFSET_TABLE_ in 64 bit mode. With this MC is able to handle _GLOBAL_OFFSET_TABLE_ in 64 bit mode, which is needed for medium and large code models. This fixes pr19470. llvm-svn: 206793	2014-04-21 21:15:45 +00:00
Rafael Espindola	83752535ea	clang-format this function. No functionality change, it will just make the next patch easier to read. llvm-svn: 206792	2014-04-21 21:00:58 +00:00
David Blaikie	422b93dcf1	Use unique_ptr to manage objects owned by the ScheduleDAGMI. llvm-svn: 206784	2014-04-21 20:32:32 +00:00
Filipe Cabecinhas	20352216fb	Rename X86insrtps to the proper instruction name. Summary: The INSERTPS pattern fragment was called insrtps (mising 'e'), which would make it harder to grep for the patterns related to this instruction. Renaming it to use the proper instruction name. Reviewers: nadav CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3443 llvm-svn: 206779	2014-04-21 20:07:29 +00:00
Chandler Carruth	a4a2066482	[Modules] Consolidate the DEBUG_TYPE defines in NVPTX to the top of the cpp file rather than in the header and then again in the cpp file. llvm-svn: 206778	2014-04-21 19:53:55 +00:00
Yi Jiang	d069f6393a	ARM64: Combine shifts and uses from different basic block to bit-extract instruction llvm-svn: 206774	2014-04-21 19:34:27 +00:00
NAKAMURA Takumi	62774f3524	Appease autoconf build since X86Disassembler.c has been disappeared in r206717. It can be reverted a few days later, after X86Disassembler.d is updated not to contain "X86Disassembler.c". llvm-svn: 206758	2014-04-21 14:59:11 +00:00
Michael Zolotukhin	f2ba994bf6	Reapply r206732. This time without optimization of branches. llvm-svn: 206749	2014-04-21 12:01:33 +00:00
Benjamin Kramer	d2da720ead	[C++11] Replace OwningPtr with std::unique_ptr in places where it doesn't break the API. No functionality change. llvm-svn: 206740	2014-04-21 09:34:48 +00:00
Lang Hames	5aa6ee80b6	[X86] ISEL (and X, <constant mask>) to BZHI when BMI2 is available. Generating BZHI in the variable mask case, i.e. (and X, (sub (shl 1, N), 1)), was already supported, but we were missing the constant-mask case. This patch fixes that. <rdar://problem/15480077> llvm-svn: 206738	2014-04-21 08:18:53 +00:00
Chandler Carruth	a2533a7bef	Revert r206732 which is causing llc to crash on most of the build bots. Original commit message: Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN, safe.urem.iN (iN = i8, i61, i32, or i64). llvm-svn: 206735	2014-04-21 07:11:15 +00:00
Michael Zolotukhin	137a84616c	Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN, safe.urem.iN (iN = i8, i16, i32, or i64). llvm-svn: 206732	2014-04-21 05:33:09 +00:00
Richard Smith	5d50610306	C++ has a bool type! (And C's had one too, for 15 years...) llvm-svn: 206723	2014-04-20 22:15:37 +00:00
Richard Smith	6a6967eeaf	More C++ification. llvm-svn: 206722	2014-04-20 22:10:16 +00:00
Richard Smith	3c3410f139	Remove some more C junk from these files. llvm-svn: 206721	2014-04-20 21:56:02 +00:00
Richard Smith	ac15f1cda3	Don't provide two different definitions of ModRMDecision, OpcodeDecision, and ContextDecision in different source files (depending on #define magic). llvm-svn: 206720	2014-04-20 21:52:16 +00:00
Richard Smith	82b47d5660	Don't define llvm::X86Disassembler::InstructionSpecifier in different ways in different source files. llvm-svn: 206719	2014-04-20 21:35:26 +00:00
Richard Smith	555134215b	Maybe if I touch this file the buildbots will actually rerun configure like they need to... llvm-svn: 206718	2014-04-20 21:28:33 +00:00
Richard Smith	89ee75d786	What year is it! This file has no reason to be written in C, and has doubly no reason to expose a global symbol 'decodeInstruction' nor to pollute the global scope with a bunch of external linkage entities (some of which conflict with others elsewhere in LLVM). This is just the initial transition to C++; more cleanups to follow. llvm-svn: 206717	2014-04-20 21:07:34 +00:00
Alp Toker	9844434151	Remove some empty statements Cleanup only. llvm-svn: 206710	2014-04-19 23:56:35 +00:00
Yaron Keren	d7ba46b287	Patch by Vadim Chugunov Win64 stack unwinder gets confused when execution flow "falls through" after a call to 'noreturn' function. This fixes the "missing epilogue" problem by emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows. A secondary use for it would be for anyone wanting to make double-sure that 'noreturn' functions, indeed, do not return. llvm-svn: 206684	2014-04-19 13:47:43 +00:00
Kevin Enderby	b7e51f6af5	Change the ARM assembler to require a :lower16: or :upper16 on non-constant expressions for mov instructions instead of silently truncating by default. For the ARM assembler, we want to avoid misleadingly allowing something like "mov r0, <symbol>" especially when we turn it into a movw and the expression <symbol> does not have a :lower16: or :upper16" as part of the expression. We don't want the behavior of silently truncating, which can be unexpected and lead to bugs that are difficult to find since this is an easy mistake to make. This does change the previous behavior of llvm but actually matches an older gnu assembler that would not allow this but print less useful errors of like “invalid constant (0x927c0) after fixup” and “unsupported relocation on symbol foo”. The error for llvm is "immediate expression for mov requires :lower16: or :upper16" with correct location information on the operand as shown in the added test cases. rdar://12342160 llvm-svn: 206669	2014-04-18 23:06:39 +00:00
Chad Rosier	9149acb053	[ARM64] Ports the Cortex-A53 Machine Model description from AArch64. Summary: This port includes the rudimentary latencies that were provided for the Cortex-A53 Machine Model in the AArch64 backend. It also changes the SchedAlias for COPY in the Cyclone model to an explicit WriteRes mapping to avoid conflicts in other subtargets. Differential Revision: http://reviews.llvm.org/D3427 Patch by Dave Estes <cestes@codeaurora.org>! llvm-svn: 206652	2014-04-18 21:22:04 +00:00
Adam Nemet	ee7a3e38c9	[X86] Improve buildFromShuffleMostly for AVX For a 256-bit BUILD_VECTOR consisting mostly of shuffles of 256-bit vectors, both the BUILD_VECTOR and its operands may need to be legalized in multiple steps. Consider: (v8f32 (BUILD_VECTOR (extract_vector_elt (v8f32 %vreg0,) Constant<1>), (extract_vector_elt %vreg0, Constant<2>), (extract_vector_elt %vreg0, Constant<3>), (extract_vector_elt %vreg0, Constant<4>), (extract_vector_elt %vreg0, Constant<5>), (extract_vector_elt %vreg0, Constant<6>), (extract_vector_elt %vreg0, Constant<7>), %vreg1)) a. We can't build a 256-bit vector efficiently so, we need to split it into two 128-bit vecs and combine them with VINSERTX128. b. Operands like (extract_vector_elt (v8f32 %vreg0), Constant<7>) needs to be split into a VEXTRACTX128 and a further extract_vector_elt from the resulting 128-bit vector. c. The extract_vector_elt from b. is lowered into a shuffle to the first element and a movss. Depending on the order in which we legalize the BUILD_VECTOR and its operands[1], buildFromShuffleMostly may be faced with: (v4f32 (BUILD_VECTOR (extract_vector_elt (vector_shuffle<1,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), (extract_vector_elt (vector_shuffle<2,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), (extract_vector_elt (vector_shuffle<3,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), %vreg1)) In order to figure out the underlying vector and their identity we need to see through the shuffles. [1] Note that the order in which operations and their operands are legalized is only guaranteed in the first iteration of LegalizeDAG. Fixes <rdar://problem/16296956> llvm-svn: 206634	2014-04-18 19:44:16 +00:00
Tim Northover	37d9a9cebf	ARM64: disable generation of .loh directives outside MachO. Part of PR19455. llvm-svn: 206611	2014-04-18 14:54:46 +00:00
Tim Northover	be1d1b6681	ARM64: don't emit .subsections_via_symbols on ELF. Part of PR19455. llvm-svn: 206610	2014-04-18 14:54:41 +00:00
Tim Northover	be3941cc79	ARM64: add extra NEG pattern. llvm-svn: 206609	2014-04-18 14:54:35 +00:00
Tim Northover	e3028832d1	AArch64/ARM64: add non-scalar lowering for more FCVT operations. llvm-svn: 206591	2014-04-18 13:16:42 +00:00
Tim Northover	01f315a556	AArch64/ARM64: improve spotting of EXT instructions from VECTOR_SHUFFLE. We couldn't cope if the first mask element was UNDEF before, which isn't ideal. llvm-svn: 206588	2014-04-18 12:50:58 +00:00
Benjamin Kramer	e6c821ef4c	X86: Pattern match scalar loads + vcvtph2ps into just vcvtph2ps. vcvtph2ps only reads the lower 64 bits of the address passed to the intrinsic. llvm-svn: 206579	2014-04-18 10:45:33 +00:00
Tim Northover	a2c4c71c12	AArch64/ARM64: spot a greater variety of concat_vector operations. Code mostly copied from AArch64, just tidied up a trifle and plumbed into the ARM64 way of doing things. This also enables the AArch64 tests which inspired the previous untested commits. llvm-svn: 206574	2014-04-18 09:31:27 +00:00
Tim Northover	848bb3ced5	ARM64: implement cunning optimisation from AArch64 A vector extract followed by a dup can become a single instruction even if the types don't match. AArch64 handled this in ISelLowering, but a few reasonably simple patterns can take care of it in TableGen, so that's where I've put it. llvm-svn: 206573	2014-04-18 09:31:20 +00:00
Tim Northover	5ec51a8981	ARM64: spot a vector_shuffle that maps to INS and expand. Tests will be coming very shortly when all the optimisations needed to support AArch64's neon-copy.ll file are committed. llvm-svn: 206572	2014-04-18 09:31:15 +00:00
Tim Northover	46d98ea8de	ARM64: nick some AArch64 patterns for extract/insert -> INS. Tests will be committed shortly when all optimisations needed to support AArch64's neon-copy.ll file are supported. llvm-svn: 206571	2014-04-18 09:31:11 +00:00
Tim Northover	8b2fa3dfef	AArch64/ARM64: emit all vector FP comparisons as such. ARM64 was scalarizing some vector comparisons which don't quite map to AArch64's compare and mask instructions. AArch64's approach of sacrificing a little efficiency to emulate them with the limited set available was better, so I ported it across. More "inspired by" than copy/paste since the backend's internal expectations were a bit different, but the tests were invaluable. llvm-svn: 206570	2014-04-18 09:31:07 +00:00
Tim Northover	0a44e66bb8	AArch64/ARM64: port BSL logic from AArch64 & enable test. I enhanced it a little in the process. The decision shouldn't really be beased on whether a BUILD_VECTOR is a splat: any set of constants will do the job provided they're related in the correct way. Also, the BUILD_VECTOR could be any operand of the incoming AND nodes, so it's best to check for all 4 possibilities rather than assuming it'll be the RHS. llvm-svn: 206569	2014-04-18 09:31:01 +00:00
Tim Northover	547a4ae6fa	AArch64/ARM64: copy byval implementation from AArch64. It's not actually used to handle C or C++ ABI rules on ARM64, but could well be emitted by other language front-ends, so it's as well to have a sensible implementation. llvm-svn: 206568	2014-04-18 09:30:52 +00:00
Jiangning Liu	ad874fca28	This commit allows vectorized loops to be unrolled by a factor of 2 for AArch64. A new test case is also added for ARM64. Patched by Z.Zheng llvm-svn: 206563	2014-04-18 07:57:54 +00:00
Matt Arsenault	209a7b92b5	R600: Minor cleanups. Fix indentation, better line wrapping, unused includes. llvm-svn: 206562	2014-04-18 07:40:20 +00:00
Jiangning Liu	40d81e10c5	This is one of the optimizations ported from ARM64 to AArch64 to address the performance gap between these two back ends. The test case newly added for AArch64 already exists in ARM64. Patched by Z.Zheng llvm-svn: 206559	2014-04-18 05:58:09 +00:00
Matt Arsenault	78b8670aac	R600/SI: Try to use scalar BFE. Use scalar BFE with constant shift and offset when possible. This is complicated by the fact that the scalar version packs the two operands of the vector version into one. llvm-svn: 206558	2014-04-18 05:19:26 +00:00
Jiangning Liu	e56c30614f	This commit enables unaligned memory accesses of vector types on AArch64 back end. This should boost vectorized code performance. Patched by Z. Zheng llvm-svn: 206557	2014-04-18 03:58:38 +00:00
Matt Arsenault	27cc958dff	R600/SI: Match sign_extend_inreg to s_sext_i32_i8 and s_sext_i32_i16 llvm-svn: 206547	2014-04-18 01:53:18 +00:00
Tom Stellard	1aa6cb4d88	R600/SI: Use SReg_64 instead of VSrc_64 when selecting BUILD_PAIR llvm-svn: 206541	2014-04-18 00:36:21 +00:00
Jim Grosbach	6bfe18a365	[ARM64,C++11] Range'ify another loop. llvm-svn: 206539	2014-04-17 23:41:57 +00:00
Reed Kotler	720c5ca4ea	Start pushing changes for Mips Fast-Isel llvm-svn: 206505	2014-04-17 22:15:34 +00:00
Tom Stellard	aeeea8a864	R600: Add comment clariying use of sext for result of MUL_U24 llvm-svn: 206501	2014-04-17 21:00:13 +00:00
Tom Stellard	868fd92e54	R600/SI: Stop using i128 as the resource descriptor type Having i128 as a legal type complicates the legalization phase. v4i32 is already a legal type, so we will use that instead. This fixes several piglit tests. llvm-svn: 206500	2014-04-17 21:00:11 +00:00
Tom Stellard	334b29c7f6	R600/SI: Change default register class for i32 to SReg_32 SIFixSGPRCopies is smart enough to handle this now. llvm-svn: 206499	2014-04-17 21:00:09 +00:00
Tom Stellard	4f3b04de21	R600/SI: Teach SIInstrInfo::moveToVALU() how to handle PHI instructions llvm-svn: 206498	2014-04-17 21:00:07 +00:00
Tom Stellard	e1a244502c	R600/SI: Legalize operands after changing dst reg in FixSGPRCopies Otherwise we may not legalize some illegal REG_SEQUENCE instructions. llvm-svn: 206497	2014-04-17 21:00:01 +00:00
Louis Gerbarg	153e695ee2	Improve ARM64 vector creation This patch improves the performance of vector creation in caseiswhere where several of the lanes in the vector are a constant floating point value. It also includes new patterns to fold together some of the instructions when the value is 0.0f. Test cases included. rdar://16349427 llvm-svn: 206496	2014-04-17 20:51:50 +00:00
Jim Grosbach	0fba6d98fc	ARM64: [su]xtw use W regs as inputs, not X regs. Update the SXT[BHW]/UXTW instruction aliases and the shifted reg addressing mode handling. PR19455 and rdar://16650642 llvm-svn: 206495	2014-04-17 20:47:31 +00:00
Tim Northover	11a6082e33	ARM64: switch to IR-based atomic operations. Goodbye code! (Game: spot the bug fixed by the change). llvm-svn: 206490	2014-04-17 20:00:33 +00:00
Tim Northover	0129f298c4	ARM64: add acquire/release versions of the existing atomic intrinsics. These will be needed to support IR-level lowering of atomic operations. llvm-svn: 206489	2014-04-17 20:00:24 +00:00
Tim Northover	037f26f212	Atomics: promote ARM's IR-based atomics pass to CodeGen. Still only 32-bit ARM using it at this stage, but the promotion allows direct testing via opt and is a reasonably self-contained patch on the way to switching ARM64. At this point, other targets should be able to make use of it without too much difficulty if they want. (See ARM64 commit coming soon for an example). llvm-svn: 206485	2014-04-17 18:22:47 +00:00
Matt Arsenault	a90d22fad5	R600/SI: f64 frint is legal on CI llvm-svn: 206475	2014-04-17 17:06:37 +00:00
Chad Rosier	c4eb4f8827	[AArch64] Implement the getCSRFirstUseCost API, mirroring that in ARM64. llvm-svn: 206473	2014-04-17 16:19:54 +00:00
Craig Topper	0a9bf4c0c5	[X86] Add disassembler support for the 0x0f 0x7f form of movq %mm, %mm. llvm-svn: 206447	2014-04-17 06:33:45 +00:00
Matt Arsenault	51df0c1965	R600/SI: Fix zext from i1 to i64 llvm-svn: 206437	2014-04-17 02:03:08 +00:00
Adam Nemet	287f989dde	[ARM64] Fix "Cannot select" for vector ctpop The commit of r205855: Author: Arnold Schwaighofer <aschwaighofer@apple.com> Date: Wed Apr 9 14:20:47 2014 +0000 SLPVectorizer: Only vectorize intrinsics whose operands are widened equally The vectorizer only knows how to vectorize intrinics by widening all operands by the same factor. Patch by Tyler Nowicki! exposed a backend bug causing a regression (Cannot select ctpop). The commit msg is a bit confusing because the patch actually changes the behavior for the loop-vectorizer as well. As things got refactored into a helper ctpop got snuck in to the trivially-vectorizable helper which is now used by both vectorizers. In other words, we started seeing vector-ctpops in the backend. This change makes ctpop LegalizeAction::Expand for the types not supported by the byte-only CNT instruction. We may be able to custom-lower these later to a single CNT but this is to fix the compiler crash first. Fixes <rdar://problem/16578951> llvm-svn: 206433	2014-04-17 01:01:37 +00:00
Aaron Ballman	5f1378c2a4	Replacing a non-ASCII character in a comment with an ASCII character. Fixes a C4819 warning in MSVC. llvm-svn: 206403	2014-04-16 17:09:20 +00:00
Matheus Almeida	483d7e9349	[mips] Use TwoOperandAliasConstraint for shift instructions. This enables TableGen to generate an additional two operand matcher for our shift_rotate_imm and shift_rotate_reg class of instructions. The tests were also updated so that they include now encoding information for all affected instructions. llvm-svn: 206398	2014-04-16 16:28:59 +00:00
Matheus Almeida	0051f2dc78	[mips] Add initial support for NaN2008 in the back-end. This is so that EF_MIPS_NAN2008 is set if we are using IEEE 754-2008 NaN encoding (-mnan=2008). This patch also adds support for parsing '.nan legacy' and '.nan 2008' assembly directives. The handling of these directives should match GAS' behaviour i.e., the last directive in use sets the ELF header bit (EF_MIPS_NAN2008). Differential Revision: http://reviews.llvm.org/D3346 llvm-svn: 206396	2014-04-16 15:48:55 +00:00
Tim Northover	ef7b34d403	ARM64: silence sign-comparison warning. llvm-svn: 206393	2014-04-16 15:28:06 +00:00
Tim Northover	3e69958b6b	AArch64/ARM64: produce correct relocation for conditional branches. llvm-svn: 206391	2014-04-16 15:27:52 +00:00
Daniel Sanders	82cd99a126	[mips] Indentation llvm-svn: 206389	2014-04-16 14:38:27 +00:00
Daniel Sanders	16fa1db637	[mips] Fix emission of '.option pic0' for MIPS-IV. Summary: This was a case of incorrect usage of hasMips64() vs isABI_N64() Reviewers: matheusalmeida, dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3398 llvm-svn: 206388	2014-04-16 13:58:57 +00:00
Daniel Sanders	a024fb0e04	[mips] Correct r206370 to account for non-Linux targets using the small data section. This should fix the ninja-x64-msvc-RA-centos6 builder. I suspect the check in MipsSubtarget.cpp is incorrect and is really trying to check for a bare-metal target rather and anything other than linux. I'll investigate this. llvm-svn: 206385	2014-04-16 12:29:08 +00:00
Tim Northover	3ec1de7767	AArch64/ARM64: port across stub handling for ELF C++ exceptions. The most important part here is that we should actuall emit the stubs we refer to in the exception table, but as a side issue this uses more sensible & GCC compatible representations for some of the bits of information. llvm-svn: 206380	2014-04-16 11:52:55 +00:00
Tim Northover	18f68f6d1a	ARM64: use 32-bit moves for constants where possible. If we know that a particular 64-bit constant has all high bits zero, then we can rely on the fact that 32-bit ARM64 instructions automatically zero out the high bits of an x-register. This gives the expansion logic less constraints to satisfy and so sometimes allows it to pick better sequences. Came up while porting test/CodeGen/AArch64/movw-consts.ll: this will allow a 32-bit MOVN to be used in @test8 soon. llvm-svn: 206379	2014-04-16 11:52:51 +00:00
Tim Northover	9cfb57dafa	ARM64: use the integrated assembler on ELF. llvm-svn: 206378	2014-04-16 11:52:40 +00:00
Matheus Almeida	dc7e48e084	[mips] Emit '.set nomicromips' before a function's entry label if not in micromips mode. The test (elf_st_other.ll) was renamed as the name and description didn't make sense as the test wasn't checking any symbol table entry. Differential Revision: http://reviews.llvm.org/D3346 llvm-svn: 206377	2014-04-16 11:46:59 +00:00
Aaron Ballman	58ce7f24cd	Fixing a compile error in debug versions of MSVC. It seems that the range-based for loop is confused by the DEBUG macro expansion unless a compound statement is used. llvm-svn: 206376	2014-04-16 11:15:57 +00:00
Daniel Sanders	11c0c067c2	[mips] Correct callee saved list for the N32 ABI and enable test Summary: Depends on D3339 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://reviews.llvm.org/D3340 llvm-svn: 206371	2014-04-16 10:23:37 +00:00
Tim Northover	97c5b6fe4f	ARM64: mark x7 as used when an i128 gets shunted onto the stack. The second half of a split i128 was ending up in x7, which is not a good thing. This is another part of PR19432. llvm-svn: 206366	2014-04-16 09:03:25 +00:00
Craig Topper	abb4ac7f87	Convert SelectionDAG::getVTList to use ArrayRef llvm-svn: 206357	2014-04-16 06:10:51 +00:00
Saleem Abdulrasool	0d3d6c45ef	Target: whitespace llvm-svn: 206353	2014-04-16 04:15:25 +00:00
Matt Arsenault	4e46665a80	R600: Expand sign extension of vectors. Setting vector types to expand will result in scalarization on pre SI hw, as those gpus don't have vector shifts either. Expand also i32 vectors, this helps llvm make the correct decision about scalarizing the vector ops. v2: move setOperation() calls to R600ISelLowering.cpp. cleanup the SI code to make it obvious that this patch does is nop for SI Patch by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 206348	2014-04-16 01:41:30 +00:00
Jim Grosbach	36c6a50512	[ARM64,C++11] Tidy up branch relaxation a bit w/ c++11. No functional change. llvm-svn: 206344	2014-04-16 00:42:46 +00:00
Jim Grosbach	01fc5887ad	ARM64: Nuke some dead code. Missed in previous commit. llvm-svn: 206343	2014-04-16 00:42:43 +00:00
Jim Grosbach	80633094f8	[ARM64,C++11] Clean up the ARM64 LOH collection pass. Range'ify a bunch of loops, mainly. As a result, we have a variety of objects via reference rather than by pointer, so propogate that through the various helper functions where it makes sense. llvm-svn: 206337	2014-04-15 22:57:02 +00:00
Matt Arsenault	e500e32939	R600/SI: Print code size along with used registers llvm-svn: 206336	2014-04-15 22:40:47 +00:00
Matt Arsenault	4d7d38333b	R600/SI: Print more immediates in hex format Print in decimal for inline immediates, and hex otherwise. Use hex always for offsets in addressing offsets. This approximately matches what the shader compiler does. llvm-svn: 206335	2014-04-15 22:32:49 +00:00
Matt Arsenault	fcf86c5417	R600/SI: Cleanup parsing of register names. Try to figure out the class and number of subregisters. llvm-svn: 206334	2014-04-15 22:32:42 +00:00
Matt Arsenault	470acd81a8	R600/SI: Fix loads of i1 llvm-svn: 206330	2014-04-15 22:28:39 +00:00
Andrea Di Biagio	aac2eac4c2	[X86] Improve the lowering of packed shifts by constant build_vector. This patch teaches the backend how to efficiently lower logical and arithmetic packed shifts on both SSE and AVX/AVX2 machines. When possible, instead of scalarizing a vector shift, the backend should try to expand the shift into a sequence of two packed shifts by immedate count followed by a MOVSS/MOVSD. Example (v4i32 (srl A, (build_vector < X, Y, Y, Y>))) Can be rewritten as: (v4i32 (MOVSS (srl A, <Y,Y,Y,Y>), (srl A, <X,X,X,X>))) [with X and Y ConstantInt] The advantage is that the two new shifts from the example would be lowered into X86ISD::VSRLI nodes. This is always cheaper than scalarizing the vector into four scalar shifts plus four pairs of vector insert/extract. llvm-svn: 206316	2014-04-15 19:30:48 +00:00
Quentin Colombet	72dad56c53	[ARM64] Set default CPU to generic instead of cyclone. llvm-svn: 206313	2014-04-15 19:08:46 +00:00
NAKAMURA Takumi	e1f3583b96	MipsAsmParser.cpp: Fix vg_leak in MipsOperand::CreateMem(). Mem.Base is managed by k_Memory itself. llvm-svn: 206293	2014-04-15 14:13:21 +00:00
NAKAMURA Takumi	bd524ef129	MipsAsmParser::ParseRegister(): Be responsible to delete an Operand on a temporary Operands. llvm-svn: 206292	2014-04-15 14:06:27 +00:00
Tim Northover	ebb3123a5f	AArch64/ARM64: add missing pattern for extending load. llvm-svn: 206290	2014-04-15 14:00:19 +00:00
Tim Northover	cbcb7a37f7	AArch64/ARM64: only mangle MOVZ/MOVN during encoding when needed Sometimes we need emit the bits that would actually be a MOVN when producing a relocated MOVZ instruction (don't ask). But not always, a check which ARM64 got wrong until now. llvm-svn: 206289	2014-04-15 14:00:15 +00:00
Tim Northover	6e27b8ded5	AArch64/ARM64: add support for large code-model jump tables. I've left the MachO CodeGen as it is, there's a reasonable chance it should use the GOT like ConstPools, but I'm not certain. llvm-svn: 206288	2014-04-15 14:00:11 +00:00
Tim Northover	221b583951	AArch64/ARM64: add patterns for various commutations of FNMADD. llvm-svn: 206287	2014-04-15 14:00:06 +00:00
Tim Northover	b37cff1ae2	AArch64/ARM64: add half as a storage type on ARM64. This brings it into line with the AArch64 behaviour and should open the way for certain OpenCL features. llvm-svn: 206286	2014-04-15 14:00:03 +00:00
Tim Northover	80a70a265a	AArch64/ARM64: copy patterns for fixed-point conversions Code is mostly copied directly across, with a slight extension of the ISelDAGToDAG function so that it can cope with the floating-point constants being behind a litpool. llvm-svn: 206285	2014-04-15 13:59:57 +00:00
Tim Northover	f70577b1cd	ARM64: add constraints to various FastISel operations llvm-svn: 206284	2014-04-15 13:59:53 +00:00
Tim Northover	2f553f326a	FastISel: constrain the RegClass of operands when emitting instructions. ARM64 suffered multiple -verify-machineinstr failures (principally over the xsp/xzr issue) because FastISel was completely ignoring which subset of the general-purpose registers each instruction required. More fixes are coming in ARM64 specific FastISel, but this should cover the generic problems. llvm-svn: 206283	2014-04-15 13:59:49 +00:00
Tim Northover	20603726ce	AArch64/ARM64: add dp tests from AArch64 llvm-svn: 206281	2014-04-15 13:59:40 +00:00
NAKAMURA Takumi	6091e1aed5	ARM64AsmParser.cpp: Fix vg_leak in MC/ARM64/fp-encoding.s. llvm-svn: 206279	2014-04-15 13:22:11 +00:00
Stepan Dyatkovskiy	95cdac43af	Optional hash symbol feature support for ARM64 http://reviews.llvm.org/D3328 llvm-svn: 206276	2014-04-15 11:43:09 +00:00
Vladimir Medic	16d671a413	Current definition of subtract with immediate instruction aliases uses CodeGenOnly defined instructions and post matcher expansion methods to emit real instructions add with immediate. However, they can directly alias add with immediate instruction and remove unnecessary definitions and code in MipsAsmParser.cpp. This patch makes no change in functionality, just removes unnecessary definitions and code. llvm-svn: 206272	2014-04-15 10:14:49 +00:00
NAKAMURA Takumi	df72764599	X86JITInfo: [x86] Rework r206240, X86CompilationCallback_SSE() should be called for SSE-enabled code generator, even if LLVM is not built with -msse. llvm-svn: 206261	2014-04-15 08:28:23 +00:00
Nick Lewycky	aad475b324	Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead. llvm-svn: 206255	2014-04-15 07:22:52 +00:00
Lang Hames	a1bc0f5662	[MC] Require an MCContext when constructing an MCDisassembler. This patch re-introduces the MCContext member that was removed from MCDisassembler in r206063, and requires that an MCContext be passed in at MCDisassembler construction time. (Previously the MCContext member had been initialized in an ad-hoc fashion after construction). The MCCContext member can be used by MCDisassembler sub-classes to construct constant or target-specific MCExprs. This patch updates disassemblers for in-tree targets, and provides the MCRegisterInfo instance that some disassemblers were using through the MCContext (previously those backends were constructing their own MCRegisterInfo instances). llvm-svn: 206241	2014-04-15 04:40:56 +00:00
NAKAMURA Takumi	33ec29ace9	X86JITInfo: [x86] Use X86CompilationCallback_SSE() along; not Subtarget->hasSSE1() but __SSE__, the flag that LLVM libraries are compiled The callback calls internal LLVM JIT libraries. It may be built with -msse (or above). FIXME: JIT may use "host" instead of "generic" by default. llvm-svn: 206240	2014-04-15 04:12:21 +00:00
Jim Grosbach	2c6ff0cbb4	[ARM64,C++11]: Range'ify the dead-register-definition pass. Range-based for loops. No functional change intended. llvm-svn: 206239	2014-04-15 02:14:09 +00:00
Quentin Colombet	f9b61e6afd	[ARM64][MC] Set the default CPU string to generic. llvm-svn: 206228	2014-04-15 00:28:39 +00:00
Jim Grosbach	a344b6c314	X86: Nuke one more CPU autodetect blurb. Missed one in r206094. This brings MC and TargetMachine back into sync. llvm-svn: 206220	2014-04-14 22:23:30 +00:00
David Blaikie	9027abae53	Change argument order and add explanatory comment to r206130 Changes requested in code review by Eric Christopher of r206130. llvm-svn: 206219	2014-04-14 22:23:06 +00:00
Eric Christopher	b45b4814f6	Use FrameSetup on frame instructions for the Mips port. I can't seem to get a testcase to show a difference here, but it's part of the unconditional-br.ll line table weirdness. llvm-svn: 206218	2014-04-14 22:21:22 +00:00
Quentin Colombet	4097c8959c	[ARM64][MC] Set the default CPU to cyclone when initilizating the MC layer. This matches that ARM64Subtarget does for now. This is related to <rdar://problem/16573920> llvm-svn: 206211	2014-04-14 21:25:53 +00:00
Louis Gerbarg	cfc05450e5	Fix for codegen bug that could cause illegal cmn instruction generation In rare cases the dead definition elimination pass code can cause illegal cmn instructions when it replaces dead registers on instructions that use unmaterialized frame indexes. This patch disables the dead definition optimization for instructions which include frame index operands. rdar://16438284 llvm-svn: 206208	2014-04-14 21:05:05 +00:00
Louis Gerbarg	6d2e3c638f	Add a flag to disable the ARM64DeadRegisterDefinitionsPass This patch adds a -arm64-dead-def-elimination flag so that it is possible to disable dead definition elimination. Includes test case. llvm-svn: 206207	2014-04-14 21:05:02 +00:00
James Molloy	d60571bad7	[ARM64] Port over missing subtarget features, and CPU definitions from AArch64. llvm-svn: 206198	2014-04-14 17:38:00 +00:00
Daniel Sanders	863c35a358	[mips] Fix fcopysign for MIPS-IV and add the test. Summary: This was another incorrect use of hasMips64() vs isGP64bit(). Depends on D3344 Reviewers: matheusalmeida, vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3347 llvm-svn: 206187	2014-04-14 16:24:12 +00:00
Daniel Sanders	3d84935d28	[mips] Fix more incorrect uses of HasMips64 and isMips64() Summary: - Conditional moves acting on 64-bit GPR's should require MIPS-IV rather than MIPS64 - ISD::MUL, and ISD::MULH[US] should be lowered on all 64-bit ISA's Patch by David Chisnall His work was sponsored by: DARPA, AFRL I've added additional testcases to cover as much of the codegen changes affecting MIPS-IV as I can. Where I've been unable to find an existing MIPS64 testcase that can be re-used for MIPS-IV (mainly tests covering ISD::GlobalAddress and similar), I at least agree that MIPS-IV should behave like MIPS64. Further testcases that are fixed by this patch will follow in my next commit. The testcases from that commit that fail for MIPS-IV without this patch are: LLVM :: CodeGen/Mips/2010-07-20-Switch.ll LLVM :: CodeGen/Mips/cmov.ll LLVM :: CodeGen/Mips/eh-dwarf-cfa.ll LLVM :: CodeGen/Mips/largeimmprinting.ll LLVM :: CodeGen/Mips/longbranch.ll LLVM :: CodeGen/Mips/mips64-f128.ll LLVM :: CodeGen/Mips/mips64directive.ll LLVM :: CodeGen/Mips/mips64ext.ll LLVM :: CodeGen/Mips/mips64fpldst.ll LLVM :: CodeGen/Mips/mips64intldst.ll LLVM :: CodeGen/Mips/mips64load-store-left-right.ll LLVM :: CodeGen/Mips/sint-fp-store_pattern.ll Reviewers: dsanders Reviewed By: dsanders CC: matheusalmeida Differential Revision: http://reviews.llvm.org/D3343 llvm-svn: 206183	2014-04-14 15:44:42 +00:00
Tim Northover	cb9c3cfb58	ARM64: remove buggy REV16 pattern. The 32-bit pattern is still valid: 0123 -> 3210 -> 1032. llvm-svn: 206172	2014-04-14 12:59:52 +00:00
Tim Northover	b6abe806c7	AArch64/ARM64: enable directcond.ll test on ARM64. Code change is because optimizeCompareInstr didn't know how to pull the condition code out of FCSEL instructions. llvm-svn: 206171	2014-04-14 12:51:06 +00:00
Tim Northover	0d7bd4f444	ARM64: add patterns for csXYZ with reversed operands. AArch64 tests for this, and it's obviously a good idea. Have to invert the condition code, of course. llvm-svn: 206170	2014-04-14 12:51:02 +00:00
Tim Northover	2f48303436	ARM64: add support for AArch64's addsub_ext.ll There was one definite issue in ARM64 (the off-by-1 check for whether a shift could be folded in) and one difference that is probably correct: ARM64 didn't fold nodes with multiple uses into the arithmetic operations unless optimising for code size. llvm-svn: 206168	2014-04-14 12:50:50 +00:00
Tim Northover	23b1f08282	ARM64: optimise (cmp x, (sub 0, y)) to (cmn x, y). This transformation is only valid when being used for an EQ or NE comparison since the flags change otherwise. llvm-svn: 206167	2014-04-14 12:50:47 +00:00
Richard Osborne	da16ff47cd	[XCore] Don't create invalid MKMSK instructions inside loadImmediate(). Summary: Previously loadImmediate() would produce MKMSK instructions with invalid immediate values such as mkmsk r0, 9. Fix this by checking the mask size is valid. Reviewers: robertlytton Reviewed By: robertlytton CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3289 llvm-svn: 206163	2014-04-14 12:30:35 +00:00
Hal Finkel	0192cbac66	[PowerPC] [Constant Hoisting] Enable constant hoisting on PPC Implements the various TTI functions to enable constant hoisting on PPC. The only significant test-suite change is this: MultiSource/Benchmarks/VersaBench/bmm/bmm - 20% speedup (which essentially reverses the slowdown from r206120). llvm-svn: 206141	2014-04-13 23:02:40 +00:00
Hal Finkel	d9963c75da	[PowerPC] Fix rlwimi isel when mask is not constant We had been using the known-zero values of the operand of the or to construct the mask for an rlwimi; this is not quite correct, but fine when the mask is constant. When the mask is constant, then the known zeros of the operand must be a superset of the zeros in the mask. However, when the mask is not a constant, then there might be bits in the operand that are not known to be zero that, at runtime, might be zero in the mask. Therefore, we check that any bits not known to be zero are known to be one in the mask. Otherwise, we can't fold the mask with the or and shift. This was revealed as a miscompile of MultiSource/Benchmarks/BitBench/drop3/drop3 when I started experimenting with constant hoisting. llvm-svn: 206136	2014-04-13 17:10:58 +00:00
David Blaikie	269e0fb2e4	Fix instruction debug info location during legalization I found this from a particular GDB test suite case of inlining (something similar is provided as a test case) but came across a few other related cases (other callers of the same functions, and one other instance of the same coding mistake in a separate function). I'm not sure what the best way to test this is (let alone to cover the other cases I discovered), so hopefully this sufficies - open to ideas. llvm-svn: 206130	2014-04-13 06:39:55 +00:00
Lang Hames	0563ca1be8	[X86] unique_ptr'ify one of X86GenericDisassembler's members. llvm-svn: 206127	2014-04-13 04:09:16 +00:00
Hal Finkel	34974ed503	[PowerPC] Implement some additional TLI callbacks Add implementations of: bool isLegalICmpImmediate(int64_t Imm) const bool isLegalAddImmediate(int64_t Imm) const bool isTruncateFree(Type Ty1, Type Ty2) const bool isTruncateFree(EVT VT1, EVT VT2) const bool shouldConvertConstantLoadToIntImm(const APInt &Imm, Type *Ty) const Unfortunately, this regresses counter-register-based loop formation because some of the loops now end up in forms were SE cannot compute loop counts. However, nevertheless, the test-suite results favor committing: SingleSource/Benchmarks/BenchmarkGame/puzzle: 26% speedup MultiSource/Benchmarks/FreeBench/analyzer/analyzer: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan: 20% speedup SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv: 19% speedup SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv: 15% speedup MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2: 2% speedup MultiSource/Benchmarks/VersaBench/bmm/bmm: 26% slowdown llvm-svn: 206120	2014-04-12 21:52:38 +00:00
Benjamin Kramer	44a53da346	Spell the specialization namespace correctly. Not sure why clang didn't diagnose this (GCC does). llvm-svn: 206117	2014-04-12 18:45:24 +00:00
Benjamin Kramer	30120c0626	Make helper static and place random global into the llvm namespace. llvm-svn: 206116	2014-04-12 18:39:57 +00:00
Benjamin Kramer	502b9e1d7f	Retire llvm::array_endof in favor of non-member std::end. While there make array_lengthof constexpr if we have support for it. llvm-svn: 206112	2014-04-12 16:15:53 +00:00
Juergen Ributzka	cf03068d91	[ARM64] Never hoist the shift value of a shift instruction. There is no need to check if we want to hoist the immediate value of an shift instruction. Simply return TCC_Free right away. llvm-svn: 206101	2014-04-12 02:53:51 +00:00
Juergen Ributzka	6e17aa45a3	[ARM64] Fix the cost model for cheap large constants. Originally the cost model would give up for large constants and just return the maximum cost. This is not what we want for constant hoisting, because some of these constants are large in bitwidth, but are still cheap to materialize. This commit fixes the cost model to either return TCC_Free if the cost cannot be determined, or accurately calculate the cost even for large constants (bitwidth > 128). This fixes <rdar://problem/16591573>. llvm-svn: 206100	2014-04-12 02:36:28 +00:00
Jim Grosbach	48551fbdba	X86: Remove TargetMachine CPU auto-detection. This logic is properly in the realm of whatever is creating the TargetMachine. This makes plain 'llc foo.ll' consistent across heterogenous machines. llvm-svn: 206094	2014-04-12 01:34:29 +00:00
Chad Rosier	4ec124bc3e	[AArch64] Implement the isLegalAddressingMode and getScalingFactorCost APIs. llvm-svn: 206089	2014-04-12 00:14:23 +00:00
Louis Gerbarg	b9a0551862	Add ARM64 CLS patterns This patch adds patterns to generate the cls instruction ARM64. Includes tests for 64 bit and 32 bit operands. rdar://15611957 llvm-svn: 206079	2014-04-11 22:27:58 +00:00
Matt Arsenault	e1f030ca66	R600: Check if a sextload should be used for parameter loads. Through some oddity where truncate (sextload x) isn't folded into an anyextload for vectors, the sextload remains if the vector isn't immediately scalarized. This keeps the expected zextload instructions in the kernel-args test when small type vectors aren't scalarized. llvm-svn: 206070	2014-04-11 20:59:54 +00:00
Lang Hames	95400e22f9	Remove redundant symbolization support from MCDisassembler interface. MCDisassembler has an MCSymbolizer member that is meant to take care of symbolizing during disassembly, but it also has several methods that enable the disassembler to do symbolization internally (i.e. without an attached symbolizer object). There is no need for this duplication, but ARM64 had been making use of it. This patch moves the ARM64 symbolization logic out of ARM64Disassembler and into an ARM64ExternalSymbolizer class, and removes the duplicated MCSymbolizer functionality from the MCDisassembler interface. Symbolization will now be done exclusively through MCSymbolizers. There should be no impact on disassembly for any platform, but this allows us to tidy up the MCDisassembler interface and simplify the process of (and invariants related to) disassembler setup. llvm-svn: 206063	2014-04-11 20:07:58 +00:00
Matt Arsenault	0cb92e133f	R600/SI: Refactor SOPC classes slightly. Better match what is done for VOPC to eventually prefer selecting these. llvm-svn: 206048	2014-04-11 19:25:18 +00:00
Matt Arsenault	9ec3cf2c8a	Move ExtractVectorElements to SelectionDAG. This seems generally useful, and makes sense to go along with SplitVector. llvm-svn: 206041	2014-04-11 17:47:30 +00:00
David Blaikie	ceec2bdaa5	Implement depth_first and inverse_depth_first range factory functions. Also updated as many loops as I could find using df_begin/idf_begin - strangely I found no uses of idf_begin. Is that just used out of tree? Also a few places couldn't use df_begin because either they used the member functions of the depth first iterators or had specific ordering constraints (I added a comment in the latter case). Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T> where you needed iterator_range<idf_iterator<T>>) llvm-svn: 206016	2014-04-11 01:50:01 +00:00
Jim Grosbach	f77265bfee	[ARM64,C++11] Range'ify use-lists iterators in address type promotion. llvm-svn: 206013	2014-04-11 01:13:10 +00:00
Jim Grosbach	8838d793b7	[ARM64,C++11]: Range'ify use-list iterators in DAGToDAG. llvm-svn: 206007	2014-04-11 00:27:22 +00:00
Jim Grosbach	d3249d0923	[ARM64,C++11]: More range-based loop simplification. llvm-svn: 206006	2014-04-11 00:27:19 +00:00
Reid Kleckner	9c6582129a	Move the segmented stack switch to a function attribute This removes the -segmented-stacks command line flag in favor of a per-function "split-stack" attribute. Patch by Luqman Aden and Alex Crichton! llvm-svn: 205997	2014-04-10 22:58:43 +00:00
Jim Grosbach	577e921344	[ARM64,C++11]: Range'ify loops in InstrInfo. llvm-svn: 205992	2014-04-10 22:00:18 +00:00
Jim Grosbach	8a0c50e5a9	[ARM64,C++11]: Range'ify loops in the conditional-compare pass. llvm-svn: 205988	2014-04-10 21:49:24 +00:00

... 6 7 8 9 10 ...

28582 Commits