llvm-project

Commit Graph

Author	SHA1	Message	Date
Benjamin Kramer	feacdd39d5	[Hexagon] Make global arrays 'static const'. NFC. llvm-svn: 239475	2015-06-10 14:43:59 +00:00
Daniel Sanders	a73f1fdb19	Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and create*MCSubtargetInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10311 llvm-svn: 239467	2015-06-10 12:11:26 +00:00
Daniel Sanders	9aa7e38bf8	Replace string GNU Triples with llvm::Triple in create*MCRelocationInfo(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: rafael Reviewed By: rafael Subscribers: rafael, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10307 llvm-svn: 239465	2015-06-10 10:54:40 +00:00
Daniel Sanders	418caf5002	Replace string GNU Triples with llvm::Triple in MCAsmBackend subclasses and create*AsmBackend(). NFC. Summary: This continues the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. Reviewers: echristo, rafael Reviewed By: rafael Subscribers: rafael, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10243 llvm-svn: 239464	2015-06-10 10:35:34 +00:00
Elena Demikhovsky	00c9ad5ec2	AVX-512: Fixed a bug in comparison of i1 vectors. cmp eq should give kxnor instruction cmp neq should give kxor https://llvm.org/bugs/show_bug.cgi?id=23631 llvm-svn: 239460	2015-06-10 06:49:28 +00:00
Craig Topper	8e29d71623	Remove unnecessary conversion from StringRef to std::string and back to StringRef. NFC. llvm-svn: 239455	2015-06-10 02:07:37 +00:00
Reid Kleckner	673de15af9	[WinEH] Call llvm.stackrestore in __except blocks We have to do this manually, the runtime only sets up ebp. Fixes a crash when returning after catching an exception. llvm-svn: 239451	2015-06-10 01:34:54 +00:00
Reid Kleckner	2bc93ca846	[WinEH] Emit .safeseh directives for all 32-bit exception handlers Use a "safeseh" string attribute to do this. You would think we chould just accumulate the set of personalities like we do on dwarf, but this fails to account for the LSDA-loading thunks we use for __CxxFrameHandler3. Each of those needs to make it into .sxdata as well. The string attribute seemed like the most straightforward approach. llvm-svn: 239448	2015-06-10 01:02:30 +00:00
Peter Collingbourne	9fe51fdf18	Move dllimport name mangling to IR mangler. This ensures that LTO clients see the correct external symbol name. Differential Revision: http://reviews.llvm.org/D10318 llvm-svn: 239437	2015-06-09 22:09:53 +00:00
Jingyue Wu	75589ffcc2	[NVPTX] fix a crash bug in NVPTXFavorNonGenericAddrSpaces Summary: We used to assume V->RAUW only modifies the operand list of V's user. However, if V and V's user are Constants, RAUW may replace and invalidate V's user entirely. This patch fixes the above issue by letting the caller replace the operand instead of calling RAUW on Constants. Test Plan: @nested_const_expr and @rauw in access-non-generic.ll Reviewers: broune, jholewinski Reviewed By: broune, jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10345 llvm-svn: 239435	2015-06-09 21:50:32 +00:00
Reid Kleckner	f12c030f48	[WinEH] Add 32-bit SEH state table emission prototype This gets all the handler info through to the asm printer and we can look at the .xdata tables now. I've convinced one small catch-all test case to work, but other than that, it would be a stretch to say this is functional. The state numbering algorithm avoids doing any scope reconstruction as we do for C++ to simplify the implementation. llvm-svn: 239433	2015-06-09 21:42:19 +00:00
Chad Rosier	cf90acc104	[AArch64] Remove an overly conservative check when generating store pairs. Store instructions do not modify register values and therefore it's safe to form a store pair even if the source register has been read in between the two store instructions. Previously, the read of w1 (see below) prevented the formation of a stp. str w0, [x2] ldr w8, [x2, #8] add w0, w8, w1 str w1, [x2, #4] ret We now generate the following code. stp w0, w1, [x2] ldr w8, [x2, #8] add w0, w8, w1 ret All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass. Performance results for SPEC2K were within noise. llvm-svn: 239432	2015-06-09 20:59:41 +00:00
Akira Hatanaka	d9699bc7bd	Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions that was resetting it. Remove the uses of DisableTailCalls in subclasses of TargetLowering and use the value of function attribute "disable-tail-calls" instead. Also, unconditionally add pass TailCallElim to the pipeline and check the function attribute at the start of runOnFunction to disable the pass on a per-function basis. This is part of the work to remove TargetMachine::resetTargetOptions, and since DisableTailCalls was the last non-fast-math option that was being reset in that function, we should be able to remove the function entirely after the work to propagate IR-level fast-math flags to DAG nodes is completed. Out-of-tree users should remove the uses of DisableTailCalls and make changes to attach attribute "disable-tail-calls"="true" or "false" to the functions in the IR. rdar://problem/13752163 Differential Revision: http://reviews.llvm.org/D10099 llvm-svn: 239427	2015-06-09 19:07:19 +00:00
Samuel Antao	cd50135a29	The constant initialization for globals in NVPTX is generated as an array of bytes. The generation of this byte arrays was expecting the host to be little endian, which prevents big endian hosts to be used in the generation of the PTX code. This patch fixes the problem by changing the way the bytes are extracted so that it works for either little and big endian. llvm-svn: 239412	2015-06-09 16:29:34 +00:00
Toma Tabacu	465acfd13c	Recommit "[mips] [IAS] Restore STI.FeatureBits in .set pop." (r239144). Specified the llvm namespace for the 2 calls to make_unique() which caused compilation errors in Visual Studio 2013. llvm-svn: 239405	2015-06-09 13:33:26 +00:00
Elena Demikhovsky	6b62b659cb	X86-MPX: Implemented encoding for MPX instructions. Added encoding tests. llvm-svn: 239403	2015-06-09 13:02:10 +00:00
Aaron Ballman	3182ee92ba	Removing spurious semi colons; NFC. llvm-svn: 239399	2015-06-09 12:03:46 +00:00
Toma Tabacu	7977cfd52a	Revert "[mips] [IAS] Add support for BNE and BEQ with an immediate operand." (r239396). It was breaking buildbots. llvm-svn: 239397	2015-06-09 10:43:49 +00:00
Toma Tabacu	5fa8fb5762	[mips] [IAS] Add support for BNE and BEQ with an immediate operand. Summary: For some branches, GAS accepts an immediate instead of the 2nd register operand. We only implement this for BNE and BEQ for now. Other branch instructions can be added later, if needed. Reviewers: dsanders Reviewed By: dsanders Subscribers: seanbruno, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D9666 llvm-svn: 239396	2015-06-09 10:34:31 +00:00
Daniel Sanders	329fc9b68a	[nvptx] Only support the 'm' inline assembly memory constraint. NFC. Summary: NVPTX doesn't seem to support any additional constraints. Therefore remove the target hook. No functional change intended. Reviewers: jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8209 llvm-svn: 239395	2015-06-09 10:34:05 +00:00
Matt Arsenault	5881f4e1e4	R600: Switch to using generic min / max nodes. llvm-svn: 239377	2015-06-09 00:52:37 +00:00
Matt Arsenault	8b643559d4	MC: Add target hook to control symbol quoting llvm-svn: 239370	2015-06-09 00:31:39 +00:00
Jingyue Wu	2e4d1dd0ed	[NVPTX] run SROA after NVPTXFavorNonGenericAddrSpaces Summary: This cleans up most allocas NVPTXLowerKernelArgs emits for byval parameters. Test Plan: makes bug21465.ll more stronger to verify no redundant local load/store. Reviewers: eliben, jholewinski Reviewed By: eliben, jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10322 llvm-svn: 239368	2015-06-09 00:05:56 +00:00
Reid Kleckner	b7403336ce	[WinEH] Cache declarations of frame intrinsics llvm-svn: 239361	2015-06-08 22:43:32 +00:00
Reid Kleckner	218a9593db	Fix clang-cl self-host -Wc++11-narrowing bug Use unsigned as the underlying storage type of the AMDGPU address space enum. llvm-svn: 239355	2015-06-08 21:57:57 +00:00
Ranjeet Singh	10511a493e	[AArch64] AsmParser should be case insensitive about accepting vector register names. Differential Revision: http://reviews.llvm.org/D10320 llvm-svn: 239353	2015-06-08 21:32:16 +00:00
Keno Fischer	e70b31fc1b	[InstrInfo] Refactor foldOperandImpl to thread through InsertPt. NFC Summary: This was a longstanding FIXME and is a necessary precursor to cases where foldOperandImpl may have to create more than one instruction (e.g. to constrain a register class). This is the split out NFC changes from D6262. Reviewers: pete, ributzka, uweigand, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, ted, llvm-commits Differential Revision: http://reviews.llvm.org/D10174 llvm-svn: 239336	2015-06-08 20:09:58 +00:00
Akira Hatanaka	4a61619ff5	[ARM] Pass a callback to FunctionPass constructors to enable skipping execution on a per-function basis. Previously some of the passes were conditionally added to ARM's pass pipeline based on the target machine's subtarget. This patch makes changes to add those passes unconditionally and execute them conditonally based on the predicate functor passed to the pass constructors. This enables running different sets of passes for different functions in the module. rdar://problem/20542263 Differential Revision: http://reviews.llvm.org/D8717 llvm-svn: 239325	2015-06-08 18:50:43 +00:00
Pete Cooper	4915dd076f	Remove includes of MCMachOSymbolFlags.h after it was deleted llvm-svn: 239318	2015-06-08 17:25:57 +00:00
Matthias Braun	6f8db0e1a7	X86: Reject register operands with obvious type mismatches. While we have some code to transform specification like {ax} into {eax}/{rax} if the operand type isn't 16bit, we should reject cases where there is no sane way to do this, like the i128 type in the example. Related to rdar://21042280 Differential Revision: http://reviews.llvm.org/D10260 llvm-svn: 239309	2015-06-08 16:56:23 +00:00
Colin LeMahieu	6aca6f0be5	[Hexagon] Adding functionality for searching for compound instruction pairs. Compound instructions reduce slot resource requirements freeing those packet slots up for more instructions. llvm-svn: 239307	2015-06-08 16:34:47 +00:00
Javed Absar	e1c7dc3ee2	ARM]: Add support for MMFR4_EL1 in assembler This patch adds support for system register MMFR4_EL1 (memory model feature register) in the assembler. This register provides information about the implemented memory model and memory management support. llvm-svn: 239302	2015-06-08 15:01:11 +00:00
Igor Breger	00d9f8457b	AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL Implemented DAG lowering for all these forms. Added tests for DAG lowering and encoding. Differential Revision: http://reviews.llvm.org/D10310 llvm-svn: 239300	2015-06-08 14:03:17 +00:00
Simon Pilgrim	3a7718038d	[X86] Added BitScanForward/BitScanReverse memory folding + tests llvm-svn: 239257	2015-06-07 18:34:25 +00:00
Rafael Espindola	f3d49b30b5	Handle 16 bit PC relative relocations. Fixes pr23771. llvm-svn: 239214	2015-06-06 02:29:56 +00:00
Peter Collingbourne	6679fc1a79	Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM." as it caused miscompilations and assertion failures (PR23768, http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html). llvm-svn: 239169	2015-06-05 18:01:28 +00:00
Alexei Starovoitov	8cf9a4c472	[bpf] rename triple names bpf_be -> bpfeb llvm-svn: 239162	2015-06-05 16:11:14 +00:00
Colin LeMahieu	be8c453d58	[Hexagon] Reapply r239097 with tests corrected for shuffling and duplexing. llvm-svn: 239161	2015-06-05 16:00:11 +00:00
Benjamin Kramer	113b2a943f	[ARM] Make helper function static. This one had a declaration but it differed from the definition so the declaration was actually dead. llvm-svn: 239157	2015-06-05 14:32:54 +00:00
John Brawn	985c04e8fa	[ARM] Add support for -sp- FPUs and FPU none to TargetParser These are added mainly for the benefit of clang, but this also means that they are now allowed in .fpu directives and we emit the correct .fpu directive when single-precision-only is used. Differential Revision: http://reviews.llvm.org/D10238 llvm-svn: 239151	2015-06-05 13:31:19 +00:00
John Brawn	d03d22922d	[ARM] Add knowledge of FPU subtarget features to TargetParser Add getFPUFeatures to TargetParser, which gets the list of subtarget features that are enabled/disabled for each FPU, and use it when handling the .fpu directive. No functional change in this commit, though clang will start behaving differently once it starts using this. Differential Revision: http://reviews.llvm.org/D10237 llvm-svn: 239150	2015-06-05 13:29:24 +00:00
Toma Tabacu	399a56d771	Revert "[mips] [IAS] Restore STI.FeatureBits in .set pop." (r239144). This is breaking the Windows buildbots. llvm-svn: 239145	2015-06-05 12:19:27 +00:00
Toma Tabacu	89ebf88ff3	[mips] [IAS] Restore STI.FeatureBits in .set pop. Summary: Only restoring AvailableFeatures is not enough and will lead to buggy behaviour. For example, if we have a feature enabled and we ".set pop", the next time we try to ".set" that feature nothing will happen because the "!(STI.getFeatureBits()[Feature])" check will be false, because we didn't restore STI.FeatureBits. In order to fix this, we need to make MipsAssemblerOptions remember the STI.FeatureBits instead of the AvailableFeatures and then regenerate AvailableFeatures each time we ".set pop". This is because, AFAIK, there is no way to convert from AvailableFeatures back to STI.FeatureBits, but the reverse is possible by using ComputeAvailableFeatures(STI.FeatureBits). I also moved the updating of AssemblerOptions inside the "if" statement in setFeatureBits() and clearFeatureBits(), as there is no reason to update if nothing changes. Reviewers: dsanders, mkuper Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9156 llvm-svn: 239144	2015-06-05 11:48:54 +00:00
Jim Grosbach	56ed0bb111	MC: Clean up the naming for MCMachObjectWriter. NFC. s/ExecutePostLayoutBinding/executePostLayoutBinding/ s/ComputeSymbolTable/computeSymbolTable/ s/BindIndirectSymbols/bindIndirectSymbols/ s/RecordTLVPRelocation/recordTLVPRelocation/ s/RecordScatteredRelocation/recordScatteredRelocation/ s/WriteLinkerOptionsLoadCommand/writeLinkerOptionsLoadCommand/ s/WriteLinkeditLoadCommand/writeLinkeditLoadCommand/ s/WriteNlist/writeNlist/ s/WriteDysymtabLoadCommand/writeDysymtabLoadCommand/ s/WriteSymtabLoadCommand/writeSymtabLoadCommand/ s/WriteSection/writeSection/ s/WriteSegmentLoadCommand/writeSegmentLoadCommand/ s/WriteHeader/writeHeader/ llvm-svn: 239119	2015-06-04 23:25:54 +00:00
Charles Davis	da280728b6	[Target/X86] Don't use callee-saved registers in a Win64 tail call on non-Windows. Summary: A small bit that I missed when I updated the X86 backend to account for the Win64 calling convention on non-Windows. Now we don't use dead non-volatile registers when emitting a Win64 indirect tail call on non-Windows. Should fix PR23710. Test Plan: Added test for the correct behavior based on the case I posted to PR23710. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10258 llvm-svn: 239111	2015-06-04 22:50:05 +00:00
Jim Grosbach	36e60e9127	MC: Clean up naming in MCObjectWriter. NFC. s/WriteObject/writeObject/ s/RecordRelocation/recordRelocation/ s/IsSymbolRefDifferenceFullyResolved/isSymbolRefDifferenceFullyResolved/ s/Write8/write8/ s/WriteLE16/writeLE16/ s/WriteLE32/writeLE32/ s/WriteLE64/writeLE64/ s/WriteBE16/writeBE16/ s/WriteBE32/writeBE32/ s/WriteBE64/writeBE64/ s/Write16/write16/ s/Write32/write32/ s/Write64/write64/ s/WriteZeroes/writeZeroes/ s/WriteBytes/writeBytes/ llvm-svn: 239108	2015-06-04 22:24:41 +00:00
Colin LeMahieu	c40be85adc	Revert r239095 incorrect test tree. llvm-svn: 239102	2015-06-04 21:32:42 +00:00
Jingyue Wu	a2f6027a31	[NVPTX] roll forward r239082 NVPTXISelDAGToDAG translates "addrspacecast to param" to NVPTX::nvvm_ptr_gen_to_param Added an llc test in bug21465. llvm-svn: 239100	2015-06-04 21:28:26 +00:00
Colin LeMahieu	f99fe00afc	[Hexagon] Removing unused variable. llvm-svn: 239097	2015-06-04 21:22:12 +00:00
Colin LeMahieu	fc52c11d80	[Hexagon] Adding functionality for duplexing. Duplexing is a way to compress commonly used pairs of instructions in order to reduce code size. The test case duplex.ll normally would be 8 bytes, assign register to 0 and jump to link register. After duplexing this is only 4 bytes. This also tests the HexagonMCShuffler code path which is used to make sure duplexed instructions still follow slot requirements. llvm-svn: 239095	2015-06-04 21:16:16 +00:00
Jingyue Wu	b8f38668d5	Revert r239082 llc crashed for NVPTX backend llvm-svn: 239094	2015-06-04 21:07:08 +00:00
Ahmed Bougacha	8207641251	[GlobalMerge] Take into account minsize on Global users' parents. Now that we can look at users, we can trivially do this: when we would have otherwise disabled GlobalMerge (currently -O<3), we can just run it for minsize functions, as it's usually a codesize win. Differential Revision: http://reviews.llvm.org/D10054 llvm-svn: 239087	2015-06-04 20:39:23 +00:00
Jim Grosbach	7c76b4cc6e	MC: Remove obsolete MachO UseAggressiveSymbolFolding. Fix the FIXME and remove this old as(1) compat option. It was useful for bringup of the integrated assembler to diff object files, but now it's just causing more relocations than strictly necessary to be generated. rdar://21201804 llvm-svn: 239084	2015-06-04 20:27:42 +00:00
Jingyue Wu	f3a8079b75	[NVPTX] kernel pointer arguments point to the global address space Summary: With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a pointer in the global address space. This change, along with NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.* and st.global.* for accessing kernel pointer arguments. Minor changes: 1. refactor: extract function convertToPointerInAddrSpace 2. fix a bug in the test case in bug21465.ll Test Plan: lower-kernel-ptr-arg.ll Reviewers: eliben, meheff, jholewinski Reviewed By: jholewinski Subscribers: wengxt, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10154 llvm-svn: 239082	2015-06-04 20:19:38 +00:00
Alexei Starovoitov	310deada10	[bpf] add big- and host- endian support Summary: -march=bpf -> host endian -march=bpf_le -> little endian -match=bpf_be -> big endian Test Plan: v1 was tested by IBM s390 guys and appears to be working there. It bit rots too fast here. Reviewers: chandlerc, tstellarAMD Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10177 llvm-svn: 239071	2015-06-04 19:15:05 +00:00
Matt Arsenault	73e06fa262	R600/SI: Reimplement isLegalAddressingMode Now that we sometimes know the address space, this can theoretically do a better job. This needs better test coverage, but this mostly depends on first updating the loop optimizatiosn to provide the address space. llvm-svn: 239053	2015-06-04 16:17:42 +00:00
Matt Arsenault	81c7ae2bf5	R600/SI: Fix some cases for load / store of half Mostly argument loads were producing broken zextloads from an FP type. llvm-svn: 239049	2015-06-04 16:00:27 +00:00
Benjamin Kramer	50e2a29385	Replace custom fixed endian to raw_ostream emission with EndianStream. Less code, clearer and more efficient. No functionality change intended. llvm-svn: 239040	2015-06-04 15:03:02 +00:00
Daniel Sanders	7813ae879e	Replace string GNU Triples with llvm::Triple in MCAsmInfo subclasses and create*AsmInfo(). NFC. Summary: This is the first of several patches to eliminate StringRef forms of GNU triples from the internals of LLVM. After this is complete, GNU triples will be replaced by a more authoratitive representation in the form of an LLVM TargetTuple. Reviewers: rengolin Reviewed By: rengolin Subscribers: ted, llvm-commits, rengolin, jholewinski Differential Revision: http://reviews.llvm.org/D10236 llvm-svn: 239036	2015-06-04 13:12:25 +00:00
Elena Demikhovsky	2f1a0dabd0	AVX-512: I brought back vector-shuffle-512-v8.ll test. I re-generated it after all AVX-512 shuffle optimizations. llvm-svn: 239026	2015-06-04 07:49:56 +00:00
Elena Demikhovsky	4078c75bd4	AVX-512: added all SKX forms of VPERMW/D/Q instructions. Added all forms of VPERMPS/PD instrcuctions. Added encoding tests. llvm-svn: 239016	2015-06-04 07:07:13 +00:00
Elena Demikhovsky	214335d703	Removed {}, NFC. llvm-svn: 239014	2015-06-04 07:01:29 +00:00
Rafael Espindola	8c006ee385	Bring back r239006 with a fix. The fix is just that getOther had not been updated for packing the st_other values in fewer bits and could return spurious values: - unsigned Other = (getFlags() & (0x3f << ELF_STO_Shift)) >> ELF_STO_Shift; + unsigned Other = (getFlags() & (0x7 << ELF_STO_Shift)) >> ELF_STO_Shift; Original message: Pack the MCSymbolELF bit fields into MCSymbol's Flags. This reduces MCSymolfELF from 64 bytes to 56 bytes on x86_64. While at it, also make getOther/setOther easier to use by accepting unshifted STO_* values. llvm-svn: 239012	2015-06-04 05:59:23 +00:00
Rafael Espindola	a86ecee52b	Revert "Pack the MCSymbolELF bit fields into MCSymbol's Flags." This reverts commit r239006. I am debugging the powerpc failures. llvm-svn: 239010	2015-06-04 05:00:12 +00:00
Rafael Espindola	d31203ae21	Pack the MCSymbolELF bit fields into MCSymbol's Flags. This reduces MCSymolfELF from 64 bytes to 56 bytes on x86_64. While at it, also make getOther/setOther easier to use by accepting unshifted STO_* values. llvm-svn: 239006	2015-06-04 02:32:20 +00:00
Sanjay Patel	667a7e2a0f	make reciprocal estimate code generation more flexible by adding command-line options (3rd try) The first try (r238051) to land this was reverted due to ExecutionEngine build failure; that was hopefully addressed by r238788. The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure; that was hopefully addressed by r238953. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to not use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 239001	2015-06-04 01:32:35 +00:00
Tom Stellard	1ba52feb96	R600: Re-enable sub-reg liveness The bug in the R600 backend that this uncovered has been fixed. llvm-svn: 238999	2015-06-04 01:20:04 +00:00
Rafael Espindola	f8794ff29d	Remove MCELFSymbolFlags.h. It is now internal to MCSymbolELF. llvm-svn: 238996	2015-06-04 00:47:43 +00:00
Rafael Espindola	c73aed1cb3	Remove getOrCreateSymbolData. There is no MCSymbolData anymore. llvm-svn: 238952	2015-06-03 19:03:11 +00:00
Colin LeMahieu	1ce7a11c9c	[Hexagon] Test doesn't work on all platforms. At any rate the uninitialized variable issue was fixed. Removing re-registering ASM backend. llvm-svn: 238949	2015-06-03 18:00:45 +00:00
Colin LeMahieu	a675077310	[Hexagon] Reapply 238772 OSABI was not correctly set, added empty_elf test to make sure it is. llvm-svn: 238947	2015-06-03 17:34:16 +00:00
Matthias Braun	125c9f5f7b	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 Recommiting after the revert in r238821, the buildbot still failed with the patch removed so there seems to be another reason for the breakage. llvm-svn: 238935	2015-06-03 16:30:24 +00:00
Daniel Sanders	43a79bf694	[arm] Fix r238921. We must handle Constraint_i too. llvm-svn: 238925	2015-06-03 14:17:18 +00:00
Asaf Badouh	402ebb34af	re-apply 238809 AVX-512: Implemented GETEXP instruction for KNL and SKX Added rounding mode modifier for SQRTPS/PD Added tests for encoding and intrinsics. CR: http://reviews.llvm.org/D9991 llvm-svn: 238923	2015-06-03 13:41:48 +00:00
Daniel Sanders	1f58ef71ea	[arm] Distinguish the /U[qytnms]/, 'Uv', 'Q', and 'm' inline assembly memory constraints. Summary: But still handle them the same way since I don't know how they differ on this target. Of these, /U[qytnms]/ do not have backend tests but are accepted by clang. No functional change intended. Reviewers: t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D8203 llvm-svn: 238921	2015-06-03 12:33:56 +00:00
Elena Demikhovsky	86224fe468	AVX-512: More code improvements in shuffles, NFC llvm-svn: 238919	2015-06-03 12:05:03 +00:00
Elena Demikhovsky	21de893377	AVX-512: VSHUFPD instruction selection - code improvements llvm-svn: 238918	2015-06-03 11:21:01 +00:00
Elena Demikhovsky	9e38086534	AVX-512: Implemented SHUFF32x4/SHUFF64x2/SHUFI32x4/SHUFI64x2 instructions for SKX and KNL. Added tests for encoding. By Igor Breger (igor.breger@intel.com) llvm-svn: 238917	2015-06-03 10:56:40 +00:00
Elena Demikhovsky	f7e641cc2d	X86: Added MPX feature and bound registers. Intel® Memory Protection Extensions (Intel® MPX) is a new feature in Skylake. It is a part of KNL and SKX sets. It is also a part of Skylake client. I added definition of %bnd0 - %bnd3 registers, each register is a pair of 64-bit integers. llvm-svn: 238916	2015-06-03 10:30:57 +00:00
Simon Pilgrim	452252e6c8	[X86] Removed (unused) FSRL x86 operation This patch removes the old X86ISD::FSRL op - which allowed float vectors to use the byte right shift operations (causing a domain switch....). Since the refactoring of the shuffle lowering code this no longer has any use. Differential Revision: http://reviews.llvm.org/D10169 llvm-svn: 238906	2015-06-03 08:32:36 +00:00
Rafael Espindola	cf8beece97	Revert "make reciprocal estimate code generation more flexible by adding command-line options (2nd try)" This reverts commit r238842. It broke -DBUILD_SHARED_LIBS=ON build. llvm-svn: 238900	2015-06-03 05:32:44 +00:00
Rafael Espindola	9aa3ab30a9	Avoid a call to getOrCreateSymbol when we already have the symbol. llvm-svn: 238890	2015-06-03 00:02:40 +00:00
Rafael Espindola	0ccf9b71f3	Pass a MCSymbolELF to a few ELF only functions. NFC. llvm-svn: 238868	2015-06-02 21:30:13 +00:00
Rafael Espindola	95fb9b93ed	Merge MCELF.h into MCSymbolELF.h. Now that we have a dedicated type for ELF symbol, these helper functions can become member function of MCSymbolELF. llvm-svn: 238864	2015-06-02 20:38:46 +00:00
Tim Northover	3f3a4d8503	AArch64: fix typo in SMIN far atomics and add tests llvm-svn: 238858	2015-06-02 18:37:20 +00:00
Benjamin Kramer	db220dbf02	Push constness through LoopInfo::isLoopHeader and clean it up a bit. NFC. llvm-svn: 238843	2015-06-02 15:28:27 +00:00
Sanjay Patel	6f031d848e	make reciprocal estimate code generation more flexible by adding command-line options (2nd try) The first try (r238051) to land this was reverted due to bot failures that were hopefully addressed by r238788. This patch adds a TargetRecip class for processing many recip codegen possibilities. The class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other x86 CPUs continue to not use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 238842	2015-06-02 15:28:15 +00:00
Elena Demikhovsky	8938f5acca	AVX-512: Implemented VRANGESD and VRANGESS instructions for SKX Implemented DAG lowering for all these forms. Added tests for encoding. By Igor Breger (igor.breger@intel.com) llvm-svn: 238834	2015-06-02 14:12:54 +00:00
Elena Demikhovsky	44a129c533	AVX-512: Shorten implementation of lowerV16X32VectorShuffle() using lowerVectorShuffleWithSHUFPS() and other shuffle-helpers routines. Added matching of VALIGN instruction. llvm-svn: 238830	2015-06-02 13:43:18 +00:00
Vasileios Kalintiris	bb698c7d5f	[mips] Add support for dynamic stack realignment. Summary: With this change we are able to realign the stack dynamically, whenever it contains objects with alignment requirements that are larger than the alignment specified from the given ABI. We have to use the $fp register as the frame pointer when we perform dynamic stack realignment. In complex stack frames, with variably-sized objects, we reserve additionally the callee-saved register $s7 as the base pointer in order to reference locals. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8633 llvm-svn: 238829	2015-06-02 13:14:46 +00:00
Renato Golin	3a7bec86bd	Revert "ARM: Thumb2 LDRD/STRD supports independent input/output regs" This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot. Since self-hosting issues with Clang are hard to investigate, I'm taking the liberty to revert now, so we can investigate it offline. llvm-svn: 238821	2015-06-02 11:47:30 +00:00
Vladimir Sukharev	5f6f60d942	[AArch64] Add v8.1a atomic instructions Patch by: Tom Coxon Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8501 llvm-svn: 238818	2015-06-02 10:58:41 +00:00
Toma Tabacu	2969650ecd	[mips] [IAS] Add support for the .set softfloat/hardfloat directives. Summary: These directives are used to set the current value of the SoftFloat feature. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits, mpf Differential Revision: http://reviews.llvm.org/D9074 llvm-svn: 238813	2015-06-02 09:48:04 +00:00
Elena Demikhovsky	3425c932da	AVX-512: Implemented VFIXUPIMMSD and VFIXUPIMMSS instructions for KNL Implemented DAG lowering for all these forms. Added tests for encoding. By Igor Breger (igor.breger@intel.com) llvm-svn: 238811	2015-06-02 08:28:57 +00:00
Asaf Badouh	8d897dd05f	revert 238809 llvm-svn: 238810	2015-06-02 07:45:19 +00:00
Asaf Badouh	17de10f37e	AVX-512: Implemented GETEXP instruction for KNL and SKX Added rounding mode modifier for SQRTPS/PD Added tests for encoding and intrinsics. llvm-svn: 238809	2015-06-02 07:18:14 +00:00
Rafael Espindola	a869576008	Create a MCSymbolELF. This create a MCSymbolELF class and moves SymbolSize since only ELF needs a size expression. This reduces the size of MCSymbol from 56 to 48 bytes. llvm-svn: 238801	2015-06-02 00:25:12 +00:00
Matthias Braun	e20dc1cd3a	ARM: Thumb2 LDRD/STRD supports independent input/output regs The existing code would unnecessarily break LDRD/STRD apart with non-adjacent registers, on thumb2 this is not necessary. Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore as there is not reason to set register hints anymore, changing that is something for a future patch however. Differential Revision: http://reviews.llvm.org/D9694 llvm-svn: 238795	2015-06-01 23:27:08 +00:00
Matthias Braun	72b8f74813	AArch64: Use CMP;CCMP sequences for and/or/setcc trees. Previously CCMP/FCCMP instructions were only used by the AArch64ConditionalCompares pass for control flow. This patch uses them for SELECT like instructions as well by matching patterns in ISelLowering. PR20927, rdar://18326194 Differential Revision: http://reviews.llvm.org/D8232 llvm-svn: 238793	2015-06-01 22:31:17 +00:00
Alexei Starovoitov	dadc97767f	[bpf] fix build fix breakage due to r238634 Patch by Vijay Subramanian. llvm-svn: 238792	2015-06-01 22:24:36 +00:00
Matt Arsenault	a0269b6d20	R600/SI: Don't hardcode pointer type llvm-svn: 238789	2015-06-01 21:58:24 +00:00
Matthias Braun	ec50fa6f8c	ARMLoadStoreOptimizer: Fix doxygen comments; NFC llvm-svn: 238784	2015-06-01 21:26:23 +00:00
Rafael Espindola	b5815b4738	Revert "[Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath." This reverts commit r238748. It broke the msan bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/4372/steps/check-llvm%20msan/logs/stdio llvm-svn: 238772	2015-06-01 19:20:47 +00:00
Vasileios Kalintiris	cbbf8e0a39	[mips][FastISel] Implement bswap. Summary: Implement bswap intrinsic for MIPS FastISel. It's very different for misp32 r1/r2 . Based on a patch by Reed Kotler. Test Plan: bswap1.ll test-suite Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D7219 llvm-svn: 238760	2015-06-01 16:40:45 +00:00
Vasileios Kalintiris	bdb91b31f0	[mips][FastISel] Implement intrinsics memset, memcopy & memmove. Summary: Implement the intrinsics memset, memcopy and memmove in MIPS FastISel. Make some needed infrastructure fixes so that this can work. Based on a patch by Reed Kotler. Test Plan: memtest1.ll The patch passes test-suite for mips32 r1/r2 and at O0/O2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D7158 llvm-svn: 238759	2015-06-01 16:36:01 +00:00
Vasileios Kalintiris	8fcb3986d0	[mips][FastISel] Implement srem/urem and sdiv/udiv instructions. Summary: Implement the LLVM assembly urem/srem and sdiv/udiv instructions in MIPS FastISel. Based on a patch by Reed Kotler. Test Plan: srem1.ll div1.ll test-suite at O0/O2 for mips32 r1/r2 Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D7028 llvm-svn: 238757	2015-06-01 16:17:37 +00:00
Vasileios Kalintiris	127f894b55	[mips][FastISel] Implement the select statement for MIPS FastISel. Summary: Implement the LLVM IR select statement for MIPS FastISelsel. Based on a patch by Reed Kotler. Test Plan: "Make check" test included now. Passes test-suite at O2/O0 mips32 r1/r2. Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D6774 llvm-svn: 238756	2015-06-01 15:56:40 +00:00
Vasileios Kalintiris	7f680e156e	[mips][FastISel] Clobber HI0/LO0 registers in MUL instructions. Summary: The contents of the HI/LO registers are unpredictable after the execution of the MUL instruction. In addition to implicitly defining these registers in the MUL instruction definition, we have to mark those registers as dead too. Without this the fast register allocator is running out of registers when the MUL instruction is followed by another one that tries to allocate the AC0 register. Based on a patch by Reed Kotler. Reviewers: dsanders, rkotler Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D9825 llvm-svn: 238755	2015-06-01 15:48:09 +00:00
Rafael Espindola	7f7caf9167	Fix relocation selection for foo-. on mips. This handles only the 32 bit case. llvm-svn: 238751	2015-06-01 15:10:51 +00:00
Rafael Espindola	ccb8d1a114	Simplify code, NFC. llvm-svn: 238750	2015-06-01 14:58:29 +00:00
Colin LeMahieu	a739a4b3c7	[Hexagon] Adding basic ELF relocation generation and testing advanced relaxation codepath. llvm-svn: 238748	2015-06-01 14:51:26 +00:00
Elena Demikhovsky	67afb630e1	AVX-512: Optimized vector shuffle for v16f32 and v16i32 types. llvm-svn: 238743	2015-06-01 13:26:18 +00:00
Luke Cheeseman	85fd06d389	Re-commit of r238201 with fix for building with shared libraries. llvm-svn: 238739	2015-06-01 12:02:47 +00:00
Elena Demikhovsky	3582eb3b39	AVX-512: Implemented VRANGEPD and VRANGEPD instructions for SKX. Implemented DAG lowering for all these forms. Added tests for encoding. By Igor Breger (igor.breger@intel.com) llvm-svn: 238738	2015-06-01 11:05:34 +00:00
Elena Demikhovsky	0c41088ebf	AVX-512: Implemented vector shuffle lowering for v8i64 and v8f64 types. I removed the vector-shuffle-512-v8.ll, it is auto-generated test, not valid any more. llvm-svn: 238735	2015-06-01 09:49:53 +00:00
Elena Demikhovsky	75ede68793	AVX-512: added all forms of VPSHUFD and VPSHUFHW, VPSHUFLW including encodings. llvm-svn: 238729	2015-06-01 07:17:23 +00:00
Elena Demikhovsky	42c96d9c0a	AVX-512: Implemented VFIXUPIMMPD and VFIXUPIMMPS instructions for KNL and SKX Implemented DAG lowering for all these forms. Added tests for encoding. by Igor Breger (igor.breger@intel.com) llvm-svn: 238728	2015-06-01 06:50:49 +00:00
Elena Demikhovsky	dd68d0cb0f	AVX-512: Fixed a bug in compress and expand intrinsics. By Igor Breger (igor.breger@intel.com) llvm-svn: 238724	2015-06-01 06:30:13 +00:00
Matt Arsenault	bd7d80a4a6	Add address space argument to isLegalAddressingMode This is important because of different addressing modes depending on the address space for GPU targets. This only adds the argument, and does not update any of the uses to provide the correct address space. llvm-svn: 238723	2015-06-01 05:31:59 +00:00
Rafael Espindola	5eb02e45e3	Simplify another function that doesn't fail. llvm-svn: 238703	2015-06-01 00:27:26 +00:00
NAKAMURA Takumi	072a58a7fd	ARMConstantIslandPass.cpp: Prune an empty \brief. [-Wdocumentation] llvm-svn: 238697	2015-05-31 23:05:35 +00:00
Colin LeMahieu	a97365b8e0	[Hexagon] Including raw_ostream for debug builds. llvm-svn: 238695	2015-05-31 22:29:33 +00:00
Colin LeMahieu	b819d3c465	[Hexagon] classes are actually structs. llvm-svn: 238694	2015-05-31 22:18:42 +00:00
Colin LeMahieu	b23c47bab3	[Hexagon] Adding MC packet shuffler. llvm-svn: 238692	2015-05-31 21:57:09 +00:00
Tim Northover	a603c4076c	ARM: recommit r237590: allow jump tables to be placed as constant islands. The original version didn't properly account for the base register being modified before the final jump, so caused miscompilations in Chromium and LLVM. I've fixed this and tested with an LLVM self-host (I don't have the means to build & test Chromium). The general idea remains the same: in pathological cases jump tables can be too far away from the instructions referencing them (like other constants) so they need to be movable. Should fix PR23627. llvm-svn: 238680	2015-05-31 19:22:07 +00:00
Colin LeMahieu	b510fb38f5	[Hexagon] Adding override specifier and removing erroneous assertion llvm-svn: 238664	2015-05-30 20:03:07 +00:00
Colin LeMahieu	86f218e7ec	[Hexagon] Adding basic relaxation functionality. llvm-svn: 238660	2015-05-30 18:55:47 +00:00
Simon Pilgrim	f19ef9f741	Stripped trailing whitespace. NFC. llvm-svn: 238654	2015-05-30 13:01:42 +00:00
Renato Golin	5d78c9ce58	Comment change. NFC That comment misleads the current discussions in mentioned bug. Leave the discussions to the bug. Also, adding a future change FIXME. llvm-svn: 238653	2015-05-30 10:44:07 +00:00
Chandler Carruth	cb58910ce8	[x86] Unify the horizontal adding used for popcount lowering taking the best approach of each. For vNi16, we use SHL + ADD + SRL pattern that seem easily the best. For vNi32, we use the PUNPCK + PSADBW + PACKUSWB pattern. In some cases there is a huge improvement with this in IACA's estimated throughput -- over 2x higher throughput!!!! -- but the measurements are too good to be true. In one narrow case, the SHL + ADD + SHL + ADD + SRL pattern looks slightly faster, but I'm not sure I believe any of the measurements at this point. Both are the exact same uops though. Hard to be confident of anything past that. If anyone wants to collect very detailed (Agner-level) timings with the result of this patch, or with the i32 case replaced with SHL + ADD + SHl + ADD + SRL, I'd be very interested. Note that you'll need to test it on both Ivybridge and Haswell, with both SSE3, SSSE3, and AVX selected as I saw unique behavior in each of these buckets with IACA all of which should be checked against measured performance. But this patch is still a useful improvement by dropping duplicate work and getting the much nicer PSADBW lowering for v2i64. I'd still like to rephrase this in terms of generic horizontal sum. It's a bit lame to have a special case of that just for popcount. llvm-svn: 238652	2015-05-30 10:35:03 +00:00
Renato Golin	230d298320	[ARMTargetParser] Move IAS arch ext parser. NFC The plan was to move the whole table into the already existing ArchExtNames but some fields depend on a table-generated file, and we don't yet have this feature in the generic lib/Support side. Once the minimum target-specific table-generated files are available in a generic fashion to these libraries, we'll have to keep it in the ASM parser. llvm-svn: 238651	2015-05-30 10:30:02 +00:00
Chandler Carruth	11e6f8fed1	[x86] Split out the horizontal byte sum lowering component of the LUT lowering into a helper function. NFC. llvm-svn: 238650	2015-05-30 09:46:16 +00:00
Chandler Carruth	9cc2516676	[x86] Replace the long spelling of getting a bitcast with the much shorter one. NFC. In addition to being much shorter to type and requiring fewer arguments, this change saves over 30 lines from this one file, all wasted on total boilerplate... llvm-svn: 238640	2015-05-30 04:23:13 +00:00
Chandler Carruth	060cdca996	[x86] Replace the long spelling of getting a bitcast with the new short spelling. NFC. llvm-svn: 238639	2015-05-30 04:19:57 +00:00
Chandler Carruth	502b23a7a9	[sdag] Add the helper I most want to the DAG -- building a bitcast around a value using its existing SDLoc. Start using this in just one function to save omg lines of code. llvm-svn: 238638	2015-05-30 04:14:10 +00:00
Chandler Carruth	2599da3cfd	[x86] Restore the bitcasts I removed when refactoring this to avoid shifting vectors of bytes as x86 doesn't have direct support for that. This removes a bunch of redundant masking in the generated code for SSE2 and SSE3. In order to avoid the really significant code size growth this would have triggered, I also factored the completely repeatative logic for shifting and masking into two lambdas which in turn makes all of this much easier to read IMO. llvm-svn: 238637	2015-05-30 04:05:11 +00:00
Chandler Carruth	6ba9730a4e	[x86] Implement a faster vector population count based on the PSHUFB in-register LUT technique. Summary: A description of this technique can be found here: http://wm.ite.pl/articles/sse-popcount.html The core of the idea is to use an in-register lookup table and the PSHUFB instruction to compute the population count for the low and high nibbles of each byte, and then to use horizontal sums to aggregate these into vector population counts with wider element types. On x86 there is an instruction that will directly compute the horizontal sum for the low 8 and high 8 bytes, giving vNi64 popcount very easily. Various tricks are used to get vNi32 and vNi16 from the vNi8 that the LUT computes. The base implemantion of this, and most of the work, was done by Bruno in a follow up to D6531. See Bruno's detailed post there for lots of timing information about these changes. I have extended Bruno's patch in the following ways: 0) I committed the new tests with baseline sequences so this shows a diff, and regenerated the tests using the update scripts. 1) Bruno had noticed and mentioned in IRC a redundant mask that I removed. 2) I introduced a particular optimization for the i32 vector cases where we use PSHL + PSADBW to compute the the low i32 popcounts, and PSHUFD + PSADBW to compute doubled high i32 popcounts. This takes advantage of the fact that to line up the high i32 popcounts we have to shift them anyways, and we can shift them by one fewer bit to effectively divide the count by two. While the PSHUFD based horizontal add is no faster, it doesn't require registers or load traffic the way a mask would, and provides more ILP as it happens on different ports with high throughput. 3) I did some code cleanups throughout to simplify the implementation logic. 4) I refactored it to continue to use the parallel bitmath lowering when SSSE3 is not available to preserve the performance of that version on SSE2 targets where it is still much better than scalarizing as we'll still do a bitmath implementation of popcount even in scalar code there. With #1 and #2 above, I analyzed the result in IACA for sandybridge, ivybridge, and haswell. In every case I measured, the throughput is the same or better using the LUT lowering, even v2i64 and v4i64, and even compared with using the native popcnt instruction! The latency of the LUT lowering is often higher than the latency of the scalarized popcnt instruction sequence, but I think those latency measurements are deeply misleading. Keeping the operation fully in the vector unit and having many chances for increased throughput seems much more likely to win. With this, we can lower every integer vector popcount implementation using the LUT strategy if we have SSSE3 or better (and thus have PSHUFB). I've updated the operation lowering to reflect this. This also fixes an issue where we were scalarizing horribly some AVX lowerings. Finally, there are some remaining cleanups. There is duplication between the two techniques in how they perform the horizontal sum once the byte population count is computed. I'm going to factor and merge those two in a separate follow-up commit. Differential Revision: http://reviews.llvm.org/D10084 llvm-svn: 238636	2015-05-30 03:20:59 +00:00
Chandler Carruth	c2e400de83	[x86] Restructure the parallel bitmath lowering of popcount into a separate routine, generalize it to work for all the integer vector sizes, and do general code cleanups. This dramatically improves lowerings of byte and short element vector popcount, but more importantly it will make the introduction of the LUT-approach much cleaner. The biggest cleanup I've done is to just force the legalizer to do the bitcasting we need. We run these iteratively now and it makes the code much simpler IMO. Other changes were minor, and mostly naming and splitting things up in a way that makes it more clear what is going on. The other significant change is to use a different final horizontal sum approach. This is the same number of instructions as the old method, but shifts left instead of right so that we can clear everything but the final sum with a single shift right. This seems likely better than a mask which will usually have to read the mask from memory. It is certaily fewer u-ops. Also, this will be temporary. This and the LUT approach share the need of horizontal adds to finish the computation, and we have more clever approaches than this one that I'll switch over to. llvm-svn: 238635	2015-05-30 03:20:55 +00:00
Jim Grosbach	13760bd152	MC: Clean up MCExpr naming. NFC. llvm-svn: 238634	2015-05-30 01:25:56 +00:00
Reid Kleckner	e6531a5588	[WinEH] Adjust the 32-bit SEH prologue to better match reality It turns out that _except_handler3 and _except_handler4 really use the same stack allocation layout, at least today. They just make different choices about encoding the LSDA. This is in preparation for lowering the llvm.eh.exceptioninfo(). llvm-svn: 238627	2015-05-29 22:57:46 +00:00
Reid Kleckner	173a72524f	Disable FP elimination in funcs using 32-bit MSVC EH personalities The value in 'ebp' acts as an implicit argument to the outlined handlers, and is recovered with frameaddress(1). llvm-svn: 238619	2015-05-29 21:58:11 +00:00
Rafael Espindola	4d37b2a259	Remove getData. This completes the mechanical part of merging MCSymbol and MCSymbolData. llvm-svn: 238617	2015-05-29 21:45:01 +00:00
Reid Kleckner	5b8ebfbc25	Only add the EH state insertion pass on 32-bit Windows llvm-svn: 238612	2015-05-29 20:43:10 +00:00
Rafael Espindola	beb6060a51	Remove the MCSymbolData typedef. The getData member function is next. llvm-svn: 238611	2015-05-29 20:41:47 +00:00
Rafael Espindola	b5d316bfc3	Rename getOrCreateSymbolData to registerSymbol and return void. Another step in merging MCSymbol and MCSymbolData. llvm-svn: 238607	2015-05-29 20:21:02 +00:00
Rafael Espindola	e3b2acf274	Pass MCSymbols to the helper functions in MCELF.h. llvm-svn: 238596	2015-05-29 18:47:23 +00:00
Rafael Espindola	ece40ca43d	Pass a MCSymbol to needsRelocateWithSymbol. llvm-svn: 238589	2015-05-29 18:26:09 +00:00
Nemanja Ivanovic	376e17364f	Add support for VSX FMA single-precision instructions to the PPC back end This patch corresponds to review: http://reviews.llvm.org/D9941 It adds the various FMA instructions introduced in the version 2.07 of the ISA along with the testing for them. These are operations on single precision scalar values in VSX registers. llvm-svn: 238578	2015-05-29 17:13:25 +00:00
Reid Kleckner	1d3d4adbb9	[WinEH] Emit EH tables for __CxxFrameHandler3 on 32-bit x86 Small (really small!) C++ exception handling examples work on 32-bit x86 now. This change disables the use of .seh_* directives in WinException when CFI is not in use. It also uses absolute symbol references in the tables instead of imagerel32 relocations. Also fixes a cache invalidation bug in MMI personality classification. llvm-svn: 238575	2015-05-29 17:00:57 +00:00
Jingyue Wu	995dde2799	[NVPTXFavorNonGenericAddrSpaces] recursively trace into GEP and BitCast Summary: This patch allows NVPTXFavorNonGenericAddrSpaces to remove addrspacecast from longer chains consisting of GEPs and BitCasts. For example, it can now optimize %0 = addrspacecast [10 x float] addrspace(3)* @a to [10 x float]* %1 = gep [10 x float]* %0, i64 0, i64 %i %2 = bitcast float* %1 to i32* %3 = load i32* %2 ; emits ld.u32 to %0 = gep [10 x float] addrspace(3)* @a, i64 0, i64 %i %1 = bitcast float addrspace(3)* %0 to i32 addrspace(3)* %3 = load i32 addrspace(3)* %1 ; emits ld.shared.f32 Test Plan: @ld_int_from_global_float in access-non-generic.ll Reviewers: broune, eliben, jholewinski, meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10074 llvm-svn: 238574	2015-05-29 17:00:27 +00:00
Colin LeMahieu	68d967d92e	[Hexagon] Disassembling, printing, and emitting instructions a whole-bundle at a time which is the semantic unit for Hexagon. Fixing tests to use the new format. Disabling tests in the direct object emission path for a followup patch. llvm-svn: 238556	2015-05-29 14:44:13 +00:00
Toma Tabacu	b45fb36f20	[mips] Remove 2 unused variables in MipsTargetStreamer.cpp. NFC. llvm-svn: 238554	2015-05-29 13:52:56 +00:00
Matthias Braun	e41e146c16	CodeGen: Use mop_iterator instead of MIOperands/ConstMIOperands MIOperands/ConstMIOperands are classes iterating over the MachineOperand of a MachineInstr, however MachineInstr::mop_iterator does the same thing. I assume these two iterators exist to have a uniform interface to iterate over the operands of a machine instruction bundle and a single machine instruction. However in practice I find it more confusing to have 2 different iterator classes, so this patch transforms (nearly all) the code to use mop_iterators. The only exception being MIOperands::anlayzePhysReg() and MIOperands::analyzeVirtReg() still needing an equivalent, I leave that as an exercise for the next patch. Differential Revision: http://reviews.llvm.org/D9932 This version is slightly modified from the proposed revision in that it introduces MachineInstr::getOperandNo to avoid the extra counting variable in the few loops that previously used MIOperands::getOperandNo. llvm-svn: 238539	2015-05-29 02:56:46 +00:00
Reid Kleckner	fe4d491bd9	[WinEH] Start inserting state number stores for C++ EH This moves all the state numbering code for C++ EH to WinEHPrepare so that we can call it from the X86 state numbering IR pass that runs before isel. Now we just call the same state numbering machinery and insert a bunch of stores. It also populates MachineModuleInfo with information about the current function. llvm-svn: 238514	2015-05-28 22:00:24 +00:00
Rafael Espindola	3a5d3cce80	Remove a trivial forwarding function. NFC. llvm-svn: 238506	2015-05-28 21:36:02 +00:00
Reid Kleckner	bfcad2f181	Remove debug prints from r238487 llvm-svn: 238501	2015-05-28 21:23:53 +00:00
Reid Kleckner	80956a0142	Disable x86 tail call optimizations that jump through GOT For x86 targets, do not do sibling call optimization when materializing the callee's address would require a GOT relocation. We can still do tail calls to internal functions, hidden functions, and protected functions, because they do not require this kind of relocation. It is still possible to get GOT relocations when the user explicitly asks for it with musttail or -tailcallopt, both of which are supposed to guarantee TCO. Based on a patch by Chih-hung Hsieh. Reviewers: srhines, timmurray, danalbert, enh, void, nadav, rnk Subscribers: joerg, davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D9799 llvm-svn: 238487	2015-05-28 20:44:28 +00:00
Peter Collingbourne	450fbee6b2	Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing these as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achiveing better codegen is to create LDM/STM instructions with identical sets of virtual registers, let the register allocator pick arbitrary registers and order register lists when printing an MCInst. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. This is implemented by lowering the memcpy intrinsic to a series of SD-only MCOPY pseudo-instructions which performs a memory copy using a given number of registers. During SD->MI lowering, we lower MCOPY to LDM/STM. This is a little unusual, but it avoids the need to encode register lists in the SD, and we can take advantage of SD use lists to decide whether to use the _UPD variant of the instructions. Fixes PR9199. Differential Revision: http://reviews.llvm.org/D9508 llvm-svn: 238473	2015-05-28 20:02:45 +00:00
Reid Kleckner	e2e57faa7d	[WinEH] Remove debugging dump() call llvm-svn: 238472	2015-05-28 20:02:05 +00:00
Chad Rosier	adc06311ba	Reuse Loc variable. NFC. llvm-svn: 238448	2015-05-28 18:18:21 +00:00
Kai Nacke	3adf9b8d80	[mips] Add new format for dmtc2/dmfc2 for Octeon CPUs. Octeon CPUs use dmtc2 rt,imm16 and dmfcp2 rt,imm16 for the crypto coprocessor. E.g. dmtc2 rt,0x4057 starts calculation of sha-1. I had to introduce a new deconding namespace to avoid a decoding conflict. Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D10083 llvm-svn: 238439	2015-05-28 16:23:16 +00:00
Petar Jovanovic	9720283e99	[Mips64] Add support for MCJIT for MIPS64r2 and MIPS64r6 Add support for resolving MIPS64r2 and MIPS64r6 relocations in MCJIT. Patch by Vladimir Radosavljevic. Differential Revision: http://reviews.llvm.org/D9667 llvm-svn: 238424	2015-05-28 13:48:41 +00:00
Benjamin Kramer	dba7ee90b5	Don't call utostr in Twine/raw_ostream contexts. Creating temporary std::strings there is unnecessary. llvm-svn: 238412	2015-05-28 11:24:24 +00:00
Jingyue Wu	c2a014697a	[NaryReassociate] Run EarlyCSE after NaryReassociate Summary: This patch made two improvements to NaryReassociate and the NVPTX pipeline 1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common expressions. 2. When adding an instruction to SeenExprs, maps both the SCEV before and after reassociation to that instruction. Test Plan: updated @reassociate_gep_nsw in nary-gep.ll Reviewers: meheff, broune Reviewed By: broune Subscribers: dberlin, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9947 llvm-svn: 238396	2015-05-28 04:56:52 +00:00
Renato Golin	f7c0d5f247	ARMTargetParser: Normalising build attributes Now that most of the methods in Clang and LLVM that were parsing arch/cpu/fpu strings are using ARMTargetParser, it's time to make it a bit more conforming with what the ABI says. This commit adds some clarification on what build attributes are accepted and which are "non-standard". It also makes clear that the "defaultCPU" and "defaultArch" methods were really just build attribute getters. It also diverges from GCC's behaviour to say that armv2/armv3 are really an ARMv4 in the build attributes, when the ABI has a clear state for that: Pre-v4. llvm-svn: 238344	2015-05-27 18:15:37 +00:00
Jan Vesely	6c12e2db06	R600: Rely on TypeLegalizer to use divrem instead of div/rem reviewer: tstellardAMD llvm-svn: 238337	2015-05-27 16:54:10 +00:00
Zoran Jovanovic	85a53a1ed5	[mips][microMIPSr6] Implement SEB and SEH instructions Differential Revision: http://reviews.llvm.org/D9739 llvm-svn: 238333	2015-05-27 15:39:47 +00:00
Jozef Kolek	888830adfe	[mips][microMIPSr6] Implement BEQZALC, BGEZALC, BGTZALC, BLEZALC, BLTZALC and BNEZALC instructions This patch implements microMIPS32r6 BEQZALC, BGEZALC, BGTZALC, BLEZALC, BLTZALC and BNEZALC instructions using mapping. Differential Revision: http://reviews.llvm.org/D10031 llvm-svn: 238325	2015-05-27 14:19:22 +00:00
Elena Demikhovsky	86c7b46680	AVX-512: Fixed a bug in extracting subvector from v64i1 By Igor Breger (igor.breger@intel.com) llvm-svn: 238322	2015-05-27 14:09:33 +00:00
Rafael Espindola	f4a1365387	Use operator<< instead of print in a few more places. llvm-svn: 238315	2015-05-27 13:05:42 +00:00
Daniel Sanders	8ef465f4bb	Revert r238190 and r238197: [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only. This broke the llvm-mips-linux builder and several of our out-of-tree builders. Initial investigations show that the commit probably isn't the problem but reverting anyway while I investigate. llvm-svn: 238302	2015-05-27 08:44:01 +00:00
Elena Demikhovsky	3948c590e3	AVX-512: Implemented all forms of sign-extend and zero-extend instructions for KNL and SKX Implemented DAG lowering for all these forms. Added tests for DAG lowering and encoding. By Igor Breger (igor.breger@intel.com) llvm-svn: 238301	2015-05-27 08:15:19 +00:00
Quentin Colombet	aa8020752e	[X86] Implement the support for shrink-wrapping. With this patch the x86 backend is now shrink-wrapping capable and this functionality can be tested by using the -enable-shrink-wrap switch. The next step is to make more test and enable shrink-wrapping by default for x86. Related to <rdar://problem/20821487> llvm-svn: 238293	2015-05-27 06:28:41 +00:00
Matthias Braun	aa9fa35555	ARMLoadStoreOptimizer: Code cleanup; NFC llvm-svn: 238289	2015-05-27 05:12:40 +00:00
Rafael Espindola	2fb8401b2a	Print "lock \t foo" instead of "lock \n foo". This gets gas and llc -filetype=obj to agree on the order of prefixes. For llvm-mc we need to fix the asm parser to know that it makes a difference on which line the "lock" is in. Part of pr23594. llvm-svn: 238232	2015-05-26 18:35:10 +00:00
Jan Vesely	b670d37105	R600: Use SIGN_EXTEND_INREG for SEXT loads Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 238229	2015-05-26 18:07:22 +00:00
Jan Vesely	a2143fa244	R600: Add comments to subword private address load lowering code v2: Use C++ comments and end with periods Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 238228	2015-05-26 18:07:21 +00:00
Diego Novillo	bfecc06656	Revert "Re-commit changes in r237579 with fix for bug breaking windows builds." This reverts commit r238201 to fix linking problems in x86 Linux http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html llvm-svn: 238223	2015-05-26 17:45:38 +00:00
Tom Stellard	245c15fce2	R600/SI: Add assembler support for all CI and VI VOP2 instructions llvm-svn: 238211	2015-05-26 15:55:52 +00:00
Rafael Espindola	bb9a71c1ed	Replace getOrCreateSectionData with registerSection. There is now no SectionData to be created. llvm-svn: 238208	2015-05-26 15:07:25 +00:00
Luke Cheeseman	a5d053d6f4	Re-commit changes in r237579 with fix for bug breaking windows builds. llvm-svn: 238201	2015-05-26 13:40:31 +00:00
Luke Cheeseman	0af4f635f1	Test Commit llvm-svn: 238199	2015-05-26 13:10:35 +00:00
Elena Demikhovsky	887baa0b49	AVX-512: fixed a bug in arithmetic operations lowering for i1 type https://llvm.org/bugs/show_bug.cgi?id=23630 llvm-svn: 238198	2015-05-26 12:37:17 +00:00
Elena Demikhovsky	b2b901c607	AVX-512: fixed a bug in lowering VSELECT for 512-bit vector https://llvm.org/bugs/show_bug.cgi?id=23634 llvm-svn: 238195	2015-05-26 11:32:39 +00:00
Michael Kuperstein	db0712f986	Use std::bitset for SubtargetFeatures. Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. The first several times this was committed (e.g. r229831, r233055), it caused several buildbot failures. Apparently the reason for most failures was both clang and gcc's inability to deal with large numbers (> 10K) of bitset constructor calls in tablegen-generated initializers of instruction info tables. This should now be fixed. llvm-svn: 238192	2015-05-26 10:47:10 +00:00
Daniel Sanders	58ee4c9451	[mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only. Summary: Following on from r209907 which made personality encodings indirect, do the same for TType encodings. This fixes the case where a try/catch block needs to generate references to, for example, std::exception in the .gcc_except_table. This commit uses DW_EH_PE_sdata8 for N64 as far as is possible at the moment. However, it is possible to end up with DW_EH_PE_sdata4 when a TargetMachine is not available. There's no risk of issues with inconsistency here since the tables are self describing but it does mean there is a small chance of the PC-relative offset being out of range for particularly large programs. Reviewers: petarj Reviewed By: petarj Subscribers: srhines, joerg, tberghammer, llvm-commits Differential Revision: http://reviews.llvm.org/D9669 llvm-svn: 238190	2015-05-26 10:19:18 +00:00
Rafael Espindola	64acc7fcc7	Remove most uses of MCSectionData from MCAssembler. llvm-svn: 238172	2015-05-26 02:17:21 +00:00
Rafael Espindola	61e724a8c5	Stop using MCSectionData in MCMachObjectWriter.h. llvm-svn: 238165	2015-05-26 01:15:30 +00:00
Rafael Espindola	079027ea90	Stop using MCSectionData in MCExpr.h. llvm-svn: 238163	2015-05-26 00:52:18 +00:00
Rafael Espindola	7549f87672	Return a MCSection from MCFragment::getParent(). Another step in merging MCSectionData and MCSection. llvm-svn: 238162	2015-05-26 00:36:57 +00:00
Rafael Espindola	a554c05d95	Turn MCSectionData into a field of MCSection. This also changes MCAssembler to store a vector of MCSections instead of an iplist of MCSectionData. llvm-svn: 238159	2015-05-25 23:14:17 +00:00
Simon Pilgrim	0be4fa761f	[X86][AVX2] Vectorized i16 shift operators Part of D9474, this patch extends AVX2 v16i16 types to 2 x 8i32 vectors and uses i32 shift variable shifts before packing back to i16. Adds AVX2 tests for v8i16 and v16i16 llvm-svn: 238149	2015-05-25 17:49:13 +00:00
Tom Stellard	50828163a1	R600/SI: Remove some unnecessary patterns from VINTRP multiclass DisableEncoding and Constraints can be set using let statements around the multiclass defs. llvm-svn: 238148	2015-05-25 16:15:56 +00:00
Tom Stellard	ec87f841c6	R600/SI: Fix bug with v_interp_p1_f32 instructions on 16 bank lds chips The src and dst register cannot be the same on chips with 16 lds banks. llvm-svn: 238147	2015-05-25 16:15:54 +00:00
Tom Stellard	c70cf90d09	R600/SI: Use NAME rather than opName as the key to the MCOpcode tables This lets us drop a parameter the opName parameter to the VINTRP multiclass and makes it possible to create multiple VINTRP defs with the same asm mnemonic. llvm-svn: 238146	2015-05-25 16:15:50 +00:00
Kit Barton	6646033e6e	This patch adds support for the vector quadword add/sub instructions introduced in POWER8: vadduqm vaddeuqm vaddcuq vaddecuq vsubuqm vsubeuqm vsubcuq vsubecuq In addition to adding the instructions themselves, it also adds support for the v1i128 type for intrinsics (Intrinsics.td, Function.cpp, and IntrinsicEmitter.cpp). http://reviews.llvm.org/D9081 llvm-svn: 238144	2015-05-25 15:49:26 +00:00
Rafael Espindola	6e6820a7e6	Stop forwarding getOrdinal and setOrdinal. llvm-svn: 238139	2015-05-25 14:12:48 +00:00
Michael Kuperstein	f145228676	[X86] When pattern-matching scalar FMA3 intrinsics, don't re-arrange the first and second operands. The semantics of the scalar FMA intrinsics are that the high vector elements are copied from the first source. The existing pattern switches src1 and src2 around, to match the "213" order, which ends up tying the original src2 to the dest. Since the actual scalar fma3 instructions copy the high elements from the dest register, the wrong values are copied. This modifies the pattern to leave src1 and src2 in their original order. Differential Revision: http://reviews.llvm.org/D9908 llvm-svn: 238131	2015-05-25 12:35:25 +00:00
Elena Demikhovsky	1c1391ba24	Added promotion to EXTRACT_SUBVECTOR operand. I encountered with this case in one of KNL tests for i1 vectors. v16i1 = EXTRACT_SUBVECTOR v32i1, x llvm-svn: 238130	2015-05-25 11:33:13 +00:00
NAKAMURA Takumi	5582a6a4a5	Reformat. llvm-svn: 238126	2015-05-25 01:43:34 +00:00
NAKAMURA Takumi	fb3bd7127a	Prune CRLFs. llvm-svn: 238125	2015-05-25 01:43:23 +00:00
Matt Arsenault	65ad1602b0	Add target hook to allow merging stores of nonzero constants On GPU targets, materializing constants is cheap and stores are expensive, so only doing this for zero vectors was silly. Most of the new testcases aren't optimally merged, and are for later improvements. llvm-svn: 238108	2015-05-24 00:51:27 +00:00
Benjamin Kramer	1577f1f484	Bump SmallString to the minimum required amount for raw_ostream to avoid allocation. NFC. llvm-svn: 238104	2015-05-23 17:20:53 +00:00
Benjamin Kramer	33b4691fd0	[Mips] Prefer Twine::utohexstr over utohexstr, saves a string copy. NFC. llvm-svn: 238103	2015-05-23 16:53:07 +00:00
Benjamin Kramer	be48c40475	[AArch64] Clean up the ELF streamer a bit. llvm-svn: 238102	2015-05-23 16:39:10 +00:00
Benjamin Kramer	1d1b9243d5	[AArch64] Move AArch64TargetStreamer out of MCStreamer.h It doesn't belong in the shared MC layer. NFC. llvm-svn: 238101	2015-05-23 16:15:10 +00:00
Hal Finkel	5f2a1379ef	[PowerPC] Fix fast-isel when compare is split from branch When the compare feeding a branch was in a different BB from the branch, we'd try to "regenerate" the compare in the block with the branch, possibly trying to make use of values not available there. Copy a page from AArch64's play book here to fix the problem (at least in terms of correctness). Fixes PR23640. llvm-svn: 238097	2015-05-23 12:18:10 +00:00
Akira Hatanaka	ddf76aa36f	Stop resetting NoFramePointerElim in TargetMachine::resetTargetOptions. This is part of the work to remove TargetMachine::resetTargetOptions. In this patch, instead of updating global variable NoFramePointerElim in resetTargetOptions, its use in DisableFramePointerElim is replaced with a call to TargetFrameLowering::noFramePointerElim. This function determines on a per-function basis if frame pointer elimination should be disabled. There is no change in functionality except that cl:opt option "disable-fp-elim" can now override function attribute "no-frame-pointer-elim". llvm-svn: 238080	2015-05-23 01:14:08 +00:00
Rafael Espindola	445712264d	Revert "make reciprocal estimate code generation more flexible by adding command-line options" This reverts commit r238051. It broke some bots: http://lab.llvm.org:8011/builders/llvm-ppc64-linux1/builds/18190 llvm-svn: 238075	2015-05-23 00:22:44 +00:00
Sanjay Patel	ba2ba80302	make reciprocal estimate code generation more flexible by adding command-line options This patch adds a class for processing many recip codegen possibilities. The TargetRecip class is intended to handle both command-line options to llc as well as options passed in from a front-end such as clang with the -mrecip option. The x86 backend is updated to use the new functionality. Only -mcpu=btver2 with -ffast-math should see a functional change from this patch. All other CPUs continue to not use reciprocal estimates by default with -ffast-math. Differential Revision: http://reviews.llvm.org/D8982 llvm-svn: 238051	2015-05-22 21:10:06 +00:00
Chad Rosier	67336305f5	Use new MachineInstr mayLoadOrStore() API. NFC. llvm-svn: 238044	2015-05-22 20:07:34 +00:00
Alexei Starovoitov	6296f6d7d8	[bpf] emit jmp fixups in little endian The 'off' field of 'struct bpf_insn' is in cpu-endianness, since the rest is emitted as little endian, make sure that 'off' field is little endian as well. llvm-svn: 238038	2015-05-22 18:47:33 +00:00
Quentin Colombet	494eb606cd	Reapply r238011 with a fix for the trap instruction. The problem was that I slipped a change required for shrink-wrapping, namely I used getFirstTerminator instead of the getLastNonDebugInstr that was here before the refactoring, whereas the surrounding code is not yet patched for that. Original message: [X86] Refactor the prologue emission to prepare for shrink-wrapping. - Add a late pass to expand pseudo instructions (tail call and EH returns). Instead of doing it in the prologue emission. - Factor some static methods in X86FrameLowering to ease code sharing. NFC. Related to <rdar://problem/20821487> llvm-svn: 238035	2015-05-22 18:10:47 +00:00
Bill Schmidt	e26236eed9	[PPC64] Add support for clrbhrb, mfbhrbe, rfebb. This patch adds support for the ISA 2.07 additions involving the branch history rolling buffer and event-based branching. These will not be used by typical applications, so built-in support is not required. They will only be available via inline assembly. Assembly/disassembly tests are included in the patch. llvm-svn: 238032	2015-05-22 16:44:10 +00:00
John Brawn	c815a969c7	[ARM] Fix typo in subtarget feature list for 7em triple The list of subtarget features for the 7em triple contains 't2xtpk', which actually disables that subtarget feature. Correct that to '+t2xtpk' and test that the instructions enabled by that feature do actually work. Differential Revision: http://reviews.llvm.org/D9936 llvm-svn: 238022	2015-05-22 14:16:22 +00:00
Tamas Berghammer	466692abdc	Revert "[X86] Fix a variable name for r237977 so that it works with every compilers." Revert "[X86] Refactor the prologue emission to prepare for shrink-wrapping." This reverts commit 6b3b93fc8b68a2c806aa992ee4bd3d7f61898d4b. This reverts commit ab0b15dff8539826283a59c2dd700a18a9680e0f. llvm-svn: 238011	2015-05-22 10:01:56 +00:00
Quentin Colombet	04ac8fcbde	[X86] Fix a variable name for r237977 so that it works with every compilers. llvm-svn: 237980	2015-05-22 00:41:03 +00:00
Quentin Colombet	faf4b57e1d	[X86] Refactor the prologue emission to prepare for shrink-wrapping. - Add a late pass to expand pseudo instructions (tail call and EH returns). Instead of doing it in the prologue emission. - Factor some static methods in X86FrameLowering to ease code sharing. NFC. Related to <rdar://problem/20821487> llvm-svn: 237977	2015-05-22 00:12:31 +00:00
Hal Finkel	82e1fc5fc7	[PPC] Correct iterator bug in PPCTLSDynamicCall Unfortunately, I can't reduce a small test case for this (although compiling mpfr-3.1.2 with -O2 -mcpu=a2 would fairly reliably trigger a crash), but the problem is fairly clear (at least once you know you're looking for one). If the TLS instruction being replaced was at the end of the block, we'd increment the iterator past it (so it would then point to MBB.end()), and then we'd increment it again as part of the for statement, thus overrunning the end of the list. Don't do that. llvm-svn: 237974	2015-05-21 23:45:49 +00:00
Peter Collingbourne	7e814d100b	Revert r237590, "ARM: allow jump tables to be placed as constant islands." Caused a miscompile of the Android port of Chromium, details forthcoming. llvm-svn: 237972	2015-05-21 23:20:55 +00:00
Chad Rosier	a73b359542	Use new MachineInstr mayLoadOrStore() API. llvm-svn: 237965	2015-05-21 21:59:57 +00:00
Chad Rosier	ce8e5abbaf	[AArch64] Enhance the load/store optimizer with target-specific alias analysis. Phabricator: http://reviews.llvm.org/D9863 llvm-svn: 237963	2015-05-21 21:36:46 +00:00
David Blaikie	457343dcaa	[opaque pointer type] Allow gep_type_iterator to work with the pointee type from the GEP instruction The raw non-instruction/constant form of this is still relying on being able to access the pointee type from a pointer type - those will be cleaned up later. For now, just focus on the cases where the pointee type is easily accessible. llvm-svn: 237958	2015-05-21 21:12:43 +00:00
Rafael Espindola	967d6a6914	Stop forwarding (get\|set)Aligment from MCSectionData to MCSection. llvm-svn: 237956	2015-05-21 21:02:35 +00:00
Bill Schmidt	e13ac91c5d	[PPC64] Handle vpkudum mask pattern correctly when vpkudum isn't available My recent patch to add support for ISA 2.07 vector pack/unpack instructions didn't properly check for availability of the vpkudum instruction when recognizing it as a special vector shuffle case. This causes us to leave the vector shuffle in place (rather than converting it to a vector permute) so that it can be recognized later as a vpkudum, but that pattern is invalid for processors prior to POWER8. Thus LLVM crashes with an "unable to select" message. We observed this since one of our buildbots is configured to generate code for a POWER7. This patch fixes the problem by checking for availability of the vpkudum instruction during custom lowering of vector shuffles. I've added a test case variant for the vpkudum pattern when the instruction isn't available. llvm-svn: 237952	2015-05-21 20:48:49 +00:00
Hal Finkel	3b3c9c3e44	[PPC/LoopUnrollRuntime] Don't avoid high-cost trip count computation on the PPC/A2 On X86 (and similar OOO cores) unrolling is very limited, and even if the runtime unrolling is otherwise profitable, the expense of a division to compute the trip count could greatly outweigh the benefits. On the A2, we unroll a lot, and the benefits of unrolling are more significant (seeing a 5x or 6x speedup is not uncommon), so we're more able to tolerate the expense, on average, of a division to compute the trip count. llvm-svn: 237947	2015-05-21 20:30:23 +00:00
Nemanja Ivanovic	f02def6cbc	Add support for VSX scalar single-precision arithmetic in the PPC target http://reviews.llvm.org/D9891 Following up on the VSX single precision loads and stores added earlier, this adds support for elementary arithmetic operations on single precision values in VSX registers. These instructions utilize the new VSSRC register class. Instructions added: xsaddsp xsdivsp xsmulsp xsresp xsrsqrtesp xssqrtsp xssubsp llvm-svn: 237937	2015-05-21 19:32:49 +00:00
Rafael Espindola	0709a7bd1a	Move alignment from MCSectionData to MCSection. This starts merging MCSection and MCSectionData. There are a few issues with the current split between MCSection and MCSectionData. * It optimizes the the not as important case. We want the production of .o files to be really fast, but the split puts the information used for .o emission in a separate data structure. * The ELF/COFF/MachO hierarchy is not represented in MCSectionData, leading to some ad-hoc ways to represent the various flags. * It makes it harder to remember where each item is. The attached patch starts merging the two by moving the alignment from MCSectionData to MCSection. Most of the patch is actually just dropping 'const', since MCSectionData is mutable, but MCSection was not. llvm-svn: 237936	2015-05-21 19:20:38 +00:00
Elena Demikhovsky	4aed59fc89	AVX-512: Enabled SSE intrinsics on AVX-512. Predicate UseAVX depricates pattern selection on AVX-512. This predicate is necessary for DAG selection to select EVEX form. But mapping SSE intrinsics to AVX-512 instructions is not ready yet. So I replaced UseAVX with HasAVX for intrinsics patterns. llvm-svn: 237903	2015-05-21 14:01:32 +00:00
Simon Pilgrim	f483abc14e	Fixed unused variable warning in non-assert builds from rL237885 llvm-svn: 237889	2015-05-21 10:22:10 +00:00
Simon Pilgrim	e054199354	[X86][SSE] Improve support for 128-bit vector sign extension This patch improves support for sign extension of the lower lanes of vectors of integers by making use of the SSE41 pmovsx* sign extension instructions where possible, and optimizing the sign extension by shifts on pre-SSE41 targets (avoiding the use of i64 arithmetic shifts which require scalarization). It converts SIGN_EXTEND nodes to SIGN_EXTEND_VECTOR_INREG where necessary, that more closely matches the pmovsx* instruction than the default approach of using SIGN_EXTEND_INREG which splits the operation (into an ANY_EXTEND lowered to a shuffle followed by shifts) making instruction matching difficult during lowering. Necessary support for SIGN_EXTEND_VECTOR_INREG has been added to the DAGCombiner. Differential Revision: http://reviews.llvm.org/D9848 llvm-svn: 237885	2015-05-21 10:05:03 +00:00
Reid Kleckner	2632f0df48	[WinEH] Store pointers to the LSDA in the exception registration object We aren't yet emitting the LSDA yet, so this will still fail to assemble. llvm-svn: 237852	2015-05-20 23:08:04 +00:00
Hans Wennborg	a8f8df5dd2	Revert r237828 "[X86] Remove unused node after morphing it from shr to and." This caused assertions during DAG combine: PR23601. llvm-svn: 237843	2015-05-20 22:31:55 +00:00
Davide Italiano	141b2891cb	[Target/ARM] Only enable OptimizeBarrierPass at -O1 and above. Ideally this is going to be and LLVM IR pass (shared, among others with AArch64), but for the time being just enable it if consumers ask us for optimization and not unconditionally. Discussed with Tim Northover on IRC. llvm-svn: 237837	2015-05-20 21:40:38 +00:00
Duncan P. N. Exon Smith	92a699c50e	MC: Remove most remaining uses of MCSymbolData::getSymbol(), NFC Remove most remaining calls to `MCSymbolData::getSymbol()`, instead using the already available `MCSymbol` directly. llvm-svn: 237829	2015-05-20 20:18:16 +00:00
Benjamin Kramer	a74480d1eb	[X86] Remove unused node after morphing it from shr to and. In some cases it won't get cleaned up properly leading to crashes downstream. PR23353. Based on a patch by Davide Italiano. llvm-svn: 237828	2015-05-20 20:10:26 +00:00
Matthias Braun	6091208331	ARM: Fix comment and make it slightly more readable llvm-svn: 237820	2015-05-20 18:40:06 +00:00
Pete Cooper	9e1d335697	Change Function::getIntrinsicID() to return an Intrinsic::ID. NFC. Now that Intrinsic::ID is a typed enum, we can forward declare it and so return it from this method. This updates all users which were either using an unsigned to store it, or had a now unnecessary cast. llvm-svn: 237810	2015-05-20 17:16:39 +00:00
Duncan P. N. Exon Smith	fd27a1dc1b	MC: Update MCAssembler to use MCSymbol, NFC Use `MCSymbol` over `MCSymbolData` where both are needed. llvm-svn: 237803	2015-05-20 16:02:11 +00:00
Duncan P. N. Exon Smith	08b8726de3	MC: Use MCSymbol in MachObjectWriter, NFC Replace uses of `MCSymbolData` with `MCSymbol` where both are needed, so we can remove the backpointer. llvm-svn: 237799	2015-05-20 15:16:14 +00:00
Elena Demikhovsky	f61727d880	AVX-512: fixed algorithm of building vectors of i1 elements fixed extract-insert i1 element, load i1, zextload i1 should be with "and $1, %reg" to prevent loading garbage. added a bunch of new tests. llvm-svn: 237793	2015-05-20 14:32:03 +00:00
Daniel Sanders	69c6008e49	Revert r237789 - [mips] The naming convention for private labels is ABI dependant. It works, but I've noticed that I missed several callers of createMCAsmInfo() and many don't have a TargetMachine to provide. llvm-svn: 237792	2015-05-20 14:18:59 +00:00
Daniel Sanders	b718eca643	[mips] The naming convention for private labels is ABI dependant. Summary: For N32/N64, private labels begin with '.L' but for O32 they begin with '$'. MCAsmInfo now has an initializer function which can be used to provide information from the TargetMachine to control the assembly syntax. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: jfb, sandeep, llvm-commits, rafael Differential Revision: http://reviews.llvm.org/D9821 llvm-svn: 237789	2015-05-20 13:16:42 +00:00
Toma Tabacu	81496c1dec	[mips] [IAS] Factor out .set nomacro warning. NFC. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9772 llvm-svn: 237780	2015-05-20 08:54:45 +00:00
David Majnemer	402c5def11	[X86] Implement the local-exec TLS model for Windows targets We know that _tls_index is zero for local-exec TLS variables because they are always defined in the executable. llvm-svn: 237772	2015-05-20 04:45:26 +00:00
Alexei Starovoitov	ccbcf7cfd3	[bpf] fix build llvm-svn: 237751	2015-05-20 00:20:26 +00:00
Duncan P. N. Exon Smith	99d8a8e8ac	MC: Take MCSymbol in MachObjectWriter::getSymbolAddress(), NFC Pass through an `MCSymbol` instead of an `MCSymbolData` so we can get rid of the back pointer. llvm-svn: 237750	2015-05-20 00:02:39 +00:00
Duncan P. N. Exon Smith	2a40483418	MC: Use MCSymbol in MCAsmLayout::getSymbolOffset(), NFC Continue to canonicalize on MCSymbol instead of MCSymbolData when both are needed. llvm-svn: 237749	2015-05-19 23:53:20 +00:00
Matthias Braun	07066cca20	MachineInstr: Remove unused parameter. llvm-svn: 237726	2015-05-19 21:22:20 +00:00
Pete Cooper	f0cd2b49f5	Remove unnecessary cast. NFC llvm-svn: 237722	2015-05-19 20:50:14 +00:00
Zoran Jovanovic	dde61c00c3	[mips][microMIPSr6] Implement NOR, OR, ORI, XOR and XORI instructions Differential Revision: http://reviews.llvm.org/D8800 llvm-svn: 237697	2015-05-19 14:12:55 +00:00
Zoran Jovanovic	299fed6b7d	[mips][microMIPSr6] Implement AND and ANDI instructions Differential Revision: http://reviews.llvm.org/D8772 llvm-svn: 237696	2015-05-19 13:32:31 +00:00
Daniel Sanders	c8cd58fa26	[mips] Correct and improve special-case shuffle instructions. Summary: The documentation writes vectors highest-index first whereas LLVM-IR writes them lowest-index first. As a result, instructions defined in terms of left_half() and right_half() had the halves reversed. In addition to correcting them, they have been improved to allow shuffles that use the same operand twice or in reverse order. For example, ilvev used to accept masks of the form: <0, n, 2, n+2, 4, n+4, ...> but now accepts: <0, 0, 2, 2, 4, 4, ...> <n, n, n+2, n+2, n+4, n+4, ...> <0, n, 2, n+2, 4, n+4, ...> <n, 0, n+2, 2, n+4, 4, ...> One further improvement is that splati.[bhwd] is now the preferred instruction for splat-like operations. The other special shuffles are no longer used for splats. This lead to the discovery that <0, 0, ...> would not cause splati.[hwd] to be selected and this has also been fixed. This fixes the enc-3des test from the test-suite on Mips64r6 with MSA. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9660 llvm-svn: 237689	2015-05-19 12:24:52 +00:00
Zoran Jovanovic	3825261572	[mips][microMIPSr6] Implement DIV, DIVU, MOD and MODU instructions Differential Revision: http://reviews.llvm.org/D8769 llvm-svn: 237685	2015-05-19 11:21:37 +00:00
Michael Kuperstein	e88a021bf4	[X86] ABI change for x86-32: pass 3 vector arguments in-register instead of 4, except on Darwin. This changes the ABI used on 32-bit x86 for passing vector arguments. Historically, clang passes the first 4 vector arguments in-register, and additional vector arguments on the stack, regardless of platform. That is different from the behavior of gcc, icc, and msvc, all of which pass only the first 3 arguments in-register. The 3-register convention is documented, unofficially, in Agner's calling convention guide, and, officially, in the recently released version 1.0 of the i386 psABI. Darwin is kept as is because the OS X ABI Function Call Guide explicitly documents the current (4-register) behavior. This fixes PR21510 Differential revision: http://reviews.llvm.org/D9644 llvm-svn: 237682	2015-05-19 11:06:56 +00:00
Reid Kleckner	7bc6b6210c	Re-land r237175: [X86] Always return the sret parameter in eax/rax ... This reverts commit r237210. Also fix X86/complex-fca.ll to match the code that we used to generate on win32 and now generate everwhere to conform to SysV. llvm-svn: 237639	2015-05-18 23:35:09 +00:00
Jozef Kolek	cc0c0fc926	[mips][microMIPSr6] Implement LSA instruction This patch implements LSA instruction using mapping. Differential Revision: http://reviews.llvm.org/D8919 llvm-svn: 237634	2015-05-18 23:12:10 +00:00
David Blaikie	ff6409d096	Simplify IRBuilder::CreateCall* by using ArrayRef+initializer_list/braced init only llvm-svn: 237624	2015-05-18 22:13:54 +00:00
Tim Northover	d6223a2471	AArch64: work around ld64 bug more aggressively. ld64 currently mishandles internal pointer relocations (i.e. ARM64_RELOC_UNSIGNED referred to by section & offset rather than symbol). The existing __cfstring clause was an early discovery and workaround for this, but the problem is wider and we should avoid such relocations wherever possible for now. This code should be reverted to allowing internal relocations as soon as possible. PR23437. llvm-svn: 237621	2015-05-18 22:07:20 +00:00
Matthias Braun	fa3872e7ad	MachineInstr: Change return value of getOpcode() to unsigned. This was previously returning int. However there are no negative opcode numbers and more importantly this was needlessly different from MCInstrDesc::getOpcode() (which even is the value returned here) and SDValue::getOpcode()/SDNode::getOpcode(). llvm-svn: 237611	2015-05-18 20:27:55 +00:00
Jim Grosbach	6f482000e9	MC: Clean up method names in MCContext. The naming was a mish-mash of old and new style. Update to be consistent with the new. NFC. llvm-svn: 237594	2015-05-18 18:43:14 +00:00
Tim Northover	12c41af07c	ARM: allow jump tables to be placed as constant islands. Previously, they were forced to immediately follow the actual branch instruction. This was usually OK (the LEAs actually accessing them got emitted nearby, and weren't usually separated much afterwards). Unfortunately, a sufficiently nasty phi elimination dumps many instructions right before the basic block terminator, and this can increase the range too much. This patch frees them up to be placed as usual by the constant islands pass, and consequently has to slightly modify the form of TBB/TBH tables to refer to a PC-relative label at the final jump. The other jump table formats were already position-independent. rdar://20813304 llvm-svn: 237590	2015-05-18 17:10:40 +00:00
James Y Knight	c49e78851c	Sparc: support the "set" synthetic instruction. This pseudo-instruction expands into 'sethi' and 'or' instructions, or, just one of them, if the other isn't necessary for a given value. Differential Revision: http://reviews.llvm.org/D9089 llvm-svn: 237585	2015-05-18 16:43:33 +00:00
Oliver Stannard	6cb23465e0	Revert r237579, as it broke windows buildbots llvm-svn: 237583	2015-05-18 16:39:16 +00:00
James Y Knight	f7e7017281	Sparc: Support PSR, TBR, WIM read/write instructions. Differential Revision: http://reviews.llvm.org/D8971 llvm-svn: 237582	2015-05-18 16:38:47 +00:00
James Y Knight	24060be73a	Sparc: Add the "alternate address space" load/store instructions. - Adds support for the asm syntax, which has an immediate integer "ASI" (address space identifier) appearing after an address, before a comma. - Adds the various-width load, store, and swap in alternate address space instructions. (ldsba, ldsha, lduba, lduha, lda, stba, stha, sta, swapa) This does not attempt to hook these instructions up to pointer address spaces in LLVM, although that would probably be a reasonable thing to do in the future. Differential Revision: http://reviews.llvm.org/D8904 llvm-svn: 237581	2015-05-18 16:35:04 +00:00
James Y Knight	807563df22	Add support for the Sparc implementation-defined "ASR" registers. (Note that register "Y" is essentially just ASR0). Also added some test cases for divide and multiply, which had none before. Differential Revision: http://reviews.llvm.org/D8670 llvm-svn: 237580	2015-05-18 16:29:48 +00:00
Oliver Stannard	0c553afe6a	[LLVM - ARM/AArch64] Add ACLE special register intrinsics This patch implements LLVM support for the ACLE special register intrinsics in section 10.1, __arm_{w,r}sr{,p,64}. This patch is intended to lower the read/write_register instrinsics, used to implement the special register intrinsics in the clang patch for special register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor registers in AArch32 and AArch64. This is done by inspecting the register string passed to the intrinsic and then lowering to the appropriate instruction. Patch by Luke Cheeseman. Differential Revision: http://reviews.llvm.org/D9699 llvm-svn: 237579	2015-05-18 16:23:33 +00:00
Jozef Kolek	cbb227b48d	[mips][microMIPSr6] Implement ALIGN and AUI instructions This patch implements ALIGN and AUI instructions using mapping. Differential Revision: http://reviews.llvm.org/D8782 llvm-svn: 237563	2015-05-18 11:44:30 +00:00
Elena Demikhovsky	b8573cba02	AVX-512: Added intrinsics for ADDSS/D, MULSS/D, SUBSS/D, DIVSS/D instructions. These intrinsics are comming with rounding mode. Added intrinsics for MAXSS/D, MINSS/D - with and without sae. By Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 237560	2015-05-18 07:24:19 +00:00
Elena Demikhovsky	7eb367625e	fixed compilation warning/error llvm-svn: 237559	2015-05-18 07:10:25 +00:00
Elena Demikhovsky	08ce53c0ea	AVX-512: Added patterns for scalar-to-vector broadcast llvm-svn: 237558	2015-05-18 07:06:23 +00:00
Elena Demikhovsky	ad9c396838	AVX-512: Added VBROADCASTF64X4, VBROADCASTF64X2, VBROADCASTI32X8, and other instructions from this set Added encoding tests. llvm-svn: 237557	2015-05-18 06:42:57 +00:00
Hal Finkel	8340de142c	[PowerPC] Add extra r2 read deps on @toc@l relocations If some commits are happy, and some commits are sad, this is a sad commit. It is sad because it restricts instruction scheduling to work around a binutils linker bug, and moreover, one that may never be fixed. On 2012-05-21, GCC was updated not to produce code triggering this bug, and now we'll do the same... When resolving an address using the ELF ABI TOC pointer, two relocations are generally required: one for the high part and one for the low part. Only the high part generally explicitly depends on r2 (the TOC pointer). And, so, we might produce code like this: .Ltmp526: addis 3, 2, .LC12@toc@ha .Ltmp1628: std 2, 40(1) ld 5, 0(27) ld 2, 8(27) ld 11, 16(27) ld 3, .LC12@toc@l(3) rldicl 4, 4, 0, 32 mtctr 5 bctrl ld 2, 40(1) And there is nothing wrong with this code, as such, but there is a linker bug in binutils (https://sourceware.org/bugzilla/show_bug.cgi?id=18414) that will misoptimize this code sequence to this: nop std r2,40(r1) ld r5,0(r27) ld r2,8(r27) ld r11,16(r27) ld r3,-32472(r2) clrldi r4,r4,32 mtctr r5 bctrl ld r2,40(r1) because the linker does not know (and does not check) that the value in r2 changed in between the instruction using the .LC12@toc@ha (TOC-relative) relocation and the instruction using the .LC12@toc@l(3) relocation. Because it finds these instructions using the relocations (and not by scanning the instructions), it has been asserted that there is no good way to detect the change of r2 in between. As a result, this bug may never be fixed (i.e. it may become part of the definition of the ABI). GCC was updated to add extra dependencies on r2 to instructions using the @toc@l relocations to avoid this problem, and we'll do the same here. This is done as a separate pass because: 1. These extra r2 dependencies are not really properties of the instructions, but rather due to a linker bug, and maybe one day we'll be able to get rid of them when targeting linkers without this bug (and, thus, keeping the logic centralized here will make that straightforward). 2. There are ISel-level peephole optimizations that propagate the @toc@l relocations to some user instructions, and so the exta dependencies do not apply only to a fixed set of instructions (without undesirable definition replication). The test case was reduced with the help of bugpoint, with minimal cleaning. I'm looking forward to our upcoming MI serialization support, and with that, much better tests can be created. llvm-svn: 237556	2015-05-18 06:25:59 +00:00
Elena Demikhovsky	a8200603d4	AVX-512: fixed extended load to 512-bit register llvm-svn: 237537	2015-05-17 08:08:06 +00:00
Elena Demikhovsky	1d6a495d6d	AVX-512: fixed a bug in mask operations - (i1 1) pattern Filling k-reg with all-ones value was wrong, (i1 1) should switch on only one bit in mask register llvm-svn: 237536	2015-05-17 07:28:51 +00:00
Daniel Sanders	d049669546	[x86] Distinguish the 'o', 'v', 'X', and 'i' inline assembly memory constraints. Summary: But still handle them the same way since I don't know how they differ on this target. Of these, 'o' and 'v' are not tested but were already implemented. I'm not sure why 'i' is required for X86 since it's supposed to be an immediate constraint rather than a memory constraint. A test asserts without it so I've included it for now. No functional change intended. Reviewers: nadav Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8254 llvm-svn: 237517	2015-05-16 12:09:54 +00:00
Duncan P. N. Exon Smith	6e23e5a680	MC: Use MCSymbol in RelAndSymbol, NFC Switch from `MCSymbolData` to `MCSymbol`. llvm-svn: 237502	2015-05-16 01:14:19 +00:00
Bill Schmidt	5ed84cdba8	[PPC64] Add vector pack/unpack support from ISA 2.07 This patch adds support for the following new instructions in the Power ISA 2.07: vpksdss vpksdus vpkudus vpkudum vupkhsw vupklsw These instructions are available through the vec_packs, vec_packsu, vec_unpackh, and vec_unpackl built-in interfaces. These are lane-sensitive instructions, so the built-ins have different implementations for big- and little-endian, and the instructions must be marked as killing the vector swap optimization for now. The first three instructions perform saturating pack operations. The fourth performs a modulo pack operation, which means it can be represented with a vector shuffle, and conversely the appropriate vector shuffles may cause this instruction to be generated. The other instructions are only generated via built-in support for now. Appropriate tests have been added. There is a companion patch to clang for the rest of this support. llvm-svn: 237499	2015-05-16 01:02:12 +00:00
Duncan P. N. Exon Smith	09bfa58edd	MC: Change MCFragment::Atom to an MCSymbol, NFC Change `MCFragment::Atom` from an `MCSymbolData` to an `MCSymbol`, moving in the direction of removing the back-pointer. llvm-svn: 237497	2015-05-16 00:48:58 +00:00
Pete Cooper	81902a3ae4	Remove MCAssembler.h include from MCStreamer.h and fix users of MCStreamer.h llvm-svn: 237483	2015-05-15 22:19:42 +00:00
Pete Cooper	3de83e4098	Remove 3 includes from MCInstrDesc.h and explicitly include them where needed llvm-svn: 237481	2015-05-15 21:58:42 +00:00
David Majnemer	596c8d76fc	[X86] Use a better sentinel offset for the FrameAddr index Other pieces of CodeGen want to negate frame object offsets to account for architectures where the stack grows down. Our object is a pseudo object so it's offset doesn't matter. However, we shouldn't choose an offset which results in undefined behavior if you negate it. llvm-svn: 237474	2015-05-15 20:08:27 +00:00
Jim Grosbach	4c98cf77d9	MC: MCCodeGenInfo naming update. NFC. s/InitMCCodeGenInfo/initMCCodeGenInfo/ llvm-svn: 237471	2015-05-15 19:13:31 +00:00
Jim Grosbach	91df21f740	MC: Update MCCodeEmitter naming. NFC. s/EncodeInstruction/encodeInstruction/ llvm-svn: 237469	2015-05-15 19:13:16 +00:00
Jim Grosbach	63661f8d73	MC: Update MCFixup naming. NFC. s/MCFixup::Create/MCFixup::create/ llvm-svn: 237468	2015-05-15 19:13:05 +00:00
James Molloy	cfb0443af6	Mark SMIN/SMAX/UMIN/UMAX nodes as legal and add patterns for them. The new [SU]{MIN,MAX} SDNodes can be lowered directly to instructions for most NEON datatypes - the big exclusion being v2i64. llvm-svn: 237455	2015-05-15 16:15:57 +00:00
Daniel Sanders	734400b46a	[xcore] Only support the 'm' inline assembly memory constraint. NFC. Summary: XCore doesn't seem to have any additional constraints. Therefore remove the target hook. No functional change intended. Reviewers: friedgold Reviewed By: friedgold Subscribers: friedgold, llvm-commits Differential Revision: http://reviews.llvm.org/D8921 llvm-svn: 237442	2015-05-15 12:32:16 +00:00
Toma Tabacu	a3d056fd4c	[mips] [IAS] Fix expansion of negative 32-bit immediates for LI/DLI. Summary: To maintain compatibility with GAS, we need to stop treating negative 32-bit immediates as 64-bit values when expanding LI/DLI. This currently happens because of sign extension. To do this we need to choose the 32-bit value expansion for values which use their upper 33 bits only for sign extension (i.e. no 0's, only 1's). Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8662 llvm-svn: 237428	2015-05-15 09:42:11 +00:00
Akira Hatanaka	ff86773f51	Stop resetting SanitizeAddress in TargetMachine::resetTargetOptions. NFC. Instead of doing that, create a temporary copy of MCTargetOptions and reset its SanitizeAddress field based on the function's attribute every time an InlineAsm instruction is emitted in AsmPrinter::EmitInlineAsm. This is part of the work to remove TargetMachine::resetTargetOptions (the FIXME added to TargetMachine.cpp in r236009 explains why this function has to be removed). Differential Revision: http://reviews.llvm.org/D9570 llvm-svn: 237412	2015-05-15 00:20:44 +00:00
Eric Christopher	149d37abd4	Remove setting FloatABIType from the X86 port, nothing uses it. llvm-svn: 237398	2015-05-14 22:26:54 +00:00
Brendon Cahoon	7c8a3b0ef6	[Hexagon] Generate hardware loop for a vectorized loop The induction variable in the vectorized loop wasn't recognized properly, so a hardware loop wasn't generated. Differential Revision: http://reviews.llvm.org/D9722 llvm-svn: 237388	2015-05-14 20:36:19 +00:00
Brendon Cahoon	485bea74ad	[Hexagon] Remove dead constant assignment in hardware loop pass After converting a loop to a hardware loop, the pass should remove any unnecessary instructions from the old compare-and-branch code. This patch removes a dead constant assignment that was used in the compare instruction. Differential Revision: http://reviews.llvm.org/D9720 llvm-svn: 237373	2015-05-14 17:31:40 +00:00
Douglas Katzman	9d08232e28	Reflow long lines of some LLVMBuild files Differential Revision: http://reviews.llvm.org/D9752 llvm-svn: 237367	2015-05-14 15:38:27 +00:00
Toma Tabacu	e625b5fc02	[mips] [IAS] Enforce .set nomacro. Summary: When used, ".set nomacro" causes warning messages to be reported when we expand pseudo-instructions to multiple machine instructions. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9564 llvm-svn: 237366	2015-05-14 14:51:32 +00:00
Brendon Cahoon	9376e9998e	[Hexagon] Check for underflow/wrap in hardware loop pass If the loop trip count may underflow or wrap, the compiler should not generate a hardware loop since the trip count will be incorrect. llvm-svn: 237365	2015-05-14 14:15:08 +00:00
Toma Tabacu	772155cbc6	[mips] [IAS] Emit .set macro/nomacro. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9563 llvm-svn: 237363	2015-05-14 13:42:10 +00:00
Vasileios Kalintiris	70b744e4a1	[mips] Do not place users of $ra in the delay slot of call instructions. Summary: When we are trying to fill the delay slot of a call instruction, we must avoid filler instructions that use the $ra register. This fixes the test MultiSource/Applications/JM/lencod when we enable the forward delay slot filler. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9670 llvm-svn: 237362	2015-05-14 13:17:56 +00:00
Artyom Skrobov	a70dfe18d3	Re-apply r237247 - [AArch64] Codegen VMAX/VMIN for safe math cases No longer breaks SPEC2000/2006 llvm-svn: 237361	2015-05-14 12:59:46 +00:00
Toma Tabacu	ec1de82213	[mips] [IAS] Warn when LA is used with a 64-bit symbol. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9295 llvm-svn: 237356	2015-05-14 10:53:40 +00:00
Toma Tabacu	b5592eeb00	[mips] [IAS] Give expandLoadAddressSym() more specific arguments. NFC. Summary: If we only pass the necessary operands, we don't have to determine the position of the symbol operand when entering expandLoadAddressSym(). This simplifies the expandLoadAddressSym() code. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9291 llvm-svn: 237355	2015-05-14 10:02:58 +00:00
Vladimir Sukharev	8ccf0a3aa7	[AArch64] Slight naming changes and comments for AArch64NamedImmMapper Reviewers: echristo Subscribers: llvm-commits Follow-up to: http://reviews.llvm.org/D8496#158595 Relates to: http://reviews.llvm.org/rL235089 llvm-svn: 237354	2015-05-14 09:50:14 +00:00
Elena Demikhovsky	d5b3e376d2	AVX-512: Added i1 type handling for calling conventions. i1 type is a legal type on AVX-512 and can be passed as parameter or return value. i1 is promoted to i8 on return and to i32 for call arguments (i8 is also promoted to i32 here). The result code is similar to the previous X86 targets, where i1 is allways promoted to i8. llvm-svn: 237350	2015-05-14 09:04:45 +00:00
Douglas Katzman	6dc1397298	[X86] Fix PR23271 - RIP-relative decoding bug in disassembler. Differential Revision: http://reviews.llvm.org/D9110 llvm-svn: 237310	2015-05-13 22:44:52 +00:00
Tim Northover	b4c61f889f	ARM: remove possible vestiges of the legacy JIT??? There's no need to manually pass modifier strings around to tell an operand how to print now, that information is encoded in the operand itself since the MC layer came along. llvm-svn: 237295	2015-05-13 20:28:41 +00:00
Tim Northover	4998a47f73	ARM: remove custom jump table UID We were creating and propagating two separate indices for each jump table (from back in the mists of time). However, the generic index used by other backends is sufficient to emit a unique symbol so this was unneeded. llvm-svn: 237294	2015-05-13 20:28:38 +00:00
Tim Northover	688f7bb21a	ARM: refactor optimizeThumb2JumpTables. The previous logic mixed 2 separate questions: + Can we form a TBB/TBH instruction? + Can we remove the jump-table calculation before it? It then performed a bunch of random tests on the instructions earlier in the basic block, which were probably sufficient to answer 2 but only because of the very limited ways in which a t2BR_JT can actually be created. For example there's no reason to expect the LeaInst to define the same base register as the following indexing calulation. In practice this means we might have missed opportunities to form TBB/TBH, in theory you could end up misidentifying a sequence and removing the wrong LEA: %R1 = t2LEApcrelJT ... %R2 = t2LEApcrelJT ... <... using and killing %R2 ...> %R2 = t2ADDr %R1, $Ridx Before we would have looked for an LEA defining %R2 and found the wrong one. We just got lucky that jump table setup was (almost?) always confined to a single basic block and there was only one jump table per block. llvm-svn: 237293	2015-05-13 20:28:32 +00:00
Jim Grosbach	e9119e41ef	MC: Modernize MCOperand API naming. NFC. MCOperand::Create() methods renamed to MCOperand::create(). llvm-svn: 237275	2015-05-13 18:37:00 +00:00
Brendon Cahoon	d11c92a41c	[Hexagon] Generate loop1 instruction for nested loops loop1 is for the outer loop and loop0 is for the inner loop. Differential Revision: http://reviews.llvm.org/D9680 llvm-svn: 237266	2015-05-13 17:56:03 +00:00
Toma Tabacu	df7fd46c4a	[mips] [IAS] Preemptively fix warning introduced by r237255. NFC. Some compilers warn about using the ternary operator with an unsigned variable and enum. I haven't seen this trigger in the llvm.org buildbots yet, but it probably will at some point. Reported by Daniel Sanders. llvm-svn: 237262	2015-05-13 16:02:41 +00:00
Brendon Cahoon	254e656862	[Hexagon] Generate hardware loop when loop has a critical edge The hardware loop pass should try to generate a hardware loop instruction when the original loop has a critical edge. Differential Revision: http://reviews.llvm.org/D9678 llvm-svn: 237258	2015-05-13 14:54:24 +00:00
Jozef Kolek	6fec325d10	[mips][microMIPSr6] Implement CLO and CLZ instructions This patch implements CLO and CLZ instructions using mapping. Differential Revision: http://reviews.llvm.org/D8553 llvm-svn: 237257	2015-05-13 14:18:11 +00:00
Silviu Baranga	780a3b3be7	Revert r237247 - [AArch64] Codegen VMAX/VMIN.. as it is causing failures in SPEC2000/2006 llvm-svn: 237256	2015-05-13 14:03:18 +00:00
Toma Tabacu	d0a7ff2ed7	[mips] [IAS] Unify common functionality of LA and LI. Summary: A side-effect of this is that LA gains proper handling of unsigned and positive signed 16-bit immediates and more accurate error messages. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9290 llvm-svn: 237255	2015-05-13 13:56:16 +00:00
Artyom Skrobov	b526681e08	[AArch64] Codegen VMAX/VMIN for safe math cases llvm-svn: 237247	2015-05-13 12:01:09 +00:00
Michael Kuperstein	c3434b390d	Reverting r237234, "Use std::bitset for SubtargetFeatures" The buildbots are still not satisfied. MIPS and ARM are failing (even though at least MIPS was expected to pass). llvm-svn: 237245	2015-05-13 10:28:46 +00:00
Michael Kuperstein	aba4a34ef2	Use std::bitset for SubtargetFeatures Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. The first two times this was committed (r229831, r233055), it caused several buildbot failures. At least some of the ARM and MIPS ones were due to gcc/binutils issues, and should now be fixed. llvm-svn: 237234	2015-05-13 08:27:08 +00:00
Elena Demikhovsky	1b2f2f1b37	AVX-512: fixed a bug in encoding of VPSRAQ instrcution, added a bunch of encoding tests. llvm-svn: 237232	2015-05-13 07:35:05 +00:00
Sanjoy Das	a1d39ba940	[Statepoints] Support for "patchable" statepoints. Summary: This change adds two new parameters to the statepoint intrinsic, `i64 id` and `i32 num_patch_bytes`. `id` gets propagated to the ID field in the generated StackMap section. If the `num_patch_bytes` is non-zero then the statepoint is lowered to `num_patch_bytes` bytes of nops instead of a call (the spill and reload code remains unchanged). A non-zero `num_patch_bytes` is useful in situations where a language runtime requires complete control over how a call is lowered. This change brings statepoints one step closer to patchpoints. With some additional work (that is not part of this patch) it should be possible to get rid of `TargetOpcode::STATEPOINT` altogether. PlaceSafepoints generates `statepoint` wrappers with `id` set to `0xABCDEF00` (the old default value for the ID reported in the stackmap) and `num_patch_bytes` set to `0`. This can be made more sophisticated later. Reviewers: reames, pgavlin, swaroop.sridhar, AndyAyers Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9546 llvm-svn: 237214	2015-05-12 23:52:24 +00:00
Chandler Carruth	942fba95e2	Revert r237175: [X86] Always return the sret parameter in eax/rax ... This commit broke an x86 test and the bots have been broken for well over an hour now so I'm just reverting. llvm-svn: 237210	2015-05-12 23:34:27 +00:00
Matthias Braun	b5424d043b	Revert "ARM: Remove Itineraries for swift CPU" Reverting until I figure out the new lit failures. This reverts commit r237179. llvm-svn: 237189	2015-05-12 21:28:39 +00:00
Matthias Braun	befa1380d2	ARM: Remove Itineraries for swift CPU They do more harm than good when used in the MachineScheduler as they tend to take preference to register pressure minimsation which is more important for swift. Differential Revision: http://reviews.llvm.org/D9718 llvm-svn: 237179	2015-05-12 21:07:54 +00:00
Reid Kleckner	b465563b46	[X86] Always return the sret parameter in eax/rax, even on 32-bit Summary: This rule was always in the old SysV i386 ABI docs and the new ones that H.J. Lu has put together, but we never noticed: EAX scratch register; also used to return integer and pointer values from functions; also stores the address of a returned struct or union Fixes PR23491. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9715 llvm-svn: 237175	2015-05-12 20:56:32 +00:00
Douglas Katzman	03dfca04df	Strip trailing whitespace. NFC llvm-svn: 237165	2015-05-12 19:42:31 +00:00
Tom Stellard	a77c3f7010	R600/SI: Fix bug in VGPR spilling AMDGPU::SI_SPILL_V96_RESTORE was missing from a switch statement, which caused the srsrc and soffset register to not be set correctly. This commit replaces the switch statement with a SITargetInfo query to make sure all spill instructions are covered. Differential Revision: http://reviews.llvm.org/D9582 llvm-svn: 237164	2015-05-12 18:59:17 +00:00
Jozef Kolek	38bb81db85	[mips][microMIPSr6] Implement SELEQZ and SELNEZ instructions This patch implements SELEQZ and SELNEZ instructions using mapping. Differential Revision: http://reviews.llvm.org/D8497 llvm-svn: 237158	2015-05-12 17:39:32 +00:00
Petar Jovanovic	e0de8f4efb	[Mips] Return false for isFPCloseToIncomingSP() On Mips, frame pointer points to the same side of the frame as the stack pointer. This function is used to decide where to put register scavenging spill slot. So far, it was put on the wrong side of the frame, and thus it was too far away from $fp when frame was larger than 2^15 bytes. Patch by Vladimir Radosavljevic. http://reviews.llvm.org/D8895 llvm-svn: 237153	2015-05-12 17:14:05 +00:00
Tom Stellard	28d13a4b12	R600/SI: add pass to mark CF live ranges as non-spillable Spilling can insert instructions almost anywhere, and this can mess up control flow lowering in a multitude of ways, due to instruction reordering. Let's sort this out the easy way: never spill registers involved with control flow, i.e. saved EXEC masks. Unfortunately, this does not work at all with optimizations disabled, as the register allocator ignores spill weights. This should be addressed in a future commit. The test was reduced from the "stacks" shader of [1]. Some issues trigger the machine verifier while another one is checked manually. [1] http://madebyevan.com/webgl-path-tracing/ v2: only insert pass with optimizations enabled, merge test runs. Patch by: Grigori Goronzy llvm-svn: 237152	2015-05-12 17:13:02 +00:00
Sanjay Patel	7713e6849d	use 'auto' to improve readability; NFC llvm-svn: 237144	2015-05-12 15:15:55 +00:00
Tom Stellard	c274349207	R600/SI: Update tablegen defs to avoid restoring spilled sgprs to m0 We had code to do this in SIRegisterInfo::eliminateFrameIndex(), but it is easier to just change the definition of SI_SPILL_S32_RESTORE to only allow numbered sgprs. llvm-svn: 237143	2015-05-12 15:00:53 +00:00
Tom Stellard	8f96dfc9ea	R600/SI: Remove M0Reg register class It is no longer used. llvm-svn: 237142	2015-05-12 15:00:52 +00:00
Tom Stellard	381a94a764	R600/SI: Remove explicit m0 operand from DS instructions Instead add m0 as an implicit operand. This helps avoid spills of the m0 register in some cases. llvm-svn: 237141	2015-05-12 15:00:49 +00:00
Tom Stellard	2a9d94757f	R600/SI: Remove explicit m0 operand from v_interp instructions Instead add m0 as an implicit operand. This helps avoid spills of the m0 register in some cases. llvm-svn: 237140	2015-05-12 15:00:46 +00:00
Tom Stellard	fc92e77445	R600/SI: Remove explicit m0 operand from s_sendmsg Instead add m0 as an implicit operand. This allows us to avoid using the M0Reg register class and eliminates a number of unnecessary spills when using s_sendmsg instructions. This impacts one shader in the shader-db: SGPRS: 48 -> 40 (-16.67 %) VGPRS: 112 -> 108 (-3.57 %) Code Size: 40132 -> 38796 (-3.33 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 2048 -> 0 (-100.00 %) bytes per wave llvm-svn: 237133	2015-05-12 14:18:14 +00:00
Tom Stellard	d33d7f15a2	R600/SI: Replace TRI->getRegClass(Reg) with TRI->getPhysRegClass(Reg) TRI->getRegClass() takes a register class ID, not a register. We were using this incorrectly in a few places. llvm-svn: 237132	2015-05-12 14:18:11 +00:00
Elena Demikhovsky	fae20d3565	AVX-512, X86: Added lowering for shift operations for SKX. The other changes in the LowerShift() are not functional, just to make the code more convenient. So, the functional changes for SKX only. llvm-svn: 237129	2015-05-12 13:25:46 +00:00
John Brawn	70605f7d22	[ARM] Use AEABI aligned function variants AEABI defines aligned variants of memcpy etc. that can be faster than the default version due to not having to do alignment checks. When emitting target code for these functions make use of these aligned variants if possible. Also convert memset to memclr if possible. Differential Revision: http://reviews.llvm.org/D8060 llvm-svn: 237127	2015-05-12 13:13:38 +00:00
Andrea Di Biagio	454f7909c6	[X86] Remove useless target specific combine on TRUNCATE dag nodes. Before revision 171146, function 'PerformTruncateCombine' used to perform a premature lowering of TRUNCATE dag nodes. Revision 171146 then moved all the logic implemented by PerformTruncateCombine to a custom lowering hook. However, that revision forgot to delete function PerformTruncateCombine from the code. This patch removes function 'PerformTruncateCombine' since it has no effect on the SelectionDAG. No functional change intended. llvm-svn: 237122	2015-05-12 12:34:22 +00:00
Vasileios Kalintiris	b48c905613	[mips][FastISel] Handle calls with non legal types i8 and i16. Summary: Allow calls with non legal integer types based on i8 and i16 to be processed by mips fast-isel. Based on a patch by Reed Kotler. Test Plan: "Make check" test forthcoming. Test-suite passes at O0/O2 and with mips32 r1/r2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D6770 llvm-svn: 237121	2015-05-12 12:29:17 +00:00
Vasileios Kalintiris	32cd69a2eb	[mips][FastISel] Allow computation of addresses from constant expressions. Summary: Try to compute addresses when the offset from a memory location is a constant expression. Based on a patch by Reed Kotler. Test Plan: Passes test-suite for -O0/O2 and mips 32 r1/r2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, aemerson, rfuhler Differential Revision: http://reviews.llvm.org/D6767 llvm-svn: 237117	2015-05-12 12:08:31 +00:00
Renato Golin	35de35d03f	Change TargetParser enum names to avoid macro conflicts (llvm) sys/time.h on Solaris (and possibly other systems) defines "SEC" as "1" using a cpp macro. The result is that this fails to compile. Fixes https://llvm.org/PR23482 llvm-svn: 237112	2015-05-12 10:33:58 +00:00
Elena Demikhovsky	c1ac5d7bd5	AVX-512: select operation for i1 vectors like: select i1 %cond, <16 x i1> %a, <16 x i1> %b. I added pseudo-CMOV patterns to resolve the "select". Added tests for KNL and SKX. llvm-svn: 237106	2015-05-12 09:36:52 +00:00
Michael Kuperstein	6f5ff6905c	[X86] DAGCombine should not assume arbitrary vector types are simple The X86-specific DAGCombine for stores should not assume vector types are always simple. This fixes PR23476. Differential Revision: http://reviews.llvm.org/D9659 llvm-svn: 237097	2015-05-12 07:33:07 +00:00
Eric Christopher	824f42f209	Migrate existing backends that care about software floating point to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079	2015-05-12 01:26:05 +00:00
David Blaikie	46c561c19e	Readdress r236990, use of static members on a non-static variable. The TargetRegistry is just a namespace-like class, instantiated in one place to use a range-based for loop. Instead, expose access to the registry via a range-based 'targets()' function instead. This makes most uses a bit awkward/more verbose - but eventually we should just add a range-based find_if function which will streamline these functions. I'm happy to mkae them a bit awkward in the interim as encouragement to improve the algorithms in time. llvm-svn: 237059	2015-05-11 22:20:48 +00:00
Pirama Arumuga Nainar	af171e7720	[X86] Updates to X86 backend for f16 promotion Summary: r235215 adds support for f16 to be considered as a load/store type and promote f16 operations to f32. This patch has miscellaneous fixes for the X86 backend so all f16 operations are handled: 1. Set loadextaction for f16 vectors to expand. 2. Handle FP_EXTEND in a switch statement when handling v2f32 3. Do not fold (FP_TO_SINT (load f16)) into FP_TO_INT*_IN_MEM or (store (SINT_TO_FP )) to a FILD. Tests included. Reviewers: ab, srhines, delena Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9092 llvm-svn: 237004	2015-05-11 17:14:39 +00:00
Aaron Ballman	2a3aa1f249	Silencing an MSVC warning: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?); NFC llvm-svn: 236987	2015-05-11 12:45:53 +00:00
Elena Demikhovsky	5b9ee1ba7e	Fixed compilation warning, NFC. llvm-svn: 236972	2015-05-11 06:23:41 +00:00
Elena Demikhovsky	0d7e9364d1	AVX-512: Added SKX instructions and intrinsics: {add/sub/mul/div/} x {ps/pd} x {128/256} 2. max/min with sae By Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 236971	2015-05-11 06:05:05 +00:00
Elena Demikhovsky	f40342d6a2	AVX-512: fixed UINT_TO_FP operation for 512-bit types. llvm-svn: 236955	2015-05-10 14:23:52 +00:00
Elena Demikhovsky	75d1489326	AVX-512: fixed a bug in i1 vectors lowering llvm-svn: 236947	2015-05-10 10:33:32 +00:00
Saleem Abdulrasool	ee33c49ade	SystemZ: silence a GCC warning warning: enumeral and non-enumeral type in conditional expression Cast the 0 to the appropriate type. NFC. Identified by GCC 4.9.2 llvm-svn: 236942	2015-05-10 00:53:41 +00:00
Tom Stellard	f01af29f01	MachineCSE: Add a target query for the LookAheadLimit heurisitic This is used to determine whether or not to CSE physical register defs. Differential Revision: http://reviews.llvm.org/D9472 llvm-svn: 236923	2015-05-09 00:56:07 +00:00
Arnold Schwaighofer	dc2711446e	Fix compile error llvm-svn: 236921	2015-05-09 00:10:25 +00:00
Davide Italiano	2c29cd697e	[Target/ARM] Remove unused 'private' from class. Differential Revision: http://reviews.llvm.org/D9611 Reviewed by: rengolin llvm-svn: 236918	2015-05-08 23:58:28 +00:00
Arnold Schwaighofer	f54b73d681	ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects The code that builds the dependence graph assumes that two PseudoSourceValues don't alias. In a tail calling function two FixedStackObjects might refer to the same location. Worse 'immutable' fixed stack objects like function arguments are not immutable and will be clobbered. Change this so that a load from a FixedStackObject is not invariant in a tail calling function and don't return a PseudoSourceValue for an instruction in tail calling functions when building the dependence graph so that we handle function arguments conservatively. Fix for PR23459. rdar://20740035 llvm-svn: 236916	2015-05-08 23:52:00 +00:00
Renato Golin	f5f373fcf1	TargetParser: FPU/ARCH/EXT parsing refactory - NFC This new class in a global context contain arch-specific knowledge in order to provide LLVM libraries, tools and projects with the ability to understand the architectures. For now, only FPU, ARCH and ARCH extensions on ARM are supported. Current behaviour it to parse from free-text to enum values and back, so that all users can share the same parser and codes. This simplifies a lot both the ASM/Obj streamers in the back-end (where this came from), and the front-end parsers for command line arguments (where this is going to be used next). The previous implementation, using .def/.h includes is deprecated due to its inflexibility to be built without the backend support and for being too cumbersome. As more architectures join this scheme, and as more features of such architectures are added (such as hardware features, type sizes, etc) into a full blown TargetDescription class, having a set of classes is the most sane implementation. The ultimate goal of this refactor both LLVM's and Clang's target description classes into one unique interface, so that we can de-duplicate and standardise the descriptions, as well as make it available for other front-ends, tools, etc. The FPU parsing for command line options in Clang has been converted to use this new library and a number of aliases were added for compatibility: * A bogus neon-vfpv3 alias (neon defaults to vfp3) * armv5/v6 * {fp4/fp5}-{sp/dp}-d16 Next steps: * Port Clang's ARCH/EXT parsing to use this library. * Create a TableGen back-end to generate this information. * Run this TableGen process regardless of which back-ends are built. * Expose more information and rename it to TargetDescription. * Continue re-factoring Clang to use as much of it as possible. llvm-svn: 236900	2015-05-08 21:04:27 +00:00
Brendon Cahoon	bece8edcdd	[Hexagon] Generate more hardware loops Refactored parts of the hardware loop pass to generate more. Also, added more tests. Differential Revision: http://reviews.llvm.org/D9568 llvm-svn: 236896	2015-05-08 20:18:21 +00:00
Pete Cooper	7f7c9f1dad	[X86] Fast-ISel was incorrectly always killing the source of a truncate. A trunc from i32 to i1 on x86_64 generates an instruction such as %vreg19<def> = COPY %vreg9:sub_8bit<kill>; GR8:%vreg19 GR32:%vreg9 However, the copy here should only have the kill flag on the 32-bit path, not the 64-bit one. Otherwise, we are killing the source of the truncate which could be used later in the program. llvm-svn: 236890	2015-05-08 18:29:42 +00:00
Pat Gavlin	cc0431d1c0	Extend the statepoint intrinsic to allow statepoints to be marked as transitions from GC-aware code to code that is not GC-aware. This changes the shape of the statepoint intrinsic from: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args) to: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args) This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back. In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation. Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops. Differential Revision: http://reviews.llvm.org/D9501 llvm-svn: 236888	2015-05-08 18:07:42 +00:00
Brendon Cahoon	df43e68629	[Hexagon] Update AnalyzeBranch, etc target hooks Improved the AnalyzeBranch, InsertBranch, and RemoveBranch functions in order to handle more of our branch instructions. This requires changes to analyzeCompare and PredicateInstructions. Specifically, we've added support for new value compare jumps, improved handling of endloop, added more compare instructions, and improved support for predicate instructions. Differential Revision: http://reviews.llvm.org/D9559 llvm-svn: 236876	2015-05-08 16:16:29 +00:00
Andrea Di Biagio	84e22b9096	[X86] Teach 'getTargetShuffleMask' how to look through ISD::WrapperRIP when decoding a PSHUFB mask. The function 'getTargetShuffleMask' already knows how to deal with PSHUFB nodes where the mask node is a load from constant pool, and the constant pool node is wrapped by a X86ISD::Wrapper node. This patch extends that logic by teaching it how to also look through X86ISD::WrapperRIP. This helps function combineX86ShufflesRecusively to combine more shuffle sequences containing PSHUFB nodes if we are in RIPRel PIC mode. Before this change, llc (with -relocation-model=pic -march=x86-64) was unable to decode a pshufb where the mask was loaded from a constant pool. For example, the no-op shuffle from test 'x86-fold-pshufb.ll' was not folded into its operand, so instead of generating a single 'movaps' the backend always generated a sub-optimal 'movdqa + pshufb' sequence. Added test x86-fold-pshufb.ll. llvm-svn: 236863	2015-05-08 15:11:07 +00:00
Jozef Kolek	8abad7bacc	[mips][microMIPSr6] Implement ALUIPC and AUIPC instructions This patch implements ALUIPC and AUIPC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8441 llvm-svn: 236858	2015-05-08 14:25:11 +00:00
Jozef Kolek	9ce6e0a926	[mips][microMIPSr6] Implement ADDIUPC and LWPC instructions This patch implements ADDIUPC and LWPC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8415 llvm-svn: 236852	2015-05-08 13:52:04 +00:00
Denis Protivensky	159a49e5d6	Fix gcc warning of different enum and non-enum types in ternary Make '0' literal explicitly unsigned with '0u'. This appeared after r236775. llvm-svn: 236838	2015-05-08 12:21:03 +00:00
Toma Tabacu	8b3345ba7c	[mips] Only use FGR_{32,64} in TableGen descriptions. NFC. Summary: Instead of explicitly adding the IsFP64bit and NotFP64bit predicates through AdditionalRequires. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9566 llvm-svn: 236835	2015-05-08 12:15:04 +00:00
Vasileios Kalintiris	42544d6472	[mips] Emit the .insn directive for empty basic blocks. Summary: In microMIPS, labels need to know whether they are on code or data. This is indicated with STO_MIPS_MICROMIPS and can be inferred by being followed by instructions. For empty basic blocks, we can ensure this by emitting the .insn directive after the label. Also, this fixes some failures in our out-of-tree microMIPS buildbots, for the exception handling regression tests under: SingleSource/Regression/C++/EH Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9530 llvm-svn: 236815	2015-05-08 09:10:15 +00:00
Eric Christopher	54966ebc54	InMips16HardFloat was only being set conditional on whether or not IsSoftFloat was set so remove it from here simplifying the accessor. llvm-svn: 236795	2015-05-07 23:10:23 +00:00
Eric Christopher	e8ae3e3acd	Rename the MIPS routine abiUsesSoftFloat -> useSoftFloat to match some incoming changes and the general scheme used by features (use/has). llvm-svn: 236794	2015-05-07 23:10:21 +00:00
Matthias Braun	f45afee3dc	Fix typo. llvm-svn: 236785	2015-05-07 22:16:10 +00:00
Matthias Braun	d04893fa36	Change getTargetNodeName() to produce compiler warnings for missing cases, fix them llvm-svn: 236775	2015-05-07 21:33:59 +00:00
Pete Cooper	f52123b454	[AArch64] Fix sext/zext folding in address arithmetic. We were accidentally folding a sign/zero extend in to address arithmetic in a different BB when the extend wasn't available there. Cross BB fast-isel isn't safe, so restrict this to only when the extend is in the same BB as the use. llvm-svn: 236764	2015-05-07 19:21:36 +00:00
Nemanja Ivanovic	f3c94b1e3c	Add VSX Scalar loads and stores to the PPC back end This patch corresponds to review: http://reviews.llvm.org/D9440 It adds a new register class to the PPC back end to contain single precision values in VSX registers. Additionally, it adds scalar loads and stores for VSX registers. llvm-svn: 236755	2015-05-07 18:24:05 +00:00
Jozef Kolek	cf98462818	[mips][microMIPSr6] Implement JIALC and JIC instructions This patch implements JIALC and JIC instructions using mapping. Differential Revision: http://reviews.llvm.org/D8389 llvm-svn: 236748	2015-05-07 17:12:23 +00:00
Matt Arsenault	585b566278	R600: Fix comment that mentions AMDIL llvm-svn: 236745	2015-05-07 17:02:32 +00:00
Sanjay Patel	5b373cacf2	Use intrinsic pattern to make a simpler match This is a follow-on to r236740 where I took Andrea's advice in D9504 to remove a redundant pattern...except that I removed the wrong pattern! AFAICT, there is no change in the final code produced because subsequent passes would clean up the extra instructions created by the more complicated pattern. llvm-svn: 236743	2015-05-07 16:51:12 +00:00
Sanjay Patel	a9f6d3505d	[x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507) Finish the job that was abandoned in D6958 following the refactoring in http://reviews.llvm.org/rL230221: 1. Uncomment the intrinsic def for the AVX r_Int instruction. 2. Add missing r_Int entries to the load folding tables; there are already tests that check these in "test/Codegen/X86/fold-load-unops.ll", so I haven't added any more in this patch. 3. Add patterns to solve PR21507 ( https://llvm.org/bugs/show_bug.cgi?id=21507 ). So instead of this: movaps %xmm0, %xmm1 rcpss %xmm1, %xmm1 movss %xmm1, %xmm0 We should now get: rcpss %xmm0, %xmm0 And instead of this: vsqrtss %xmm0, %xmm0, %xmm1 vblendps $1, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm1[0],xmm0[1,2,3] We should now get: vsqrtss %xmm0, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D9504 llvm-svn: 236740	2015-05-07 15:48:53 +00:00
Simon Atanasyan	fee03b1be8	[MIPS] Move MIPS ABI flags structure constants to the separate header http://reviews.llvm.org/D9517 The separate header file allows to reuse the MIPS ABI flags structure constants in other LLVM tools like the llvm-readobj. No functional changes. llvm-svn: 236732	2015-05-07 14:57:04 +00:00
Elena Demikhovsky	29792e9a80	AVX-512: Added all forms of FP compare instructions for KNL and SKX. Added intrinsics for the instructions. CC parameter of the intrinsics was changed from i8 to i32 according to the spec. By Igor Breger (igor.breger@intel.com) llvm-svn: 236714	2015-05-07 11:24:42 +00:00
Toma Tabacu	506cfd0b2b	[mips] Add the SoftFloat MipsSubtarget feature. Summary: This will enable the IAS to reject floating point instructions if soft-float is enabled. Reviewers: dsanders, echristo Reviewed By: dsanders Subscribers: jfb, llvm-commits, mpf Differential Revision: http://reviews.llvm.org/D9053 llvm-svn: 236713	2015-05-07 10:29:52 +00:00
Sanjoy Das	2e0d29fb09	[X86MCInst] Move LowerSTATEPOINT to inside X86AsmPrinter. NFC. llvm-svn: 236676	2015-05-06 23:53:26 +00:00
Sanjoy Das	80876d5db3	[X86MCInst] Clean up LowerSTATEPOINT: variable names. NFC. llvm-svn: 236675	2015-05-06 23:53:24 +00:00
Pete Cooper	d31583ddfb	[x86] Fix register class of folded load index reg. When folding a load in to another instruction, we need to fix the class of the index register Otherwise, it could be something like GR64 not GR64_NOSP and would fail the machine verifier. llvm-svn: 236644	2015-05-06 21:37:19 +00:00
Wei Mi	062c74484d	[X86] Disable loop unrolling in loop vectorization pass when VF is 1. The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture, by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce the cost of overflow check, memory boundary check and extra prologue/epilogue code when regular unroller will unroll the loop another time. Disable it when VF==1 remove the unnecessary cost on x86. The same can be done for other platforms after verifying interleaving/memory bound checking to be not perf critical on those platforms. Differential Revision: http://reviews.llvm.org/D9515 llvm-svn: 236613	2015-05-06 17:12:25 +00:00
Pete Cooper	d927c6eaf8	[ARM] Fast-Isel was incorrectly selecting <2 x double> adds. With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add. However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it." This commit disables SelectBinaryFPOp for any vector types. There's already a FIXME to try handle neon. Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 \|\| VT == MVT::i64' llvm-svn: 236609	2015-05-06 16:39:17 +00:00
Bill Schmidt	5fe2e25f7c	[PPC64LE] Adjust vector splats during VSX swap optimization The initial code drop for VSX swap optimization permitted the optimization only when all operations in a web of related computation are lane-insensitive. For some lane-sensitive operations, we can still permit the optimization provided that we make adjustments to those operations. This patch adds special handling for vector splats so that their presence doesn't kill the optimization. Vector splats are lane-sensitive since they identify by number a vector element to be used as the source of a splat. When swap optimizations take place, the desired vector element will move to the opposite doubleword of the quadword vector. We thus replace the index I by (I + N/2) % N, where N is the number of elements in the vector. A new test case is added to test that swap optimization succeeds when vector splats are present, and that the proper input element is used as the source of the splat. An ancillary change removes SH_BUILDVEC as one of the kinds of special handling that may be required by VSX swap optimization. From experience with GCC, I had expected to need some modifications for vector build operations, but I did not find that to be the case. llvm-svn: 236606	2015-05-06 15:40:46 +00:00
NAKAMURA Takumi	d7c0be9c42	Revert r236546, "propagate IR-level fast-math-flags to DAG nodes (NFC)" It caused undefined behavior. llvm-svn: 236600	2015-05-06 14:03:12 +00:00
Artyom Skrobov	3f8eae92a4	[ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too llvm-svn: 236590	2015-05-06 11:44:10 +00:00
Ahmed Bougacha	e8d0c4ccea	[ARM][FastISel] Use TST #1 instead of CMP #0 for select. Since r234249, i1 are sext instead of zext; because of that, doing "CMP rN, #0; IT EQ/NE" isn't correct anymore. "TST #1" is the conservatively correct alternative - the tradeoff being that it doesn't have a 16-bit encoding -, so use that instead. llvm-svn: 236569	2015-05-06 04:14:02 +00:00
Pete Cooper	d0dae3e577	[X86 fast-isel] Constrain the index reg class to not include SP. The index reg on instructions with complex address modes is a GPR64_NOSP. Constrain it to appease the machine verifier. llvm-svn: 236557	2015-05-05 23:41:53 +00:00
Sanjay Patel	801caff64d	propagate IR-level fast-math-flags to DAG nodes (NFC) This patch adds the minimum plumbing necessary to use IR-level fast-math-flags (FMF) in the backend without actually using them for anything yet. This is a follow-on to: http://reviews.llvm.org/rL235997 ...which split the existing nsw / nuw / exact flags and FMF into their own struct. There are 2 structural changes here: 1. The main diff is that we're preparing to extend the optimization flags to affect more than just binary SDNodes. Eg, IR intrinsics ( https://llvm.org/bugs/show_bug.cgi?id=21290 ) or non-binop nodes that don't even exist in IR such as FMA, FNEG, etc. 2. The other change is that we're actually copying the FP fast-math-flags from the IR instructions to SDNodes. Differential Revision: http://reviews.llvm.org/D8900 llvm-svn: 236546	2015-05-05 21:40:38 +00:00
Sanjay Patel	fbca70d767	use range-based for-loop; NFC llvm-svn: 236544	2015-05-05 21:20:52 +00:00
Peter Collingbourne	85a0e23bc8	Thumb2SizeReduction: Check the correct set of registers for LDMIA. The register set for LDMIA begins at offset 3, not 4. We were previously missing the short encoding of this instruction in the case where the base register was the first register in the register set. Also clean up some dead code: - The isARMLowRegister check is redundant with what VerifyLowRegs does; replace with an assert. - Remove handling of LDMDB instruction, which has no short encoding (and does not appear in ReduceTable). Differential Revision: http://reviews.llvm.org/D9485 llvm-svn: 236535	2015-05-05 20:07:10 +00:00
Ulrich Weigand	c1708b2618	[SystemZ] Add vector intrinsics This adds intrinsics to allow access to all of the z13 vector instructions. Note that instructions whose semantics can be described by standard LLVM IR do not get any intrinsics. For each instructions whose semantics cannot (fully) be described, we define an LLVM IR target-specific intrinsic that directly maps to this instruction. For instructions that also set the condition code, the LLVM IR intrinsic returns the post-instruction CC value as a second result. Instruction selection will attempt to detect code that compares that CC value against constants and use the condition code directly instead. Based on a patch by Richard Sandiford. llvm-svn: 236527	2015-05-05 19:31:09 +00:00
Ulrich Weigand	5211f9ff4d	[SystemZ] Mark v1i128 and v1f128 as unsupported The ABI specifies that <1 x i128> and <1 x fp128> are supposed to be passed in vector registers. We do not yet support those types, and some infrastructure is missing before we can do so. In order to prevent accidentally generating code violating the ABI, this patch adds checks to detect those types and error out if user code attempts to use them. llvm-svn: 236526	2015-05-05 19:30:05 +00:00
Ulrich Weigand	cd2a1b5341	[SystemZ] Handle sub-128 vectors The ABI allows sub-128 vectors to be passed and returned in registers, with the vector occupying the upper part of a register. We therefore want to legalize those types by widening the vector rather than promoting the elements. The patch includes some simple tests for sub-128 vectors and also tests that we can recognize various pack sequences, some of which use sub-128 vectors as temporary results. One of these forms is based on the pack sequences generated by llvmpipe when no intrinsics are used. Signed unpacks are recognized as BUILD_VECTORs whose elements are individually sign-extended. Unsigned unpacks can have the equivalent form with zero extension, but they also occur as shuffles in which some elements are zero. Based on a patch by Richard Sandiford. llvm-svn: 236525	2015-05-05 19:29:21 +00:00
Ulrich Weigand	49506d78e7	[SystemZ] Add CodeGen support for scalar f64 ops in vector registers The z13 vector facility includes some instructions that operate only on the high f64 in a v2f64, effectively extending the FP register set from 16 to 32 registers. It's still better to use the old instructions if the operands happen to fit though, since the older instructions have a shorter encoding. Based on a patch by Richard Sandiford. llvm-svn: 236524	2015-05-05 19:28:34 +00:00
Ulrich Weigand	80b3af7ab3	[SystemZ] Add CodeGen support for v4f32 The architecture doesn't really have any native v4f32 operations except v4f32->v2f64 and v2f64->v4f32 conversions, with only half of the v4f32 elements being used. Even so, using vector registers for <4 x float> and scalarising individual operations is much better than generating completely scalar code, since there's much less register pressure. It's also more efficient to do v4f32 comparisons by extending to 2 v2f64s, comparing those, then packing the result. This particularly helps with llvmpipe. Based on a patch by Richard Sandiford. llvm-svn: 236523	2015-05-05 19:27:45 +00:00
Ulrich Weigand	cd808237b2	[SystemZ] Add CodeGen support for v2f64 This adds ABI and CodeGen support for the v2f64 type, which is natively supported by z13 instructions. Based on a patch by Richard Sandiford. llvm-svn: 236522	2015-05-05 19:26:48 +00:00
Ulrich Weigand	ce4c109585	[SystemZ] Add CodeGen support for integer vector types This the first of a series of patches to add CodeGen support exploiting the instructions of the z13 vector facility. This patch adds support for the native integer vector types (v16i8, v8i16, v4i32, v2i64). When the vector facility is present, we default to the new vector ABI. This is characterized by two major differences: - Vector types are passed/returned in vector registers (except for unnamed arguments of a variable-argument list function). - Vector types are at most 8-byte aligned. The reason for the choice of 8-byte vector alignment is that the hardware is able to efficiently load vectors at 8-byte alignment, and the ABI only guarantees 8-byte alignment of the stack pointer, so requiring any higher alignment for vectors would require dynamic stack re-alignment code. However, for compatibility with old code that may use vector types, when not using the vector facility, the old alignment rules (vector types are naturally aligned) remain in use. These alignment rules are not only implemented at the C language level (implemented in clang), but also at the LLVM IR level. This is done by selecting a different DataLayout string depending on whether the vector ABI is in effect or not. Based on a patch by Richard Sandiford. llvm-svn: 236521	2015-05-05 19:25:42 +00:00
Ulrich Weigand	a8b04e1cbc	[SystemZ] Add z13 vector facility and MC support This patch adds support for the z13 processor type and its vector facility, and adds MC support for all new instructions provided by that facilily. Apart from defining the new instructions, the main changes are: - Adding VR128, VR64 and VR32 register classes. - Making FP64 a subclass of VR64 and FP32 a subclass of VR32. - Adding a D(V,B) addressing mode for scatter/gather operations - Adding 1-, 2-, and 3-bit immediate operands for some 4-bit fields. Until now all immediate operands have been the same width as the underlying field (hence the assert->return change in decode[SU]ImmOperand). In addition, sys::getHostCPUName is extended to detect running natively on a z13 machine. Based on a patch by Richard Sandiford. llvm-svn: 236520	2015-05-05 19:23:40 +00:00
Reid Kleckner	0738a9c02e	Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86" This reverts commit r236360. This change exposed a bug in WinEHPrepare by opting win32 code into EH preparation. We already knew that WinEHPrepare has bugs, and is the status quo for x64, so I don't think that's a reason to hold off on this change. I disabled exceptions in the sanitizer tests in r236505 and an earlier revision. llvm-svn: 236508	2015-05-05 17:44:16 +00:00
Quentin Colombet	61b305edfd	[ShrinkWrap] Add (a simplified version) of shrink-wrapping. This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks. As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64. Context Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places. Motivating example Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 } On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call. Proposed Solution This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI. Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties. The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap. Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block. Design Decisions 1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples. Differential Revision: http://reviews.llvm.org/D9210 <rdar://problem/3201744> llvm-svn: 236507	2015-05-05 17:38:16 +00:00
Kit Barton	d4eb73c00e	This patch adds ABI support for v1i128 data type. It adds v1i128 to the appropriate register classes and checks parameter passing and return values. This is related to http://reviews.llvm.org/D9081, which will add instructions that exploit the v1i128 datatype. Phabricator review: http://reviews.llvm.org/D9475 llvm-svn: 236503	2015-05-05 16:10:44 +00:00
Daniel Sanders	eda60d217b	[mips] Generate code for insert/extract operations when using the N64 ABI and MSA. Summary: When using the N64 ABI, element-indices use the i64 type instead of i32. In many cases, we can use iPTR to account for this but additional patterns and pseudo's are also required. This fixes most (but not quite all) failures in the test-suite when using N64 and MSA together. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9342 llvm-svn: 236494	2015-05-05 10:32:24 +00:00
Daniel Sanders	4160c802d9	[mips][msa] Test basic operations for the N32 ABI too. Summary: This required adding instruction aliases for dneg. N64 will be enabled shortly but requires additional bugfixes. Reviewers: vkalintiris Reviewed By: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9341 llvm-svn: 236489	2015-05-05 08:48:35 +00:00
Reid Kleckner	9dad227b85	[X86] Fix assertion while DAG combining offsets and ExternalSymbols ExternalSymbol nodes do not contain offsets, unlike GlobalValue nodes. llvm-svn: 236471	2015-05-04 23:22:36 +00:00
Pete Cooper	4dddbcfbb1	[ARM] IT block insertion needs to update kill flags When forming an IT block from the first MOV here: %R2<def> = t2MOVr %R0, pred:1, pred:%CPSR, opt:%noreg %R3<def> = tMOVr %R0<kill>, pred:14, pred:%noreg the move in to R3 is moved out of the IT block so that later instructions on the same predicate can be inside this block, and we can share the IT instruction. However, when moving the R3 copy out of the IT block, we need to clear its kill flags for anything in use at this point in time, ie, R0 here. This appeases the machine verifier which thought that R0 wasn't defined when used. I have a test case, but its extremely register allocator specific. It would be too fragile to commit a test which depends on the register allocator here. llvm-svn: 236468	2015-05-04 22:44:47 +00:00
Reid Kleckner	b61f06c9c2	Fix -Wmicrosoft warning by making enum unsigned llvm-svn: 236436	2015-05-04 18:21:35 +00:00
Ulrich Weigand	9ac2f9b2d8	[SystemZ] Reclassify f32 subregs of f64 registers At the moment, all subregs defined by the SystemZ target can be modified independently of the wider register. E.g. writing to a GR32 does not change the upper 32 bits of the GR64. Writing to an FP32 does not change the lower 32 bits of the FP64. Hoewver, the upcoming support for the vector extension redefines FP64 as one half of a V128. Floating-point operations leave the other half of a V128 in an unpredictable state, so it's no longer the case that writing to an FP32 leaves the bits of the underlying register (the V128) alone. I'd prefer to have separate subreg_ names for this situation, so that it's obvious at a glance whether we're talking about a subreg that leaves the other parts of the register alone. No behavioral change intended. Patch originally by Richard Sandiford. llvm-svn: 236433	2015-05-04 17:41:22 +00:00
Ulrich Weigand	1f698b003c	[SystemZ] Clean up AsmParser isMem() handling We know what MemoryKind an operand has at the time we construct it, so we might as well just record it in an unused part of the structure. This makes it easier to add scatter/gather addresses later. No behavioral change intended. Patch originally by Richard Sandiford. llvm-svn: 236432	2015-05-04 17:40:53 +00:00
Ulrich Weigand	1c6f07d616	[SystemZ] Fix getTargetNodeName It seems SystemZTargetLowering::getTargetNodeName got out of sync with some recent changes to the SystemZISD opcode list. Add back all the missing opcodes (and re-sort to the same order as SystemISelLowering.h). llvm-svn: 236430	2015-05-04 17:39:40 +00:00
Tom Stellard	b81f4aa952	R600/SI: Code cleanup This is a follow-up to r236004 llvm-svn: 236427	2015-05-04 16:45:08 +00:00
Elena Demikhovsky	60eb9db7bb	AVX-512: added calling convention for i1 vectors in 32-bit mode. Fixed some bugs in extend/truncate for AVX-512 target. Removed VBROADCASTM (masked broadcast) node, since it is not used any more. llvm-svn: 236420	2015-05-04 12:40:50 +00:00
Elena Demikhovsky	52266388f8	AVX-512: added integer "add" and "sub" instructions with saturation for SKX with intrinsics and tests by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 236418	2015-05-04 12:35:55 +00:00
Elena Demikhovsky	2557a22be7	AVX-512: Added VPACK* instructions forms for KNL and SKX and their intrinsics by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 236414	2015-05-04 09:14:02 +00:00
Elena Demikhovsky	1b60ed7069	Masked gather and scatter intrinsics - enabled codegen for KNL. llvm-svn: 236394	2015-05-03 07:12:25 +00:00
Simon Pilgrim	d5e20306cc	[SSE2] Minor tidyup of v16i8 SHL lowering. NFC. Removed code that was replicating v8i16 'shift + mask' implementation that is done more nicely by making use of LowerScalarImmediateShift llvm-svn: 236388	2015-05-02 14:42:43 +00:00
Reid Kleckner	83d89fa546	Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86" This reverts commit r236359. Things are still broken despite testing. :( llvm-svn: 236360	2015-05-01 22:50:14 +00:00
Reid Kleckner	51476acd77	Re-land "[WinEH] Add an EH registration and state insertion pass for 32-bit x86" This reverts commit r236340. llvm-svn: 236359	2015-05-01 22:40:25 +00:00
Quentin Colombet	0de2346859	[AArch64][FastISel] Variant of the logical instructions that use two input registers cannot write on SP. rdar://problem/20748715 llvm-svn: 236352	2015-05-01 21:34:57 +00:00
Colin LeMahieu	6efd273a61	[Hexagon] Removing variable unused in release. llvm-svn: 236351	2015-05-01 21:30:22 +00:00
Colin LeMahieu	b662565475	[Hexagon] Adding expression MC emission and removing XFAIL from test that hits this code path. llvm-svn: 236348	2015-05-01 21:14:21 +00:00
Quentin Colombet	9df2fa261b	[AArch64][FastISel] Fix the setting of kill flags for MUL -> UMULH sequences. rdar://problem/20748715 llvm-svn: 236346	2015-05-01 20:57:11 +00:00
Reid Kleckner	2747d3d55a	Revert "[WinEH] Add an EH registration and state insertion pass for 32-bit x86" This reverts commit r236339, it breaks the win32 clang-cl self-host. llvm-svn: 236340	2015-05-01 20:14:04 +00:00
Reid Kleckner	4856fc61b4	[WinEH] Add an EH registration and state insertion pass for 32-bit x86 This pass is responsible for constructing the EH registration object that gets linked into fs:00, which is all it does in this change. In the future, it will also insert stores to update the EH state number. I considered keeping this functionality in WinEHPrepare, but it's pretty separable and X86 specific. It has conceptually very little to do with the task of WinEHPrepare, which is currently outlining. WinEHPrepare is also in theory useful on ARM, but this logic is pretty x86 specific. Reviewers: andrew.w.kaylor, majnemer Differential Revision: http://reviews.llvm.org/D9422 llvm-svn: 236339	2015-05-01 20:04:54 +00:00
Pete Cooper	f68d5038e6	[ARM] Transfer the internal flag in thumb2 size reduction. Converting from t2LDRs to tLDRr caused the shift argument to drop the internal flag. This would then throw machine verifier errors. Unfortunately i'm having trouble reducing a test case. I'm going to keep trying, but so far its a scary combination of machine sinking, an 'and i1', loads feeding loads, and a bunch of code which shouldn't change IT block formation, but does. Its not useful to commit a test in that state as we have no way of knowing if it even hits this code reliably in future. rdar://problem/20752113 llvm-svn: 236333	2015-05-01 18:57:32 +00:00
Peter Collingbourne	d27d3a151f	ARM: Align functions containing Thumb-2 jump tables to 4 bytes. Functions with jump tables need an alignment of 4 because they use the ADR instruction, which aligns the PC to 4 bytes before adding an offset. Differential Revision: http://reviews.llvm.org/D9424 llvm-svn: 236327	2015-05-01 18:05:59 +00:00
James Y Knight	35e04e84fa	[Sparc] Repair fixups in little endian mode. Differential Revision: http://reviews.llvm.org/D9434 llvm-svn: 236324	2015-05-01 17:13:02 +00:00
Toma Tabacu	00e9867988	[mips] [IAS] Fix error messages for using LI with 64-bit immediates. Summary: LI should never accept immediates larger than 32 bits. The additional Is32BitImm boolean also paves the way for unifying the functionality that LA and LI have in common. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9289 llvm-svn: 236313	2015-05-01 12:19:27 +00:00
Toma Tabacu	a2861db834	[mips] [IAS] Slightly improve shift instruction generation in expandLoadImm. Summary: Generate one DSLL32 of 0 instead of two consecutive DSLL of 16. In order to do this I had to change createLShiftOri's template argument from a bool to an unsigned. This also gave me the opportunity to rewrite the mips64-expansions.s test, as it was testing the same cases multiple times and skipping over other cases. It was also somewhat unreadable, as the CHECK lines were grouped in a huge block of text at the beginning of the file. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8974 llvm-svn: 236311	2015-05-01 10:26:47 +00:00
Tom Stellard	aa798340c3	R600/SI: Add VCC as an implict def of SI_KILL When SI_KILL has a register operand, its lowered form writes to vcc. llvm-svn: 236307	2015-05-01 03:44:09 +00:00
Tom Stellard	0b7feb1cb7	R600/SI: Fix verifier errors from the SIAnnotateControlFlow pass This pass was generating 'Instruction does not dominate all uses!' errors for programs which had loops with a condition variable that depended on the result of a phi instruction from outside of the loop. The pass was inserting new phi nodes outside of the loop which used values defined inside the loop. http://bugs.freedesktop.org/show_bug.cgi?id=90056 llvm-svn: 236306	2015-05-01 03:44:08 +00:00
Pete Cooper	2127b00cd5	[ARM] optimizeSelect should clear kill flags. If we move an instruction from one block down to a MOVC and predicate it, then the original instruction could be moved in to a loop. In this case, its invalid for any kill flags to remain on there. Fails with -verfy-machineinstrs. rdar://problem/20752113 llvm-svn: 236290	2015-04-30 23:57:47 +00:00
Quentin Colombet	329fa890ba	[AArch64] Fix bad register class constraint in fast-isel for TST instruction. rdar://problem/20748715 llvm-svn: 236273	2015-04-30 22:27:20 +00:00
Pete Cooper	5111881cfc	Don't always apply kill flag in thumb2 ABS pseudo expansion. The expansion for t2ABS was always setting the kill flag on the rsb instruction. It should instead only be set on rsb if it was set on the original ABS instruction. rdar://problem/20752113 llvm-svn: 236272	2015-04-30 22:15:59 +00:00
Reid Kleckner	60d5232be2	[X86] Use 4 byte preferred aggregate alignment on Win32 This helps reduce the frequency of stack realignment prologues in 32-bit X86 Windows code. Before this change and the corresponding clang change, we would take the max of the type preferred alignment and the explicit alignment on the alloca. If you don't override aggregate alignment in datalayout, you get a default of 8. This dates back to 2007 / r34356, and changing it seems prohibitively difficult at this point. llvm-svn: 236270	2015-04-30 22:11:59 +00:00
Matt Arsenault	d42e017ee4	Mips: Remove dead declaration llvm-svn: 236250	2015-04-30 19:35:43 +00:00
Quentin Colombet	0a905042cd	[ARM] Do not generate invalid encoding for stack adjust, even if this is just temporary. Because of that: 1. The machine verifier was complaining on such code. 2. The generate code worked just because the thumb reduction size pass fixed the opcode. rdar://problem/20749824 llvm-svn: 236247	2015-04-30 18:52:49 +00:00
Tim Northover	03b99f66d7	AArch64: add BFC alias for the BFI/BFM instructions. Unlike 32-bit ARM, AArch64 can use wzr/xzr to implement this without the need for a separate instruction. rdar://18679590 llvm-svn: 236245	2015-04-30 18:28:58 +00:00
Jan Vesely	808fff585b	Reinstate revisions r234755, r234759, r234760 changes: Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO Add location to getConstant Drop comment about the ops being turned into expand llvm-svn: 236240	2015-04-30 17:15:56 +00:00
Daniel Jasper	232778a7a0	Silence unused warning in non-assert builds. llvm-svn: 236213	2015-04-30 09:01:21 +00:00
Elena Demikhovsky	e1eda8a9e6	Masked gather and scatter - added DAGCombine visitors and AVX-512 instruction selection patterns. All other patches, including tests will follow. http://reviews.llvm.org/D7665 llvm-svn: 236211	2015-04-30 08:38:48 +00:00
Simon Pilgrim	ecf5875bd5	[SSE] Fix for MUL v16i8 on pre-SSE41 targets (PR23369). Sign extension of i8 to i16 was placing the unpacked bytes in the lower byte instead of the upper byte. llvm-svn: 236209	2015-04-30 08:23:16 +00:00
Pete Cooper	46361a1ea1	Change x86 CMOVE_F to read it source, not write it. This was breaking sqlite with the machine verifier because operand 0 was a def according to tablegen, but didn't have the 'isDef' flag set. Looking at the ISA, its clear that this operand is a source as writing to st(0) is implicit. So move the operand to the correct place in the td file. rdar://problem/20751584 llvm-svn: 236183	2015-04-29 23:51:33 +00:00
Douglas Katzman	9160e78ac8	[Sparc] Really add sparcel architecture support. Mostly copy-and-paste from Sparc v8 architecture. Differential Revision: http://reviews.llvm.org/D8741 llvm-svn: 236146	2015-04-29 20:30:57 +00:00
Manman Ren	0e20822887	[AArch64] Refactor out codes that depend on specific CS save sequence. No functionality change. llvm-svn: 236143	2015-04-29 20:03:38 +00:00
Tim Northover	5211715360	ARM: mark branch-like instructions with correct flags. There's probably no way to test BXJ, but if the compiler ever did emit it during CodeGen it would have to be a block terminator so "isBranch" is appropriate. BLX is more tricky. Clearly a call, but it affects surprisingly little. rdar://18719544 llvm-svn: 236140	2015-04-29 19:16:38 +00:00
Douglas Katzman	9cb88b73c6	Make Sparc assembler accept parenthesized constant expressions. Differential Revision: http://reviews.llvm.org/D9087 llvm-svn: 236137	2015-04-29 18:48:29 +00:00
Zoran Jovanovic	387ce30685	[mips][microMIPSr6] Implement MUL, MUH, MULU and MUHU instructions Differential Revision: http://reviews.llvm.org/D8894 llvm-svn: 236131	2015-04-29 17:23:22 +00:00

... 7 8 9 10 11 ...

33674 Commits