llvm-project

Commit Graph

Author	SHA1	Message	Date
Clement Courbet	d5f6182bec	use repmovsb when optimizing forminsize llvm-svn: 300960	2017-04-21 09:20:55 +00:00
Clement Courbet	203fc17797	Rename FastString flag. llvm-svn: 300959	2017-04-21 09:20:50 +00:00
Clement Courbet	1ce3b82dea	X86 memcpy: use REPMOVSB instead of REPMOVS{Q,D,W} for inline copies when the subtarget has fast strings. This has two advantages: - Speed is improved. For example, on Haswell thoughput improvements increase linearly with size from 256 to 512 bytes, after which they plateau: (e.g. 1% for 260 bytes, 25% for 400 bytes, 40% for 508 bytes). - Code is much smaller (no need to handle boundaries). llvm-svn: 300957	2017-04-21 09:20:39 +00:00
Clement Courbet	8177fee513	Delete dead code llvm-svn: 300952	2017-04-21 07:40:59 +00:00
Artyom Skrobov	8d9643009f	[Thumb1] The recently added tADCS and tSBCS pseudo-instructions were missing `Uses = [CPSR]` Summary: Thanks to Oliver Stannard for helping catch this. Reviewers: olista01, efriedma Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D31815 llvm-svn: 300951	2017-04-21 07:35:21 +00:00
Akira Hatanaka	78ccba6a20	Revert r300932 and r300930. It seems that r300930 was creating an infinite loop in dag-combine when compling the following file: MultiSource/Benchmarks/MiBench/consumer-typeset/z21.c llvm-svn: 300940	2017-04-21 01:31:50 +00:00
Akira Hatanaka	e52caddae8	[AArch64] Use suffix ULL to shift a 64-bit value. llvm-svn: 300932	2017-04-21 00:35:27 +00:00
Akira Hatanaka	19077aaee0	[AArch64] Improve code generation for logical instructions taking immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. This recommits r300913, which broke bots because I didn't fix a call to ShrinkDemandedConstant in SIISelLowering.cpp after changing the APIs of TargetLoweringOpt and TargetLowering. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300930	2017-04-21 00:05:16 +00:00
Matthias Braun	9610a26251	X86RegisterInfo: eliminateFrameIndex: Avoid code duplication; NFC X86RegisterInfo::eliminateFrameIndex() and X86FrameLowering::getFrameIndexReference() both had logic to compute the base register. This consolidates the code. Also use MachineInstr::isReturn instead of manually enumerating tail call instructions (return instructions were not included in the previous list because they never reference frame indexes). Differential Revision: https://reviews.llvm.org/D32206 llvm-svn: 300923	2017-04-20 23:34:50 +00:00
Matthias Braun	63e3e8ce72	X86RegisterInfo: eliminateFrameIndex: Force SP for AfterFPPop; NFC AfterFPPop is used for tailcall/tailjump instructions. We shouldn't ever have frame-pointer/base-pointer relative addressing for those. After all the frame/base pointer should already be restored to their previous values at the return. Make this fact explicit in preparation for an upcoming refactoring. Differential Revision: https://reviews.llvm.org/D32205 llvm-svn: 300922	2017-04-20 23:34:46 +00:00
Akira Hatanaka	7b06cebe73	Revert "[AArch64] Improve code generation for logical instructions taking" This reverts r300913. This broke bots. llvm-svn: 300916	2017-04-20 23:03:30 +00:00
Akira Hatanaka	e327f09832	[AArch64] Improve code generation for logical instructions taking immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300913	2017-04-20 22:47:56 +00:00
Tim Northover	100b7f6eae	AArch64: lower "fence singlethread" to a pure compiler barrier. Single-threaded fences aren't required to provide any synchronization with other processing elements so there's no need for a DMB. They should still be a barrier for compiler optimizations though. llvm-svn: 300905	2017-04-20 21:57:45 +00:00
Tim Northover	46e58354da	ARM: lower "fence singlethread" to a pure compiler barrier. Single-threaded fences aren't required to provide any synchronization with other processing elements so there's no need for a DMB. They should still be a barrier for compiler optimizations though. llvm-svn: 300904	2017-04-20 21:56:52 +00:00
Chad Rosier	4279c58ec4	[AArch64] Whitespace/ordering fixes for Falkor machine description. NFC. llvm-svn: 300893	2017-04-20 21:11:17 +00:00
Chad Rosier	a56bdbe62d	[AArch64] Refine Falkor machine description for pre/post-inc and stores. llvm-svn: 300892	2017-04-20 21:11:09 +00:00
Tim Northover	8b1240b0f0	ARM: handle post-indexed NEON ops where the offset isn't the access width. Before, we assumed that any ConstantInt offset was precisely the access width, so we could use the "[rN]!" form. ISelLowering only ever created that kind, but further simplification during combining could lead to unexpected constants and incorrect codegen. Should fix PR32658. llvm-svn: 300878	2017-04-20 19:54:02 +00:00
Chad Rosier	9f25dd56a8	[AArch64] Improve scheduling of logical operations on Falkor. llvm-svn: 300871	2017-04-20 18:50:21 +00:00
Weiming Zhao	962c5a3aec	[Thumb-1] Fix corner cases for compressed jump tables Summary: When synthesized TBB/TBH is expanded, we need to avoid the case of: BaseReg is redefined after the load of branching target. E.g.: %R2 = tLEApcrelJT <jt#1> %R1 = tLDRr %R1, %R2 ==> %R2 = tLEApcrelJT <jt#1> %R2 = tLDRspi %SP, 12 %R2 = tLDRspi %SP, 12 tBR_JTr %R1 tTBB_JT %R2, %R1 ` Reviewers: jmolloy Reviewed By: jmolloy Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D32250 llvm-svn: 300870	2017-04-20 18:37:14 +00:00
Benjamin Kramer	58dadd59d9	Fix use-after-frees on memory allocated in a Recycler. This will become asan errors once the patch lands that poisons the memory after free. The x86 change is a hack, but I don't see how to solve this properly at the moment. llvm-svn: 300867	2017-04-20 18:29:14 +00:00
Sam Clegg	90d99413ac	[WebAssembly] Add known failures for wasm object file backend Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D32300 llvm-svn: 300859	2017-04-20 17:18:15 +00:00
Craig Topper	bcfd2d1789	[APInt] Rename getSignBit to getSignMask getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask. Differential Revision: https://reviews.llvm.org/D32108 llvm-svn: 300856	2017-04-20 16:56:25 +00:00
Petar Jovanovic	2b6fe3ffa6	[mips][msa] Mask vectors holding shift amounts Masked vectors which hold shift amounts when creating the following nodes: ISD::SHL, ISD::SRL or ISD::SRA. Instructions that use said nodes, which have had their arguments altered are sll, srl, sra, bneg, bclr and bset. For said instructions, the shift amount or the bit position that is specified in the corresponding vector elements will be interpreted as the shift amount/bit position modulo the size of the element in bits. The problem lies in compiling with -O2 enabled, where the instructions for formats .w and .d are not generated, but are instead optimized away. In this case, having shift amounts that are either negative or greater than the element bit size results in generation of incorrect results when constant folding. We remedy this by masking the operands for the nodes mentioned above before actually creating them, so that the final result is correct before placed into the constant pool. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D31331 llvm-svn: 300839	2017-04-20 13:26:46 +00:00
John Brawn	66719f63d0	[ARM] Fix handling of mapping symbols when changing sections ChangeSection incorrectly registers LastEMSInfo as belonging to the previous section, not the current section. This happens to work when changing sections using .section, as the previous section is set to the current section before the call to ChangeSection, but not when using .popsection. Differential Revision: https://reviews.llvm.org/D32225 llvm-svn: 300831	2017-04-20 10:18:13 +00:00
John Brawn	5ca5daa6b9	[AArch64] Fix handling of zero immediate in fmov instructions Currently fmov #0 with a vector destination is handle incorrectly and results in fmov #-1.9375 being emitted but should instead give an error. This is due to the way we cope with fmov #0 with a scalar destination being an alias of fmov zr, so fix this by actually doing it through an alias. Differential Revision: https://reviews.llvm.org/D31949 llvm-svn: 300830	2017-04-20 10:13:54 +00:00
John Brawn	dcf037a6f0	[AArch64] Fix handling of integer fp immediates When an integer is used as an fp immediate we're failing to check the return value of getFP64Imm, so invalid values are silently permitted. Fix this by merging together the integer and real handling. llvm-svn: 300828	2017-04-20 10:10:10 +00:00
Diana Picus	7c6dee9f16	[ARM] Rename HW div feature to HW div Thumb. NFCI. The hardware div feature refers only to Thumb, but because of its name it is tempting to use it to check for hardware division in general, which may cause problems in ARM mode. See https://reviews.llvm.org/D32005. This patch adds "Thumb" to its name, to make its scope clear. One notable place where I haven't made the change is in the feature flag (used with -mattr), which is still hwdiv. Changing it would also require changes in a lot of tests, including clang tests, and it doesn't seem like it's worth the effort. Differential Revision: https://reviews.llvm.org/D32160 llvm-svn: 300827	2017-04-20 09:38:25 +00:00
Kannan Narayanan	2fb5960121	Revert earlier change. ds permute operations affect lgkm counter. Differential Revision: https://reviews.llvm.org/D32254 llvm-svn: 300791	2017-04-19 23:39:19 +00:00
Matthias Braun	372ee59766	X86FrameLowering: Fix getFrameIndexReference() for 'fixed' objects Debug information is calculated with getFrameIndexReference() which was missing some logic for the fixed object cases (= parameters on the stack). rdar://24557797 Differential Revision: https://reviews.llvm.org/D32204 llvm-svn: 300781	2017-04-19 23:10:43 +00:00
Matthias Braun	8aaa368d00	ARMFrameLowering: Reserve emergency spill slot for large arguments Re-commit after revert in r300668. Changed getMaxFPOffset() to a more conservative heuristic instead of trying to be clever and missing for some exotic calling conventions. We need to reserve an emergency spill slot in cases with large argument types that could overflow immediate offsets for FP relative address calculations. rdar://31317893 Differential Revision: https://reviews.llvm.org/D31643 llvm-svn: 300761	2017-04-19 21:11:44 +00:00
Matt Arsenault	4a48623e4f	AMDGPU: Custom lower illegal small select types Promote them to i32 vectors to avoid unpacking and re-packing the vectors. llvm-svn: 300754	2017-04-19 20:53:07 +00:00
Eli Friedman	70ad2751d5	[ARM] Remove redundant computeKnownBits helper. Move the BFI logic to computeKnownBitsForTargetNode, and delete the redundant CMOV logic. This is intended as a cleanup, but it's probably possible to construct a case where moving the BFI logic allows more combines. Differential Revision: https://reviews.llvm.org/D31795 llvm-svn: 300752	2017-04-19 20:50:57 +00:00
Aditya Nandakumar	75ad9ccbfa	[GISEL]: Move getConstantVReg to Utils NFCI llvm-svn: 300751	2017-04-19 20:48:50 +00:00
Eli Friedman	f281d490cc	[ARM] Use TableGen patterns to select vtbl. NFC. Differential Revision: https://reviews.llvm.org/D32103 llvm-svn: 300749	2017-04-19 20:39:39 +00:00
Dehao Chen	58601674d2	PR32710: Disable using PMADDWD for unsigned short. Summary: PMADDWD can only handle signed short. Reviewers: mkuper, wmi Reviewed By: mkuper Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D32236 llvm-svn: 300737	2017-04-19 19:50:34 +00:00
Matt Arsenault	021a218dd2	AMDGPU: Don't emit amd_kernel_code_t for callable functions This is inserted directly in the text section. The relocation for the function ends up resolving to the beginning of the amd_kernel_code_t header rather than the actual function entry point. Also skip some of the comments for initialization that only makes sense for kernels. llvm-svn: 300736	2017-04-19 19:38:10 +00:00
Tim Northover	ff168c68dc	ARM: TLS calling convention doesn't preserve r9 or r12 on Darwin. llvm-svn: 300726	2017-04-19 18:07:54 +00:00
Matt Arsenault	6cb7b8a42f	AMDGPU: Don't align callable functions to 256 llvm-svn: 300720	2017-04-19 17:42:39 +00:00
Matt Arsenault	4c1ecded63	AMDGPU: Change DivergenceAnalysis for function arguments Stop assuming all functions are kernels. llvm-svn: 300719	2017-04-19 17:42:34 +00:00
Krzysztof Parzyszek	333b2bf2ed	[Hexagon] Generate proper offset in opt-addr-mode Also, make a few changes to allow using the pass in .mir testcases. Among other things, change the abbreviation from opt-amode to amode-opt, because otherwise lit would expand the "opt" part to the full path to the opt binary. llvm-svn: 300707	2017-04-19 15:15:51 +00:00
Krzysztof Parzyszek	634f57e0bb	[Hexagon] Remove RDefMap, use Liveness:getNearestAliasedRef instead llvm-svn: 300706	2017-04-19 15:14:30 +00:00
Krzysztof Parzyszek	0de74f315d	[RDF] Switch NodeList to SmallVector from std::vector The list has a single element 75+% of the time, reservation of 4 elements is sufficient in 95% of cases. llvm-svn: 300705	2017-04-19 15:12:44 +00:00
Krzysztof Parzyszek	7c69a3b490	[RDF] Use faster version of findBlock llvm-svn: 300704	2017-04-19 15:11:23 +00:00
Krzysztof Parzyszek	6aa3a3f00b	[RDF] Cache register units for reg masks instead of recalculating them llvm-svn: 300702	2017-04-19 15:10:09 +00:00
Krzysztof Parzyszek	5bfaf56ee5	[Hexagon] Cache reached blocks in bit tracker instead of scanning list llvm-svn: 300701	2017-04-19 15:08:31 +00:00
Igor Breger	4fdf1e489c	[GlobalIsel][X86] support G_TRUNC selection. Summary: [GlobalIsel][X86] support G_TRUNC selection. Add regbank-select and legalizer tests. Currently legalization of trunc i64 on 32bit platform not supported. Reviewers: ab, zvi, rovka Reviewed By: zvi Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D32115 llvm-svn: 300678	2017-04-19 11:34:59 +00:00
Renato Golin	742aed8683	Revert "ARMFrameLowering: Reserve emergency spill slot for large arguments" This reverts commit r300639, as it broke self-hosting on ARM. PR32709. llvm-svn: 300668	2017-04-19 09:02:52 +00:00
Diana Picus	49472ff1cf	[ARM] GlobalISel: Add support for G_MUL Support G_MUL, very similar to G_ADD and G_SUB. The only difference is in the instruction selector, where we have to select either MUL or MULv5 depending on the target. llvm-svn: 300665	2017-04-19 07:29:46 +00:00
Kristof Beyls	0f36e68f62	[GlobalISel] Support vector-of-pointers in LLT This fixes PR32471. As comment 10 on that bug report highlights (https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a few different defendable design tradeoffs that could be made, including not representing pointers at all in LLT. I decided to go for representing vector-of-pointer as a concept in LLT, while keeping the size of the LLT type 64 bits (this is an increase from 48 bits before). My rationale for keeping pointers explicit is that on some targets probably it's very handy to have the distinction between pointer and non-pointer (e.g. 68K has a different register bank for pointers IIRC). If we keep a scalar pointer, it probably is easiest to also have a vector-of-pointers to keep LLT relatively conceptually clean and orthogonal, while we don't have a very strong reason to break that orthogonality. Once we gain more experience on the use of LLT, we can of course reconsider this direction. Rejecting vector-of-pointer types in the IRTranslator is also an option to avoid the crash reported in PR32471, but that is only a very short-term solution; also needs quite a bit of code tweaks in places, and is probably fragile. Therefore I didn't consider this the best option. llvm-svn: 300664	2017-04-19 07:23:57 +00:00
Serge Pavlov	5943a96d81	ARM: Use methods to access data stored with frame instructions In r300196 several methods were added to TarfetInstrInfo to access data stored with call frame setup/destroy instructions. This change replaces calls to getOperand with calls to such special methods in ARM target. Differential Revision: https://reviews.llvm.org/D32127 llvm-svn: 300655	2017-04-19 03:12:05 +00:00

1 2 3 4 5 ...

42166 Commits