llvm-project

Commit Graph

Author	SHA1	Message	Date
Momchil Velikov	d2cc6fd90b	[ARM] Accept a subset of Thumb GPR register class when emitting an SP-relative load instruction The function `Thumb1InstrInfo::loadRegFromStackSlot` accepts only the `tGPR` register class. The function serves to emit a `tLDRspi` instruction and certainly any subset of the `tGPR` register class is a valid destination of the load. Differential revision: https://reviews.llvm.org/D42535 llvm-svn: 323514	2018-01-26 10:20:58 +00:00
Sjoerd Meijer	011de9c0ca	[ARM] Armv8.2-A FP16 code generation (part 1/3) This is the groundwork for Armv8.2-A FP16 code generation . Clang passes and returns _Float16 values as floats, together with the required bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318. We will implement half-precision argument passing/returning lowering in the ARM backend soon, but for now this means that this: _Float16 sub(_Float16 a, _Float16 b) { return a + b; } gets lowered to this: define float @sub(float %a.coerce, float %b.coerce) { entry: %0 = bitcast float %a.coerce to i32 %tmp.0.extract.trunc = trunc i32 %0 to i16 %1 = bitcast i16 %tmp.0.extract.trunc to half <SNIP> %add = fadd half %1, %3 <SNIP> } When FullFP16 is not supported, we don't make f16 a legal type, and we get legalization for "free", i.e. nothing changes and everything works as before. And also f16 argument passing/returning is handled. When FullFP16 is supported, we do make f16 a legal type, and have 2 places that we need to patch up: f16 argument passing and returning, which involves minor tweaks to avoid unnecessary code generation for some bitcasts. As a "demonstrator" that this works for the different FP16, FullFP16, softfp modes, etc., I've added match rules to the VSUB instruction description showing that we can codegen this instruction from IR, but more importantly, also to some conversion instructions. These conversions were causing issue before in the FP16 and FullFP16 cases. I've also added match rules to the VLDRH and VSTRH desriptions, so that we can actually compile the entire half-precision sub code example above. This showed that these loads and stores had the wrong addressing mode specified: AddrMode5 instead of AddrMode5FP16, which turned out not be implemented at all, so that has also been added. This is the minimal patch that shows all the different moving parts. In patch 2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the remaining Armv8.2-A FP16 instruction descriptions. Thanks to Sam Parker and Oliver Stannard for their help and reviews! Differential Revision: https://reviews.llvm.org/D38315 llvm-svn: 323512	2018-01-26 09:26:40 +00:00
Weiming Zhao	665784f170	[ARM] Expand long shifts for Thumb1 to __aeabi_ calls Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1. Reviewers: samparker Reviewed By: samparker Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42401 llvm-svn: 323354	2018-01-24 18:00:57 +00:00
Martin Storsjo	4ed94a06ac	[ARM] Call __chkstk for dynamic stack allocation in all windows environments This matches what MSVC does for alloca() function calls on ARM. Even if MSVC doesn't support VLAs at the language level, it does support the alloca function. On the clang level, both the _alloca() (when emulating MSVC, which is what the alloca() function expands to) and __builtin_alloca() builtin functions, and VLAs, map to the same LLVM IR "alloca" function - so within LLVM they're not distinguishable from each other. Differential Revision: https://reviews.llvm.org/D42292 llvm-svn: 323308	2018-01-24 06:40:11 +00:00
Martin Storsjo	e8248f2e10	[GlobalMerge] Don't merge dllexport globals Merging such globals loses the dllexport attribute. Add a test to check that normal globals still are merged. Differential Revision: https://reviews.llvm.org/D42127 llvm-svn: 323307	2018-01-24 06:40:04 +00:00
Aditya Nandakumar	f2aa2af24e	[GISel]: Remove redundant copies at the end of ISel https://reviews.llvm.org/D42402 A lot of these copies are useless (copies b/w VRegs having the same regclass) and should be cleaned up. llvm-svn: 323291	2018-01-24 01:35:26 +00:00
Marina Yatsina	6fc2aaae8d	Separate ExecutionDepsFix into 4 parts: 1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains). 2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings. 3. BreakFalseDeps - Breaks false dependencies. 4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks. This also included the following changes to ExcecutionDepsFix original logic: 1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class. 2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class. Additional changes in affected files: 1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate. 2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate. Additional refactoring changes will follow. This commit is (almost) NFC. The only functional change is that now BreakFalseDeps will break dependency for all register classes. Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet. In a future commit several instructions (and tests) will be added. This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40330 Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275 llvm-svn: 323087	2018-01-22 10:05:23 +00:00
Saleem Abdulrasool	d3169b478c	test: fix ARM tests harder Remove the missed check update for the removal of the x86 specific vector call on ARM. llvm-svn: 323023	2018-01-20 01:26:46 +00:00
Saleem Abdulrasool	6f831d54f8	test: move ARM test from x86 The ARM backend is not guaranteed to be present on x86, move the test to the ARM tests. llvm-svn: 323021	2018-01-20 01:03:11 +00:00
Joel Galenson	dbc724f764	[ARM] Fix perf regression in compare optimization. Fix a performance regression caused by r322737. While trying to make it easier to replace compares with existing adds and subtracts, I accidentally stopped it from doing so in some cases. This should fix that. I'm also fixing another potential bug in that commit. Differential Revision: https://reviews.llvm.org/D42263 llvm-svn: 322972	2018-01-19 17:46:27 +00:00
Daniel Neilson	1e68724d24	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])$(.), i32, i1$~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965	2018-01-19 17:13:12 +00:00
Matthias Braun	8bb5228db9	Move tests to the correct place test/CodeGen/MIR is for testing the MIR parser/printer. Tests for passes and targets belong to test/CodeGen/TARGETNAME. llvm-svn: 322925	2018-01-19 06:08:15 +00:00
Martin Storsjo	d96be854e5	[test] Actually check the common parts in CodeGen/ARM/global-merge-external.ll. NFC. Previously, these parts weren't ever checked. The label patterns need to be extended to match successfully on macho. Differential Revision: https://reviews.llvm.org/D42126 llvm-svn: 322900	2018-01-18 21:21:48 +00:00
Francis Visoiu Mistrih	378b5f3de6	[CodeGen] Print RegClasses on MI in verbose mode r322086 removed the trailing information describing reg classes for each register. This patch adds printing reg classes next to every register when individual operands/instructions/basic blocks are printed. In the case of dumping MIR or printing a full function, by default don't print it. Differential Revision: https://reviews.llvm.org/D42239 llvm-svn: 322867	2018-01-18 17:59:06 +00:00
Reid Kleckner	1aa9061c5f	[CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64 Every known PE COFF target emits /EXPORT: linker flags into a .drective section. The AsmPrinter should handle this. While we're at it, use global_values() and emit each export flag with its own .ascii directive. This should make the .s file output more readable. llvm-svn: 322788	2018-01-17 23:55:23 +00:00
Eli Friedman	c60a23a6af	[LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization. The code wasn't zero-extending correctly, so the comparison could spuriously fail. Adds some AArch64 tests to cover this case. Inspired by D41791. Differential Revision: https://reviews.llvm.org/D41798 llvm-svn: 322767	2018-01-17 22:04:36 +00:00
Daniel Sanders	12e6e709e9	[globalisel][tablegen] Honour priority order within nested instructions. It appears that we haven't been prioritizing rules that contain nested instructions properly. InstructionOperandMatcher didn't override isHigherPriorityThan so it never compared the instructions/operands/predicates inside nested instructions. Fixes PR35926. Thanks to Diana Picus for the bug report. llvm-svn: 322754	2018-01-17 20:34:29 +00:00
Joel Galenson	bbcaf4ac5c	[ARM] Optimize {s,u}mul.with.overflow. This extends my previous patches to also optimize overflow-checked multiplies during SelectionDAG. Differential revision: https://reviews.llvm.org/D40922 llvm-svn: 322738	2018-01-17 19:19:05 +00:00
Joel Galenson	fe7fa40869	[ARM] Optimize {s,u}{add,sub}.with.overflow. The ARM backend contains code that tries to optimize compares by replacing them with an existing instruction that sets the flags the same way. This allows it to replace a "cmp" with a "adds", generalizing the code that replaces "cmp" with "sub". It also heuristically disables sinking of instructions that could potentially be used to replace compares (currently only if they're next to each other). Differential revision: https://reviews.llvm.org/D38378 llvm-svn: 322737	2018-01-17 19:19:05 +00:00
Diana Picus	4652e25030	[ARM GlobalISel] Add instselect tests for G_FPEXT and G_FPTRUNC G_FPEXT and G_FPTRUNC are handled by TableGen'erated code, just add tests. llvm-svn: 322665	2018-01-17 15:01:19 +00:00
Diana Picus	c62a16234b	[ARM GlobalISel] Map G_FPEXT and G_FPTRUNC to FPR llvm-svn: 322657	2018-01-17 14:14:14 +00:00
Diana Picus	65ed364fac	[ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNC Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware support, but only for conversions between float and double. Also add the necessary boilerplate so that the LegalizerHelper can introduce the required libcalls. This also works only for float and double, but isn't too difficult to extend when the need arises. llvm-svn: 322651	2018-01-17 13:34:10 +00:00
Sean Eveson	2ae6037dd1	[MC] Fix -stack-size-section on ARM Change symbol values in the stack_size section from being 8 bytes, to being a target dependent size. Differential Revision: https://reviews.llvm.org/D42108 llvm-svn: 322619	2018-01-17 09:01:29 +00:00
Jonas Devlieghere	6f24c8778c	[DebugInfo] Unify dumping of address ranges Summary: This patch unifies the printing of address ranges as [0x0, 0x1). rdar://34822059 Reviewers: aprantl, dblaikie Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D42056 llvm-svn: 322543	2018-01-16 11:17:57 +00:00
Diana Picus	cf044647c4	[ARM GlobalISel] Add inst selector tests for G_FMA We don't yet match all the patterns involving G_FMA. Add tests for some of the ones that we do match. llvm-svn: 322368	2018-01-12 12:44:36 +00:00
Diana Picus	2dc5405693	[ARM GlobalISel] Map G_FMA to FPR llvm-svn: 322367	2018-01-12 12:06:01 +00:00
Diana Picus	e74243d473	[ARM GlobalISel] Legalize G_FMA For hard float with VFP4, it is legal. Otherwise, we use libcalls. This needs a bit of support in the LegalizerHelper for soft float because we didn't handle G_FMA libcalls yet. The support is trivial, as the only difference between G_FMA and other libcalls that we already handle is that it has 3 input operands rather than just 2. llvm-svn: 322366	2018-01-12 11:30:45 +00:00
Andre Vieira	5627c218e1	[ARM] Add codegen for SMMULR, SMMLAR and SMMLSR This patch teaches the Arm back-end to generate the SMMULR, SMMLAR and SMMLSR instructions from equivalent IR patterns. Differential Revision: https://reviews.llvm.org/D41775 llvm-svn: 322361	2018-01-12 09:24:41 +00:00
Andre Vieira	26b9de9ebb	[ARM] Fix erroneous availability of SMMLS for Armv7-M Differential Revision: https://reviews.llvm.org/D41855 llvm-svn: 322360	2018-01-12 09:21:09 +00:00
Matthias Braun	ea4359e922	PeepholeOptimizer: Fix for vregs without defs The PeepholeOptimizer would fail for vregs without a definition. If this was caused by an undef operand abort to keep the code simple (so we don't need to add logic everywhere to replicate the undef flag). Differential Revision: https://reviews.llvm.org/D40763 llvm-svn: 322319	2018-01-11 22:30:43 +00:00
Matthias Braun	08abcac9dc	PeepholeOptimizer: Do not form PHI with subreg arguments When replacing a PHI the PeepholeOptimizer currently takes the register class of the register at the first operand. This however is not correct if this argument has a subregister index. As there is currently no API to query the register class resulting from applying a subregister index to all registers in a class, we can only abort in these cases and not perform the transformation. This changes findNextSource() to require the end of all copy chains to not use a subregister if there is any PHI in the chain. I had to rewrite the overly complicated inner loop there to have a good place to insert the new check. This fixes https://llvm.org/PR33071 (aka rdar://32262041) Differential Revision: https://reviews.llvm.org/D40758 llvm-svn: 322313	2018-01-11 21:57:03 +00:00
Evgeniy Stepanov	5223b5d9d6	[arm] Implement Target Operand Flag MIR serialization. Reviewers: efriedma, pcc Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D39975 llvm-svn: 322312	2018-01-11 21:37:58 +00:00
Diana Picus	e3591f3a17	[ARM GlobalISel] Add inst selector tests for G_FNEG s32 and s64 G_FNEG is already handled by the TableGen'erated code. Just add a few tests to make sure everything works as expected. llvm-svn: 322170	2018-01-10 11:13:36 +00:00
Diana Picus	0ed7513c83	[ARM GlobalISel] Map G_FNEG to the FPR bank llvm-svn: 322169	2018-01-10 11:13:31 +00:00
Diana Picus	f949a0abac	[ARM GlobalISel] Legalize G_FNEG for s32 and s64 For hard float, it is legal. For soft float, we need to lower to 0 - x first, and then we can use the libcall for G_FSUB. This is undoing some of the canonicalization performed by the IRTranslator (which introduces G_FNEG when it sees a 0 - x). Ideally, that canonicalization would be performed by a pre-legalizer pass that would allow targets to opt out of this behaviour rather than dance around it in the legalizer. llvm-svn: 322168	2018-01-10 10:45:34 +00:00
Diana Picus	8f14886630	[ARM GlobalISel] Legalize s32/s64 G_FCONSTANT Legal for hard float. Change to G_CONSTANT for soft float (but preserve the binary representation). llvm-svn: 322164	2018-01-10 10:01:49 +00:00
Diana Picus	734a5e8912	[ARM GlobalISel] Legalize G_CONSTANT for scalars > 32 bits Make G_CONSTANT narrow for any scalars larger than 32 bits. llvm-svn: 322162	2018-01-10 09:32:01 +00:00
Puyan Lotfi	fe6c9cbb24	[MIR] Repurposing '$' sigil used by external symbols. Replacing with '&'. Planning to add support for named vregs. This puts is in a conundrum since physregs are named as well. To rectify this we need to use a sigil other than '%' for physregs in MIR. We've settled on using '$' for physregs but first we must repurpose it from external symbols using it, which is what this commit is all about. We think '&' will have familiar semantics for C/C++ users. llvm-svn: 322146	2018-01-10 00:56:48 +00:00
Francis Visoiu Mistrih	7d9bef8f5c	[CodeGen] Don't print "pred:" and "opt:" in -debug output In -debug output we print "pred:" whenever a MachineOperand is a predicate operand in the instruction descriptor, and "opt:" whenever a MachineOperand is an optional def in the instruction descriptor. Differential Revision: https://reviews.llvm.org/D41870 llvm-svn: 322096	2018-01-09 17:31:07 +00:00
Francis Visoiu Mistrih	2b3bd30637	[CodeGen] Don't print register classes in -debug output Since register classes and banks are already printed with the register definition, don't print it at the end of every instruction anymore. This follows MIR in this regard and is another step to the unification of the two formats. llvm-svn: 322086	2018-01-09 15:39:44 +00:00
Momchil Velikov	ac7c5c1d92	[ARM] Fix PR35379 - incorrect unwind information when compiling with -Oz The patch makes the unwind information not mention registers, which were pushed solely for the purpose of saving stack adjustment instructions. Differential revision: https://reviews.llvm.org/D41300 Fixes https://bugs.llvm.org/show_bug.cgi?id=35379 llvm-svn: 321996	2018-01-08 14:47:19 +00:00
Bjorn Pettersson	5ffb1c0ff0	[DebugInfo] Align comments in debug_loc section Summary: This commit updates the BufferByteStreamer, used by DebugLocStream to buffer bytes/comments to put in the debug_loc section, to make sure that the Buffer and Comments vectors are synced. Previously, when an SLEB128 or ULEB128 was emitted together with a comment, the vectors could be out-of-sync if the LEB encoding added several entries to the Buffer vectors, while we only added a single entry to the Comments vector. The goal with this is to get the comments in the debug_loc section in the .s file correctly aligned. Example (using ARM as target): Instead of .byte 144 @ sub-register DW_OP_regx .byte 128 @ 256 .byte 2 @ DW_OP_piece .byte 147 @ 8 .byte 8 @ sub-register DW_OP_regx .byte 144 @ 257 .byte 129 @ DW_OP_piece .byte 2 @ 8 .byte 147 @ .byte 8 @ we now get .byte 144 @ sub-register DW_OP_regx .byte 128 @ 256 .byte 2 @ .byte 147 @ DW_OP_piece .byte 8 @ 8 .byte 144 @ sub-register DW_OP_regx .byte 129 @ 257 .byte 2 @ .byte 147 @ DW_OP_piece .byte 8 @ 8 Reviewers: JDevlieghere, rnk, aprantl Reviewed By: aprantl Subscribers: davide, Ka-Ka, uabelho, aemerson, javed.absar, kristof.beyls, llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D41763 llvm-svn: 321907	2018-01-05 22:20:30 +00:00
Sam Parker	1ad085b808	[DAGCombine] Fix for PR37563 While searching for loads to be narrowed, equal sized loads were not added to the list, resulting in anyext loads not being converted to zext loads. https://bugs.llvm.org/show_bug.cgi?id=35763 Differential Revision: https://reviews.llvm.org/D41628 llvm-svn: 321862	2018-01-05 08:47:23 +00:00
Diana Picus	865f7fecb2	[ARM GlobalISel] Select G_PHI Select G_PHI to PHI and manually constrain the result register. This is very similar to how COPY is handled, so extract and reuse some of that code. llvm-svn: 321797	2018-01-04 13:09:25 +00:00
Diana Picus	bcabda43e4	[ARM GlobalISel] Add RegBankSelect tests for G_PHI RegBankSelect already handles G_PHI with some generic code. Add a couple of tests for it. llvm-svn: 321796	2018-01-04 13:09:20 +00:00
Diana Picus	c768bbe2e7	[ARM GlobalISel] Legalize scalar G_PHI Mark G_PHI as Legal for s32 and p0, and also for s64 if we have hard float. Widen any smaller types. llvm-svn: 321795	2018-01-04 13:09:14 +00:00
Diana Picus	37ae9f68a4	[ARM GlobalISel] Fix selection of pointer constants We used to handle G_CONSTANT with pointer type by forcing the type of the result register to s32 and then letting TableGen handle it. Unfortunately, setting the type only works for generic virtual registers, that haven't yet been constrained to a register class (e.g. those used only by a COPY later on). If the result register has already been constrained as a use of a previously selected instruction, then setting the type will assert. It would be nice to be able to teach TableGen to select pointer constants the same as integer constants, but since it's such an edge case (at the moment the only pointer constant that we're generally interested in is 0, and that is mostly used for comparisons and selects, which are also not supported by TableGen) it's probably not worth the effort right now. Instead, handle pointer constants with some trivial handwritten code. llvm-svn: 321793	2018-01-04 10:54:57 +00:00
Diana Picus	28a6d0e639	[ARM GlobalISel] Support G_INTTOPTR and G_PTRTOINT for s32 Mark conversions between pointers and 32-bit scalars as legal, map them to the GPR and select to a simple COPY. llvm-svn: 321356	2017-12-22 13:05:51 +00:00
Diana Picus	68773859c8	[ARM GlobalISel] Support pointer constants Pointer constants are pretty rare, since we usually represent them as integer constants and then cast to pointer. One notable exception is the null pointer constant, which is represented directly as a G_CONSTANT 0 with pointer type. Mark it as legal and make sure it is selected like any other integer constant. llvm-svn: 321354	2017-12-22 11:09:18 +00:00
Sam Parker	cf426fccd4	[DAGCombine] Revert r321259 Improve ReduceLoadWidth for SRL Patch is causing an issue on the PPC64 BE santizer. llvm-svn: 321349	2017-12-22 08:36:25 +00:00
Sam Parker	59efb8cb5b	[DAGCombine] Improve ReduceLoadWidth for SRL If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 llvm-svn: 321259	2017-12-21 12:55:04 +00:00
Michael Zolotukhin	ad371e0caa	[SimplifyCFG] Avoid quadratic on a predecessors number behavior in instruction sinking. If a block has N predecessors, then the current algorithm will try to sink common code to this block N times (whenever we visit a predecessor). Every attempt to sink the common code includes going through all predecessors, so the complexity of the algorithm becomes O(N^2). With this patch we try to sink common code only when we visit the block itself. With this, the complexity goes down to O(N). As a side effect, the moment the code is sunk is slightly different than before (the order of simplifications has been changed), that's why I had to adjust two tests (note that neither of the tests is supposed to test SimplifyCFG): * test/CodeGen/AArch64/arm64-jumptable.ll - changes in this test mimic the changes that previous implementation of SimplifyCFG would do. * test/CodeGen/ARM/avoid-cpsr-rmw.ll - in this test I disabled common code sinking by a command line flag. llvm-svn: 321236	2017-12-21 01:22:13 +00:00
Joel Galenson	6f4e827e4c	[ARM] Optimize {s,u}{add,sub}.with.overflow. The AArch64 backend contains code to optimize {s,u}{add,sub}.with.overflow during SelectionDAG. This commit ports that code to the ARM backend. Differential revision: https://reviews.llvm.org/D35635 llvm-svn: 321224	2017-12-20 22:25:39 +00:00
Diana Picus	75ce852abe	[ARM GlobalISel] Fix assertion in RegBankSelect We get an assertion in RegBankSelect for code along the lines of my_32_bit_int = my_64_bit_int, which tends to translate into a 64-bit load, followed by a G_TRUNC, followed by a 32-bit store. This appears in a couple of places in the test-suite. At the moment, the legalizer doesn't distinguish between integer and floating point scalars, so a 64-bit load will be marked as legal for targets with VFP, and so will the rest of the sequence, leading to a slightly bizarre G_TRUNC reaching RegBankSelect. Since the current support for 64-bit integers is rather immature, this patch works around the issue by explicitly handling this case in RegBankSelect and InstructionSelect. In the future, we may want to revisit this decision and make sure 64-bit integer loads are narrowed before reaching RegBankSelect. llvm-svn: 321165	2017-12-20 11:27:10 +00:00
Florian Hahn	c3aa6d83fd	[ARM] Lower unsigned saturation to USAT Summary: Implement lower of unsigned saturation on an interval [0, k] where k + 1 is a power of two using USAT instruction in a similar way to how [~k, k] is lowered using SSAT on ARM models that supports it. Patch by Marten Svanfeldt Reviewers: t.p.northover, pbarrio, eastig, SjoerdMeijer, javed.absar, fhahn Reviewed By: fhahn Subscribers: fhahn, aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D41348 llvm-svn: 321164	2017-12-20 11:13:57 +00:00
Sam Parker	fd967f2f7a	[ARM] Adjust test checks Correct the CHECK-LABELS of a couple of dag combine tests. llvm-svn: 320963	2017-12-18 10:08:03 +00:00
Sam Parker	00804efd72	[DAGCombine] Move AND nodes to multiple load leaves Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D41177 llvm-svn: 320962	2017-12-18 10:04:27 +00:00
Sam Parker	18b0d1e5b9	[ARM] Some DAG combine tests Add some more and and shift load combine tests. llvm-svn: 320822	2017-12-15 15:30:39 +00:00
Roger Ferrer Ibanez	9fcc4727ac	[ARM] Add tests for D34515 This is NFC and a preparatory step for D34515. Differential Revision: https://reviews.llvm.org/D41122 llvm-svn: 320795	2017-12-15 09:24:46 +00:00
Geoff Berry	dcc646e40b	[ARM] Fix isRenamable flag setting on expanded VSTMDIA opcode. Fixes expensive-check ARM buildbot failure. llvm-svn: 320718	2017-12-14 18:06:25 +00:00
Benjamin Kramer	a85822cb1e	Revert "[DAGCombine] Move AND nodes to multiple load leaves" This reverts commit r320679. Causes miscompiles. llvm-svn: 320698	2017-12-14 14:03:07 +00:00
Francis Visoiu Mistrih	5df3bbf3e6	[CodeGen] Print global addresses as @foo in both MIR and debug output Work towards the unification of MIR and debug output by printing `@foo` instead of `<ga:@foo>`. Also print target flags in the MIR format since most of them are used on global address operands. Only debug syntax is affected. llvm-svn: 320682	2017-12-14 10:03:09 +00:00
Francis Visoiu Mistrih	e76c5fcd70	[CodeGen] Print external symbols as $symbol in both MIR and debug output Work towards the unification of MIR and debug output by printing `$symbol` instead of `<es:symbol>`. Only debug syntax is affected. llvm-svn: 320681	2017-12-14 10:02:58 +00:00
Sam Parker	ef12b41ef7	[DAGCombine] Move AND nodes to multiple load leaves Recommitting rL319773, which was reverted due to a recursive issue causing timeouts. This happened because I failed to check whether the discovered loads could be narrowed further. In the case of a tree with one or more narrow loads, that could not be further narrowed, as well as a node that would need masking, an AND could be introduced which could then be visited and recombined again with the same load. This could again create the masking load, with would be combined again... We now check that the load can be narrowed so that this process stops. Original commit message: Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D41177 llvm-svn: 320679	2017-12-14 09:31:01 +00:00
Roger Ferrer Ibanez	e8d4e88bab	[DAG] Promote ADDCARRY / SUBCARRY Add missing case that was not implemented yet. Differential Revision: https://reviews.llvm.org/D38942 llvm-svn: 320567	2017-12-13 10:45:21 +00:00
Francis Visoiu Mistrih	26ae8a6582	[CodeGen] Print constant pool index operands as %const.0 + 8 in both MIR and debug output Work towards the unification of MIR and debug output by printing `%const.0 + 8` instead of `<cp#0+8>` and `%const.0 - 8` instead of `<cp#0-8>`. Only debug syntax is affected. Differential Revision: https://reviews.llvm.org/D41116 llvm-svn: 320564	2017-12-13 10:30:45 +00:00
Amara Emerson	df9b529d42	[GlobalISel] Disable GISel for big endian. This is due to PR26161 needing to be resolved before we can fix big endian bugs like PR35359. The work to split aggregates into smaller LLTs instead of using one large scalar will take some time, so in the mean time we'll fall back to SDAG. Some ARM BE tests xfailed for now as a result. Differential Revision: https://reviews.llvm.org/D40789 llvm-svn: 320388	2017-12-11 16:58:29 +00:00
Diana Picus	291e8d924f	[ARM GlobalISel] Add test for a MOVTi16 pattern. NFC Add test for matching an OR with 0xFFFF0000 to a MOVTi16. llvm-svn: 320362	2017-12-11 13:28:45 +00:00
Roger Ferrer Ibanez	5ea0f2501f	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 - fixes PR34564 - fixes PR35103 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 320355	2017-12-11 12:13:45 +00:00
Diana Picus	775bb74379	[ARM GlobalISel] Add tests for PKHBT and PKHTB Test (some of) the patterns for selecting PKHBT and PKHTB. The others are just very similar to the ones we're testing and there would be little value in covering them as well. llvm-svn: 320352	2017-12-11 11:44:23 +00:00
Francis Visoiu Mistrih	a8a83d150f	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register. Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/<def>//g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<def[ ],[ ]dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ],[ ]dead>/implicit-def dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name "*.s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' llvm-svn: 320022	2017-12-07 10:40:31 +00:00
Nirav Dave	7d8f3e0c93	[ARM][AArch64][DAG] Reenable post-legalize store merge Reenable post-legalize stores with constant merging computation and corresponding test case. * Properly truncate store merge constants * Disable merging of truncated stores floating points * Ensure merges of constant stores into a single vector are constructed from legal elements. Reviewers: eastig, efriedma Reviewed By: eastig Subscribers: spatel, rengolin, aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40701 llvm-svn: 319899	2017-12-06 15:30:13 +00:00
Vlad Tsyrklevich	0b40f21134	Revert "[DAGCombine] Move AND nodes to multiple load leaves" This reverts commit r319773. It was causing some buildbots to hang, e.g. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/5589 llvm-svn: 319867	2017-12-06 01:16:08 +00:00
Sam Parker	0a436a9d62	[DAGCombine] Move AND nodes to multiple load leaves Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D39604 llvm-svn: 319773	2017-12-05 15:13:47 +00:00
Francis Visoiu Mistrih	25528d6de7	[CodeGen] Unify MBB reference format in both MIR and debug output As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber/" << printMBBReference(\1)/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber/" << printMBBReference(\1)/g' * find . $ -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 llvm-svn: 319665	2017-12-04 17:18:51 +00:00
Pablo Barrio	2b4385846c	Fix function pointer tail calls in armv8-M.base Summary: The compiler fails with the following error message: fatal error: error in backend: ran out of registers during register allocation Tail call optimization for Armv8-M.base fails to meet all the required constraints when handling calls to function pointers where the arguments take up r0-r3. This is because the pointer to the function to be called can only be stored in r0-r3, but these are all occupied by arguments. This patch makes sure that tail call optimization does not try to handle this type of calls. Reviewers: chill, MatzeB, olista01, rengolin, efriedma Reviewed By: olista01, efriedma Subscribers: efriedma, aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40706 llvm-svn: 319664	2017-12-04 16:55:49 +00:00
Sam Parker	987b2c9966	[ARM] CodeGen test Add another and + load DAG combine test. llvm-svn: 319660	2017-12-04 15:14:59 +00:00
Martin Storsjo	c85cc41801	[ARM] Allow using emulated tls on platforms other than ELF This matches how it is done on X86. This allows using emulated tls on windows; in MinGW environments, native tls isn't supported at the moment. Differential Revision: https://reviews.llvm.org/D40769 llvm-svn: 319643	2017-12-04 09:08:55 +00:00
Nirav Dave	3e76e1e89e	[DAG][ARM] Revert "Reenable post-legalize store merge" due to failures in AArch and ARM code gen. llvm-svn: 319587	2017-12-01 21:55:47 +00:00
Sam Parker	45b5950f38	[ARM] and + load combine tests Add a few more tests cases. llvm-svn: 319548	2017-12-01 15:31:41 +00:00
Nirav Dave	eb2b24fded	[ARM][DAG] Reenable post-legalize store merge Summary: Reenable post-legalize stores with constant merging computation and cofrresponding test case. Reviewers: eastig, efriedma Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40701 llvm-svn: 319547	2017-12-01 14:49:26 +00:00
Sam Parker	412a991b10	[ARM] and + load combine tests Adding autogenerated tests for narrow load combines. Differential Revision: https://reviews.llvm.org/D40709 llvm-svn: 319542	2017-12-01 13:42:39 +00:00
Daniel Sanders	f499b2bf1f	[globalisel][tablegen] Add support for specific immediates in the match pattern This enables a few rules such as ARM's uxtb instruction. llvm-svn: 319457	2017-11-30 18:48:35 +00:00
Francis Visoiu Mistrih	c71cced0aa	[CodeGen] Always use `printReg` to print registers in both MIR and debug output As part of the unification of the debug format and the MIR format, always use `printReg` to print all kinds of registers. Updated the tests using '_' instead of '%noreg' until we decide which one we want to be the default one. Differential Revision: https://reviews.llvm.org/D40421 llvm-svn: 319445	2017-11-30 16:12:24 +00:00
Diana Picus	f003d9ff95	[ARM GlobalISel] Bail out for byval Fallback if we have a byval parameter or argument since we don't support them yet. llvm-svn: 319428	2017-11-30 12:23:44 +00:00
Francis Visoiu Mistrih	93ef145862	[CodeGen] Print "%vreg0" as "%0" in both MIR and debug output As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities). Basically: * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed Differential Revision: https://reviews.llvm.org/D40420 llvm-svn: 319427	2017-11-30 12:12:19 +00:00
Nirav Dave	bafaa53c4d	[ARM][DAG] Revert Disable post-legalization store merge for ARM Partially reverting enabling of post-legalization store merge (r319036) for just ARM backend as it is causing incorrect code in some Thumb2 cases. llvm-svn: 319331	2017-11-29 18:06:13 +00:00
Diana Picus	863b5b05f1	[ARM GlobalISel] Fix selecting G_BRCOND When lowering a G_BRCOND, we generate a TSTri of the condition against 1, which sets the flags, and then a Bcc which branches based on the value of the flags. Unfortunately, we were using the wrong condition code to check whether we need to branch (EQ instead of NE), which caused all our branches to do the opposite of what they were intended to do. This patch fixes the issue by using the correct condition code. llvm-svn: 319313	2017-11-29 14:20:06 +00:00
Oliver Stannard	9ea2eaeb50	[ARM] Add support for armv7e-m to the .arch directive This will allow compilation of assembly files targeting armv7e-m without having to specify the Tag_CPU_arch attribute as a workaround. Differential revision: https://reviews.llvm.org/D40370 Patch by Ian Tessier! llvm-svn: 319303	2017-11-29 10:12:15 +00:00
Daniel Sanders	40c5cbfb08	[globalisel][tablegen] Fix PR35375 by sign-extending the table value to match getConstantVRegVal() Summary: From the bug report: > The problem is that it fails when trying to compare -65536 (or 4294901760) to 0xFFFF,0000. This is because the > constant in the instruction is sign extended to 64 bits (0xFFFF,FFFF,FFFF,0000) and then compared to the non > extended 64 bit version expected by TableGen. > > In contrast, the DAGISelEmitter generates special code for AND immediates (OPC_CheckAndImm), which does not > sign extend. This patch doesn't introduce the special case for AND (and OR) immediates since the majority of it is related to handling known bits that have no effect on the result and GlobalISel doesn't detect known-bits at this time. Instead this patch just ensures that the immediate is extended consistently on both sides of the check. Thanks to Diana Picus for the detailed bug report. Reviewers: rovka Reviewed By: rovka Subscribers: kristof.beyls, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D40532 llvm-svn: 319252	2017-11-28 23:18:54 +00:00
Francis Visoiu Mistrih	9d7bb0cb40	[CodeGen] Print register names in lowercase in both MIR and debug output As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187	2017-11-28 17:15:09 +00:00
Simon Dardis	3aeb1a5404	[DAGCombine] Disable finding better chains for stores at O0 Unoptimized IR can have linear sequences of stores to an array, where the initial GEP for the first store is formed from the pointer to the array, and the GEP for each store after the first is formed from the previous GEP with some offset in an inductive fashion. The (large) resulting DAG when analyzed by DAGCombine undergoes an excessive number of combines as each store node is examined every time its' offset node is combined with any child of the offset. One of the transformations is findBetterNeighborChains which assists MergeConsecutiveStores. The former relies on repeated chain walking to do its' work, however MergeConsecutiveStores is disabled at O0 which makes the transformation redundant. Any optimization level other than O0 would invoke InstCombine which would resolve the chain of GEPs into flat base + offset GEP for each store which does not exhibit the repeated examination of each store to the array. Disabling this optimization fixes an excessive compile time issue (30~ minutes for the test case provided) at O0. Reviewers: niravd, craig.topper, t.p.northover Differential Revision: https://reviews.llvm.org/D40193 llvm-svn: 319142	2017-11-28 04:07:59 +00:00
Matthias Braun	5d01e708e1	ARM: Fix PR32578 https://llvm.org/PR32578 I simplified and converted the reproducer into a lit test. Patch by Vedant Kumar! llvm-svn: 319130	2017-11-28 01:17:52 +00:00
Nirav Dave	db77e57ea8	[DAG] Do MergeConsecutiveStores again before Instruction Selection Summary: Now that store-merge is only generates type-safe stores, do a second pass just before instruction selection to allow lowered intrinsics to be merged as well. Reviewers: jyknight, hfinkel, RKSimon, efriedma, rnk, jmolloy Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33675 llvm-svn: 319036	2017-11-27 15:28:15 +00:00
Momchil Velikov	bd2c7eb923	[ARM] Fix an off-by-one error when restoring LR for 16-bit Thumb The commit https://reviews.llvm.org/rL318143 computes incorrectly to offset to restore LR from. The number of tPOP operands is 2 (condition) + 2 (implicit def and use of SP) + count of the popped registers. We need to load LR from just past the last register, hence the correct offset should be either getNumOperands() - 4 and getNumExplicitOperands() - 2 (multiplied by 4). Differential revision: https://reviews.llvm.org/D40305 llvm-svn: 319014	2017-11-27 10:13:14 +00:00
Diana Picus	c01f7f131b	[ARM GlobalISel] Support G_FDIV for s32 and s64 TableGen already generates code for selecting a G_FDIV, so we only need to add a test. For the legalizer and reg bank select, we do the same thing as for the other floating point binary operations: either mark as legal if we have a FP unit or lower to a libcall, and map to the floating point registers. llvm-svn: 318915	2017-11-23 13:26:07 +00:00
Diana Picus	9faa09b21e	[ARM GlobalISel] Support G_FMUL for s32 and s64 TableGen already generates code for selecting a G_FMUL, so we only need to add a test for that part. For the legalizer and reg bank select, we do the same thing as the other floating point binary operators: either mark as legal if we have a FP unit or lower to a libcall, and map to the floating point registers. llvm-svn: 318910	2017-11-23 12:44:20 +00:00
Eugene Leviant	6bc35a93e6	[MI scheduler] Fix VADD and VSUB in cortex-a57 model This patch fixes instregex for interger vector add/sub instructions Differential revision: https://reviews.llvm.org/D40254 llvm-svn: 318749	2017-11-21 11:01:28 +00:00
Diana Picus	c79dfb3b31	[ARM GlobalISel] Add comment for r318398. NFC. Mention the purpose of the BICri tests added by r318398, as requested in post-commit review. llvm-svn: 318747	2017-11-21 10:17:02 +00:00
Diana Picus	3ac504035a	[ARM GlobalISel] Add test for RSBri. NFC Add instruction selector test for RSBri, which is derived from AsI1_rbin_irs, and make sure it doesn't get mistaken for SUBri, which is derived from the very similar AsI1_bin_irs pattern. llvm-svn: 318643	2017-11-20 11:05:31 +00:00
Diana Picus	6db48f7d6b	[ARM GlobalISel] Clean up binary operator tests. NFC Remove some of the instruction selector tests for binary operators (and, or, xor). These are all derived from the same kind of TableGen pattern, AsI1_bin_irs, so there's no point in testing all of them. llvm-svn: 318642	2017-11-20 10:35:35 +00:00
Martin Storsjo	b4c907edd7	[ARM] Use dwarf exception handling on MinGW Enabling and using dwarf exceptions seems like an easier path to take, than to make the COFF/ARM backend output EHABI directives. Previously, no EH model was enabled at all on this target. There's no point in setting UseIntegratedAssembler to false since GNU binutils doesn't support Windows on ARM, and since we don't need to support external assembler, we don't need to use register numbers in cfi directives. Differential Revision: https://reviews.llvm.org/D39532 llvm-svn: 318510	2017-11-17 08:04:40 +00:00
Yi Kong	39bcd4ed3e	[ARM] 't' asm constraint should accept i32 't' constraint normally only accepts f32 operands, but for VCVT the operands can be i32. LLVM is overly restrictive and rejects asm like: float foo() { float result; __asm__ __volatile__( "vcvt.f32.s32 %[result], %[arg1]\n" : [result]"=t"(result) : [arg1]"t"(0x01020304) ); return result; } Relax the value type for 't' constraint to either f32 or i32. Differential Revision: https://reviews.llvm.org/D40137 llvm-svn: 318472	2017-11-16 23:38:17 +00:00
Diana Picus	bfdf7b6c39	[ARM GlobalISel] Add tests for BIC. NFC Add instruction selector tests for BICrr and BICri, which are handled by TableGen. llvm-svn: 318398	2017-11-16 13:32:47 +00:00
Diana Picus	4d242b18b2	[ARM GlobalISel] Add tests for REVSH patterns. NFC Add instruction selector tests for some of the REVSH patterns handled by TableGen. llvm-svn: 318393	2017-11-16 12:29:28 +00:00
Sam Parker	43fa5911a1	[DAGCombine] Enable more srl -> load combines Change the calculation for the desired ValueType for non-sign extending loads, as in those cases we don't care about the higher bits. This creates a smaller ExtVT and allows for such combinations as: (srl (zextload i16, [addr]), 8) -> (zextload i8, [addr + 1]) Differential Revision: https://reviews.llvm.org/D40034 llvm-svn: 318390	2017-11-16 11:28:26 +00:00
Aditya Nandakumar	e6201c8724	[GISel]: Rework legalization algorithm for better elimination of artifacts along with DCE Legalization Artifacts are all those insts that are there to make the type system happy. Currently, the target needs to say all combinations of extends and truncs are legal and there's no way of verifying that post legalization, we only have truly legal instructions. This patch changes roughly the legalization algorithm to process all illegal insts at one go, and then process all truncs/extends that were added to satisfy the type constraints separately trying to combine trivial cases until they converge. This has the added benefit that, the target legalizerinfo can only say which truncs and extends are okay and the artifact combiner would combine away other exts and truncs. Updated legalization algorithm to roughly the following pseudo code. WorkList Insts, Artifacts; collect_all_insts_and_artifacts(Insts, Artifacts); do { for (Inst in Insts) legalizeInstrStep(Inst, Insts, Artifacts); for (Artifact in Artifacts) tryCombineArtifact(Artifact, Insts, Artifacts); } while(!Insts.empty()); Also, wrote a simple wrapper equivalent to SetVector, except for erasing, it avoids moving all elements over by one and instead just nulls them out. llvm-svn: 318210	2017-11-14 22:42:19 +00:00
Tim Northover	5cdc4f9c33	ARM: correctly update CFG when splitting BB to fix branch. Because the block-splitting code is multi-purpose, we have to meddle with the branches when using it to fixup a conditional branch destination. We got the code right, but forgot to update the CFG so the verifier complained when expensive checks were on. Probably harmless since constant-islands comes so late, but best to fix it anyway. llvm-svn: 318148	2017-11-14 11:43:54 +00:00
Diana Picus	21a42bcc0b	[ARM GlobalISel] Remove C++ code for G_CONSTANT Get rid of the handwritten instruction selector code for handling G_CONSTANT. This code wasn't checking all the preconditions correctly anyway, so it's better to leave it to TableGen, which can handle at least some cases correctly (e.g. MOVi, MOVi16, folding into binary operations). Also add tests to cover those cases. llvm-svn: 318146	2017-11-14 11:20:32 +00:00
Momchil Velikov	dc86e1444d	[ARM] Fix incorrect conversion of a tail call to an ordinary call When we emit a tail call for Armv8-M, but then discover that the caller needs to save/restore `LR`, we convert the tail call to an ordinary one, since restoring `LR` takes extra instructions, which may negate the benefits of the tail call. If the callee, however, takes stack arguments, this conversion is incorrect, since nothing has been done to pass the stack arguments. Thus the patch reverts https://reviews.llvm.org/rL294000 Also, we improve the instruction sequence for popping `LR` in the case when we couldn't immediately find a scratch low register, but we can use as a temporary one of the callee-saved low registers and restore `LR` before popping other callee-saves. Differential Revision: https://reviews.llvm.org/D39599 llvm-svn: 318143	2017-11-14 10:36:52 +00:00
Evgeniy Stepanov	76d5ac4906	[arm] Fix Unnecessary reloads from GOT. Summary: This fixes PR35221. Use pseudo-instructions to let MachineCSE hoist global address computation. Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39871 llvm-svn: 318081	2017-11-13 20:45:38 +00:00
Diana Picus	69aa20e3ca	[ARM GlobalISel] Update legalizer test Make one of the legalizer tests a bit more robust by making sure all values we're interested in are used (either in a store or a return) and by using loads instead of constants for obtaining values on fewer than 32 bits. This should make the test less fragile to changes in the legalize combiner, since those loads are legal (as opposed to the constants, which were being widened and thus produced opportunities for the legalize combiner). llvm-svn: 318047	2017-11-13 16:02:42 +00:00
Momchil Velikov	842aa90192	[ARM] Place jump table as the first operand in additions When generating table jump code for switch statements, place the jump table label as the first operand in the various addition instructions in order to enable addressing mode selectors to better match index computation and possibly fold them into the addressing mode of the table entry load instruction. Differential revision: https://reviews.llvm.org/D39752 llvm-svn: 318033	2017-11-13 11:56:48 +00:00
Kristof Beyls	af9814a1fc	[GlobalISel] Enable legalizing non-power-of-2 sized types. This changes the interface of how targets describe how to legalize, see the below description. 1. Interface for targets to describe how to legalize. In GlobalISel, the API in the LegalizerInfo class is the main interface for targets to specify which types are legal for which operations, and what to do to turn illegal type/operation combinations into legal ones. For each operation the type sizes that can be legalized without having to change the size of the type are specified with a call to setAction. This isn't different to how GlobalISel worked before. For example, for a target that supports 32 and 64 bit adds natively: for (auto Ty : {s32, s64}) setAction({G_ADD, 0, s32}, Legal); or for a target that needs a library call for a 32 bit division: setAction({G_SDIV, s32}, Libcall); The main conceptual change to the LegalizerInfo API, is in specifying how to legalize the type sizes for which a change of size is needed. For example, in the above example, how to specify how all types from i1 to i8388607 (apart from s32 and s64 which are legal) need to be legalized and expressed in terms of operations on the available legal sizes (again, i32 and i64 in this case). Before, the implementation only allowed specifying power-of-2-sized types (e.g. setAction({G_ADD, 0, s128}, NarrowScalar). A worse limitation was that if you'd wanted to specify how to legalize all the sized types as allowed by the LLVM-IR LangRef, i1 to i8388607, you'd have to call setAction 8388607-3 times and probably would need a lot of memory to store all of these specifications. Instead, the legalization actions that need to change the size of the type are specified now using a "SizeChangeStrategy". For example: setLegalizeScalarToDifferentSizeStrategy( G_ADD, 0, widenToLargerAndNarrowToLargest); This example indicates that for type sizes for which there is a larger size that can be legalized towards, do it by Widening the size. For example, G_ADD on s17 will be legalized by first doing WidenScalar to make it s32, after which it's legal. The "NarrowToLargest" indicates what to do if there is no larger size that can be legalized towards. E.g. G_ADD on s92 will be legalized by doing NarrowScalar to s64. Another example, taken from the ARM backend is: for (unsigned Op : {G_SDIV, G_UDIV}) { setLegalizeScalarToDifferentSizeStrategy(Op, 0, widenToLargerTypesUnsupportedOtherwise); if (ST.hasDivideInARMMode()) setAction({Op, s32}, Legal); else setAction({Op, s32}, Libcall); } For this example, G_SDIV on s8, on a target without a divide instruction, would be legalized by first doing action (WidenScalar, s32), followed by (Libcall, s32). The same principle is also followed for when the number of vector lanes on vector data types need to be changed, e.g.: setAction({G_ADD, LLT::vector(8, 8)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(16, 8)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(4, 16)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(8, 16)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(2, 32)}, LegalizerInfo::Legal); setAction({G_ADD, LLT::vector(4, 32)}, LegalizerInfo::Legal); setLegalizeVectorElementToDifferentSizeStrategy( G_ADD, 0, widenToLargerTypesUnsupportedOtherwise); As currently implemented here, vector types are legalized by first making the vector element size legal, followed by then making the number of lanes legal. The strategy to follow in the first step is set by a call to setLegalizeVectorElementToDifferentSizeStrategy, see example above. The strategy followed in the second step "moreToWiderTypesAndLessToWidest" (see code for its definition), indicating that vectors are widened to more elements so they map to natively supported vector widths, or when there isn't a legal wider vector, split the vector to map it to the widest vector supported. Therefore, for the above specification, some example legalizations are: * getAction({G_ADD, LLT::vector(3, 3)}) returns {WidenScalar, LLT::vector(3, 8)} * getAction({G_ADD, LLT::vector(3, 8)}) then returns {MoreElements, LLT::vector(8, 8)} * getAction({G_ADD, LLT::vector(20, 8)}) returns {FewerElements, LLT::vector(16, 8)} 2. Key implementation aspects. How to legalize a specific (operation, type index, size) tuple is represented by mapping intervals of integers representing a range of size types to an action to take, e.g.: setScalarAction({G_ADD, LLT:scalar(1)}, {{1, WidenScalar}, // bit sizes [ 1, 31[ {32, Legal}, // bit sizes [32, 33[ {33, WidenScalar}, // bit sizes [33, 64[ {64, Legal}, // bit sizes [64, 65[ {65, NarrowScalar} // bit sizes [65, +inf[ }); Please note that most of the code to do the actual lowering of non-power-of-2 sized types is currently missing, this is just trying to make it possible for targets to specify what is legal, and how non-legal types should be legalized. Probably quite a bit of further work is needed in the actual legalizing and the other passes in GlobalISel to support non-power-of-2 sized types. I hope the documentation in LegalizerInfo.h and the examples provided in the various {Target}LegalizerInfo.cpp and LegalizerInfoTest.cpp explains well enough how this is meant to be used. This drops the need for LLT::{half,double}...Size(). Differential Revision: https://reviews.llvm.org/D30529 llvm-svn: 317560	2017-11-07 10:34:34 +00:00
Diana Picus	d1b618177a	[globalisel][tablegen] Skip src child predicates The GlobalISel TableGen backend didn't check for predicates on the source children. This caused it to generate code for ARM patterns such as SMLABB or similar, but without properly checking for the sext_16_node part of the operands. This in turn meant that we would select SMLABB instead of MLA for simple sequences such as s32 + s32 * s32, which is wrong (we want a MLA on the full operands, not just their bottom 16 bits). This patch forces TableGen to skip patterns with predicates on the src children, so it doesn't generate code for SMLABB and other similar ARM instructions at all anymore. AArch64 and X86 are not affected. Differential Revision: https://reviews.llvm.org/D39554 llvm-svn: 317313	2017-11-03 10:30:19 +00:00
Sam Parker	242052c6b4	[ARM] and, or, xor and add with shl combine The generic dag combiner will fold: (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2) This can create constants which are too large to use as an immediate. Many ALU operations are also able of performing the shl, so we can unfold the transformation to prevent a mov imm instruction from being generated. Other patterns, such as b + ((a << 1) \| 510), can also be simplified in the same manner. Differential Revision: https://reviews.llvm.org/D38084 llvm-svn: 317197	2017-11-02 10:43:10 +00:00
Roger Ferrer Ibanez	9dfbc10522	Revert r313618 "[ARM] Use ADDCARRY / SUBCARRY" That change causes PR35103, so reverting until I figure it out. llvm-svn: 317092	2017-11-01 14:06:57 +00:00
Javed Absar	5cde1ccb29	[GlobalISel\|ARM] : Allow legalizing G_FSUB Adding support for VSUB. Reviewed by: @rovka Differential Revision: https://reviews.llvm.org/D39261 llvm-svn: 316902	2017-10-30 13:51:56 +00:00
Diana Picus	6e41e6ac6c	[ARM GlobalISel] Fixup r316572. NFC Just missed a few spots... llvm-svn: 316897	2017-10-30 11:58:09 +00:00
Diana Picus	c537795433	[ARM GlobalISel] Remove redundant testcases. NFC Remove the G_FADD testcases from arm-legalizer.mir, they are covered by arm-legalizer-fp.mir (I probably forgot to delete them when I created that test). llvm-svn: 316573	2017-10-25 12:22:21 +00:00
Diana Picus	e7b53c4546	[ARM GlobalISel] Update test after r316479. NFC No need to check register classes in the register block anymore, since we can now much more conveniently check them at their def. llvm-svn: 316572	2017-10-25 12:22:16 +00:00
Diana Picus	b35022121d	[ARM GlobalISel] Fix call opcodes We were generating BLX for all the calls, which was incorrect in most cases. Update ARMCallLowering to generate BL for direct calls, and BLX, BX_CALL or BMOVPCRX_CALL for indirect calls. llvm-svn: 316570	2017-10-25 11:42:40 +00:00
Diana Picus	eb58db5321	[ARM GlobalISel] Split test into 3. NFC Separate the test cases that deal with calls from the rest of the IR Translator tests. We split into 2 different files, one for testing parameter and result lowering, and one for testing the various different kinds of calls that can occur (BL, BLX, BX_CALL etc). llvm-svn: 316569	2017-10-25 11:21:15 +00:00
Sam Parker	ccb209bb97	[ARM] Swap cmp operands for automatic shifts Swap the compare operands if the lhs is a shift and the rhs isn't, as in arm and T2 the shift can be performed by the compare for its second operand. Differential Revision: https://reviews.llvm.org/D39004 llvm-svn: 316562	2017-10-25 08:33:06 +00:00
Justin Bogner	6c452834a1	MIR: Print the register class or bank in vreg defs This updates the MIRPrinter to include the regclass when printing virtual register defs, which is already valid syntax for the parser. That is, given 64 bit %0 and %1 in a "gpr" regbank, %1(s64) = COPY %0(s64) would now be written as %1:gpr(s64) = COPY %0(s64) While this change alone introduces a bit of redundancy with the registers block, it allows us to update the tests to be more concise and understandable and brings us closer to being able to remove the registers block completely. Note: We generally only print the class in defs, but there is one exception. If there are uses without any defs whatsoever, we'll print the class on all uses. I'm not completely convinced this comes up in meaningful machine IR, but for now the MIRParser and MachineVerifier both accept that kind of stuff, so we don't want to have a situation where we can print something we can't parse. llvm-svn: 316479	2017-10-24 18:04:54 +00:00
Aditya Nandakumar	921f24cef1	[GISel][ARM]: Fix illegal Generic copies in tests This is in preparation for a verifier check that makes sure copies are of the same size (when generic virtual registers are involved). llvm-svn: 316388	2017-10-23 22:53:08 +00:00
Momchil Velikov	d6a4ab3d49	[ARM] Dynamic stack alignment for 16-bit Thumb This patch implements dynamic stack (re-)alignment for 16-bit Thumb. When targeting processors, which support only the 16-bit Thumb instruction set the compiler ignores the alignment attributes of automatic variables and may silently generate incorrect code. Differential revision: https://reviews.llvm.org/D38143 llvm-svn: 316289	2017-10-22 11:56:35 +00:00
Eugene Leviant	27b226fb65	[ARM] Use post-RA MI scheduler when +use-misched is set Differential revision: https://reviews.llvm.org/D39100 llvm-svn: 316214	2017-10-20 14:29:17 +00:00
Diana Picus	7bf71008aa	[ARM GlobalISel] Fix liveins in test. NFC llvm-svn: 316155	2017-10-19 09:28:19 +00:00
Diana Picus	a993859335	[ARM GlobalISel] Remove redundant tests These test cases don't really add anything that isn't covered by other tests as well, so we can safely remove them. llvm-svn: 316154	2017-10-19 08:50:28 +00:00
Justin Bogner	d45849f703	Canonicalize a large number of mir tests using update_mir_test_checks This converts a large and somewhat arbitrary set of tests to use update_mir_test_checks. I ran the script on all of the tests I expect to need to modify for an upcoming mir syntax change and kept the ones that obviously didn't change the tests in ways that might make it harder to understand. llvm-svn: 316137	2017-10-18 23:18:12 +00:00
Aditya Nandakumar	c3bfc81a1f	[GISel]: Fix generation of illegal COPYs during CallLowering We end up creating COPY's that are either truncating/extending and this should be illegal. https://reviews.llvm.org/D37640 Patch for X86 and ARM by igorb, rovka llvm-svn: 315240	2017-10-09 20:07:43 +00:00
Diana Picus	57285d7a43	[ARM] GlobalISel: Make tests less strict These are intended as integration tests, so they shouldn't be too specific about what they're checking. llvm-svn: 315083	2017-10-06 17:47:27 +00:00
Diana Picus	e393bc72ee	[ARM] GlobalISel: Select shifts Unfortunately TableGen doesn't handle this yet: Unable to deduce gMIR opcode to handle Src (which is a leaf). Just add some temporary hand-written code to generate the proper MOVsr. llvm-svn: 315071	2017-10-06 15:39:16 +00:00
Diana Picus	a81a4b17e5	[ARM] GlobalISel: Map shift operands to GPRs llvm-svn: 315067	2017-10-06 14:52:43 +00:00
Diana Picus	2c95730450	[ARM] GlobalISel: Mark shifts as legal for s32 The new legalize combiner introduces shifts all over the place, so we should support them sooner rather than later. llvm-svn: 315064	2017-10-06 14:30:05 +00:00
Geoff Berry	fabedbad11	Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"" This reverts commit r314729. Another bug has been encountered in an out-of-tree target reported by Quentin. llvm-svn: 314814	2017-10-03 16:59:13 +00:00
Geoff Berry	bfc5fb4571	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Issues addressed since original review: - Avoid bug in regalloc greedy/machine verifier when forwarding to use in an instruction that re-defines the same virtual register. - Fixed bug when forwarding to use in EarlyClobber instruction slot. - Fixed incorrect forwarding to register definitions that showed up in explicit_uses() iterator (e.g. in INLINEASM). - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 314729	2017-10-02 22:01:37 +00:00
Matthias Braun	51687912a4	ARM: Fix cases where CSI Restored bit is not cleared LR is an untypical callee saved register in that it is restored into a different register (PC) and thus does not live-out of the return block. This case requires the `Restored` flag in CalleeSavedInfo to be cleared. This fixes a number of cases where this wasn't handled correctly yet. llvm-svn: 314471	2017-09-28 23:12:06 +00:00
Martin Storsjo	d6218cc385	[ARM] Restore the right frame pointer register in Int_eh_sjlj_longjmp In setupEntryBlockAndCallSites in CodeGen/SjLjEHPrepare.cpp, we fetch and store the actual frame pointer, but on return via the longjmp intrinsic, it always was restored into the r7 variable. On windows, the frame pointer should be restored into r11 instead of r7. On Darwin (where sjlj exception handling is used by default), the frame pointer is always r7, both in arm and thumb mode, and likewise, on windows, the frame pointer always is r11. On linux however, if sjlj exception handling is enabled (which it isn't by default), libcxxabi and the user code can be built in differing modes using different registers as frame pointer. Therefore, when restoring registers on a platform where we don't always use the same register depending on code mode, restore both r7 and r11. Differential Revision: https://reviews.llvm.org/D38253 llvm-svn: 314451	2017-09-28 19:04:30 +00:00
Martin Storsjo	adceba59a2	[ARM] Fix SJLJ exception handling when manually chosen on a platform where it isn't default Differential Revision: https://reviews.llvm.org/D38252 llvm-svn: 314450	2017-09-28 19:04:14 +00:00
Matthias Braun	5c3e8a450e	MIR: Serialize CaleeSavedInfo Restored flag llvm-svn: 314449	2017-09-28 18:52:14 +00:00
George Burgess IV	f8e11b803d	[DAGCombiner] Fix an off-by-one error in vector logic Without this, we could end up trying to get the Nth (0-indexed) element from a subvector of size N. Differential Revision: https://reviews.llvm.org/D37880 llvm-svn: 314380	2017-09-28 06:17:19 +00:00
Eli Friedman	edee9999c4	Revert r312724 ("[ARM] Remove redundant vcvt patterns."). It leads to some improvements, but also a regression for the simple case, so it's not clearly a good idea. test/CodeGen/ARM/vcvt.ll now has test coverage to show the difference. Ultimately, the right solution is probably to custom-lower fp-to-int conversions, to something like ARMISD::VCVT_F32_S32 plus a bitcast. It's hard to do the right thing when the implicit bitcast isn't visible to DAG transforms. llvm-svn: 314169	2017-09-25 22:07:33 +00:00
Eli Friedman	48853741ad	[ARM] Fix tests for vcvt+store to return void. This is what I meant to do in r314161; I didn't realize I'd messed up because the generated assembly is currently identical. llvm-svn: 314163	2017-09-25 21:55:27 +00:00
Eli Friedman	7961112df9	[ARM] Add tests for vcvt followed by store. llvm-svn: 314161	2017-09-25 21:37:52 +00:00
Eli Friedman	7404fad205	[ARM] Regenerate vcvt test checks. llvm-svn: 314160	2017-09-25 21:34:29 +00:00
Craig Topper	5bc10ede53	[SelectionDAG] Teach simplifyDemandedBits to handle shifts by constant splat vectors This teach simplifyDemandedBits to handle constant splat vector shifts. This required changing some uses of getZExtValue to getLimitedValue since we can't rely on legalization using getShiftAmountTy for the shift amount. I believe there may have been a bug in the ((X << C1) >>u ShAmt) handling where we didn't check if the inner shift was too large. I've fixed that here. I had to add new patterns to ARM because the zext/sext the patterns were trying to look for got turned into an any_extend with this patch. Happy to split that out too, but not sure how to test without this change. Differential Revision: https://reviews.llvm.org/D37665 llvm-svn: 314139	2017-09-25 19:26:08 +00:00
Arnold Schwaighofer	ae4de58a5b	ARM: Use the proper swifterror CSR list on platforms other than darwin Noticed by inspection llvm-svn: 314121	2017-09-25 17:19:50 +00:00
Simon Pilgrim	2b1c3bb25d	[ARM] Add missing selection patterns for vnmla For the following function: double fn1(double d0, double d1, double d2) { double a = -d0 - d1 * d2; return a; } on ARM, LLVM generates code along the lines of vneg.f64 d0, d0 vmls.f64 d0, d1, d2 i.e., a negate and a multiply-subtract. The attached patch adds instruction selection patterns to allow it to generate the single instruction vnmla.f64 d0, d1, d2 (multiply-add with negation) instead, like GCC does. Committed on behalf of @gergo- (Gergö Barany) Differential Revision: https://reviews.llvm.org/D35911 llvm-svn: 313972	2017-09-22 09:50:52 +00:00
Roger Ferrer Ibanez	8d0180c955	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 - fixes PR34564 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 313618	2017-09-19 09:05:39 +00:00
Manoj Gupta	7476f629ed	[LoopVectorizer] Add more testcases for PR33804. Summary: Add test cases when float <-> pointer types conversion is triggered in presence of load instructions. Reviewers: Ayal, srhines, mkuper, rengolin Reviewed By: rengolin Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D37967 llvm-svn: 313544	2017-09-18 17:28:15 +00:00
Chad Rosier	4d6d74e236	Add newline to end of test file. NFC. llvm-svn: 313263	2017-09-14 14:48:59 +00:00
Hans Wennborg	8c1eb106bd	Revert r313009 "[ARM] Use ADDCARRY / SUBCARRY" This was causing PR34045 to fire again. > This is a preparatory step for D34515 and also is being recommitted as its > first version caused PR34045. > > This change: > - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 > - lowering is done by first converting the boolean value into the carry flag > using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value > using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two > operations does the actual addition. > - for subtraction, given that ISD::SUBCARRY second result is actually a > borrow, we need to invert the value of the second operand and result before > and after using ARMISD::SUBE. We need to invert the carry result of > ARMISD::SUBE to preserve the semantics. > - given that the generic combiner may lower ISD::ADDCARRY and > ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering > as well otherwise i64 operations now would require branches. This implies > updating the corresponding test for unsigned. > - add new combiner to remove the redundant conversions from/to carry flags > to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C > - fixes PR34045 > > Differential Revision: https://reviews.llvm.org/D35192 Also revert follow-up r313010: > [ARM] Fix typo when creating ISD::SUB nodes > > In D35192, I accidentally introduced a typo when creating ISD::SUB nodes, > giving them two values instead of one. > > This fails when the merge_values combiner finds one of these nodes. > > This change fixes PR34564. > > Differential Revision: https://reviews.llvm.org/D37690 llvm-svn: 313044	2017-09-12 16:24:17 +00:00
Roger Ferrer Ibanez	4f92b4162f	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515 and also is being recommitted as its first version caused PR34045. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 313009	2017-09-12 07:40:09 +00:00
Hans Wennborg	075e5a2e2b	Revert r312898 "[ARM] Use ADDCARRY / SUBCARRY" It caused PR34564. > This is a preparatory step for D34515 and also is being recommitted as its > first version caused PR34045. > > This change: > - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 > - lowering is done by first converting the boolean value into the carry flag > using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value > using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two > operations does the actual addition. > - for subtraction, given that ISD::SUBCARRY second result is actually a > borrow, we need to invert the value of the second operand and result before > and after using ARMISD::SUBE. We need to invert the carry result of > ARMISD::SUBE to preserve the semantics. > - given that the generic combiner may lower ISD::ADDCARRY and > ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering > as well otherwise i64 operations now would require branches. This implies > updating the corresponding test for unsigned. > - add new combiner to remove the redundant conversions from/to carry flags > to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C > - fixes PR34045 > > Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 312980	2017-09-11 23:52:02 +00:00
Adrian Prantl	16aa4cf7ef	llvm-dwarfdump: Make -brief the default and add a -verbose option instead. Differential Revision: https://reviews.llvm.org/D37717 llvm-svn: 312972	2017-09-11 23:05:20 +00:00
Adrian Prantl	7bc1b28291	llvm-dwarfdump: Replace -debug-dump=sect option with individual options. As discussed on llvm-dev in http://lists.llvm.org/pipermail/llvm-dev/2017-September/117301.html this changes the command line interface of llvm-dwarfdump to match the one used by the dwarfdump utility shipping on macOS. In addition to being shorter to type this format also has the advantage of allowing more than one section to be specified at the same time. In a nutshell, with this change $ llvm-dwarfdump --debug-dump=info $ llvm-dwarfdump --debug-dump=apple-objc becomes $ dwarfdump --debug-info --apple-objc Differential Revision: https://reviews.llvm.org/D37714 llvm-svn: 312970	2017-09-11 22:59:45 +00:00
Roger Ferrer Ibanez	12b20f2307	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515 and also is being recommitted as its first version caused PR34045. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 312898	2017-09-11 07:38:05 +00:00
Wei Mi	5d84d9b35c	Fix a bug for rL312641. rL312641 Allowed llvm.memcpy/memset/memmove to be tail calls when parent function return the intrinsics's first argument. However on arm-none-eabi platform, llvm.memcpy will be expanded to __aeabi_memcpy which doesn't have return value. The fix is to check the libcall name after expansion to match "memcpy/memset/memmove" before allowing those intrinsic to be tail calls. llvm-svn: 312799	2017-09-08 16:44:52 +00:00
Benjamin Kramer	6ef976d5e1	[ARM] Remove redundant vcvt patterns. These don't add any value as they're just compositions of existing patterns. However, they can confuse the cost logic in ISel, leading to duplicated vcvt instructions like in PR33199. llvm-svn: 312724	2017-09-07 14:52:26 +00:00
Saleem Abdulrasool	5fba8ba9cc	ARM: track globals promoted to coalesced const pool entries Globals that are promoted to an ARM constant pool may alias with another existing constant pool entry. We need to keep a reference to all globals that were promoted to each constant pool value so that we can emit a distinct label for each promoted global. These labels are necessary so that debug info can refer to the promoted global without an undefined reference during linking. Patch by Stephen Crane! llvm-svn: 312692	2017-09-07 04:00:13 +00:00
Eli Friedman	c22c699882	[ARM] Make ARMExpandPseudo add implicit uses for predicated instructions Missing these could potentially screw up post-ra scheduling. Issue found by inspection, so I don't have a real testcase. Included test just verifies the expected operands after expansion. Differential Revision: https://reviews.llvm.org/D35156 llvm-svn: 312589	2017-09-05 22:54:06 +00:00
Diana Picus	abb088691b	[ARM] GlobalISel: Support global variables for RWPI In RWPI code, globals that are not read-only are accessed relative to the SB register (R9). This is achieved by explicitly generating an ADD instruction between SB and an offset that we either load from a constant pool or movw + movt into a register. llvm-svn: 312521	2017-09-05 07:57:41 +00:00
Sam McCall	f71bb198ed	Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"" This crashes on boringSSL on PPC (will send reduced testcase) This reverts commit r312328. llvm-svn: 312490	2017-09-04 15:47:00 +00:00
Dean Michael Berris	ebc1659016	[XRay][CodeGen] Use PIC-friendly code in XRay sleds and remove synthetic references in .text Summary: This is a re-roll of D36615 which uses PLT relocations in the back-end to the call to __xray_CustomEvent() when building in -fPIC and -fxray-instrument mode. Reviewers: pcc, djasper, bkramer Subscribers: sdardis, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D37373 llvm-svn: 312466	2017-09-04 05:34:58 +00:00
Matthias Braun	cebdb17522	LiveIntervalAnalysis: Fix alias regunit reserved definition A register in CodeGen can be marked as reserved: In that case we consider the register always live and do not use (or rather ignore) kill/dead/undef operand flags. LiveIntervalAnalysis however tracks liveness per register unit (not per register). We already needed adjustments for this in r292871 to deal with super/sub registers. However I did not look at aliased register there. Looking at ARM: FPSCR (regunits FPSCR, FPSCR~FPSCR_NZCV) aliases with FPSCR_NZCV (regunits FPSCR_NZCV, FPSCR~FPSCR_NZCV) hence they share a register unit (FPSCR~FPSCR_NZCV) that represents the aliased parts of the registers. This shared register unit was previously considered non-reserved, however given that we uses of the reserved FPSCR potentially violate some rules (like uses without defs) we should make FPSCR~FPSCR_NZCV reserved too and stop tracking liveness for it. This patch: - Defines a register unit as reserved when: At least for one root register, the root register and all its super registers are reserved. - Adjust LiveIntervals::computeRegUnitRange() for new reserved definition. - Add MachineRegisterInfo::isReservedRegUnit() to have a canonical way of testing. - Stop computing LiveRanges for reserved register units in HMEditor even with UpdateFlags enabled. - Skip verification of uses of reserved reg units in the machine verifier (this usually didn't happen because there would be no cached liverange but there is no guarantee for that and I would run into this case before the HMEditor tweak, so may as well fix the verifier too). Note that this should only affect ARMs FPSCR/FPSCR_NZCV registers today; aliased registers are rarely used, the only other cases are hexagons P0-P3/P3_0 and C8/USR pairs which are not mixing reserved/non-reserved registers in an alias. Differential Revision: https://reviews.llvm.org/D37356 llvm-svn: 312348	2017-09-01 18:36:26 +00:00
Manoj Gupta	6b54c7e11b	[LoopVectorizer] Use two step casting for float to pointer types. Summary: LoopVectorizer is creating casts between vec<ptr> and vec<float> types on ARM when compiling OpenCV. Since, tIs is illegal to directly cast a floating point type to a pointer type even if the types have same size causing a crash. Fix the crash using a two-step casting by bitcasting to integer and integer to pointer/float. Fixes PR33804. Reviewers: mkuper, Ayal, dlj, rengolin, srhines Reviewed By: rengolin Subscribers: aemerson, kristof.beyls, mkazantsev, Meinersbur, rengolin, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35498 llvm-svn: 312331	2017-09-01 15:36:00 +00:00
Geoff Berry	65528f2991	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Issues addressed since original review: - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 312328	2017-09-01 14:27:20 +00:00
Diana Picus	f959791189	[ARM] GlobalISel: Support ROPI global variables In the ROPI relocation model, read-only variables are accessed relative to the PC. We use the (MOV\|LDRLIT)_ga_pcrel pseudoinstructions for this. llvm-svn: 312323	2017-09-01 11:13:39 +00:00
Diana Picus	b67264b182	[ARM] GlobalISel: More tests. NFC. Test constants as well in the PIC tests. These are also represented as G_GLOBAL_VALUE, and although they are treated just like other globals for PIC, they won't be for ROPI, so it's good to have this coverage. llvm-svn: 312319	2017-09-01 10:18:37 +00:00
Daniel Jasper	c0a976d417	Revert r311525: "[XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text" Breaks builds internally. Will forward repo instructions to author. llvm-svn: 312243	2017-08-31 15:17:17 +00:00
Hans Wennborg	24775a0a6c	Revert r312154 "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"" It caused PR34387: Assertion failed: (RegNo < NumRegs && "Attempting to access record for invalid register number!") > Issues identified by buildbots addressed since original review: > - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. > - The pass no longer forwards COPYs to physical register uses, since > doing so can break code that implicitly relies on the physical > register number of the use. > - The pass no longer forwards COPYs to undef uses, since doing so > can break the machine verifier by creating LiveRanges that don't > end on a use (since the undef operand is not considered a use). > > [MachineCopyPropagation] Extend pass to do COPY source forwarding > > This change extends MachineCopyPropagation to do COPY source forwarding. > > This change also extends the MachineCopyPropagation pass to be able to > be run during register allocation, after physical registers have been > assigned, but before the virtual registers have been re-written, which > allows it to remove virtual register COPY LiveIntervals that become dead > through the forwarding of all of their uses. llvm-svn: 312178	2017-08-30 22:11:37 +00:00
Brian Gesiak	3332976478	[ARM] Use Swift error registers on non-Darwin targets Summary: Remove a check for `ARMSubtarget::isTargetDarwin` when determining whether to use Swift error registers, so that Swift errors work properly on non-Darwin ARM32 targets (specifically Android). Before this patch, generated code would save and restores ARM register r8 at the entry and returns of a function that throws. As r8 is used as a virtual return value for the object being thrown, this gets overwritten by the restore, and calling code is unable to catch the error. In turn this caused Swift code that used `do`/`try`/`catch` to work improperly on Android ARM32 targets. Addresses Swift bug report https://bugs.swift.org/browse/SR-5438. Patch by John Holdsworth. Reviewers: manmanren, rjmccall, aschwaighofer Reviewed By: aschwaighofer Subscribers: srhines, aschwaighofer, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35835 llvm-svn: 312164	2017-08-30 20:03:54 +00:00
Aditya Nandakumar	c6615f56f5	[GISel]: Add a clean up combiner during legalization. Added a combiner which can clean up truncs/extends that are created in order to make the types work during legalization. Also moved the combineMerges to the LegalizeCombiner. https://reviews.llvm.org/D36880 llvm-svn: 312158	2017-08-30 19:32:59 +00:00
Geoff Berry	feffb0c8af	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Issues identified by buildbots addressed since original review: - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 312154	2017-08-30 18:41:07 +00:00
Adrian Prantl	05782218ab	Canonicalize the representation of empty an expression in DIGlobalVariableExpression This change simplifies code that has to deal with DIGlobalVariableExpression and mirrors how we treat DIExpressions in debug info intrinsics. Before this change there were two ways of representing empty expressions on globals, a nullptr and an empty !DIExpression(). If someone needs to upgrade out-of-tree testcases: perl -pi -e 's/(!DIGlobalVariableExpression$var: ![0-9]*)$/\1, expr: !DIExpression())/g' <MYTEST.ll> will catch 95%. llvm-svn: 312144	2017-08-30 18:06:51 +00:00
Balaram Makam	42adadfca0	Re-land MachineInstr: Reason locally about some memory objects before going to AA. Summary: Reverts r311008 to reinstate r310825 with a fix. Refine alias checking for pseudo vs value to be conservative. This fixes the original failure in builtbot unittest SingleSource/UnitTests/2003-07-09-SignedArgs. Reviewers: hfinkel, nemanjai, efriedma Reviewed By: efriedma Subscribers: bjope, mcrosier, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D36900 llvm-svn: 312126	2017-08-30 14:57:12 +00:00
Reid Kleckner	a058736c9c	[dwarfdump] Pretty print location expressions and location lists Summary: Based on Fred's patch here: https://reviews.llvm.org/D6771 I can't seem to commandeer the old review, so I'm creating a new one. With that change the locations exrpessions are pretty printed inline in the DIE tree. The output looks like this for debug_loc entries: DW_AT_location [DW_FORM_data4] (0x00000000 0x0000000000000001 - 0x000000000000000b: DW_OP_consts +3 0x000000000000000b - 0x0000000000000012: DW_OP_consts +7 0x0000000000000012 - 0x000000000000001b: DW_OP_reg0 RAX, DW_OP_piece 0x4 0x000000000000001b - 0x0000000000000024: DW_OP_breg5 RDI+0) And like this for debug_loc.dwo entries: DW_AT_location [DW_FORM_sec_offset] (0x00000000 Addr idx 2 (w/ length 190): DW_OP_consts +0, DW_OP_stack_value Addr idx 3 (w/ length 23): DW_OP_reg0 RAX, DW_OP_piece 0x4) Simple locations without ranges are printed inline: DW_AT_location [DW_FORM_block1] (DW_OP_reg4 RSI, DW_OP_piece 0x4, DW_OP_bit_piece 0x20 0x0) The debug_loc(.dwo) dumping in changed accordingly to factor the code. Reviewers: dblaikie, aprantl, friss Subscribers: mgorny, javed.absar, hiraditya, llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D37123 llvm-svn: 312042	2017-08-29 21:41:21 +00:00
Diana Picus	c9f29c62cc	[ARM] GlobalISel: Select globals in PIC mode Support the selection of G_GLOBAL_VALUE in the PIC relocation model. For simplicity we use the same pseudoinstructions for both Darwin and ELF: (MOV\|LDRLIT)_ga_pcrel(_ldr). This is new for ELF, so it requires a small update to the ARM pseudo expansion pass to make sure it adds the correct constant pool modifier and add-current-address in the case of ELF. Differential Revision: https://reviews.llvm.org/D36507 llvm-svn: 311992	2017-08-29 09:47:55 +00:00
Diana Picus	e9a9aaf539	[ARM] GlobalISel: Rename tests. NFC. The checks are complicated enough as it is, there's no use cramming PIC in there as well... llvm-svn: 311989	2017-08-29 09:00:58 +00:00
Joerg Sonnenberger	0f76a35c5e	Fix ARMv4 support ARMv4 doesn't support the "BX" instruction, which has been introduced with ARMv4t. Adjust the call lowering and tail call implementation accordingly. Further changes are necessary to ensure that presence of the v4t feature is correctly set. Most importantly, the "generic" CPU for thumb-* triples should include ARMv4t, since thumb mode without thumb support would naturally be pointless. Add a couple of asserts to ensure thumb instructions are not emitted without CPU support. Differential Revision: https://reviews.llvm.org/D37030 llvm-svn: 311921	2017-08-28 20:20:47 +00:00
Geoff Berry	75c4ae3066	[ARM] Fix bug in ARMLoadStoreOptimizer when kill flags are missing. Summary: ARMLoadStoreOpt::FixInvalidRegPairOp() was only checking if one of the load destination registers to be split overlapped with the base register if the base register was marked as killed. Since kill flags may not always be present, this can lead to incorrect code. This bug was exposed by my MachineCopyPropagation change D30751 breaking the sanitizer-x86_64-linux-android buildbot. Also clean up some dead code and add an assert that a register offset is never encountered by this code, since it does not handle them correctly. Reviewers: MatzeB, qcolombet, t.p.northover Subscribers: aemerson, javed.absar, kristof.beyls, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37164 llvm-svn: 311907	2017-08-28 19:03:45 +00:00
Florian Hahn	5b92960091	[ARM] Check for assembler instructions in test. Currently this test causes test failures on some machines, due to isel not being registered. Update the test to run all passes and check emitted assembly instructions for now. llvm-svn: 311545	2017-08-23 11:53:24 +00:00
Florian Hahn	214e13d949	[ARM] Add missing patterns for insert_subvector. Summary: In some cases, shufflevector instruction can be transformed involving insert_subvector instructions. The ARM backend was missing some insert_subvector patterns, causing a failure during instruction selection. AArch64 has similar patterns. Reviewers: t.p.northover, olista01, javed.absar, rengolin Reviewed By: javed.absar Subscribers: aemerson, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36796 llvm-svn: 311543	2017-08-23 10:20:59 +00:00
Dean Michael Berris	0884b73220	[XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text Summary: This change achieves two things: - Redefine the Custom Event handling instrumentation points emitted by the compiler to not require dynamic relocation of references to the __xray_CustomEvent trampoline. - Remove the synthetic reference we emit at the end of a function that we used to keep auxiliary sections alive in favour of SHF_LINK_ORDER associated with the section where the function is defined. To achieve the custom event handling change, we've had to introduce the concept of sled versioning -- this will need to be supported by the runtime to allow us to understand how to turn on/off the new version of the custom event handling sleds. That change has to land first before we change the way we write the sleds. To remove the synthetic reference, we rely on a relatively new linker feature that preserves the sections that are associated with each other. This allows us to limit the effects on the .text section of ELF binaries. Because we're still using absolute references that are resolved at runtime for the instrumentation map (and function index) maps, we mark these sections write-able. In the future we can re-define the entries in the map to use relative relocations instead that can be statically determined by the linker. That change will be a bit more invasive so we defer this for later. Depends on D36816. Reviewers: dblaikie, echristo, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36615 llvm-svn: 311525	2017-08-23 04:49:41 +00:00
Matthias Braun	8426d1342d	Add test case for r311511 This also changes the TailDuplicator to be configured explicitely pre/post regalloc rather than relying on the isSSA() flag. This was necessary to have `llc -run-pass` work reliably. llvm-svn: 311520	2017-08-23 03:17:59 +00:00
Renato Golin	c070c73d5e	[ARM] Avoid creating duplicate ANDs in SelectionDAG When expanding a BRCOND into a BR_CC, do not create an AND 1 if one already exists. Review: D36705 Patch by Joel Galenson <jgalenson@google.com> llvm-svn: 311447	2017-08-22 11:02:45 +00:00
Renato Golin	f63d701669	[ARM] Call setBooleanContents(ZeroOrOneBooleanContent) The ARM backend should call setBooleanContents so that it can use known bits to make some optimizations. Review: D35821 Patch by Joel Galenson <jgalenson@google.com> llvm-svn: 311446	2017-08-22 11:02:37 +00:00
Martin Storsjo	91522ffa12	[ARM] Check the right order for halves of VZIP/VUZP if both parts are used This is the exact same fix as in SVN r247254. In that commit, the fix was applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue is present for VZIP/VUZP as well. This fixes PR33921. Differential Revision: https://reviews.llvm.org/D36899 llvm-svn: 311258	2017-08-19 19:47:48 +00:00
Tim Northover	14302fcb24	ARM: use an external relocation for calls from MachO ARM mode. The internal (__text-relative) relocation risks the offset not being encodable if the destination is Thumb. llvm-svn: 311187	2017-08-18 19:13:56 +00:00
Geoff Berry	bd47e8a4f7	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2 This reverts commit r311135. sanitizer-x86_64-linux-android buildbot is timing out with just this patch applied. llvm-svn: 311142	2017-08-18 01:43:11 +00:00
Geoff Berry	51f52c4fca	Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding" Two issues identified by buildbots were addressed: - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311135	2017-08-17 23:06:55 +00:00
Geoff Berry	4e38e02e6f	Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" This reverts commit r311038. Several buildbots are breaking, and at least one appears to be due to the forwarding of physical regs enabled by this change. Reverting while I investigate further. llvm-svn: 311062	2017-08-17 04:04:11 +00:00
Saleem Abdulrasool	dd8c16b58e	ARM: mark CPSR as clobbered for Windows VLAs When lowering a VLA, we emit a __chstk call. However, this call can internally clobber CPSR. We did not mark this register as an ImpDef, which could potentially allow a comparison to be hoisted above the call to `__chkstk`. In such a case, the CPSR could be clobbered, and the check invalidated. When the support was initially added, it seemed that the call would take care of preventing CPSR from being clobbered, but this is not the case. Mark the register as clobbered to fix a possible state corruption. llvm-svn: 311061	2017-08-17 02:42:24 +00:00
Geoff Berry	87f8d25150	[MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311038	2017-08-16 20:50:01 +00:00
Balaram Makam	c5698befb6	Revert "MachineInstr: Reason locally about some memory objects before going to AA." r310825 caused the clang-ppc64le-linux-lnt bot to go red (http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/5712) because of a test-suite failure of SingleSource/UnitTests/2003-07-09-SignedArgs This reverts commit 0028f6a87224fb595a1c19c544cde9b003035996. llvm-svn: 311008	2017-08-16 14:17:43 +00:00
Quentin Colombet	647b482c1e	[VirtRegRewriter] Properly model the register liveness on undef subreg definition Undef subreg definition means that the content of the super register doesn't matter at this point. While that's true for virtual registers, this may not hold when replacing them with actual physical registers. Indeed, some part of the physical register may be coalesced with the related virtual register and thus, the values for those parts matter and must be live. The fix consists in checking whether or not subregs of the physical register being assigned to an undef subreg definition are live through that def and insert an implicit use if they are. Doing so, will keep them alive until that point like they should be. E.g., let vreg14 being assigned to R0_R1 then %vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here %vreg14:gsub_1<def> = COPY %R1 Before this changes, the rewriter would change the code into: %R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined %R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1 ; is believed to come ; from the previous ; instruction Because of this invalid liveness, later pass could make wrong choices and in particular clobber live register as it happened with the register scavenger in llvm.org/PR34107 Now we would generate: %R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to ; reach this point %R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> The bug has been here forever, it got exposed recently because the register scavenger got smarter. Fixes llvm.org/PR34107 llvm-svn: 310979	2017-08-16 00:17:05 +00:00
Jakub Kuderski	638c085d07	[Dominators] Include infinite loops in PostDominatorTree Summary: This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change. What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG. This patch makes the following assumptions: - A sequence of updates should produce the same tree as a recalculating it. - Any sequence of the same updates should lead to the same tree. - Siblings and roots are unordered. The last two properties are essential to efficiently perform batch updates in the future. When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently. This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough. That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call: ``` # functions: 52283 # samples: 337609 # reverse unreachable BBs: 216022 # BBs: 247840796 Percent reverse-unreachable: 0.08716159869015269 % Max(PercRevUnreachable) in a function: 87.58620689655172 % # > 25 % samples: 471 ( 0.1395104988314885 % samples ) ... in 145 ( 0.27733680163724345 % functions ) ``` Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway. I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :). Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel Reviewed By: dberlin Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050 Differential Revision: https://reviews.llvm.org/D35851 llvm-svn: 310940	2017-08-15 18:14:57 +00:00
Balaram Makam	d9f53414de	MachineInstr: Reason locally about some memory objects before going to AA. This addresses a FIXME in MachineInstr::mayAlias. llvm-svn: 310825	2017-08-14 09:41:40 +00:00
Mikael Holmen	8b10680922	[IfConversion] Maintain the CFG when predicating/merging blocks in IfConvert* Summary: This fixes PR32721 in IfConvertTriangle and possible similar problems in IfConvertSimple, IfConvertDiamond and IfConvertForkedDiamond. In PR32721 we had a triangle EBB \| \ \| \| \| TBB \| / FBB where FBB didn't have any successors at all since it ended with an unconditional return. Then TBB and FBB were be merged into EBB, but EBB would still keep its successors, and the use of analyzeBranch and CorrectExtraCFGEdges wouldn't help to remove them since the return instruction is not analyzable (at least not on ARM). The edge updating code and branch probability updating code is now pushed into MergeBlocks() which allows us to share the same update logic between more callsites. This lets us remove several dependencies on analyzeBranch and completely eliminate RemoveExtraEdges. One thing that showed up with this patch was that IfConversion sometimes left a successor with 0% probability even if there was no branch or fallthrough to the successor. One such example from the test case ifcvt_bad_zero_prob_succ.mir. The indirect branch tBRIND can only jump to bb.1, but without the patch we got: bb.0: successors: %bb.1(0x80000000) bb.1: successors: %bb.1(0x80000000), %bb.2(0x00000000) tBRIND %r1, 1, %cpsr B %bb.1 bb.2: There is no way to jump from bb.1 to bb2, but still there is a 0% edge from bb.1 to bb.2. With the patch applied we instead get the expected: bb.0: successors: %bb.1(0x80000000) bb.1: successors: %bb.1(0x80000000) tBRIND %r1, 1, %cpsr B %bb.1 Since bb.2 had no predecessor at all, it was removed. Several testcases had to be updated due to this since the removed successor made the "Branch Probability Basic Block Placement" pass sometimes place blocks in a different order. Finally added a couple of new test cases: * PR32721_ifcvt_triangle_unanalyzable.mir: Regression test for the original problem dexcribed in PR 32721. * ifcvt_triangleWoCvtToNextEdge.mir: Regression test for problem that caused a revert of my first attempt to solve PR 32721. * ifcvt_simple_bad_zero_prob_succ.mir: Test case showing the problem where a wrong successor with 0% probability was previously left. * ifcvt_[diamond\|forked_diamond\|simple]_unanalyzable.mir Very simple test cases for the simple and (forked) diamond cases involving unanalyzable branches that can be nice to have as a base if wanting to write more complicated tests. Reviewers: iteratee, MatzeB, grosser, kparzysz Reviewed By: kparzysz Subscribers: kbarton, davide, aemerson, nemanjai, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34099 llvm-svn: 310697	2017-08-11 06:57:08 +00:00
Matthias Braun	a88587ce0c	ARM: Fix CMP_SWAP expansion Clean up after my misguided attempt in r304267 to "fix" CMP_SWAP returning an uninitialized status value. - I was always using tMOVi8 to zero the status register which cannot encode higher register numbers and llvm would silently miscompile) - Nobody was ever looking at that status value outside the expansion. ARMDAGToDAGISel::SelectCMP_SWAP() the only place creating CMP_SWAP instructions was not mapping anything to it. (The cmpxchg status value from llvm IR is lowered to a manual comparison after the CMP_SWAP) So this: - Renames the register from "status" to "temp" it make it obvious that it isn't used outside the expansion. - Remove the zeroing status/temp register. - Keep the live-in list improvements from r304267 Fixes http://llvm.org/PR34056 llvm-svn: 310534	2017-08-09 22:22:05 +00:00
Florian Hahn	d68bc7ae8d	[ARM] Emit error when ARM exec mode is not available. Summary: A similar error message has been removed from the ARMTargetMachineBase constructor in r306939. With this patch, we generate an error message for the example below, compiled with -mcpu=cortex-m0, which does not have ARM execution mode. __attribute__((target("arm"))) int foo(int a, int b) { return a + b % a; } __attribute__((target("thumb"))) int bar(int a, int b) { return a + b % a; } By adding this error message to ARMBaseTargetMachine::getSubtargetImpl, we can deal with functions that set -thumb-mode in target-features. At the moment it seems like Clang does not have access to target-feature specific information, so adding the error message to the frontend will be harder. Reviewers: echristo, richard.barton.arm, t.p.northover, rengolin, efriedma Reviewed By: echristo, efriedma Subscribers: efriedma, aemerson, javed.absar, kristof.beyls Differential Revision: https://reviews.llvm.org/D35627 llvm-svn: 310486	2017-08-09 15:39:10 +00:00
Florian Hahn	fc4b3951e9	[ARM] Remove FeatureNoARM implies ModeThumb. Summary: By removing FeatureNoARM implies ModeThumb, we can detect cases where a function's target-features contain -thumb-mode (enables ARM codegen for the function), but the architecture does not support ARM mode. Previously, the implication caused the FeatureNoARM bit to be cleared for functions with -thumb-mode, making the assertion in ARMSubtarget::ARMSubtarget [1] pointless for such functions. This assertion is the only guard against generating ARM code for architectures without ARM codegen support. Is there a place where we could easily generate error messages for the user? At the moment, we would generate ARM code for Thumb-only architectures. X86 has the same behavior as ARM, as in it only has an assertion and no error message, but I think for ARM an error message would be helpful. What do you think? For the example below, `llc -mtriple=armv7m-eabi test.ll -o -` will generate ARM assembler (or fail with an assertion error with this patch). Note that if we run the resulting assembler through llvm-mc, we get an appropriate error message, but not when codegen is handled through clang. ``` define void @bar() #0 { entry: ret void } attributes #0 = { "target-features"="-thumb-mode" } ``` [1] `c1f7b54cef/lib/Target/ARM/ARMSubtarget.cpp (L147)` Reviewers: t.p.northover, rengolin, peter.smith, aadg, silviu.baranga, richard.barton.arm, echristo Reviewed By: rengolin, echristo Subscribers: efriedma, aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35569 llvm-svn: 310476	2017-08-09 13:53:28 +00:00
Nico Weber	bfde70b097	Revert r309923, it caused PR34045. llvm-svn: 309950	2017-08-03 15:41:26 +00:00
Diana Picus	930e6ec8f3	[ARM] GlobalISel: Select simple G_GLOBAL_VALUE instructions Add support in the instruction selector for G_GLOBAL_VALUE for ELF and MachO for the static relocation model. We don't handle Windows yet because that's Thumb-only, and we don't handle Thumb in general at the moment. Support for PIC, ROPI, RWPI and TLS will be added in subsequent commits. Differential Revision: https://reviews.llvm.org/D35883 llvm-svn: 309927	2017-08-03 09:14:59 +00:00
Roger Ferrer Ibanez	854980341b	[ARM] Use ADDCARRY / SUBCARRY This patch: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) <- (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) <- (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRY into ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) -> C Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 309923	2017-08-03 07:45:10 +00:00
Rafael Espindola	79e238afee	Delete Default and JITDefault code models IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911	2017-08-03 02:16:21 +00:00
Matthias Braun	7c91336efe	ARM: Do not use llc -march in tests. `llc -march` is problematic because it only switches the target architecture, but leaves the operating system unchanged. This occasionally leads to indeterministic tests because the OS from LLVM_DEFAULT_TARGET_TRIPLE is used. However we can simply always use `llc -mtriple` instead. This changes all the tests to do this to avoid people using -march when they copy and paste parts of tests. See also the discussion in https://reviews.llvm.org/D35287 llvm-svn: 309755	2017-08-01 22:20:49 +00:00
Adrian Prantl	032d2381bf	Remove PrologEpilogInserter's usage of DBG_VALUE's offset field In the last half-dozen commits to LLVM I removed code that became dead after removing the offset parameter from llvm.dbg.value gradually proceeding from IR towards the backend. Before I can move on to DwarfDebug and friends there is one last side-called offset I need to remove: This patch modifies PrologEpilogInserter's use of the DBG_VALUE's offset argument to use a DIExpression instead. Because the PrologEpilogInserter runs at the Machine level I had to play a little trick with a named llvm.dbg.mir node to get the DIExpressions to print in MIR dumps (which print the llvm::Module followed by the MachineFunction dump). I also had to add rudimentary DwarfExpression support to CodeView and as a side-effect also fixed a bug (CodeViewDebug::collectVariableInfo was supposed to give up on variables with complex DIExpressions, but would fail to do so for fragments, which are also modeled as DIExpressions). With this last holdover removed we will have only one canonical way of representing offsets to debug locations which will simplify the code in DwarfDebug (and future versions of CodeViewDebug once it starts handling more complex expressions) and make it easier to reason about. This patch is NFC-ish: All test case changes are for assembler comments and the binary output does not change. rdar://problem/33580047 Differential Revision: https://reviews.llvm.org/D36125 llvm-svn: 309751	2017-08-01 21:45:24 +00:00
Eli Friedman	37f41d1e7c	[ScheduleDAG] Don't schedule node with physical register interference https://reviews.llvm.org/D31536 didn't really solve the problem it was trying to solve; it got rid of the assertion failure, but we were still scheduling the DAG incorrectly (mixing together instructions from different calls), leading to a MachineVerifier failure. In order to schedule the DAG correctly, we have to make sure we don't schedule a node which should be blocked by an interference. Fix ScheduleDAGRRList::PickNodeToScheduleBottomUp so it doesn't pick a node like that. The added call to FindAvailableNode() is the key change here; this makes sure we don't try to schedule a call while we're in the middle of scheduling a different call. I'm not sure this is the right approach; in particular, I'm not sure how to prove we don't end up with an infinite loop of repeatedly backtracking. This also reverts the code change from D31536. It doesn't do anything useful: we should never schedule an ADJCALLSTACKDOWN unless we've already scheduled the corresponding ADJCALLSTACKUP. Differential Revision: https://reviews.llvm.org/D33818 llvm-svn: 309642	2017-08-01 00:28:40 +00:00
Adrian Prantl	abe04759a6	Remove the obsolete offset parameter from @llvm.dbg.value There is no situation where this rarely-used argument cannot be substituted with a DIExpression and removing it allows us to simplify the DWARF backend. Note that this patch does not yet remove any of the newly dead code. rdar://problem/33580047 Differential Revision: https://reviews.llvm.org/D35951 llvm-svn: 309426	2017-07-28 20:21:02 +00:00
Strahinja Petrovic	25e9e1b866	[ARM] Add the option to directly access TLS pointer This patch enables choice for accessing thread local storage pointer (like '-mtp' in gcc). Differential Revision: https://reviews.llvm.org/D34408 llvm-svn: 309381	2017-07-28 12:54:57 +00:00
Peter Smith	5804364f7a	[ARM] Add test to check pcs of ARM ABI runtime floating point helpers The ARM Runtime ABI document (IHI0043) defines the AEABI floating point helper functions in section 4.1.2 The floating-point helper functions. The functions listed in this section must always use the base AAPCS calling convention. This test generates calls to all the helper functions that llvm supports and checks that the base AAPCS calling convention has been used. We test the equivalent of -mfloat-abi=soft, -mfloat-abi=softfp, -mfloat-abi=hardfp with an FPU that supports single and double precision, and one that only supports double precision. Differential Revision: https://reviews.llvm.org/D35904 llvm-svn: 309371	2017-07-28 09:21:00 +00:00
Matthias Braun	c618a466f1	ARMFrameLowering: Only set ExtraCSSpill for actually unused registers. The code assumed that unclobbered/unspilled callee saved registers are unused in the function. This is not true for callee saved registers that are also used to pass parameters such as swiftself. rdar://33401922 llvm-svn: 309350	2017-07-28 01:36:32 +00:00
Diana Picus	a5d6518e93	[ARM] GlobalISel: Map G_GLOBAL_VALUE to GPR A G_GLOBAL_VALUE is basically a pointer, so it should live in the GPR. llvm-svn: 309101	2017-07-26 11:01:13 +00:00
Diana Picus	b1fd784936	[ARM] GlobalISel: Mark G_GLOBAL_VALUE as legal llvm-svn: 309090	2017-07-26 09:25:15 +00:00
Erich Keane	d8f61f8f7e	Remove Bitrig: LLVM Changes Bitrig code has been merged back to OpenBSD, thus the OS has been abandoned. Differential Revision: https://reviews.llvm.org/D35707 llvm-svn: 308799	2017-07-21 22:48:47 +00:00
Javed Absar	2cb0c95031	[ARM] Unify handling of M-Class system registers This patch cleans up and fixes issues in the M-Class system register handling: 1. It defines the system registers and the encoding (SYSm values) in one place: a new ARMSystemRegister.td using SearchableTable, thereby removing the hand-coded values which existed in multiple places. 2. Some system registers e.g. BASEPRI_MAX_NS which do not exist were being allowed! Ref: ARMv6/7/8M architecture reference manual. Reviewed by: @t.p.northover, @olist01, @john.brawn Differential Revision: https://reviews.llvm.org/D35209 llvm-svn: 308456	2017-07-19 12:57:16 +00:00
Nirav Dave	d839749ae8	[DAG] Improve Aliasing of operations to static alloca Re-recommiting after landing DAG extension-crash fix. Recommiting after adding check to avoid miscomputing alias information on addresses of the same base but different subindices. Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 308350	2017-07-18 20:06:24 +00:00
Diana Picus	da25d5b8b0	[ARM] GlobalISel: Support G_(S\|U)REM for s8 and s16 Widen to s32, and then do whatever Lowering/Custom/Libcall action the subtarget wants. llvm-svn: 308285	2017-07-18 10:07:01 +00:00
Chandler Carruth	a15e080b05	Revert r308025 due to uncovering a crash in SelectionDAG. This is filed with a minimal test case in http://llvm.org/PR33833. Original commit message: Improve Aliasing of operations to static alloca llvm-svn: 308271	2017-07-18 07:53:47 +00:00
Nirav Dave	a8f63af9d1	Improve Aliasing of operations to static alloca Recommiting after adding check to avoid miscomputing alias information on addresses of the same base but different subindices. Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 308025	2017-07-14 13:56:21 +00:00
Diana Picus	87a7067983	[ARM] GlobalISel: Support G_BRCOND Insert a TSTri to set the flags and a Bcc to branch based on their values. This is a bit inefficient in the (common) cases where the condition for the branch comes from a compare right before the branch, since we set the flags both as part of the compare lowering and as part of the branch lowering. We're going to live with that until we settle on a principled way to handle this kind of situation, which occurs with other patterns as well (combines might be the way forward here). llvm-svn: 308009	2017-07-14 09:46:06 +00:00
Diana Picus	c452175642	[ARM] GlobalISel: Support G_BR This boils down to not crashing in reg bank select due to the lack of register operands on this instruction, and adding some tests. The instruction selection is already covered by the TableGen'erated code. llvm-svn: 307904	2017-07-13 11:09:34 +00:00
Evandro Menezes	14ba3d7730	[CodeGen] Add dependency printer Add SDep printer to make debugging sessions more productive. Differential revision: https://reviews.llvm.org/D35144 llvm-svn: 307799	2017-07-12 15:30:59 +00:00
Diana Picus	21014df5e0	[ARM] GlobalISel: Select s64 G_FCMP Very similar to how we select s32 G_FCMP, the only thing that is different is the exact opcodes that we use. llvm-svn: 307763	2017-07-12 09:01:54 +00:00
Konstantin Zhuravlyov	bb80d3e1d3	Enhance synchscope representation OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722	2017-07-11 22:23:00 +00:00
Diana Picus	1e33c9c166	[ARM] GlobalISel: Tighten G_FCMP selection test. NFC Use CHECK-NEXT for the comparison sequence, to make sure we don't get any unexpected instructions in the middle of our flag manipulation efforts. llvm-svn: 307656	2017-07-11 12:34:33 +00:00
Diana Picus	069da27f49	[ARM] GlobalISel: Add reg mapping for s64 G_FCMP Map the result into GPR and the operands into FPR. llvm-svn: 307653	2017-07-11 11:47:45 +00:00
Diana Picus	84baba20db	[ARM] GlobalISel: Tighten legalizer tests. NFC Make sure that all the legalizer tests where the original instruction needs to be removed check for the removal. We do this by adding CHECK-NOT lines before and after the replacement sequence. This won't catch pathological cases where the instruction remains somewhere in the middle of the instruction sequence that's supposed to replace it, but hopefully that won't occur in practice (since ideally we'd be setting the insert point for the new instruction sequence either before or after the original instruction and not fiddle with it while building the sequence). llvm-svn: 307647	2017-07-11 10:52:08 +00:00
Diana Picus	443135c6eb	[ARM] GlobalISel: Fix oversight in G_FCMP legalization We used to forget to erase the original instruction when replacing a G_FCMP true/false. Fix this bug and make sure the tests check for it. llvm-svn: 307639	2017-07-11 09:43:51 +00:00
Diana Picus	b57bba8316	[ARM] GlobalISel: Legalize s64 G_FCMP Same as the s32 version, for both hard and soft float. llvm-svn: 307633	2017-07-11 08:50:01 +00:00
Matthias Braun	b38736706e	Revert "[DAG] Improve Aliasing of operations to static alloca" Reverting as it breaks tramp3d-v4 in the llvm test-suite. I added some comments to https://reviews.llvm.org/D33345 about it. This reverts commit r307546. llvm-svn: 307589	2017-07-10 20:51:30 +00:00
Nirav Dave	163e1ad9dc	[DAG] Improve Aliasing of operations to static alloca Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 307546	2017-07-10 15:39:41 +00:00
Andrew V. Tischenko	a2ab3ed0df	NFC: I simply added CHECK-LABEL to prevent false matches in the tests. llvm-svn: 307397	2017-07-07 13:41:33 +00:00
Diana Picus	5b91653840	[ARM] GlobalISel: Select hard G_FCMP for s32 We lower to a sequence consisting of: - MOVi 0 into a register - VCMPS to do the actual comparison and set the VFP flags - FMSTAT to move the flags out of the VFP unit - MOVCCi to either use the "zero register" that we have previously set with the MOVi, or move 1 into the result register, based on the values of the flags As was the case with soft-float, for some predicates (one, ueq) we actually need two comparisons instead of just one. When that happens, we generate two VCMPS-FMSTAT-MOVCCi sequences and chain them by means of using the result of the first MOVCCi as the "zero register" for the second one. This is a bit overkill, since one comparison followed by two non-flag-setting conditional moves should be enough. In any case, the backend manages to CSE one of the comparisons away so it doesn't matter much. Note that unlike SelectionDAG and FastISel, we always use VCMPS, and not VCMPES. This makes the code a lot simpler, and it also seems correct since the LLVM Lang Ref defines simple true/false returns if the operands are QNaN's. For SNaN's, even VCMPS throws an Invalid Operand exception, so they won't be slipping through unnoticed. Implementation-wise, this introduces a template so we can share the same code that we use for handling integer comparisons, since the only differences are in the details (exact opcodes to be used etc). Hopefully this will be easy to extend to s64 G_FCMP. llvm-svn: 307365	2017-07-07 08:39:04 +00:00
Matthias Braun	eeb1516884	RegisterScavenging: Fix PR33687 When scavenging for a use in instruction MI, we will reload after that instruction and hence cannot spill uses/defs of this instruction. This fixes http://llvm.org/PR33687 llvm-svn: 307352	2017-07-07 03:02:18 +00:00
Diana Picus	c3a9c34761	[ARM] GlobalISel: Map s32 G_FCMP in reg bank select Map hard G_FCMP operands to FPR and the result to GPR. llvm-svn: 307245	2017-07-06 09:57:46 +00:00
Diana Picus	d0104eaae8	[ARM] GlobalISel: Legalize G_FCMP for s32 This covers both hard and soft float. Hard float is easy, since it's just Legal. Soft float is more involved, because there are several different ways to handle it based on the predicate: one and ueq need not only one, but two libcalls to get a result. Furthermore, we have large differences between the values returned by the AEABI and GNU functions. AEABI functions return a nice 1 or 0 representing true and respectively false. GNU functions generally return a value that needs to be compared against 0 (e.g. for ogt, the value returned by the libcall is > 0 for true). We could introduce redundant comparisons for AEABI as well, but they don't seem easy to remove afterwards, so we do different processing based on whether or not the result really needs to be compared against something (and just truncate if it doesn't). llvm-svn: 307243	2017-07-06 09:09:33 +00:00
Diana Picus	cd460c89c4	[ARM] GlobalISel: Widen s1, s8, s16 G_CONSTANT Get the legalizer to widen small constants. llvm-svn: 307239	2017-07-06 08:04:16 +00:00
Andrew Zhogin	45d192823e	[DAGCombiner] visitRotate patch to optimize pair of ROTR/ROTL instructions into one with combined shift operand. For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize. Differential revision: https://reviews.llvm.org/D12833 llvm-svn: 307179	2017-07-05 17:55:42 +00:00
Andrew Zhogin	2f8be0552e	[ARM][test] Added test/CodeGen/ARM/ror.ll test. NFC precommit for D12833. llvm-svn: 307103	2017-07-04 19:50:22 +00:00
Eric Christopher	3df231a1f7	Remove the default ARMSubtarget from the ARM TargetMachine. This enables us to ensure better LTO and code generation in the face of module linking. Remove a report_fatal_error from the TargetMachine and replace it with an assert in ARMSubtarget - and remove the test that depended on the error. The assertion will still fire in the case that we were reporting before, but error reporting needs to be in front end tools if possible for options parsing. llvm-svn: 306939	2017-07-01 03:41:53 +00:00
Eric Christopher	015dc2094e	Rewrite ARM execute only support to avoid the use of a command line flag and unqualified ARMSubtarget lookup. Paired with a clang commit to use the new behavior. llvm-svn: 306927	2017-07-01 02:55:22 +00:00
Tim Northover	ff5e7e1295	GlobalISel: add G_IMPLICIT_DEF instruction. It looks like there are two target-independent but not GISel instructions that need legalization, IMPLICIT_DEF and PHI. These are already anomalies since their operands have important LLTs attached, so to make things more uniform it seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF. llvm-svn: 306875	2017-06-30 20:27:36 +00:00
Tim Northover	2b5f03aa12	ARM: fix big-endian 64-bit cmpxchg. On big-endian machines the high and low parts of the value accessed by ldrexd and strexd are swapped around. To account for this we swap inputs and outputs in ISelLowering. Patch by Bharathi Seshadri. llvm-svn: 306865	2017-06-30 19:51:02 +00:00
Eric Christopher	ee837a59f7	Unified logic for computing target ABI in backend and front end by moving this common code to Support/TargetParser. Modeled Triple::GNU after front end code (aapcs abi) and updated tests that expect apcs abi. Based heavily on a patch by Ana Pazos! llvm-svn: 306768	2017-06-30 00:03:54 +00:00
Nikolai Bozhenov	1925594ea0	[NFC] Use stdin for some tests instead of positional argument. Summary: Otherwise unexpected matches with the path to the tests might happen. Reviewers: rengolin, spatel, efriedma, RKSimon Reviewed By: spatel Subscribers: n.bozhenov, javed.absar, llvm-commits Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D32994 llvm-svn: 306684	2017-06-29 14:51:54 +00:00
Florian Hahn	08fdd040b5	[ARM] Add tGPRwithpc register class and use it for TBB/THH Summary: TBB and THH allow using a Thumb GPR or the PC as destination operand. A few machine verifier failures where due to those instructions not expecting PC as destination operand. Add -verify-machineinstrs to test/CodeGen/ARM/jump-table-tbh.ll to add test coverage even if expensive checks are disabled. Reviewers: MatzeB, t.p.northover, jmolloy Reviewed By: MatzeB Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34610 llvm-svn: 306654	2017-06-29 08:45:31 +00:00

... 3 4 5 6 7 ...

3487 Commits