llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	22d25a08ae	[X86] Merge itineraries for CLC, CMC, and STC. These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation. llvm-svn: 329414	2018-04-06 16:16:43 +00:00
Mircea Trofin	aa3fea6cb0	[GlobalOpt] Fix support for casts in ctors. Summary: Fixing an issue where initializations of globals where constructors use casts were silently translated to 0-initialization. Reviewers: davidxl, evgeny777 Reviewed By: evgeny777 Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45198 llvm-svn: 329409	2018-04-06 15:54:47 +00:00
Dmitry Preobrazhensky	59399ae4cc	[AMDGPU][MC][VI][GFX9] Added s_atc_probe* instructions See bug 36839: https://bugs.llvm.org/show_bug.cgi?id=36839 Differential Revision: https://reviews.llvm.org/D45249 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329408	2018-04-06 15:48:39 +00:00
Pete Couperus	b7b6e1da6c	[ARC] Add <.f> suffix for F32_GEN4_{DOP\|SOP}. Add disassembler support for instructions which writeback STATUS32. https://reviews.llvm.org/D45148 Patch by Yan Luo! (Yan.Luo2@synopsys.com) llvm-svn: 329404	2018-04-06 15:43:11 +00:00
Dmitry Preobrazhensky	4732d876ee	[AMDGPU][MC][GFX9] Added s_dcache_discard* instructions See bug 36838: https://bugs.llvm.org/show_bug.cgi?id=36838 Differential Revision: https://reviews.llvm.org/D45247 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329397	2018-04-06 15:08:42 +00:00
Chad Rosier	45735b8e40	[LoopUnroll] Make LoopPeeling respect the AllowPeeling preference. The SimpleLoopUnrollPass isn't suppose to perform loop peeling. Differential Revision: https://reviews.llvm.org/D45334 llvm-svn: 329395	2018-04-06 13:57:21 +00:00
Pavel Labath	c9f07b06a1	DWARFVerifier: validate information in name index entries Summary: This patch add checks to verify that the information in the name index entries is consistent with the debug_info section. Specifically, we check that entries point to valid DIEs, and their names, tags, and compile units match the information in the debug_info sections. These checks are only run if the previous checks did not find any errors in the name index headers. Attempting to proceed with the checks anyway would likely produce a lot of spurious errors and the verification code would need to be very careful to avoid crashing. I also add a couple of more checks to the abbreviation-validation code to verify that some attributes are always present (an index without a DW_IDX_die_offset attribute is fairly useless). The entry verification works only on indexes without any type units - I haven't attempted to extend it to type units, as we don't even have a DWARF v5-compatible type unit generator at the moment. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45323 llvm-svn: 329392	2018-04-06 13:34:12 +00:00
Simon Pilgrim	09eeb3a8b9	[X86][SandyBridge] Add (V)DPPS memory fold latencies Noticed this during D44654 llvm-svn: 329389	2018-04-06 11:25:21 +00:00
Simon Pilgrim	8a83f16ccd	[X86][SandyBridge] SBWriteResPair +5cy Memory Folds As mentioned on D44647, this patch increases the default memory latency to +5cy , which more closely matches what most custom cases are doing for reg-mem instructions. I've bumped LoadLatency, ReadAfterLd and WriteLoad values to 5cy to be consistent. As Sandy Bridge is currently our default generic model, this affects a lot of scheduling tests... Differential Revision: https://reviews.llvm.org/D44654 llvm-svn: 329388	2018-04-06 11:00:51 +00:00
Hans Wennborg	da8b71f292	Tweak an assert message in the verifier llvm-svn: 329387	2018-04-06 10:20:19 +00:00
Simon Pilgrim	fd1f4fe54e	[X86][SkylakeServer] Merge 2 InstRW entries to the same sched group. NFCI. llvm-svn: 329386	2018-04-06 10:16:36 +00:00
Hans Wennborg	b230c763a4	EntryExitInstrumenter: Handle musttail calls Inserting instrumentation between a musttail call and ret instruction would create invalid IR. Instead, treat musttail calls as function exits. llvm-svn: 329385	2018-04-06 10:14:09 +00:00
Max Kazantsev	832563a782	[NFC] Add missing end of line symbols llvm-svn: 329383	2018-04-06 09:47:06 +00:00
Francis Visoiu Mistrih	537d7eee90	[MIR] Add support for MachineFrameInfo::LocalFrameSize MFI.LocalFrameSize was not serialized. It is usually set from LocalStackSlotAllocation, so if that pass doesn't run it is impossible do deduce it from the stack objects. Until now, this information was lost. llvm-svn: 329382	2018-04-06 08:56:25 +00:00
Pavel Labath	54ca2d688a	[debug_loc] Fix typo in DWARFExpression constructor Summary: The positions of the DwarfVersion and AddressSize arguments were reversed, which caused parsing for dwarf opcodes which contained address-size-dependent operands (such as DW_OP_addr). Amusingly enough, none of the address-size asserts fired, as dwarf version was always 4, which is a valid address size. I ran into this when constructing weird inputs for the DWARF verifier. I I add a test case as hand-written dwarf -- I am not sure how to trigger this differently, as having a DW_OP_addr inside a location list is a fairly non-standard thing to do. Fixing this error exposed a bug in the debug_loc.dwo parser, which was always being constructed with an address size of 0. I fix that as well by following the pattern in the non-dwo parser of picking up the address size from the first compile unit (which is technically not correct, but probably good enough in practice). Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45324 llvm-svn: 329381	2018-04-06 08:49:57 +00:00
Max Kazantsev	2f2fbebdc8	[NFC] Loosen restriction on preheader to fix buildbot llvm-svn: 329379	2018-04-06 07:23:45 +00:00
Hiroshi Inoue	a2eefb6d9a	[PowerPC] allow D-form VSX load/store when accessing FrameIndex without offset VSX D-form load/store instructions of POWER9 require the offset be a multiple of 16 and a helper`isOffsetMultipleOf` is used to check this. So far, the helper handles FrameIndex + offset case, but not handling FrameIndex without offset case. Due to this, we are missing opportunities to exploit D-form instructions when accessing an object or array allocated on stack. For example, x-form store (stxvx) is used for int a[4] = {0}; instead of d-form store (stxv). For larger arrays, D-form instruction is not used when accessing the first 16-byte. Using D-form instructions reduces register pressure as well as instructions. Differential Revision: https://reviews.llvm.org/D45079 llvm-svn: 329377	2018-04-06 05:41:16 +00:00
Robert Widmann	f108d57f9b	[LLVM-C] Audit Inline Assembly APIs for Consistency Summary: - Add a missing getter for module-level inline assembly - Add a missing append function for module-level inline assembly - Deprecate LLVMSetModuleInlineAsm and replace it with LLVMSetModuleInlineAsm2 which takes an explicit length parameter - Deprecate LLVMConstInlineAsm and replace it with LLVMGetInlineAsm, a function that allows passing a dialect and is not mis-classified as a constant operation Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45346 llvm-svn: 329369	2018-04-06 02:31:29 +00:00
Manoj Gupta	afb355bdc0	Fix lld-x86_64-darwin13 build fails. Use double braces in std::array initialization to keep Darwin builders happy. llvm-svn: 329363	2018-04-05 23:23:29 +00:00
Sanjay Patel	04683de82f	[InstCombine] FP: Z - (X - Y) --> Z + (Y - X) This restores what was lost with rL73243 but without re-introducing the bug that was present in the old code. Note that we already have these transforms if the ops are marked 'fast' (and I assume that's happening somewhere in the code added with rL170471), but we clearly don't need all of 'fast' for these transforms. llvm-svn: 329362	2018-04-05 23:21:15 +00:00
Manoj Gupta	9d68b9eac5	Attempt to fix Mips breakages. Summary: Replace ArrayRefs by actual std::array objects so that there are no dangling references. Reviewers: rsmith, gkistanova Subscribers: sdardis, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D45338 llvm-svn: 329359	2018-04-05 22:47:25 +00:00
Craig Topper	fbe3132f67	[X86] Separate CDQ and CDQE in the scheduler model. According to Agner's data, CDQE is closer to CWDE. llvm-svn: 329354	2018-04-05 21:56:19 +00:00
Mandeep Singh Grang	f3555650bd	[IR] Change std::sort to llvm::sort in response to r327219 r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer D44363 for a list of all the required patches. llvm-svn: 329353	2018-04-05 21:52:24 +00:00
Craig Topper	4cc3827791	[X86] Add MOVZPQILo2PQIrr to the Sandy Bridge scheduler model llvm-svn: 329351	2018-04-05 21:40:32 +00:00
Sanjay Patel	03e2526728	[InstCombine] nsz: -(X - Y) --> Y - X This restores part of the fold that was removed with rL73243 (PR4374). llvm-svn: 329350	2018-04-05 21:37:17 +00:00
Craig Topper	3b0b96c591	[X86] Add LEAVE instruction to the scheduler models using the same data as LEAVE64. Make LEAVE/LEAVE64 more correct on Sandy Bridge. This is the 32-bit mode version of LEAVE64. It should be at least somewhat similar to LEAVE64. The Sandy Bridge version was missing a load port use. llvm-svn: 329347	2018-04-05 21:16:26 +00:00
Wolfgang Pieb	3fb9e3f398	[DWARF v5][NFC]: Refactor DebugRnglists to prepare for the support of the DW_AT_ranges attribute in conjunction with .debug_rnglists. Reviewers: JDevlieghere Differential Revision: https://reviews.llvm.org/D45307 llvm-svn: 329345	2018-04-05 21:01:49 +00:00
Konstantin Zhuravlyov	c233ae8004	AMDGPU/Metadata: Always report a fixed number of hidden arguments Currently it is 6. If the "feature" was not used, report dummy hidden argument. Otherwise it does not match the kernarg size reported in the kernel header. Differential Revision: https://reviews.llvm.org/D45129 llvm-svn: 329341	2018-04-05 20:46:04 +00:00
Craig Topper	c6bb36a3d0	[X86] Remove some InstRWs for plain store instructions on Sandy Bridge. We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. llvm-svn: 329339	2018-04-05 20:04:06 +00:00
Lang Hames	7a598477aa	[RuntimeDyld][PowerPC] Use global entry points for calls between sections. Functions in different objects may use different TOCs, so calls between such functions should use the global entry point of the callee which updates the TOC pointer. This should fix a bug that the Numba developers encountered (see https://github.com/numba/numba/issues/2451). Patch by Olexa Bilaniuk. Thanks Olexa! No RuntimeDyld checker test case yet as I am not familiar enough with how RuntimeDyldELF fixes up call-sites, but I do not want to hold up landing this. I will continue to work on it and see if I can rope some powerpc experts in. llvm-svn: 329335	2018-04-05 19:37:05 +00:00
Mandeep Singh Grang	176c3efa0e	[Bitcode] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: pcc, mehdi_amini, dexonsmith Reviewed By: dexonsmith Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45132 llvm-svn: 329334	2018-04-05 19:27:04 +00:00
Daniel Neilson	367c2aea4e	[InstCombine] Properly change GEP type when reassociating loop invariant GEP chains Summary: This is a fix to PR37005. Essentially, rL328539 ([InstCombine] reassociate loop invariant GEP chains to enable LICM) contains a bug whereby it will convert: %src = getelementptr inbounds i8, i8* %base, <2 x i64> %val %res = getelementptr inbounds i8, <2 x i8> %src, i64 %val2 into: %src = getelementptr inbounds i8, i8 %base, i64 %val2 %res = getelementptr inbounds i8, <2 x i8*> %src, <2 x i64> %val By swapping the index operands if the GEPs are in a loop, and %val is loop variant while %val2 is loop invariant. This fix recreates new GEP instructions if the index operand swap would result in the type of %src changing from vector to scalar, or vice versa. Reviewers: sebpop, spatel Reviewed By: sebpop Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45287 llvm-svn: 329331	2018-04-05 18:51:45 +00:00
Craig Topper	9eec2025c5	[X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. Mostly vector load, store, and move instructions. llvm-svn: 329330	2018-04-05 18:38:45 +00:00
Mandeep Singh Grang	9893fe218c	[ARM] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: t.p.northover, RKSimon, MatzeB, bkramer Reviewed By: bkramer Subscribers: javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D44855 llvm-svn: 329329	2018-04-05 18:31:50 +00:00
Craig Topper	665f74414d	[X86] Disassembler support for having an ADSIZE prefix affect instructions with 0xf2 and 0xf3 prefixes. Needed to support umonitor from D45253. llvm-svn: 329327	2018-04-05 18:20:14 +00:00
Sanjay Patel	deaf4f354e	[InstCombine] use pattern matchers for fsub --> fadd folds This allows folding for vectors with undef elements. llvm-svn: 329316	2018-04-05 17:06:45 +00:00
Sam Clegg	cfd44a2e69	[WebAssembly] Allow for the creation of user-defined custom sections This patch adds a way for users to create their own custom sections to be added to wasm files. At the LLVM IR layer, they are defined through the "wasm.custom_sections" named metadata. The expected use case for this is bindings generators such as wasm-bindgen. Patch by Dan Gohman Differential Revision: https://reviews.llvm.org/D45297 llvm-svn: 329315	2018-04-05 17:01:39 +00:00
Craig Topper	6ecdb03f16	[X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256. llvm-svn: 329310	2018-04-05 16:32:48 +00:00
Andrea Di Biagio	c74ad502ce	[MC][Tablegen] Allow models to describe the retire control unit for llvm-mca. This patch adds the ability to describe properties of the hardware retire control unit. Tablegen class RetireControlUnit has been added for this purpose (see TargetSchedule.td). A RetireControlUnit specifies the size of the reorder buffer, as well as the maximum number of opcodes that can be retired every cycle. A zero (or negative) value for the reorder buffer size means: "the size is unknown". If the size is unknown, then llvm-mca defaults it to the value of field SchedMachineModel::MicroOpBufferSize. A zero or negative number of opcodes retired per cycle means: "there is no restriction on the number of instructions that can be retired every cycle". Models can optionally specify an instance of RetireControlUnit. There can only be up-to one RetireControlUnit definition per scheduling model. Information related to the RCU (RetireControlUnit) is stored in (two new fields of) MCExtraProcessorInfo. llvm-mca loads that information when it initializes the DispatchUnit / RetireControlUnit (see Dispatch.h/Dispatch.cpp). This patch fixes PR36661. Differential Revision: https://reviews.llvm.org/D45259 llvm-svn: 329304	2018-04-05 15:41:41 +00:00
Hiroshi Inoue	bbf98aea83	[PowerPC] fix assertion failure due to missing instruction in P9InstrResources.td This patch adds L(W\|H\|B)ZXTLS_32 instructions introduced by https://reviews.llvm.org/rL327635 in P9InstrResources.td. llvm-svn: 329299	2018-04-05 15:27:06 +00:00
Philip Pfaffe	131fb978b0	Re-land r329273: [Plugins] Add a slim plugin API to work together with the new PM Fix unittest: Do not link LLVM into the test plugin. Additionally, remove an unrelated change that slipped in in r329273. llvm-svn: 329293	2018-04-05 15:04:13 +00:00
Pavel Labath	510725c2d6	[Testing/Support]: Better matching of Error failure states Summary: The existing Failed() matcher only allowed asserting that the operation failed, but it was not possible to verify any details of the returned error. This patch adds two new matchers, which make this possible: - Failed<InfoT>() verifies that the operation failed with a single error of a given type. - Failed<InfoT>(M) additionally check that the contained error info object is matched by the nested matcher M. To make these work, I've changed the implementation of the ErrorHolder class. Now, instead of just storing the string representation of the Error, it fetches the ErrorInfo objects and stores then as a list of shared pointers. This way, ErrorHolder remains copyable, while still retaining the full information contained in the Error object. In case the Error object contains two or more errors, the new matchers will fail to match, instead of trying to match all (or any) of the individual ErrorInfo objects. This seemed to be the most sensible behavior for when one wants to match exact error details, but I could be convinced otherwise... Reviewers: zturner, lhames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44925 llvm-svn: 329288	2018-04-05 14:32:10 +00:00
Tim Northover	b30388bf11	ARM: Do not spill CSR to stack on entry to noreturn functions A noreturn nounwind function can be expected to never return in any way, and by never returning it will also never have to restore any callee-saved registers for its caller. This makes it possible to skip spills of those registers during function entry, saving some stack space and time in the process. This is rather useful for embedded targets with limited stack space. Should fix PR9970. Patch by myeisha (pmb). llvm-svn: 329287	2018-04-05 14:26:06 +00:00
Krzysztof Parzyszek	62c4805c1f	[Hexagon] Remove default values from lambda parameters llvm-svn: 329286	2018-04-05 14:25:52 +00:00
Sam Parker	0e7deb8104	[DAGCombine] Revert r329160 Again, broke the big endian stage 2 builders. llvm-svn: 329283	2018-04-05 13:46:17 +00:00
Sanjay Patel	236442e063	[InstCombine] cleanup; NFC llvm-svn: 329282	2018-04-05 13:24:26 +00:00
Simon Pilgrim	1d793b8ac5	[SchedModel] Complete models shouldn't match against itineraries when they don't use them (PR35639) For schedule models that don't use itineraries, checkCompleteness still checks that an instruction has a matching itinerary instead of skipping and going straight to matching the InstRWs. That doesn't seem to match what happens in TargetSchedule.cpp This patch causes problems for a number of models that had been incorrectly flagged as complete. Differential Revision: https://reviews.llvm.org/D43235 llvm-svn: 329280	2018-04-05 13:11:36 +00:00
Philip Pfaffe	e6b49ef286	Revert "[Plugins] Add a slim plugin API to work together with the new PM" This reverts commit ecf3ba1ab45edb1b0fadce716a7facf50dca4fbb/r329273. llvm-svn: 329276	2018-04-05 12:42:12 +00:00
Philip Pfaffe	e8f3ae9da0	[Plugins] Add a slim plugin API to work together with the new PM Summary: Add a new plugin API. This closes the gap between pass registration and out-of-tree passes for the new PassManager. Unlike with the existing API, interaction with a plugin is always initiated from the tools perspective. I.e., when a plugin is loaded, it resolves and calls a well-known symbol `llvmGetPassPluginInfo` to obtain details about the plugin. The fundamental motivation is to get rid of as many global constructors as possible. The API exposed by the plugin info is kept intentionally minimal. Reviewers: chandlerc Reviewed By: chandlerc Subscribers: bollu, grosser, lksbhm, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D35258 llvm-svn: 329273	2018-04-05 11:29:37 +00:00
Florian Hahn	6e0043365b	[LoopInterchange] Add stats counter for number of interchanged loops. Reviewers: samparker, karthikthecool, blitz.opensource Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D45209 llvm-svn: 329269	2018-04-05 10:39:23 +00:00
Fedor Sergeev	d29884c7e6	allow custom OptBisect classes set to LLVMContext This patch introduces a way to set custom OptPassGate instances to LLVMContext. A new instance field OptBisector and a new method setOptBisect() are added to the LLVMContext classes. These changes allow to set a custom OptBisect class that can make its own decisions on skipping optional passes. Another important feature of this change is ability to set different instances of OptPassGate to different LLVMContexts. So the different contexts can be used independently in several compiling threads of one process. One unit test is added. Patch by Yevgeny Rouban. Reviewers: andrew.w.kaylor, fedor.sergeev, vsk, dberlin, Eugene.Zelenko, reames, skatkov Reviewed By: andrew.w.kaylor, fedor.sergeev Differential Revision: https://reviews.llvm.org/D44464 llvm-svn: 329267	2018-04-05 10:29:37 +00:00
Florian Hahn	831a757728	[LoopInterchange] Preserve LoopInfo after interchanging. LoopInterchange relies on LoopInfo being up-to-date, so we should preserve it after interchanging. This patch updates restructureLoops to move the BBs of the interchanged loops to the right place. Reviewers: davide, efriedma, karthikthecool, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D45278 llvm-svn: 329264	2018-04-05 09:48:45 +00:00
Puyan Lotfi	8afc99363b	[MIR-Canon] Fixing warnings in Non-assert builds. llvm-svn: 329258	2018-04-05 06:56:44 +00:00
Craig Topper	15303dda0d	[X86] Revert r329251-329254 It's failing on the bots and I'm not sure why. This reverts: [X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. [X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256. [X86] Remove some InstRWs for plain store instructions on Sandy Bridge. [X86] Auto-generate complete checks. NFC llvm-svn: 329256	2018-04-05 05:19:36 +00:00
Craig Topper	25c7110a37	[X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. Mostly vector load, store, and move instructions. llvm-svn: 329254	2018-04-05 04:42:03 +00:00
Craig Topper	4b1fdd4921	[X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256. llvm-svn: 329253	2018-04-05 04:42:02 +00:00
Craig Topper	5c36557426	[X86] Auto-generate complete checks. NFC llvm-svn: 329251	2018-04-05 04:41:59 +00:00
Taewook Oh	e0db533feb	[CallSiteSplitting] Do not perform callsite splitting inside landing pad Summary: If the callsite is inside landing pad, do not perform callsite splitting. Callsite splitting uses utility function llvm::DuplicateInstructionsInSplitBetween, which eventually calls llvm::SplitEdge. llvm::SplitEdge calls llvm::SplitCriticalEdge with an assumption that the function returns nullptr only when the target edge is not a critical edge (and further assumes that if the return value was not nullptr, the predecessor of the original target edge always has a single successor because critical edge splitting was successful). However, this assumtion is not true because SplitCriticalEdge returns nullptr if the destination block is a landing pad. This invalid assumption results assertion failure. Fundamental solution might be fixing llvm::SplitEdge to not to rely on the invalid assumption. However, it'll involve a lot of work because current API assumes that llvm::SplitEdge never fails. Instead, this patch makes callsite splitting to not to attempt splitting if the callsite is in a landing pad. Attached test case will crash with assertion failure without the fix. Reviewers: fhahn, junbuml, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45130 llvm-svn: 329250	2018-04-05 04:16:23 +00:00
Gerolf Hoflehner	f41aa4fd85	[IR] Upgrade comment token in objc retain release marker Older compiler issued '#' instead of ';' llvm-svn: 329248	2018-04-05 02:44:46 +00:00
Puyan Lotfi	d6f7313c8f	[MIR-Canon] Improving performance by switching to named vregs. No more skipping thounsands of vregs. Much faster running time. llvm-svn: 329246	2018-04-05 00:27:15 +00:00
Puyan Lotfi	26c504fe1e	[MIR-Canon] Adding support for multi-def -> user distance reduction. llvm-svn: 329243	2018-04-05 00:08:15 +00:00
Sam Clegg	685c5e838a	[WebAssembly] Only write 32-bits for WebAssembly::OPERAND_OFFSET32 A bug was found where an offset of -1 would generate an encoding of max int64 which is invalid in the binary format. Differential Revision: https://reviews.llvm.org/D45280 llvm-svn: 329238	2018-04-04 22:27:58 +00:00
Peter Collingbourne	f11eb3ebe7	AArch64: Implement support for the shadowcallstack attribute. The implementation of shadow call stack on aarch64 is quite different to the implementation on x86_64. Instead of reserving a segment register for the shadow call stack, we reserve the platform register, x18. Any function that spills lr to sp also spills it to the shadow call stack, a pointer to which is stored in x18. Differential Revision: https://reviews.llvm.org/D45239 llvm-svn: 329236	2018-04-04 21:55:44 +00:00
Vitaly Buka	4296ea72ff	Don't inline @llvm.icall.branch.funnel Summary: @llvm.icall.branch.funnel is musttail with variable number of arguments. After inlining current backend can't separate call targets from call arguments. Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45116 llvm-svn: 329235	2018-04-04 21:46:27 +00:00
Zhaoshi Zheng	a5531f287a	[MemorySSA] Fix spelling errors in MemorySSA.cpp. NFC llvm-svn: 329230	2018-04-04 21:08:11 +00:00
Evgeniy Stepanov	1f1a7a719d	hwasan: add -hwasan-match-all-tag flag Sometimes instead of storing addresses as is, the kernel stores the address of a page and an offset within that page, and then computes the actual address when it needs to make an access. Because of this the pointer tag gets lost (gets set to 0xff). The solution is to ignore all accesses tagged with 0xff. This patch adds a -hwasan-match-all-tag flag to hwasan, which allows to ignore accesses through pointers with a particular pointer tag value for validity. Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D44827 llvm-svn: 329228	2018-04-04 20:44:59 +00:00
Jessica Paquette	bccd18b816	[MachineOutliner] Add `useMachineOutliner` target hook The MachineOutliner has a bunch of target hooks that will call llvm_unreachable if the target doesn't implement them. Therefore, if you enable the outliner on such a target, it'll just crash. It'd be much better if it'd just not run the outliner at all in this case. This commit adds a hook to TargetInstrInfo that returns false by default. Targets that implement the hook make it return true. The outliner checks the return value of this hook to decide whether or not to continue. llvm-svn: 329220	2018-04-04 19:13:31 +00:00
Eric Fiselier	96bbec79b4	[Analysis] Support aligned new/delete functions. Summary: Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well. This allows the compiler to perform certain optimizations including eliding new/delete calls. Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer Reviewed By: bkramer Subscribers: ckennelly, llvm-commits Differential Revision: https://reviews.llvm.org/D44769 llvm-svn: 329218	2018-04-04 19:01:51 +00:00
Eric Fiselier	e03d45fa8e	Revert "[Analysis] Support aligned new/delete functions." This reverts commit bee3bbd9bdd3ab3364b8fb0cdb6326bc1ae740e0. llvm-svn: 329217	2018-04-04 18:23:00 +00:00
Mandeep Singh Grang	93ab79d205	[AArch64] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: t.p.northover, jmolloy, RKSimon, rengolin Reviewed By: rengolin Subscribers: dexonsmith, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44853 llvm-svn: 329216	2018-04-04 18:20:28 +00:00
Eric Fiselier	0d5f3b0281	[Analysis] Support aligned new/delete functions. Summary: Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well. This allows the compiler to perform certain optimizations including eliding new/delete calls. Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer Reviewed By: bkramer Subscribers: ckennelly, llvm-commits Differential Revision: https://reviews.llvm.org/D44769 llvm-svn: 329215	2018-04-04 18:12:01 +00:00
Craig Topper	498875fab0	[X86] Separate BSWAP32r and BSWAP64r scheduling data in SandyBridge/Haswell/Broadwell/Skylake scheduler models. The BSWAP64r version is 2 uops and BSWAP32r is only 1 uop. The regular expressions also looked for a non-existant BSWAP16r. llvm-svn: 329211	2018-04-04 17:54:19 +00:00
Zachary Turner	15b2bdfd8b	[llvm-pdbutil] Add the ability to explain binary files. Using this, you can use llvm-pdbutil to export the contents of a stream to a binary file, then run explain on the binary file so that it treats the offset as an offset into the stream instead of an offset into a file. This makes it easy to compare the contents of the same stream from two different files. llvm-svn: 329207	2018-04-04 17:29:09 +00:00
Lei Huang	09fda63af0	[Power9]Legalize and emit code for quad-precision fma instructions Legalize and emit code for the following quad-precision fma: * xsmaddqp * xsnmaddqp * xsmsubqp * xsnmsubqp Differential Revision: https://reviews.llvm.org/D44843 llvm-svn: 329206	2018-04-04 16:43:50 +00:00
Pavel Labath	0cc0306a75	Fix build breakage from r329201 Some compilers do not like having an enum type and a variable with the same name (AccelTableKind). I rename the variable to TheAccelTableKind. Suggestions for a better name welcome. llvm-svn: 329202	2018-04-04 14:54:08 +00:00
Pavel Labath	6088c23431	Re-commit r329179 after fixing build&test issues - MSVC was not OK with a static_assert referencing a non-static member variable, even though it was just in a sizeof(expression). I move the assert into the emit function, where it is probably more useful. - Tests were failing in builds which did not have the X86 target configured. Since this functionality is not target-specific, I have removed the target specifiers from the .ll files. llvm-svn: 329201	2018-04-04 14:42:14 +00:00
Dmitry Preobrazhensky	523872ea59	[AMDGPU][MC] Enabled instruction TBUFFER_LOAD_FORMAT_XYZ for SI/CI See bug 36958: https://bugs.llvm.org/show_bug.cgi?id=36958 Differential Revision: https://reviews.llvm.org/D45099 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329197	2018-04-04 13:54:55 +00:00
Nico Weber	55fcd07d25	Revert r329179 (and follow-up unsuccessful fix attempts 329184, 329186); it doesn't build. llvm-svn: 329190	2018-04-04 13:06:22 +00:00
Dmitry Preobrazhensky	a0b8cd038c	[AMDGPU][MC] Added support of 3-element addresses for MIMG instructions See bug 35999: https://bugs.llvm.org/show_bug.cgi?id=35999 Differential Revision: https://reviews.llvm.org/D45084 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329187	2018-04-04 13:01:17 +00:00
Nico Weber	be6a9b6d7d	Attempt to fix bots more after r329179. llvm-svn: 329186	2018-04-04 12:58:49 +00:00
Nico Weber	7e654e3231	Attempt to fix bots after r329179. llvm-svn: 329184	2018-04-04 12:54:34 +00:00
Nico Weber	1cbd096914	Sort targetgen calls in lib/Target/*/CMakeLists. Makes it easier to see mistakes such as the one fixed in r329178 and makes the different target CMakeLists more consistent. Also remove some stale-looking comments from the Nios2 target cmakefile. No intended behavior change. llvm-svn: 329181	2018-04-04 12:37:44 +00:00
Pavel Labath	69baab103a	[CodeGen] Generate DWARF v5 Accelerator Tables Summary: This patch adds a DwarfAccelTableEmitter class, which generates an accelerator table, as specified in DWARF v5 standard. At the moment it only generates a DIE offset column and (if we are indexing more than one compile unit) a CU column. Indexing type units is not currently supported, as we don't even have the ability to generate DWARF v5-compatible compile units. The implementation is not data-source agnostic like the one generating apple tables. This was not necessary as we currently only have one user of this code, and without a second user it was not obvious to me how to best abstract this. (The difference between these tables and the apple ones is that they need a lot more metadata about the debug info they are indexing). The generation is triggered by the --accel-tables argument, which supersedes the --dwarf-accel-tables arg -- the latter was a simple on-off switch, but not we can choose between two kinds of accelerator tables we can generate. This is tested by parsing the generated tables with llvm-dwarfdump and the DWARFVerifier, and I've also checked that GNU readelf is able to make sense of the tables. Differential Revision: https://reviews.llvm.org/D43286 llvm-svn: 329179	2018-04-04 12:28:20 +00:00
Nico Weber	644d456a5f	Remove duplicate tablegen lines from AVR target. They were added in r285274, in what looks like a merge mishap. AVRGenMCCodeEmitter.inc is the only non-dupe tablegen invocation added in that revision. Also sort the tablegen lines to make this easier to spot in the future. llvm-svn: 329178	2018-04-04 12:27:43 +00:00
Benjamin Kramer	1fc0da4849	Make helpers static. NFC. llvm-svn: 329170	2018-04-04 11:45:11 +00:00
Nicolai Haehnle	2f5a73820c	AMDGPU: Dimension-aware image intrinsics Summary: These new image intrinsics contain the texture type as part of their name and have each component of the address/coordinate as individual parameters. This is a preparatory step for implementing the A16 feature, where coordinates are passed as half-floats or -ints, but the Z compare value and texel offsets are still full dwords, making it difficult or impossible to distinguish between A16 on or off in the old-style intrinsics. Additionally, these intrinsics pass the 'texfailpolicy' and 'cachectrl' as i32 bit fields to reduce operand clutter and allow for future extensibility. v2: - gather4 supports 2darray images - fix a bug with 1D images on SI Change-Id: I099f309e0a394082a5901ea196c3967afb867f04 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44939 llvm-svn: 329166	2018-04-04 10:58:54 +00:00
Nicolai Haehnle	eb7311ffb1	StructurizeCFG: Test for branch divergence correctly Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform, so the branch is non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. As discovered after committing an earlier version of this change, this exposes a subtle interaction between this pass and DivergenceAnalysis: since we remove and re-create branch instructions, we can no longer rely on DivergenceAnalysis for branches in subregions that were already processed by the pass. Explicitly remove branch instructions from DivergenceAnalysis to avoid dangling pointers as a matter of defensive programming, and change how we detect non-uniform subregions. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Differential Revision: https://reviews.llvm.org/D43743 llvm-svn: 329165	2018-04-04 10:58:15 +00:00
Nicolai Haehnle	3ffd383a15	AMDGPU: Fix copying i1 value out of loop with non-uniform exit Summary: When an i1-value is defined inside of a loop and used outside of it, we cannot simply use the SGPR bitmask from the loop's last iteration. There are also useful and correct cases of an i1-value being copied between basic blocks, e.g. when a condition is computed outside of a loop and used inside it. The concept of dominators is not sufficient to capture what is going on, so I propose the notion of "lane-dominators". Fixes a bug encountered in Nier: Automata. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103743 Change-Id: If37b969ddc71d823ab3004aeafb9ea050e45bd9a Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D40547 llvm-svn: 329164	2018-04-04 10:57:58 +00:00
John Brawn	21d9b33d62	[AArch64] Add patterns matching (fabs (fsub x y)) to (fabd x y) Differential Revision: https://reviews.llvm.org/D44573 llvm-svn: 329163	2018-04-04 10:12:53 +00:00
Sam Parker	7ec722d603	[DAGCombine] Improve ReduceLoadWidth for SRL Recommitting rL321259. Previosuly this caused an issue with PPCBE but I didn't receieve a reproducer and didn't have the time to follow up. If the issue appears again, please provide a reproducer so I can fix it. Original commit message: If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 llvm-svn: 329160	2018-04-04 09:26:56 +00:00
Mikhail Maltsev	68f35bcc85	[ARM] Do not convert some vmov instructions Summary: Patch https://reviews.llvm.org/D44467 implements conversion of invalid vmov instructions into valid ones. It turned out that some valid instructions also get converted, for example vmov.i64 d2, #0xff00ff00ff00ff00 -> vmov.i16 d2, #0xff00 Such behavior is incorrect because according to the ARM ARM section F2.7.7 Modified immediate constants in T32 and A32 Advanced SIMD instructions, "On assembly, the data type must be matched in the table if possible." This patch fixes the isNEONmovReplicate check so that the above instruction is not modified any more. Reviewers: rengolin, olista01 Reviewed By: rengolin Subscribers: javed.absar, kristof.beyls, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D44678 llvm-svn: 329158	2018-04-04 08:54:19 +00:00
Craig Topper	a30db995b3	[X86] Use the same predicate for the load for PMOVSXBQ and PMOVZXBQ. These both use a 16-bit load, but one used loadi16_anyext and the other used extloadi32i16. The only difference between them is that loadi16_anyext checked that the load was at least 2 byte aligned and non-volatile. But the alignment doesn't matter here. Just use extloadi32i16 for both. llvm-svn: 329154	2018-04-04 07:00:24 +00:00
Craig Topper	a3cac956fc	[X86] Use loadi16/loadi32 predicates in multiply patterns llvm-svn: 329153	2018-04-04 07:00:19 +00:00
Craig Topper	88e38e3e3e	[X86] Remove more dead code left over from the handling of i8/i16 UMUL_LOHI/SMUL_LOHI that is no longer needed. NFC llvm-svn: 329152	2018-04-04 07:00:16 +00:00
Max Kazantsev	613af1f7ca	[SCEV] Prove implications for SCEVUnknown Phis This patch teaches SCEV how to prove implications for SCEVUnknown nodes that are Phis. If we need to prove `Pred` for `LHS, RHS`, and `LHS` is a Phi with possible incoming values `L1, L2, ..., LN`, then if we prove `Pred` for `(L1, RHS), (L2, RHS), ..., (LN, RHS)` then we can also prove it for `(LHS, RHS)`. If both `LHS` and `RHS` are Phis from the same block, it is sufficient to prove the predicate for values that come from the same predecessor block. The typical case that it handles is that we sometimes need to prove that `Phi(Len, Len - 1) >= 0` given that `Len > 0`. The new logic was added to `isImpliedViaOperations` and only uses it and non-recursive reasoning to prove the facts we need, so it should not hurt compile time a lot. Differential Revision: https://reviews.llvm.org/D44001 Reviewed By: anna llvm-svn: 329150	2018-04-04 05:46:47 +00:00
Craig Topper	afa22edcf0	[X86] Remove dead code for handling i8/i16 UMUL_LOHI/SMUL_LOHI from X86ISelDAGToDAG.cpp. NFC These are promoted to i16/i32 multiplies by a DAG combine. llvm-svn: 329147	2018-04-04 04:38:55 +00:00
Craig Topper	3064c15dc3	[X86] Remove some code that was only needed when i1 was a legal type. NFC llvm-svn: 329146	2018-04-04 04:38:54 +00:00
Craig Topper	7d3aba6687	[SimplifyCFG] Teach merge conditional stores to handle cases where the PostBB has more than 2 predecessors by inserting a new block for the store. Summary: Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors. This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store. Reviewers: efriedma, jmolloy, ABataev Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39760 llvm-svn: 329142	2018-04-04 03:47:17 +00:00
Vlad Tsyrklevich	b324733169	Fix bad #include path in r329139 llvm-svn: 329140	2018-04-04 01:34:42 +00:00
Vlad Tsyrklevich	e3446017ed	Add the ShadowCallStack pass Summary: The ShadowCallStack pass instruments functions marked with the shadowcallstack attribute. The instrumented prolog saves the return address to [gs:offset] where offset is stored and updated in [gs:0]. The instrumented epilog loads/updates the return address from [gs:0] and checks that it matches the return address on the stack before returning. Reviewers: pcc, vitalybuka Reviewed By: pcc Subscribers: cryptoad, eugenis, craig.topper, mgorny, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44802 llvm-svn: 329139	2018-04-04 01:21:16 +00:00
Nico Weber	086b1c8118	Minor no-op cmake file style fix. llvm-svn: 329137	2018-04-04 00:50:22 +00:00
Lang Hames	b1e5043cff	Reapply r329133 with fix. llvm-svn: 329136	2018-04-04 00:34:54 +00:00
Lang Hames	4e319acd84	Revert r329133 "[RuntimeDyld][AArch64] Add some error pluming / generation..." This broke a number of buildbots. Looking in to it now... llvm-svn: 329135	2018-04-04 00:12:12 +00:00
Jessica Paquette	5fa2a63785	[MachineOutliner] Test for X86FI->getUsesRedZone() as well as Attribute::NoRedZone This commit is similar to r329120, but uses the existing getUsesRedZone() function in X86MachineFunctionInfo. This teaches the outliner to look at whether or not a function truly uses a redzone instead of just the noredzone attribute on a function. Thus, after this commit, it's possible to outline from x86 without using -mno-red-zone and still get outlining results. This also adds a new test for the new redzone behaviour. llvm-svn: 329134	2018-04-03 23:32:41 +00:00
Lang Hames	b92b10f3ec	[RuntimeDyld][AArch64] Add some error pluming / generation to catch unhandled relocation types on AArch64. llvm-svn: 329133	2018-04-03 23:19:20 +00:00
Farhana Aleen	e80aeac0f2	[AMDGPU] performMinMaxCombine should not optimize patterns of vectors to min3/max3. Summary: There are no packed instructions for min3 or max3. So, performMinMaxCombine should not optimize vectors of f16 to min3/max3. Author: FarhanaAleen Reviewed By: arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D45219 llvm-svn: 329131	2018-04-03 23:00:30 +00:00
Evandro Menezes	6b8d8f4010	[AArch64] Adjust the cost model for Exynos M3 Fix typo and simplify matching expression. llvm-svn: 329130	2018-04-03 22:57:17 +00:00
Ikhlas Ajbar	1376d934ed	[Hexagon] peel loops with runtime small trip counts Move the check canPeel() to Hexagon Target before setting PeelCount. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329129	2018-04-03 22:55:09 +00:00
Sanjay Patel	81b3b10a95	[InstCombine] allow more fmul folds with 'reassoc' The tests marked with 'FIXME' require loosening the check in SimplifyAssociativeOrCommutative() to optimize completely; that's still checking isFast() in Instruction::isAssociative(). llvm-svn: 329121	2018-04-03 22:19:19 +00:00
Jessica Paquette	642f6c61a3	[MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfo This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It returns true if the function is known to use a redzone, false if it is known to not use a redzone, and no value otherwise. This removes the requirement to pass -mno-red-zone when outlining for AArch64. https://reviews.llvm.org/D45189 llvm-svn: 329120	2018-04-03 21:56:10 +00:00
Farhana Aleen	936947349a	Revert "MSG" This reverts commit 9a0ce889d1c39c74d69ecad5ce9c875155ae55de. This was committed by mistake. llvm-svn: 329119	2018-04-03 21:51:45 +00:00
Vlad Tsyrklevich	07cf78cdad	Fix bad copy-and-paste in r329108 llvm-svn: 329118	2018-04-03 21:40:27 +00:00
Jessica Paquette	d506bf8e3d	[MachineOutliner][NFC] Make outlined functions have internal linkage The linkage type on outlined functions was private before. This meant that if you set a breakpoint in an outlined function, the debugger wouldn't be able to give a sane name to the outlined function. This commit changes the linkage type to internal and updates any tests that relied on the prefixes on the names of outlined functions. llvm-svn: 329116	2018-04-03 21:36:00 +00:00
Farhana Aleen	3ab409dc86	MSG llvm-svn: 329114	2018-04-03 21:20:39 +00:00
Gor Nishanov	d4712715dd	[coroutines] Respect alloca alignment requirements when building coroutine frame Summary: If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment. For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned. ``` %PackedStruct = type <{ i64 }> ... %data = alloca %PackedStruct, align 8 ``` If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame: ``` %f.Frame = type { ..., i16, [6 x i8], %PackedStruct } ``` Reviewers: rnk, lewissbaker, modocache Reviewed By: modocache Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D45221 llvm-svn: 329112	2018-04-03 20:54:20 +00:00
Florian Hahn	9467ccf447	[LoopInterchange] Add remark for calls preventing interchanging. It also updates test/Transforms/LoopInterchange/call-instructions.ll to use accesses where we can prove dependence after D35430. Reviewers: sebpop, karthikthecool, blitz.opensource Reviewed By: sebpop Differential Revision: https://reviews.llvm.org/D45206 llvm-svn: 329111	2018-04-03 20:54:04 +00:00
Vlad Tsyrklevich	d17f61ea3b	Add the ShadowCallStack attribute Summary: Introduce the ShadowCallStack function attribute. It's added to functions compiled with -fsanitize=shadow-call-stack in order to mark functions to be instrumented by a ShadowCallStack pass to be submitted in a separate change. Reviewers: pcc, kcc, kubamracek Reviewed By: pcc, kcc Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44800 llvm-svn: 329108	2018-04-03 20:10:40 +00:00
Aaron Smith	47f18b91bb	[DebugInfoPDB] Add a few missing definitions to PDBTypes.h The missing definitions are from cvconst.h shipped with DIA SDK. Correct the url to MSDN for MemoryTypeEnum and set the underlying type of PDB_StackFrameType and PDB_MemoryType to uint16_t. llvm-svn: 329104	2018-04-03 19:41:27 +00:00
Jun Bum Lim	7ab1b32b5e	[CodeGen]Add NoVRegs property on PostRASink and ShrinkWrap Summary: This change declare that PostRAMachineSinking and ShrinkWrap require NoVRegs property, so now the MachineFunctionPass can enforce this check. These passes are disabled in NVPTX & WebAssembly. Reviewers: dschuff, jlebar, tra, jgravelle-google, MatzeB, sebpop, thegameg, mcrosier Reviewed By: dschuff, thegameg Subscribers: jholewinski, jfb, sbc100, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D45183 llvm-svn: 329095	2018-04-03 18:17:34 +00:00
Alexey Bataev	d5b1f7892f	[SLP] Fixed formatting, NFC. llvm-svn: 329091	2018-04-03 17:48:14 +00:00
Alexey Bataev	f7226ed67d	[DEBUGINFO] Add option that allows to disable emission of flags in .loc directives. Summary: Some targets do not support extended format of .loc directive and support only simple format: .loc <FileID> <Line> <Column>. Patch adds MCAsmInfo flag and option that allows emit .loc directive without additional flags. Reviewers: echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45184 llvm-svn: 329089	2018-04-03 17:28:55 +00:00
Daniel Neilson	901acfab0c	[InstCombine] Fold compare of int constant against a splatted vector of ints Summary: Folding patterns like: %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %ext = extractelement <4 x i8> %insvec, i32 0 %cond = icmp eq i32 %ext, 0 Combined with existing rules, this allows us to fold patterns like: %insvec = insertelement <4 x i8> undef, i8 %val, i32 0 %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %cond = icmp eq i8 %val, 0 When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant. Reviewers: spatel, anna, mkazantsev Reviewed By: spatel Subscribers: efriedma, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D44997 llvm-svn: 329087	2018-04-03 17:26:20 +00:00
Alexey Bataev	428e9d9d87	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085	2018-04-03 17:14:47 +00:00
Alexey Bataev	df989c54cf	Recommit "[SLP] Fix issues with debug output in the SLP vectorizer." The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used errs() rather than dbgs(). llvm-svn: 329082	2018-04-03 16:40:33 +00:00
Krzysztof Parzyszek	9fa6ffe290	[Hexagon] Remove -mhvx-double and the corresponding subtarget feature Specifying the HVX vector length should be done via the -mhvx-length option. llvm-svn: 329079	2018-04-03 16:06:36 +00:00
Puyan Lotfi	764b386e20	Adding optional Name parameter to createVirtualRegister and createGenericVirtualRegister. llvm-svn: 329076	2018-04-03 15:53:49 +00:00
Benjamin Kramer	2fc3b18922	Revert "[SLP] Fix PR36481: vectorize reassociated instructions." This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071	2018-04-03 14:40:33 +00:00
Andrea Di Biagio	823e5f90db	[MC] Fix -Wmissing-field-initializer warning after r329067. This should fix the problem reported by the lld buildbots: - Builder lld-x86_64-darwin13, Build #19782 - Builder lld-perf-testsuite, Build #1419 llvm-svn: 329068	2018-04-03 13:52:26 +00:00
Andrea Di Biagio	9da4d6db33	[MC][Tablegen] Allow the definition of processor register files in the scheduling model for llvm-mca This patch allows the description of register files in processor scheduling models. This addresses PR36662. A new tablegen class named 'RegisterFile' has been added to TargetSchedule.td. Targets can optionally describe register files for their processors using that class. In particular, class RegisterFile allows to specify: - The total number of physical registers. - Which target registers are accessible through the register file. - The cost of allocating a register at register renaming stage. Example (from this patch - see file X86/X86ScheduleBtVer2.td) def FpuPRF : RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]> Here, FpuPRF describes a register file for MMX/XMM/YMM registers. On Jaguar (btver2), a YMM register definition consumes 2 physical registers, while MMX/XMM register definitions only cost 1 physical register. The syntax allows to specify an empty set of register classes. An empty set of register classes means: this register file models all the registers specified by the Target. For each register class, users can specify an optional register cost. By default, register costs default to 1. A value of 0 for the number of physical registers means: "this register file has an unbounded number of physical registers". This patch is structured in two parts. * Part 1 - MC/Tablegen * A first part adds the tablegen definition of RegisterFile, and teaches the SubtargetEmitter how to emit information related to register files. Information about register files is accessible through an instance of MCExtraProcessorInfo. The idea behind this design is to logically partition the processor description which is only used by external tools (like llvm-mca) from the processor information used by the llvm machine schedulers. I think that this design would make easier for targets to get rid of the extra processor information if they don't want it. * Part 2 - llvm-mca related * The second part of this patch is related to changes to llvm-mca. The main differences are: 1) class RegisterFile now needs to take into account the "cost of a register" when allocating physical registers at register renaming stage. 2) Point 1. triggered a minor refactoring which lef to the removal of the "maximum 32 register files" restriction. 3) The BackendStatistics view has been updated so that we can print out extra details related to each register file implemented by the processor. The effect of point 3. is also visible in tests register-files-[1..5].s. Differential Revision: https://reviews.llvm.org/D44980 llvm-svn: 329067	2018-04-03 13:36:24 +00:00
Hiroshi Inoue	08a1775f28	[PowerPC] reorder entries in P9InstrResources.td in alphabetical order; NFC Reorder entries added in my previous commit (rL328969) to keep alphabetical order. llvm-svn: 329064	2018-04-03 12:49:42 +00:00
Alexander Potapenko	ac70668cff	MSan: introduce the conservative assembly handling mode. The default assembly handling mode may introduce false positives in the cases when MSan doesn't understand that the assembly call initializes the memory pointed to by one of its arguments. We introduce the conservative mode, which initializes the first \|sizeof(type)\| bytes for every \|type*\| pointer passed into the assembly statement. llvm-svn: 329054	2018-04-03 09:50:06 +00:00
Serguei Katkov	2ace8dc1c3	[SCEV] Fix PR36974. The patch changes the usage of dominate to properlyDominate to satisfy the condition !(a < a) while using std::max. It is actually NFC due to set data structure is used to keep the Loops and no two identical loops can be in collection. So in reality there is no difference between usage of dominate and properlyDominate in this particular case. However it might be changed so it is better to fix it. llvm-svn: 329051	2018-04-03 07:29:00 +00:00
Craig Topper	9b6a65b9ef	[X86] Reduce number of OpPrefix bits in TSFlags to 2. NFCI TSFlag doesn't need to disambiguate NoPrfx from PS. So shift the encodings so PS is NoPrfx\|0x4. llvm-svn: 329049	2018-04-03 06:37:04 +00:00
Max Kazantsev	c01e47b43f	[SCEV] Make computeExitLimit more simple and more powerful Current implementation of `computeExitLimit` has a big piece of code the only purpose of which is to prove that after the execution of this block the latch will be executed. What it currently checks is actually a subset of situations where the exiting block dominates latch. This patch replaces all these checks for simple particular cases with domination check over loop's latch which is the only necessary condition of taking the exiting block into consideration. This change allows to calculate exact loop taken count for simple loops like for (int i = 0; i < 100; i++) { if (cond) {...} else {...} if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44677 Reviewed By: efriedma llvm-svn: 329047	2018-04-03 05:57:19 +00:00
Chandler Carruth	597bfd8448	[SLP] Fix issues with debug output in the SLP vectorizer. The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used `errs()` rather than `dbgs()`. llvm-svn: 329046	2018-04-03 05:27:28 +00:00
Yonghong Song	d3b522f519	bpf: fix incorrect SELECT_CC lowering Commit 37962a331c77 ("bpf: Improve expanding logic in LowerSELECT_CC") intended to improve code quality for certain jmp conditions. The commit, however, has a couple of issues: (1). In code, just swap is not enough, ConditionalCode CC should also be swapped, otherwise incorrect code will be generated. (2). The ConditionalCode swap should be subject to getHasJmpExt(). If getHasJmpExt() is False, certain conditional codes will not be supported and swap may generate incorrect code. The original goal for this patch is to optimize jmp operations which does not have JmpExt turned on. If JmpExt is on, better code could be generated. For example, the test select_ri.ll is introduced to demonstrate the optimization. The same result can be achieved with -mcpu=v2 flag. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 329043	2018-04-03 03:56:37 +00:00
Ikhlas Ajbar	b7322e8ac7	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329042	2018-04-03 03:39:43 +00:00
Haicheng Wu	7f0daaeb86	[SLP] Distinguish "demanded and shrinkable" from "demanded and not shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 llvm-svn: 329035	2018-04-03 00:05:10 +00:00
Brian Gesiak	64521bed0d	[Coroutines] Avoid assert splitting hidden coros Summary: When attempting to split a coroutine with 'hidden' visibility (for example, a C++ coroutine that is inlined when compiled with the option '-fvisibility-inlines-hidden'), LLVM would hit an assertion in include/llvm/IR/GlobalValue.h:240: "local linkage requires default visibility". The issue is that the visibility is copied from the source of the function split in the `CloneFunctionInto` function, but the linkage is not. To fix, create the new function first with external linkage, then copy the linkage from the original function after `CloneFunctionInto` is called. Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the explicit call to `setDSOLocal` can be removed in CoroSplit.cpp. Test Plan: check-llvm Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk Reviewed By: rnk Subscribers: llvm-commits, eric_niebler Differential Revision: https://reviews.llvm.org/D44185 llvm-svn: 329033	2018-04-02 23:39:40 +00:00
Rafael Espindola	8c58750cc4	Align stubs for external and common global variables to pointer size. This patch fixes PR36885: clang++ generates unaligned stub symbol holding a pointer. Patch by Rahul Chaudhry! llvm-svn: 329030	2018-04-02 23:20:30 +00:00
Reid Kleckner	298ffc609b	[InstCombine] Don't strip function type casts from musttail calls Summary: The cast simplifications that instcombine does here do not make any attempt to obey the verifier rules for musttail calls. Therefore we have to disable them. Reviewers: efriedma, majnemer, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45186 llvm-svn: 329027	2018-04-02 22:49:44 +00:00
Reid Kleckner	a9e9918ee4	Treat inlining a notail call as a regular, non-tail call Otherwise, we end up inlining a musttail call into a non-tail position, which breaks verifier invariants. Fixes PR31014 llvm-svn: 329015	2018-04-02 21:23:16 +00:00
Lang Hames	3fdfc04e53	[ORC] Create a new SymbolStringPool by default in ExecutionSession constructor. This makes the common case of constructing an ExecutionSession tidier. llvm-svn: 329013	2018-04-02 20:57:56 +00:00
Sanjay Patel	cbb0450540	[InstCombine] add folds for icmp + sub (PR36969) (A - B) >u A --> A <u B C <u (C - D) --> C <u D https://rise4fun.com/Alive/e7j Name: ugt %sub = sub i8 %x, %y %cmp = icmp ugt i8 %sub, %x => %cmp = icmp ult i8 %x, %y Name: ult %sub = sub i8 %x, %y %cmp = icmp ult i8 %x, %sub => %cmp = icmp ult i8 %x, %y This should fix: https://bugs.llvm.org/show_bug.cgi?id=36969 llvm-svn: 329011	2018-04-02 20:37:40 +00:00
Harlan Haskins	bee4b5894a	Fix header mismatch in DIBuilder Type APIs Some of the headers changed slightly, and the accompanying implementation didn't change. This caused a silent failure. llvm-svn: 329003	2018-04-02 19:11:44 +00:00
Zachary Turner	d11328a1bb	[llvm-pdbutil] Add an export subcommand. This command can dump the binary contents of a stream to a file. This is useful when you want to do side-by-side comparisons of a specific stream from two PDBs to examine the differences between them. You can export both of them to a file, then open them up side by side in a hex editor (for example), so as to eliminate any differences that might arise from the contents being on different blocks in the PDB. In subsequent patches I plan to improve the "explain" subcommand so that you can explain the contents of a binary file that isn't necessarily a full PDB, but one of these dumped streams, by telling the subcommand how to interpret the contents. llvm-svn: 329002	2018-04-02 18:35:21 +00:00
Nico Weber	868112181b	Remove HAVE_LIBPSAPI, HAVE_SHELL32. These used to be set in the old autoconf build, but the cmake build has had a "TODO: actually check for these" comment since it was checked in, and they were set to 1 on mingw unconditionally. It seems safe to say that they always exist under mingw, so just remove them and assume they're set exactly when on mingw (with msvc, we use `pragma comment` instead of linking these via flags). llvm-svn: 328992	2018-04-02 17:32:48 +00:00
Rong Xu	5a8d4c3357	[DeadArgumentElim] Clone function level metadatas Some Function level metadatas, such as function entry count, are not cloned in DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim after internalization. This patch clones the metadatas in the original function to the new function. Differential Revision: https://reviews.llvm.org/D44127 llvm-svn: 328991	2018-04-02 17:27:38 +00:00
Nico Weber	f3db8e3c70	Remove HAVE_DIRENT_H. The autoconf manual: "This macro is obsolescent, as all current systems with directory libraries have <dirent.h>. New programs need not use this macro." llvm-svn: 328989	2018-04-02 17:17:29 +00:00
Dmitry Preobrazhensky	b181c7312e	[AMDGPU][MC][GFX9] Added instructions v_cvt_norm_*16_f16, v_sat_pk_u8_i16 See bug 36847: https://bugs.llvm.org/show_bug.cgi?id=36847 Differential Revision: https://reviews.llvm.org/D45097 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328988	2018-04-02 17:09:20 +00:00
Gor Nishanov	b0316d96ae	[coroutines] Add support for llvm.coro.noop intrinsics Summary: A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined coroutine noop_coroutine that does nothing. To implement this feature, we implemented an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that does nothing when resumed or destroyed. Reviewers: EricWF, modocache, rnk, lewissbaker Reviewed By: modocache Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45114 llvm-svn: 328986	2018-04-02 16:55:12 +00:00
Dmitry Preobrazhensky	6bad04ecf5	[AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructions Fixed a bug which caused Tablegen crash. See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837 Differential Revision: https://reviews.llvm.org/D45085 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328983	2018-04-02 16:10:25 +00:00
Krzysztof Parzyszek	0831f57afe	[Hexagon] Clean up some code in HexagonAsmPrinter, NFC llvm-svn: 328981	2018-04-02 15:06:55 +00:00
Alexey Bataev	3decaf4275	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 328980	2018-04-02 14:51:37 +00:00
Nico Weber	f492f58182	Revert r328975, it makes TableGen assert on the bots. llvm-svn: 328978	2018-04-02 14:20:23 +00:00
Nico Weber	9f03e9de77	Remove HAVE_WRITEV that's unused after r255837. llvm-svn: 328977	2018-04-02 14:18:13 +00:00
Dmitry Preobrazhensky	32c450ae6a	[AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructions See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837 Differential Revision: https://reviews.llvm.org/D45085 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328975	2018-04-02 13:52:23 +00:00
Nico Weber	2eada78a50	Attempt to heal bots after r328970. llvm-svn: 328974	2018-04-02 13:49:35 +00:00
Lama Saba	927468309f	[X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346 If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory. A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load. The estimated penalty for a store forward block is ~13 cycles. This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence of a load and a store. The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies. breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM. Differential revision: https://reviews.llvm.org/D41330 Change-Id: Ib48836ccdf6005989f7d4466fa2035b7b04415d9 llvm-svn: 328973	2018-04-02 13:48:28 +00:00
Hiroshi Inoue	6d48493817	[PowerPC] fix assertion failure due to missing instruction in P9InstrResources.td This patch adds L(D\|W\|H\|B)XTLS instructions introduced by https://reviews.llvm.org/rL327635 in P9InstrResources.td. llvm-svn: 328969	2018-04-02 12:18:21 +00:00
Jonas Devlieghere	9e3e7a99e8	[dsymutil] Upstream emitting of papertrail warnings. When running dsymutil as part of your build system, it can be desirable for warnings to be part of the end product, rather than just being emitted to the output stream. This patch upstreams that functionality. Differential revision: https://reviews.llvm.org/D44639 llvm-svn: 328965	2018-04-02 10:40:43 +00:00
Craig Topper	96729cd64b	[X86][Silvermont] Use correct latency and throughput information for divide and square root in the scheduler model. Data taken from Table 16-17 in the Intel Optimization Manual. llvm-svn: 328962	2018-04-02 06:34:16 +00:00
Craig Topper	6a814904da	[X86][SkylakeServer] Correct throughput for 512-bit sqrt and divide. Data taken from the AVX512_SKX_PortAssign spreadsheet at http://instlatx64.atw.hu/ llvm-svn: 328961	2018-04-02 05:54:34 +00:00
Craig Topper	8104f266a4	[X86] Correct the throughput for divide instructions in Sandy Bridge/Haswell/Broadwell/Skylake scheduler models. Fixes most of PR36898. Still need to fix the 512-bit instructions, but Agner's tables don't have those. llvm-svn: 328960	2018-04-02 05:33:28 +00:00
Craig Topper	dc74094398	[X86] Fix the SchedRW for AVX512 shift instructions. It was being inadvertently defaulted to an FADD scheduler class. llvm-svn: 328959	2018-04-02 03:15:02 +00:00
Craig Topper	5fb1dc2d22	[X86] Give the AVX512 VEXTRACT instructions the same SchedRWs as the SSE/AVX versions. llvm-svn: 328958	2018-04-02 02:44:55 +00:00
Craig Topper	caec723a1a	[X86] Add an itinerary to BTR64rr. llvm-svn: 328956	2018-04-02 01:12:34 +00:00
Craig Topper	02daec00a2	[X86] Make sure all the classes declare in the Haswell scheduler model are prefixed with HW. The tablegen files all share a namespace so we shouldn't use a generic names in a specific scheduler model. llvm-svn: 328955	2018-04-02 01:12:32 +00:00
Craig Topper	c90d906b16	[X86] Give VINSERTPS the same intinerary as INSERTPS. llvm-svn: 328954	2018-04-02 00:48:11 +00:00
Harlan Haskins	b7881bbfa2	Add C API bindings for DIBuilder 'Type' APIs This patch adds a set of unstable C API bindings to the DIBuilder interface for creating structure, function, and aggregate types. This patch also removes the existing implementations of these functions from the Go bindings and updates the Go API to fit the new C APIs. llvm-svn: 328953	2018-04-02 00:17:40 +00:00
Craig Topper	dc4a6d1ef6	[X86] Cleanup ADCX/ADOX instruction definitions. Give them both the same itineraries. Add hasSideEffects = 0 to ADOX since they don't have patterns. Rename source operands to $src1 and $src2 instead of $src0 and $src. Add ReadAfterLd to the memory form SchedRW. llvm-svn: 328952	2018-04-01 23:58:50 +00:00
Petr Hosek	934e5d5436	[AArch64] Reserve x18 register on Fuchsia This register is reserved as a platform register on Fuchsia. Differential Revision: https://reviews.llvm.org/D45105 llvm-svn: 328950	2018-04-01 23:44:04 +00:00
Craig Topper	8a1787ae22	[DebugCounter] Make -debug-counter cl::Hidden. llvm-svn: 328948	2018-04-01 22:16:52 +00:00
Craig Topper	f5730c38e9	[LegacyPassManager] Make 'print-module-scope' cl::Hidden like the rest of the printing options. llvm-svn: 328947	2018-04-01 21:54:26 +00:00
Craig Topper	9f834810ea	[X86] Give ADC8/16/32/64mi the same scheduling information as ADC8/16/32/64mr and SBB8/16/32/64mi. It doesn't make a lot of sense that it would be different. llvm-svn: 328946	2018-04-01 21:54:24 +00:00
Chandler Carruth	4244625c51	[x86] Correct the operand structure of the ADOX instruction. This also moves to define it in the same way as ADCX which seems to use constraints a bit better. This is pulled out of the review for reducing the use of popf for restoring EFLAGS, but is independent. There are still more problems with our definitions for these instructions that Craig is going to look at but this is at least less broken and he can start from this to improve them more fully. Thanks to Craig for the review here. llvm-svn: 328945	2018-04-01 21:53:18 +00:00
Chandler Carruth	06b343c6ed	[x86] Expose more of the condition conversion routines in the public API for X86's instruction information. I've now got a second patch under review that needs these same APIs. This bit is nicely orthogonal and obvious, so landing it. NFC. llvm-svn: 328944	2018-04-01 21:47:55 +00:00
Nicolai Haehnle	4254d45a79	AMDGPU: Make isIntrinsicSourceOfDivergence table-driven Summary: This is in preparation for the new dimension-aware image intrinsics, which I'd rather not have to list here by hand. Change-Id: Iaa16e3a635a11283918ce0d9e1e618591b0bf6fa Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44938 llvm-svn: 328939	2018-04-01 17:09:14 +00:00
Nicolai Haehnle	5d0d30304c	AMDGPU: Make getTgtMemIntrinsic table-driven for resource-based intrinsics Summary: Avoids having to list all intrinsics manually. This is in preparation for the new dimension-aware image intrinsics, which I'd rather not have to list here by hand. Change-Id: If7ced04998397ef68c4cb8f7de66b5050fb767e5 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44937 llvm-svn: 328938	2018-04-01 17:09:07 +00:00
Mandeep Singh Grang	fe1d28e83d	[DebugInfo] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: echristo, zturner, samsonov Reviewed By: echristo Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D45134 llvm-svn: 328935	2018-04-01 16:18:49 +00:00
Teresa Johnson	974706ebf7	[ThinLTO] Add an import cutoff for debugging/triaging Summary: Adds -import-cutoff=N which will stop importing during the thin link after N imports. Default is -1 (no limit). Reviewers: wmi Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45127 llvm-svn: 328934	2018-04-01 15:54:40 +00:00
David Green	f80ebc8d21	[LoopRotate] Rotate loops with loop exiting latches If a loop has a loop exiting latch, it can be profitable to rotate the loop if it leads to the simplification of a phi node. Perform rotation in these cases even if loop rotate itself didnt simplify the loop to get there. Differential Revision: https://reviews.llvm.org/D44199 llvm-svn: 328933	2018-04-01 12:48:24 +00:00
Craig Topper	9b8cd5fe55	[X86] Don't check for folding into a store when deciding if we can promote an i16 mul. There's no RMW mul operation. llvm-svn: 328931	2018-04-01 06:29:32 +00:00
Craig Topper	db6caabccc	[X86] Check if the load and store are to the same pointer before preventing i16 RMW shifts and subtracts from being promoted. llvm-svn: 328930	2018-04-01 06:29:28 +00:00
Craig Topper	ae2de57db0	[X86] Allow i16 subtracts to be promoted if the load is on the LHS and its not being stored. llvm-svn: 328928	2018-04-01 06:29:25 +00:00
Craig Topper	9bc0d881a3	[X86] Remove unneeded temporary variable. NFC This Promote flag was alwasys set to true except in the default case. But in the default case we don't need to set PVT and can just return false. llvm-svn: 328926	2018-04-01 06:29:21 +00:00
Mandeep Singh Grang	97bcade70f	[Analysis] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer D44363 for a list of all the required patches. Reviewers: sanjoy, dexonsmith, hfinkel, RKSimon Reviewed By: dexonsmith Subscribers: david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D44944 llvm-svn: 328925	2018-04-01 01:46:51 +00:00
Sanjay Patel	6124cae8f7	[DAGCombine] (float)((int) f) --> ftrunc (PR36617) fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 328921	2018-03-31 17:55:44 +00:00
Simon Pilgrim	3b8ad346f9	[X86][Btver2] Add MMX_PSHUFB to the JWritePSHUFB InstRW entries llvm-svn: 328918	2018-03-31 09:15:54 +00:00
Simon Pilgrim	8c8ebd7945	Fix trailing whitespace. NFCI. llvm-svn: 328917	2018-03-31 09:14:14 +00:00
Puyan Lotfi	57c4f38c35	[MIR-Canon] Adding support for local idempotent instruction hoisting. llvm-svn: 328915	2018-03-31 05:48:51 +00:00
Craig Topper	13a0f83a05	[X86] Add SchedRW for PMULLD Summary: It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput. This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet. I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs. Reviewers: RKSimon, GGanesh, courbet Reviewed By: RKSimon Subscribers: gchatelet, gbedwell, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D44972 llvm-svn: 328914	2018-03-31 04:54:32 +00:00
Teresa Johnson	db83aceb06	[ThinLTO] Add an option to force summary call edges cold for debugging Summary: Useful to selectively disable importing into specific modules for debugging/triaging/workarounds. Reviewers: eraman Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45062 llvm-svn: 328909	2018-03-31 00:18:08 +00:00
Fangrui Song	956ee79795	Fix a bunch of typoes. NFC llvm-svn: 328907	2018-03-30 22:22:31 +00:00
Ekaterina Romanova	0b01dfbba6	Prevent data races in concurrent ThinLTO processes. Make sure ThinLTO with caching doesn't use non-atomic writes to the cache file (to prevent data races and cache files corruption). 1. Place temp file to the same place where the caching directory is (instead of creating it the directory pointed to by TMP/TEMP variable). This will help to prevent using non-atomic rename and falling back to non-atomic "direct" write to the cache file. 2. if rename failed do not write to the cache file directly (direct write to the file is non-atomic and could cause data race conditions). 3. if cache file doesn't exist (e.g., because 'rename' failed or because some other reasons), bypass using the cache altogether. Differential Revision: https://reviews.llvm.org/D45076 llvm-svn: 328904	2018-03-30 21:35:42 +00:00
Jacob Gravelle	40926451d2	[WebAssembly] Register wasm passes with the PassRegistry Summary: This exposes WebAssembly passes for use on the command line (as arguments to -print-before and the like). Reviewers: dschuff, sunfish Subscribers: MatzeB, jfb, sbc100, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D45103 llvm-svn: 328901	2018-03-30 20:36:58 +00:00
Krzysztof Parzyszek	74096f7258	[Hexagon] Reduce excessive indentation in .s output llvm-svn: 328898	2018-03-30 19:30:28 +00:00
Krzysztof Parzyszek	0f983d69a4	[Hexagon] Avoid creating invalid offsets in packetizer Two memory instructions with a dependency only on the address register between the two (the first one of them being post-incrememnt) can be packetized together after the offset on the second was updated to the incremement value. Make sure that the new offset is valid for the instruction. llvm-svn: 328897	2018-03-30 19:28:37 +00:00
Andrea Di Biagio	dc97172b2f	[X86][BtVer2] Fixed the number of micro opcodes for AVX vector converts and VSQRT instructions. There were still a few AVX instructions with an incorrect number of opcodes. These should be fixed now. llvm-svn: 328892	2018-03-30 18:53:47 +00:00
Peter Collingbourne	d03bf12c1b	DataFlowSanitizer: wrappers of functions with local linkage should have the same linkage as the function being wrapped This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan. This change resolves bug 36314. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D44784 llvm-svn: 328890	2018-03-30 18:37:55 +00:00
Puyan Lotfi	399b46c98d	[MIR] Adding support for Named Virtual Registers in MIR. llvm-svn: 328887	2018-03-30 18:15:54 +00:00
Andrea Di Biagio	3eaa26bb64	[X86][BtVer2] Fix the number of uOps for horizontal operations. llvm-svn: 328886	2018-03-30 18:15:30 +00:00
Tim Shen	8f9f026965	[NVPTX] Enable StructuredCFG for NVPTX Summary: Make NVPTX require structured CFG. Added a temporary flag to "roll back" the behavior for easy deployment. Combined with D45008, this fixes several internal Nvidia GPU test failures that we suspect to be ptxas miscompiles (PR27738). Reviewers: jlebar Subscribers: jholewinski, sanjoy, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D45070 llvm-svn: 328885	2018-03-30 17:51:03 +00:00
Tim Shen	1a8c6776a3	[BlockPlacement] Disable block placement tail duplciation in structured CFG. Summary: Tail duplication easily breaks the structure of CFG, e.g. duplicating on a region entry. If the structure is intended to be preserved, then we may want to configure tail duplication, or disable it for structured CFG. From our benchmark results disabling it doesn't cause performance regression. Notice that this currently affects AMDGPU backend. In the next patch, I also plan to turn on requiresStructuredCFG for NVPTX. All unit tests still pass. Reviewers: jlebar, arsenm Subscribers: jholewinski, sanjoy, wdng, tpr, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45008 llvm-svn: 328884	2018-03-30 17:51:00 +00:00
Robert Widmann	478fce9ebf	[LLVM-C] Finish exception instruction bindings - Round 2 Summary: Previous revision caused a leak in the echo test that got caught by the ASAN bots because of missing free of the handlers array and was reverted in r328759. Resubmitting the patch with that correction. Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors. Test is modified from SimplifyCFG because it contains many diverse usages of these instructions. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits, vlad.tsyrklevich Differential Revision: https://reviews.llvm.org/D45100 llvm-svn: 328883	2018-03-30 17:49:53 +00:00
Zachary Turner	d5cf5cf637	[llvm-pdbutil] Dig deeper into the PDB and DBI streams when explaining. This will show more detail when using `llvm-pdbutil explain` on an offset in the DBI or PDB streams. Specifically, it will dig into individual header fields and substreams to give a more precise description of what the byte represents. llvm-svn: 328878	2018-03-30 17:16:50 +00:00
Derek Schuff	a2726e9ab6	[WebAssembly] Refactor tablegen for store instructions (NFC) Summary: Add patterns similar to loads. Differential Revision: https://reviews.llvm.org/D45064 llvm-svn: 328876	2018-03-30 17:02:50 +00:00
Krzysztof Parzyszek	fce30c2ba3	Revert "peel loops with runtime small trip counts" This reverts commit r328854, it breaks some Hexagon tests. llvm-svn: 328875	2018-03-30 16:55:44 +00:00
Stanislav Mekhanoshin	74e2974ac6	[AMDGPU] Fixed some instructions latencies Differential Revision: https://reviews.llvm.org/D45073 llvm-svn: 328874	2018-03-30 16:19:13 +00:00
Sanjay Patel	e09b7dcf3d	[SelectionDAG] Removing FABS folding from DAGCombiner The code has bugs dealing with -0.0. Since D44550 introduced FABS pattern folding in InstCombine, this patch removes the now-redundant code that causes https://bugs.llvm.org/show_bug.cgi?id=36600. Patch by Mikhail Dvoretckii! Differential Revision: https://reviews.llvm.org/D44683 llvm-svn: 328872	2018-03-30 15:42:52 +00:00
Krzysztof Parzyszek	4f99836a9e	[Hexagon] Recognize and handle :endloop01 llvm-svn: 328870	2018-03-30 15:29:47 +00:00
Krzysztof Parzyszek	46abcb236b	[Hexagon] Fix printing :mem_noshuf on compiler-generated packets llvm-svn: 328869	2018-03-30 15:09:05 +00:00
Andrea Di Biagio	073a9d74ca	[X86][BtVer2] Add missing ReadAfterLd to RM variants of AVX horizontal adds and most vector logic instructions. Fixed a few InstRW that forgot to specify a ReadAfterLd for the register input operand. llvm-svn: 328867	2018-03-30 14:48:08 +00:00
Krzysztof Parzyszek	3f55ad8fae	[Hexagon] Remove unused scheduling classes llvm-svn: 328866	2018-03-30 14:34:32 +00:00
Krzysztof Parzyszek	1ca23d9837	[Hexagon] Pass pointer to SelectionDAG to dump functions llvm-svn: 328864	2018-03-30 14:29:15 +00:00
Vlad Tsyrklevich	894c028d56	Revert "[LLVM-C] Finish exception instruction bindings" This reverts commit r328759. It was causing LSan failures on sanitizer-x86_64-linux-bootstrap llvm-svn: 328858	2018-03-30 06:21:28 +00:00
Michael Bedy	59e5ef793c	[AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE. Summary: The phase attempts to transform operations that extract a portion of a value into an SDWA src operand in cases where that value is used only once. It was not prepared for this use to be the preserved portion of a value for dst:UNUSED_PRESERVE, resulting in a crash or assert. This change either rejects the illegal SDWA attempt, or in the case where dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded extract instruction. Reviewers: arsenm, #amdgpu Reviewed By: arsenm, #amdgpu Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D44364 llvm-svn: 328856	2018-03-30 05:03:36 +00:00
Ikhlas Ajbar	66c8ba5a50	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 328854	2018-03-30 03:05:34 +00:00
Eli Friedman	208fe67a78	[MachineCopyPropagation] Handle COPY with overlapping source/dest. MachineCopyPropagation::CopyPropagateBlock has a bunch of special handling for COPY instructions. This handling assumes that COPY instructions do not modify the source of the copy; this is wrong if the COPY destination overlaps the source. To fix the bug, check explicitly for this situation, and fall back to the generic instruction handling. This bug can't happen for most register classes because they don't have this sort of overlap, but there are a few register classes where this is possible. The testcase uses the AArch64 QQQQ register class. Differential Revision: https://reviews.llvm.org/D44911 llvm-svn: 328851	2018-03-30 00:56:03 +00:00
Eugene Zelenko	7fb5d41e44	[IR] Fix some Clang-tidy modernize-use-auto warnings; other minor fixes (NFC). llvm-svn: 328850	2018-03-30 00:47:31 +00:00
Rafael Espindola	4b4d85fd4d	Style update. NFC. Rename 3 functions to start with lowercase letters. Don't repeat the name in the comments. llvm-svn: 328848	2018-03-29 23:32:54 +00:00
David Blaikie	f423062aff	Fix some layering in StripNonLineTableDebugInfo, moving its declaration from IPO.h to Utils.h to match its implementation llvm-svn: 328844	2018-03-29 22:42:08 +00:00
David Blaikie	7883340331	Remove unused header to fix layering. llvm-svn: 328842	2018-03-29 22:35:59 +00:00
David Blaikie	4778bb88ef	Remove unused headers to fix layering llvm-svn: 328840	2018-03-29 22:31:39 +00:00
David Blaikie	c90289b5d3	llvm-c: Split Utils out of Scalar.h To fix layering (so that Scalar.h, a libScalarOpts header, isn't included from Utils - which libScalarOpts depends on). llvm-svn: 328839	2018-03-29 22:31:38 +00:00
David Blaikie	bd0c88078a	Remove some unneeded #includes to fix layering llvm-svn: 328838	2018-03-29 22:31:36 +00:00
Craig Topper	ee3c19fd7f	[X86] Add ReadAfterLds to some 3 src instructions Sometimes the operand comes after the memory operand so we need 5 ReadDefaults first. I suspect we also need to do something for the mask operand for masked avx512 instructions? I'm not sure if the mask should be ReadAfterLd or not since it can mask faults. If it shouldn't be ReadAfterLd then we're probably wrong for zero masking instructions already. Differential Revision: https://reviews.llvm.org/D44726 llvm-svn: 328834	2018-03-29 22:03:05 +00:00
Matt Arsenault	efd1b30436	AMDGPU: Fix build warning in release llvm-svn: 328832	2018-03-29 21:44:44 +00:00
Matt Arsenault	03ae399d50	AMDGPU: Support realigning stack While the stack access instructions don't care about alignment > 4, some transformations on the pointer calculation do make assumptions based on knowing the low bits of a pointer are 0. If a stack object ends up being accessed through its absolute address (relative to the kernel scratch wave offset), the addressing expression may depend on the stack frame being properly aligned. This was breaking in a testcase due to the add->or combine. I think some of the SP/FP handling logic is still backwards, and overly simplistic to support all of the stack features. Code which tries to modify the SP with inline asm for example or variable sized objects will probably require redoing this. llvm-svn: 328831	2018-03-29 21:30:06 +00:00
Evgeniy Stepanov	50635dab26	Add msan custom mapping options. Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan. As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html Patch by vit9696(at)avp.su. Differential Revision: https://reviews.llvm.org/D44926 llvm-svn: 328830	2018-03-29 21:18:17 +00:00
Craig Topper	3f2dbec652	[X86] Remove ReadAfterLd from BMI and TBM instructions that don't have a register operand in their memory form The memory form of these instructions only read an input from memory. They don't have any register operands. Differential Revision: https://reviews.llvm.org/D44836 llvm-svn: 328828	2018-03-29 21:03:53 +00:00
Craig Topper	89310f56c8	[X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated SchedRW for BEXTR/BZHI. These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd Differential Revision: https://reviews.llvm.org/D44838 llvm-svn: 328823	2018-03-29 20:41:39 +00:00
Philip Reames	5c14ed89f6	[NFC][LICM] Rearrange checks to have the cheap bail out first llvm-svn: 328822	2018-03-29 20:32:15 +00:00
Matt Arsenault	ffb132e74b	AMDGPU: Increase default stack alignment 8 and 16-byte values are common, so increase the default alignment to avoid realigning the stack in most functions. llvm-svn: 328821	2018-03-29 20:22:04 +00:00
Matt Arsenault	6c041a3cab	AMDGPU: Fix selection error on constant loads with < 4 byte alignment llvm-svn: 328818	2018-03-29 19:59:28 +00:00
Philip Reames	e4b728e82b	Fix an accidental circular dependence llvm-svn: 328816	2018-03-29 19:22:12 +00:00
Mandeep Singh Grang	10d8b85570	[Mips] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: sdardis, RKSimon, dsanders, atanasyan Reviewed By: atanasyan Subscribers: atanasyan, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D44869 llvm-svn: 328815	2018-03-29 19:05:26 +00:00
Zachary Turner	3203e27473	[MSF] Default to FPM2, and always mark FPM pages allocated. There are two FPMs in an MSF file, the idea being that for incremental updates you can write to the alternate one and then atomically swap them on commit. LLVM defaulted to using FPM1 on the first commit, but this differs from Microsoft's behavior which is to default to using FPM2 on the first commit. To eliminate some byte-level file differences, this patch changes LLVM's default to also be FPM2. Additionally, LLVM was trying to be "smart" about marking FPM pages allocated. In addition to marking every page belonging to the alternate FPM as unallocated, LLVM also marked pages at the end of the main FPM which were not needed as unallocated. In order to match the behavior of Microsoft-generated PDBs, we now always mark every FPM block as allocated, regardless of whether it is in the main FPM or the alt FPM, and regardless of whether or not it describes blocks which are actually in the file. This has the side benefit of simplifying our code. llvm-svn: 328812	2018-03-29 18:34:15 +00:00
Craig Topper	2fa1436206	[IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806	2018-03-29 17:21:10 +00:00
Paul Robinson	b271f31d8d	Reapply "[DWARFv5] Emit file 0 to the line table." DWARF v5 specifies that the root file (also given in the DW_AT_name attribute of the compilation unit DIE) should be emitted explicitly to the line table's list of files. This makes the line table more independent of the .debug_info section. We emit the new syntax only for DWARF v5 and later. Fixes the bug found by asan. Also XFAIL the new test for Darwin, which is stuck on DWARF v2, and fix up other tests so they stop failing on Windows. Last but not least, don't break "clang -g" of an assembler file that has .file directives in it. Differential Revision: https://reviews.llvm.org/D44054 llvm-svn: 328805	2018-03-29 17:16:41 +00:00
Haicheng Wu	c7cc87922e	[JumpThreading] Don't select an edge that we know we can't thread In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798	2018-03-29 16:01:26 +00:00
Pavel Labath	ea0f841c3b	.debug_names: Correctly align the AugmentationStringSize field We should align the value of the field, not the overall section offset. This distinction matters if one of the debug_names contributions is not of size which is a multiple of four. The dwarf producers may choose to emit rounded contributions, but they are not required to do so. In the latter case, without this patch we would corrupt the parsing state, as we would adjust the offset even if subsequent contributions contained correctly rounded augmentation strings. llvm-svn: 328796	2018-03-29 15:12:45 +00:00
Krzysztof Parzyszek	dc7a557e6a	[Hexagon] Add support to handle bit-reverse load intrinsics Patch by Sumanth Gundapaneni. llvm-svn: 328774	2018-03-29 13:52:46 +00:00
Pavel Labath	2d1fc4375f	.debug_names: Parse DW_IDX_die_offset as a reference Before this patch we were parsing the attributes as section offsets, as that is what apple_names is doing. However, this is not correct as DWARF v5 specifies that this attribute should use the Reference form class. This also updates all the testcases (except the ones that deliberately pass a different form) to use the correct form class. llvm-svn: 328773	2018-03-29 13:47:57 +00:00
Simon Pilgrim	71c5f3fffd	[X86][SSE] Don't bother re-adding combined target shuffles to the work list We are re-adding all the bitcasts, constant masks and target shuffles to the work list for no apparent gain. Found while investigating adding SimplifyDemandedVectorElts to target shuffles. Differential Revision: https://reviews.llvm.org/D44942 llvm-svn: 328771	2018-03-29 11:18:41 +00:00
Simon Dardis	32a27fc77a	[Mips] Remove dead code I believe the role of ehDataReg has been replaced by MipsABIInfo::GetEhDataReg, thus removing the dead code. Patch By: Wei-Ren Chen. Reviewers: ehostunreach, sdardis Differential Revision: https://reviews.llvm.org/D44867 llvm-svn: 328767	2018-03-29 09:21:20 +00:00
David Green	b0aa36f9c2	[LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass with Loop Rotation Utility Interface The existing LoopRotation.cpp is implemented as one of loop passes instead of being a utility. The user cannot easily perform the loop rotation selectively (or on demand) under different optimization level. For example, the loop rotation is needed as part of the logic to convert a loop into a loop with bottom test for a transformation. If the loop rotation is simply added as a loop pass before the transformation, the pass is skipped if it is compiled at –O0 or if it is explicitly disabled by the user, causing the compiler to generate incorrect code. Furthermore, as a loop pass it will rotate all loops instead of just the relevant loops. We provide a utility interface for the loop rotation so that the loop rotation can be called on demand. The changeset is as follows: - Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main implementation of class LoopRotate into this file. - Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the interface LoopRotation(...). - Original LoopRotation.cpp is changed to use the utility function LoopRotation in LoopRotationUtils.cpp. This is done in the same way community did for mem-to-reg implementation. Patch by Jin Lin! Differential Revision: https://reviews.llvm.org/D44595 llvm-svn: 328766	2018-03-29 08:48:15 +00:00
Benjamin Kramer	6b995a4a7e	[Transforms] Make sure to include the c binding header when defining c binding functions Otherwise the definitions can't see the extern C declarations and get name mangled, making it impossible for users to call them. This breaks the Go bindings. llvm-svn: 328765	2018-03-29 07:56:53 +00:00
Max Kazantsev	18f93894db	[NFC] Fix meaningless assert in SCEV llvm-svn: 328764	2018-03-29 07:54:59 +00:00
Craig Topper	a21758fa2c	[X86] Don't pass getRegisterName from the InstPrinters into EmitAnyX86InstComments. Just always use the function from the ATTPrinter. NFC The IntelPrinter and the ATTPrinter produce the same strings for the same input. We already use the ATTPrinter explicitly in several other places. llvm-svn: 328762	2018-03-29 04:14:04 +00:00
Robert Widmann	6775f52fe0	[LLVM-C] Finish exception instruction bindings Summary: Add support for cleanupret, catchret, catchpad, cleanuppad and catchswitch and their associated accessors. Test is modified from SimplifyCFG because it contains many diverse usages of these instructions. Reviewers: whitequark, deadalnix, echristo Reviewed By: echristo Subscribers: llvm-commits, harlanhaskins Differential Revision: https://reviews.llvm.org/D44496 llvm-svn: 328759	2018-03-29 03:43:15 +00:00
Craig Topper	7456af88f4	[X86] Rename RIi64_NOREX tblgen class to just Ii64. Make RIi64 inherit from it. NFC This feels more consistent with the other classes. We don't need to say _NOREX if we didn't start it with an R in the first place. llvm-svn: 328757	2018-03-29 03:14:57 +00:00
Craig Topper	7441ffff84	[X86] Cleanup inheritance of the X86InstrFormats.td classes. NFC EVEX shouldn't inherit from VEX and EVEX_4V shouldn't inherit from VEX_4V. llvm-svn: 328756	2018-03-29 03:14:56 +00:00
George Burgess IV	af0b06f4fd	[MemorySSA] Turn an assert into a condition Eli pointed out that variadic functions are totally a thing, so this assert is incorrect. No test-case is provided, since the only way this assert fires is if a specific DenseMap falls back to doing `isEqual` checks, and that seems fairly brittle (and requires a pyramid of growing `call void (i8, ...) @varargs(i8 0)`). llvm-svn: 328755	2018-03-29 03:12:03 +00:00
George Burgess IV	3588fd4865	[MemorySSA] Consider callsite args for hashing and equality. We use a `DenseMap<MemoryLocOrCall, MemlocStackInfo>` to keep track of prior work when optimizing uses in MemorySSA. Because we weren't accounting for callsite arguments in either the hash code or equality tests for `MemoryLocOrCall`s, we optimized uses too aggressively in some rare cases. Fix by Daniel Berlin. Should fix PR36883. llvm-svn: 328748	2018-03-29 00:54:39 +00:00
David Blaikie	b3f471a4bd	Remove some unused includes to fix layering. llvm-svn: 328745	2018-03-29 00:29:45 +00:00
David Blaikie	8ad9a97310	Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737	2018-03-28 22:28:50 +00:00
Craig Topper	aac23d7881	[X86][SkylakeServer] Remove checks for 'k', 'z', '_Int' and 'b' from scheduler regexs. Most of these were optional matches at the end of the strings, but since the strings themselves are prefix matches by default you don't need to check for something optional at the end. I've left the 'b' on memory instructions where it means 'broadcast' because I'm not sure those really have the same load latency and we may need to split them explicitly in the future. llvm-svn: 328730	2018-03-28 20:40:24 +00:00
Jun Bum Lim	f90fe701ef	[PostRAMachineSink] preserve CFG Summary: Mark CFG is preserved since this pass do not make any change in CFG. Reviewers: sebpop, mzolotukhin, mcrosier Reviewed By: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44845 llvm-svn: 328727	2018-03-28 19:56:26 +00:00
Krzysztof Parzyszek	440ba3ae5c	[Hexagon] Add support for "new" circular buffer intrinsics These instructions have been around for a long time, but we haven't supported intrinsics for them. The "new" versions use the CSx register for the start of the buffer instead of the K field in the Mx register. We need to use pseudo instructions for these instructions until after register allocation. The problem is that these instructions allocate a M0/CS0 or M1/CS1 pair. But, we can't generate code for the CSx set-up until after register allocation when the Mx register has been fixed for the instruction. There is a related clang patch. Patch by Brendon Cahoon. llvm-svn: 328724	2018-03-28 19:38:29 +00:00
David Blaikie	eb8cc04ea2	Oops - moved slightly too many things from Scalar to Utils. Move LoopSimplifyCFG things back llvm-svn: 328720	2018-03-28 18:03:25 +00:00
Jessica Paquette	4aa14dbcc2	[MachineOutliner] Simplify call outlining + require valid callee save info for call outlining This commit simplifies the call outlining logic by removing references to the Function associated with the callee. To do this, it requires that valid callee save info is available to the outliner. llvm-svn: 328719	2018-03-28 17:52:31 +00:00
David Blaikie	a373d18eb7	Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717	2018-03-28 17:44:36 +00:00
Peter Collingbourne	d579c31d68	[llvm-ar] Support multiple dashed options This allows syntax like: $ llvm-ar -c -r -u file.a file.o This is in addition to the other formats that are already supported: $ llvm-ar cru file.a file.o $ llvm-ar -cru file.a file.o Patch by Tom Anderson! Differential Revision: https://reviews.llvm.org/D44452 llvm-svn: 328716	2018-03-28 17:21:14 +00:00
Dmitry Preobrazhensky	622bde8bc7	[AMDGPU][MC] Added ds_add_src2_f32 See bug 36833: https://bugs.llvm.org/show_bug.cgi?id=36833 Differential Revision: https://reviews.llvm.org/D44779 Reviewers: arsenm, artem.tamazov, timcorringham llvm-svn: 328713	2018-03-28 16:21:56 +00:00
Dmitry Preobrazhensky	2456ac696a	[AMDGPU][MC] Added PCK variants of image load/store instructions See bug 36834: https://bugs.llvm.org/show_bug.cgi?id=36834 Differential Revision: https://reviews.llvm.org/D44795 Reviewers: artem.tamazov, arsenm, timcorringham, nhaehnle llvm-svn: 328710	2018-03-28 15:44:16 +00:00
Dmitry Preobrazhensky	a917e88585	[AMDGPU][MC][GFX9] Added buffer_*_format_d16_hi_x See bug 36835: https://bugs.llvm.org/show_bug.cgi?id=36835 Differential Revision: https://reviews.llvm.org/D44825 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328707	2018-03-28 14:53:13 +00:00
Dmitry Preobrazhensky	dd2b929ffb	[AMDGPU][MC][GFX9] Added s_scratch* instructions See bug 36836: https://bugs.llvm.org/show_bug.cgi?id=36836 Differential Revision: https://reviews.llvm.org/D44832 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328704	2018-03-28 14:08:03 +00:00
Simon Pilgrim	b1bc6cd96b	[X86][Btver2] Moved JWriteFCmp/JWriteFCmpY classes next to each other. NFCI Renamed JWriteFPAY22 to JWriteFCmpY - we've tended to avoid latency based names llvm-svn: 328701	2018-03-28 13:53:21 +00:00
Alexander Potapenko	202f809437	Revert "Reapply "[DWARFv5] Emit file 0 to the line table."" This reverts commit r328676. Commit r328676 broke the -no-integrated-as flag necessary to build Linux kernel with Clang: $ cat t.c void foo() {} $ clang -no-integrated-as -c t.c -g /tmp/t-dcdec5.s: Assembler messages: /tmp/t-dcdec5.s:8: Error: file number less than one clang-7.0: error: assembler command failed with exit code 1 (use -v to see invocation) llvm-svn: 328699	2018-03-28 12:36:46 +00:00
Andrea Di Biagio	5076b98fb9	[X86][BtVer2] Fix the number of micro opcodes for AES[ENC\|DEC] and other YMM instructions. Similar to r328694. The number of micro opcodes should be 2 for those instructions. This was found when testing AVX code for BtVer2 using llvm-mca. llvm-svn: 328698	2018-03-28 12:12:04 +00:00
Alexander Potapenko	4e7ad0805e	[MSan] Introduce ActualFnStart. NFC This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to prepend a special basic block containing tool-specific calls to each function. Because we still want to instrument the original entry block, we'll need to store it in ActualFnStart. For MSan this will still be F.getEntryBlock(), whereas for KMSAN it'll contain the second BB. llvm-svn: 328697	2018-03-28 11:35:09 +00:00
Tim Renouf	cdac172e2a	Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader" This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981. It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a release-with-asserts build. I will resubmit the change when I have fixed that. Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a llvm-svn: 328695	2018-03-28 11:21:07 +00:00
Andrea Di Biagio	010924e35c	[X86][BtVer2] Fix the number of micro opcodes for a bunch of YMM instructions. The Jaguar backend natively supports 128-bit data types. Operations on YMM registers are split into two COPs (complex operations). Each COP consumes a slot in the dispatch group, and in the reorder buffer. The scheduling model for Jaguar should mark those instructions as `let NumMicroOps = 2`. This was found when testing AVX code for BtVer2 using llvm-mca. llvm-svn: 328694	2018-03-28 10:49:33 +00:00
Alexander Potapenko	e1d5877847	[MSan] Add an isStore argument to getShadowOriginPtr(). NFC This is a step towards the upcoming KMSAN implementation patch. The isStore argument is to be used by getShadowOriginPtrKernel(), it is ignored by getShadowOriginPtrUserspace(). Depending on whether a memory access is a load or a store, KMSAN instruments it with different functions, __msan_metadata_ptr_for_load_X() and __msan_metadata_ptr_for_store_X(). Those functions may return different values for a single address, which is necessary in the case the runtime library decides to ignore particular accesses. llvm-svn: 328692	2018-03-28 10:17:17 +00:00
Christof Douma	a1e77c0e02	[ARM] Support float literals under XO Follow up patch of r328313 to support the UseVMOVSR constraint. Removed some unneeded instructions from the test and removed some stray comments. Differential Revision: https://reviews.llvm.org/D44941 llvm-svn: 328691	2018-03-28 10:02:26 +00:00
Mikael Holmen	6c062b7641	[RegisterCoalescing] Don't move COPY if it would interfere with another value Summary: RegisterCoalescer::removePartialRedundancy tries to hoist B = A from BB0/BB2 to BB1: BB1: ... BB0/BB2: ---- B = A; \| ... \| A = B; \| \|------- \| It does so if a number of conditions are fulfilled. However, it failed to check if B was used by any of the terminators in BB1. Since we must insert B = A before the terminators (since it's not a terminator itself), this means that we could erroneously insert a new definition of B before a use of it. Reviewers: wmi, qcolombet Reviewed By: wmi Subscribers: MatzeB, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D44918 llvm-svn: 328689	2018-03-28 06:01:30 +00:00
Lang Hames	a95b0df5ed	[ORC] Fix ORC on platforms without indirection support. Previously this crashed because a nullptr (returned by createLocalIndirectStubsManagerBuilder() on platforms without indirection support) functor was unconditionally invoked. Patch by Andres Freund. Thanks Andres! llvm-svn: 328687	2018-03-28 03:41:45 +00:00
Matt Arsenault	bd49eccca1	AMDGPU: Really implement getFrameRegister Currently this seems to only really be used for debug info. llvm-svn: 328677	2018-03-27 23:26:59 +00:00
Paul Robinson	07480bd177	Reapply "[DWARFv5] Emit file 0 to the line table." DWARF v5 specifies that the root file (also given in the DW_AT_name attribute of the compilation unit DIE) should be emitted explicitly to the line table's list of files. This makes the line table more independent of the .debug_info section. Fixes the bug found by asan. Also XFAIL the new test for Darwin, which is stuck on DWARF v2, and fix up other tests so they stop failing on Windows. Last but not least, don't break "clang -g" of an assembler file that has .file directives in it. Differential Revision: https://reviews.llvm.org/D44054 llvm-svn: 328676	2018-03-27 22:40:34 +00:00
Jessica Paquette	2519ee7081	[MachineOutliner] AArch64: Don't outline ADRPs with un-outlinable operands If an ADRP appears with, say, a CPI operand, we shouldn't outline it. This moves the check for unsafe operands so that it occurs before the special-case for ADRPs. Also add a test for outlining ADRPs. llvm-svn: 328674	2018-03-27 22:23:48 +00:00
Tim Renouf	e4208bfa5b	[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader Summary: For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders). This commit fixes that to use offset 0x10 instead of offset 0 for a compute shader, per the PAL ABI spec. Reviewers: kzhuravl, nhaehnle, timcorringham Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D44468 Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f llvm-svn: 328673	2018-03-27 21:35:00 +00:00
Paul Robinson	7cb26ad2ef	[DWARF] Suppress split line tables more carefully. If a given split type unit does not have source locations, don't have it refer to the split line table. If no split type unit refers to the split line table, don't emit the line table at all. This will save a little space on rare occasions, but also refactors things a bit to improve which class is responsible for what. Responding to review comments on r326395. Differential Revision: https://reviews.llvm.org/D44220 llvm-svn: 328670	2018-03-27 21:28:59 +00:00
Tim Renouf	4db0960420	[CodeGen] Fixed unreachable with -print-machineinstrs and custom pseudo source value Summary: Rev 327580 "[CodeGen] Use MIR syntax for MachineMemOperand printing" broke -print-machineinstrs for us on AMDGPU, because we have custom pseudo source values, and MIR serialization does not implement that. This commit at least restores the functionality of -print-machineinstrs, even if it does not properly implement the missing MIR serialization functionality. Differential Revision: https://reviews.llvm.org/D44871 Change-Id: I44961c0b90bf6d48c01484ed7a4e466fd300db66 llvm-svn: 328668	2018-03-27 21:14:04 +00:00
Sterling Augustine	33dc01861a	Initialize variable added in r328617. llvm-svn: 328667	2018-03-27 21:11:57 +00:00
Simon Pilgrim	a2f26788a3	[X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer. Differential Revision: https://reviews.llvm.org/D44924 llvm-svn: 328664	2018-03-27 20:38:54 +00:00
Wolfgang Pieb	ab068eaa57	[DWARF][DWARF v5]: Adding support for dumping DW_RLE_offset_pair and DW_RLE_base_address Reviewers: dblakie, aprantl Differential Revision: https://reviews.llvm.org/D44811 llvm-svn: 328662	2018-03-27 20:27:36 +00:00
Graydon Hoare	926cd9b837	[YAML] Escape non-printable multibyte UTF8 in Output::scalarString. The existing YAML Output::scalarString code path includes a partial and incorrect implementation of YAML escaping logic. In particular, the logic put in place in rL321283 escapes non-printable bytes only if they are not part of a multibyte UTF8 sequence; implicitly this means that all multibyte UTF8 sequences -- printable and non -- are passed through verbatim. The simplest solution to this is to direct the Output::scalarString method to use the standalone yaml::escape function, and this _almost_ works, except that the existing code in that function _over_ escapes: any multibyte UTF8 sequence is escaped, even printable ones. While this is permitted for YAML, it is also more aggressive (and hard to read for non-English locales) than necessary, and the entire point of rL321283 was to back off such aggressive over-escaping. So in this change, I have both redirected Output::scalarString to use yaml::escape _and_ modified yaml::escape to optionally restrict its escaping to non-printables. This preserves behaviour of any existing clients while giving them a path to more moderate escaping should they desire. Reviewers: JDevlieghere, thegameg, MatzeB, vladimir.plyashkun Reviewed By: thegameg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44863 llvm-svn: 328661	2018-03-27 19:52:45 +00:00
Xin Tong	0272cb077f	80-line wrap. NFC llvm-svn: 328660	2018-03-27 19:43:02 +00:00
Matt Arsenault	17f3338015	AMDGPU: Fix not preserving CSR VGPR if used for SGPR spills Before this was not done if the function had no calls in it. This is still a possible issue with any callable function, regardless of calls present. llvm-svn: 328659	2018-03-27 19:42:55 +00:00
Matt Arsenault	95329f8c53	AMDGPU: Set natural stack alignment in DataLayout Only 4 byte alignment is ever useful, so increasing anything beyond this may require realigning the stack. llvm-svn: 328656	2018-03-27 19:26:40 +00:00
Rong Xu	662f38b16f	[PGO] Fix branch probability remarks assert Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653	2018-03-27 18:55:56 +00:00
Matt Arsenault	0a0c871f60	AMDGPU: Fix crash when MachinePointerInfo invalid The combine on a select of a load only triggers for addrspace 0, and discards the MachinePointerInfo. The conservative default needs to be used for this. llvm-svn: 328652	2018-03-27 18:39:45 +00:00
Matt Arsenault	e9f3679031	AMDGPU: Fix FP restore from being reordered with stack ops In a function, s5 is used as the frame base SGPR. If a function is calling another function, during the call sequence it is copied to a preserved SGPR and restored. Before it was possible for the scheduler to move stack operations before the restore of s5, since there's nothing to associate a frame index access with the restore. Add an implicit use of s5 to the adjcallstack pseudo which ends the call sequence to preven this from happening. I'm not 100% satisfied with this solution, but I'm not sure what else would be better. llvm-svn: 328650	2018-03-27 18:38:51 +00:00
Krzysztof Parzyszek	0375cd46ef	[Hexagon] Implement TTI::shouldMaximizeVectorBandwidth llvm-svn: 328648	2018-03-27 18:10:47 +00:00
Stefan Pintilie	659f040351	[Power9] Fix the resource list for the COPY instruction. The COPY instruction was listed as a 4 cycle instruction. It is now listed correctly as a 2 cycle ALU instruction. llvm-svn: 328647	2018-03-27 17:51:53 +00:00
Pirama Arumuga Nainar	ddd7b06842	Remap values in PromotedFloats Summary: When a node is about to be erased from ReplacedValues, we should also remap its corresponding values in PromotedFloats. Patch by Yan Luo (Yan.Luo2@synopsys.com) Reviewers: pirama Reviewed By: pirama Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D44872 llvm-svn: 328644	2018-03-27 17:42:36 +00:00
Krzysztof Parzyszek	0a15d24134	[Hexagon] Rudimentary support for auto-vectorization for HVX This implements a set of TTI functions that the loop vectorizer uses. The only purpose of this is to enable testing. Auto-vectorization is disabled by default, enabled by -hexagon-autohvx. llvm-svn: 328639	2018-03-27 17:07:52 +00:00
Rafael Auler	d058b882be	[AArch64] Decorate AArch64 instrs with OPERAND_PCREL Summary: This is a canonical way to teach objdump to print the target symbols for branches when disassembling AArch64 code. Reviewers: evandro, t.p.northover, espindola Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D44851 llvm-svn: 328638	2018-03-27 16:58:01 +00:00
Fedor Sergeev	98014e433f	[NFC] OptPassGate extracted from OptBisect Summary: This is an NFC refactoring of the OptBisect class to split it into an optional pass gate interface used by LLVMContext and the Optional Pass Bisector (OptBisect) used for debugging of optional passes. This refactoring is needed for D44464, which introduces setOptPassGate() method to allow implementations other than OptBisect. Patch by Yevgeny Rouban. Reviewers: andrew.w.kaylor, fedor.sergeev, vsk, dberlin, Eugene.Zelenko, reames, skatkov Reviewed By: fedor.sergeev Differential Revision: https://reviews.llvm.org/D44821 llvm-svn: 328637	2018-03-27 16:57:20 +00:00
Krzysztof Parzyszek	52396bb9c5	Use .set instead of = when printing assignment in assembly output On Hexagon "x = y" is a syntax used in most instructions, and is not treated as a directive. Differential Revision: https://reviews.llvm.org/D44256 llvm-svn: 328635	2018-03-27 16:44:41 +00:00
Krzysztof Parzyszek	5d93fdfa89	[LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per target The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632	2018-03-27 16:14:11 +00:00
Simon Pilgrim	5f7ab4fedf	[X86][Btver2] Add MMX_PMOVMSKBrr to MOVMSK scheduler class llvm-svn: 328620	2018-03-27 12:26:12 +00:00
Strahinja Petrovic	06cf6a6490	[PowerPC] Secure PLT support This patch supports secure PLT mode for PowerPC 32 architecture. Differential Revision: https://reviews.llvm.org/D42112 llvm-svn: 328617	2018-03-27 11:23:53 +00:00
Alexander Richardson	e8059b1de4	[MIPS] Add static_assert that all Fixups are handled in getFixupKind Summary: I recently added a new Fixup kind to our fork of LLVM but forgot to add it to the table in MipsAsmBackend.cpp. With this static_assert the error would have been caught instead of zero-initializing the array entries for the new fixups. Reviewers: sdardis, atanasyan Reviewed By: atanasyan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44895 llvm-svn: 328616	2018-03-27 10:08:12 +00:00
Max Kazantsev	b1ad66ff12	[LoopUnroll][NFC] Remove redundant canPeel check We check `canPeel` twice: when evaluating the number of iterations to be peeled and within the method `peelLoop` that performs peeling. This method is only executed if the calculated peel count is positive. Thus, the check in `peelLoop` can never fail. This patch replaces this check with an assert. Differential Revision: https://reviews.llvm.org/D44919 Reviewed By: fhahn llvm-svn: 328615	2018-03-27 09:40:51 +00:00
Sam Parker	90b7f4f72c	[IRCE] Enable decreasing loops of non-const bound As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613	2018-03-27 08:24:53 +00:00
Max Kazantsev	ee5dd8306f	[NFC] Fix comments in getExact() llvm-svn: 328612	2018-03-27 08:13:55 +00:00
Max Kazantsev	7094c8deb2	[SCEV] Make exact taken count calculation more optimistic Currently, `getExact` fails if it sees two exit counts in different blocks. There is no solid reason to do so, given that we only calculate exact non-taken count for exiting blocks that dominate latch. Using this fact, we can simply take min out of all exits of all blocks to get the exact taken count. This patch makes the calculation more optimistic with enforcing our assumption with asserts. It allows us to calculate exact backedge taken count in trivial loops like for (int i = 0; i < 100; i++) { if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44676 Reviewed By: fhahn llvm-svn: 328611	2018-03-27 07:30:38 +00:00
Max Kazantsev	a63d333881	[SCEV] Add one more case in computeConstantDifference This patch teaches `computeConstantDifference` handle calculation of constant difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`. Differential Revision: https://reviews.llvm.org/D43759 Reviewed By: anna llvm-svn: 328609	2018-03-27 04:54:00 +00:00
David Blaikie	60e62438d2	Add a build dependency from libMC to libDebugInfoCodeView to match the reality of header dependencies here llvm-svn: 328595	2018-03-26 23:48:52 +00:00
Aaron Smith	f13938382c	[DebugInfoPDB] Print the method name along with the variant value Before this change, using dumpProperties() with PDBSymbolData would look like this: get_locationType: 3 1 After this change: get_locationType: 3 get_value: 1 llvm-svn: 328590	2018-03-26 22:53:38 +00:00
Aaron Smith	1af50bcf89	[DebugInfoPDB] Add methods to get the compiland and line numbers with PDBSymbolData llvm-svn: 328587	2018-03-26 22:17:12 +00:00
Aaron Smith	ed81a9db29	[DebugInfoPDB] Add DIA implementation of findLineNumbersByRVA This method is used to find line numbers for PDBSymbolData that have an invalid virtual address. llvm-svn: 328586	2018-03-26 22:13:22 +00:00
Aaron Smith	53708a5e9e	[DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVA These are used in finding line numbers for PDBSymbolData llvm-svn: 328585	2018-03-26 22:10:02 +00:00
Simon Pilgrim	28e7bcbba6	[X86] Add WriteCRC32 scheduler class Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis. Differential Revision: https://reviews.llvm.org/D44647 llvm-svn: 328582	2018-03-26 21:06:14 +00:00
Rafael Espindola	78fdca3cd5	Use local symbols for creating .stack-size. llvm-svn: 328581	2018-03-26 20:40:22 +00:00
Paul Robinson	82e4864730	Use correct format specifier. Review comment on r328235 by James Henderson. llvm-svn: 328578	2018-03-26 19:55:01 +00:00
Eli Friedman	88e2bac94d	[MemorySSA] Fix exponential compile-time updating MemorySSA. MemorySSAUpdater::getPreviousDefRecursive is a recursive algorithm, for each block, it computes the previous definition for each predecessor, then takes those definitions and combines them. But currently it doesn't remember results which it already computed; this means it can visit the same block multiple times, which adds up to exponential time overall. To fix this, this patch adds a cache. If we computed the result for a block already, we don't need to visit it again because we'll come up with the same result. Well, unless we RAUW a MemoryPHI; in that case, the TrackingVH will be updated automatically. This matches the original source paper for this algorithm. The testcase isn't really a test for the bug, but it adds coverage for the case where tryRemoveTrivialPhi erases an existing PHI node. (It's hard to write a good regression test for a performance issue.) Differential Revision: https://reviews.llvm.org/D44715 llvm-svn: 328577	2018-03-26 19:52:54 +00:00
Krzysztof Parzyszek	4a5a80c370	[Hexagon] Assertion failure in HexagonSubtarget.cpp In restoreLatency, replace range-for loop with std::find. Patch by Jyotsna Verma. llvm-svn: 328574	2018-03-26 19:04:58 +00:00
Simon Pilgrim	fcf49df21c	[X86][Btver2] Add (U)COMISD/(U)COMISD scheduler costs Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write) llvm-svn: 328573	2018-03-26 19:01:06 +00:00
Reid Kleckner	41fb2dba9c	[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32 Summary: Re-lands r328386 and r328443, reverting r328482. Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small parameters in i8 and i16 do not end up in the SysV register parameters (EDI, ESI, etc). I added tests for how we receive small parameters, since that is the important part. It's always safe to store more bytes than will be read, but the assumptions you make when loading them are what really matter. I also tested this by self-hosting clang and it passed tests on win64. Reviewers: mstorsjo, hans Subscribers: hiraditya, mstorsjo, llvm-commits Differential Revision: https://reviews.llvm.org/D44900 llvm-svn: 328570	2018-03-26 18:49:48 +00:00
Simon Pilgrim	f33d905293	[X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881) Give the bit count instructions their own scheduler classes instead of forcing them into existing classes. These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar). Differential Revision: https://reviews.llvm.org/D44879 llvm-svn: 328566	2018-03-26 18:19:28 +00:00
David Blaikie	7c4b5d92f1	Remove unused file, ExecutionEngine/MCJIT/ObjectBuffer.h This header also wasn't self contained/modular - but with no users, it didn't seem worth fixing because it'd break so easily again. llvm-svn: 328565	2018-03-26 18:10:31 +00:00
Mandeep Singh Grang	1b9ff45157	[XCore] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: dblaikie, RKSimon, robertlytton Reviewed By: robertlytton Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44875 llvm-svn: 328564	2018-03-26 18:08:26 +00:00
Sanjay Patel	0e3167cb30	[InstCombine] improve code comment; NFC llvm-svn: 328560	2018-03-26 17:52:02 +00:00
Lei Huang	be0afb0870	[Power9]Legalize and emit code for quad-precision convert from double-precision Legalize and emit code for quad-precision floating point operation xscvdpqp and add option to guard the quad precision operation support. Differential Revision: https://reviews.llvm.org/D44746 llvm-svn: 328558	2018-03-26 17:46:25 +00:00
Stefan Pintilie	26d4f923c4	[PowerPC] Infrastructure work. Implement getting the opcode for a spill in one place. A new function getOpcodeForSpill should now be the only place to get the opcode for a given spilled register. Differential Revision: https://reviews.llvm.org/D43086 llvm-svn: 328556	2018-03-26 17:39:18 +00:00
Zaara Syeda	17e4eeaa8b	Disable [MachineLICM] Add functions to MachineLICM to hoist invariant stores Disable https://reviews.llvm.org/D40196 with setting option hoist-const-stores to false since failing s390 buildbot. llvm-svn: 328555	2018-03-26 17:22:33 +00:00
Krzysztof Parzyszek	3ca233414b	[Pipeliner] Several node-ordering fixes First, we change the heuristic that is used to ignore the recurrent node-sets in the node ordering. In certain cases it's not important to focus on the recurrent node-sets. Instead, the algorithm begins by considering all the instructions in the node ordering step. Second, a minor change to the bottom up traversal, which needs to consider loop carried dependences (modeled as anti dependences). Previously, these instructions were skipped, which caused problems because the instruction ends up having both predecessors and sucessors in the schedule. Third, consider anti-dependences as a tie breaker when choosing between instructions in the node ordering. We want to make sure that the source of the anti-dependence does not end up with both predecesssors and sucessors in the final node ordering. Patch by Brendon Cahoon. llvm-svn: 328554	2018-03-26 17:07:41 +00:00
Tim Corringham	7116e8963d	[AMDGPU] Improve disassembler error handling Summary: llvm-objdump now disassembles unrecognised opcodes as data, using the .long directive. We treat unrecognised opcodes as being 32 bit values, so move along 4 bytes rather than the single byte which previously resulted in a cascade of bogus disassembly following an unrecognised opcode. While no solution can always disassemble code that contains embedded data correctly this provides a significant improvement. The disassembler will now cope with an arbitrary length section as it no longer truncates it to a multiple of 4 bytes, and will use the .byte directive for trailing bytes. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D44685 llvm-svn: 328553	2018-03-26 17:06:33 +00:00
Simon Pilgrim	86ea53123d	[X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costs We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR..... llvm-svn: 328551	2018-03-26 17:02:02 +00:00
Krzysztof Parzyszek	8c07d0c42c	[Pipeliner] Check for affine expression in isLoopCarriedOrder The pipeliner must add a loop carried dependence between two memory operations if the base register is not an affine (linear) exression. The current implementation doesn't check how the base register is defined, which allows non-affine expressions, and then the pipeliner does not add a loop carried dependence when one is needed. This patch adds code to isLoopCarriedOrder that checks if the base register of the memory operations is defined by a phi, and the loop definition for the phi is a constant increment value. This is a very simple check for a linear expression. Patch by Brendon Cahoon. llvm-svn: 328550	2018-03-26 16:58:40 +00:00
David Blaikie	535ca36e5e	Remove an unneeded (& mislayered) include from Target/TargetLoweringObjectFile on a CodeGen header llvm-svn: 328549	2018-03-26 16:57:31 +00:00
David Blaikie	a1b2bf4c71	Remove unneeded (& mislayered) include from TargetMachine.cpp on a CodeGen header llvm-svn: 328548	2018-03-26 16:52:10 +00:00
Krzysztof Parzyszek	9f041b1830	[Pipeliner] Add missing loop carried dependences The pipeliner is not adding a dependence edge for a loop carried dependence, and ends up scheduling a load from iteration n prior to an aliased store in iteration n-1. The code that adds the loop carried dependences in the pipeliner doesn't check if the memory objects for loads and stores are "identified" (i.e., distinct) objects. If they are not, then the code that adds the dependences needs to be conservative. The objects can be used to check dependences only when they are distinct objects. The code that checks for loop carried dependences has been updated to classify loads and stores that are not identified as "unknown" values. A store with an "unknown" value can potentially create a loop carried dependence with any pending load. Patch by Brendon Cahoon. llvm-svn: 328547	2018-03-26 16:50:11 +00:00
Krzysztof Parzyszek	16e66f5901	[Pipeliner] Fix renaming in pipeliner when eliminating phis The phi renaming code in the pipeliner uses the wrong value when rewriting phi uses, which results in an undefined value. In this case, the original phi is no longer needed due to the order of instruction in the pipelined loop. The pipeliner was assuming, in this case, the the phi loop definition should be used to rewrite the uses. However, the pipeliner needs to check to make sure that the loop definition has already been scheduled. If not, then the phi initial value needs to be used instead. Patch by Brendon Cahoon. llvm-svn: 328545	2018-03-26 16:41:36 +00:00
Krzysztof Parzyszek	3f72a6b7a1	[Pipeliner] Fix number of phis to generate in the epilog The pipeliner was generating too many phis in the epilog blocks, which caused incorrect code generation when rewriting an instruction that uses the phi. In this case, there 3 prolog and epilog stages. An existing phi was scheduled at stage 1. When generating the code for the 2nd epilog an extra new phi was generated. To fix this, we need to update the code that calculates the maximum number of phis that can be generated, which is based upon the current prolog stage and the stage of the original phi. In this case, when the prolog stage is 1 and the original phi stage is 1, the maximum number of phis to generate is 2. Patch by Brendon Cahoon. llvm-svn: 328543	2018-03-26 16:37:55 +00:00
Krzysztof Parzyszek	a212204453	[Pipeliner] Use latency to compute RecMII The patch contains severals changes needed to pipeline an example that was transformed so that a Phi with a subreg is converted to copies. The pipeliner wasn't working for a couple of reasons. - The RecMII was 3 instead of 2 due to the extra copies. - Copy instructions contained a latency of 1. - The node order algorithm was not choosing the best "bottom" node, which caused an instruction to be scheduled that had a predecessor and successor already scheduled. - Updated the Hexagon Machine Scheduler to check if the node is latency bound when adding the cost for a 0-latency dependence. The RecMII was 3 because the computation looks at the number of nodes in the recurrence. The extra copy is an extra node but it shouldn't increase the latency. The new RecMII computation looks at the latency of the instructions in the recurrence. We changed the latency of the dependence of a copy to 0. The latency computation for the copy also checks the use of the copy (similar to a reg_sequence). The node order algorithm was not choosing the last instruction in the recurrence for a bottom up traversal. This was when the last instruction is a copy. A check was added when choosing the instruction to check for NodeNum if the maxASAP is the same. This means that the scheduler will not end up with another node in the recurrence that has both a predecessor and successor already scheduled. The cost computation in Hexagon Machine Scheduler adds cost when an instruction can be packetized with a zero-latency instruction. We should only do this if the schedule is latency bound. Patch by Brendon Cahoon. llvm-svn: 328542	2018-03-26 16:33:16 +00:00
Simon Pilgrim	8815105cd5	[X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs llvm-svn: 328541	2018-03-26 16:24:13 +00:00
Krzysztof Parzyszek	f13bbf1d58	[Pipeliner] Fix assert caused by pipeliner serialization The pipeliner is asserting because the serialization step that occurs at the end is deleting an instruction. The assert occurs later on because there is a use without a definition. The problem occurs when an instruction defines a value used by a REQ_SEQUENCE and that value is used by a COPY instruction. The latencies between these instructions are zero, so they are put in to the same packet. The serialization code is unable to handle this correctly, and ends up putting the REG_SEQUENCE before its definition. There is special code in the serialization step that attempts to handle zero-cost instructions (phis, copy, reg_sequence) differently than regular instructions. Unfortunately, this means the order does not come out correct. This patch simplifies the code by changing the seperate steps for handling zero-cost and regular instructions. Only phis are handled separate now, since they should occurs first. Then, this patch adds checks to make use the MoveUse is set to the smallest value if there are multiple uses in a cycle. Patch by Brendon Cahoon. llvm-svn: 328540	2018-03-26 16:23:29 +00:00
Sebastian Pop	d870aea03e	[InstCombine] reassociate loop invariant GEP chains to enable LICM This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539	2018-03-26 16:19:31 +00:00
Krzysztof Parzyszek	40df8a2b98	[Pipeliner] Enable more base+offset dependence changes in pipeliner The pipeliner changes dependences between base+offset instructions (loads and stores) so that the instructions have more flexibility to be scheduled with respect to each other. This occurs when the pipeliner is able to compute that the instructions will not alias if their order is changed. The prevous code enforced the alias property by checking if the base register is the same, and that the offset values are either both positive or negative. This patch improves the alias check by using the API areMemAccessesTriviallyDisjoint instead. This enables more cases, especially if the offset is a negative value. The pipeliner uses the function by creating a new instruction with the offset used in the next iteration. Patch by Brendon Cahoon. llvm-svn: 328538	2018-03-26 16:17:06 +00:00
Krzysztof Parzyszek	55cb4986a4	[Pipeliner] Fix calculation when reusing phis A schedule may require that a phi from the original loop is used in multiple iterations in the scheduled loop. When this occurs, we generate multiple phis in the pipelined loop to save the value across iterations. When we generate the new phis and update the register names in the pipelined loop, the pipeliner attempts to reuse a previously generated phi, when possible. The calculation for the name of the new phi needs to account for the version/iteration of the original phi. Also, in the epilog, the code only needs to check backwards for a previous iteration until reaching the first prolog block. Patch by Brendon Cahoon. llvm-svn: 328537	2018-03-26 16:10:48 +00:00
Simon Pilgrim	aa40148cae	[X86][Btver2] Account for the "+i" integer pipe transfer costs (1cy use of JALU0 for GPR PRF write) llvm-svn: 328536	2018-03-26 16:10:08 +00:00
Krzysztof Parzyszek	8e1363df4e	[Pipeliner] Fix check for order dependences when finalizing instructions The code in orderDepdences that looks at the order dependences between instructions was processing all the successor and predecessor order dependences. However, we really only want to check for an order dependence for instructions scheduled in the same cycle. Also, fixed how the pipeliner handles output dependences. An output dependence is also a potential loop carried dependence. The pipeliner didn't handle this case properly so an invalid schedule could be created that allowed an output dependence to be scheduled in the next iteration at the same cycle. Patch by Brendon Cahoon. llvm-svn: 328516	2018-03-26 16:05:55 +00:00
Krzysztof Parzyszek	3a0a15afe7	[Pipeliner] Fix in the pipeliner phi reuse code When the definition of a phi is used by a phi in the next iteration, the pipeliner was assuming that the definition is processed first. Because of the assumption, an incorrect phi name was used. This patch has a check to see if the phi definition has been processed already. Patch by Brendon Cahoon. llvm-svn: 328510	2018-03-26 15:58:16 +00:00
Krzysztof Parzyszek	b9b75b8cb6	[Pipeliner] Pipeliner should mark physical registers as used The software pipeliner attempts to delete dead instructions after generating the pipelined loop. The code looks for uses of each instruction. Physical registers should be treated differently because the use chains do not exist. The code that checks for dead instructions should assume that definitions of physical registers are used if the operand doesn't contain the dead flag. Patch by Brendon Cahoon. llvm-svn: 328509	2018-03-26 15:53:23 +00:00
Krzysztof Parzyszek	785b6cec11	[Pipeliner] Correctly update memoperands in the epilog The pipeliner needs to be conservative when updating the memoperands of instructions in the epilog. Previously, the pipeliner was changing the offset of the memoperand based upon the scheduling stage. However, that is incorrect when control flow branches around the kernel code. The bug enabled a load and store to the same stack offset to be swapped. This patch fixes the bug by updating the size of the memoperands to be UINT_MAX. This conservative value means that dependences will be created between other loads and stores. Patch by Brendon Cahoon. llvm-svn: 328508	2018-03-26 15:45:55 +00:00
Erik Pilkington	615e753e09	[demangler] Fix a bug in r328464 found by oss-fuzz. llvm-svn: 328507	2018-03-26 15:34:36 +00:00
Krzysztof Parzyszek	56f0fc4716	[Hexagon] Give priority to post-incremementing memory accesses in LSR llvm-svn: 328506	2018-03-26 15:32:03 +00:00
Simon Pilgrim	0b73b29388	[X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costs Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write) This also adds missing vcvttss2si tests llvm-svn: 328505	2018-03-26 15:30:47 +00:00
Sanjay Patel	4fd4fd610c	[InstCombine] distribute fmul over fadd/fsub This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502	2018-03-26 15:03:57 +00:00
Simon Pilgrim	3aa9344605	[X86][Btver2] Fix YMM BLENDPD/BLENDPS + UNPCKPD/UNPCKP instructions costs These should match the YMM MOVDUP/ PERMILPD/PERMILPS + SHUFPD/SHUFPS shuffles instead of using the WriteFShuffle defaults. llvm-svn: 328501	2018-03-26 14:44:24 +00:00
Sanjay Patel	2455fef497	[InstCombine] check uses before creating instructions for fmul distribution As the tests show, we could create extra instructions without any obvious benefit. llvm-svn: 328498	2018-03-26 14:25:43 +00:00
Simon Pilgrim	67df1cf597	[X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costs The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps llvm-svn: 328497	2018-03-26 14:03:40 +00:00
Nicolai Haehnle	4f850eabb6	AMDGPU: Introduce common SOP_Pseudo and VOP_Pseudo TableGen base classes Differential revision: https://reviews.llvm.org/D44820 Change-Id: I732979e2964006aa15d78a333d8886e6855f319a llvm-svn: 328496	2018-03-26 13:56:53 +00:00
Simon Pilgrim	caa203aed5	[X86][Btver2] Double the AGU and schedule pipe resources for YMM Both the AGUs and schedule pipes are double pumped for 256-bit instructions as well as the functional units which we already model. llvm-svn: 328491	2018-03-26 13:15:20 +00:00
Krzysztof Parzyszek	0b377e0ae9	[LSR] Allow giving priority to post-incrementing addressing modes Implement TTI interface for targets to indicate that the LSR should give priority to post-incrementing addressing modes. Combination of patches by Sebastian Pop and Brendon Cahoon. Differential Revision: https://reviews.llvm.org/D44758 llvm-svn: 328490	2018-03-26 13:10:09 +00:00
Max Kazantsev	a55749312b	[LoopUnroll] Fix dangling pointers in SCEV Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on fact that exit count of outer loops cannot rely on exiting blocks of inner loops, which is true in current implementation of backedge taken count calculation but is wrong in general. As result, when we only forget the loop that we have just unrolled, we may still have cached data for its outer loops (in particular, exit counts) which keeps references on blocks of inner loop that could have been changed or even deleted. The attached test demonstrates a situaton when after unrolling of innermost loop the outermost loop contains a dangling pointer on non-existant block. The problem shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV smarter about exit count calculation. I am not sure if the bug exists without this patch, it appears that now it is accidentally correct just because in practice exact backedge taken count for outer loops with complex control flow inside is never calculated. But when SCEV learns to do so, this problem shows up. This patch replaces existing logic of SCEV loop invalidation with a correct one, which happens to be invalidation of outermost loop (which also leads to invalidation of all loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers on removed blocks, or just outdated information that has changed after unrolling. Differential Revision: https://reviews.llvm.org/D44818 Reviewed By: samparker llvm-svn: 328483	2018-03-26 11:31:46 +00:00
Hans Wennborg	311b63f13b	Revert r328386 "[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32" This broke Chromium (see crbug.com/825748). It looks like mstorsjo's follow-up patch at D44876 fixes this, but let's revert back to green for now until that's ready to land. (Also reverts r328443.) > Both GCC and MSVC only look at the low byte of a boolean when it is > passed. llvm-svn: 328482	2018-03-26 10:07:51 +00:00
Benjamin Kramer	8840f644b4	[DeadArgElim] Strip allocsize attributes when deleting an argument. Since allocsize refers to the argument number it gets invalidated when an argument is removed and the numbers shift. llvm-svn: 328481	2018-03-26 09:44:24 +00:00
Sam Parker	53a423a417	[IRCE] Enable increasing loops of variable bounds CanBeMin is currently used which will report true for any unknown values, but often a check is performed outside the loop which covers this situation: for (int i = 0; i < N; ++i) ... if (N > 0) for (int i = 0; i < N; ++i) ... So I've add 'LoopGuardedAgainstMin' which reports whether N is greater than the minimum value which then allows loop with a variable loop count to be optimised. I've also moved the increasing bound checking into its own function and replaced SumCanReachMax is another isLoopEntryGuardedByCond function. llvm-svn: 328480	2018-03-26 09:29:42 +00:00
Martin Storsjo	439824622a	[ARM] Simplify constructing the ARMArchFeature string. NFC. Differential Revision: https://reviews.llvm.org/D44819 llvm-svn: 328478	2018-03-26 08:41:10 +00:00
Craig Topper	6f28d3c954	[X86] Fix the SchedRW for intrinsic register form of SQRT/RCP/RSQRT. llvm-svn: 328474	2018-03-26 05:05:12 +00:00
Craig Topper	cdfcf8ecda	[X86] Merge the SSE and AVX versions of fp divs and sqrts in the SandyBridge/Haswell/Broadwell/Skylake scheduler models. I've used Agner's data as best I could to get the values to converge on. llvm-svn: 328473	2018-03-26 05:05:10 +00:00
Craig Topper	fbf2d850e3	[X86] Add itinerary to intrinsic version of sqrtss, rcpss, and rsqrtss instructions. llvm-svn: 328472	2018-03-26 04:20:36 +00:00
Craig Topper	c049cb7823	[X86] Correct the itineraries for the dot production instructions. llvm-svn: 328471	2018-03-26 02:17:15 +00:00
Craig Topper	4367874bc5	[X86] Use the same itinerary for VCVTDQ2PD as the SSE version so that the generated scheduler classes will merge. llvm-svn: 328470	2018-03-26 02:17:14 +00:00
Craig Topper	659f85af14	[X86] Swap the itineraries on the memory and register forms of CVTDQ2PD. They were backwards. llvm-svn: 328469	2018-03-26 02:17:13 +00:00
Craig Topper	4bf23eddaf	[X86] Give VMOVSX/ZX the same itinerary as the SSE version so they'll reuse the same generated scheduler class. llvm-svn: 328468	2018-03-26 02:17:12 +00:00
Craig Topper	6e8d99bbea	[X86] Give vpmsadbw the same itinerary as the SSE version so they'll be able to share the same generated scheduler class. llvm-svn: 328466	2018-03-25 23:52:06 +00:00
Craig Topper	15fef89ad9	[X86] Move (v)movss to port 5 only for Skylake. Move (v)movups/d to port 015 for Skylake. This matches Agner's data and is consistent with what the EVEX instructions were doing on SKX. llvm-svn: 328465	2018-03-25 23:40:56 +00:00
Erik Pilkington	8a1cb33ba5	[demangler] Use a back-patching scheme to resolve forward references. Strictly in a conversion operator's type, a <template-param> refers to a <template-arg> that is further ahead in the mangled name. Instead of doing a second parse to resolve these, introduce a ForwardTemplateReference Node and back-patch the referenced <template-arg> when we're in the right context. This is also a correctness fix, previously we would only do a second parse if the <template-param> was out of bounds in the current set of <template-args>. This lead to misdemangles (gasp!) when the conversion operator was a member of a templated struct, for instance. llvm-svn: 328464	2018-03-25 22:50:33 +00:00
Erik Pilkington	8c7013d4ca	[demangler] Tweak how parameter pack sizes are determined. Rather than eagerly propagating up parameter pack sizes in Node ctors, find the parameter pack size during printing. This is being done to support back-patching forward referencing <template-param>s. llvm-svn: 328463	2018-03-25 22:49:57 +00:00
Erik Pilkington	c728786b1d	[demangler] Support for clang's enable_if attribute. Fixes PR33569. llvm-svn: 328462	2018-03-25 22:49:16 +00:00
Sanjay Patel	93e64dd9a1	[PatternMatch] allow undef elements when matching vector FP +0.0 This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461	2018-03-25 21:16:33 +00:00
Simon Pilgrim	68a8fbc102	[X86] Use WriteResPair for WriteIDiv to cleanup sched defs. NFCI. llvm-svn: 328460	2018-03-25 20:16:53 +00:00
Simon Pilgrim	fecb0b7874	[X86][SkylakeClient] Fix missing comma llvm-svn: 328458	2018-03-25 19:17:17 +00:00
Simon Pilgrim	351e4fa0e2	[ARM] Remove sched model instregex entries that don't match any instructions (D44687) Reviewed by @javed.absar llvm-svn: 328457	2018-03-25 19:07:17 +00:00
Simon Pilgrim	854ac7490d	[X86] Add missing full stop to comment. NFCI. llvm-svn: 328456	2018-03-25 18:49:48 +00:00
Craig Topper	972bdbd415	[X86][SkylakeClient] Fix a set of regular expressions that were checking for optionally starting with 'Y' instead of 'V' These bad regexs were introduced by r328435 llvm-svn: 328454	2018-03-25 17:33:14 +00:00
Simon Pilgrim	562e8b4eae	[X86][MMX] MOVQ2DQ/MOVDQ2Q are better described as WriteVecMove than WriteMove Not that it makes a difference to current cost values, but will when we try to better model GPR-SIMD transfer costs llvm-svn: 328453	2018-03-25 17:28:06 +00:00
Simon Pilgrim	25acc0a79b	[X86][SkylakeServer] Merge multiple instregex. NFCI llvm-svn: 328452	2018-03-25 17:25:37 +00:00
Craig Topper	a985919d3e	[X86] Update cost model for Goldmont. Add fsqrt costs for Silvermont Add fdiv costs for Goldmont using table 16-17 of the Intel Optimization Manual. Also add overrides for FSQRT for Goldmont and Silvermont. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44644 llvm-svn: 328451	2018-03-25 15:58:12 +00:00
Sanjay Patel	841aac04d4	[InstCombine] peek through more icmp of FP cast + bitcast This is an extension of rL328426 as noted in D44367. llvm-svn: 328448	2018-03-25 14:01:42 +00:00
Simon Pilgrim	e3547af7be	[X86] Add the ability to override memory folding latency to schedules and add 1uop for memory folds for Intel models The Intel models need an extra 1uop for memory folded instructions, plus a lot of instructions take a non-default memory latency which should allow us to use the multiclass a lot more to tidy things up. Differential Revision: https://reviews.llvm.org/D44840 llvm-svn: 328446	2018-03-25 10:21:19 +00:00
Craig Topper	e8f4e747bf	[X86] Consistently prefix all defs in X86ScheduleSLM.td with 'SLM'. llvm-svn: 328444	2018-03-25 01:28:43 +00:00
Martin Storsjo	98720156b9	[X86] Update a partially stale comment, since SVN r328386. NFC. llvm-svn: 328443	2018-03-24 23:00:00 +00:00
Simon Pilgrim	31a9633724	[X86][SkylakeClient] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time llvm-svn: 328435	2018-03-24 20:40:14 +00:00
Simon Pilgrim	c21deec37b	[X86][Broadwell] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time llvm-svn: 328434	2018-03-24 19:37:28 +00:00
Mandeep Singh Grang	98bc25a0f2	[RISCV] Use init_array instead of ctors for RISCV target, by default Summary: LLVM defaults to the newer .init_array/.fini_array scheme for static constructors rather than the less desirable .ctors/.dtors (the UseCtors flag defaults to false). This wasn't being respected in the RISC-V backend because it fails to call TargetLoweringObjectFileELF::InitializeELF with the the appropriate flag for UseInitArray. This patch fixes this by implementing RISCVELFTargetObjectFile and overriding its Initialize method to call InitializeELF(TM.Options.UseInitArray). Reviewers: asb, apazos Reviewed By: asb Subscribers: mgorny, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, llvm-commits Differential Revision: https://reviews.llvm.org/D44750 llvm-svn: 328433	2018-03-24 18:37:19 +00:00
Simon Pilgrim	2b5967f510	[X86][Haswell] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time llvm-svn: 328432	2018-03-24 18:36:01 +00:00
Simon Pilgrim	efcf1d85b3	[X86][SandyBridge] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time llvm-svn: 328431	2018-03-24 18:12:59 +00:00
Mandeep Singh Grang	db00e2e20f	[Hexagon] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace all std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44857 llvm-svn: 328430	2018-03-24 17:34:37 +00:00
Mandeep Singh Grang	860adef9e6	[AMDGPU] Change std::sort to llvm::sort in response to r327219 Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Reviewers: tstellar, RKSimon, arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44856 llvm-svn: 328429	2018-03-24 17:15:04 +00:00
Sanjay Patel	745a9c62c2	[InstCombine] peek through FP casts for sign-bit compares (PR36682) This pattern came up in PR36682: https://bugs.llvm.org/show_bug.cgi?id=36682 https://godbolt.org/g/LhuD9A Equality checks are planned as a follow-up enhancement. Differential Revision: https://reviews.llvm.org/D44367 llvm-svn: 328426	2018-03-24 15:45:02 +00:00
Sanjay Patel	286074e8a1	[InstCombine] fix formatting; NFC llvm-svn: 328425	2018-03-24 15:41:59 +00:00
Craig Topper	097b47a0fc	[X86] Add a new disassembler opcode map for 3DNow. Stop treating 3DNow as an attribute. This reduces the size of llvm-mc by at least 150k since we no longer have to multiply the attribute across 7 tables. llvm-svn: 328416	2018-03-24 07:48:54 +00:00
Craig Topper	e865641aea	[X86] Merge the Has3DNow0F0FOpcode TSFlag into the OpMap encoding. NFC The 3DNow instructions are encoded a little weird, but we can still represent it as an opcode map. llvm-svn: 328410	2018-03-24 06:04:12 +00:00
Craig Topper	2c0a62ab9a	[X86] Add a DAG combine to simplify PMULDQ/PMULUDQ nodes These nodes only use the lower 32 bits of their inputs so we can use SimplifyDemandedBits to simplify them. Differential Revision: https://reviews.llvm.org/D44375 llvm-svn: 328405	2018-03-24 01:52:01 +00:00
Eric Christopher	fe6e6d93d9	Allow FDE references outside the +/-2GB range supported by PC relative offsets for code models other than small/medium. For JIT application, memory layout is less controlled and can result in truncations otherwise. Patch based on one by Olexa Bilaniuk! llvm-svn: 328400	2018-03-24 00:07:38 +00:00
David Blaikie	53f51c1df8	Remove unused header from EntryExitInstrumenter Fixes layering, since Transforms/Utils doesn't depend on CodeGen, so shouldn't include headers from it. llvm-svn: 328399	2018-03-24 00:06:14 +00:00
Craig Topper	bc6d2ec8ce	[X86] Correct the value AdSizeX in X86II enum. NFC Should be NFC since nothing used the enum value. The instruction descriptions are generated from tablegen which had the correct value. llvm-svn: 328398	2018-03-24 00:02:46 +00:00
David Blaikie	36a0f226b1	Fix layering by moving ValueTypes.h from CodeGen to IR ValueTypes.h is implemented in IR already. llvm-svn: 328397	2018-03-23 23:58:31 +00:00
David Blaikie	13e77db2df	Fix layering of MachineValueType.h by moving it from CodeGen to Support This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395	2018-03-23 23:58:25 +00:00
David Blaikie	bf121cf44a	Fix layering by moving Support/CodeGenCWrappers.h to Target This includes llvm-c/TargetMachine.h which is logically part of libTarget (since libTarget implements llvm-c/TargetMachine.h's functions). llvm-svn: 328394	2018-03-23 23:58:21 +00:00
David Blaikie	ab7f17f4ec	Fix layering by moving X86DisassemblerDecoderCommon to Support This is used from llvm tblgen and the X86Disassembler - the only common library (apart from TableGen, which probably doesn't make sense to have as a dependency from a release tool (rather than a use-while-building-llvm tool) of LLVM) llvm-svn: 328393	2018-03-23 23:58:20 +00:00
David Blaikie	6054e650ff	Move TargetLoweringObjectFile from CodeGen to Target to fix layering It's implemented in Target & include from other Target headers, so the header should be in Target. llvm-svn: 328392	2018-03-23 23:58:19 +00:00
Philip Reames	6a1f3446b5	[GuardWidening] Group code by class [NFC] llvm-svn: 328387	2018-03-23 23:41:47 +00:00
Reid Kleckner	e27b410661	[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32 Both GCC and MSVC only look at the low byte of a boolean when it is passed. llvm-svn: 328386	2018-03-23 23:38:53 +00:00
David Blaikie	4fe1fe1418	Fix Layering, move instrumentation transform headers into Instrumentation subdirectory llvm-svn: 328379	2018-03-23 22:11:06 +00:00
Fedor Sergeev	6660fd0f95	[PM][FunctionAttrs] add NoUnwind attribute inference to PostOrderFunctionAttrs pass Summary: This was motivated by absence of PrunEH functionality in new PM. It was decided that a proper way to do PruneEH is to add NoUnwind inference into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top. This change generalizes attribute handling implemented for (a removal of) Convergent attribute, by introducing a generic builder-like class AttributeInferer It registers all the attribute inference requests, storing per-attribute predicates into a vector, and then goes through an SCC Node, scanning all the instructions for not breaking attribute assumptions. The main idea is that as soon all the instructions from all the functions of SCC Node conform to attribute assumptions then we are free to infer the attribute as set for all the functions of SCC Node. It handles two distinct cases of attributes: - those that might break due to derefinement of the function code for these attributes we are allowed to apply inference only if all the functions are "exact definitions". Example - NoUnwind. - those that do not care about derefinement for these attributes we are allowed to apply inference as soon as we see any function definition. Example - removal of Convergent attribute. Also in this commit: * Converted all the FunctionAttrs tests to use FileCheck and added new-PM invocations to them * FunctionAttrs/convergent.ll test demonstrates a difference in behavior between new and old PM implementations. Marked with FIXME. * PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg combo as intended * some of "other" tests were updated since function-attrs now infers 'nounwind' even for old PM pipeline * -disable-nounwind-inference hidden option added as a possible workaround for a supposedly rare case when nounwind being inferred by default presents a problem Reviewers: chandlerc, jlebar Reviewed By: jlebar Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D44415 llvm-svn: 328377	2018-03-23 21:46:16 +00:00
Sanjay Patel	32381d7c7e	[InstCombine] simplify code for FP intrinsic shrinking; NFCI llvm-svn: 328372	2018-03-23 21:18:12 +00:00
Krzysztof Parzyszek	998df2ca4f	[Hexagon] Make findLoopInstr member of HexagonInstrInfo llvm-svn: 328367	2018-03-23 20:43:02 +00:00
Krzysztof Parzyszek	8038dad7db	[Hexagon] Correct update of instruction offet in HW loop fixup llvm-svn: 328366	2018-03-23 20:41:44 +00:00
Krzysztof Parzyszek	bcf0a96f9e	[Hexagon] Boost profit for word-mask immediates, reduce for others This avoids unnecessary splitting due to uninteresting immediates. llvm-svn: 328364	2018-03-23 20:11:00 +00:00
Zachary Turner	f228276262	[PDB] Resubmit "Support embedding natvis files in PDBs." This was reverted several times due to what ultimately turned out to be incompatibilities in our serialized hash table format. Several changes went in prior to this to fix those issues since they were more fundamental and independent of supporting injected sources, so now that those are fixed this change should hopefully pass. llvm-svn: 328363	2018-03-23 19:57:25 +00:00
Krzysztof Parzyszek	ca93f5e605	[Hexagon] Assume all extendable branches to be of size 8 in relaxation The branch relaxation pass collects sizes of all instructions at the beginning, before any changes have been made. It then performs one pass over all branches to see which ones need to be extended. It does not account for the case when a previously valid branch becomes out-of-range due to relaxing other branches. This approach fixes this problem by assuming from the beginning that all extendable branches have been extended. This may cause unneeded relaxation in some cases, but avoids iteration and recomputing instruction sizes. llvm-svn: 328360	2018-03-23 19:47:13 +00:00
Krzysztof Parzyszek	6f503b96fb	[Hexagon] Incorrectly removing dead flag and adding kill flag The HexagonExpandCondsets pass is incorrectly removing the dead flag on a definition that is really dead, and adding a kill flag to a use that is tied to a definition. This causes an assert later during the machine scheduler when querying the live interval information. Patch by Brendon Cahoon. llvm-svn: 328357	2018-03-23 19:39:37 +00:00
Benjamin Kramer	faa9b438ce	[Hexagon] Silence unused variable warning in Release builds llvm-svn: 328356	2018-03-23 19:39:16 +00:00
Krzysztof Parzyszek	e247526cc9	[Hexagon] Fold offset in base+immediate loads/stores Optimize Ry = add(Rx,#n); memw(Ry+#0) = Rz => memw(Rx,#n) = Rz. Patch by Jyotsna Verma. llvm-svn: 328355	2018-03-23 19:30:34 +00:00
Craig Topper	4529d3abcb	[X86] Add itinerary to RCPSS*_Int and similar instructions. llvm-svn: 328353	2018-03-23 19:15:05 +00:00
Craig Topper	02fb3907f1	[X86] Add itineraries to ADD.*_DB instructions to match their normal counterparts. llvm-svn: 328352	2018-03-23 19:15:03 +00:00
Tony Tye	7a893d4e34	[AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS. Differential Revision: https://reviews.llvm.org/D43736 llvm-svn: 328349	2018-03-23 18:45:18 +00:00
Zachary Turner	a6fb536e5b	[PDB] Make our PDBs look more like MS PDBs. When investigating bugs in PDB generation, the first step is often to do the same link with link.exe and then compare PDBs. But comparing PDBs is hard because two completely different byte sequences can both be correct, so it hampers the investigation when you also have to spend time figuring out not just which bytes are different, but also if the difference is meaningful. This patch fixes a couple of cases related to string table emission, hash table emission, and the order in which we emit strings that makes more of our bytes the same as the bytes generated by MS PDBs. Differential Revision: https://reviews.llvm.org/D44810 llvm-svn: 328348	2018-03-23 18:43:39 +00:00
Krzysztof Parzyszek	5f7ba9a74c	[Hexagon] Always generate mux out of predicated transfers if possible HexagonGenMux would collapse pairs of predicated transfers if it assumed that the predicated .new forms cannot be created. Turns out that generating mux is preferable in almost all cases. Introduce an option -hexagon-gen-mux-threshold that controls the minimum distance between the instruction defining the predicate and the later of the two transfers. If the distance is closer than the threshold, mux will not be generated. Set the threshold to 0 by default. llvm-svn: 328346	2018-03-23 18:43:09 +00:00
Krzysztof Parzyszek	80f10e4fe5	[Hexagon] Avoid early if-conversion for one sided branches Patch by Anand Kodnani. llvm-svn: 328344	2018-03-23 18:00:18 +00:00
Simon Pilgrim	6c63e6c222	[X86][Btver2] Cleanup TEST instructions to use JFPA (+JFPX on ymms) function unit llvm-svn: 328343	2018-03-23 17:59:22 +00:00
Alex Shlyapnikov	83e7841419	[HWASan] Port HWASan to Linux x86-64 (LLVM) Summary: Porting HWASan to Linux x86-64, first of the three patches, LLVM part. The approach is similar to ARM case, trap signal is used to communicate memory tag check failure. int3 instruction is used to generate a signal, access parameters are stored in nop [eax + offset] instruction immediately following the int3 one. One notable difference is that x86-64 has to untag the pointer before use due to the lack of feature comparable to ARM's TBI (Top Byte Ignore). Reviewers: eugenis Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44699 llvm-svn: 328342	2018-03-23 17:57:54 +00:00
Ana Pazos	41573804f2	[ARM] Fix "Constant pool entry out of range!" in Thumb1 mode This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode. In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode, adjustBBOffsetsAfter() is not calculating postOffset correctly by properly accounting for the padding that is required for the constant pool that immediately follows the jump table branch instruction. Reviewers: t.p.northover, eli.friedman Reviewed By: t.p.northover Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44709 llvm-svn: 328341	2018-03-23 17:53:27 +00:00
Krzysztof Parzyszek	570c6440cd	[Hexagon] Two fixes in early if-conversion - Fix checking for vector predicate registers. - Avoid speculating llvm.lifetime.end intrinsic. Patch by Harsha Jagasia and Brendon Cahoon. llvm-svn: 328339	2018-03-23 17:46:09 +00:00
Simon Pilgrim	e5c0a041ff	[X86][Btver2] Cleanup MOVMSK instructions to use JFPA function unit Add missing non-VEX and (V)PMOVMSKB instructions to the pattern llvm-svn: 328338	2018-03-23 17:38:59 +00:00
Andrew Kaylor	a237866faf	Fix a block copying problem in LICM Differential Revision: https://reviews.llvm.org/D44817 llvm-svn: 328336	2018-03-23 17:36:18 +00:00
Fangrui Song	c244a15801	[ADT] Simplify getMemory. NFC llvm-svn: 328334	2018-03-23 17:26:12 +00:00
Krzysztof Parzyszek	c98802de09	[Hexagon] Copy subregisters in HexagonStoreWiden When converting an instruction to the wider version, copy any subregisters if the original operand has a subregister. Patch by Brendon Cahoon. llvm-svn: 328333	2018-03-23 17:22:55 +00:00
Simon Pilgrim	256f149bf0	[X86][Btver2] Vector permutes use a JFPU01 scheduler pipe and JFPX/JVALU function unit llvm-svn: 328331	2018-03-23 16:17:56 +00:00
Simon Pilgrim	ee282b3160	[X86][Btver2] Vector store instructions use a JFPU1 scheduler pipe and JSAGU/JSTC function units llvm-svn: 328328	2018-03-23 15:35:13 +00:00
Zaara Syeda	6535993625	Re-commit: [MachineLICM] Add functions to MachineLICM to hoist invariant stores This patch adds functions to allow MachineLICM to hoist invariant stores. Currently, MachineLICM does not hoist any store instructions, however when storing the same value to a constant spot on the stack, the store instruction should be considered invariant and be hoisted. The function isInvariantStore iterates each operand of the store instruction and checks that each register operand satisfies isCallerPreservedPhysReg. The store may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore. This patch also adds the PowerPC changes needed to consider the stack register as caller preserved. Differential Revision: https://reviews.llvm.org/D40196 llvm-svn: 328326	2018-03-23 15:28:15 +00:00
Simon Pilgrim	1335b9c0ca	[X86][Btver2] Cleanup DPPS/DPPD instructions to use JFPA/JFPM function units llvm-svn: 328324	2018-03-23 15:17:50 +00:00
Sanjay Patel	713ca3d36a	[InstCombine] reduce code duplication; NFC llvm-svn: 328323	2018-03-23 15:07:35 +00:00
Sanjay Patel	6de89ce3f7	[InstCombine] improve variable name; NFC llvm-svn: 328322	2018-03-23 14:48:31 +00:00
John Brawn	e3b44f9de6	[AArch64] Don't reduce the width of loads if it prevents combining a shift Loads and stores can only shift the offset register by the size of the value being loaded, but currently the DAGCombiner will reduce the width of the load if it's followed by a trunc making it impossible to later combine the shift. Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and make it prevent the width reduction if this is what would happen, though do allow it if reducing the load width will let us eliminate a later sign or zero extend. Differential Revision: https://reviews.llvm.org/D44794 llvm-svn: 328321	2018-03-23 14:47:07 +00:00
Simon Pilgrim	5792e10ffb	[X86][Btver2] Fix MicroOps counts for DPPS/YMM memory folded instructions This was due to a misunderstanding over what llvm calls a micro-op (retirement unit) is actually called a macro-op on the AMD/Jaguar target. Folded loads don't affect num macro ops. llvm-svn: 328320	2018-03-23 14:45:03 +00:00
Simon Pilgrim	8619962c73	[X86][Btver2] Cleanup SSE42 PCMPISTR/PCMPESTR string instructions to correctly use JFPU1 scheduler pipe followed by JLAGU/JSAGU/JFPA/JVALU function units Fixes throughput to match Agner/Fam16h-SoG as well. llvm-svn: 328318	2018-03-23 14:27:26 +00:00
Matthew Simpson	6c289a1c74	[SLP] Stop counting cost of gather sequences with multiple uses When building the SLP tree, we look for reuse among the vectorized tree entries. However, each gather sequence is represented by a unique tree entry, even though the sequence may be identical to another one. This means, for example, that a gather sequence with two uses will be counted twice when computing the cost of the tree. We should only count the cost of the definition of a gather sequence rather than its uses. During code generation, the redundant gather sequences are emitted, but we optimize them away with CSE. So it looks like this problem just affects the cost model. Differential Revision: https://reviews.llvm.org/D44742 llvm-svn: 328316	2018-03-23 14:18:27 +00:00
Alexey Bataev	bff360865b	[DEBUGINFO] Add flag for DWARF2 to use sections as references. Summary: Some targets does not support labels inside debug sections, but support references in form `section+offset`. Patch adds initial support for this. Reviewers: echristo, probinson, jlebar Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D43943 llvm-svn: 328314	2018-03-23 13:35:54 +00:00
Christof Douma	4a025cc79d	[ARM] Support float literals under XO When targeting execute-only and fp-armv8, float constants in a compare resulted in instruction selection failures. This is now fixed by using vmov.f32 where possible, otherwise the floating point constant is lowered into a integer constant that is moved into a floating point register. This patch also restores using fpcmp with immediate 0 under fp-armv8. Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443 llvm-svn: 328313	2018-03-23 13:02:03 +00:00
Florian Hahn	f73c3ece7f	Revert r328307: [IPSCCP] Use constant range information for comparisons of parameters. Reverted for now, due to it causing verifier failures. llvm-svn: 328312	2018-03-23 12:49:39 +00:00
Simon Pilgrim	9ea14bbbb0	[X86][Znver1] Fix instregex entries that don't match any instructions (D44687) Reviewed by @GGanesh and @craig.topper llvm-svn: 328309	2018-03-23 12:08:23 +00:00
Simon Pilgrim	2755893834	[X86][SandyBridge] Fix missing comma that was causing string concatenation of 2 instregex entries Found while updating D44687 llvm-svn: 328308	2018-03-23 11:56:38 +00:00
Florian Hahn	b1feec087e	[IPSCCP] Use constant range information for comparisons of parameters. For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 328307	2018-03-23 11:56:00 +00:00
Simon Pilgrim	a1e3ea01ef	[X86][Btver2] Vector move/load/store instructions use a JFPU01 scheduler pipe and JFPX/JVALU function unit as well as the AGUs llvm-svn: 328304	2018-03-23 11:27:31 +00:00
Florian Hahn	588e640ea1	[AArch64] Clean-up a few over-eager regexps in models. Patch by Simon Pilgrim <llvm-dev@redking.me.uk> That is a slightly modified version of the AArch64 changes from Simon's D44687 . llvm-svn: 328303	2018-03-23 11:00:42 +00:00
Florian Hahn	52436a587e	[LoopUnroll] Simplify induction variables after peeling too. Loop peeling also has an impact on the induction variables, so we should benefit from induction variable simplification after peeling too. Reviewers: sanjoy, bogner, mzolotukhin, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43878 llvm-svn: 328301	2018-03-23 10:38:12 +00:00
Martin Storsjo	e1a64fe95c	[ARM] Error out on .arm assembler directives on windows Windows on arm is thumb only. Differential Revision: https://reviews.llvm.org/D43005 llvm-svn: 328298	2018-03-23 09:10:03 +00:00
Martin Storsjo	db75aa96d3	Revert "[DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst))" This reverts commit r328252. This change broke building a number of projects when targeting ARM and AArch64, see PR36873. llvm-svn: 328297	2018-03-23 08:36:47 +00:00
Craig Topper	dfeea84d63	[X86] Give VPCMPEQQ the same itinerary as its SSE counterpart. llvm-svn: 328296	2018-03-23 06:58:55 +00:00
Craig Topper	4787b7f434	[X86] Correct the latencies of SNB integer vector multiplies based on Agner's data. Add missing MMX multiplies. llvm-svn: 328295	2018-03-23 06:41:43 +00:00
Craig Topper	659c66dfc1	[X86] Match vpblendvb/vblendvps/vblendvpd itineraries to the SSE equivalent. Change pblendvb/blendvps/blendvpd to use WriteFVarBlend llvm-svn: 328294	2018-03-23 06:41:41 +00:00
Craig Topper	7580a7997d	[X86] Change VPSADBW itinerary to SSE_INTALU_ITINS_P to match the SSE version. llvm-svn: 328293	2018-03-23 06:41:40 +00:00
Craig Topper	d5ac3ae8d3	[X86] Give VLDDQUrm and LDDQUrm the same itinerary. llvm-svn: 328292	2018-03-23 06:41:39 +00:00
Craig Topper	7f142b8bf1	[X86] Merge VMOVMSKBrr and MOVMSKBrr in the SNB sheduler model. The VMOVMSKBrr was in a separate InstRW with a lower latency, but I assume they should be the same and the higher latency matches Agners table so I'm going with that. llvm-svn: 328291	2018-03-23 06:41:38 +00:00
Craig Topper	fae4173b47	[X86] Add VEXTRB/W/D/Q to Zen scheduler model. The SSE versions were present, but not the VEX version. llvm-svn: 328290	2018-03-23 06:41:36 +00:00
Craig Topper	6ef55d1887	[X86] Fix the itinerary for vextractps to match extractps. llvm-svn: 328289	2018-03-23 06:41:35 +00:00
Nirav Dave	5b3e8791b4	[DAG] Fix node id invalidation in Instruction Selection. Invalidation should be bit negation. Add missing negation. llvm-svn: 328287	2018-03-23 01:22:39 +00:00
Michael Zolotukhin	fab7a676c2	State that CFG is preserved in 'Falkor HW Prefetch Fix Late Phase'. That removes some redundant recomputations from the passes pipeline. llvm-svn: 328272	2018-03-22 23:44:40 +00:00
David Blaikie	301627f875	Move SampleProfile.h into IPO along with the rest of the IPO pass headers llvm-svn: 328262	2018-03-22 22:42:44 +00:00
Craig Topper	adb173314d	[X86] Correct the VROUND regular expressions in Znver1 scheduler model to account for r328254 llvm-svn: 328260	2018-03-22 22:17:11 +00:00
David Blaikie	376294c23a	Finish moving the IPSCCP pass from Scalar to IPO - moving the registration llvm-svn: 328259	2018-03-22 22:07:53 +00:00
Evgeny Stupachenko	579507a53a	Revert r325687 (workaround for PR36032). Summary: Revert r325687 workaround for PR36032 since a fix was committed in r326154. Reviewers: sbaranga Differential Revision: http://reviews.llvm.org/D44768 From: Evgeny Stupachenko <evstupac@gmail.com> <evgeny.v.stupachenko@intel.com> llvm-svn: 328257	2018-03-22 22:04:39 +00:00
Craig Topper	40d3b32e12	[X86] Rename VROUNDYPS* and VROUNDYPD* instructions to VROUNDPSY* and VROUNDPDY*. Fix itinerary mistake on all memory forms of VROUNDPD This makes the Y position consistent with other instructions. This should have been NFC, but while refactoring the multiclass I noticed that VROUNDPD memory forms were using the register itinerary. llvm-svn: 328254	2018-03-22 21:55:20 +00:00
Guozhi Wei	17ff975eb1	[DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst)) In our real world application, we found the following optimization is missed in DAGCombiner (zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst)) If the user of original zext is an add, it may enable further lea optimization on x86. This patch add a new function CombineZExtLogicopShiftLoad to do this optimization. Differential Revision: https://reviews.llvm.org/D44402 llvm-svn: 328252	2018-03-22 21:47:25 +00:00
David Blaikie	3bbf5af0ac	Fix layering between SCCP and IPO SCCP Transforms/Scalar/SCCP.cpp implemented both the Scalar and IPO SCCP, but this meant Transforms/Scalar including Transfroms/IPO headers, creating a circular dependency. (IPO depends on Scalar already) - so move the IPO SCCP shims out into IPO and the basic library implementation accessible from Scalar/SCCP.h to be used from the IPO/SCCP.cpp implementation. llvm-svn: 328250	2018-03-22 21:41:29 +00:00
Roman Tereshin	d96de6f6ae	[MIR] Making MIR Printing, opt -dot-cfg, and -debug printing faster Value::printAsOperand has been scanning the entire module just to print a single value as an operand, regardless being asked to print a type or not at all, and regardless really needing to scan the module to print a type. It made some of the users of the method exceptionally slow on large IR-modules (or large MIR-files with large IR-modules embedded). This patch defers scanning a module looking for struct types, mostly numbered struct types, as much as possible, speeding up those users w/o changing any APIs at all. See speedup examples below: Release Build: # 83 seconds -> 5.5 seconds time ./bin/llc -start-before=irtranslator -stop-after=irtranslator \ -global-isel -global-isel-abort=2 -simplify-mir sqlite3.O0.ll -o \ sqlite3.O0.ll.regbankselected.mir # 133 seconds -> 6.2 seconds time ./bin/opt sqlite3.O0.ll -dot-cfg -disable-output Release + Asserts Build: # 95 seconds -> 5.5 seconds time ./bin/llc -start-before=irtranslator -stop-after=irtranslator \ -global-isel -global-isel-abort=2 -simplify-mir sqlite3.O0.ll -o \ sqlite3.O0.ll.regbankselected.mir # 146 seconds -> 6.2 seconds time ./bin/opt sqlite3.O0.ll -dot-cfg -disable-output # 1096 seconds -> 553 seconds time ./bin/llc -debug-only=isel -fast-isel=false -stop-after=isel \ sqlite3.O0.ll -o /dev/null 2> err where sqlite3.O0.ll is non-optimized IR produced from sqlite-amalgamation (http://sqlite.org/download.html), which is entire SQLite3 implementation in a single C-file. Benchmarked on 4-cores / 8 threads PCI-E SSD iMac running macOS Reviewers: dexonsmith, bkramer, void, chandlerc, aditya_nandakumar, dsanders, qcolombet, Reviewed By: bogner Subscribers: thegameg, llvm-commits Differential Revision: https://reviews.llvm.org/D44132 llvm-svn: 328246	2018-03-22 21:29:07 +00:00
Mircea Trofin	29a21bab08	Revert "Revert "[InstrProf] Support for external functions in text format."" Summary: This reverts commit 364eb09576a7667bc6d3ff80c52a83014ccac976 and separates out the portion that was fixing binary reader error propagation - turns out, there are production cases where that causes a regression. Will re-introduce the error propagation fix separately. The fix to the text reader error propagation is still "in". Reviewers: bkramer Reviewed By: bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44807 llvm-svn: 328244	2018-03-22 21:26:52 +00:00
Craig Topper	58afb4ea58	[X86][SkylakeClient] Fix a bunch of instructions that were incorrectly assigned Port015 instead of Port01. The VEC ADD and VEC MUL units aren't present on port 5 on SkylakeClient. llvm-svn: 328241	2018-03-22 21:10:07 +00:00
Jessica Paquette	df82274f3c	[MachineOutliner][NFC] Refactoring + comments in runOnModule Split up some of the if/else branches in runOnModule. Elaborate on some comments. Replace a call to getOrCreateMachineFunction with getMachineFunction. This makes it clearer what's happening in runOnModule, and ensures that the outliner doesn't create any MachineFunctions which will never be used by the outliner (or anything else, really). llvm-svn: 328240	2018-03-22 21:07:09 +00:00
Jun Bum Lim	2ecb7ba4c6	[CodeGen] Add a new pass for PostRA sink Summary: This pass sinks COPY instructions into a successor block, if the COPY is not used in the current block and the COPY is live-in to a single successor (i.e., doesn't require the COPY to be duplicated). This avoids executing the the copy on paths where their results aren't needed. This also exposes additional opportunites for dead copy elimination and shrink wrapping. These copies were either not handled by or are inserted after the MachineSink pass. As an example of the former case, the MachineSink pass cannot sink COPY instructions with allocatable source registers; for AArch64 these type of copy instructions are frequently used to move function parameters (PhyReg) into virtual registers in the entry block.. For the machine IR below, this pass will sink %w19 in the entry into its successor (%bb.1) because %w19 is only live-in in %bb.1. ``` %bb.0: %wzr = SUBSWri %w1, 1 %w19 = COPY %w0 Bcc 11, %bb.2 %bb.1: Live Ins: %w19 BL @fun %w0 = ADDWrr %w0, %w19 RET %w0 %bb.2: %w0 = COPY %wzr RET %w0 ``` As we sink %w19 (CSR in AArch64) into %bb.1, the shrink-wrapping pass will be able to see %bb.0 as a candidate. With this change I observed 12% more shrink-wrapping candidate and 13% more dead copies deleted in spec2000/2006/2017 on AArch64. Reviewers: qcolombet, MatzeB, thegameg, mcrosier, gberry, hfinkel, john.brawn, twoh, RKSimon, sebpop, kparzysz Reviewed By: sebpop Subscribers: evandro, sebpop, sfertile, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41463 llvm-svn: 328237	2018-03-22 20:06:47 +00:00
Paul Robinson	7947468e69	[DWARF] Replace assert with diagnostic. PR36868. llvm-svn: 328235	2018-03-22 19:37:56 +00:00
David Blaikie	2965a01e98	Move the initialization of the Meta Renamer pass over to IPO along with the rest of it that was moved in r328209 llvm-svn: 328234	2018-03-22 19:36:54 +00:00
Nirav Dave	8c5f47ac40	[DAG, X86] Fix ISel-time node insertion ids As in SystemZ backend, correctly propagate node ids when inserting new unselected nodes into the DAG during instruction Seleciton for X86 target. Fixes PR36865. Reviewers: jyknight, craig.topper Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44797 llvm-svn: 328233	2018-03-22 19:32:07 +00:00
Craig Topper	4a3be6e578	[X86] Correct the scheduling data for some of the 32 and 64 bit multiplies to as best as I understand how they are implemented. llvm-svn: 328231	2018-03-22 19:22:51 +00:00
Daniel Neilson	710d7b9945	[InstCombineCalls] Update deprecated API usage (NFC) Summary: Just updating a call to MemSetInst::getAlignment() to MemSetInst::getDestAlignment(). The former has been deprecated. llvm-svn: 328227	2018-03-22 18:36:15 +00:00
Simon Pilgrim	bcb86bb927	[X86][Btver2] Conversion, MaskedLoad/MaskedStore and NTStores all are scheduled through the JFPU1 pipe llvm-svn: 328226	2018-03-22 18:29:16 +00:00
Simon Pilgrim	0e031afa95	[X86][Btver2] FCMP (inc FMAX/FMIN) instructions use the JFPA functional pipe The ymm instructions are double pumped as well. llvm-svn: 328222	2018-03-22 17:43:12 +00:00
Zachary Turner	71d36ad9f9	[Codeview/PDB] Rename some methods for clarity. NFC, this just renames some methods to better express what they do, and also adds a few helper methods to add some symmetry to the API in a few places (for example there was a getStringFromId but not a getIdFromString method in the string table). llvm-svn: 328221	2018-03-22 17:37:28 +00:00
Aditya Nandakumar	b3297ef051	[GISel]: Fix incorrect IRTranslation while translating null pointer types https://reviews.llvm.org/D44762 Currently IRTranslator produces %vreg17<def>(p0) = G_CONSTANT 0; instead we should build %vreg16(s64) = G_CONSTANT 0 %vreg17(p0) = G_INTTOPTR %vreg16 reviewed by @aemerson. llvm-svn: 328218	2018-03-22 17:31:38 +00:00
Simon Pilgrim	e5b51f6786	[X86][Btver2] FMUL ymm instructions are double pumped on the JFPM functional pipe llvm-svn: 328217	2018-03-22 17:25:38 +00:00
Craig Topper	7ccb5ebed8	[ARM] Enable the full InstRW overlap check for ARMScheduleR52.td This fixes a few issues with the R52 instregexs to enable the full overlap checking Differential Revision: https://reviews.llvm.org/D44767 llvm-svn: 328216	2018-03-22 17:17:47 +00:00
Matt Morehouse	236cdaf84c	[SimplifyCFG] Create attribute for fuzzing-specific optimizations. Summary: When building with libFuzzer, converting control flow to selects or obscuring the original operands of CMPs reduces the effectiveness of libFuzzer's heuristics. This patch provides an attribute to disable or modify certain optimizations for optimal fuzzing signal. Provides a less aggressive alternative to https://reviews.llvm.org/D44057. Reviewers: vitalybuka, davide, arsenm, hfinkel Reviewed By: vitalybuka Subscribers: junbuml, mehdi_amini, wdng, javed.absar, hiraditya, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44232 llvm-svn: 328214	2018-03-22 17:07:51 +00:00
Alexey Bataev	07254641d3	[DWARF] Add EmitDwarfOffset function, NFC. Added EmitDwarfOffset function after discussion with Eric Christofer. llvm-svn: 328212	2018-03-22 16:43:21 +00:00
Anna Thomas	9b1176b0ef	[LoopPredication] Add profitability check based on BPI Summary: LoopPredication is not profitable when the loop is known to always exit through some block other than the latch block. A coarse grained latch check can cause loop predication to predicate the loop, and unconditionally deoptimize. However, without predicating the loop, the guard may never fail within the loop during the dynamic execution because the non-latch loop termination condition exits the loop before the latch condition causes the loop to exit. We teach LP about this using BranchProfileInfo pass. Reviewers: apilipenko, skatkov, mkazantsev, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44667 llvm-svn: 328210	2018-03-22 16:03:59 +00:00
David Blaikie	0368417595	Move MetaRenamer from Transforms/UTils to Transforms/IPO since it implements part of IPO.h llvm-svn: 328209	2018-03-22 15:57:47 +00:00
Paul Robinson	938d9a0778	[DWARF] Fix mixing assembler -g with DWARF .file directives. We were effectively overriding an explicit '.file' directive with info for the assembler source. That shouldn't happen. Fixes PR36636, really, even for .s files emitted by Clang. Differential Revision: https://reviews.llvm.org/D44265 llvm-svn: 328208	2018-03-22 15:48:01 +00:00
Benjamin Kramer	de18a2e6ff	Revert "[InstrProf] Support for external functions in text format." This reverts commit r328132. Breaks FDO selfhost. I'm seeing error: /tmp/profraw: Invalid instrumentation profile data (bad magic) llvm-svn: 328207	2018-03-22 15:29:55 +00:00
Florian Hahn	9bc0bc4b9b	[CallSiteSplitting] Preserve DominatorTreeAnalysis. The dominator tree analysis can be preserved easily. Some other kinds of analysis can probably be preserved too. Reviewers: junbuml, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43173 llvm-svn: 328206	2018-03-22 15:23:33 +00:00
Sanjay Patel	3bf58317f7	[MC] fix documentation comments; NFC llvm-svn: 328205	2018-03-22 15:23:21 +00:00
Simon Pilgrim	53b2c3329a	[X86][SSE42] Use the default PCMPEST/PCMPIST scheduler classes directly. NFCI. Models were completely overriding all SSE42 strins instructions when the default classes could be used for exactly the same coverage. llvm-svn: 328203	2018-03-22 14:56:18 +00:00
Pavel Labath	79cd942c23	DWARFVerifier: verify debug_names abbreviation table Summary: This commit adds checks of the abbreviation table in a DWARF v5 Name Index. The most interesting/useful check is the one which checks that each index attributes is encoded using the correct form class, but it also checks for the more obvious errors like unknown forms/tags/attributes and duplicated attributes. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44736 llvm-svn: 328202	2018-03-22 14:50:44 +00:00
Sanjay Patel	94c91b78e7	[InstCombine] add folds for xor-of-icmp signbit tests (PR36682) This is a retry of r328119 which was reverted at r328145 because it could crash by trying to combine icmps with different operand types. This version has a check for that and additional tests. Original commit message: This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328197	2018-03-22 14:08:16 +00:00
Simon Pilgrim	3b2ff1faa9	[X86][CLMUL] Use the default CLMUL scheduler classes directly. NFCI. Models were completely overriding all CLMUL instructions when the WriteCLMUL default classes could be used for exactly the same coverage. llvm-svn: 328194	2018-03-22 13:37:30 +00:00
Simon Pilgrim	6bdd6b32fd	[X86][CLMUL] Fix/add missing itinerary tags to (V)PCLMULQDQ instructions PCLMULQDQrm was using the rr itinerary. Difference in itineraries between PCLMULQDQ/VPCLMULQDQ variants was causing an unnecessary duplication of scheduler class entries. llvm-svn: 328193	2018-03-22 13:36:06 +00:00
Simon Pilgrim	7684e055b3	[X86] Use the default AES scheduler classes directly. NFCI. Models were completely overriding all AES instructions when the WriteAES default classes could be used for exactly the same coverage. Removes 6 unnecessary scheduler classes from every model. Note: Still looking for a way for tblgen to warn when this is happening - often the override is more complete than the default. llvm-svn: 328192	2018-03-22 13:18:08 +00:00
Florian Hahn	3bb822e7d6	[CloneFunction] Preserve DT in DuplicateInstructionsInSplitBetween. DuplicateInstructionsInSplitBetween can preserve the DT by passing through DT to SplitEdge. Reviewers: sanjoy, junbuml, anna, kuhar Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D44629 llvm-svn: 328189	2018-03-22 11:38:53 +00:00
Craig Topper	df7855fc8d	[X86] Remove unused SchedWriteRes classes. NFC llvm-svn: 328181	2018-03-22 04:52:08 +00:00
Craig Topper	fc179c6dd5	[X86][Skylake] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955) I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do. This reduces the size of the optimized llc binary on my computer by 400K. Presumably because we went from 5000+ scheduler classes per CPU to ~2000. llvm-svn: 328179	2018-03-22 04:23:41 +00:00
Aaron Smith	523de05a1f	[DIA] Add IPDBSectionContrib interfaces and DIA implementation To resolve symbol context at a particular address, we need to determine the compiland for the address. We are able to determine the parent compiland of PDBSymbolFunc, PDBSymbolTypeUDT, PDBSymbolTypeEnum symbols indirectly through line information. However no such information is availabile for PDBSymbolData, i.e. variables. The Section Contribution table from PDBs has information about each compiland's contribution to sections by address. For example, a piece of a contribution looks like, VA RelativeVA Sect No. Offset Length Compiland 14000087B0 000087B0 0001 000077B0 000000BB exe_main.obj So given an address, it's possible to determine its compiland with this information. llvm-svn: 328178	2018-03-22 04:08:15 +00:00
Aaron Smith	58a32a478f	[PDB] Get more DIA table enumerators Rename the original function and make it a static template. llvm-svn: 328177	2018-03-22 03:57:06 +00:00
David Blaikie	2be3922807	Fix a couple of layering violations in Transforms Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering. Transforms depends on Transforms/Utils, not the other way around. So remove the header and the "createStripGCRelocatesPass" function declaration (& definition) that is unused and motivated this dependency. Move Transforms/Utils/Local.h into Analysis because it's used by Analysis/MemoryBuiltins.cpp. llvm-svn: 328165	2018-03-21 22:34:23 +00:00
Mircea Trofin	556f8f6adc	[InstrProf] Encapsulates access to AddrToMD5Map. Summary: This fixes a unittest failure introduced by D44717 D44717 introduced lazy sorting of the internal data structures of the symbol table. The AddrToMD5Map getter was potentially exposing inconsistent (unsorted) state. We could sort in the accessor, however, a client may store the pointer and thus bypass the internal state management of the symbol table. The alternative in this CL blocks direct access to the state, thus ensuring consistent externally-observable state. Reviewers: davidxl, xur, eraman Reviewed By: xur Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44757 llvm-svn: 328163	2018-03-21 22:27:31 +00:00
Zachary Turner	eb62999455	[PDB] Don't ignore bucket 0 when writing the PDB string table. The hash table is a list of buckets, and the value stored in the bucket cannot be 0 since that is reserved. However, the code here was incorrectly skipping over the 0'th bucket entirely. The 0'th bucket is perfectly fine, just none of these buckets can contain the value 0. As a result, whenever there was a string where hash(S) % Size was equal to 0, we would write the value in the next bucket instead. We never caught this in our tests due to another bug, which is that we would iterate the entire list of buckets looking for the value, only using the hash value as a starting point. However, the real algorithm stops when it finds 0 in a bucket since it takes that to mean "the item is not in the hash table". The unit test is updated to carefully construct a set of hash values that will cause one item to hash to 0 mod bucket count, and the reader is also updated to return an error indicating that the item is not found when it encounters a 0 bucket. llvm-svn: 328162	2018-03-21 22:23:59 +00:00
Artem Belevich	30512869ff	[NVPTX] Make tensor shape part of WMMA intrinsic's name. This is needed for the upcoming implementation of the new 8x32x16 and 32x8x16 variants of WMMA instructions introduced in CUDA 9.1. Differential Revision: https://reviews.llvm.org/D44719 llvm-svn: 328158	2018-03-21 21:55:02 +00:00
Reid Kleckner	8562c1a198	[PDB] Remove unused private variable, re-applying r327900 after relanding more natvis changes[4~ llvm-svn: 328156	2018-03-21 21:47:26 +00:00
Reid Kleckner	440219d53e	[WebAssembly] Really disable wasm register name matcher The "ShouldEmitMatchRegisterName" bit wasn't taking effect because the WebAssembly target didn't point to the custom WebAssemblyAsmParser record. llvm-svn: 328155	2018-03-21 21:46:47 +00:00
Rafael Espindola	c51dc906ea	Handle abbr_offset with relocations. This is mostly just plumbing to get a DWARFDataExtractor where we compute abbr_offset so we can use getRelocatedValue. This is part of PR36793. llvm-svn: 328154	2018-03-21 21:31:25 +00:00
Reid Kleckner	762331be07	Revert r328119 "[InstCombine] add folds for xor-of-icmp signbit tests (PR36682)" This asserts when compiling safe_numerics_unittest.cpp in Chromium with MSan. llvm-svn: 328145	2018-03-21 20:35:36 +00:00
Sanjay Patel	e235942a1e	[InstSimplify] fp_binop X, NaN --> NaN We propagate the existing NaN value when possible. Differential Revision: https://reviews.llvm.org/D44521 llvm-svn: 328140	2018-03-21 19:31:53 +00:00
Craig Topper	2854dc93e1	[X86] Rewrite getOperandBias in X86BaseInfo.h to be a little more structured and update comments to be more clear about what it does. NFC llvm-svn: 328136	2018-03-21 19:30:28 +00:00
David Blaikie	8820929011	Sink Analysis/ObjectUtil(canBeOmittedFromSymbolTable) into IR so it can be legitimately be used by Object/IRSymtab llvm-svn: 328135	2018-03-21 19:23:45 +00:00
Mircea Trofin	71349ff07d	[InstrProf] Support for external functions in text format. Summary: External functions appearing as indirect call targets could not be found in the SymTab, and the value:counter record was represented, in the text format, using an empty string for the name. This would then cause a silent parsing error when reading. This CL: - adds explicit support for such functions - fixes the places where we would not propagate errors when reading - addresses a performance issue due to eager resorting of the SymTab. Reviewers: xur, eraman, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44717 llvm-svn: 328132	2018-03-21 19:06:06 +00:00
David Blaikie	99e172532c	Reapply Support layering fixes. Compiler.h is used by Demangle (which Support depends on) - so sink it into Demangle to avoid a circular dependency DataTypes.h is used by llvm-c (which Support depends on) - so sink it into llvm-c. DataTypes.h could probably be fixed the other way - making llvm-c depend on Support instead of Support depending on llvm-c - if anyone feels that's the better option, happy to work with them on that. I /think/ this'll address the layering issues that previous attempts to commit this have triggered in the Modules buildbot, but I haven't been able to reproduce that build so can't say for sure. If anyone's having trouble with this - it might be worth taking a look to see if there's a quick fix/something small I missed rather than revert, but no worries. llvm-svn: 328123	2018-03-21 17:31:49 +00:00
Krzysztof Parzyszek	b4bb75d6ad	[Hexagon] Generalize DAG mutation for function calls Add barrier edges to check for any physical register. The previous code worked for the function return registers: r0/d0, v0/w0. Patch by Brendon Cahoon. llvm-svn: 328120	2018-03-21 17:23:32 +00:00
Sanjay Patel	778032f39d	[InstCombine] add folds for xor-of-icmp signbit tests (PR36682) This is part of solving: https://bugs.llvm.org/show_bug.cgi?id=36682 There's also a leftover improvement from the long-ago-closed: https://bugs.llvm.org/show_bug.cgi?id=5438 https://rise4fun.com/Alive/dC1 llvm-svn: 328119	2018-03-21 17:17:13 +00:00
Nicolai Haehnle	87aec1b194	TableGen: Remove redundant loop in ListInit::resolveReferences Summary: Recursive lookups are handled by the Resolver, so the loop was purely a waste of runtime. Change-Id: I2bd23a68b478aea0bbac1a86ca7635adffa28688 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44624 llvm-svn: 328118	2018-03-21 17:13:10 +00:00
Nicolai Haehnle	420e28c78c	TableGen: Streamline how defs are instantiated Summary: Instantiating def's and defm's needs to perform the following steps: - for defm's, clone multiclass def prototypes and subsitute template args - for def's and defm's, add subclass definitions, substituting template args - clone the record based on foreach loops and substitute loop iteration variables - override record variables based on the global 'let' stack - resolve the record name (this should be simple, but unfortunately it's not due to existing .td files relying on rather silly implementation details) - for def(m)s in multiclasses, add the unresolved record as a multiclass prototype - for top-level def(m)s, resolve all internal variable references and add them to the record keeper and any active defsets This change streamlines how we go through these steps, by having both def's and defm's feed into a single addDef() method that handles foreach, final resolve, and routing the record to the right place. This happens to make foreach inside of multiclasses work, as the new test case demonstrates. Previously, foreach inside multiclasses was not forbidden by the parser, but it was de facto broken. Another side effect is that the order of "instantiated from" notes in error messages is reversed, as the modified test case shows. This is arguably clearer, since the initial error message ends up pointing directly to whatever triggered the error, and subsequent notes will point to increasingly outer layers of multiclasses. This is consistent with how C++ compilers report nested #includes and nested template instantiations. Change-Id: Ica146d0db2bc133dd7ed88054371becf24320447 Reviewers: arsenm, craig.topper, tra, MartinO Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44478 llvm-svn: 328117	2018-03-21 17:12:53 +00:00
Krzysztof Parzyszek	c715a5d2b8	[Hexagon] Eliminate subregisters from PHI nodes before pipelining The pipeliner needs to remove instructions from the SlotIndexes structure when they are deleted. Otherwise, the SlotIndexes map has stale data, and an assert will occur when adding new instructions. This patch also changes the pipeliner to make the back-edge of a loop carried dependence 1 cycle. The 1 cycle latency is added to the anti-dependence that represents the back-edge. This changes eliminates a couple of hacks added to the pipeliner to handle the latency of the back-edge. It is needed to correctly pipeline the test case for the sub-register elimination pass. llvm-svn: 328113	2018-03-21 16:39:11 +00:00
Reid Kleckner	18574836f9	[WebAssembly] Suppress unused function warning for register name matcher llvm-svn: 328112	2018-03-21 16:20:58 +00:00
Simon Pilgrim	ec2f878779	[X86][Haswell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955) I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do. llvm-svn: 328111	2018-03-21 16:19:03 +00:00
Simon Pilgrim	96b605d34c	[X86][SandyBridge] Merge more VEX/non-VEX instregex patterns (NFCI) (PR35955) llvm-svn: 328110	2018-03-21 16:05:58 +00:00
Alex Bradbury	65d6ea5e68	[RISCV] Codegen support for RV32F floating point comparison operations This patch also includes extensive tests targeted at select and br+fcmp IR inputs. A sequence of br+fcmp required support for FPR32 registers to be added to RISCVInstrInfo::storeRegToStackSlot and RISCVInstrInfo::loadRegFromStackSlot. llvm-svn: 328104	2018-03-21 15:11:02 +00:00
Daniel Neilson	6f1eb58e92	[MemCpyOpt] Update to new API for memory intrinsic alignment Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the MemCpyOpt pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. We also add a few tests to fill gaps in the testing of this pass. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use [get\|set]DestAlignment() and [get\|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960, rL325816, rL327398, rL327421 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 328097	2018-03-21 14:14:55 +00:00
Justin Lebar	038cbc5c13	Re-re-land: Teach CorrelatedValuePropagation to reduce the width of udiv/urem instructions. Summary: If the operands of a udiv/urem can be proved to fit within a smaller power-of-two-sized type, reduce the width of the udiv/urem. Backed out for causing performance regressions. Re-landing because we've determined that these regressions were noise. Original Differential Revision: https://reviews.llvm.org/D44102 llvm-svn: 328096	2018-03-21 14:08:21 +00:00
Pavel Labath	646ab4113f	Fix build broken by r328090 - constexpr is needed for out-of-class definition of the Type static member by some compilers - MSVC is confused by the initialization of the static constexpr char[] member when it happens in a template specialization. Explicitly specifying the length of the array seems to be enough to help it figure things out. llvm-svn: 328093	2018-03-21 12:18:03 +00:00
Pavel Labath	9025f9559d	[dwarf] Unify unknown dwarf enum formatting code Summary: We have had at least three pieces of code (in DWARFAbbreviationDeclaration, DWARFAcceleratorTable and DWARFDie) that have hand-rolled support for dumping unknown dwarf enum values. While not terrible, they are a bit distracting and enable small differences to creep in (Unknown_ffff vs. Unknown_0xffff). I ended up needing to add a fourth place (DWARFVerifier), so it seems it would be a good time to centralize. This patch creates an alternative to the XXXString dumping functions in the BinaryFormat library, which formats an unknown value as DW_TYPE_unknown_1234, instead of just an empty string. It is based on the formatv function, as that allows us to avoid materializing the string for unknown values (and because this way I don't have to invent a name for the new functions :P). In this patch I add formatters for dwarf attributes, forms, tags, and index attributes as these are the ones in use currently, but adding other enums is straight-forward. Reviewers: dblaikie, JDevlieghere, aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44570 llvm-svn: 328090	2018-03-21 11:46:37 +00:00
Jonas Devlieghere	2fdda6882f	Revert layering changes This reverts: r328072 "Move Compiler.h from Support to Demangler to fix layering." r328073 "Fix the actual user of DataTypes.h in llvm-c to avoid the circular dependency" Failing bots: http://green.lab.llvm.org/green/job/clang-stage2-coverage-R/ http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/ llvm-svn: 328085	2018-03-21 10:35:09 +00:00
Bjorn Pettersson	5c25f88536	[SelectionDAG] Support multiple dangling debug info for one value Summary: When building the selection DAG we sometimes need to postpone the handling of a dbg.value until the value it should refer to is created. This is done by using the DanglingDebugInfoMap. In the past this map has been limited to hold one dangling dbg.value per value. This patch removes that restriction. Reviewers: aprantl, rnk, probinson, vsk Reviewed By: aprantl Subscribers: Ka-Ka, llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D44610 llvm-svn: 328084	2018-03-21 09:44:34 +00:00
Craig Topper	5a69a0011b	[X86][Broadwell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955) llvm-svn: 328076	2018-03-21 06:28:42 +00:00
David Blaikie	6336aedd39	Move Compiler.h from Support to Demangler to fix layering. Support depends on Demangle (Support/Unix/Signals.inc), so Demangle including Support/Compiler.h created a circular dependency. Leave a forwarding shim of Compiler.h because it makes more sense for users (a deeper fix might involve splitting Support into lower and upper Support - but that also sounds a bit weird/awkward) than thinking about the dependency on the Demangler. llvm-svn: 328072	2018-03-21 04:07:05 +00:00
Craig Topper	137a4dd84d	[X86] Fix the SchedRW for XOP vpcom register form instructions to not be marked as loads. llvm-svn: 328071	2018-03-21 03:41:33 +00:00
Craig Topper	d25f1acf67	[X86] Change PMULLD to 10 cycles on Skylake per Agner's tables and llvm-exegesis. Also restrict to port 0 and 1 for SkylakeClient. It looks like the scheduler models don't account for client not having a full vector ALU on port 5 like server. Fixes PR36808. llvm-svn: 328061	2018-03-20 23:39:48 +00:00
Philip Reames	37a1a29fcb	[MustExecute] Shwo the effect of using full loop info variant Most basic possible test for the logic used by LICM. Also contains a speculative build fix for compiles which complain about a definition of a stuct K; followed by a declaration as class K; llvm-svn: 328058	2018-03-20 23:00:54 +00:00
Derek Schuff	73a98f5eea	[WebAssembly] Update torture compile test expectations The tests compile after r328049 llvm-svn: 328057	2018-03-20 23:00:13 +00:00
Philip Reames	23aed5ef6f	[MustExecute] Move isGuaranteedToExecute and related rourtines to Analysis Next step is to actually merge the implementations and get both implementations tested through the new printer. llvm-svn: 328055	2018-03-20 22:45:23 +00:00
Derek Schuff	39b5367cba	[WebAssembly] Strip threadlocal attribute from globals in single thread mode The default thread model for wasm is single, and in this mode thread-local global variables can be lowered identically to non-thread-local variables. Differential Revision: https://reviews.llvm.org/D44703 llvm-svn: 328049	2018-03-20 22:01:32 +00:00
Simon Pilgrim	572bfa562a	[X86] Drop unnecessary InstRW overrides for WriteFMA As noticed on D44687, these already match the WriteFMA def so can be removed. llvm-svn: 328045	2018-03-20 21:15:23 +00:00
Craig Topper	0f110a88be	[ReachingDefAnalysis] Fix what I assume to be a typo ReachingDedDefaultVal->ReachingDefDefaultVal. Unless Ded has some many I don't know about. llvm-svn: 328043	2018-03-20 20:53:21 +00:00
Shoaib Meenai	3f689c8632	[ObjCARC] Add funclet token to ARC marker The inline assembly generated for the ARC autorelease elision marker must have a funclet token if it's emitted inside a funclet, otherwise the inline assembly (and all subsequent code in the funclet) will be marked unreachable by WinEHPrepare. Note that this only applies for the non-O0 case, since at O0, clang emits the autorelease elision marker itself rather than deferring to the backend. The fix for clang is handled in a separate change. Differential Revision: https://reviews.llvm.org/D44641 llvm-svn: 328042	2018-03-20 20:45:41 +00:00
Martin Storsjo	07589fc496	[X86] Don't use the MSVC stack protector names on mingw Mingw uses the same stack protector functions as GCC provides on other platforms as well. Patch by Valentin Churavy! Differential Revision: https://reviews.llvm.org/D27296 llvm-svn: 328039	2018-03-20 20:37:51 +00:00
Alexey Bataev	858a7dd6d7	[DEBUGINFO] Add -no-dwarf-debug-ranges option. Summary: Added option -no-dwarf-debug-ranges option to disable emission of .debug_ranges section. Reviewers: probinson, echristo Subscribers: aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D44384 llvm-svn: 328030	2018-03-20 20:21:38 +00:00
Derek Schuff	e4825975d8	[WebAssembly] Added initial AsmParser implementation. It uses the MC framework and the tablegen matcher to do the heavy lifting. Can handle both explicit and implicit locals (-disable-wasm-explicit-locals). Comes with a small regression test. This is a first basic implementation that can parse most llvm .s output and round-trips most instructions succesfully, but in order to keep the commit small, does not address all issues. There are a fair number of mismatches between what MC / assembly matcher think a "CPU" should look like and what WASM provides, some already have workarounds in this commit (e.g. the way it deals with register operands) and some that require further work. Some of that further work may involve changing what the Disassembler outputs (and what s2wasm parses), so are probably best left to followups. Some known things missing: - Many directives are ignored and not emitted. - Vararg calls are parsed but extra args not emitted. - Loop signatures are likely incorrect. - $drop= is not emitted. - Disassembler does not output SIMD types correctly, so assembler can't test them. Patch by Wouter van Oortmerssen Differential Revision: https://reviews.llvm.org/D44329 llvm-svn: 328028	2018-03-20 20:06:35 +00:00
Evandro Menezes	36afbee1d8	[AArch64] Adjust the cost model for Exynos M3 Fix typo in the number of integer dividers. llvm-svn: 328027	2018-03-20 20:00:29 +00:00
Krzysztof Parzyszek	65059ee284	[Hexagon] Add heuristic to exclude critical path cost for scheduling Patch by Brendon Cahoon. llvm-svn: 328022	2018-03-20 19:26:27 +00:00
Krzysztof Parzyszek	9315c0de9b	[Hexagon] Fix fall-through warnings in HexagonMCDuplexInfo.cpp llvm-svn: 328021	2018-03-20 19:23:18 +00:00
Nirav Dave	ce71989188	[MC,X86] Cleanup some X86 parser functions to use MCParser helpers. NFCI. llvm-svn: 328019	2018-03-20 19:12:41 +00:00
Craig Topper	c2dbd677bd	[PowerPC][LegalizeFloatTypes] Move the PPC hacks for (i32 fp_to_sint/fp_to_uint (ppcf128 X)) out of LegalizeFloatTypes and into PPC specific code I'm not entirely sure these hacks are still needed. If you remove the hacks completely, the name of the library call that gets generated doesn't match the grep the test previously had. So the test wasn't really checking anything. If the hack is still needed it belongs in PPC specific code. I believe the FP_TO_SINT code here is the only place in the tree where a FP_ROUND_INREG node is created today. And I don't think its even being used correctly because the legalization returned a BUILD_PAIR with the same value twice. That doesn't seem right to me. By moving the code entirely to PPC we can avoid creating the FP_ROUND_INREG at all. I replaced the grep in the existing test with full checks generated by hacking update_llc_test_check.py to support ppc32 just long enough to generate it. Differential Revision: https://reviews.llvm.org/D44061 llvm-svn: 328017	2018-03-20 18:49:28 +00:00
Krzysztof Parzyszek	eb0c510ecd	[X86] Add phony registers for high halves of regs with low halves Registers E[A-D]X, E[SD]I, E[BS]P, and EIP have 16-bit subregisters that cover the low halves of these registers. This change adds artificial subregisters for the high halves in order to differentiate (in terms of register units) between the 32- and the low 16-bit registers. This patch contains parts that aim to preserve the calculated register pressure. This is in order to preserve the current codegen (minimize the impact of this patch). The approach of having artificial subregisters could be used to fix PR23423, but the pressure calculation would need to be changed. Differential Revision: https://reviews.llvm.org/D43353 llvm-svn: 328016	2018-03-20 18:46:55 +00:00
Philip Reames	ce998adf0a	[MustExecute] Use the annotation style printer As suggested in the original review (https://reviews.llvm.org/D44524), use an annotation style printer instead. Note: The switch from -analyze to -disable-output in tests was driven by the fact that seems to be the idiomatic style used in annoation passes. I tried to keep both working, but the old style pass API for printers really doesn't make this easy. It invokes (runOnFunction, print(Module)) repeatedly. I decided the extra state wasn't worth it given the old pass manager is going away soonish anyway. llvm-svn: 328015	2018-03-20 18:43:44 +00:00
Zachary Turner	fced530650	Revert "Resubmit "Support embedding natvis files in PDBs."" This is still failing on a different bot this time due to some issue related to hashing absolute paths. Reverting until I can figure it out. llvm-svn: 328014	2018-03-20 18:37:03 +00:00
Artem Belevich	914d4babec	[NVPTX] Make tensor load/store intrinsics overloaded. This way we can support address-space specific variants without explicitly encoding the space in the name of the intrinsic. Less intrinsics to deal with -> less boilerplate. Added a bit of tablegen magic to match/replace an intrinsics with a pointer argument in particular address space with the space-specific instruction variant. Updated tests to use non-default address spaces. Differential Revision: https://reviews.llvm.org/D43268 llvm-svn: 328006	2018-03-20 17:18:59 +00:00
Philip Reames	89f2241770	Add an analysis printer for must execute reasoning Many of our loop passes make use of so called "must execute" or "guaranteed to execute" facts to prove the legality of code motion. The basic notion is that we know (by assumption) an instruction didn't fault at it's original location, so if the location we move it to is strictly post dominated by the original, then we can't have introduced a new fault. At the moment, the testing for this logic is somewhat adhoc and done mostly through LICM. Since I'm working on that code, I want to improve the testing. This patch is the first step in that direction. It doesn't actually test the variant used by the loop passes - I need to move that to the Analysis library first - but instead exercises an alternate implementation used by SCEV. (I plan on merging both implementations.) Note: I'll be replacing the printing logic within this with an annotation based version in the near future. Anna suggested this in review, and it seems like a strictly better format. Differential Revision: https://reviews.llvm.org/D44524 llvm-svn: 328004	2018-03-20 17:09:21 +00:00
Zachary Turner	132d7a134f	Resubmit "Support embedding natvis files in PDBs." The issue causing this to fail in certain configurations should be fixed. It was due to the fact that DIA apparently expects there to be a null string at ID 1 in the string table. I'm not sure why this is important but it seems to make a difference, so set it. llvm-svn: 328002	2018-03-20 17:06:39 +00:00
Krzysztof Parzyszek	4c6b65f685	[Hexagon] Correct the computation of TopReadyCycle and BotReadyCycle of SU TopReadyCycle and BotReadyCycle were off by one cycle when an SU is either the first instruction or the last instruction in a packet. Patch by Ikhlas Ajbar. llvm-svn: 328000	2018-03-20 17:03:27 +00:00
Michael Zolotukhin	fb3f509e01	[XRay] Lazily compute MachineLoopInfo instead of requiring it. Summary: Currently X-Ray Instrumentation pass has a dependency on MachineLoopInfo (and thus on MachineDominatorTree as well) and we have to compute them even if X-Ray is not used. This patch changes it to a lazy computation to save compile time by avoiding these redundant computations. Reviewers: dberris, kubamracek Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44666 llvm-svn: 327999	2018-03-20 17:02:29 +00:00
Krzysztof Parzyszek	73be83dec5	[Hexagon] Check weak dependences when only 1 instruction is available Patch by Brendon Cahoon. llvm-svn: 327997	2018-03-20 16:22:06 +00:00
Alexey Bataev	648ed2dedb	[DEBUGINFO] Add flag -no-dwarf-pub-sections to disable pub sections. Summary: Added a flag -no-dwarf-pub-sections, which allows to disable emission of DWARF public sections. Reviewers: probinson, echristo Subscribers: aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D44385 llvm-svn: 327994	2018-03-20 16:04:40 +00:00
Simon Pilgrim	62690e9d0e	[X86][Haswell][Znver1] Fix typo in fldl instregexs Missing comma was casing 2 instregex entries to be concatenated together by mistake. Found while investigating PR35548 llvm-svn: 327992	2018-03-20 15:44:47 +00:00
Krzysztof Parzyszek	5ffd808a27	[Hexagon] Improve scheduling heuristic for large basic blocks This patch changes the isLatencyBound heuristic to look at the path length based upon the number of packets needed to schedule a basic block. For small basic blocks, the heuristic uses a small threshold for isLatencyBound. For large basic blocks, the heuristic uses a large threshold. The goal is to increase the priority of an instruction in a small basic block that has a large height or depth relative to the code size. For large functions, the height and depth are ignored because it increases the live range of a register and causes more spills. That is, for large functions, it is more important to schedule instructions when available, and attempt to keep the defs and uses closer together. Patch by Brendon Cahoon. llvm-svn: 327987	2018-03-20 14:54:01 +00:00
Geoff Berry	0b64402adb	[AArch64][Falkor] Correct load/store increment scheduling details llvm-svn: 327982	2018-03-20 13:46:35 +00:00
Krzysztof Parzyszek	2c4231d888	[Hexagon] Fix division by zero in machine scheduler llvm-svn: 327980	2018-03-20 13:28:46 +00:00
Alex Bradbury	80c8eb7696	[RISCV] Add codegen for RV32F floating point load/store As part of this, add support for load/store from the constant pool. This is used to materialise f32 constants. llvm-svn: 327979	2018-03-20 13:26:12 +00:00
Alex Bradbury	76c29ee815	[RISCV] Add codegen for RV32F arithmetic and conversion operations Currently, only a soft floating point ABI is supported. llvm-svn: 327976	2018-03-20 12:45:35 +00:00
Krzysztof Parzyszek	dca383123f	[Hexagon] Improve scheduling based on register pressure Patch by Brendon Cahoon. llvm-svn: 327975	2018-03-20 12:28:43 +00:00
Simon Pilgrim	4a83f802cc	[X86][SandyBridge] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955) I've also merged some VEX/non-VEX instregex strings with a (V?) prefix - there are still a lot more of these to do. llvm-svn: 327974	2018-03-20 12:26:55 +00:00
Xin Tong	a713ebea24	[MergeICmps] Break eargerly out of loop llvm-svn: 327972	2018-03-20 12:03:25 +00:00
Xin Tong	bdbd97ed9a	[MergeICmp] Fix a bug in entry block shuffled to middle of the chain Summary: Fix a bug in entry block shuffled to middle of the chain. Reviewers: davide, courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44642 llvm-svn: 327971	2018-03-20 11:57:54 +00:00
Igor Laevsky	3ce2d7f270	[llvm-opt-fuzzer] Add irce to the fuzzing options llvm-svn: 327969	2018-03-20 11:32:13 +00:00
Bjorn Pettersson	bf3213e485	[CGP] Avoid segmentation fault when doing PHI node simplifications Summary: Made PHI node simplifiations more robust in several ways: - Minor refactoring to let the SimplificationTracker own the sets with new PHI/Select nodes that are introduced. This is maybe not mapping to the original intention with the SimplificationTracker, but IMHO it encapsulates the logic behind those sets a little bit better. - MatchPhiNode can sometimes populate the Matched set with several entries, where it maps one PHI node to different candidates for replacement. The Matched set is changed into a SmallSetVector to make sure we get a deterministic iteration when doing the replacements. - As described above we may get several different replacements for a single PHI node. The loop in MatchPhiSet that is doing the replacements could end up calling eraseFromParent several times for the same PHI node, resulting in segmentation faults. This problem was supposed to be fixed in rL327250, but due to the non-determinism(?) it only appeared to be fixed (I still got crashes sometime when turning on/off -print-after-all etc to get different iteration order in the DenseSets). With this patch we follow the deterministic ordering in the Matched set when replacing the PHI nodes. If we find a new replacement for an already replaced PHI node we replace the new replacement by the old replacement instead. This is quite similar to what happened in the rl327250 patch, but here we also recursively verify that the old replacement hasn't been replaced already. - It was really hard to track down the fault described above (segementation fault due to doing eraseFromParent multiple times for the same instruction). The fault was intermittent and small changes in the code, or simply turning on -print-after-all etc could make the problem go away. This was basically due to the iteration over PhiNodesToMatch in MatchPhiSet no being deterministic. Therefore I've changed the data structure for the SimplificationTracker::AllPhiNodes into an SmallSetVector. This gives a deterministic behavior. Reviewers: skatkov, john.brawn Reviewed By: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44571 llvm-svn: 327961	2018-03-20 09:06:37 +00:00
Andrei Elovikov	8b8253fdc7	[LV] Let recordVectorLoopValueForInductionCast to check if IV was created from the cast. Summary: It turned out to be error-prone to expect the callers to handle that - better to leave the decision to this routine and make the required data to be explicitly passed to the function. This handles the case that was missed in the r322473 and fixes the assert mentioned in PR36524. Reviewers: dorit, mssimpso, Ayal, dcaballe Reviewed By: dcaballe Subscribers: Ka-Ka, hiraditya, dneilson, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D43812 llvm-svn: 327960	2018-03-20 09:04:39 +00:00
Martin Storsjo	802b434156	[X86] Properly implement the calling convention for f80 for mingw/x86_64 In these cases, both parameters and return values are passed as a pointer to a stack allocation. MSVC doesn't use the f80 data type at all, while it is used for long doubles on mingw. Normally, this part of the calling convention is handled within clang, but for intrinsics that are lowered to libcalls, it may need to be handled within llvm as well. Differential Revision: https://reviews.llvm.org/D44592 llvm-svn: 327957	2018-03-20 06:19:38 +00:00
Lang Hames	2c83285716	[ORC] Don't fully qualify explicit destructor call -- it confuses some compilers. This should fix the builder failure at http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/19224 llvm-svn: 327955	2018-03-20 05:56:58 +00:00
Craig Topper	ad7c685791	[X86] Rename MOVSX32_NOREXrr8 to MOVSX32rr8_NOREX so that the scheduler model regular expressions will pick it up with the regular version. Do the same for MOVSX32_NOREXrm8, MOVZX32_NOREXrr8, and MOVZX32_NOREXrm8 llvm-svn: 327948	2018-03-20 05:00:20 +00:00
Craig Topper	4778fa7e8a	[X86] Fix the SchedRW for memory forms of CMP and TEST. They were incorrectly marked as RMW operations. Some of the CMP instrucions worked, but the ones that use a similar encoding as RMW form of ADD ended up marked as RMW. TEST used the same tablegen class as some of the CMPs. llvm-svn: 327947	2018-03-20 03:55:17 +00:00
Lang Hames	4cca7d229e	[ORC] Rename SymbolSource to MaterializationUnit, and make the materialization operation all-or-nothing, rather than allowing materialization on a per-symbol basis. This addresses a shortcoming of per-symbol materialization: If a MaterializationUnit (/SymbolSource) wants to materialize more symbols than requested (which is likely: most materializers will want to materialize whole modules) then it needs a way to notify the symbol table about the extra symbols being materialized. This process (checking what has been requested against what is being provided and notifying the symbol table about the difference) has to be repeated at every level of the JIT stack. Making materialization all-or-nothing eliminates this issue, simplifying both materializer implementations and the symbol table (VSO class) API. The cost is that per-symbol materialization (e.g. for individual symbols in a module) now requires multiple MaterializationUnits. llvm-svn: 327946	2018-03-20 03:49:29 +00:00
Craig Topper	3e9462607e	[X86] Add TEST16mi/TEST32mi/TEST64mi32 to the Sandybridge/Haswell/Broadwell/Skylake scheduler models. Move it from a load+store group on SNB to a load only group, the same group as CMP. llvm-svn: 327944	2018-03-20 03:02:03 +00:00
Craig Topper	7c90e29cf8	[X86] Add ROR/ROL/SHR/SAR by 1 instructions to the Sandy Bridge scheduler model. I assume these match the generic immediate version like they do in the other models. llvm-svn: 327943	2018-03-20 03:01:59 +00:00
Quentin Colombet	508f68233d	[ShrinkWrap] Take into account landing pad When scanning the function for CSRs uses and defs, also check if the basic block are landing pads. Consider that landing pads needs the CSRs to be properly set. That way we force the prologue/epilogue to always be pushed out of the problematic "throw" region. The "throw" region is problematic because the jumps are not properly modeled. Fixes PR36513 llvm-svn: 327942	2018-03-20 02:44:40 +00:00
Shiva Chen	cbd498ac10	[RISCV] Preserve stack space for outgoing arguments when the function contain variable size objects E.g. bar (int x) { char p[x]; push outgoing variables for foo. call foo } We need to generate stack adjustment instructions for outgoing arguments by eliminateCallFramePseudoInstr when the function contains variable size objects to avoid outgoing variables corrupt the variable size object. Default hasReservedCallFrame will return !hasFP(). We don't want to generate extra sp adjustment instructions when hasFP() return true, So We override hasReservedCallFrame as !hasVarSizedObjects(). Differential Revision: https://reviews.llvm.org/D43752 llvm-svn: 327938	2018-03-20 01:39:17 +00:00
Craig Topper	2330d6cd55	[X86] Fix the SNB scheduler for BLENDVB. PBLENDVBrr0 was with the memory version of VBLENDVB and PBLENDVBrm0 was missing. llvm-svn: 327937	2018-03-20 01:30:21 +00:00
Vitaly Buka	849217abdf	Object: Fix handling of @@@ in .symver directive Summary: name@@@nodename is going to be replaced with name@@nodename if symbols is defined in the assembled file, or name@nodename if undefined. https://sourceware.org/binutils/docs/as/Symver.html Fixes PR36623 Reviewers: pcc, espindola Subscribers: mehdi_amini, hiraditya Differential Revision: https://reviews.llvm.org/D44274 llvm-svn: 327930	2018-03-20 00:45:03 +00:00
Vitaly Buka	0d03881eb5	Object: Move attribute calculation into RecordStreamer. NFC Summary: Preparation for D44274 Reviewers: pcc, espindola Subscribers: hiraditya Differential Revision: https://reviews.llvm.org/D44276 llvm-svn: 327928	2018-03-20 00:38:33 +00:00
Aaron Smith	6738960588	[SelectionDAG] Transfer DbgValues when integer operations are promoted Summary: DbgValue nodes were not transferred when integer DAG nodes were promoted. For example, if an i32 add node was promoted to an i64 add node by DAGTypeLegalizer::PromoteIntegerResult(), its DbgValue node was not transferred to the new node. The simple fix is to update SetPromotedInteger() to transfer DbgValues. Add AArch64/dbg-value-i8.ll to test this change and fix ARM/debug-info-d16-reg.ll which had the wrong DILocalVariable nodes with arg numbers even though they are not for function parameters. Patch by Se Jong Oh! Reviewers: vsk, JDevlieghere, aprantl Reviewed By: JDevlieghere Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D44546 llvm-svn: 327919	2018-03-19 22:58:50 +00:00
Jessica Paquette	563548d8f3	[MachineOutliner] AArch64: Emit CFI instructions when outlining calls When outlining calls, the outliner needs to update CFI to ensure that, say, exception handling works. This commit adds that functionality and adds a test just for call outlining. Call outlining stuff in machine-outliner.mir should be moved into machine-outliner-calls.mir in a later commit. llvm-svn: 327917	2018-03-19 22:48:40 +00:00
Craig Topper	956fec2a4a	[DAGCombiner] Fix type in comment. NFC llvm-svn: 327916	2018-03-19 22:25:26 +00:00
Craig Topper	ab6076514d	[X86] Simplify the AVX512 code in LowerTruncate a little. We don't need to create an ISD::TRUNCATE node to return, we started with one and can return it. Also remove the call to getExtendInVec, the result is just going to be a getNode of that value passed in. llvm-svn: 327914	2018-03-19 21:58:02 +00:00
Aaron Smith	da61120749	[PDB] Add a method to get the full path of the source file for PDBSymbolCompiland Summary: Redefine PDBSymbolCompiland::getSourceFileName() to return the filename (w/o directory) of the source file that is used to compile the compiland. This is because the result returned previously is ambiguous. It could be the filename, relative path or full path of the source file. Move the implementation of SymbolFilePDB::GetSourceFileNameForPDBCompiland() into a new method PDBSymbolCompiland::getSourceFileFullPath(). Reviewers: zturner, rnk, llvm-commits Reviewed By: zturner Differential Revision: https://reviews.llvm.org/D44458 llvm-svn: 327910	2018-03-19 21:20:04 +00:00
Aaron Smith	06173e8b46	[PDB] Add exclusive methods to derived symbol class Summary: This commit adds two methods to the PDBSymboFunc class used in parsing symbols. getLineNumbers() is used to determine a Function symbol's declaration and getCompilandId() is used to initialize the SymbolContext field sc.comp_unit. Reviewers: zturner, rnk, llvm-commits Reviewed By: zturner Differential Revision: https://reviews.llvm.org/D44457 llvm-svn: 327909	2018-03-19 21:18:39 +00:00
Zachary Turner	a21558897b	Revert "Support embedding natvis files in PDBs." This is causing a test failure on a certain bot, so I'm removing this temporarily until we can figure out the source of the error. llvm-svn: 327903	2018-03-19 20:41:59 +00:00
Zachary Turner	426885b10c	Remove an unused private variable. llvm-svn: 327900	2018-03-19 20:22:48 +00:00
Craig Topper	3b967466d5	[X86] Replace a couple calls to getExtendInVec with getNode and the appropriate target independent EXTEND_VECTOR_INREG opcode. llvm-svn: 327899	2018-03-19 20:20:22 +00:00

... 10 11 12 13 14 ...

112713 Commits