Commit Graph

116221 Commits

Author SHA1 Message Date
Yury Delendik 132fc5a861 Update DBG_VALUE register operand during LiveInterval operations
Summary:
Handling of DBG_VALUE in ConnectedVNInfoEqClasses::Distribute() was fixed in
PR16110. However DBG_VALUE register operands are not getting updated. This
patch properly resolves the value location.

Reviewers: MatzeB, vsk

Reviewed By: MatzeB

Subscribers: kparzysz, thegameg, vsk, MatzeB, dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D48994

llvm-svn: 340310
2018-08-21 17:48:28 +00:00
Aditya Nandakumar c0333f7184 Revert "Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting operations""
This reverts commit d1341152d91398e9a882ba2ee924147ea2f9b589.

This patch originally made use of Nested MachineIRBuilder buildInstr
calls, and since order of argument processing is not well defined, the
instructions were built slightly in a different order (still correct).
I've removed the nested buildInstr calls to have a defined order now.

Patch was tested by Mikael.

llvm-svn: 340309
2018-08-21 17:30:31 +00:00
Simon Pilgrim 50eba6b380 [X86][SSE] Lower vXi8 general shifts to SSE shifts directly. NFCI.
Most of these shifts are extended to vXi16 so we don't gain anything from forcing another round of generic shift lowering - we know these extended cases are legal constant splat shifts.

llvm-svn: 340307
2018-08-21 17:27:03 +00:00
Craig Topper b172b8884a [BypassSlowDivision] Teach bypass slow division not to interfere with div by constant where constants have been constant hoisted, but not moved from their basic block
DAGCombiner doesn't pay attention to whether constants are opaque before doing the div by constant optimization. So BypassSlowDivision shouldn't introduce control flow that would make DAGCombiner unable to see an opaque constant. This can occur when a div and rem of the same constant are used in the same basic block. it will be hoisted, but not leave the block.

Longer term we probably need to look into the X86 immediate cost model used by constant hoisting and maybe not mark div/rem immediates for hoisting at all.

This fixes the case from PR38649.

Differential Revision: https://reviews.llvm.org/D51000

llvm-svn: 340303
2018-08-21 17:15:33 +00:00
Simon Pilgrim 98eb4ae499 [X86][SSE] Lower v8i16 general shifts to SSE shifts directly. NFCI.
We don't gain anything from forcing another round of generic shift lowering - we know these are legal constant splat shifts.

llvm-svn: 340302
2018-08-21 17:05:07 +00:00
Simon Pilgrim dbe4e9e3ff [X86][SSE] Lower directly to SSE shifts in the BLEND(SHIFT, SHIFT) combine. NFCI.
We don't gain anything from forcing another round of generic shift lowering - we know these are legal constant splat shifts.

llvm-svn: 340300
2018-08-21 16:46:48 +00:00
Matt Arsenault 182bab8d1e Try to fix bot build failure
llvm-svn: 340296
2018-08-21 16:24:56 +00:00
Farhana Aleen 3528c80378 [AMDGPU] Support idot2 pattern.
Summary: Transform add (mul ((i32)S0.x, (i32)S1.x),

         add( mul ((i32)S0.y, (i32)S1.y), (i32)S3) => i/udot2((v2i16)S0, (v2i16)S1, (i32)S3)

Author: FarhanaAleen

Reviewed By: arsenm

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D50024

llvm-svn: 340295
2018-08-21 16:21:15 +00:00
Matt Arsenault 7dd9d58c66 AMDGPU: Partially move target handling code from clang to TargetParser
A future change in clang necessitates access of this information
from the driver, so move this into a common place.

Try to mimic something resembling the API the other targets are
using here.

One thing I'm uncertain about is how to split amdgcn and r600
handling. Here I've mostly duplicated the functions for each,
while keeping the same enums. I think this is a bit awkward
for the features which don't matter for amdgcn.

It's also a bit messy that this isn't a complete set of
subtarget features. This is just the minimum set needed
for the driver code. For example building the list of
subtarget feature names is still in clang.

llvm-svn: 340291
2018-08-21 16:13:01 +00:00
Simon Pilgrim 5a83a1fd13 [X86][SSE] Add helper function to convert to/between the SSE vector shift opcodes. NFCI.
Also remove some more getOpcode calls from LowerShift when we already have Opc.

llvm-svn: 340290
2018-08-21 15:57:33 +00:00
Daniel Sanders 6a943fb16a [aarch64][mc] Don't lookup symbols when there is no symbol lookup callback
Summary: When run under llvm-mc-disassemble-fuzzer, there is no symbol lookup callback so tryAddingSymbolicOperand() must fail gracefully instead of crashing

Reviewers: aemerson, javed.absar

Reviewed By: aemerson

Subscribers: lhames, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D51005

llvm-svn: 340287
2018-08-21 15:47:25 +00:00
Sanjay Patel f3ae9cc33e [InstSimplify] use isKnownNeverNaN to fold more fcmp ord/uno
Remove duplicate tests from InstCombine that were added with
D50582. I left negative tests there to verify that nothing
in InstCombine tries to go overboard. If isKnownNeverNaN is
improved to handle the FP binops or other cases, we should
have coverage under InstSimplify, so we could remove more
duplicate tests from InstCombine at that time.

llvm-svn: 340279
2018-08-21 14:45:13 +00:00
Anna Thomas b02b0ad8c7 [LV] Vectorize loops where non-phi instructions used outside loop
Summary:
Follow up change to rL339703, where we now vectorize loops with non-phi
instructions used outside the loop. Note that the cyclic dependency
identification occurs when identifying reduction/induction vars.

We also need to identify that we do not allow users where the PSCEV information
within and outside the loop are different. This was the fix added in rL307837
for PR33706.

Reviewers: Ayal, mkuper, fhahn

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D50778

llvm-svn: 340278
2018-08-21 14:40:27 +00:00
Tim Renouf bb5ee41ab4 [AMDGPU] Allow int types for MUBUF vdata
Summary:
Previously the new llvm.amdgcn.raw/struct.buffer.load/store intrinsics
only allowed float types for the data to be loaded or stored, which
sometimes meant the frontend needed to generate a bitcast. In this, the
new intrinsics copied the old buffer intrinsics.

This commit extends the new intrinsics to allow int types as well.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D50315

Change-Id: I8202af2d036455553681dcbb3d7d32ae273f8f85
llvm-svn: 340270
2018-08-21 11:08:12 +00:00
Tim Renouf 4f703f5e11 [AMDGPU] New buffer intrinsics
Summary:
This commit adds new intrinsics
  llvm.amdgcn.raw.buffer.load
  llvm.amdgcn.raw.buffer.load.format
  llvm.amdgcn.raw.buffer.load.format.d16
  llvm.amdgcn.struct.buffer.load
  llvm.amdgcn.struct.buffer.load.format
  llvm.amdgcn.struct.buffer.load.format.d16
  llvm.amdgcn.raw.buffer.store
  llvm.amdgcn.raw.buffer.store.format
  llvm.amdgcn.raw.buffer.store.format.d16
  llvm.amdgcn.struct.buffer.store
  llvm.amdgcn.struct.buffer.store.format
  llvm.amdgcn.struct.buffer.store.format.d16
  llvm.amdgcn.raw.buffer.atomic.*
  llvm.amdgcn.struct.buffer.atomic.*

with the following changes from the llvm.amdgcn.buffer.*
intrinsics:

* there are separate raw and struct versions: raw does not have an
  index arg and sets idxen=0 in the instruction, and struct always sets
  idxen=1 in the instruction even if the index is 0, to allow for the
  fact that gfx9 does bounds checking differently depending on whether
  idxen is set;

* there is a combined cachepolicy arg (glc+slc)

* there are now only two offset args: one for the offset that is
  included in bounds checking and swizzling, to be split between the
  instruction's voffset and immoffset fields, and one for the offset
  that is excluded from bounds checking and swizzling, to go into the
  instruction's soffset field.

The AMDISD::BUFFER_* SD nodes always have an index operand, all three
offset operands, combined cachepolicy operand, and an extra idxen
operand.

The obsolescent llvm.amdgcn.buffer.* intrinsics continue to work.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50306

Change-Id: If897ea7dc34fcbf4d5496e98cc99a934f62fc205
llvm-svn: 340269
2018-08-21 11:07:10 +00:00
Tim Renouf 35484c9d50 [AMDGPU] New tbuffer intrinsics
Summary:
This commit adds new intrinsics
  llvm.amdgcn.raw.tbuffer.load
  llvm.amdgcn.struct.tbuffer.load
  llvm.amdgcn.raw.tbuffer.store
  llvm.amdgcn.struct.tbuffer.store

with the following changes from the llvm.amdgcn.tbuffer.* intrinsics:

* there are separate raw and struct versions: raw does not have an index
  arg and sets idxen=0 in the instruction, and struct always sets
  idxen=1 in the instruction even if the index is 0, to allow for the
  fact that gfx9 does bounds checking differently depending on whether
  idxen is set;

* there is a combined format arg (dfmt+nfmt)

* there is a combined cachepolicy arg (glc+slc)

* there are now only two offset args: one for the offset that is
  included in bounds checking and swizzling, to be split between the
  instruction's voffset and immoffset fields, and one for the offset
  that is excluded from bounds checking and swizzling, to go into the
  instruction's soffset field.

The AMDISD::TBUFFER_* SD nodes always have an index operand, all three
offset operands, combined format operand, combined cachepolicy operand,
and an extra idxen operand.

The tbuffer pseudo- and real instructions now also have a combined
format operand.

The obsolescent llvm.amdgcn.tbuffer.* and llvm.SI.tbuffer.store
intrinsics continue to work.

V2: Separate raw and struct intrinsics.
V3: Moved extract_glc and extract_slc defs to a more sensible place.
V4: Rebased on D49995.
V5: Only two separate offset args instead of three.
V6: Pseudo- and real instructions have joint format operand.
V7: Restored optionality of dfmt and nfmt in assembler.
V8: Addressed minor review comments.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D49026

Change-Id: If22ad77e349fac3a5d2f72dda53c010377d470d4
llvm-svn: 340268
2018-08-21 11:06:05 +00:00
Bjorn Pettersson d378a39603 Change how finalizeBundle selects debug location for the BUNDLE instruction
Summary:
Previously a BUNDLE instruction inherited the DebugLoc from the
first instruction in the bundle, even if that DebugLoc had no
DILocation. With this commit this is changed into selecting the
first DebugLoc that has a DILocation, by searching among the
bundled instructions.

The idea is to reduce amount of bundles that are lacking
debug locations.

Reviewers: #debug-info, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: JDevlieghere, mattd, llvm-commits

Differential Revision: https://reviews.llvm.org/D50639

llvm-svn: 340267
2018-08-21 10:59:50 +00:00
Sam Parker 597811e7a7 [DAGCombiner] Reduce load widths of shifted masks
During combining, ReduceLoadWdith is used to combine AND nodes that
mask loads into narrow loads. This patch allows the mask to be a
shifted constant. This results in a narrow load which is then left
shifted to compensate for the new offset.

Differential Revision: https://reviews.llvm.org/D50432

llvm-svn: 340261
2018-08-21 10:26:59 +00:00
Simon Pilgrim 72b324de4d [TargetLowering] Add BuildSDiv support for division by one or negone.
This reduces most of the sdiv stages (the MULHS, shifts etc.) to just zero/identity values and use the numerator scale factor to multiply by +1/-1.

llvm-svn: 340260
2018-08-21 10:20:36 +00:00
Petar Jovanovic 3b953c37f8 [MIPS GlobalISel] Select bitwise instructions
Select bitwise instructions for i32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D50183

llvm-svn: 340258
2018-08-21 08:15:56 +00:00
Max Kazantsev 097ef69182 [LICM] Hoist guards with invariant conditions
This patch teaches LICM to hoist guards from the loop if they are guaranteed to execute and
if there are no side effects that could prevent that.

Differential Revision: https://reviews.llvm.org/D50501
Reviewed By: reames

llvm-svn: 340256
2018-08-21 08:11:31 +00:00
Bjorn Pettersson 880f291577 [RegisterCoalescer] Do not assert when trying to remat dead values
Summary:
RegisterCoalescer::reMaterializeTrivialDef used to assert that
the input register was live in. But as shown by the new
coalesce-dead-lanes.mir test case that seems to be a valid
scenario. We now return false instead of the assert, simply
avoiding to remat the dead def.

Normally a COPY of an undef value is eliminated by
eliminateUndefCopy(). Although we only do that when the
destination isn't a physical register. So the situation
above should be limited to the case when we copy an undef
value to a physical register.

Reviewers: kparzysz, wmi, tpr

Reviewed By: kparzysz

Subscribers: MatzeB, qcolombet, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D50842

llvm-svn: 340255
2018-08-21 07:49:05 +00:00
Max Kazantsev bfbd4d1fb6 [NFC] Factor out predecessors collection into a separate method
It may be reused in a different piece of logic.

Differential Revision: https://reviews.llvm.org/D50890
Reviewed By: reames

llvm-svn: 340250
2018-08-21 07:15:06 +00:00
Serguei Katkov 09ab506798 [IR Verifier] Do not allow bitcast of pointer to vector of pointers and vice versa.
LangRef for BitCast requires that
"The bit sizes of value and the destination type, ty2, must be identical".
Currently verifier allows BitCast of pointer to vector of pointers so that
the sizes are different.

This change fixes that.

Reviewers: arsenm
Reviewed By: arsenm
Subscribers: llvm-commits, wdng
Differential Revision: https://reviews.llvm.org/D50886

llvm-svn: 340249
2018-08-21 04:27:07 +00:00
Philip Reames a5a8546ac6 [AST] Mark invariant.starts as being readonly
These intrinsics are modelled as writing for control flow purposes, but they don't actually write to any location. Marking these - as we did for guards - allows LICM to hoist loads out of loops containing invariant.starts.

Differential Revision: https://reviews.llvm.org/D50861

llvm-svn: 340245
2018-08-21 00:55:35 +00:00
Zachary Turner c175310a09 [MS Demangler] Demangle special operator 'dynamic initializer'.
This is encoded as __E and should print something like
"dynamic initializer for 'Foo'(void)"

This also adds support for dynamic atexit destructor, which is
basically identical but encoded as __F with slightly different
description.

llvm-svn: 340239
2018-08-20 23:59:21 +00:00
Zachary Turner 0002dd467d [MS Demangler] Anonymous namespace hashes can be backreferenced.
Previously we were not remembering the key values of anonymous
namespaces, but we need to do this.

llvm-svn: 340238
2018-08-20 23:58:58 +00:00
Zachary Turner 91c98a858c [MS Demangler] Properly demangle anonymous namespaces.
llvm-svn: 340237
2018-08-20 23:58:35 +00:00
Heejin Ahn 487992cc09 [WebAssembly] Revert type of wake count in atomic.wake to i32
Summary:
We decided to revert this from i64 to i32 in Nov 28 CG meeting. Fixes
PR38632.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D51010

llvm-svn: 340234
2018-08-20 23:49:29 +00:00
Reid Kleckner 85a8c12db8 Re-land r334313 "[asan] Instrument comdat globals on COFF targets"
If we can use comdats, then we can make it so that the global metadata
is thrown away if the prevailing definition of the global was
uninstrumented. I have only tested this on COFF targets, but in theory,
there is no reason that we cannot also do this for ELF.

This will allow us to re-enable string merging with ASan on Windows,
reducing the binary size cost of ASan on Windows.

I tested this change with ASan+PGO, and I fixed an issue with the
__llvm_profile_raw_version symbol. With the old version of my patch, we
would attempt to instrument that symbol on ELF because it had a comdat
with external linkage. If we had been using the linker GC-friendly
metadata scheme, everything would have worked, but clang does not enable
it by default.

llvm-svn: 340232
2018-08-20 23:35:45 +00:00
Craig Topper bee74793a3 [InstCombine] Add splat vector constant support to foldICmpAddOpConst.
Differential Revision: https://reviews.llvm.org/D50946

llvm-svn: 340231
2018-08-20 23:04:25 +00:00
Heejin Ahn c2c33c8e64 [WebAssembly] Remove an unused argument from writeSPToMemory (NFC)
Reviewers: dschuff

Subscribers: dschuff, sbc100, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50933

llvm-svn: 340230
2018-08-20 23:02:15 +00:00
Michael Berg 0b838deddc extend binop folds for selects to include true and false binops flag intersection
Summary: This change address bug 38641

Reviewers: spatel, wristow

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D50996

llvm-svn: 340222
2018-08-20 22:26:58 +00:00
Craig Topper 210ccfe3db [X86] Prevent lowerVectorShuffleByMerging128BitLanes from creating cycles
Due to some splat handling code in getVectorShuffle, its possible for NewV1/NewV2 to have their mask modified from what is requested. This can lead to cycles being created in the DAG.

This patch examines the returned mask and makes sure its different. Long term we may need to look closer at that splat code in getVectorShuffle, or add more splat awareness to getVectorShuffle.

Fixes PR38639

Differential Revision: https://reviews.llvm.org/D50981

llvm-svn: 340214
2018-08-20 21:08:35 +00:00
Craig Topper 7dcb2c4b0a [X86] Teach combineTruncatedArithmetic to handle some cases of ISD::SUB
We can safely avoid interfering with the subus combine if both inputs are freely truncatable. Either both extends, or an extend and a constant vector.

Differential Revision: https://reviews.llvm.org/D50878

llvm-svn: 340212
2018-08-20 20:57:35 +00:00
Craig Topper 4ee28412a5 [LegacyPassManager] Remove analysis P from AnUsageMap before deleting it in schedulePass.
If we deem the analysis pass useless and delete it, we need to make sure we remove it from AnUsageMap. Otherwise we might allocate another pass in the freed memory. This will cause us to reuse the AnalysisUsage from the original pass instead of the new one.

Fixes PR38511

Differential Revision: https://reviews.llvm.org/D50573

llvm-svn: 340210
2018-08-20 20:57:30 +00:00
Krzysztof Parzyszek cc3f630252 Consistently use MemoryLocation::UnknownSize to indicate unknown access size
1. Change the software pipeliner to use unknown size instead of dropping
   memory operands. It used to do it before, but MachineInstr::mayAlias
   did not handle it correctly.
2. Recognize UnknownSize in MachineInstr::mayAlias.
3. Print and parse UnknownSize in MIR.

Differential Revision: https://reviews.llvm.org/D50339

llvm-svn: 340208
2018-08-20 20:37:57 +00:00
David Blaikie a25e206973 Add missing include (<functional> for std::ref)
llvm-svn: 340205
2018-08-20 20:02:29 +00:00
Richard Smith 8a57f2e012 Move Itanium demangler implementation into a header file and add visitation support.
Summary:
This transforms the Itanium demangler into a generic reusable library that can
be used to build, traverse, and transform Itanium mangled name trees.

This is in preparation for adding a canonicalizing demangler, which
cannot live in the Demangle library for layering reasons. In order to
keep the diffs simpler, this patch moves more code to the new header
than is strictly necessary: in particular, all of the printLeft /
printRight implementations can be moved to the implementation file.
(And indeed we could make them non-virtual now if we wished, and remove
the vptr from Node.)

All nodes are now included in the Kind enumeration, rather than omitting
some of the Expr nodes, and the three different floating-point literal
node types now have distinct Kind values.

As a proof of concept for the visitation / matching mechanism, this
patch implements a Node dumping facility on top of it, replacing the
prior mechanism that produced the pretty-printed output rather than a
tree dump. Sample dump output:

FunctionEncoding(
  NameType("int"),
  NameWithTemplateArgs(
    NestedName(
      NameWithTemplateArgs(
        NameType("A"),
        TemplateArgs(
          {NameType("B")})),
      NameType("f")),
    TemplateArgs(
      {NameType("int")})),
  {},
  <null>,
  QualConst, FunctionRefQual::FrefQualLValue)

As a next step, it would make sense to move the LLVM high-level interface to
the demangler (the itaniumDemangler function and ItaniumPartialDemangler class)
into the Support library, and implement them in terms of the Demangle library.
This would allow the libc++abi demangler implementation to be an identical copy
of the llvm Demangle library, and would allow the LLVM implementation to reuse
LLVM components such as llvm::BumpPtrAllocator, but we'll need to decide how to
coordinate that with the MS ABI demangler, so I'm not doing that in this patch.

No functionality change intended other than the behavior of dump().

Reviewers: erik.pilkington, zturner, chandlerc, dlj

Subscribers: aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D50930

llvm-svn: 340203
2018-08-20 19:44:01 +00:00
Vitaly Buka 30b5ed3eb7 Revert "AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space"
As it introduces out of bound access.

This reverts commit r340172 and r340171

llvm-svn: 340202
2018-08-20 19:31:03 +00:00
Cameron McInally 94b9029be9 [FPEnv] Support constrained FREM intrinsic
Differential Revision: https://reviews.llvm.org/D50975

llvm-svn: 340201
2018-08-20 19:28:56 +00:00
Marcello Maggioni 5ca4128b45 [PSV] Update API to be able to use TargetCustom without UB.
getTargetCustom() requires values for "Kind" in the constructor
that are not in the PSVKind enum. Passing a value that is not inside
an enum as an argument to a constructor of the type of the enum is
UB. Changing to the underlying type of the enum would solve the UB

Differential Revision: https://reviews.llvm.org/D50909

llvm-svn: 340200
2018-08-20 19:23:45 +00:00
Zachary Turner 66555a7bed [MS Demangler] Demangle member pointer template parameters.
llvm-svn: 340199
2018-08-20 19:15:35 +00:00
Aditya Nandakumar 2a08285cf3 Revert "Revert r339977: [GISel]: Add Opcodes for a few LLVM Intrinsics"
This reverts commit 7debc334e6421bb5251ef8f18e97166dfc7dd787.

I missed updating legalizer-info-validation.mir as I had assertions
turned off in my build and that specific test requires asserts. Fixed it
now.

llvm-svn: 340197
2018-08-20 18:43:19 +00:00
Simon Pilgrim 6ac905926f [TargetLowering] Disable BuildSDiv division by one or negone.
Fuzz tests have detected an issue, currently working on a fix.

llvm-svn: 340195
2018-08-20 18:23:54 +00:00
Sanjay Patel 3ce999fa41 [ConstantFolding] improve folding of binops with vector undef operand
A non-undef operand may still have undef constant elements, 
so we should always propagate the vector results per-lane.

llvm-svn: 340194
2018-08-20 18:19:02 +00:00
Matt Arsenault 450fcc77a7 ValueTracking: Handle more instructions in isKnownNeverNaN
llvm-svn: 340187
2018-08-20 16:51:00 +00:00
Reid Kleckner 918930adf9 Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting operations"
It causes LegalizerHelperTest.LowerBitCountingCTTZ1 to fail.

llvm-svn: 340186
2018-08-20 16:50:19 +00:00
Reid Kleckner 531319388d Add cmake option to disable minidumps, default it to off
Since crash dumping landed in r268519, May 2016, I have not once seen
anyone use an uploaded minidump to debug a compiler crash. Therefore,
I'm turning this off by default. The dumps clutter up user and buildbot
temp directories. Each file is only about 56KB, but it adds up.

In the context of clang, the extra line about the minidump confuses
users, when what we really want from them is the pre-processed source
code.

llvm-svn: 340185
2018-08-20 16:49:54 +00:00
Simon Pilgrim 1a00042270 [SelectionDAG] Reuse the Op's VT. NFCI.
llvm-svn: 340173
2018-08-20 13:44:03 +00:00
Samuel Pitoiset 216a2da577 AMDGPU: fix compilation errors since r340171
Some buildbot slaves reports compilation errors, but it
compiled fine on my side, sorry for the breakage.

llvm-svn: 340172
2018-08-20 13:31:41 +00:00
Samuel Pitoiset c95ef77d37 AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space
32-bit constant address space is declared as 6, so the
maximum number of address spaces is 6, not 5.

Fixes "LLVM ERROR: Pointer address space out of range".

v3: use static_assert()
v2: add a very simple test for 32-bit addr space

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106630
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
llvm-svn: 340171
2018-08-20 13:18:59 +00:00
Haojian Wu 54829bb3ff Fix an undefined behavior when storing an empty StringRef.
Summary: Passing a nullptr to memcpy is UB.

Reviewers: ioeric

Subscribers: llvm-commits, cfe-commits

Differential Revision: https://reviews.llvm.org/D50966

llvm-svn: 340170
2018-08-20 13:12:54 +00:00
Simon Pilgrim 5b78c9d58d [SelectionDAG] Add partial sign-bit support to ComputeNumSignBits for BITCAST nodes
Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts.

Handle the case where the sign bit extends to only part of the small elements.

llvm-svn: 340169
2018-08-20 13:05:48 +00:00
Victor Leschuk cba595da82 [DWARF] Refactor DWARF classes to use unified error reporting. NFC.
DWARF-related classes in lib/DebugInfo/DWARF contained 
duplicating code for creating StringError instances, like:

template <typename... Ts>
static Error createError(char const *Fmt, const Ts &... Vals) {
  std::string Buffer;
  raw_string_ostream Stream(Buffer);
  Stream << format(Fmt, Vals...);
  return make_error<StringError>(Stream.str(), inconvertibleErrorCode());
}

Similar function was placed in Support lib in https://reviews.llvm.org/D49824

This revision makes DWARF classes use this function
instead of their local implementation of it.

Reviewers: aprantl, dblaikie, probinson, wolfgangp, JDevlieghere, jhenderson

Reviewed By: JDevlieghere, jhenderson

Differential Revision: https://reviews.llvm.org/D49964

llvm-svn: 340163
2018-08-20 09:59:08 +00:00
Sander de Smalen 07db432265 [AArch64][SVE] Asm: Add SVE System registers
This patch adds system registers for controlling aspects of SVE:
- ZCR_EL1  (r/w)   visible at EL1 and EL0.
- ZCR_EL2  (r/w)   visible at EL2 and Non-secure EL1 and EL0.
- ZCR_EL3  (r/w)   visible at all exception levels.

and a system register identifying SVE:
- ID_AA64ZFR0_EL1  (r)  SVE Feature identifier.

Reviewers: SjoerdMeijer, samparker, pbarrio, fhahn, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D50885

llvm-svn: 340158
2018-08-20 09:16:59 +00:00
Justin Bogner 6f1740d52f [SimplifyCFG] Replace some uses of bitwise or with logical or
It's clearer to use logical or for boolean values. Thanks to Steven
Zhang for noticing!

llvm-svn: 340153
2018-08-20 06:37:11 +00:00
Craig Topper 24674ca773 [InstCombine] Move some variable declarations into a more appropriate scope. NFC
llvm-svn: 340150
2018-08-20 05:35:12 +00:00
QingShan Zhang f8f9af7ba5 [PowerPC] Add a peephole post RA to transform the inst that fed by add
If the arch is P8, we will select XFLOAD to load the floating point, and then, expand it to vsx and non-vsx X-form instruction post RA. This patch is trying to convert the X-form to D-form if it meets the requirement that one operand of the x-form inst is the special Zero register, and another operand fed by add inst. i.e.
y = add imm, reg
LFDX. 0, y
-->
LFD imm(reg)

Reviewers: Nemanjai
Differential Revision: https://reviews.llvm.org/D49007

llvm-svn: 340149
2018-08-20 02:52:55 +00:00
whitequark c438ac2352 [LLVM-C] Add coroutine passes
Differential Revision: https://reviews.llvm.org/D50950

llvm-svn: 340147
2018-08-19 23:39:57 +00:00
whitequark b56a4d3149 [C-API][DIBuilder] Added DIFlags in LLVMDIBuilderCreateBasicType
Added DIFlags in LLVMDIBuilderCreateBasicType to add optional DWARF
attributes, such as DW_AT_endianity.

Patch by Chirag Patel.

Differential Revision: https://reviews.llvm.org/D50832

llvm-svn: 340146
2018-08-19 23:39:47 +00:00
Simon Pilgrim 5b936ec89e [SelectionDAG] Add basic demanded elements support to ComputeNumSignBits for BITCAST nodes
Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts.

The next step would be to support cases where the large elements aren't all sign bits, and determine the small element equivalent based on the demanded elements.

llvm-svn: 340143
2018-08-19 17:47:50 +00:00
Craig Topper 803912ea57 [X86] Fix an issue in the matching for ADDUS.
We were basically assuming only one operand of the compare could be an ADD node and using that to swap operands. But we can have a normal add followed by a saturing add.

This rewrites the canonicalization to just be based on the condition code.

llvm-svn: 340134
2018-08-19 04:26:31 +00:00
Craig Topper 2b03df9b05 [X86] Use SDValue::operator== instead of DAG.isEqualTo in strictly integer matching.
isEqualTo is more useful for floating point. operator== is sufficient for integer.

llvm-svn: 340130
2018-08-18 19:16:56 +00:00
Craig Topper 3e299d896f [X86] Simplify the PADDUS legality check in combineSelect to match PSUBUS. NFC
While there remove some trailing whitespace.

llvm-svn: 340129
2018-08-18 18:51:04 +00:00
Craig Topper 40c9559b74 [X86] Add support for using 512-bit PSUBUS to combineSelect.
The code already support 128 and 256 and even knows to split 256 for AVX1. So we really just needed to stop looking for specific VTs and subtarget features and just look for legal VTs with i8/i16 elements.

While there, add some curly braces around outer if statement bodies that contain only another if. It makes all the closing curly braces look more regular.

llvm-svn: 340128
2018-08-18 18:51:03 +00:00
Zachary Turner d9e925fca4 [MS Demangler] Resolve backreferences eagerly, not lazily.
A while back I submitted a patch to resolve backreferences
lazily, thinking this that it was not always possible to know
in advance what type you were looking at until you had completed
a full pass over the input, and therefore it would be impossible
to resolve backreferences eagerly.

This was mistaken though, and turned out to be an unrelated
problem.  In fact, the reverse is true.  You *must* resolve
backreferences eagerly.  This is because certain types of nested
mangled symbols do not share a backreference context with their
parent symbol, and as such, if you try to resolve them lazily
their backreference context will have been lost by the time you
finish demangling the entire input.  On the other hand, resolving
them eagerly appears to always work, and enables us to port
many more tests over.

llvm-svn: 340126
2018-08-18 18:49:48 +00:00
Lang Hames 8e296229b5 [RuntimeDyld] Fix a bug in RuntimeDyld::loadObjectImpl that was over-allocating
space for common symbols.

Patch by Dmitry Sidorov. Thanks Dmitry!

Differential revision: https://reviews.llvm.org/D50240

llvm-svn: 340125
2018-08-18 18:38:37 +00:00
Simon Pilgrim 9c1761a6fd [X86] Replace all single match schedule class instregexs with instrs entries
Helps reduce cost of instrw collection

llvm-svn: 340124
2018-08-18 18:04:29 +00:00
Simon Pilgrim ebfd6ebba7 [X86] Merge shift/rotate schedule class instregexs
Helps reduce cost of instrw collection

llvm-svn: 340123
2018-08-18 15:58:19 +00:00
Hsiangkai Wang 68c706ceb7 [DebugInfo] In FastISel, convert llvm.dbg.label to DBG_LABEL MI.
Convert llvm.dbg.label(!label_metadata) to DBG_LABEL !label_metadata.

Differential Revision: https://reviews.llvm.org/D50622

llvm-svn: 340122
2018-08-18 14:55:34 +00:00
Craig Topper cc5dbbf759 [DAGCombiner] Allow divide by constant optimization on opaque constants.
Summary:
I believe this restores the behavior we had before r339147.

Fixes PR38622.

Reviewers: RKSimon, chandlerc, spatel

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50936

llvm-svn: 340120
2018-08-18 05:52:42 +00:00
Zachary Turner bc94ae437f Add the extended XMM registers mappings for AVX-512.
After this we should have the entire AVX-512 register set
mapping in place.

llvm-svn: 340118
2018-08-18 03:54:16 +00:00
Lang Hames 76e21c9792 [ORC] Rename 'finalize' to 'emit' to avoid potential confusion.
An emitted symbol has had its contents written and its memory protections
applied, but it is not automatically ready to execute.

Prior to ORC supporting concurrent compilation, the term "finalized" could be
interpreted two different (but effectively equivalent) ways: (1) The finalized
symbol's contents have been written and its memory protections applied, and (2)
the symbol is ready to run. Now that ORC supports concurrent compilation, sense
(1) no longer implies sense (2). We have already introduced a new term, 'ready',
to capture sense (2), so rename sense (1) to 'emitted' to avoid any lingering
confusion.

llvm-svn: 340115
2018-08-18 02:06:18 +00:00
Peter Collingbourne 5b4f8e10b5 MC: Remove dead code from WinCOFFObjectWriter.cpp. NFCI.
Remove code for writing auxiliary symbols of type function definition
and begin function. These types of symbols are associated with
pre-CodeView debug info and we never emit them.

llvm-svn: 340113
2018-08-18 00:54:46 +00:00
Aditya Nandakumar 59b2485ba2 [GISel]: Add Legalization/lowering code for bit counting operations
https://reviews.llvm.org/D48847#inline-448257

Ported legalization expansions for CTLZ/CTTZ from DAG to GISel.

Reviewed by rtereshin.

llvm-svn: 340111
2018-08-18 00:01:54 +00:00
Philip Reames 96bc076c3a [AST] Clarify printing of unknown size locations [NFC]
Printing "unknown" is much more clear than an arbitrary large integer

llvm-svn: 340108
2018-08-17 23:17:31 +00:00
George Burgess IV 04df67e268 [DebugCounters] don't do redundant map lookups; NFC
llvm-svn: 340104
2018-08-17 22:34:04 +00:00
Reid Kleckner aa56bac652 [MC] Improve error message when a codeview register is unknown
This is in MCRegisterInfo, we can print the actual register name easily.

llvm-svn: 340089
2018-08-17 21:35:14 +00:00
Zachary Turner 4746aa7b8f [MS Demangler] Properly print all thunk types.
We were only printing the vtordisp thunk before as the previous
patch was more aimed at getting special operators working, one
of which was a thunk.  This patch gets all thunk types to print
properly, and adds a test for each one.

llvm-svn: 340088
2018-08-17 21:32:07 +00:00
Craig Topper 62dd1b1b4f [X86] Remove detectAddSubSatPattern.
This was added very recently in r339650, but appears to be completely untested and has at least one bug in it.

llvm-svn: 340086
2018-08-17 21:19:28 +00:00
Matt Arsenault 25e51540e1 DAG: Fix isKnownNeverNaN for basic non-sNaN cases
fadd/fsub/fmul need to worry about infinities as well
as fdiv.

llvm-svn: 340085
2018-08-17 21:19:22 +00:00
Lang Hames d5f56c5979 [ORC] Rename VSO to JITDylib.
VSO was a little close to VDSO (an acronym on Linux for Virtual Dynamic Shared
Object) for comfort. It also risks giving the impression that instances of this
class could be shared between ExecutionSessions, which they can not.

JITDylib seems moderately less confusing, while still hinting at how this
class is intended to be used, i.e. as a JIT-compiled stand-in for a dynamic
library (code that would have been a dynamic library if you had wanted to
compile it ahead of time).

llvm-svn: 340084
2018-08-17 21:18:18 +00:00
Zachary Turner 469f076356 [MS Demangler] Demangle all remaining types of operators.
This demangles all remaining special operators including thunks,
RTTI Descriptors, and local static guard variables.

llvm-svn: 340083
2018-08-17 21:18:05 +00:00
Krzysztof Parzyszek 9937e205e8 [Hexagon] Remove unused functions from HexagonInstPrinter, NFC
llvm-svn: 340081
2018-08-17 21:12:37 +00:00
Jun Lim da5864c73c Test commit
I just removed a blank space.

llvm-svn: 340069
2018-08-17 18:40:41 +00:00
Simon Pilgrim 2f48122cc9 [X86][SSE] Lower constant vXi8 ISD::SRL/ISD::SRA using PMULLW
Extending the concept introduced in D49562, this patch lowers constant vXi8 ISD::SRL/ISD::SRA by zero/sign extending to vXi16 and using PMULLW and then truncating the high 8 bits of the result.

Differential Revision: https://reviews.llvm.org/D50781

llvm-svn: 340062
2018-08-17 18:03:11 +00:00
Evandro Menezes 4b39010afb [InstCombine] Refactor the simplification of pow() (NFC)
Refactor all cases dealing with `exp{,2,10}()` into one function in
preparation for D49273.  Otherwise, NFC.

llvm-svn: 340061
2018-08-17 17:59:53 +00:00
Craig Topper 730890dbdb [X86] Use hasOneUse instead of isOnlyUserOf. NFCI
isOnlyUserOf is a little heavier because it allows the node to be used multiple times by the other node. In this case we are looking at a truncate which only has one operand so we know it can only use it once. Thus hasOneUse is better.

llvm-svn: 340059
2018-08-17 17:57:25 +00:00
Alina Sbirlea 0dfe830318 [IDF] Teach Iterated Dominance Frontier to use a snapshot CFG based on a GraphDiff.
Summary:
Create the ability to compute IDF using a CFG View.
For this, we'll need a new DT created using a list of Updates (to be refactored later to a GraphDiff), and the GraphTraits based on the same GraphDiff.

Reviewers: kuhar, george.burgess.iv, mzolotukhin

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D50675

llvm-svn: 340052
2018-08-17 17:39:15 +00:00
Teresa Johnson cb9a82fc7b [ThinLTO] Add option for printing import failure reasons
Summary:
Adds the option for the printing of summary information about functions
considered but rejected for importing during the thin link.

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D50881

llvm-svn: 340047
2018-08-17 16:53:47 +00:00
Zachary Turner 3461bfaa9c [MS Demangler] Rework the way operators are demangled.
Previously, some of the code for actually parsing mangled
operator names was more like formatting code in nature,
and was interspersed with the demangling code which builds
the AST.  This means that by the time we got to the printing
code, we had lost all information about what type of operator
we had, and all we were left with was a string that we just
had to print.  However, not all operators are actually even
operators.  it's basically just a catch-all mangling for
"special names", and for some of the other types it helps
to know when we're actually doing the printing what it is.

This patch changes the way things work by introducing an
OperatorInfo structure and corresponding enumeration.  When
we demangle we store the enumeration value and demangled
components separately.  This gives more flexibility during
printing.

In doing so, some demanglings of special names which we didn't
previously support come out of this for free, so we now demangle
those.

A few are more complex and are better left for a followup patch
though.

An exhaustive test of every possible operator code is included,
with the ones that don't yet work commented out.

llvm-svn: 340046
2018-08-17 16:14:05 +00:00
Hsiangkai Wang 2532ac880a [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 340039
2018-08-17 15:22:04 +00:00
Stefan Pintilie 39869ccf51 [PowerPC] Generate lxsd instead of the ld->mtvsrd sequence for vector loads
This patch addresses:

- Implementation within PPCISelLowering.cpp to check if we should use direct
load into vector instructions (such as lxsd/lfd ) when the scalar_to_vector
function is used; which will allow us to catch as many cases of the
scalar_to_vector uses as possible to translate the ld->mtvsrd sequence into
lxsd.

- Test cases to exhibit the behaviour of emitting lxsd/lfd.

Patch by amyk

Differential revision: https://reviews.llvm.org/D49698

llvm-svn: 340037
2018-08-17 15:15:26 +00:00
Francis Visoiu Mistrih 8bff832534 [X86] Fix liveness information when expanding X86::EH_SjLj_LongJmp64
test/CodeGen/X86/shadow-stack.ll has the following machine verifier
errors:

```
*** Bad machine code: Using a killed virtual register ***
- function:    bar
- basic block: %bb.6 entry (0x7fdc81857818)
- instruction: %3:gr64 = MOV64rm killed %2:gr64, 1, $noreg, 8, $noreg
- operand 1:   killed %2:gr64

*** Bad machine code: Using a killed virtual register ***
- function:    bar
- basic block: %bb.6 entry (0x7fdc81857818)
- instruction: $rsp = MOV64rm killed %2:gr64, 1, $noreg, 16, $noreg
- operand 1:   killed %2:gr64

*** Bad machine code: Virtual register killed in block, but needed live out. ***
- function:    bar
- basic block: %bb.2 entry (0x7fdc818574f8)
Virtual register %2 is used after the block.
```

The fix here is to only copy the machine operand's register without the
kill flags for all the instructions except the very last one of the
sequence.

I had to insert dummy PHIs in the test case to force the NoPHI function
property to be set to false. More on this here: https://llvm.org/PR38439

Differential Revision: https://reviews.llvm.org/D50260

llvm-svn: 340033
2018-08-17 14:46:56 +00:00
Florian Hahn 19f9e32f07 [InstrSimplify,NewGVN] Add option to ignore additional instr info when simplifying.
NewGVN uses InstructionSimplify for simplifications of leaders of
congruence classes. It is not guaranteed that the metadata or other
flags/keywords (like nsw or exact) of the leader is available for all members
in a congruence class, so we cannot use it for simplification.

This patch adds a InstrInfoQuery struct with a boolean field
UseInstrInfo (which defaults to true to keep the current behavior as
default) and a set of helper methods to get metadata/keywords for a
given instruction, if UseInstrInfo is true. The whole thing might need a
better name, to avoid confusion with TargetInstrInfo but I am not sure
what a better name would be.

The current patch threads through InstrInfoQuery to the required
places, which is messier then it would need to be, if
InstructionSimplify and ValueTracking would share the same Query struct.

The reason I added it as a separate struct is that it can be shared
between InstructionSimplify and ValueTracking's query objects. Also,
some places do not need a full query object, just the InstrInfoQuery.

It also updates some interfaces that do not take a Query object, but a
set of optional parameters to take an additional boolean UseInstrInfo.

See https://bugs.llvm.org/show_bug.cgi?id=37540.

Reviewers: dberlin, davide, efriedma, sebpop, hiraditya

Reviewed By: hiraditya

Differential Revision: https://reviews.llvm.org/D47143

llvm-svn: 340031
2018-08-17 14:39:04 +00:00
Krzysztof Parzyszek 39a979c838 [Hexagon] Expand vgather pseudos during packetization
This will allow packetizing the vgather expansion with other instructions.

llvm-svn: 340028
2018-08-17 14:24:24 +00:00
Alex Bradbury 3291f9aa81 [AtomicExpandPass] Widen partword atomicrmw or/xor/and before tryExpandAtomicRMW
This patch performs a widening transformation of bitwise atomicrmw 
{or,xor,and} and applies it prior to tryExpandAtomicRMW. This operates 
similarly to convertCmpXchgToIntegerType. For these operations, the i8/i16 
atomicrmw can be implemented in terms of the 32-bit atomicrmw by appropriately 
manipulating the operands. There is no functional change for the handling of 
partword or/xor, but the transformation for partword 'and' is new.

The advantage of performing this transformation early is that the same 
code-path can be used regardless of the approach used to expand the atomicrmw 
(AtomicExpansionKind). i.e. the same logic is used for 
AtomicExpansionKind::CmpXchg and can also be used by the intrinsic-based 
expansion in D47882.

Differential Revision: https://reviews.llvm.org/D48129

llvm-svn: 340027
2018-08-17 14:03:37 +00:00
Anna Thomas 1962621a7e [LICM] Add a diagnostic analysis for identifying alias information
Summary:
Currently, in LICM, we use the alias set tracker to identify if the
instruction (we're interested in hoisting) aliases with instruction that
modifies that memory location.

This patch adds an LICM alias analysis diagnostic tool that checks the
mod ref info of the instruction we are interested in hoisting/sinking,
with every instruction in the loop.  Because of O(N^2) complexity this
is now only a diagnostic tool to show the limitation we have with the
alias set tracker and is OFF by default.

Test cases show the difference with the diagnostic analysis tool, where
we're able to hoist out loads and readonly + argmemonly calls from the
loop, where the alias set tracker analysis is not able to hoist these
instructions out.

Reviewers: reames, mkazantsev, fedor.sergeev, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50854

llvm-svn: 340026
2018-08-17 13:44:00 +00:00
Roger Ferrer Ibanez 734a04ea33 [RISCV] Remove unused function
This function is not virtual, it is private and it is not called anywhere. No
regression is introduced by removing it.

I think we can safely remove it.

Differential Revision: https://reviews.llvm.org/D50836

llvm-svn: 340024
2018-08-17 13:40:03 +00:00
Sanjay Patel 411b86081e [ConstantFolding] add simplifications for funnel shift intrinsics
This is another step towards being able to canonicalize to the funnel shift 
intrinsics in IR (see D49242 for the initial patch). 
We should not have any loss of simplification power in IR between these and 
the equivalent IR constructs.

Differential Revision: https://reviews.llvm.org/D50848

llvm-svn: 340022
2018-08-17 13:23:44 +00:00
Luke Cheeseman 64dcdec60c [AArch64] - Generate pointer authentication instructions
- Generate pointer authentication instructions
- The functions instrumented depend on function attribtues:
  all (all functions instrumentent)
  non-leaf (only those that spill LR)
  none
- Function epilogues sign the LR before spilling to the stack and authenticate
  the LR once restored
- If the target is v8.3a or greater than can use the combined authenticate and
  return instruction

Differential revision: https://reviews.llvm.org/D49793

llvm-svn: 340018
2018-08-17 12:53:22 +00:00
Nemanja Ivanovic 39751276b0 [PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction
Add a DAG combine for the PowerPC code generator to generate the Power9 extswsli
extend sign and shift immediate instruction.

Patch by RolandF.

Differential revision: https://reviews.llvm.org/D49879

llvm-svn: 340016
2018-08-17 12:35:44 +00:00
Simon Pilgrim 03e57521c0 [DAGCombiner] extractShiftForRotate - fix out of range shift issue
Don't just check for negative shift amounts.

Fixes OSS Fuzz #9935
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9935

llvm-svn: 340015
2018-08-17 12:25:18 +00:00
Andrea Di Biagio f874607f32 [InstCombine] Remove unused method FAddCombine::createFDiv(). NFC
This commit fixes a (gcc 7.3.0) [-Wunused-function] warning caused by the
presence of unused method FaddCombine::createFDiv().
The last use of that method was removed at r339519.

llvm-svn: 340014
2018-08-17 11:33:48 +00:00
Bernard Ogden b828bb2a15 [ARM/AArch64] Support FP16 +fp16fml instructions
Add +fp16fml feature for new FP16 instructions, which are a
mandatory part of FP16 from v8.4-A and an optional part of FP16
from v8.2-A. It doesn't seem to be possible to model this in
LLVM, but the relationship between the options is handled by
the related clang patch.

In keeping with what I think is the usual practice, the fp16fml
extension is accepted regardless of base architecture version.

Builds on/replaces Sjoerd Meijer's patch to add these instructions at
https://reviews.llvm.org/D49839.

Differential Revision: https://reviews.llvm.org/D50228

llvm-svn: 340013
2018-08-17 11:29:49 +00:00
Simon Pilgrim 5113b48798 [DAGCombine] Improve (sra (sra x, c1), c2) -> (sra x, (add c1, c2)) folding
Add support for cases where only some c1+c2 results exceed the max bitshift, clamping accordingly.

Differential Revision: https://reviews.llvm.org/D35722

llvm-svn: 340010
2018-08-17 10:52:49 +00:00
Daniel Cederman 0c597ca223 [Sparc] Get sret arg size from CallLoweringInfo.getArgs()
Summary:
Looking at the callee argument list, as is done now, might not work if
the function has been typecasted into one that is expected to return
a struct. This change also simplifies the code.

The isFP128ABICall() function can be removed as it is no longer needed.
The test in fp128.ll has been updated to verify this.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48117

llvm-svn: 340008
2018-08-17 10:40:00 +00:00
Simon Pilgrim 22d580f2ca Fix "control reaches end of non-void function" -Wreturn-type warning. NFCI.
llvm-svn: 340006
2018-08-17 09:47:52 +00:00
Daniel Cederman 7d3e08ff8d [Sparc] Flush register windows for @llvm.returnaddress(1)
Summary: When @llvm.returnaddress is called with a value higher than 0
it needs to read from the call stack to get the return address. This
means that the register windows needs to be flushed to the stack to
guarantee that the data read is valid. For values higher than 1 this
is done indirectly by the call to getFRAMEADDR(), but not for the value 1.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48636

llvm-svn: 340003
2018-08-17 09:18:31 +00:00
Chen Zheng e2d47dd1bb [MISC]Fix wrong usage of std::equal()
Differential Revision: https://reviews.llvm.org/D49958

llvm-svn: 340000
2018-08-17 07:51:01 +00:00
Sjoerd Meijer 31239a4c6a [ARM][NFC] ARMCodeGenPrepare: some refactoring and algorithm description
Differential Revision: https://reviews.llvm.org/D50846

llvm-svn: 339997
2018-08-17 07:34:01 +00:00
Max Kazantsev 7b78d3920c [MustExecute] Fix algorithmic bug in isGuaranteedToExecute. PR38514
The description of `isGuaranteedToExecute` does not correspond to its implementation.
According to description, it should return `true` if an instruction is executed under the
assumption that its loop is *entered*. However there is a sophisticated alrogithm inside
that tries to prove that the instruction is executed if the loop is *exited*, which is not the
same thing for infinite loops. There is an attempt to protect from dealing with infinite loops
by prohibiting loops without exit blocks, however an infinite loop can have exit blocks.

As result of that, MustExecute can falsely consider some blocks that are never entered as
mustexec, and LICM can hoist dangerous instructions out of them basing on this fact.
This may introduce UB to programs which did not contain it initially.

This patch removes the problematic algorithm and replaced it with a one which tries to
prove what is required in description.

Differential Revision: https://reviews.llvm.org/D50558
Reviewed By: reames

llvm-svn: 339984
2018-08-17 06:19:17 +00:00
Chandler Carruth b898b86f49 Revert r339977: [GISel]: Add Opcodes for a few LLVM Intrinsics
This is breaking ~all the bots.

llvm-svn: 339982
2018-08-17 04:47:16 +00:00
Graydon Hoare eac6e87118 [Support] Add a public API to allow clearing all (static) timer groups.
Summary:
Formerly, all timer groups were automatically cleared when printed out. In
https://reviews.llvm.org/rL324788 this behaviour was changed to not-clearing
timers on printout, to allow printing timers more than once, but as a result
clients (specifically Swift) that relied on the clear-on-print behaviour to
inhibit duplicate timer printing on shutdown were broken.

Rather than revert that change, this change adds a new API that enables
clients that _want_ to clear all timers to do so explicitly.

Reviewers: george.karpenkov, thegameg

Reviewed By: george.karpenkov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50874

llvm-svn: 339980
2018-08-17 04:13:19 +00:00
Aditya Nandakumar 973a557338 [GISel]: Add Opcodes for a few LLVM Intrinsics
https://reviews.llvm.org/D50401

Add opcodes for llvm.intrinsic.trunc, round, and update the IRTranslator
for the same.

Reviewed by: dsanders.

llvm-svn: 339977
2018-08-17 01:41:56 +00:00
Heejin Ahn a93e726170 [WebAssembly] Modify LateEHPrepare one-line description (NFC)
llvm-svn: 339972
2018-08-17 00:12:04 +00:00
David Blaikie 0e03047e85 DebugInfo: Remove command line (& target-based) disabling of pubnames in favor of metadata
Now that Clang disables NVPTX pubnames via metadata there's no need for
this fallback to target detection in the backend.

llvm-svn: 339970
2018-08-16 23:57:15 +00:00
Heejin Ahn e76fa9ecca [WebAssembly] CFG stackify support for exception handling
Summary:
This adds support for exception handling to CFGStackify pass. This only
adds TRY / END_TRY markers and DOES NOT yet fix unwind mismatches that
can be created by the linearization of the CFG into the structural wasm
format. The mismatch fix will be added by following patches.

In detail, this patch
- Added support for TRY / END_TRY markers to support EH
- Changed many static functions into class member functions as they take
too many arguments now
- Added several more bookeeping data structures
- Refactored routines that decide where to insert markers, because
without refactoring this got too complicated as we added support for new
kinds of markers (TRY/END_TRY).
- Rewrote rethrow instructions' BB arguments to relative depths in EH
pad stack.

Reviewers: dschuff, sunfish

Subscribers: sbc100, jgravelle-google, llvm-commits

Differential Revision: https://reviews.llvm.org/D48273

llvm-svn: 339967
2018-08-16 23:50:59 +00:00
Chandler Carruth 75ca6be1c1 [x86/MIR] Implement support for pre- and post-instruction symbols, as
well as MIR parsing support for `MCSymbol` `MachineOperand`s.

The only real way to test pre- and post-instruction symbol support is to
use them in operands, so I ended up implementing that within the patch
as well. I can split out the operand support if folks really want but it
doesn't really seem worth it.

The functional implementation of pre- and post-instruction symbols is
now *completely trivial*. Two tiny bits of code in the (misnamed)
AsmPrinter. It should be completely target independent as well. We emit
these exactly the same way as we emit basic block labels. Most of the
code here is to give full dumping, MIR printing, and MIR parsing support
so that we can write useful tests.

The MIR parsing of MC symbol operands still isn't 100%, as it forces the
symbols to be non-temporary and non-local symbols with names. However,
those names often can encode most (if not all) of the special semantics
desired, and unnamed symbols seem especially annoying to serialize and
de-serialize. While this isn't perfect or full support, it seems plenty
to write tests that exercise usage of these kinds of operands.

The MIR support for pre-and post-instruction symbols was quite
straightforward. I chose to print them out in an as-if-operand syntax
similar to debug locations as this seemed the cleanest way and let me
use nice introducer tokens rather than inventing more magic punctuation
like we use for memoperands.

However, supporting MIR-based parsing of these symbols caused me to
change the design of the symbol support to allow setting arbitrary
symbols. Without this, I don't see any reasonable way to test things
with MIR.

Differential Revision: https://reviews.llvm.org/D50833

llvm-svn: 339962
2018-08-16 23:11:05 +00:00
Sanjay Patel 8ba631d9c8 [InstCombine] add reflection fold for tan(-x)
This is a follow-up suggested with rL339604.
For tan(), we don't have a corresponding LLVM 
intrinsic -- unlike sin/cos -- so this is the 
only way/place that we can do this fold currently.

llvm-svn: 339958
2018-08-16 22:46:20 +00:00
Vedant Kumar ee6c233ae0 [InstrProf] Use atomic profile counter updates for TSan
Thread sanitizer instrumentation fails to skip all loads and stores to
profile counters. This can happen if profile counter updates are merged:

  %.sink = phi i64* ...
  %pgocount5 = load i64, i64* %.sink
  %27 = add i64 %pgocount5, 1
  %28 = bitcast i64* %.sink to i8*
  call void @__tsan_write8(i8* %28)
  store i64 %27, i64* %.sink

To suppress TSan diagnostics about racy counter updates, make the
counter updates atomic when TSan is enabled. If there's general interest
in this mode it can be surfaced as a clang/swift driver option.

Testing: check-{llvm,clang,profile}

rdar://40477803

Differential Revision: https://reviews.llvm.org/D50867

llvm-svn: 339955
2018-08-16 22:24:47 +00:00
Alina Sbirlea 2ab544bcf5 Update MemorySSA in Local utils removing blocks.
Summary: Extend Local utils to update MemorySSA.

Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits

Differential Revision: https://reviews.llvm.org/D48790

llvm-svn: 339951
2018-08-16 21:58:44 +00:00
Alina Sbirlea d4b3f19ba6 [DomTree] Add constructor to create a new DT based on current DT/CFG and a set of Updates.
Summary:
Add the posibility of creating a new DT using a set of Updates.
This will essentially create a DT based on a CFG snapshot/view.

Additional refactoring for either this patch or follow-ups:
- create an utility for building BUI.
- replace BUI with a GraphDiff.

Reviewers: kuhar

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D50671

llvm-svn: 339947
2018-08-16 21:54:33 +00:00
Craig Topper 883ff69c93 [DAGCombiner] Don't reassociate operations that have the vector reduction flag set.
When nodes are reassociated the vector-reduction flag gets lost.

The test case is here is what would happen if you had a sum of absolute differences loop that started with a non-zero but contant sum and that loop was unrolled. The vectorizer will generate a constant vector for the initial value. And DAGCombiner reassociate tries to move it down the addition tree erasing the vector-reduction flag. Interestingly this moves constants the opposite direction of the reassociate IR pass.

I've chosen to just punt on the reassociate, but I suppose we could maybe preserve the flag if both nodes have it set.

Differential Revision: https://reviews.llvm.org/D50827

llvm-svn: 339946
2018-08-16 21:54:05 +00:00
Craig Topper bde2b43cb3 [X86] In EFLAGS copy pass, don't emit EXTRACT_SUBREG instructions since we're after peephole
Normally the peephole pass converts EXTRACT_SUBREG to COPY instructions. But we're after peephole so we can't rely on it to clean these up.

To fix this, the eflags pass now emits a COPY with a subreg input.

I also noticed that in 32-bit mode we need to constrain the input to the copy to ensure the subreg is valid. Otherwise we'll fail verify-machineinstrs

Differential Revision: https://reviews.llvm.org/D50656

llvm-svn: 339945
2018-08-16 21:54:02 +00:00
Richard Smith a6c34887f7 Factor Node creation out of the demangler. No functionality change
intended.

llvm-svn: 339944
2018-08-16 21:40:57 +00:00
Reid Kleckner 602c0dafdd [MC] Improve COFF associative section lookup
Handle the case when the symbol is private. Private symbols are not in
the COFF object file symbol table, so they aren't inserted into
SymbolMap. We can't look up the section of the symbol that way. Instead,
get the MCSection from the MCSymbol and map that to the object file
section.

Print a better error message when the symbol has no section, like when
the symbol is undefined.

Fixes PR38607

llvm-svn: 339942
2018-08-16 21:34:41 +00:00
Chandler Carruth c73c0307fe [MI] Change the array of `MachineMemOperand` pointers to be
a generically extensible collection of extra info attached to
a `MachineInstr`.

The primary change here is cleaning up the APIs used for setting and
manipulating the `MachineMemOperand` pointer arrays so chat we can
change how they are allocated.

Then we introduce an extra info object that using the trailing object
pattern to attach some number of MMOs but also other extra info. The
design of this is specifically so that this extra info has a fixed
necessary cost (the header tracking what extra info is included) and
everything else can be tail allocated. This pattern works especially
well with a `BumpPtrAllocator` which we use here.

I've also added the basic scaffolding for putting interesting pointers
into this, namely pre- and post-instruction symbols. These aren't used
anywhere yet, they're just there to ensure I've actually gotten the data
structure types correct. I'll flesh out support for these in
a subsequent patch (MIR dumping, parsing, the works).

Finally, I've included an optimization where we store any single pointer
inline in the `MachineInstr` to avoid the allocation overhead. This is
expected to be the overwhelmingly most common case and so should avoid
any memory usage growth due to slightly less clever / dense allocation
when dealing with >1 MMO. This did require several ergonomic
improvements to the `PointerSumType` to reasonably support the various
usage models.

This also has a side effect of freeing up 8 bits within the
`MachineInstr` which could be repurposed for something else.

The suggested direction here came largely from Hal Finkel. I hope it was
worth it. ;] It does hopefully clear a path for subsequent extensions
w/o nearly as much leg work. Lots of thanks to Reid and Justin for
careful reviews and ideas about how to do all of this.

Differential Revision: https://reviews.llvm.org/D50701

llvm-svn: 339940
2018-08-16 21:30:05 +00:00
David Blaikie 66cf14d06b DebugInfo: Add metadata support for disabling DWARF pub sections
In cases where the debugger load time is a worthwhile tradeoff (or less
costly - such as loading from a DWP instead of a variety of DWOs
(possibly over a high-latency/distributed filesystem)) against object
file size, it can be reasonable to disable pubnames and corresponding
gdb-index creation in the linker.

A backend-flag version of this was implemented for NVPTX in
D44385/r327994 - which was fine for NVPTX which wouldn't mix-and-match
CUs. Now that it's going to be a user-facing option (likely powered by
"-gno-pubnames", the same as GCC) it should be encoded in the
DICompileUnit so it can vary per-CU.

After this, likely the NVPTX support should be migrated to the metadata
& the previous flag implementation should be removed.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D50213

llvm-svn: 339939
2018-08-16 21:29:55 +00:00
Michael Berg ed89d069f4 add a missed case for binary op FMF propagation under select folds
llvm-svn: 339938
2018-08-16 20:59:45 +00:00
Philip Reames 684fa57ef7 [MemLoc] Fix a bug causing any use of invariant.end to crash in LICM
The fix is fairly simple, but is says something unpleasant about the usage and testing of invariant.start/end scopes that this went undetected.  To put this in perspective, *any* invariant.end in a loop flowing through LICM crashed.  I haven't bothered to figure out just how far back this goes, but it's not caused by any of the recent changes.  We're probably talking months if not years.  

llvm-svn: 339936
2018-08-16 20:48:55 +00:00
Philip Reames 0e2f9b9e30 [LICM][NFC] Restructure pointer invalidation API in terms of MemoryLocation
Main value is just simplifying code.  I'll further simply the argument handling case in a bit, but that involved a slightly orthogonal change so I went with the mildy ugly intermediate for this patch.

Note that the isSized check in the old LICM code was not carried across.  It turns out that check was dead.  a) no test exercised it, and b) langref and verifier had been updated to disallow unsized types used in loads.

llvm-svn: 339930
2018-08-16 20:11:15 +00:00
Jacob Gravelle 3d668d3928 [WebAssembly] Remove temporary workaround for function bitcasts
Summary:
EM_ASM no longer is lowered as varargs in C, so this workaround is
obsolete.

Reviewers: dschuff, sunfish

Subscribers: sbc100, aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D50859

llvm-svn: 339925
2018-08-16 19:24:31 +00:00
Krzysztof Parzyszek 9af86a5e01 [MachineVerifier] Check if predecessor is jointly dominated by undefs
Each use of a value should be jointly dominated by the union of defs and
undefs. It can happen that it will only be jointly dominated by undefs,
and that is still legal. Make sure that the verifier is aware of that.

llvm-svn: 339924
2018-08-16 19:13:28 +00:00
Eli Friedman 73e8a784e6 [SelectionDAG] Improve the legalisation lowering of UMULO.
There is no way in the universe, that doing a full-width division in
software will be faster than doing overflowing multiplication in
software in the first place, especially given that this same full-width
multiplication needs to be done anyway.

This patch replaces the previous implementation with a direct lowering
into an overflowing multiplication algorithm based on half-width
operations.

Correctness of the algorithm was verified by exhaustively checking the
output of this algorithm for overflowing multiplication of 16 bit
integers against an obviously correct widening multiplication. Baring
any oversights introduced by porting the algorithm to DAG, confidence in
correctness of this algorithm is extremely high.

Following table shows the change in both t = runtime and s = space. The
change is expressed as a multiplier of original, so anything under 1 is
“better” and anything above 1 is worse.

+-------+-----------+-----------+-------------+-------------+
| Arch  | u64*u64 t | u64*u64 s | u128*u128 t | u128*u128 s |
+-------+-----------+-----------+-------------+-------------+
|   X64 |     -     |     -     |    ~0.5     |    ~0.64    |
|  i686 |   ~0.5    |   ~0.6666 |    ~0.05    |    ~0.9     |
| armv7 |     -     |   ~0.75   |      -      |    ~1.4     |
+-------+-----------+-----------+-------------+-------------+

Performance numbers have been collected by running overflowing
multiplication in a loop under `perf` on two x86_64 (one Intel Haswell,
other AMD Ryzen) based machines. Size numbers have been collected by
looking at the size of function containing an overflowing multiply in
a loop.

All in all, it can be seen that both performance and size has improved
except in the case of armv7 where code size has regressed for 128-bit
multiply. u128*u128 overflowing multiply on 32-bit platforms seem to
benefit from this change a lot, taking only 5% of the time compared to
original algorithm to calculate the same thing.

The final benefit of this change is that LLVM is now capable of lowering
the overflowing unsigned multiply for integers of any bit-width as long
as the target is capable of lowering regular multiplication for the same
bit-width. Previously, 128-bit overflowing multiply was the widest
possible.

Patch by Simonas Kazlauskas!

Differential Revision: https://reviews.llvm.org/D50310

llvm-svn: 339922
2018-08-16 18:39:39 +00:00
Krzysztof Parzyszek 17143f6111 [RegisterCoalescer] Shrink to uses if needed after removeCopyByCommutingDef
llvm-svn: 339912
2018-08-16 18:02:59 +00:00
Zachary Turner af738f7277 Fix memory leak in demangling of string literals.
llvm-svn: 339909
2018-08-16 17:48:32 +00:00
Simon Pilgrim 87d0039a45 [TargetLowering] Add support for non-uniform vectors to BuildSDIV
This patch refactors the existing TargetLowering::BuildSDIV base implementation to support non-uniform constant vector denominators.

This is the last patch necessary to close PR36545

Differential Revision: https://reviews.llvm.org/D50765

llvm-svn: 339908
2018-08-16 17:44:33 +00:00
Reid Kleckner bd5d71229d [codeview] Use push_macro to avoid conflicts instead of a prefix
Summary:
This prefix was added in r333421, and it changed our dumper output to
say things like "CVRegEAX" instead of just "EAX". That's a functional
change that I'd rather avoid.

I tested GCC, Clang, and MSVC, and all of them support #pragma
push_macro. They don't issue warnings whem the macro is not defined
either.

I don't have a Mac so I can't test the real termios.h header, but I
looked at the termios.h sources online and looked for other conflicts.
I saw only the CR* macros, so those are the ones we work around.

Reviewers: zturner, JDevlieghere

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50851

llvm-svn: 339907
2018-08-16 17:34:31 +00:00
Nirav Dave eb189a0ef7 [MC] Cleanup noop default case spelling. NFC.
llvm-svn: 339906
2018-08-16 17:22:31 +00:00
Matt Arsenault 7121bed210 AMDGPU: Custom lower fexp
This will allow the library to just use __builtin_expf directly
without expanding this itself. Note f64 still won't work because
there is no exp instruction for it.

llvm-svn: 339902
2018-08-16 17:07:52 +00:00
Simon Pilgrim ede4905375 [TargetLowering] Refactor BuildSDIV in preparation for D50765. NFCI.
Pull out magic factor calculators into a helper function, use 0/+1/-1 multiplication factor to (optionally) add/sub the numerator.

llvm-svn: 339898
2018-08-16 16:54:06 +00:00
Benjamin Kramer 0ce64c81e7 [MC] Remove unused variable
llvm-svn: 339896
2018-08-16 16:50:23 +00:00
Nirav Dave 7fd992a755 [MC][X86] Enhance X86 Register expression handling to more closely match GCC.
Allow the comparison of x86 registers in the evaluation of assembler
directives. This generalizes and simplifies the extension from r334022
to catch another case found in the Linux kernel.

Reviewers: rnk, void

Reviewed By: rnk

Subscribers: hiraditya, nickdesaulniers, llvm-commits

Differential Revision: https://reviews.llvm.org/D50795

llvm-svn: 339895
2018-08-16 16:31:14 +00:00
Zachary Turner d78fe2f46d Fix -Wmicrosoft-goto warnings.
llvm-svn: 339894
2018-08-16 16:30:27 +00:00
Zachary Turner 2838b59121 Add support for AVX-512 CodeView registers.
When compiling with /arch:AVX512 and optimizations turned on,
we could crash while emitting debug info because we did not
have CodeView register constants for the AVX 512 register
set defined.  This patch defines them.

Differential Revision: https://reviews.llvm.org/D50819

llvm-svn: 339893
2018-08-16 16:17:55 +00:00
Zachary Turner 970fdc3236 [MS Demangler] Demangle string literals.
When demangling string literals, Microsoft's undname
simply prints 'string'.  This patch implements string
literal demangling while doing a bit better than this
by decoding as much of the string as possible and
trying to faithfully reproduce the original string
literal definition.

This is a bit tricky because the different character
types char, char16_t, and char32_t are not uniquely
identified by the mangling, so we have to use a
heuristic to try to guess the character type.  But
it works pretty well, and many tests are added to
illustrate the behavior.

Differential Revision: https://reviews.llvm.org/D50806

llvm-svn: 339892
2018-08-16 16:17:36 +00:00
Zachary Turner 83313f8f54 [MS Demangler] Don't fail on MD5-mangled names.
When we have an MD5 mangled name, we shouldn't choke and say
that it's an invalid name.  Even though it's impossible to demangle,
we should just output the original name.

llvm-svn: 339891
2018-08-16 16:17:17 +00:00
Evandro Menezes c05c7e11bb [InstCombine] Expand the simplification of pow(x, 0.5) to sqrt(x)
Expand the number of cases when `pow(x, 0.5)` is simplified into `sqrt(x)`
by considering the math semantics with more granularity.

Differential revision: https://reviews.llvm.org/D50036

llvm-svn: 339887
2018-08-16 15:58:08 +00:00
Sanjay Patel 039f556f44 [InstCombine] move vector compare before same-shuffled ops
This is a step towards fixing PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

llvm-svn: 339875
2018-08-16 12:52:17 +00:00
Sam Parker 0d51197051 [ARM] Ignore GEPs in ARMCodeGenPrepare
While searching through the use-def tree, ignore GetElementPtrInst
instructions because they don't need promoting and neither do their
indices. Otherwise, the wide indices prevent the transformation from
happening.

Differential Revision: https://reviews.llvm.org/D50762

llvm-svn: 339871
2018-08-16 12:24:40 +00:00
Sam Parker 0e2f0bd48e [ARM] Allow zext in ARMCodeGenPrepare
Treat zext instructions as roots, like we do for truncs.

Differential Revision: https://reviews.llvm.org/D50759

llvm-svn: 339868
2018-08-16 11:54:09 +00:00
Alex Bradbury fdc4647ca3 [RISCV][MC] Don't fold symbol differences if requiresDiffExpressionRelocations is true
When emitting the difference between two symbols, the standard behavior is 
that the difference will be resolved to an absolute value if both of the 
symbols are offsets from the same data fragment. This is undesirable on 
architectures such as RISC-V where relaxation in the linker may cause the 
computed difference to become invalid. This caused an issue when compiling to 
object code, where the size of a function in the debug information was already 
calculated even though it could change as a consequence of relaxation in the 
subsequent linking stage.

This patch inhibits the resolution of symbol differences to absolute values 
where the target's AsmBackend has declared that it does not want these to be 
folded.

Differential Revision: https://reviews.llvm.org/D45773
Patch by Edward Jones.

llvm-svn: 339864
2018-08-16 11:26:37 +00:00
Simon Pilgrim 8f46505beb [ADT] Replace APInt::WORD_MAX with APInt::WORDTYPE_MAX
The windows SDK defines WORD_MAX, so any poor soul that wants to use LLVM in a project that depends on the windows SDK gets a build error.

Given that it actually describes the maximal value of WordType, it actually fits even better than WORD_MAX

Patch by: @miscco

Differential Revision: https://reviews.llvm.org/D50777

llvm-svn: 339863
2018-08-16 11:08:23 +00:00
Sam Parker 13567dbbd8 [ARM] Allow signed icmps in ARMCodeGenPrepare
Originally committed in r339755 which was reverted in r339806 due to
an asan issue. The issue was caused by my assumption that operands to
a CallInst mapped to the FunctionType Params. CallInsts are now
handled by iterating over their ArgOperands instead of Operands.
    
Original Message:
  Treat signed icmps as 'sinks', allowing them to be in the use-def
  tree, enabling more promotions to be performed. As a sink, any
  promoted incoming values need to be truncated before being used by
  the signed icmp.
    
  Differential Revision: https://reviews.llvm.org/D50067

llvm-svn: 339858
2018-08-16 10:05:39 +00:00
Simon Atanasyan a8ac4308aa [mips] Remove dead code from MipsPassConfig
Found by GCC's -Wunused-function.

Patch by Kim Gräsman.

Differential revision: https://reviews.llvm.org/D50612

llvm-svn: 339847
2018-08-16 08:43:17 +00:00
Max Kazantsev 72d7d649e3 [NFC] Remove const modifier to allow further development in LICM
llvm-svn: 339846
2018-08-16 08:30:15 +00:00
Max Kazantsev a7415874c9 [NFC] Add missing const modifier
llvm-svn: 339844
2018-08-16 06:28:04 +00:00
Craig Topper 9c1d9fdeaa [X86] Remove masking from the 512-bit padds and psubs intrinsics. Use select in IR instead.
llvm-svn: 339842
2018-08-16 06:20:24 +00:00
Craig Topper 9d6983c9fd [X86] Remove the unused masked 128 and 256-bit masked padds/psubs intrinsics.
Still need to remove masking from the 512-bit versions.

llvm-svn: 339841
2018-08-16 06:20:22 +00:00
Chandler Carruth 00c35c7794 [x86] Actually initialize the SLH pass with the x86 backend and use
a shorter name ('x86-slh') for the internal flags and pass name.

Without this, you can't use the -stop-after or -stop-before
infrastructure. I seem to have just missed this when originally adding
the pass.

The shorter name solves two problems. First, the flag names were ...
really long and hard to type/manage. Second, the pass name can't be the
exact same as the flag name used to enable this, and there are already
some users of that flag name so I'm avoiding changing it unnecessarily.

llvm-svn: 339836
2018-08-16 01:22:19 +00:00
Easwaran Raman aca738b742 [BFI] Use rounding while computing profile counts.
Summary:
Profile count of a block is computed by multiplying its block frequency
by entry count and dividing the result by entry block frequency. Do
rounded division in the last step and update test cases appropriately.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50822

llvm-svn: 339835
2018-08-16 00:26:59 +00:00
George Burgess IV 3083105b81 [Metadata] Replace a SmallVector with an array; NFC
MDNode::get takes an ArrayRef, so these should be equivalent.

llvm-svn: 339824
2018-08-15 22:15:35 +00:00
Guozhi Wei 8c17f9a77d [CodeGenPrepare] Add BothExtension type to PromotedInsts
This patch fixes PR38125.

Instruction extension types are recorded in PromotedInsts, it can be used later in function canGetThrough. If an instruction has two users with different extension types, it will be inserted into PromotedInsts two times in function promoteOperandForOther. The second one overwrites the first one, and the final extension type is wrong, later causes problem in canGetThrough.

This patch changes the simple bool extension type to 2-bit enum type, add a BothExtension type in addition to zero/sign extension. When an user sees BothExtension for an instruction, it actually knows nothing about how that instruction is extended.

Differential Revision: https://reviews.llvm.org/D49512

llvm-svn: 339822
2018-08-15 22:08:26 +00:00
Matt Arsenault f533e6b0ed AMDGPU: Fold fneg into fmed3
llvm-svn: 339821
2018-08-15 21:46:27 +00:00
Matt Arsenault a816073764 AMDGPU: Improve extract_vector_elt reduction combine
Handle fmul, fsub and preserve flags.

Also really test minnum/maxnum reductions.
The existing tests were only checking from
minnum/maxnum matched from a fast math compare
and select which is not the same.

llvm-svn: 339820
2018-08-15 21:34:06 +00:00
Matt Arsenault b3a80e5397 AMDGPU: Implement llvm.amdgcn.icmp/fcmp for i16/f16
Also support these on targets without support for these,
since it will allow us to freely create these in instcombine.

llvm-svn: 339819
2018-08-15 21:25:20 +00:00
Craig Topper 08e082619a [X86] Improve AVX1 shuffle lowering for v8f32 shuffles where the low half comes from V1 and the high half comes from V2 and the halves do the same operation
To lower this we now create a new V1 containing the low half of both sources and a new V2 containing the upper half of both sources. Then we created a repeated lane shuffle of those new sources to create the final result.

This fixes PR35833

Differential Revison: https://reviews.llvm.org/D41794

llvm-svn: 339818
2018-08-15 21:21:52 +00:00
Matt Arsenault 9a389fbd79 AMDGPU: Stop producing icmp/fcmp intrinsics with invalid types
llvm-svn: 339815
2018-08-15 21:14:25 +00:00
Matt Arsenault 6c7ba82900 AMDGPU: Address todo for handling 1/(2 pi)
llvm-svn: 339814
2018-08-15 21:03:55 +00:00
Matt Arsenault 0f2c1cf429 DAG: Use getObjectOffset helper
llvm-svn: 339813
2018-08-15 21:03:44 +00:00
Matt Arsenault 22f01268fe DAG: Try to custom lower when promoting float operands
For some reason this wasn't done for floats like
integers.

llvm-svn: 339811
2018-08-15 20:34:54 +00:00
Lang Hames 942cb7b3f8 [MCJIT] Fix a case of Error::success() being passed to report_fatal_error.
MCJIT::getSymbolAddress was handling a non-fatal error condition of JITSymbol
as fatal. JITSymbol::operator bool returns false if no address is available
but no error is set. This can occur e.g. if the symbol name was not found.

Patch by Jascha Wetzel. Thanks Jascha!

llvm-svn: 339809
2018-08-15 20:11:21 +00:00
Vitaly Buka ed4239f482 Revert "[ARM] Allow signed icmps in ARMCodeGenPrepare"
use-after-poison in check-llvm under asan

This reverts commit r339755.

llvm-svn: 339806
2018-08-15 20:09:35 +00:00
Lang Hames 00fb14da27 [Support] Add a basic C API for llvm::Error.
Summary:
The C-API supports consuming errors, converting an error to a string error
message, and querying an error's type. Other LLVM C APIs that wish to use
llvm::Error can supply error-type-id checkers and custom
error-to-structured-type converters for any custom errors they provide.

Reviewers: bogner, zturner, labath, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50716

llvm-svn: 339802
2018-08-15 18:42:11 +00:00
Thomas Lively 5222cb601b [WebAssembly][NFC] Standardize SIMD multiclass format
Summary:
This CL changes the ExtractLane ISEL multiclass to more closely mirror
the structure of the splat and replace_lane multiclasses.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50794

llvm-svn: 339801
2018-08-15 18:15:18 +00:00
Peter Collingbourne 62e4fc48a5 llvm-readobj: Fix addend in relocations for android packed format
If a relocation group doesn't have the RELOCATION_GROUP_HAS_ADDEND_FLAG set, then this implies the group's addend equals zero.
In this case android packed format won't encode an explicit addend delta, instead we need to set Addend, the "previous addend" variable, to zero by ourself.

Patch by Yi-Yo Chiang!

Differential Revision: https://reviews.llvm.org/D50601

llvm-svn: 339799
2018-08-15 17:58:22 +00:00
Thomas Lively 39fe480832 [WebAssembly] Test commit
Changes a comment and some whitespace to test commit access.

llvm-svn: 339798
2018-08-15 17:50:22 +00:00
Amara Emerson 070ac768ff [InstCombine] Fix IC trying to create a xor of pointer types.
rdar://42473741

Differential Revision: https://reviews.llvm.org/D50775

llvm-svn: 339796
2018-08-15 17:46:22 +00:00
Alina Sbirlea cc2e8ccc6f [MemorySSA] Expose the verify as a debug option.
Summary: Expose VerifyMemorySSA as a debug option. If set, passes will call the MSSA->verifyMemorySSA() after calling into the updater's APIs when MemorySSA should be valid.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D50749

llvm-svn: 339795
2018-08-15 17:34:55 +00:00
Krzysztof Parzyszek 3b097b4d3e [RegisterCoalescer] Ensure that both registers have subranges if one does
llvm-svn: 339792
2018-08-15 17:04:58 +00:00
Krzysztof Parzyszek 88d267d094 [RegisterCoalescer] Reset VNInfo def when copying segments over
llvm-svn: 339788
2018-08-15 16:21:53 +00:00
Derek Schuff 82812fb986 [WebAssembly] SIMD replace_lane
Implement and test replace_lane instructions.

Patch by Thomas Lively

Differential Revision: https://reviews.llvm.org/D50750

llvm-svn: 339786
2018-08-15 16:18:51 +00:00
Krzysztof Parzyszek 46ce441df6 [RegAlloc] Check that subreg liveness tracking applies to given virtual reg
Subregister liveness applies selectively to register classes with certain
properties. Make sure that when it's enabled, it applies to a given virtual
register (in virtual register rewriter).

llvm-svn: 339784
2018-08-15 16:07:47 +00:00
Nemanja Ivanovic 5b9a4f8ee5 [PowerPC] Enhance the selection(ISD::VSELECT) of vector type
To make ISD::VSELECT available(legal) so long as there are altivec instruction,
otherwise it's default behavior is expanding.
Use xxsel to match vselect if vsx is open, or use vsel.

In order to do not write many patterns in td file, promote (for vector it's
bitcast) all other type into v4i32 and only pattern match vselect of v4i32 into
vsel or xxsel.

Patch by wuzish
Differential revision: https://reviews.llvm.org/D49531

llvm-svn: 339779
2018-08-15 15:30:36 +00:00
Krzysztof Parzyszek 2a119b9a98 [SystemZ] Replace subreg_r with subreg_h
Change
  subreg_r32  -> subreg_h32
  subreg_r64  -> subreg_h64
  subreg_hr32 -> subreg_hh32

The subregisters subreg_r32 and subreg_r64 were added to emphasize the
fact that modifying these subregisters may clobber the entire register.
This is not necessarily the case for subreg_h32, et al.

However, the ability to compose subreg_h64 with subreg_r32, and with
subreg_h32 and subreg_l32 at the same time makes the compositions be
treated as non-overlapping (leading to problems when tracking subreg
liveness). See D50468 for more details.

Differential Revision: https://reviews.llvm.org/D50725

llvm-svn: 339778
2018-08-15 15:21:23 +00:00
Marcello Maggioni e98aaf1d91 [GVN] Fix typo in IsValueFullyAvailableInBlock. NFC.
DenseMap insert() method return a pair<iterator, bool>
not pair<iterator, char>
Noticed it and thought I might just fix it ...

llvm-svn: 339777
2018-08-15 15:06:53 +00:00
Jonas Paulsson d5a9c2d551 [SystemZ] New CL option to enable subreg liveness
This option is needed to enable subreg liveness tracking during register
allocation.

Review: Ulrich Weigand
https://reviews.llvm.org/D50779

llvm-svn: 339776
2018-08-15 15:04:49 +00:00
Chijun Sima e8263f33d9 [SimplifyCFG] Remove pointer from SmallPtrSet before deletion
Summary:
Previously, `eraseFromParent()` calls `delete` which invalidates the value of the pointer. Copying the value of the pointer later is undefined behavior in C++11 and implementation-defined (which may cause a segfault on implementations having strict pointer safety) in C++14.

This patch removes the BasicBlock pointer from related SmallPtrSet before `delete` invalidates it in the SimplifyCFG pass.

Reviewers: kuhar, dmgreen, davide, trentxintong

Reviewed By: kuhar, dmgreen

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50717

llvm-svn: 339773
2018-08-15 13:56:21 +00:00
Sam Parker fabf7fe5f8 [ARM] TypeSize lower bound for ARMCodeGenPrepare
We only try to promote types with are smaller than 16-bits, but we
also need to check that the type is not less than 8-bits.

Differential Revision: https://reviews.llvm.org/D50769

llvm-svn: 339770
2018-08-15 13:29:50 +00:00
Nemanja Ivanovic 8b4bd09e22 [PowerPC] Don't run BV DAG Combine before legalization if it assumes legal types
When trying to combine a DAG that builds a vector out of sign-extensions of
vector extracts, the code assumes legal input types. Due to that, we have to
disable this combine prior to legalization.
In some cases, the DAG will look slightly different after legalization so
account for that in the matching code.

This is a fix for https://bugs.llvm.org/show_bug.cgi?id=38087

Differential Revision: https://reviews.llvm.org/D49080

llvm-svn: 339769
2018-08-15 12:58:13 +00:00
Simon Pilgrim f3b5943ffc Remove lambda default argument to fix gcc pedantic warning.
llvm-svn: 339767
2018-08-15 12:32:09 +00:00
Simon Pilgrim 4b2317ebfb [TargetLowering] Minor cleanup of TargetLowering::BuildSDIV. NFCI.
Pull out some types to match layout in TargetLowering::BuildUDIV. Early step towards adding non-uniform vector support.

llvm-svn: 339763
2018-08-15 11:11:05 +00:00
David Green 6cb6478739 [UnJ] Rename hasInvariantIterationCount to hasIterationCountInvariantInParent NFC
This hopefully describes the API of the function more precisely.

llvm-svn: 339762
2018-08-15 10:59:41 +00:00
Simon Pilgrim a4ba43d3d3 [TargetLowering] Minor refactor to TargetLowering::BuildUDIV to merge scalar/vector magic value collection. NFCI.
Use the same ISD::matchUnaryPredicate pattern that was used in D50392.

llvm-svn: 339758
2018-08-15 10:11:13 +00:00
Simon Pilgrim e8a906ba47 [DagCombiner] Don't bother adding to the work list if TLI.BuildSDIVPow2 failed. NFCI.
Matches the code in BuildSDIV/BuildUDIV

llvm-svn: 339757
2018-08-15 10:02:54 +00:00
Simon Pilgrim a272fa9b0c [TargetLowering] Add support for non-uniform vectors to BuildExactSDIV
This patch refactors the existing BuildExactSDIV implementation to support non-uniform constant vector denominators.

Differential Revision: https://reviews.llvm.org/D50392

llvm-svn: 339756
2018-08-15 09:35:12 +00:00
Sam Parker 6548cd3905 [ARM] Allow signed icmps in ARMCodeGenPrepare
Treat signed icmps as 'sinks', allowing them to be in the use-def
tree, enabling more promotions to be performed. As a sink, any
promoted incoming values need to be truncated before being used by
the signed icmp.

Differential Revision: https://reviews.llvm.org/D50067

llvm-svn: 339755
2018-08-15 08:23:03 +00:00
Sam Parker 7def86bbdb [ARM] Allow pointer values in ARMCodeGenPrepare
Add pointers to the list of allowed types, but don't try to promote
them. Also fixed a bug with the promotion of undef values, so a new
value is now created instead of mutating in place. We also now only
promote if there's an instruction in the use-def chains other than
the icmp, sinks and sources.

Differential Revision: https://reviews.llvm.org/D50054

llvm-svn: 339754
2018-08-15 07:52:35 +00:00
Max Kazantsev 5a10d127b9 [AliasSetTracker] Do not treat experimental_guard intrinsic as memory writing instruction
The `experimental_guard` intrinsic has memory write semantics to model the thread-exiting
logic, but does not do any actual writes to memory. Currently, `AliasSetTracker` treats it as a
normal memory write. As result, a loop-invariant load cannot be hoisted out of loop because
the guard may possibly alias with it.

This patch makes `AliasSetTracker` so that it doesn't treat guards as memory writes.

Differential Revision: https://reviews.llvm.org/D50497
Reviewed By: reames

llvm-svn: 339753
2018-08-15 06:21:02 +00:00
Max Kazantsev 530b8d1c3d [NFC] Refactoring of LoopSafetyInfo, step 1
Turn structure into class, encapsulate methods, add clarifying comments.

Differential Revision: https://reviews.llvm.org/D50693
Reviewed By: reames

llvm-svn: 339752
2018-08-15 05:55:43 +00:00
Max Kazantsev df58dd8418 [NFC] Add sanitizing assertion to ICF tracker
llvm-svn: 339751
2018-08-15 05:50:38 +00:00
Max Kazantsev 68290f838a [NFC][LICM] Make hoist method void
Method hoist always returns true. This patch makes it void.

Differential Revision: https://reviews.llvm.org/D50696
Reviewed By: hiraditya

llvm-svn: 339750
2018-08-15 02:49:12 +00:00
Craig Topper 633fe98e27 [X86] Change legacy SSE scalar fp to integer intrinsics to use specific ISD opcodes instead of keeping as intrinsics. Unify SSE and AVX512 isel patterns.
AVX512 added new versions of these intrinsics that take a rounding mode. If the rounding mode is 4 the new intrinsics are equivalent to the old intrinsics.

The AVX512 intrinsics were being lowered to ISD opcodes, but the legacy SSE intrinsics were left as intrinsics. This resulted in the AVX512 instructions needing separate patterns for the ISD opcodes and the legacy SSE intrinsics.

Now we convert SSE intrinsics and AVX512 intrinsics with rounding mode 4 to the same ISD opcode so we can share the isel patterns.

llvm-svn: 339749
2018-08-15 01:23:00 +00:00
Chandler Carruth 139b35192a [SDAG] Update the AVR backend for the SelectionDAG API changes in
r339740, fixing the build for this target.

llvm-svn: 339748
2018-08-15 01:22:50 +00:00
Evgeniy Stepanov a265a13bbe [hwasan] Add a basic API.
Summary:
Add user tag manipulation functions:
  __hwasan_tag_memory
  __hwasan_tag_pointer
  __hwasan_print_shadow (very simple and ugly, for now)

Reviewers: vitalybuka, kcc

Subscribers: kubamracek, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50746

llvm-svn: 339746
2018-08-15 00:39:35 +00:00
Derek Schuff 4ec8bca13e [WebAssembly] SIMD Splats
Implement and test SIMD splat ops.

Patch by Thomas Lively

Differential Revision: https://reviews.llvm.org/D50741

llvm-svn: 339744
2018-08-15 00:30:27 +00:00
Chandler Carruth 66654b72c9 [SDAG] Remove the reliance on MI's allocation strategy for
`MachineMemOperand` pointers attached to `MachineSDNodes` and instead
have the `SelectionDAG` fully manage the memory for this array.

Prior to this change, the memory management was deeply confusing here --
The way the MI was built relied on the `SelectionDAG` allocating memory
for these arrays of pointers using the `MachineFunction`'s allocator so
that the raw pointer to the array could be blindly copied into an
eventual `MachineInstr`. This creates a hard coupling between how
`MachineInstr`s allocate their array of `MachineMemOperand` pointers and
how the `MachineSDNode` does.

This change is motivated in large part by a change I am making to how
`MachineFunction` allocates these pointers, but it seems like a layering
improvement as well.

This would run the risk of increasing allocations overall, but I've
implemented an optimization that should avoid that by storing a single
`MachineMemOperand` pointer directly instead of allocating anything.
This is expected to be a net win because the vast majority of uses of
these only need a single pointer.

As a side-effect, this makes the API for updating a `MachineSDNode` and
a `MachineInstr` reasonably different which seems nice to avoid
unexpected coupling of these two layers. We can map between them, but we
shouldn't be *surprised* at where that occurs. =]

Differential Revision: https://reviews.llvm.org/D50680

llvm-svn: 339740
2018-08-14 23:30:32 +00:00
Cameron McInally 00b0658aae [FPEnv] Scalarize StrictFP vector operations
Add a helper function to scalarize constrained FP operations as needed.

Differential Revision: https://reviews.llvm.org/D50720

llvm-svn: 339735
2018-08-14 22:13:11 +00:00
Eli Friedman 0d12e90bf5 [ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly.
Intentionally excluding nodes from the DAGCombine worklist is likely to
lead to weird optimizations and infinite loops, so it's generally a bad
idea.

To avoid the infinite loops, fix DAGCombine to use the
isDesirableToCommuteWithShift target hook before performing the
transforms in question, and implement the target hook in the ARM backend
disable the transforms in question.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a
reduced testcase for that bug. But we should have sufficient test
coverage for PerformSHLSimplify given that we're not playing weird
tricks with the worklist. I can try to bugpoint it if necessary,
though.)

Differential Revision: https://reviews.llvm.org/D50667

llvm-svn: 339734
2018-08-14 22:10:25 +00:00
Matt Morehouse 0f22fac274 [SanitizerCoverage] Add associated metadata to PC guards.
Summary:
Without this metadata LLD strips unused PC table entries
but won't strip unused guards.  This metadata also seems
to influence the linker to change the ordering in the PC
guard section to match that of the PC table section.

The libFuzzer runtime library depends on the ordering
of the PC table and PC guard sections being the same.  This
is not generally guaranteed, so we may need to redesign
PC tables/guards/counters in the future.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: kcc, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50483

llvm-svn: 339733
2018-08-14 22:04:34 +00:00
Anna Thomas 6a1dd77f5d NFC: Clarify comment in loop vectorization legality
Clarifying the comment about PSCEV and external IV users by referencing
the bug in question.

llvm-svn: 339722
2018-08-14 20:25:13 +00:00
Adrian Prantl 55f4262999 [DebugInfoMetadata] Added DIFlags interface in DIBasicType.
Flags in DIBasicType will be used to pass attributes used in
DW_TAG_base_type, such as DW_AT_endianity.

Patch by Chirag Patel!

Differential Revision: https://reviews.llvm.org/D49610

llvm-svn: 339714
2018-08-14 19:35:34 +00:00
Heejin Ahn c9c711a0ac [WebAssembly] Fix encoding of non-SIMD vector-typed instructions
Previously SIMD_I was the same as a normal instruction except for the
addition of a HasSIM128 predicate. However, rL339186 changed the
encoding of SIMD_I instructions to automatically contain the SIMD
prefix byte. This broke the encoding of non-SIMD vector-typed
instructions, which had instantiated SIMD_I. This CL corrects this
error.

Reviewers: aheejin

Subscribers: sunfish, jgravelle-google, sbc100, llvm-commits

Differential Revision: https://reviews.llvm.org/D50682

Patch by Thomas Lively (tlively)

llvm-svn: 339710
2018-08-14 19:03:36 +00:00
Zachary Turner 2bbb23ba3b [MS Demangler] Fix some minor formatting bugs.
1) We print __restrict twice on member pointers.  This is fixed
   and relevant tests are re-enabled.

2) Several tests were disabled because of printing slightly
   different output than undname.  These were confirmed to be
   bugs in undname, so we just re-enable the tests.

3) The test for printing reference temporaries is re-enabled.  This
   is a clang mangling extension, so we have some flexibility with
   how we demangle it.  The output currently looks fine, so we just
   re-enable the test with no fixes.

llvm-svn: 339708
2018-08-14 18:54:28 +00:00
Heejin Ahn a0fd9c3e9a [WebAssembly] SIMD extract_lane
Implement instruction selection for all versions of the extract_lane
instruction. Use explicit sext/zext to differentiate between
extract_lane_s and extract_lane_u for applicable types, otherwise
default to extract_lane_u.

Reviewers: aheejin

Subscribers: sunfish, jgravelle-google, sbc100, llvm-commits

Differential Revision: https://reviews.llvm.org/D50597

Patch by Thomas Lively (tlively)

llvm-svn: 339707
2018-08-14 18:53:27 +00:00
Andrea Di Biagio 9eaf5aa006 [Tablegen][MCInstPredicate] Removed redundant template argument from class TIIPredicate, and implemented verification rules for TIIPredicates.
This patch removes redundant template argument `TargetName` from TIIPredicate.
Tablegen can always infer the target name from the context. So we don't need to
force users of TIIPredicate to always specify it.

This allows us to better modularize the tablegen class hierarchy for the
so-called "function predicates". class FunctionPredicateBase has been added; it
is currently used as a building block for TIIPredicates. However, I plan to
reuse that class to model other function predicate classes too (i.e. not just
TIIPredicates). For example, this can be a first step towards implementing
proper support for dependency breaking instructions in tablegen.

This patch also adds a verification step on TIIPredicates in tablegen.
We cannot have multiple TIIPredicates with the same name. Otherwise, this will
cause build errors later on, when tablegen'd .inc files are included by cpp
files and then compiled.

Differential Revision: https://reviews.llvm.org/D50708

llvm-svn: 339706
2018-08-14 18:36:54 +00:00
Anna Thomas 60a1e4dddc [LV] Teach about non header phis that have uses outside the loop
Summary:
This patch teaches the loop vectorizer to vectorize loops with non
header phis that have have outside uses.  This is because the iteration
dependence distance for these phis can be widened upto VF (similar to
how we do for induction/reduction) if they do not have a cyclic
dependence with header phis. When identifying reduction/induction/first
order recurrence header phis, we already identify if there are any cyclic
dependencies that prevents vectorization.

The vectorizer is taught to extract the last element from the vectorized
phi and update the scalar loop exit block phi to contain this extracted
element from the vector loop.

This patch can be extended to vectorize loops where instructions other
than phis have outside uses.

Reviewers: Ayal, mkuper, mssimpso, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50579

llvm-svn: 339703
2018-08-14 18:22:19 +00:00
Bruno Cardoso Lopes f446282aad Revert "[DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)"
This reverts commit cb8c5e417d55141f3f079a8a876e786f44308336 / r339676.

This causing a test to fail in http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/48406/

    LLVM :: DebugInfo/Generic/debug-label.ll

llvm-svn: 339700
2018-08-14 17:54:41 +00:00
Simon Pilgrim 2ce3d6e135 [X86][SSE] Avoid duplicate shuffle input sources in combineX86ShufflesRecursively
rL339686 added the case where a faux shuffle might have repeated shuffle inputs coming from either side of the OR().

This patch improves the insertion of the inputs into the source ops lists to account for this, as well as making it trivial to add support for shuffles with more than 2 inputs in the future.

llvm-svn: 339696
2018-08-14 17:22:37 +00:00
Alina Sbirlea 148c445475 [DomTree] Cleanup Update and LegalizeUpdate API moved to Support header.
Summary:
Clean-up following D50479.
Make Update and LegalizeUpdate refer to the utilities in Support/CFGUpdate.

Reviewers: kuhar

Subscribers: sanjoy, jlebar, mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D50669

llvm-svn: 339694
2018-08-14 17:12:30 +00:00
Nirav Dave fbfe2ad9e0 [DAG] Avoid redundant chain transversal in store merge cycle check. NFCI.
Patch by Henric Karlsson.

llvm-svn: 339688
2018-08-14 16:20:43 +00:00
Simon Pilgrim ed55138247 [X86][SSE] Add shuffle combine support for OR(PSHUFB,PSHUFB) style patterns.
If each element is zero from one (or both) inputs then we can combine these into a single shuffle mask.

llvm-svn: 339686
2018-08-14 16:00:05 +00:00
Fedor Sergeev b55705f6e9 [Inliner] add inliner stats to new pm version of inliner
Increment existing NumInlined and NumDeleted stats in InlinerPass::run.

llvm-svn: 339682
2018-08-14 15:19:14 +00:00
Simon Pilgrim df9880f257 [X86][SSE] Generalize lowerVectorShuffleAsBlendOfPSHUFBs to work with any vXi8 type.
We still only use this for v16i8, but this cleans up the code to support v32i8/v64i8 sometime in the future.

llvm-svn: 339679
2018-08-14 14:00:14 +00:00
Hsiangkai Wang ccae278938 [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 339676
2018-08-14 13:50:59 +00:00
Amara Emerson 30e61404a8 [GlobalISel][IRTranslator] Fix a bug in handling repeating struct types during argument lowering.
Differential Revision: https://reviews.llvm.org/D49442

llvm-svn: 339674
2018-08-14 12:04:25 +00:00
Simon Pilgrim 7bae71a209 Fix MSVC "compiler limit: blocks nested too deeply" error. NFCI.
MSVC only accepts if-else chains up to 127 blocks long. I've had to merge a number of intrinsic cases together to get back below this limit, resulting in some duplication of string matches; this shouldn't cause any notable increase in runtime (and even then only for old IR, nothing that clang currently emits).

llvm-svn: 339666
2018-08-14 10:04:14 +00:00
Tomasz Krupa e766e5f636 [X86] Constant folding of adds/subs intrinsics
Summary: This adds constant folding of signed add/sub with saturation intrinsics.

Reviewers: craig.topper, spatel, RKSimon, chandlerc, efriedma

Reviewed By: craig.topper

Subscribers: rnk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50499

llvm-svn: 339659
2018-08-14 09:04:01 +00:00
Roger Ferrer Ibanez c8f4dbbc63 [RISCV] Fix incorrect use of MCInstBuilder
This is a fix for r339314.

MCInstBuilder uses the named parameter idiom and an 'operator MCInst&' to ease
the creation of MCInsts. As the object of MCInstBuilder owns the MCInst is
manipulating, the lifetime of the MCInst is bound to that of MCInstBuilder.

In r339314 I bound a reference to the MCInst in an initializer. The
temporary of MCInstBuilder (and also its MCInst) is destroyed at the end of
the declaration leading to a dangling reference.

Fix this by using MCInstBuilder inside an argument of a function call.
Temporaries in function calls are destroyed in the enclosing full expression,
so the the reference to MCInst is still valid when emitToStreamer executes.

llvm-svn: 339654
2018-08-14 08:30:42 +00:00
Chih-Mao Chen 5d94b25ffe Test commit: fix punctuation
llvm-svn: 339652
2018-08-14 08:08:39 +00:00
Tomasz Krupa 86a63889f3 [X86] Lowering addus/subus intrinsics to native IR
Summary: This revision improves previous version (rL330322) which has been reverted due to crashes.

This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations. The patch also includes folding
of previously missing saturation patterns so that IR emits the same
machine instructions as the intrinsics.

Reviewers: craig.topper, spatel, RKSimon

Reviewed By: craig.topper

Subscribers: mike.dvoretsky, DavidKreitzer, sroland, llvm-commits

Differential Revision: https://reviews.llvm.org/D46179

llvm-svn: 339650
2018-08-14 08:00:56 +00:00
Sjoerd Meijer 3c859b3ec3 [ARM] ParallelDSP: add option to enable/disable the pass
Differential Revision: https://reviews.llvm.org/D50511

llvm-svn: 339645
2018-08-14 07:43:49 +00:00
Teresa Johnson b0a1d3bdf1 [ThinLTO] Fix printing of WPD remarks
Summary:
When WPD is performed in a ThinLTO backend, the function may be created
if it isn't already in that module. Module::getOrInsertFunction may
add a bitcast, in which case the returned Constant is not a Function and
doesn't have a name. Invoke stripPointerCasts() on the returned value
where we access its name.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49959

llvm-svn: 339640
2018-08-14 03:00:16 +00:00
Teresa Johnson c7816800d8 [ThinLTO] Handle optional args in assembly format for ConstVCalls
Summary:
The AsmWriter was only writing the Args for a ConstVCall if it was
non-empty, however, the LLParser was always expecting it. To aid
in making it optional, surround the ConstVCall VFuncId and Args in
parentheses when writing, then make the Args optional when reading.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49960

llvm-svn: 339637
2018-08-14 01:49:33 +00:00
Reid Kleckner 40e7663b1f [BasicAA] Don't assume tail calls with byval don't alias allocas
Summary:
Calls marked 'tail' cannot read or write allocas from the current frame
because the current frame might be destroyed by the time they run.
However, a tail call may use an alloca with byval. Calling with byval
copies the contents of the alloca into argument registers or stack
slots, so there is no lifetime issue. Tail calls never modify allocas,
so we can return just ModRefInfo::Ref.

Fixes PR38466, a longstanding bug.

Reviewers: hfinkel, nlewycky, gbiv, george.burgess.iv

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50679

llvm-svn: 339636
2018-08-14 01:24:35 +00:00
Wouter van Oortmerssen a7be375586 Revert "[WebAssembly] Added default stack-only instruction mode for MC."
This reverts commit 917a99b71ce21c975be7bfbf66f4040f965d9f3c.

llvm-svn: 339630
2018-08-13 23:12:49 +00:00
Jordan Rupprecht 97ea485041 [Support] NFC: Allow modifying access/modification times independently in sys::fs::setLastModificationAndAccessTime.
Summary:
Add an overload to sys::fs::setLastModificationAndAccessTime that allows setting last access and modification times separately. This will allow tools to use this API when they want to preserve both the access and modification times from an input file, which may be different.

Also note that both the POSIX (futimens/futimes) and Windows (SetFileTime) APIs take the two timestamps in the order of (1) access (2) modification time, so this renames the method to "setLastAccessAndModificationTime" to make it clear which timestamp is which.

For existing callers, the 1-arg overload just sets both timestamps to the same thing.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50521

llvm-svn: 339628
2018-08-13 23:03:45 +00:00
Philip Reames 90bffb3eb9 [AST] Minor formatting cleanup [NFC]
llvm-svn: 339627
2018-08-13 22:34:14 +00:00
Philip Reames 0f396696d1 [AST] Cleanup code by using MemoryLocation utility [NFC]
Differential Revision: https://reviews.llvm.org/D50588

llvm-svn: 339625
2018-08-13 22:25:16 +00:00
Craig Topper cade635c77 [X86] Don't ignore 0x66 prefix on relative jumps in 64-bit mode. Fix opcode selection of relative jumps in 16-bit mode. Treat jno/jo like other jcc instructions.
The behavior in 64-bit mode is different between Intel and AMD CPUs. Intel ignores the 0x66 prefix. AMD does not. objump doesn't ignore the 0x66 prefix. Since LLVM aims to match objdump behavior, we should do the same.

While I was trying to fix this I had change brtarget16/32 to use ENCODING_IW/ID instead of ENCODING_Iv to get the 0x66+REX.W case to act sort of sanely. It's still wrong, but that's a problem for another day.

The change in encoding exposed the fact that 16-bit mode disassembly of relative jumps was creating JMP_4 with a 2 byte immediate. It should have been JMP_2. From just printing you can't tell the difference, but if you dumped the encoding it wouldn't have matched what we started with.

While fixing that, it exposed that jo/jno opcodes were missing from the switch that this patch deleted and there were no test cases for them.

Fixes PR38537.

llvm-svn: 339622
2018-08-13 22:06:28 +00:00
Roman Lebedev 3534874fbf [InstCombine] Re-land: Optimize redundant 'signed truncation check pattern'.
Summary:
This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250):
```
signed char test(unsigned int x) { return x; }
```
`clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3`
* Old: {F6904292}
* With this patch: {F6904294}

General pattern:
  X & Y

Where `Y` is checking that all the high bits (covered by a mask `4294967168`)
are uniform, i.e.  `%arg & 4294967168`  can be either  `4294967168`  or  `0`
Pattern can be one of:
  %t = add        i32 %arg,    128
  %r = icmp   ult i32 %t,      256
Or
  %t0 = shl       i32 %arg,    24
  %t1 = ashr      i32 %t0,     24
  %r  = icmp  eq  i32 %t1,     %arg
Or
  %t0 = trunc     i32 %arg  to i8
  %t1 = sext      i8  %t0   to i32
  %r  = icmp  eq  i32 %t1,     %arg
This pattern is a signed truncation check.

And `X` is checking that some bit in that same mask is zero.
I.e. can be one of:
  %r = icmp sgt i32   %arg,    -1
Or
  %t = and      i32   %arg,    2147483648
  %r = icmp eq  i32   %t,      0

Since we are checking that all the bits in that mask are the same,
and a particular bit is zero, what we are really checking is that all the
masked bits are zero.
So this should be transformed to:
  %r = icmp ult i32 %arg, 128

The transform itself ended up being rather horrible, even though i omitted some cases.
Surely there is some infrastructure that can help clean this up that i missed?

https://rise4fun.com/Alive/3Ou

The initial commit (rL339610)
was reverted, since the first assert was being triggered.
The @positive_with_extra_and test now has coverage for that case.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: RKSimon, erichkeane, vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50465

llvm-svn: 339621
2018-08-13 21:54:37 +00:00
Sanjay Patel 15bff18c6f [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds (retry r339608)
Even though this code is below a function called optimizeFloatingPointLibCall(),
we apparently can't guarantee that we're dealing with FPMathOperators, so bail
out immediately if that's not true.

llvm-svn: 339618
2018-08-13 21:49:19 +00:00
Roman Lebedev 28a42c7706 Revert "[InstCombine] Optimize redundant 'signed truncation check pattern'."
At least one buildbot was able to actually trigger that assert
on the top of the function. Will investigate.

This reverts commit r339610.

llvm-svn: 339612
2018-08-13 20:46:22 +00:00
Roman Lebedev 4c4750771f [InstCombine] Optimize redundant 'signed truncation check pattern'.
Summary:
This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250):
```
signed char test(unsigned int x) { return x; }
```
`clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3`
* Old: {F6904292}
* With this patch: {F6904294}

General pattern:
  X & Y

Where `Y` is checking that all the high bits (covered by a mask `4294967168`)
are uniform, i.e.  `%arg & 4294967168`  can be either  `4294967168`  or  `0`
Pattern can be one of:
  %t = add        i32 %arg,    128
  %r = icmp   ult i32 %t,      256
Or
  %t0 = shl       i32 %arg,    24
  %t1 = ashr      i32 %t0,     24
  %r  = icmp  eq  i32 %t1,     %arg
Or
  %t0 = trunc     i32 %arg  to i8
  %t1 = sext      i8  %t0   to i32
  %r  = icmp  eq  i32 %t1,     %arg
This pattern is a signed truncation check.

And `X` is checking that some bit in that same mask is zero.
I.e. can be one of:
  %r = icmp sgt i32   %arg,    -1
Or
  %t = and      i32   %arg,    2147483648
  %r = icmp eq  i32   %t,      0

Since we are checking that all the bits in that mask are the same,
and a particular bit is zero, what we are really checking is that all the
masked bits are zero.
So this should be transformed to:
  %r = icmp ult i32 %arg, 128

https://rise4fun.com/Alive/3Ou

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: RKSimon, erichkeane, vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50465

llvm-svn: 339610
2018-08-13 20:33:08 +00:00
Sanjay Patel 66c6fe6534 revert r339608 - [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds
Can't set the builder flags without knowing this is an FPMathOperator. I'll add a test
for that and try again.

llvm-svn: 339609
2018-08-13 20:20:38 +00:00
Sanjay Patel 981f50919e [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds
llvm-svn: 339608
2018-08-13 20:14:27 +00:00
Sanjay Patel e45a83d447 [SimplifyLibCalls] add reflection fold for -sin(-x) (PR38458)
This is a very partial fix for the reported problem. I suspect
we do not get this fold in most motivating cases because most of
the time, the libcall would have been replaced by an intrinsic,
and that optimization is handled elsewhere...but maybe it should
be handled here?

llvm-svn: 339604
2018-08-13 19:24:41 +00:00
Scott Linder 35213793bc [CodeGen] Fix assert in SelectionDAG::computeKnownBits
Fix SelectionDAG::computeKnownBits asserting when handling EXTRACT_SUBVECTOR
when zero extending the demanded elements mask if it is already as long as the
source vector.

Differential Revision: https://reviews.llvm.org/D49574

llvm-svn: 339600
2018-08-13 18:44:21 +00:00
Andrea Di Biagio 7b77b14198 [X86][BtVer2] Use NoSchedPredicate to model default transitions in variant scheduling classes. NFC.
llvm-svn: 339589
2018-08-13 17:52:39 +00:00
Sanjay Patel ce4ddbe960 [SimplifyLibCalls] reduce code for optimizeCos; NFCI
llvm-svn: 339588
2018-08-13 17:40:49 +00:00
Simon Pilgrim 82edf8d329 [InstCombine] Limit simplifyAllocaArraySize constant folding to values that fit into a uint64_t
Fixes OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5223

llvm-svn: 339584
2018-08-13 16:50:20 +00:00
Erik Pilkington ac6a801cca [itanium demangler] Add llvm::itaniumFindTypesInMangledName()
This function calls a callback whenever a <type> is parsed.

This is necessary to implement FindAlternateFunctionManglings in LLDB, which
uses a similar hack in FastDemangle. Once that function has been updated to use
this version, FastDemangle can finally be removed.

Differential revision: https://reviews.llvm.org/D50586

llvm-svn: 339580
2018-08-13 16:37:47 +00:00
Evandro Menezes 5ecd6c1a46 [SLC] Expand simplification of pow() for vector types
Also consider vector constants when simplifying `pow()`.

Differential revision: https://reviews.llvm.org/D50035

llvm-svn: 339578
2018-08-13 16:12:37 +00:00
Krzysztof Parzyszek cce15c76d3 [Hexagon] Silence -Wuninitialized warning from GCC 5.4, NFC
Patch by Kim Gräsman.

Differential Revision: https://reviews.llvm.org/D50623

llvm-svn: 339576
2018-08-13 15:08:25 +00:00
Daniel Cederman dc3e4c6d95 Revert "[Sparc] Add support for the cycle counter available in GR740"
It breaks when using EXPENSIVE_CHECKS with the error message
"Bad machine code: Using an undefined physical register".

llvm-svn: 339570
2018-08-13 14:18:09 +00:00
Sid Manning 8d4a6615e1 Check for tied operands
Differential Revision: https://reviews.llvm.org/D50592

llvm-svn: 339567
2018-08-13 14:01:25 +00:00
Jonas Paulsson 5ffb27b166 [SystemZ] Increase the amount of inlining.
Implement getInliningThresholdMultiplier() and have it return 3.

Review: Ulrich Weigand
llvm-svn: 339563
2018-08-13 13:31:30 +00:00
Simon Pilgrim 26e3d3f1c8 [DAGCombiner] simplifyDivRem - add comment describing divide by undef/zero combine. NFC.
llvm-svn: 339561
2018-08-13 13:12:25 +00:00
Simon Pilgrim ee82a79041 [CGP] Fix GEP issue with out of range APInt constant values not fitting in int64_t
Test case reduced from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=7173

llvm-svn: 339556
2018-08-13 12:10:09 +00:00
Daniel Cederman 1bfbc62022 [Sparc] Add support for the cycle counter available in GR740
Summary: The GR740 provides an up cycle counter in the
registers ASR22 and ASR23. As these registers can not be
read together atomically we only use the value of ASR23
for llvm.readcyclecounter(). The ASR23 register holds the
32 LSBs of the up-counter.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48638

llvm-svn: 339551
2018-08-13 10:49:48 +00:00
Luke Geeson 4ce41d2bb7 [ARM] Added FP16 VREV Vector Instrinsic CodeGen support
llvm-svn: 339546
2018-08-13 08:37:41 +00:00
Max Kazantsev 5c490b49c3 [GuardWidening] Widen very likely non-taken br instructions
This is a second part of D49974 that handles widening of conditional branches that
have very likely `false` branch.

Differential Revision: https://reviews.llvm.org/D50040
Reviewed By: reames

llvm-svn: 339537
2018-08-13 07:58:19 +00:00
Craig Topper cacf12a149 [SelectionDAG] In PromoteFloatOp_BITCAST, insert a bitcast after the fp_to_fp16 in case the result type isn't a scalar integer.
This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary.

llvm-svn: 339536
2018-08-13 06:53:49 +00:00
Craig Topper e42a159537 [SelectionDAG] In PromoteIntRes_BITCAST, when the input is TypePromoteFloat, make sure the output type is scalar. For vectors, use a store and load of temporary.
Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid.

This is basically the opposite case of the root cause of PR38533.

llvm-svn: 339535
2018-08-13 06:53:47 +00:00
Lei Liu 901a0a9588 Restore correct x86_64 EH encodings in kernel code model
Fixes PR37524.

The exception handling encodings for x86_64 in kernel code model
has been changed with r309884.  Restore it to correct ones.  These
encodings include PersonalityEncoding, LSDAEncoding and
TTypeEncoding.

Differential Revision: https://reviews.llvm.org/D50490

llvm-svn: 339534
2018-08-13 06:06:53 +00:00
Craig Topper 42e32117bb [SelectionDAG] In PromoteFloatRes_BITCAST, insert a bitcast before the fp16_to_fp in case the input type isn't an i16.
The bitcast can be further legalized as needed.

Fixes PR38533.

llvm-svn: 339533
2018-08-13 05:26:49 +00:00
Craig Topper 8caccc32b5 [InstCombine] Fix typo in comment. NFC
llvm-svn: 339532
2018-08-13 00:54:23 +00:00
Craig Topper 8bb49218bc [InstCombine] Replace call to haveNoCommonBitsSet in visitXor with just the special case that doesn't use computeKnownBits.
Summary: computeKnownBits is expensive. The cases that would be detected by the computeKnownBits portion of haveNoCommonBitsSet were already handled by the earlier call to SimplifyDemandedInstructionBits.

Reviewers: spatel, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50604

llvm-svn: 339531
2018-08-13 00:38:27 +00:00
Craig Topper 484b342c68 [X86] Add constant folding for AVX512 versions of scalar floating point to integer conversion intrinsics.
Summary:
We've supported constant folding for sse versions for many years. This patch adds support for the avx512 versions including unsigned with the default rounding mode. We could probably do more with other roundings modes and SAE in the future.

The test cases are largely based on the sse.ll test cases. But I did add some test cases to ensure the unsigned versions don't accept negative values. Also checked the bounds of f64->i32 conversions to make sure unsigned has a larger positive range than signed.

Reviewers: RKSimon, spatel, chandlerc

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50553

llvm-svn: 339529
2018-08-12 22:09:54 +00:00
Matt Arsenault 1201301b94 DAG: Check no-signed-zeros instead of unsafe-fp-math
Addresses fixme, although this should still be checking individual
operand flags.

llvm-svn: 339525
2018-08-12 19:09:12 +00:00
David Bolvansky 01d98cc03f [InstCombine] Fold Select with binary op - non-commutative opcodes
Summary:
Basic version was merged - https://reviews.llvm.org/D49954

This adds support for FP & non-commutative opcodes

Precommited tests: https://reviews.llvm.org/rL338727

Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: jfb

Differential Revision: https://reviews.llvm.org/D50190

llvm-svn: 339520
2018-08-12 17:30:07 +00:00
Sanjay Patel dc185ee275 [InstCombine] fix/enhance fadd/fsub factorization
(X * Z) + (Y * Z) --> (X + Y) * Z
  (X * Z) - (Y * Z) --> (X - Y) * Z
  (X / Z) + (Y / Z) --> (X + Y) / Z
  (X / Z) - (Y / Z) --> (X - Y) / Z

The existing code that implemented these folds failed to 
optimize vectors, and it transformed code with multiple 
uses when it should not have.

llvm-svn: 339519
2018-08-12 15:48:26 +00:00
Benjamin Kramer bae6aab6fb [InstSimplify] Guard against large shift amounts.
These are always UB, but can happen for large integer inputs. Testing it
is very fragile as -simplifycfg will nuke the UB top-down.

llvm-svn: 339515
2018-08-12 11:43:03 +00:00
Matt Arsenault 13b0db9285 AMDGPU: Check NSZ MI flag when folding omod
I'm not sure the exact nsz flag combination that
is OK. I think as long as it's on either, this is OK.
For now just check it on the omod multiply.

llvm-svn: 339513
2018-08-12 08:44:25 +00:00
Matt Arsenault b5acec1f79 AMDGPU: Use splat vectors for undefs when folding canonicalize
If one of the elements is undef, use the canonicalized constant
from the other element instead of 0.

Splat vectors are more useful for other optimizations, such
as matching vector clamps. This was breaking on clamps
of half3 from the undef 4th component.

llvm-svn: 339512
2018-08-12 08:42:54 +00:00
Matt Arsenault 3ead7d7389 AMDGPU: Fix packing undef parts of build_vector
llvm-svn: 339511
2018-08-12 08:42:46 +00:00
Craig Topper 60177f1aee [TargetLowering] Simplify one of the special cases in SimplifyDemandedBits for XOR. NFCI
We were checking for all bits being Known by checking Known.Zero|Known.One, but if all the bits are known then the value should be a Constant and we can just check for that instead.

llvm-svn: 339509
2018-08-12 06:52:03 +00:00
Craig Topper d112206004 [TargetLowering] Use APInt::isSubsetOf to simplify some code. NFC
llvm-svn: 339508
2018-08-12 05:34:15 +00:00
Craig Topper ed8a114c86 [X86] Remove unnecessary AddedComplexity line. NFC
The use of the or_is_add predicate already gives enough of a complexity boost to get the patterns ordered properly.

llvm-svn: 339507
2018-08-12 03:22:18 +00:00
Chijun Sima ce698a5586 [Dominators] Remove the DeferredDominance class
Summary: After converting all existing passes to use the new DomTreeUpdater interface, there isn't any usage of the original DeferredDominance class. Thus, we can safely remove it from the codebase.

Reviewers: kuhar, brzycki, dmgreen, davide, grosser

Reviewed By: kuhar, brzycki

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D49747

llvm-svn: 339502
2018-08-11 08:12:07 +00:00
David Green f7111d1ece [UnJ] Improve explicit loop count checks
Try to improve the computed counts when it has been explicitly set by a pragma
or command line option. This moves the code around, so that first call to
computeUnrollCount to get a sensible count and override that if explicit unroll
and jam counts are specified.

Also added some extra debug messages for when unroll and jamming is disabled.

Differential Revision: https://reviews.llvm.org/D50075

llvm-svn: 339501
2018-08-11 07:37:31 +00:00
David Green 395b80cd3c [UnJ] Create a hasInvariantIterationCount function. NFC
Pulled out a separate function for some code that calculates
if an inner loop iteration count is invariant to it's outer
loop.

Differential Revision: https://reviews.llvm.org/D50063

llvm-svn: 339500
2018-08-11 06:57:28 +00:00
Craig Topper b3e3477649 [X86] Remove the AL/AX/EAX/RAX short immediate forms from the macro fusion shouldScheduleAdjacent. NFC
These instructions are only created by the backend during MCInst lowering.

llvm-svn: 339499
2018-08-11 06:42:51 +00:00
Craig Topper c6cf169940 [X86] Add the mem-reg form of CMP to the macro fusion shouldScheduleAdjacent.
Unlike the other arithmetic instructions the mem-reg form of compare is just a load and not a RMW operation. According to the Intel optimization manual, this form is also supported by macro fusion.

llvm-svn: 339498
2018-08-11 06:42:50 +00:00
Craig Topper 616eeb827d [X86] Remove ADD8mi and ADDmr from the macro fusion shouldScheduleAdjacent.
The are RMW of memory operations. They aren't eligible for macro fusion.

llvm-svn: 339497
2018-08-11 06:42:49 +00:00
Craig Topper 570d47a010 [X86] Change the MOV32ri64 pseudo instruction to def a GR64 directly instead of wrapping it in a SUBREG_TO_REG.
Now we switch to the subregister in expandPostRAPseudos where we already switched the opcode.

This simplifies a few isel patterns that used the pseudo directly. And magically seems to have improved our ability to CSE it in the undef-label.ll test.

llvm-svn: 339496
2018-08-11 05:33:00 +00:00
Richard Trieu 01f99f3cd6 Fix WebAssembly instruction printer after r339474
Treat the stack variants of control instructions the same as regular
instructions.  Otherwise, the vector ControlFlowStack will be the wrong
size and have out-of-bounds access.  This was detected by MemorySanitizer.

llvm-svn: 339495
2018-08-11 04:18:05 +00:00
Tom Stellard 8adc86a7dc AMDGPU/GlobalISel: Define instruction mapping for G_INSERT
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D49625

llvm-svn: 339491
2018-08-11 00:51:54 +00:00
JF Bastien fe258d9776 Re-commit "[NFC] More ConstantMerge refactoring"
My previous change moved some code upwards which caused an assert in debug mode
because the global value didn't necessarily have an initializer. Don't do that.

llvm-svn: 339485
2018-08-10 22:41:09 +00:00
Philip Reames 85afd1a9a0 [LICM] Hoist assumes out of loops
If we have an assume which is known to execute and whose operand is invariant, we can lift that into the pre-header. So long as we don't change which paths the assume executes on, this is a legal transformation. It's likely to be a useful canonicalization as other transforms only look for dominating assumes.

Differential Revision: https://reviews.llvm.org/D50364

llvm-svn: 339481
2018-08-10 22:21:56 +00:00
JF Bastien b99f131ffd Revert "[NFC] More ConstantMerge refactoring"
Sanitizers seem unhappy.

llvm-svn: 339480
2018-08-10 22:10:20 +00:00
Eli Friedman 6b84a48953 Fix unused lambda capture warning from r339472.
llvm-svn: 339479
2018-08-10 22:03:25 +00:00
JF Bastien 62fb8ea4e0 [NFC] More ConstantMerge refactoring
This makes my upcoming patch much easier to read.

llvm-svn: 339478
2018-08-10 21:58:00 +00:00
Wouter van Oortmerssen ab26bd0647 [WebAssembly] Added default stack-only instruction mode for MC.
Summary:
Moved Explicit Locals pass to last.
Made that pass obligatory.
Made it convert from register to stack based instructions, and removed the registers.
Fixes to related code that was expecting register based instructions.
Added the correct testing flag to all tests, depending on what the
format they were expecting so far.
Translated one test to stack format as example: reg-stackify-stack.ll

tested:
llvm-lit -v `find test -name WebAssembly`
unittests/MC/*

Reviewers: dschuff, sunfish

Subscribers: jfb, llvm-commits, aheejin, eraman, jgravelle-google, sbc100

Differential Revision: https://reviews.llvm.org/D50568

llvm-svn: 339474
2018-08-10 21:32:47 +00:00
Eli Friedman e1687a89e8 [ARM] Adjust AND immediates to make them cheaper to select.
LLVM normally prefers to minimize the number of bits set in an AND
immediate, but that doesn't always match the available ARM instructions.
In Thumb1 mode, prefer uxtb or uxth where possible; otherwise, prefer
a two-instruction sequence movs+ands or movs+bics.

Some potential improvements outlined in
ARMTargetLowering::targetShrinkDemandedConstant, but seems to work
pretty well already.

The ARMISelDAGToDAG fix ensures we don't generate an invalid UBFX
instruction due to a larger-than-expected mask. (It's orthogonal, in
some sense, but as far as I can tell it's either impossible or nearly
impossible to reproduce the bug without this change.)

According to my testing, this seems to consistently improve codesize by
a small amount by forming bic more often for ISD::AND with an immediate.

Differential Revision: https://reviews.llvm.org/D50030

llvm-svn: 339472
2018-08-10 21:21:53 +00:00
Zachary Turner 29ec67b62f [MS Demangler] Support extern "C" functions.
There are two cases we need to support with extern "C"
functions.  The first is the case of a '9' indicating that
the function has no prototype.  This occurs when we mangle
a symbol inside of an extern "C" function, but not the
function itself.

The second case is when we have an overloaded extern "C"
functions.  In this case we emit $$J0 to indicate this.
This patch adds support for both of these cases.

llvm-svn: 339471
2018-08-10 21:09:05 +00:00
Sanjay Patel 85e17bb195 [InstCombine] rearrange code for foldSelectBinOpIdentity; NFCI
This is a retry of rL339439 with a fix for the problem that
caused the original commit to be reverted at rL339446. 

That problem was that the compare can be integer while
the binop is FP or vice-versa, so we need to use the binop 
type when we ask for the identity constant.

A test to guard against the problem was added at rL339453.

llvm-svn: 339469
2018-08-10 20:30:35 +00:00
Zachary Turner 073620bc3b [MS Demangler] Demangle cv qualifiers on template args.
Before we wouldn't properly demangle something like
Foo<const int>.  Template args have a special escape sequence
'$$C' that is optional, but if it is present contains
qualifiers.  So we need to check for this and only if it
present, demangle qualifiers before demangling the type.

With this fix, we re-enable some tests that were previously
marked FIXME.

llvm-svn: 339465
2018-08-10 19:57:36 +00:00