Commit Graph

55610 Commits

Author SHA1 Message Date
Philip Reames c3c23e8cf2 [AST] Remove notion of volatile from alias sets [NFCI]
Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet.

It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.)

There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for..

llvm-svn: 340312
2018-08-21 17:59:11 +00:00
Yury Delendik 132fc5a861 Update DBG_VALUE register operand during LiveInterval operations
Summary:
Handling of DBG_VALUE in ConnectedVNInfoEqClasses::Distribute() was fixed in
PR16110. However DBG_VALUE register operands are not getting updated. This
patch properly resolves the value location.

Reviewers: MatzeB, vsk

Reviewed By: MatzeB

Subscribers: kparzysz, thegameg, vsk, MatzeB, dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D48994

llvm-svn: 340310
2018-08-21 17:48:28 +00:00
Craig Topper b172b8884a [BypassSlowDivision] Teach bypass slow division not to interfere with div by constant where constants have been constant hoisted, but not moved from their basic block
DAGCombiner doesn't pay attention to whether constants are opaque before doing the div by constant optimization. So BypassSlowDivision shouldn't introduce control flow that would make DAGCombiner unable to see an opaque constant. This can occur when a div and rem of the same constant are used in the same basic block. it will be hoisted, but not leave the block.

Longer term we probably need to look into the X86 immediate cost model used by constant hoisting and maybe not mark div/rem immediates for hoisting at all.

This fixes the case from PR38649.

Differential Revision: https://reviews.llvm.org/D51000

llvm-svn: 340303
2018-08-21 17:15:33 +00:00
Farhana Aleen 3528c80378 [AMDGPU] Support idot2 pattern.
Summary: Transform add (mul ((i32)S0.x, (i32)S1.x),

         add( mul ((i32)S0.y, (i32)S1.y), (i32)S3) => i/udot2((v2i16)S0, (v2i16)S1, (i32)S3)

Author: FarhanaAleen

Reviewed By: arsenm

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D50024

llvm-svn: 340295
2018-08-21 16:21:15 +00:00
Nicola Zaghen 8a012cbabf [InstCombine] Add new tests for icmp ugt/ult (add nuw X, C2), C
Differential Revision: https://reviews.llvm.org/D51040

llvm-svn: 340284
2018-08-21 15:27:32 +00:00
Simon Pilgrim 43cf2c20ab [X86] Add SSE2 and XOP udiv combine tests
llvm-svn: 340282
2018-08-21 15:21:45 +00:00
Sanjay Patel f3ae9cc33e [InstSimplify] use isKnownNeverNaN to fold more fcmp ord/uno
Remove duplicate tests from InstCombine that were added with
D50582. I left negative tests there to verify that nothing
in InstCombine tries to go overboard. If isKnownNeverNaN is
improved to handle the FP binops or other cases, we should
have coverage under InstSimplify, so we could remove more
duplicate tests from InstCombine at that time.

llvm-svn: 340279
2018-08-21 14:45:13 +00:00
Anna Thomas b02b0ad8c7 [LV] Vectorize loops where non-phi instructions used outside loop
Summary:
Follow up change to rL339703, where we now vectorize loops with non-phi
instructions used outside the loop. Note that the cyclic dependency
identification occurs when identifying reduction/induction vars.

We also need to identify that we do not allow users where the PSCEV information
within and outside the loop are different. This was the fix added in rL307837
for PR33706.

Reviewers: Ayal, mkuper, fhahn

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D50778

llvm-svn: 340278
2018-08-21 14:40:27 +00:00
Sanjay Patel 6bb09a4291 [InstSimplify] add tests for FP uno/ord with nnan; NFC
This is a slight modification of the tests from D50582;
change half of the predicates to 'uno' so we have coverage
for that side too. All of the positive tests can fold to a
constant (true/false), so that should happen in instsimplify.

llvm-svn: 340276
2018-08-21 13:33:13 +00:00
Anna Thomas 2d33ce7701 NFC: Add loop vectorizer tests showing various control flow within loop that skip iterations
llvm-svn: 340275
2018-08-21 13:02:09 +00:00
Tim Renouf bb5ee41ab4 [AMDGPU] Allow int types for MUBUF vdata
Summary:
Previously the new llvm.amdgcn.raw/struct.buffer.load/store intrinsics
only allowed float types for the data to be loaded or stored, which
sometimes meant the frontend needed to generate a bitcast. In this, the
new intrinsics copied the old buffer intrinsics.

This commit extends the new intrinsics to allow int types as well.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D50315

Change-Id: I8202af2d036455553681dcbb3d7d32ae273f8f85
llvm-svn: 340270
2018-08-21 11:08:12 +00:00
Tim Renouf 4f703f5e11 [AMDGPU] New buffer intrinsics
Summary:
This commit adds new intrinsics
  llvm.amdgcn.raw.buffer.load
  llvm.amdgcn.raw.buffer.load.format
  llvm.amdgcn.raw.buffer.load.format.d16
  llvm.amdgcn.struct.buffer.load
  llvm.amdgcn.struct.buffer.load.format
  llvm.amdgcn.struct.buffer.load.format.d16
  llvm.amdgcn.raw.buffer.store
  llvm.amdgcn.raw.buffer.store.format
  llvm.amdgcn.raw.buffer.store.format.d16
  llvm.amdgcn.struct.buffer.store
  llvm.amdgcn.struct.buffer.store.format
  llvm.amdgcn.struct.buffer.store.format.d16
  llvm.amdgcn.raw.buffer.atomic.*
  llvm.amdgcn.struct.buffer.atomic.*

with the following changes from the llvm.amdgcn.buffer.*
intrinsics:

* there are separate raw and struct versions: raw does not have an
  index arg and sets idxen=0 in the instruction, and struct always sets
  idxen=1 in the instruction even if the index is 0, to allow for the
  fact that gfx9 does bounds checking differently depending on whether
  idxen is set;

* there is a combined cachepolicy arg (glc+slc)

* there are now only two offset args: one for the offset that is
  included in bounds checking and swizzling, to be split between the
  instruction's voffset and immoffset fields, and one for the offset
  that is excluded from bounds checking and swizzling, to go into the
  instruction's soffset field.

The AMDISD::BUFFER_* SD nodes always have an index operand, all three
offset operands, combined cachepolicy operand, and an extra idxen
operand.

The obsolescent llvm.amdgcn.buffer.* intrinsics continue to work.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50306

Change-Id: If897ea7dc34fcbf4d5496e98cc99a934f62fc205
llvm-svn: 340269
2018-08-21 11:07:10 +00:00
Tim Renouf 35484c9d50 [AMDGPU] New tbuffer intrinsics
Summary:
This commit adds new intrinsics
  llvm.amdgcn.raw.tbuffer.load
  llvm.amdgcn.struct.tbuffer.load
  llvm.amdgcn.raw.tbuffer.store
  llvm.amdgcn.struct.tbuffer.store

with the following changes from the llvm.amdgcn.tbuffer.* intrinsics:

* there are separate raw and struct versions: raw does not have an index
  arg and sets idxen=0 in the instruction, and struct always sets
  idxen=1 in the instruction even if the index is 0, to allow for the
  fact that gfx9 does bounds checking differently depending on whether
  idxen is set;

* there is a combined format arg (dfmt+nfmt)

* there is a combined cachepolicy arg (glc+slc)

* there are now only two offset args: one for the offset that is
  included in bounds checking and swizzling, to be split between the
  instruction's voffset and immoffset fields, and one for the offset
  that is excluded from bounds checking and swizzling, to go into the
  instruction's soffset field.

The AMDISD::TBUFFER_* SD nodes always have an index operand, all three
offset operands, combined format operand, combined cachepolicy operand,
and an extra idxen operand.

The tbuffer pseudo- and real instructions now also have a combined
format operand.

The obsolescent llvm.amdgcn.tbuffer.* and llvm.SI.tbuffer.store
intrinsics continue to work.

V2: Separate raw and struct intrinsics.
V3: Moved extract_glc and extract_slc defs to a more sensible place.
V4: Rebased on D49995.
V5: Only two separate offset args instead of three.
V6: Pseudo- and real instructions have joint format operand.
V7: Restored optionality of dfmt and nfmt in assembler.
V8: Addressed minor review comments.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D49026

Change-Id: If22ad77e349fac3a5d2f72dda53c010377d470d4
llvm-svn: 340268
2018-08-21 11:06:05 +00:00
Bjorn Pettersson d378a39603 Change how finalizeBundle selects debug location for the BUNDLE instruction
Summary:
Previously a BUNDLE instruction inherited the DebugLoc from the
first instruction in the bundle, even if that DebugLoc had no
DILocation. With this commit this is changed into selecting the
first DebugLoc that has a DILocation, by searching among the
bundled instructions.

The idea is to reduce amount of bundles that are lacking
debug locations.

Reviewers: #debug-info, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: JDevlieghere, mattd, llvm-commits

Differential Revision: https://reviews.llvm.org/D50639

llvm-svn: 340267
2018-08-21 10:59:50 +00:00
Simon Pilgrim 8e15b43092 [X86] Add SSE2 sdiv combine tests
llvm-svn: 340264
2018-08-21 10:44:06 +00:00
Sam Parker 597811e7a7 [DAGCombiner] Reduce load widths of shifted masks
During combining, ReduceLoadWdith is used to combine AND nodes that
mask loads into narrow loads. This patch allows the mask to be a
shifted constant. This results in a narrow load which is then left
shifted to compensate for the new offset.

Differential Revision: https://reviews.llvm.org/D50432

llvm-svn: 340261
2018-08-21 10:26:59 +00:00
Simon Pilgrim 72b324de4d [TargetLowering] Add BuildSDiv support for division by one or negone.
This reduces most of the sdiv stages (the MULHS, shifts etc.) to just zero/identity values and use the numerator scale factor to multiply by +1/-1.

llvm-svn: 340260
2018-08-21 10:20:36 +00:00
Petar Jovanovic 3b953c37f8 [MIPS GlobalISel] Select bitwise instructions
Select bitwise instructions for i32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D50183

llvm-svn: 340258
2018-08-21 08:15:56 +00:00
Max Kazantsev 097ef69182 [LICM] Hoist guards with invariant conditions
This patch teaches LICM to hoist guards from the loop if they are guaranteed to execute and
if there are no side effects that could prevent that.

Differential Revision: https://reviews.llvm.org/D50501
Reviewed By: reames

llvm-svn: 340256
2018-08-21 08:11:31 +00:00
Bjorn Pettersson 880f291577 [RegisterCoalescer] Do not assert when trying to remat dead values
Summary:
RegisterCoalescer::reMaterializeTrivialDef used to assert that
the input register was live in. But as shown by the new
coalesce-dead-lanes.mir test case that seems to be a valid
scenario. We now return false instead of the assert, simply
avoiding to remat the dead def.

Normally a COPY of an undef value is eliminated by
eliminateUndefCopy(). Although we only do that when the
destination isn't a physical register. So the situation
above should be limited to the case when we copy an undef
value to a physical register.

Reviewers: kparzysz, wmi, tpr

Reviewed By: kparzysz

Subscribers: MatzeB, qcolombet, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D50842

llvm-svn: 340255
2018-08-21 07:49:05 +00:00
Max Kazantsev f1dc867396 [NFC] Add some LICM tests
llvm-svn: 340254
2018-08-21 07:37:02 +00:00
Serguei Katkov 09ab506798 [IR Verifier] Do not allow bitcast of pointer to vector of pointers and vice versa.
LangRef for BitCast requires that
"The bit sizes of value and the destination type, ty2, must be identical".
Currently verifier allows BitCast of pointer to vector of pointers so that
the sizes are different.

This change fixes that.

Reviewers: arsenm
Reviewed By: arsenm
Subscribers: llvm-commits, wdng
Differential Revision: https://reviews.llvm.org/D50886

llvm-svn: 340249
2018-08-21 04:27:07 +00:00
Philip Reames a5a8546ac6 [AST] Mark invariant.starts as being readonly
These intrinsics are modelled as writing for control flow purposes, but they don't actually write to any location. Marking these - as we did for guards - allows LICM to hoist loads out of loops containing invariant.starts.

Differential Revision: https://reviews.llvm.org/D50861

llvm-svn: 340245
2018-08-21 00:55:35 +00:00
Philip Reames 578c64da0c [LICM] Add tests from D50786 [NFC]
Exercise more use of volatiles to illustrate that nothing changes as we tweak how we detect them.

llvm-svn: 340244
2018-08-21 00:42:07 +00:00
Philip Reames efdd0a426a [LICM][NFC] Add tests from D50730
Landing tests so corresponding change can show effects clearly.  see
D50730 [AST] Generalize argument specific aliasing

llvm-svn: 340243
2018-08-21 00:37:09 +00:00
Philip Reames 4009487e5c [LICM] More tests for D50925 [NFC]
This time, the corresponding cases where we can hoist (store-like) calls out of loops.

llvm-svn: 340242
2018-08-21 00:14:14 +00:00
Reid Kleckner 1d432ae284 Fix global_metadata_external_comdat.ll test
llvm-svn: 340240
2018-08-21 00:03:21 +00:00
Zachary Turner c175310a09 [MS Demangler] Demangle special operator 'dynamic initializer'.
This is encoded as __E and should print something like
"dynamic initializer for 'Foo'(void)"

This also adds support for dynamic atexit destructor, which is
basically identical but encoded as __F with slightly different
description.

llvm-svn: 340239
2018-08-20 23:59:21 +00:00
Zachary Turner 0002dd467d [MS Demangler] Anonymous namespace hashes can be backreferenced.
Previously we were not remembering the key values of anonymous
namespaces, but we need to do this.

llvm-svn: 340238
2018-08-20 23:58:58 +00:00
Zachary Turner 91c98a858c [MS Demangler] Properly demangle anonymous namespaces.
llvm-svn: 340237
2018-08-20 23:58:35 +00:00
Heejin Ahn 487992cc09 [WebAssembly] Revert type of wake count in atomic.wake to i32
Summary:
We decided to revert this from i64 to i32 in Nov 28 CG meeting. Fixes
PR38632.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D51010

llvm-svn: 340234
2018-08-20 23:49:29 +00:00
Philip Reames 529a590bce [LICM][Tests] Add tests for store hoisting [NFC]
https://reviews.llvm.org/D50925 will be rebased on top of this.

llvm-svn: 340233
2018-08-20 23:37:59 +00:00
Reid Kleckner 85a8c12db8 Re-land r334313 "[asan] Instrument comdat globals on COFF targets"
If we can use comdats, then we can make it so that the global metadata
is thrown away if the prevailing definition of the global was
uninstrumented. I have only tested this on COFF targets, but in theory,
there is no reason that we cannot also do this for ELF.

This will allow us to re-enable string merging with ASan on Windows,
reducing the binary size cost of ASan on Windows.

I tested this change with ASan+PGO, and I fixed an issue with the
__llvm_profile_raw_version symbol. With the old version of my patch, we
would attempt to instrument that symbol on ELF because it had a comdat
with external linkage. If we had been using the linker GC-friendly
metadata scheme, everything would have worked, but clang does not enable
it by default.

llvm-svn: 340232
2018-08-20 23:35:45 +00:00
Craig Topper bee74793a3 [InstCombine] Add splat vector constant support to foldICmpAddOpConst.
Differential Revision: https://reviews.llvm.org/D50946

llvm-svn: 340231
2018-08-20 23:04:25 +00:00
Michael Berg 0b838deddc extend binop folds for selects to include true and false binops flag intersection
Summary: This change address bug 38641

Reviewers: spatel, wristow

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D50996

llvm-svn: 340222
2018-08-20 22:26:58 +00:00
Zachary Turner 030ad37ef4 [llvm-objdump] Add ability to demangle COFF symbols.
llvm-svn: 340221
2018-08-20 22:18:21 +00:00
Craig Topper 9c57ba0dc3 [X86] Add test command line to expose PR38649.
Bypass slow division and constant hoisting are conspiring to break div+rem of large constants.

llvm-svn: 340217
2018-08-20 21:51:35 +00:00
Craig Topper 210ccfe3db [X86] Prevent lowerVectorShuffleByMerging128BitLanes from creating cycles
Due to some splat handling code in getVectorShuffle, its possible for NewV1/NewV2 to have their mask modified from what is requested. This can lead to cycles being created in the DAG.

This patch examines the returned mask and makes sure its different. Long term we may need to look closer at that splat code in getVectorShuffle, or add more splat awareness to getVectorShuffle.

Fixes PR38639

Differential Revision: https://reviews.llvm.org/D50981

llvm-svn: 340214
2018-08-20 21:08:35 +00:00
Craig Topper 7dcb2c4b0a [X86] Teach combineTruncatedArithmetic to handle some cases of ISD::SUB
We can safely avoid interfering with the subus combine if both inputs are freely truncatable. Either both extends, or an extend and a constant vector.

Differential Revision: https://reviews.llvm.org/D50878

llvm-svn: 340212
2018-08-20 20:57:35 +00:00
Craig Topper 08e7e04998 [X86] Pre-commit test cases for D50878.
llvm-svn: 340211
2018-08-20 20:57:32 +00:00
Krzysztof Parzyszek cc3f630252 Consistently use MemoryLocation::UnknownSize to indicate unknown access size
1. Change the software pipeliner to use unknown size instead of dropping
   memory operands. It used to do it before, but MachineInstr::mayAlias
   did not handle it correctly.
2. Recognize UnknownSize in MachineInstr::mayAlias.
3. Print and parse UnknownSize in MIR.

Differential Revision: https://reviews.llvm.org/D50339

llvm-svn: 340208
2018-08-20 20:37:57 +00:00
Vitaly Buka 30b5ed3eb7 Revert "AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space"
As it introduces out of bound access.

This reverts commit r340172 and r340171

llvm-svn: 340202
2018-08-20 19:31:03 +00:00
Cameron McInally 94b9029be9 [FPEnv] Support constrained FREM intrinsic
Differential Revision: https://reviews.llvm.org/D50975

llvm-svn: 340201
2018-08-20 19:28:56 +00:00
Zachary Turner 66555a7bed [MS Demangler] Demangle member pointer template parameters.
llvm-svn: 340199
2018-08-20 19:15:35 +00:00
Aditya Nandakumar 2a08285cf3 Revert "Revert r339977: [GISel]: Add Opcodes for a few LLVM Intrinsics"
This reverts commit 7debc334e6421bb5251ef8f18e97166dfc7dd787.

I missed updating legalizer-info-validation.mir as I had assertions
turned off in my build and that specific test requires asserts. Fixed it
now.

llvm-svn: 340197
2018-08-20 18:43:19 +00:00
Simon Pilgrim 6ac905926f [TargetLowering] Disable BuildSDiv division by one or negone.
Fuzz tests have detected an issue, currently working on a fix.

llvm-svn: 340195
2018-08-20 18:23:54 +00:00
Sanjay Patel 3ce999fa41 [ConstantFolding] improve folding of binops with vector undef operand
A non-undef operand may still have undef constant elements, 
so we should always propagate the vector results per-lane.

llvm-svn: 340194
2018-08-20 18:19:02 +00:00
Sanjay Patel 7ff7bd9b3c [ConstantFolding] add tests for binops on vectors with undef elements; NFC
llvm-svn: 340190
2018-08-20 17:31:34 +00:00
Matt Arsenault 450fcc77a7 ValueTracking: Handle more instructions in isKnownNeverNaN
llvm-svn: 340187
2018-08-20 16:51:00 +00:00
Sanjay Patel 5ae83a21b5 [InstCombine] add tests for insertelement+binop; NFC
llvm-svn: 340184
2018-08-20 16:49:08 +00:00
Samuel Pitoiset c95ef77d37 AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space
32-bit constant address space is declared as 6, so the
maximum number of address spaces is 6, not 5.

Fixes "LLVM ERROR: Pointer address space out of range".

v3: use static_assert()
v2: add a very simple test for 32-bit addr space

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106630
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
llvm-svn: 340171
2018-08-20 13:18:59 +00:00
Simon Pilgrim 5b78c9d58d [SelectionDAG] Add partial sign-bit support to ComputeNumSignBits for BITCAST nodes
Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts.

Handle the case where the sign bit extends to only part of the small elements.

llvm-svn: 340169
2018-08-20 13:05:48 +00:00
Simon Pilgrim 11bec5b80c [X86][SSE] Fix PACKSS bitcast test from rL340166
We need the signbits to extends to lower 16-bits of the even elements

llvm-svn: 340167
2018-08-20 11:47:15 +00:00
Simon Pilgrim cee9c64838 [X86][SSE] Add PACKSS test showing ComputeNumSignBits failure to handle a partial sign bits extension through a bitcast
llvm-svn: 340166
2018-08-20 11:10:12 +00:00
Simon Pilgrim 686090a45f [X86] Drop unnecessary exact qualifier from packss test
llvm-svn: 340165
2018-08-20 11:01:51 +00:00
Sander de Smalen 07db432265 [AArch64][SVE] Asm: Add SVE System registers
This patch adds system registers for controlling aspects of SVE:
- ZCR_EL1  (r/w)   visible at EL1 and EL0.
- ZCR_EL2  (r/w)   visible at EL2 and Non-secure EL1 and EL0.
- ZCR_EL3  (r/w)   visible at all exception levels.

and a system register identifying SVE:
- ID_AA64ZFR0_EL1  (r)  SVE Feature identifier.

Reviewers: SjoerdMeijer, samparker, pbarrio, fhahn, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D50885

llvm-svn: 340158
2018-08-20 09:16:59 +00:00
QingShan Zhang f8f9af7ba5 [PowerPC] Add a peephole post RA to transform the inst that fed by add
If the arch is P8, we will select XFLOAD to load the floating point, and then, expand it to vsx and non-vsx X-form instruction post RA. This patch is trying to convert the X-form to D-form if it meets the requirement that one operand of the x-form inst is the special Zero register, and another operand fed by add inst. i.e.
y = add imm, reg
LFDX. 0, y
-->
LFD imm(reg)

Reviewers: Nemanjai
Differential Revision: https://reviews.llvm.org/D49007

llvm-svn: 340149
2018-08-20 02:52:55 +00:00
Craig Topper 5f695cc1e9 [InstCombine] Add test cases for an icmp combine that is missing support for splat vector constants.
llvm-svn: 340144
2018-08-19 18:03:34 +00:00
Simon Pilgrim 5b936ec89e [SelectionDAG] Add basic demanded elements support to ComputeNumSignBits for BITCAST nodes
Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts.

The next step would be to support cases where the large elements aren't all sign bits, and determine the small element equivalent based on the demanded elements.

llvm-svn: 340143
2018-08-19 17:47:50 +00:00
Simon Pilgrim 0fd72ab44f [X86][SSE] Add PACKSS test showing ComputeNumSignBits failure to handle demanded elts through a bitcast
llvm-svn: 340139
2018-08-19 16:01:47 +00:00
Craig Topper 803912ea57 [X86] Fix an issue in the matching for ADDUS.
We were basically assuming only one operand of the compare could be an ADD node and using that to swap operands. But we can have a normal add followed by a saturing add.

This rewrites the canonicalization to just be based on the condition code.

llvm-svn: 340134
2018-08-19 04:26:31 +00:00
Craig Topper a85d7e927b [X86] Add a test case showing an issue in our addusw pattern matching.
We are unable to handle a normal add followed by a saturing add with certain operand orders on the icmp.

llvm-svn: 340133
2018-08-19 04:26:29 +00:00
Craig Topper 40c9559b74 [X86] Add support for using 512-bit PSUBUS to combineSelect.
The code already support 128 and 256 and even knows to split 256 for AVX1. So we really just needed to stop looking for specific VTs and subtarget features and just look for legal VTs with i8/i16 elements.

While there, add some curly braces around outer if statement bodies that contain only another if. It makes all the closing curly braces look more regular.

llvm-svn: 340128
2018-08-18 18:51:03 +00:00
Craig Topper b40a1d5f84 [X86] Add test cases to show missed opportunities to use 512-bit PSUBUS.
llvm-svn: 340127
2018-08-18 18:50:59 +00:00
Zachary Turner d9e925fca4 [MS Demangler] Resolve backreferences eagerly, not lazily.
A while back I submitted a patch to resolve backreferences
lazily, thinking this that it was not always possible to know
in advance what type you were looking at until you had completed
a full pass over the input, and therefore it would be impossible
to resolve backreferences eagerly.

This was mistaken though, and turned out to be an unrelated
problem.  In fact, the reverse is true.  You *must* resolve
backreferences eagerly.  This is because certain types of nested
mangled symbols do not share a backreference context with their
parent symbol, and as such, if you try to resolve them lazily
their backreference context will have been lost by the time you
finish demangling the entire input.  On the other hand, resolving
them eagerly appears to always work, and enables us to port
many more tests over.

llvm-svn: 340126
2018-08-18 18:49:48 +00:00
Hsiangkai Wang 68c706ceb7 [DebugInfo] In FastISel, convert llvm.dbg.label to DBG_LABEL MI.
Convert llvm.dbg.label(!label_metadata) to DBG_LABEL !label_metadata.

Differential Revision: https://reviews.llvm.org/D50622

llvm-svn: 340122
2018-08-18 14:55:34 +00:00
Craig Topper 911efbb926 [X86] Add a signed test case for PR38622. Use nounwind to reduce the output on the unsigned test case.
llvm-svn: 340121
2018-08-18 06:00:16 +00:00
Craig Topper cc5dbbf759 [DAGCombiner] Allow divide by constant optimization on opaque constants.
Summary:
I believe this restores the behavior we had before r339147.

Fixes PR38622.

Reviewers: RKSimon, chandlerc, spatel

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50936

llvm-svn: 340120
2018-08-18 05:52:42 +00:00
Philip Reames 96bc076c3a [AST] Clarify printing of unknown size locations [NFC]
Printing "unknown" is much more clear than an arbitrary large integer

llvm-svn: 340108
2018-08-17 23:17:31 +00:00
Jordan Rupprecht be8ebccaed [llvm-objcopy] Implement -G/--keep-global-symbol(s).
Summary:
Port GNU Objcopy -G/--keep-global-symbol(s).

This is slightly different than the already-implemented --globalize-symbol, which marks a symbol as global when copying. When --keep-global-symbol (alias -G) is used, *only* those symbols marked will stay global, and all other globals are demoted to local. (Also note that it doesn't *promote* a symbol to global). Additionally, there is a pluralized version of the flag --keep-global-symbols, which effectively applies --keep-global-symbol for every non-comment in a file.

Reviewers: jakehehrlich, jhenderson, alexshap

Reviewed By: jhenderson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50589

llvm-svn: 340105
2018-08-17 22:34:48 +00:00
Philip Reames 26f6176f38 [AST][Tests] Clarify what each test is doing
llvm-svn: 340100
2018-08-17 21:58:26 +00:00
Philip Reames 9e313167cf [AST[Tests] Shorten tests using noalias params
llvm-svn: 340099
2018-08-17 21:45:57 +00:00
Philip Reames 079c92e201 [AST] Add tests for argmemonly calls [NFC]
First step towards building a test set to rebase D50730 on top of.  Starting with clone of memtransfer tests, more to come.

llvm-svn: 340095
2018-08-17 21:42:18 +00:00
Matt Arsenault ea4b476a30 ValueTracking: Add tests for isKnownNeverNaN
llvm-svn: 340090
2018-08-17 21:39:52 +00:00
Zachary Turner 4746aa7b8f [MS Demangler] Properly print all thunk types.
We were only printing the vtordisp thunk before as the previous
patch was more aimed at getting special operators working, one
of which was a thunk.  This patch gets all thunk types to print
properly, and adds a test for each one.

llvm-svn: 340088
2018-08-17 21:32:07 +00:00
Matt Arsenault 25e51540e1 DAG: Fix isKnownNeverNaN for basic non-sNaN cases
fadd/fsub/fmul need to worry about infinities as well
as fdiv.

llvm-svn: 340085
2018-08-17 21:19:22 +00:00
Zachary Turner 469f076356 [MS Demangler] Demangle all remaining types of operators.
This demangles all remaining special operators including thunks,
RTTI Descriptors, and local static guard variables.

llvm-svn: 340083
2018-08-17 21:18:05 +00:00
Jordan Rupprecht bb179a197c Fix windows buildbots by removing : from filenames
llvm-svn: 340071
2018-08-17 19:18:20 +00:00
Jordan Rupprecht cf67633e66 [llvm-objcopy] Add support for -I binary -B <arch>.
Summary:
The -I (--input-target) and -B (--binary-architecture) flags exist but are currently silently ignored. This adds support for -I binary for architectures i386, x86-64 (and alias i386:x86-64), arm, aarch64, sparc, and ppc (powerpc:common64). This is largely based on D41687.

This is done by implementing an additional subclass of Reader, BinaryReader, which works by interpreting the input file as contents for .data field, sets up a synthetic header, and adds additional sections/symbols (e.g. _binary__tmp_data_txt_start).

Reviewers: jakehehrlich, alexshap, jhenderson, javed.absar

Reviewed By: jhenderson

Subscribers: jyknight, nemanjai, kbarton, fedor.sergeev, jrtc27, kristof.beyls, paulsemel, llvm-commits

Differential Revision: https://reviews.llvm.org/D50343

llvm-svn: 340070
2018-08-17 18:51:11 +00:00
Vedant Kumar 8cd64580b7 Remove a hardcoded address in test/DebugInfo/X86/vla-multi.ll
This relaxes a test to make it less brittle.

llvm-svn: 340068
2018-08-17 18:39:19 +00:00
Simon Pilgrim 2f48122cc9 [X86][SSE] Lower constant vXi8 ISD::SRL/ISD::SRA using PMULLW
Extending the concept introduced in D49562, this patch lowers constant vXi8 ISD::SRL/ISD::SRA by zero/sign extending to vXi16 and using PMULLW and then truncating the high 8 bits of the result.

Differential Revision: https://reviews.llvm.org/D50781

llvm-svn: 340062
2018-08-17 18:03:11 +00:00
Evandro Menezes e219d384f9 [NFC] Expand test cases for simplifying pow()
In prepatration for the improvements that D49273 enables.

llvm-svn: 340060
2018-08-17 17:59:38 +00:00
Teresa Johnson cb9a82fc7b [ThinLTO] Add option for printing import failure reasons
Summary:
Adds the option for the printing of summary information about functions
considered but rejected for importing during the thin link.

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D50881

llvm-svn: 340047
2018-08-17 16:53:47 +00:00
Zachary Turner 3461bfaa9c [MS Demangler] Rework the way operators are demangled.
Previously, some of the code for actually parsing mangled
operator names was more like formatting code in nature,
and was interspersed with the demangling code which builds
the AST.  This means that by the time we got to the printing
code, we had lost all information about what type of operator
we had, and all we were left with was a string that we just
had to print.  However, not all operators are actually even
operators.  it's basically just a catch-all mangling for
"special names", and for some of the other types it helps
to know when we're actually doing the printing what it is.

This patch changes the way things work by introducing an
OperatorInfo structure and corresponding enumeration.  When
we demangle we store the enumeration value and demangled
components separately.  This gives more flexibility during
printing.

In doing so, some demanglings of special names which we didn't
previously support come out of this for free, so we now demangle
those.

A few are more complex and are better left for a followup patch
though.

An exhaustive test of every possible operator code is included,
with the ones that don't yet work commented out.

llvm-svn: 340046
2018-08-17 16:14:05 +00:00
Hsiangkai Wang 2532ac880a [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 340039
2018-08-17 15:22:04 +00:00
Stefan Pintilie 39869ccf51 [PowerPC] Generate lxsd instead of the ld->mtvsrd sequence for vector loads
This patch addresses:

- Implementation within PPCISelLowering.cpp to check if we should use direct
load into vector instructions (such as lxsd/lfd ) when the scalar_to_vector
function is used; which will allow us to catch as many cases of the
scalar_to_vector uses as possible to translate the ld->mtvsrd sequence into
lxsd.

- Test cases to exhibit the behaviour of emitting lxsd/lfd.

Patch by amyk

Differential revision: https://reviews.llvm.org/D49698

llvm-svn: 340037
2018-08-17 15:15:26 +00:00
Francis Visoiu Mistrih f006b491bd [x86] Fix test breaking on Darwin after r339962
* -march=x86-64 -> -mtriple=x86_64-unknown-linux to avoid _ prefixes to
symbols
* add -start-before to avoid running the whole codegen on the IR. I
assumed it is meant to be running after X86SpeculativeLoadHardening.

llvm-svn: 340034
2018-08-17 14:47:01 +00:00
Francis Visoiu Mistrih 8bff832534 [X86] Fix liveness information when expanding X86::EH_SjLj_LongJmp64
test/CodeGen/X86/shadow-stack.ll has the following machine verifier
errors:

```
*** Bad machine code: Using a killed virtual register ***
- function:    bar
- basic block: %bb.6 entry (0x7fdc81857818)
- instruction: %3:gr64 = MOV64rm killed %2:gr64, 1, $noreg, 8, $noreg
- operand 1:   killed %2:gr64

*** Bad machine code: Using a killed virtual register ***
- function:    bar
- basic block: %bb.6 entry (0x7fdc81857818)
- instruction: $rsp = MOV64rm killed %2:gr64, 1, $noreg, 16, $noreg
- operand 1:   killed %2:gr64

*** Bad machine code: Virtual register killed in block, but needed live out. ***
- function:    bar
- basic block: %bb.2 entry (0x7fdc818574f8)
Virtual register %2 is used after the block.
```

The fix here is to only copy the machine operand's register without the
kill flags for all the instructions except the very last one of the
sequence.

I had to insert dummy PHIs in the test case to force the NoPHI function
property to be set to false. More on this here: https://llvm.org/PR38439

Differential Revision: https://reviews.llvm.org/D50260

llvm-svn: 340033
2018-08-17 14:46:56 +00:00
Florian Hahn 9e50e915fa [NewGVN] Add tests for r340031.
llvm-svn: 340032
2018-08-17 14:39:53 +00:00
Florian Hahn 19f9e32f07 [InstrSimplify,NewGVN] Add option to ignore additional instr info when simplifying.
NewGVN uses InstructionSimplify for simplifications of leaders of
congruence classes. It is not guaranteed that the metadata or other
flags/keywords (like nsw or exact) of the leader is available for all members
in a congruence class, so we cannot use it for simplification.

This patch adds a InstrInfoQuery struct with a boolean field
UseInstrInfo (which defaults to true to keep the current behavior as
default) and a set of helper methods to get metadata/keywords for a
given instruction, if UseInstrInfo is true. The whole thing might need a
better name, to avoid confusion with TargetInstrInfo but I am not sure
what a better name would be.

The current patch threads through InstrInfoQuery to the required
places, which is messier then it would need to be, if
InstructionSimplify and ValueTracking would share the same Query struct.

The reason I added it as a separate struct is that it can be shared
between InstructionSimplify and ValueTracking's query objects. Also,
some places do not need a full query object, just the InstrInfoQuery.

It also updates some interfaces that do not take a Query object, but a
set of optional parameters to take an additional boolean UseInstrInfo.

See https://bugs.llvm.org/show_bug.cgi?id=37540.

Reviewers: dberlin, davide, efriedma, sebpop, hiraditya

Reviewed By: hiraditya

Differential Revision: https://reviews.llvm.org/D47143

llvm-svn: 340031
2018-08-17 14:39:04 +00:00
Krzysztof Parzyszek 39a979c838 [Hexagon] Expand vgather pseudos during packetization
This will allow packetizing the vgather expansion with other instructions.

llvm-svn: 340028
2018-08-17 14:24:24 +00:00
Alex Bradbury 3291f9aa81 [AtomicExpandPass] Widen partword atomicrmw or/xor/and before tryExpandAtomicRMW
This patch performs a widening transformation of bitwise atomicrmw 
{or,xor,and} and applies it prior to tryExpandAtomicRMW. This operates 
similarly to convertCmpXchgToIntegerType. For these operations, the i8/i16 
atomicrmw can be implemented in terms of the 32-bit atomicrmw by appropriately 
manipulating the operands. There is no functional change for the handling of 
partword or/xor, but the transformation for partword 'and' is new.

The advantage of performing this transformation early is that the same 
code-path can be used regardless of the approach used to expand the atomicrmw 
(AtomicExpansionKind). i.e. the same logic is used for 
AtomicExpansionKind::CmpXchg and can also be used by the intrinsic-based 
expansion in D47882.

Differential Revision: https://reviews.llvm.org/D48129

llvm-svn: 340027
2018-08-17 14:03:37 +00:00
Anna Thomas 1962621a7e [LICM] Add a diagnostic analysis for identifying alias information
Summary:
Currently, in LICM, we use the alias set tracker to identify if the
instruction (we're interested in hoisting) aliases with instruction that
modifies that memory location.

This patch adds an LICM alias analysis diagnostic tool that checks the
mod ref info of the instruction we are interested in hoisting/sinking,
with every instruction in the loop.  Because of O(N^2) complexity this
is now only a diagnostic tool to show the limitation we have with the
alias set tracker and is OFF by default.

Test cases show the difference with the diagnostic analysis tool, where
we're able to hoist out loads and readonly + argmemonly calls from the
loop, where the alias set tracker analysis is not able to hoist these
instructions out.

Reviewers: reames, mkazantsev, fedor.sergeev, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50854

llvm-svn: 340026
2018-08-17 13:44:00 +00:00
Sanjay Patel 411b86081e [ConstantFolding] add simplifications for funnel shift intrinsics
This is another step towards being able to canonicalize to the funnel shift 
intrinsics in IR (see D49242 for the initial patch). 
We should not have any loss of simplification power in IR between these and 
the equivalent IR constructs.

Differential Revision: https://reviews.llvm.org/D50848

llvm-svn: 340022
2018-08-17 13:23:44 +00:00
Luke Cheeseman 64dcdec60c [AArch64] - Generate pointer authentication instructions
- Generate pointer authentication instructions
- The functions instrumented depend on function attribtues:
  all (all functions instrumentent)
  non-leaf (only those that spill LR)
  none
- Function epilogues sign the LR before spilling to the stack and authenticate
  the LR once restored
- If the target is v8.3a or greater than can use the combined authenticate and
  return instruction

Differential revision: https://reviews.llvm.org/D49793

llvm-svn: 340018
2018-08-17 12:53:22 +00:00
Nemanja Ivanovic 39751276b0 [PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction
Add a DAG combine for the PowerPC code generator to generate the Power9 extswsli
extend sign and shift immediate instruction.

Patch by RolandF.

Differential revision: https://reviews.llvm.org/D49879

llvm-svn: 340016
2018-08-17 12:35:44 +00:00
Simon Pilgrim 03e57521c0 [DAGCombiner] extractShiftForRotate - fix out of range shift issue
Don't just check for negative shift amounts.

Fixes OSS Fuzz #9935
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9935

llvm-svn: 340015
2018-08-17 12:25:18 +00:00
Bernard Ogden b828bb2a15 [ARM/AArch64] Support FP16 +fp16fml instructions
Add +fp16fml feature for new FP16 instructions, which are a
mandatory part of FP16 from v8.4-A and an optional part of FP16
from v8.2-A. It doesn't seem to be possible to model this in
LLVM, but the relationship between the options is handled by
the related clang patch.

In keeping with what I think is the usual practice, the fp16fml
extension is accepted regardless of base architecture version.

Builds on/replaces Sjoerd Meijer's patch to add these instructions at
https://reviews.llvm.org/D49839.

Differential Revision: https://reviews.llvm.org/D50228

llvm-svn: 340013
2018-08-17 11:29:49 +00:00
Simon Pilgrim 5113b48798 [DAGCombine] Improve (sra (sra x, c1), c2) -> (sra x, (add c1, c2)) folding
Add support for cases where only some c1+c2 results exceed the max bitshift, clamping accordingly.

Differential Revision: https://reviews.llvm.org/D35722

llvm-svn: 340010
2018-08-17 10:52:49 +00:00
Daniel Cederman 0c597ca223 [Sparc] Get sret arg size from CallLoweringInfo.getArgs()
Summary:
Looking at the callee argument list, as is done now, might not work if
the function has been typecasted into one that is expected to return
a struct. This change also simplifies the code.

The isFP128ABICall() function can be removed as it is no longer needed.
The test in fp128.ll has been updated to verify this.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48117

llvm-svn: 340008
2018-08-17 10:40:00 +00:00
Daniel Cederman 7d3e08ff8d [Sparc] Flush register windows for @llvm.returnaddress(1)
Summary: When @llvm.returnaddress is called with a value higher than 0
it needs to read from the call stack to get the return address. This
means that the register windows needs to be flushed to the stack to
guarantee that the data read is valid. For values higher than 1 this
is done indirectly by the call to getFRAMEADDR(), but not for the value 1.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48636

llvm-svn: 340003
2018-08-17 09:18:31 +00:00
Max Kazantsev 7b78d3920c [MustExecute] Fix algorithmic bug in isGuaranteedToExecute. PR38514
The description of `isGuaranteedToExecute` does not correspond to its implementation.
According to description, it should return `true` if an instruction is executed under the
assumption that its loop is *entered*. However there is a sophisticated alrogithm inside
that tries to prove that the instruction is executed if the loop is *exited*, which is not the
same thing for infinite loops. There is an attempt to protect from dealing with infinite loops
by prohibiting loops without exit blocks, however an infinite loop can have exit blocks.

As result of that, MustExecute can falsely consider some blocks that are never entered as
mustexec, and LICM can hoist dangerous instructions out of them basing on this fact.
This may introduce UB to programs which did not contain it initially.

This patch removes the problematic algorithm and replaced it with a one which tries to
prove what is required in description.

Differential Revision: https://reviews.llvm.org/D50558
Reviewed By: reames

llvm-svn: 339984
2018-08-17 06:19:17 +00:00
Max Kazantsev cfa3e66b8e [NFC] Add tests to ensure that improvement of MustThrow analysis will not lead to problems in future
llvm-svn: 339983
2018-08-17 05:20:25 +00:00
Chandler Carruth b898b86f49 Revert r339977: [GISel]: Add Opcodes for a few LLVM Intrinsics
This is breaking ~all the bots.

llvm-svn: 339982
2018-08-17 04:47:16 +00:00
Aditya Nandakumar 973a557338 [GISel]: Add Opcodes for a few LLVM Intrinsics
https://reviews.llvm.org/D50401

Add opcodes for llvm.intrinsic.trunc, round, and update the IRTranslator
for the same.

Reviewed by: dsanders.

llvm-svn: 339977
2018-08-17 01:41:56 +00:00
David Blaikie 0e03047e85 DebugInfo: Remove command line (& target-based) disabling of pubnames in favor of metadata
Now that Clang disables NVPTX pubnames via metadata there's no need for
this fallback to target detection in the backend.

llvm-svn: 339970
2018-08-16 23:57:15 +00:00
Heejin Ahn e76fa9ecca [WebAssembly] CFG stackify support for exception handling
Summary:
This adds support for exception handling to CFGStackify pass. This only
adds TRY / END_TRY markers and DOES NOT yet fix unwind mismatches that
can be created by the linearization of the CFG into the structural wasm
format. The mismatch fix will be added by following patches.

In detail, this patch
- Added support for TRY / END_TRY markers to support EH
- Changed many static functions into class member functions as they take
too many arguments now
- Added several more bookeeping data structures
- Refactored routines that decide where to insert markers, because
without refactoring this got too complicated as we added support for new
kinds of markers (TRY/END_TRY).
- Rewrote rethrow instructions' BB arguments to relative depths in EH
pad stack.

Reviewers: dschuff, sunfish

Subscribers: sbc100, jgravelle-google, llvm-commits

Differential Revision: https://reviews.llvm.org/D48273

llvm-svn: 339967
2018-08-16 23:50:59 +00:00
Chandler Carruth 75ca6be1c1 [x86/MIR] Implement support for pre- and post-instruction symbols, as
well as MIR parsing support for `MCSymbol` `MachineOperand`s.

The only real way to test pre- and post-instruction symbol support is to
use them in operands, so I ended up implementing that within the patch
as well. I can split out the operand support if folks really want but it
doesn't really seem worth it.

The functional implementation of pre- and post-instruction symbols is
now *completely trivial*. Two tiny bits of code in the (misnamed)
AsmPrinter. It should be completely target independent as well. We emit
these exactly the same way as we emit basic block labels. Most of the
code here is to give full dumping, MIR printing, and MIR parsing support
so that we can write useful tests.

The MIR parsing of MC symbol operands still isn't 100%, as it forces the
symbols to be non-temporary and non-local symbols with names. However,
those names often can encode most (if not all) of the special semantics
desired, and unnamed symbols seem especially annoying to serialize and
de-serialize. While this isn't perfect or full support, it seems plenty
to write tests that exercise usage of these kinds of operands.

The MIR support for pre-and post-instruction symbols was quite
straightforward. I chose to print them out in an as-if-operand syntax
similar to debug locations as this seemed the cleanest way and let me
use nice introducer tokens rather than inventing more magic punctuation
like we use for memoperands.

However, supporting MIR-based parsing of these symbols caused me to
change the design of the symbol support to allow setting arbitrary
symbols. Without this, I don't see any reasonable way to test things
with MIR.

Differential Revision: https://reviews.llvm.org/D50833

llvm-svn: 339962
2018-08-16 23:11:05 +00:00
Sanjay Patel 8ba631d9c8 [InstCombine] add reflection fold for tan(-x)
This is a follow-up suggested with rL339604.
For tan(), we don't have a corresponding LLVM 
intrinsic -- unlike sin/cos -- so this is the 
only way/place that we can do this fold currently.

llvm-svn: 339958
2018-08-16 22:46:20 +00:00
Vedant Kumar ee6c233ae0 [InstrProf] Use atomic profile counter updates for TSan
Thread sanitizer instrumentation fails to skip all loads and stores to
profile counters. This can happen if profile counter updates are merged:

  %.sink = phi i64* ...
  %pgocount5 = load i64, i64* %.sink
  %27 = add i64 %pgocount5, 1
  %28 = bitcast i64* %.sink to i8*
  call void @__tsan_write8(i8* %28)
  store i64 %27, i64* %.sink

To suppress TSan diagnostics about racy counter updates, make the
counter updates atomic when TSan is enabled. If there's general interest
in this mode it can be surfaced as a clang/swift driver option.

Testing: check-{llvm,clang,profile}

rdar://40477803

Differential Revision: https://reviews.llvm.org/D50867

llvm-svn: 339955
2018-08-16 22:24:47 +00:00
Sanjay Patel 75714b598d [InstCombine] add tests for tan with negated arg; NFC
llvm-svn: 339953
2018-08-16 22:05:51 +00:00
Craig Topper 883ff69c93 [DAGCombiner] Don't reassociate operations that have the vector reduction flag set.
When nodes are reassociated the vector-reduction flag gets lost.

The test case is here is what would happen if you had a sum of absolute differences loop that started with a non-zero but contant sum and that loop was unrolled. The vectorizer will generate a constant vector for the initial value. And DAGCombiner reassociate tries to move it down the addition tree erasing the vector-reduction flag. Interestingly this moves constants the opposite direction of the reassociate IR pass.

I've chosen to just punt on the reassociate, but I suppose we could maybe preserve the flag if both nodes have it set.

Differential Revision: https://reviews.llvm.org/D50827

llvm-svn: 339946
2018-08-16 21:54:05 +00:00
Craig Topper bde2b43cb3 [X86] In EFLAGS copy pass, don't emit EXTRACT_SUBREG instructions since we're after peephole
Normally the peephole pass converts EXTRACT_SUBREG to COPY instructions. But we're after peephole so we can't rely on it to clean these up.

To fix this, the eflags pass now emits a COPY with a subreg input.

I also noticed that in 32-bit mode we need to constrain the input to the copy to ensure the subreg is valid. Otherwise we'll fail verify-machineinstrs

Differential Revision: https://reviews.llvm.org/D50656

llvm-svn: 339945
2018-08-16 21:54:02 +00:00
Reid Kleckner 602c0dafdd [MC] Improve COFF associative section lookup
Handle the case when the symbol is private. Private symbols are not in
the COFF object file symbol table, so they aren't inserted into
SymbolMap. We can't look up the section of the symbol that way. Instead,
get the MCSection from the MCSymbol and map that to the object file
section.

Print a better error message when the symbol has no section, like when
the symbol is undefined.

Fixes PR38607

llvm-svn: 339942
2018-08-16 21:34:41 +00:00
David Blaikie 66cf14d06b DebugInfo: Add metadata support for disabling DWARF pub sections
In cases where the debugger load time is a worthwhile tradeoff (or less
costly - such as loading from a DWP instead of a variety of DWOs
(possibly over a high-latency/distributed filesystem)) against object
file size, it can be reasonable to disable pubnames and corresponding
gdb-index creation in the linker.

A backend-flag version of this was implemented for NVPTX in
D44385/r327994 - which was fine for NVPTX which wouldn't mix-and-match
CUs. Now that it's going to be a user-facing option (likely powered by
"-gno-pubnames", the same as GCC) it should be encoded in the
DICompileUnit so it can vary per-CU.

After this, likely the NVPTX support should be migrated to the metadata
& the previous flag implementation should be removed.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D50213

llvm-svn: 339939
2018-08-16 21:29:55 +00:00
Michael Berg ed89d069f4 add a missed case for binary op FMF propagation under select folds
llvm-svn: 339938
2018-08-16 20:59:45 +00:00
Philip Reames 684fa57ef7 [MemLoc] Fix a bug causing any use of invariant.end to crash in LICM
The fix is fairly simple, but is says something unpleasant about the usage and testing of invariant.start/end scopes that this went undetected.  To put this in perspective, *any* invariant.end in a loop flowing through LICM crashed.  I haven't bothered to figure out just how far back this goes, but it's not caused by any of the recent changes.  We're probably talking months if not years.  

llvm-svn: 339936
2018-08-16 20:48:55 +00:00
Krzysztof Parzyszek bb1aede865 [SystemZ] Require asserts in subregliveness-06.mir
The option -misched=shuffle is only available with !NDEBUG builds.

llvm-svn: 339931
2018-08-16 20:12:15 +00:00
Peter Collingbourne 3da2ffb826 Add missing test file from r339799.
llvm-svn: 339927
2018-08-16 19:29:01 +00:00
Craig Topper 3dfc5af178 [X86] Pre-commit test case for D50827.
llvm-svn: 339926
2018-08-16 19:27:43 +00:00
Krzysztof Parzyszek 9af86a5e01 [MachineVerifier] Check if predecessor is jointly dominated by undefs
Each use of a value should be jointly dominated by the union of defs and
undefs. It can happen that it will only be jointly dominated by undefs,
and that is still legal. Make sure that the verifier is aware of that.

llvm-svn: 339924
2018-08-16 19:13:28 +00:00
Eli Friedman 73e8a784e6 [SelectionDAG] Improve the legalisation lowering of UMULO.
There is no way in the universe, that doing a full-width division in
software will be faster than doing overflowing multiplication in
software in the first place, especially given that this same full-width
multiplication needs to be done anyway.

This patch replaces the previous implementation with a direct lowering
into an overflowing multiplication algorithm based on half-width
operations.

Correctness of the algorithm was verified by exhaustively checking the
output of this algorithm for overflowing multiplication of 16 bit
integers against an obviously correct widening multiplication. Baring
any oversights introduced by porting the algorithm to DAG, confidence in
correctness of this algorithm is extremely high.

Following table shows the change in both t = runtime and s = space. The
change is expressed as a multiplier of original, so anything under 1 is
“better” and anything above 1 is worse.

+-------+-----------+-----------+-------------+-------------+
| Arch  | u64*u64 t | u64*u64 s | u128*u128 t | u128*u128 s |
+-------+-----------+-----------+-------------+-------------+
|   X64 |     -     |     -     |    ~0.5     |    ~0.64    |
|  i686 |   ~0.5    |   ~0.6666 |    ~0.05    |    ~0.9     |
| armv7 |     -     |   ~0.75   |      -      |    ~1.4     |
+-------+-----------+-----------+-------------+-------------+

Performance numbers have been collected by running overflowing
multiplication in a loop under `perf` on two x86_64 (one Intel Haswell,
other AMD Ryzen) based machines. Size numbers have been collected by
looking at the size of function containing an overflowing multiply in
a loop.

All in all, it can be seen that both performance and size has improved
except in the case of armv7 where code size has regressed for 128-bit
multiply. u128*u128 overflowing multiply on 32-bit platforms seem to
benefit from this change a lot, taking only 5% of the time compared to
original algorithm to calculate the same thing.

The final benefit of this change is that LLVM is now capable of lowering
the overflowing unsigned multiply for integers of any bit-width as long
as the target is capable of lowering regular multiplication for the same
bit-width. Previously, 128-bit overflowing multiply was the widest
possible.

Patch by Simonas Kazlauskas!

Differential Revision: https://reviews.llvm.org/D50310

llvm-svn: 339922
2018-08-16 18:39:39 +00:00
Jordan Rupprecht d1767dc56f [llvm-strip] Add support for -p/--preserve-dates
Summary: [llvm-strip] Preserve access/modification timestamps when -p is used.

Reviewers: jakehehrlich, jhenderson, alexshap

Reviewed By: jhenderson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50744

llvm-svn: 339921
2018-08-16 18:29:40 +00:00
Krzysztof Parzyszek 17143f6111 [RegisterCoalescer] Shrink to uses if needed after removeCopyByCommutingDef
llvm-svn: 339912
2018-08-16 18:02:59 +00:00
Simon Pilgrim 87d0039a45 [TargetLowering] Add support for non-uniform vectors to BuildSDIV
This patch refactors the existing TargetLowering::BuildSDIV base implementation to support non-uniform constant vector denominators.

This is the last patch necessary to close PR36545

Differential Revision: https://reviews.llvm.org/D50765

llvm-svn: 339908
2018-08-16 17:44:33 +00:00
Reid Kleckner bd5d71229d [codeview] Use push_macro to avoid conflicts instead of a prefix
Summary:
This prefix was added in r333421, and it changed our dumper output to
say things like "CVRegEAX" instead of just "EAX". That's a functional
change that I'd rather avoid.

I tested GCC, Clang, and MSVC, and all of them support #pragma
push_macro. They don't issue warnings whem the macro is not defined
either.

I don't have a Mac so I can't test the real termios.h header, but I
looked at the termios.h sources online and looked for other conflicts.
I saw only the CR* macros, so those are the ones we work around.

Reviewers: zturner, JDevlieghere

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50851

llvm-svn: 339907
2018-08-16 17:34:31 +00:00
Matt Arsenault 7121bed210 AMDGPU: Custom lower fexp
This will allow the library to just use __builtin_expf directly
without expanding this itself. Note f64 still won't work because
there is no exp instruction for it.

llvm-svn: 339902
2018-08-16 17:07:52 +00:00
Simon Pilgrim 8b9e545477 [X86][SSE] Add sdiv by nonuniform constant vector test containing -1/+1 and all-bits style constants
llvm-svn: 339901
2018-08-16 17:07:41 +00:00
Evandro Menezes 42422b33cf [NFC] Fix typo in test cases
llvm-svn: 339900
2018-08-16 17:03:22 +00:00
Nirav Dave 7fd992a755 [MC][X86] Enhance X86 Register expression handling to more closely match GCC.
Allow the comparison of x86 registers in the evaluation of assembler
directives. This generalizes and simplifies the extension from r334022
to catch another case found in the Linux kernel.

Reviewers: rnk, void

Reviewed By: rnk

Subscribers: hiraditya, nickdesaulniers, llvm-commits

Differential Revision: https://reviews.llvm.org/D50795

llvm-svn: 339895
2018-08-16 16:31:14 +00:00
Zachary Turner 970fdc3236 [MS Demangler] Demangle string literals.
When demangling string literals, Microsoft's undname
simply prints 'string'.  This patch implements string
literal demangling while doing a bit better than this
by decoding as much of the string as possible and
trying to faithfully reproduce the original string
literal definition.

This is a bit tricky because the different character
types char, char16_t, and char32_t are not uniquely
identified by the mangling, so we have to use a
heuristic to try to guess the character type.  But
it works pretty well, and many tests are added to
illustrate the behavior.

Differential Revision: https://reviews.llvm.org/D50806

llvm-svn: 339892
2018-08-16 16:17:36 +00:00
Zachary Turner 83313f8f54 [MS Demangler] Don't fail on MD5-mangled names.
When we have an MD5 mangled name, we shouldn't choke and say
that it's an invalid name.  Even though it's impossible to demangle,
we should just output the original name.

llvm-svn: 339891
2018-08-16 16:17:17 +00:00
Sanjay Patel 0ea8d8b951 [ConstantFolding] add tests for funnel shift intrinsics; NFC
No functionality for this yet.

llvm-svn: 339889
2018-08-16 16:10:42 +00:00
Evandro Menezes c05c7e11bb [InstCombine] Expand the simplification of pow(x, 0.5) to sqrt(x)
Expand the number of cases when `pow(x, 0.5)` is simplified into `sqrt(x)`
by considering the math semantics with more granularity.

Differential revision: https://reviews.llvm.org/D50036

llvm-svn: 339887
2018-08-16 15:58:08 +00:00
Sanjay Patel 039f556f44 [InstCombine] move vector compare before same-shuffled ops
This is a step towards fixing PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

llvm-svn: 339875
2018-08-16 12:52:17 +00:00
George Rimar d2f90ea337 [yaml2obj] - Allow to use numeric sh_link (Link) value for sections.
That change allows using numeric values for Link field.
It is consistent with the code for another fields in this method.

llvm-svn: 339873
2018-08-16 12:44:17 +00:00
George Rimar 17257bb0b5 [yaml2elf] - Use check-next in test.
Its a follow up for rL339870.

llvm-svn: 339872
2018-08-16 12:40:27 +00:00
Sam Parker 0d51197051 [ARM] Ignore GEPs in ARMCodeGenPrepare
While searching through the use-def tree, ignore GetElementPtrInst
instructions because they don't need promoting and neither do their
indices. Otherwise, the wide indices prevent the transformation from
happening.

Differential Revision: https://reviews.llvm.org/D50762

llvm-svn: 339871
2018-08-16 12:24:40 +00:00
George Rimar 7f2df7df45 [yaml2elf] - Simplify code, add a test. NFC.
This simplifies the code allowing to set the sh_info
for relocations sections. And adds a missing test.

llvm-svn: 339870
2018-08-16 12:23:22 +00:00
Sam Parker 0e2f0bd48e [ARM] Allow zext in ARMCodeGenPrepare
Treat zext instructions as roots, like we do for truncs.

Differential Revision: https://reviews.llvm.org/D50759

llvm-svn: 339868
2018-08-16 11:54:09 +00:00
Alex Bradbury fdc4647ca3 [RISCV][MC] Don't fold symbol differences if requiresDiffExpressionRelocations is true
When emitting the difference between two symbols, the standard behavior is 
that the difference will be resolved to an absolute value if both of the 
symbols are offsets from the same data fragment. This is undesirable on 
architectures such as RISC-V where relaxation in the linker may cause the 
computed difference to become invalid. This caused an issue when compiling to 
object code, where the size of a function in the debug information was already 
calculated even though it could change as a consequence of relaxation in the 
subsequent linking stage.

This patch inhibits the resolution of symbol differences to absolute values 
where the target's AsmBackend has declared that it does not want these to be 
folded.

Differential Revision: https://reviews.llvm.org/D45773
Patch by Edward Jones.

llvm-svn: 339864
2018-08-16 11:26:37 +00:00
Sam Parker 13567dbbd8 [ARM] Allow signed icmps in ARMCodeGenPrepare
Originally committed in r339755 which was reverted in r339806 due to
an asan issue. The issue was caused by my assumption that operands to
a CallInst mapped to the FunctionType Params. CallInsts are now
handled by iterating over their ArgOperands instead of Operands.
    
Original Message:
  Treat signed icmps as 'sinks', allowing them to be in the use-def
  tree, enabling more promotions to be performed. As a sink, any
  promoted incoming values need to be truncated before being used by
  the signed icmp.
    
  Differential Revision: https://reviews.llvm.org/D50067

llvm-svn: 339858
2018-08-16 10:05:39 +00:00
Craig Topper 9c1d9fdeaa [X86] Remove masking from the 512-bit padds and psubs intrinsics. Use select in IR instead.
llvm-svn: 339842
2018-08-16 06:20:24 +00:00
Craig Topper 9d6983c9fd [X86] Remove the unused masked 128 and 256-bit masked padds/psubs intrinsics.
Still need to remove masking from the 512-bit versions.

llvm-svn: 339841
2018-08-16 06:20:22 +00:00
Craig Topper 054b8cce2d [X86] Correct some bad FileCheck prefixes in tests. Add test cases for v64i8 padd/psub saturation intrinsics.
For some reason we had the 128/256-bit tests, but no the 512-bit tests.

llvm-svn: 339840
2018-08-16 06:20:19 +00:00
Chandler Carruth 00c35c7794 [x86] Actually initialize the SLH pass with the x86 backend and use
a shorter name ('x86-slh') for the internal flags and pass name.

Without this, you can't use the -stop-after or -stop-before
infrastructure. I seem to have just missed this when originally adding
the pass.

The shorter name solves two problems. First, the flag names were ...
really long and hard to type/manage. Second, the pass name can't be the
exact same as the flag name used to enable this, and there are already
some users of that flag name so I'm avoiding changing it unnecessarily.

llvm-svn: 339836
2018-08-16 01:22:19 +00:00
Easwaran Raman aca738b742 [BFI] Use rounding while computing profile counts.
Summary:
Profile count of a block is computed by multiplying its block frequency
by entry count and dividing the result by entry block frequency. Do
rounded division in the last step and update test cases appropriately.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50822

llvm-svn: 339835
2018-08-16 00:26:59 +00:00
Guozhi Wei 8c17f9a77d [CodeGenPrepare] Add BothExtension type to PromotedInsts
This patch fixes PR38125.

Instruction extension types are recorded in PromotedInsts, it can be used later in function canGetThrough. If an instruction has two users with different extension types, it will be inserted into PromotedInsts two times in function promoteOperandForOther. The second one overwrites the first one, and the final extension type is wrong, later causes problem in canGetThrough.

This patch changes the simple bool extension type to 2-bit enum type, add a BothExtension type in addition to zero/sign extension. When an user sees BothExtension for an instruction, it actually knows nothing about how that instruction is extended.

Differential Revision: https://reviews.llvm.org/D49512

llvm-svn: 339822
2018-08-15 22:08:26 +00:00
Matt Arsenault f533e6b0ed AMDGPU: Fold fneg into fmed3
llvm-svn: 339821
2018-08-15 21:46:27 +00:00
Matt Arsenault a816073764 AMDGPU: Improve extract_vector_elt reduction combine
Handle fmul, fsub and preserve flags.

Also really test minnum/maxnum reductions.
The existing tests were only checking from
minnum/maxnum matched from a fast math compare
and select which is not the same.

llvm-svn: 339820
2018-08-15 21:34:06 +00:00
Matt Arsenault b3a80e5397 AMDGPU: Implement llvm.amdgcn.icmp/fcmp for i16/f16
Also support these on targets without support for these,
since it will allow us to freely create these in instcombine.

llvm-svn: 339819
2018-08-15 21:25:20 +00:00
Craig Topper 08e082619a [X86] Improve AVX1 shuffle lowering for v8f32 shuffles where the low half comes from V1 and the high half comes from V2 and the halves do the same operation
To lower this we now create a new V1 containing the low half of both sources and a new V2 containing the upper half of both sources. Then we created a repeated lane shuffle of those new sources to create the final result.

This fixes PR35833

Differential Revison: https://reviews.llvm.org/D41794

llvm-svn: 339818
2018-08-15 21:21:52 +00:00
Matt Arsenault 9a389fbd79 AMDGPU: Stop producing icmp/fcmp intrinsics with invalid types
llvm-svn: 339815
2018-08-15 21:14:25 +00:00
Matt Arsenault 6c7ba82900 AMDGPU: Address todo for handling 1/(2 pi)
llvm-svn: 339814
2018-08-15 21:03:55 +00:00
Vitaly Buka ed4239f482 Revert "[ARM] Allow signed icmps in ARMCodeGenPrepare"
use-after-poison in check-llvm under asan

This reverts commit r339755.

llvm-svn: 339806
2018-08-15 20:09:35 +00:00
Peter Collingbourne 62e4fc48a5 llvm-readobj: Fix addend in relocations for android packed format
If a relocation group doesn't have the RELOCATION_GROUP_HAS_ADDEND_FLAG set, then this implies the group's addend equals zero.
In this case android packed format won't encode an explicit addend delta, instead we need to set Addend, the "previous addend" variable, to zero by ourself.

Patch by Yi-Yo Chiang!

Differential Revision: https://reviews.llvm.org/D50601

llvm-svn: 339799
2018-08-15 17:58:22 +00:00
Amara Emerson 070ac768ff [InstCombine] Fix IC trying to create a xor of pointer types.
rdar://42473741

Differential Revision: https://reviews.llvm.org/D50775

llvm-svn: 339796
2018-08-15 17:46:22 +00:00
Sanjay Patel 49a8280f43 [AArch64] add tests for poor vector intrinsic lowering via legalization (PR38527); NFC
These correspond to the x86 tests added with rL339790 / rL339791, but I widened
the non-fsin tests to v3f32 to show the problem because AArch supports v2f32 ops. 

llvm-svn: 339793
2018-08-15 17:06:21 +00:00
Krzysztof Parzyszek 3b097b4d3e [RegisterCoalescer] Ensure that both registers have subranges if one does
llvm-svn: 339792
2018-08-15 17:04:58 +00:00
Sanjay Patel 712d42f53d [x86] add fabs test for vector intrinsic to potential libcall bug; NFC
This is a negative test for x86 because it has custom lowering for fabs.

llvm-svn: 339791
2018-08-15 16:56:09 +00:00
Sanjay Patel f9afee479f [x86] add tests for poor vector intrinsic lowering via legalization (PR38527); NFC
llvm-svn: 339790
2018-08-15 16:35:50 +00:00
Krzysztof Parzyszek 88d267d094 [RegisterCoalescer] Reset VNInfo def when copying segments over
llvm-svn: 339788
2018-08-15 16:21:53 +00:00
Derek Schuff 82812fb986 [WebAssembly] SIMD replace_lane
Implement and test replace_lane instructions.

Patch by Thomas Lively

Differential Revision: https://reviews.llvm.org/D50750

llvm-svn: 339786
2018-08-15 16:18:51 +00:00
Krzysztof Parzyszek 46ce441df6 [RegAlloc] Check that subreg liveness tracking applies to given virtual reg
Subregister liveness applies selectively to register classes with certain
properties. Make sure that when it's enabled, it applies to a given virtual
register (in virtual register rewriter).

llvm-svn: 339784
2018-08-15 16:07:47 +00:00
Krzysztof Parzyszek 4e06beb820 [SystemZ] Add testcase for r339778
llvm-svn: 339780
2018-08-15 15:43:13 +00:00
Nemanja Ivanovic 5b9a4f8ee5 [PowerPC] Enhance the selection(ISD::VSELECT) of vector type
To make ISD::VSELECT available(legal) so long as there are altivec instruction,
otherwise it's default behavior is expanding.
Use xxsel to match vselect if vsx is open, or use vsel.

In order to do not write many patterns in td file, promote (for vector it's
bitcast) all other type into v4i32 and only pattern match vselect of v4i32 into
vsel or xxsel.

Patch by wuzish
Differential revision: https://reviews.llvm.org/D49531

llvm-svn: 339779
2018-08-15 15:30:36 +00:00
George Rimar 942e8ed19d [yaml2obj] - Teach yaml2obj to produce SHT_GROUP section with a custom Info field.
This allows to set custom Info field value for SHT_GROUP sections.

It is useful to allow this because we would be able to replace at least one binary
object committed in LLD and replace it with the yaml2obj based test.

Differential revision: https://reviews.llvm.org/D50776

llvm-svn: 339772
2018-08-15 13:55:22 +00:00
Sam Parker fabf7fe5f8 [ARM] TypeSize lower bound for ARMCodeGenPrepare
We only try to promote types with are smaller than 16-bits, but we
also need to check that the type is not less than 8-bits.

Differential Revision: https://reviews.llvm.org/D50769

llvm-svn: 339770
2018-08-15 13:29:50 +00:00
Nemanja Ivanovic 8b4bd09e22 [PowerPC] Don't run BV DAG Combine before legalization if it assumes legal types
When trying to combine a DAG that builds a vector out of sign-extensions of
vector extracts, the code assumes legal input types. Due to that, we have to
disable this combine prior to legalization.
In some cases, the DAG will look slightly different after legalization so
account for that in the matching code.

This is a fix for https://bugs.llvm.org/show_bug.cgi?id=38087

Differential Revision: https://reviews.llvm.org/D49080

llvm-svn: 339769
2018-08-15 12:58:13 +00:00
Andrea Di Biagio a03f2a77f8 [llvm-mca] Fix PR38575: Avoid an invalid implicit truncation of a processor resource mask (an uint64_t value) to unsigned.
This patch fixes a regression introduced at revision 338702.

A processor resource mask was incorrectly implicitly truncated to an unsigned
quantity. Later on, the truncated mask was used to initialize an element of a
vector of processor resource descriptors.
On targets with more than 32 processor resources, some elements of the vector
are left uninitialized. As a consequence, this bug might have eventually caused
a crash due to null dereference in the Scheduler.

This patch fixes PR38575, and adds a test for it.

llvm-svn: 339768
2018-08-15 12:53:38 +00:00
George Rimar 5290af8ad9 [yaml2obj] - Teach tool to produce SHT_GROUP section with a custom type.
Currently, it is possible to use yaml2obj for producing SHT_GROUP sections
of type GRP_COMDAT. For LLD test case I need to produce an object with
a broken (different from GRP_COMDAT) type.

The patch teaches tool to do such things.

Differential revision: https://reviews.llvm.org/D50761

llvm-svn: 339764
2018-08-15 11:43:00 +00:00
Simon Pilgrim 51cee894da [X86][SSE] Add sdiv by nonuniform constant vector tests
Tests cover each TargetLowering::BuildSDIV path separately plus combos

llvm-svn: 339761
2018-08-15 10:59:29 +00:00
Aleksandr Urakov eb3735e425 [X86] Add sibling-call test cases
This commit adds new sibling-call test cases, so it will be possible to see
how these test cases will be changed after applying D45653.
See D45653 for details.

llvm-svn: 339760
2018-08-15 10:54:06 +00:00
Simon Pilgrim a272fa9b0c [TargetLowering] Add support for non-uniform vectors to BuildExactSDIV
This patch refactors the existing BuildExactSDIV implementation to support non-uniform constant vector denominators.

Differential Revision: https://reviews.llvm.org/D50392

llvm-svn: 339756
2018-08-15 09:35:12 +00:00
Sam Parker 6548cd3905 [ARM] Allow signed icmps in ARMCodeGenPrepare
Treat signed icmps as 'sinks', allowing them to be in the use-def
tree, enabling more promotions to be performed. As a sink, any
promoted incoming values need to be truncated before being used by
the signed icmp.

Differential Revision: https://reviews.llvm.org/D50067

llvm-svn: 339755
2018-08-15 08:23:03 +00:00
Sam Parker 7def86bbdb [ARM] Allow pointer values in ARMCodeGenPrepare
Add pointers to the list of allowed types, but don't try to promote
them. Also fixed a bug with the promotion of undef values, so a new
value is now created instead of mutating in place. We also now only
promote if there's an instruction in the use-def chains other than
the icmp, sinks and sources.

Differential Revision: https://reviews.llvm.org/D50054

llvm-svn: 339754
2018-08-15 07:52:35 +00:00
Max Kazantsev 5a10d127b9 [AliasSetTracker] Do not treat experimental_guard intrinsic as memory writing instruction
The `experimental_guard` intrinsic has memory write semantics to model the thread-exiting
logic, but does not do any actual writes to memory. Currently, `AliasSetTracker` treats it as a
normal memory write. As result, a loop-invariant load cannot be hoisted out of loop because
the guard may possibly alias with it.

This patch makes `AliasSetTracker` so that it doesn't treat guards as memory writes.

Differential Revision: https://reviews.llvm.org/D50497
Reviewed By: reames

llvm-svn: 339753
2018-08-15 06:21:02 +00:00
Derek Schuff 4ec8bca13e [WebAssembly] SIMD Splats
Implement and test SIMD splat ops.

Patch by Thomas Lively

Differential Revision: https://reviews.llvm.org/D50741

llvm-svn: 339744
2018-08-15 00:30:27 +00:00
Heejin Ahn 283e1c11bd [WebAssembly] Delete a specific push number from test expectations
Summary:
This shouldn't have been a specific number but rather a regex. This was
a part of rL339474 which got reverted.

Reviewers: aardappel

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50728

llvm-svn: 339736
2018-08-14 22:14:51 +00:00
Cameron McInally 00b0658aae [FPEnv] Scalarize StrictFP vector operations
Add a helper function to scalarize constrained FP operations as needed.

Differential Revision: https://reviews.llvm.org/D50720

llvm-svn: 339735
2018-08-14 22:13:11 +00:00
Adrian Prantl 55f4262999 [DebugInfoMetadata] Added DIFlags interface in DIBasicType.
Flags in DIBasicType will be used to pass attributes used in
DW_TAG_base_type, such as DW_AT_endianity.

Patch by Chirag Patel!

Differential Revision: https://reviews.llvm.org/D49610

llvm-svn: 339714
2018-08-14 19:35:34 +00:00
Sanjay Patel b1546da0e8 [InstCombine] fix typos in tests; NFC
See D50036.

llvm-svn: 339713
2018-08-14 19:13:07 +00:00
Heejin Ahn c15a87848b [WebAssembly] SIMD encoding tests
Modifies existing SIMD tests to also check that SIMD instructions are
lowered to the expected bytes. This CL depends on D50597.

Reviewers: aheejin

Subscribers: sunfish, jgravelle-google, sbc100, llvm-commits

Differential Revision: https://reviews.llvm.org/D50660

Patch by Thomas Lively (tlively)

llvm-svn: 339712
2018-08-14 19:10:50 +00:00
Sanjay Patel 73b7e9f65e [InstCombine] add tests for pow->sqrt; NFC
D50036 should fix the missed optimizations.

llvm-svn: 339711
2018-08-14 19:05:37 +00:00
Zachary Turner 2bbb23ba3b [MS Demangler] Fix some minor formatting bugs.
1) We print __restrict twice on member pointers.  This is fixed
   and relevant tests are re-enabled.

2) Several tests were disabled because of printing slightly
   different output than undname.  These were confirmed to be
   bugs in undname, so we just re-enable the tests.

3) The test for printing reference temporaries is re-enabled.  This
   is a clang mangling extension, so we have some flexibility with
   how we demangle it.  The output currently looks fine, so we just
   re-enable the test with no fixes.

llvm-svn: 339708
2018-08-14 18:54:28 +00:00
Heejin Ahn a0fd9c3e9a [WebAssembly] SIMD extract_lane
Implement instruction selection for all versions of the extract_lane
instruction. Use explicit sext/zext to differentiate between
extract_lane_s and extract_lane_u for applicable types, otherwise
default to extract_lane_u.

Reviewers: aheejin

Subscribers: sunfish, jgravelle-google, sbc100, llvm-commits

Differential Revision: https://reviews.llvm.org/D50597

Patch by Thomas Lively (tlively)

llvm-svn: 339707
2018-08-14 18:53:27 +00:00
Anna Thomas 60a1e4dddc [LV] Teach about non header phis that have uses outside the loop
Summary:
This patch teaches the loop vectorizer to vectorize loops with non
header phis that have have outside uses.  This is because the iteration
dependence distance for these phis can be widened upto VF (similar to
how we do for induction/reduction) if they do not have a cyclic
dependence with header phis. When identifying reduction/induction/first
order recurrence header phis, we already identify if there are any cyclic
dependencies that prevents vectorization.

The vectorizer is taught to extract the last element from the vectorized
phi and update the scalar loop exit block phi to contain this extracted
element from the vector loop.

This patch can be extended to vectorize loops where instructions other
than phis have outside uses.

Reviewers: Ayal, mkuper, mssimpso, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50579

llvm-svn: 339703
2018-08-14 18:22:19 +00:00
Bruno Cardoso Lopes f446282aad Revert "[DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)"
This reverts commit cb8c5e417d55141f3f079a8a876e786f44308336 / r339676.

This causing a test to fail in http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/48406/

    LLVM :: DebugInfo/Generic/debug-label.ll

llvm-svn: 339700
2018-08-14 17:54:41 +00:00
Simon Pilgrim 2ce3d6e135 [X86][SSE] Avoid duplicate shuffle input sources in combineX86ShufflesRecursively
rL339686 added the case where a faux shuffle might have repeated shuffle inputs coming from either side of the OR().

This patch improves the insertion of the inputs into the source ops lists to account for this, as well as making it trivial to add support for shuffles with more than 2 inputs in the future.

llvm-svn: 339696
2018-08-14 17:22:37 +00:00
David Bolvansky ba74d1c4ea [NFC] Tests for select with binop fold - FP opcodes
llvm-svn: 339692
2018-08-14 17:03:47 +00:00
Simon Pilgrim ed55138247 [X86][SSE] Add shuffle combine support for OR(PSHUFB,PSHUFB) style patterns.
If each element is zero from one (or both) inputs then we can combine these into a single shuffle mask.

llvm-svn: 339686
2018-08-14 16:00:05 +00:00
Simon Pilgrim 52c88a7c0e [X86][SSE] Add shuffle combine tests for OR(PSHUFB,PSHUFB) style patterns.
We generate these shuffle patterns but we fail to combine them.

llvm-svn: 339684
2018-08-14 15:21:26 +00:00
Sanjay Patel c8e3943e89 [InstCombine] regenerate checks; NFC
llvm-svn: 339683
2018-08-14 15:21:13 +00:00
Sanjay Patel 19c7e7dab4 [InstCombine] regenerate checks; NFC
llvm-svn: 339681
2018-08-14 15:18:52 +00:00
Hsiangkai Wang ccae278938 [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 339676
2018-08-14 13:50:59 +00:00
Amara Emerson 30e61404a8 [GlobalISel][IRTranslator] Fix a bug in handling repeating struct types during argument lowering.
Differential Revision: https://reviews.llvm.org/D49442

llvm-svn: 339674
2018-08-14 12:04:25 +00:00
Tomasz Krupa e766e5f636 [X86] Constant folding of adds/subs intrinsics
Summary: This adds constant folding of signed add/sub with saturation intrinsics.

Reviewers: craig.topper, spatel, RKSimon, chandlerc, efriedma

Reviewed By: craig.topper

Subscribers: rnk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50499

llvm-svn: 339659
2018-08-14 09:04:01 +00:00
Tomasz Krupa 86a63889f3 [X86] Lowering addus/subus intrinsics to native IR
Summary: This revision improves previous version (rL330322) which has been reverted due to crashes.

This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations. The patch also includes folding
of previously missing saturation patterns so that IR emits the same
machine instructions as the intrinsics.

Reviewers: craig.topper, spatel, RKSimon

Reviewed By: craig.topper

Subscribers: mike.dvoretsky, DavidKreitzer, sroland, llvm-commits

Differential Revision: https://reviews.llvm.org/D46179

llvm-svn: 339650
2018-08-14 08:00:56 +00:00
Max Kazantsev 837418f3f9 [NFC] Add comprehensive test of AliasSetTracker with guards
llvm-svn: 339643
2018-08-14 06:37:39 +00:00
Teresa Johnson b0a1d3bdf1 [ThinLTO] Fix printing of WPD remarks
Summary:
When WPD is performed in a ThinLTO backend, the function may be created
if it isn't already in that module. Module::getOrInsertFunction may
add a bitcast, in which case the returned Constant is not a Function and
doesn't have a name. Invoke stripPointerCasts() on the returned value
where we access its name.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49959

llvm-svn: 339640
2018-08-14 03:00:16 +00:00
Teresa Johnson c7816800d8 [ThinLTO] Handle optional args in assembly format for ConstVCalls
Summary:
The AsmWriter was only writing the Args for a ConstVCall if it was
non-empty, however, the LLParser was always expecting it. To aid
in making it optional, surround the ConstVCall VFuncId and Args in
parentheses when writing, then make the Args optional when reading.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49960

llvm-svn: 339637
2018-08-14 01:49:33 +00:00
Reid Kleckner 40e7663b1f [BasicAA] Don't assume tail calls with byval don't alias allocas
Summary:
Calls marked 'tail' cannot read or write allocas from the current frame
because the current frame might be destroyed by the time they run.
However, a tail call may use an alloca with byval. Calling with byval
copies the contents of the alloca into argument registers or stack
slots, so there is no lifetime issue. Tail calls never modify allocas,
so we can return just ModRefInfo::Ref.

Fixes PR38466, a longstanding bug.

Reviewers: hfinkel, nlewycky, gbiv, george.burgess.iv

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D50679

llvm-svn: 339636
2018-08-14 01:24:35 +00:00
Wouter van Oortmerssen a7be375586 Revert "[WebAssembly] Added default stack-only instruction mode for MC."
This reverts commit 917a99b71ce21c975be7bfbf66f4040f965d9f3c.

llvm-svn: 339630
2018-08-13 23:12:49 +00:00
Craig Topper cade635c77 [X86] Don't ignore 0x66 prefix on relative jumps in 64-bit mode. Fix opcode selection of relative jumps in 16-bit mode. Treat jno/jo like other jcc instructions.
The behavior in 64-bit mode is different between Intel and AMD CPUs. Intel ignores the 0x66 prefix. AMD does not. objump doesn't ignore the 0x66 prefix. Since LLVM aims to match objdump behavior, we should do the same.

While I was trying to fix this I had change brtarget16/32 to use ENCODING_IW/ID instead of ENCODING_Iv to get the 0x66+REX.W case to act sort of sanely. It's still wrong, but that's a problem for another day.

The change in encoding exposed the fact that 16-bit mode disassembly of relative jumps was creating JMP_4 with a 2 byte immediate. It should have been JMP_2. From just printing you can't tell the difference, but if you dumped the encoding it wouldn't have matched what we started with.

While fixing that, it exposed that jo/jno opcodes were missing from the switch that this patch deleted and there were no test cases for them.

Fixes PR38537.

llvm-svn: 339622
2018-08-13 22:06:28 +00:00
Roman Lebedev 3534874fbf [InstCombine] Re-land: Optimize redundant 'signed truncation check pattern'.
Summary:
This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250):
```
signed char test(unsigned int x) { return x; }
```
`clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3`
* Old: {F6904292}
* With this patch: {F6904294}

General pattern:
  X & Y

Where `Y` is checking that all the high bits (covered by a mask `4294967168`)
are uniform, i.e.  `%arg & 4294967168`  can be either  `4294967168`  or  `0`
Pattern can be one of:
  %t = add        i32 %arg,    128
  %r = icmp   ult i32 %t,      256
Or
  %t0 = shl       i32 %arg,    24
  %t1 = ashr      i32 %t0,     24
  %r  = icmp  eq  i32 %t1,     %arg
Or
  %t0 = trunc     i32 %arg  to i8
  %t1 = sext      i8  %t0   to i32
  %r  = icmp  eq  i32 %t1,     %arg
This pattern is a signed truncation check.

And `X` is checking that some bit in that same mask is zero.
I.e. can be one of:
  %r = icmp sgt i32   %arg,    -1
Or
  %t = and      i32   %arg,    2147483648
  %r = icmp eq  i32   %t,      0

Since we are checking that all the bits in that mask are the same,
and a particular bit is zero, what we are really checking is that all the
masked bits are zero.
So this should be transformed to:
  %r = icmp ult i32 %arg, 128

The transform itself ended up being rather horrible, even though i omitted some cases.
Surely there is some infrastructure that can help clean this up that i missed?

https://rise4fun.com/Alive/3Ou

The initial commit (rL339610)
was reverted, since the first assert was being triggered.
The @positive_with_extra_and test now has coverage for that case.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: RKSimon, erichkeane, vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50465

llvm-svn: 339621
2018-08-13 21:54:37 +00:00
Roman Lebedev 93f7e7f03e [NFC][InstCombine] Add a test for D50465 that used to assert
This is valid to fold, too.
https://rise4fun.com/Alive/0lz

llvm-svn: 339619
2018-08-13 21:49:33 +00:00
Sanjay Patel 15bff18c6f [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds (retry r339608)
Even though this code is below a function called optimizeFloatingPointLibCall(),
we apparently can't guarantee that we're dealing with FPMathOperators, so bail
out immediately if that's not true.

llvm-svn: 339618
2018-08-13 21:49:19 +00:00
Roman Lebedev 28a42c7706 Revert "[InstCombine] Optimize redundant 'signed truncation check pattern'."
At least one buildbot was able to actually trigger that assert
on the top of the function. Will investigate.

This reverts commit r339610.

llvm-svn: 339612
2018-08-13 20:46:22 +00:00
Roman Lebedev 4c4750771f [InstCombine] Optimize redundant 'signed truncation check pattern'.
Summary:
This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250):
```
signed char test(unsigned int x) { return x; }
```
`clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3`
* Old: {F6904292}
* With this patch: {F6904294}

General pattern:
  X & Y

Where `Y` is checking that all the high bits (covered by a mask `4294967168`)
are uniform, i.e.  `%arg & 4294967168`  can be either  `4294967168`  or  `0`
Pattern can be one of:
  %t = add        i32 %arg,    128
  %r = icmp   ult i32 %t,      256
Or
  %t0 = shl       i32 %arg,    24
  %t1 = ashr      i32 %t0,     24
  %r  = icmp  eq  i32 %t1,     %arg
Or
  %t0 = trunc     i32 %arg  to i8
  %t1 = sext      i8  %t0   to i32
  %r  = icmp  eq  i32 %t1,     %arg
This pattern is a signed truncation check.

And `X` is checking that some bit in that same mask is zero.
I.e. can be one of:
  %r = icmp sgt i32   %arg,    -1
Or
  %t = and      i32   %arg,    2147483648
  %r = icmp eq  i32   %t,      0

Since we are checking that all the bits in that mask are the same,
and a particular bit is zero, what we are really checking is that all the
masked bits are zero.
So this should be transformed to:
  %r = icmp ult i32 %arg, 128

https://rise4fun.com/Alive/3Ou

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: RKSimon, erichkeane, vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D50465

llvm-svn: 339610
2018-08-13 20:33:08 +00:00
Sanjay Patel 66c6fe6534 revert r339608 - [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds
Can't set the builder flags without knowing this is an FPMathOperator. I'll add a test
for that and try again.

llvm-svn: 339609
2018-08-13 20:20:38 +00:00
Sanjay Patel 981f50919e [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds
llvm-svn: 339608
2018-08-13 20:14:27 +00:00
Anna Thomas cce7c24af1 NFC: Add a test to LV showing that reduction is not possible when reduction var is reset in the loop
Added a test case to reduction showing where it's illegal to identify
vectorize a loop.
Resetting the reduction var during loop iterations disallows us from
widening the dependency cycle to VF, thereby making it illegal to
vectorize the loop.

llvm-svn: 339605
2018-08-13 19:55:25 +00:00
Sanjay Patel e45a83d447 [SimplifyLibCalls] add reflection fold for -sin(-x) (PR38458)
This is a very partial fix for the reported problem. I suspect
we do not get this fold in most motivating cases because most of
the time, the libcall would have been replaced by an intrinsic,
and that optimization is handled elsewhere...but maybe it should
be handled here?

llvm-svn: 339604
2018-08-13 19:24:41 +00:00
Roman Lebedev 2da1ef5b9e [InstCombine][NFC] Tests for 'signed truncation check' optimization
See D50465 for the actual opt itself.

Differential Revision: https://reviews.llvm.org/D50464

llvm-svn: 339602
2018-08-13 18:51:09 +00:00
Scott Linder 35213793bc [CodeGen] Fix assert in SelectionDAG::computeKnownBits
Fix SelectionDAG::computeKnownBits asserting when handling EXTRACT_SUBVECTOR
when zero extending the demanded elements mask if it is already as long as the
source vector.

Differential Revision: https://reviews.llvm.org/D49574

llvm-svn: 339600
2018-08-13 18:44:21 +00:00
Sanjay Patel e33062369e [InstCombine] add more tests for trig reflections; NFC (PR38458)
llvm-svn: 339598
2018-08-13 18:34:32 +00:00
Simon Pilgrim 82edf8d329 [InstCombine] Limit simplifyAllocaArraySize constant folding to values that fit into a uint64_t
Fixes OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5223

llvm-svn: 339584
2018-08-13 16:50:20 +00:00
Sanjay Patel d379f39e18 [InstCombine] auto-generate full checks and add cos intrinsic test; NFC
llvm-svn: 339579
2018-08-13 16:29:01 +00:00
Evandro Menezes 5ecd6c1a46 [SLC] Expand simplification of pow() for vector types
Also consider vector constants when simplifying `pow()`.

Differential revision: https://reviews.llvm.org/D50035

llvm-svn: 339578
2018-08-13 16:12:37 +00:00
Daniel Cederman dc3e4c6d95 Revert "[Sparc] Add support for the cycle counter available in GR740"
It breaks when using EXPENSIVE_CHECKS with the error message
"Bad machine code: Using an undefined physical register".

llvm-svn: 339570
2018-08-13 14:18:09 +00:00
Sid Manning 8d4a6615e1 Check for tied operands
Differential Revision: https://reviews.llvm.org/D50592

llvm-svn: 339567
2018-08-13 14:01:25 +00:00
Simon Pilgrim 4aaf48013d [X86] Add tests showing missing div/rem 0, X -> 0 combines
llvm-svn: 339562
2018-08-13 13:29:54 +00:00
Simon Pilgrim ee82a79041 [CGP] Fix GEP issue with out of range APInt constant values not fitting in int64_t
Test case reduced from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=7173

llvm-svn: 339556
2018-08-13 12:10:09 +00:00
Daniel Cederman 1bfbc62022 [Sparc] Add support for the cycle counter available in GR740
Summary: The GR740 provides an up cycle counter in the
registers ASR22 and ASR23. As these registers can not be
read together atomically we only use the value of ASR23
for llvm.readcyclecounter(). The ASR23 register holds the
32 LSBs of the up-counter.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D48638

llvm-svn: 339551
2018-08-13 10:49:48 +00:00
Luke Geeson 4ce41d2bb7 [ARM] Added FP16 VREV Vector Instrinsic CodeGen support
llvm-svn: 339546
2018-08-13 08:37:41 +00:00
Max Kazantsev 5c490b49c3 [GuardWidening] Widen very likely non-taken br instructions
This is a second part of D49974 that handles widening of conditional branches that
have very likely `false` branch.

Differential Revision: https://reviews.llvm.org/D50040
Reviewed By: reames

llvm-svn: 339537
2018-08-13 07:58:19 +00:00
Craig Topper cacf12a149 [SelectionDAG] In PromoteFloatOp_BITCAST, insert a bitcast after the fp_to_fp16 in case the result type isn't a scalar integer.
This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary.

llvm-svn: 339536
2018-08-13 06:53:49 +00:00
Craig Topper e42a159537 [SelectionDAG] In PromoteIntRes_BITCAST, when the input is TypePromoteFloat, make sure the output type is scalar. For vectors, use a store and load of temporary.
Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid.

This is basically the opposite case of the root cause of PR38533.

llvm-svn: 339535
2018-08-13 06:53:47 +00:00
Lei Liu 901a0a9588 Restore correct x86_64 EH encodings in kernel code model
Fixes PR37524.

The exception handling encodings for x86_64 in kernel code model
has been changed with r309884.  Restore it to correct ones.  These
encodings include PersonalityEncoding, LSDAEncoding and
TTypeEncoding.

Differential Revision: https://reviews.llvm.org/D50490

llvm-svn: 339534
2018-08-13 06:06:53 +00:00
Craig Topper 42e32117bb [SelectionDAG] In PromoteFloatRes_BITCAST, insert a bitcast before the fp16_to_fp in case the input type isn't an i16.
The bitcast can be further legalized as needed.

Fixes PR38533.

llvm-svn: 339533
2018-08-13 05:26:49 +00:00
Craig Topper 484b342c68 [X86] Add constant folding for AVX512 versions of scalar floating point to integer conversion intrinsics.
Summary:
We've supported constant folding for sse versions for many years. This patch adds support for the avx512 versions including unsigned with the default rounding mode. We could probably do more with other roundings modes and SAE in the future.

The test cases are largely based on the sse.ll test cases. But I did add some test cases to ensure the unsigned versions don't accept negative values. Also checked the bounds of f64->i32 conversions to make sure unsigned has a larger positive range than signed.

Reviewers: RKSimon, spatel, chandlerc

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50553

llvm-svn: 339529
2018-08-12 22:09:54 +00:00
Matt Arsenault 3763f307bd AMDGPU: Cleanup min/max legacy tests
Also add some more tests in preparation for
a future patch.

llvm-svn: 339526
2018-08-12 19:29:53 +00:00
Matt Arsenault 1201301b94 DAG: Check no-signed-zeros instead of unsafe-fp-math
Addresses fixme, although this should still be checking individual
operand flags.

llvm-svn: 339525
2018-08-12 19:09:12 +00:00
David Bolvansky cd57242587 [NFC] Fixed build, updated tests
llvm-svn: 339524
2018-08-12 18:32:53 +00:00
David Bolvansky ddfe408f9a [NFC] Renamed test file
llvm-svn: 339523
2018-08-12 17:43:27 +00:00
David Bolvansky 01d98cc03f [InstCombine] Fold Select with binary op - non-commutative opcodes
Summary:
Basic version was merged - https://reviews.llvm.org/D49954

This adds support for FP & non-commutative opcodes

Precommited tests: https://reviews.llvm.org/rL338727

Reviewers: spatel, lebedev.ri

Reviewed By: spatel

Subscribers: jfb

Differential Revision: https://reviews.llvm.org/D50190

llvm-svn: 339520
2018-08-12 17:30:07 +00:00
Sanjay Patel dc185ee275 [InstCombine] fix/enhance fadd/fsub factorization
(X * Z) + (Y * Z) --> (X + Y) * Z
  (X * Z) - (Y * Z) --> (X - Y) * Z
  (X / Z) + (Y / Z) --> (X + Y) / Z
  (X / Z) - (Y / Z) --> (X - Y) / Z

The existing code that implemented these folds failed to 
optimize vectors, and it transformed code with multiple 
uses when it should not have.

llvm-svn: 339519
2018-08-12 15:48:26 +00:00
Sanjay Patel ce104b6c16 [InstCombine] move/add tests for fadd/fsub factorization; NFC
llvm-svn: 339518
2018-08-12 15:06:15 +00:00
Matt Arsenault 13b0db9285 AMDGPU: Check NSZ MI flag when folding omod
I'm not sure the exact nsz flag combination that
is OK. I think as long as it's on either, this is OK.
For now just check it on the omod multiply.

llvm-svn: 339513
2018-08-12 08:44:25 +00:00
Matt Arsenault b5acec1f79 AMDGPU: Use splat vectors for undefs when folding canonicalize
If one of the elements is undef, use the canonicalized constant
from the other element instead of 0.

Splat vectors are more useful for other optimizations, such
as matching vector clamps. This was breaking on clamps
of half3 from the undef 4th component.

llvm-svn: 339512
2018-08-12 08:42:54 +00:00
Matt Arsenault 3ead7d7389 AMDGPU: Fix packing undef parts of build_vector
llvm-svn: 339511
2018-08-12 08:42:46 +00:00
David Green f7111d1ece [UnJ] Improve explicit loop count checks
Try to improve the computed counts when it has been explicitly set by a pragma
or command line option. This moves the code around, so that first call to
computeUnrollCount to get a sensible count and override that if explicit unroll
and jam counts are specified.

Also added some extra debug messages for when unroll and jamming is disabled.

Differential Revision: https://reviews.llvm.org/D50075

llvm-svn: 339501
2018-08-11 07:37:31 +00:00
Craig Topper 570d47a010 [X86] Change the MOV32ri64 pseudo instruction to def a GR64 directly instead of wrapping it in a SUBREG_TO_REG.
Now we switch to the subregister in expandPostRAPseudos where we already switched the opcode.

This simplifies a few isel patterns that used the pseudo directly. And magically seems to have improved our ability to CSE it in the undef-label.ll test.

llvm-svn: 339496
2018-08-11 05:33:00 +00:00
Tom Stellard 69bf876b49 [gold] Fix Tests cases on i686
Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50583

llvm-svn: 339492
2018-08-11 01:08:34 +00:00
Tom Stellard 8adc86a7dc AMDGPU/GlobalISel: Define instruction mapping for G_INSERT
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D49625

llvm-svn: 339491
2018-08-11 00:51:54 +00:00
Philip Reames 85afd1a9a0 [LICM] Hoist assumes out of loops
If we have an assume which is known to execute and whose operand is invariant, we can lift that into the pre-header. So long as we don't change which paths the assume executes on, this is a legal transformation. It's likely to be a useful canonicalization as other transforms only look for dominating assumes.

Differential Revision: https://reviews.llvm.org/D50364

llvm-svn: 339481
2018-08-10 22:21:56 +00:00
Wouter van Oortmerssen ab26bd0647 [WebAssembly] Added default stack-only instruction mode for MC.
Summary:
Moved Explicit Locals pass to last.
Made that pass obligatory.
Made it convert from register to stack based instructions, and removed the registers.
Fixes to related code that was expecting register based instructions.
Added the correct testing flag to all tests, depending on what the
format they were expecting so far.
Translated one test to stack format as example: reg-stackify-stack.ll

tested:
llvm-lit -v `find test -name WebAssembly`
unittests/MC/*

Reviewers: dschuff, sunfish

Subscribers: jfb, llvm-commits, aheejin, eraman, jgravelle-google, sbc100

Differential Revision: https://reviews.llvm.org/D50568

llvm-svn: 339474
2018-08-10 21:32:47 +00:00
Eli Friedman e1687a89e8 [ARM] Adjust AND immediates to make them cheaper to select.
LLVM normally prefers to minimize the number of bits set in an AND
immediate, but that doesn't always match the available ARM instructions.
In Thumb1 mode, prefer uxtb or uxth where possible; otherwise, prefer
a two-instruction sequence movs+ands or movs+bics.

Some potential improvements outlined in
ARMTargetLowering::targetShrinkDemandedConstant, but seems to work
pretty well already.

The ARMISelDAGToDAG fix ensures we don't generate an invalid UBFX
instruction due to a larger-than-expected mask. (It's orthogonal, in
some sense, but as far as I can tell it's either impossible or nearly
impossible to reproduce the bug without this change.)

According to my testing, this seems to consistently improve codesize by
a small amount by forming bic more often for ISD::AND with an immediate.

Differential Revision: https://reviews.llvm.org/D50030

llvm-svn: 339472
2018-08-10 21:21:53 +00:00
Zachary Turner 29ec67b62f [MS Demangler] Support extern "C" functions.
There are two cases we need to support with extern "C"
functions.  The first is the case of a '9' indicating that
the function has no prototype.  This occurs when we mangle
a symbol inside of an extern "C" function, but not the
function itself.

The second case is when we have an overloaded extern "C"
functions.  In this case we emit $$J0 to indicate this.
This patch adds support for both of these cases.

llvm-svn: 339471
2018-08-10 21:09:05 +00:00
Sanjay Patel 0b62b01129 [InstCombine] add tests for fsub factorization; NFC
The tests show that;
1. The fold doesn't fire for vectors, but it should.
2. The fold fires regardless of uses, but it shouldn't.

llvm-svn: 339470
2018-08-10 21:00:27 +00:00