Commit Graph

133482 Commits

Author SHA1 Message Date
Craig Topper 3043093822 [CallSite removal][CodeGen] Replace ImmutableCallSite with CallBase in isInTailCallPosition. 2020-04-13 23:04:57 -07:00
Mircea Trofin 4aae4e3f48 [llvm][NFC] CallSite removal from inliner-related files
Summary: This removes CallSite from inliner files. Some dependencies where thus affected.

Reviewers: dblaikie, davidxl, craig.topper

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, aheejin, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77991
2020-04-13 21:28:58 -07:00
Craig Topper 2f60fbce6c [X86] Use a more realisitic cost for truncate v16i64->v16i8 with avx512f.
Still not great and we could probably codegen this better, but
11 was clearly ridiculous.
2020-04-13 21:09:43 -07:00
Craig Topper 535a566a01 [X86] Split AVX512 getCastInstrCost into tables that require useAVX512Regs() and those that just operate on 256 or smaller vectors.
Use useAVX512Regs() to skip lookups instead of using type legalization
action.
2020-04-13 21:09:42 -07:00
Craig Topper 071c64d68d [X86] Add a more accurate truncate cost for v8i64->v8i8 2020-04-13 21:09:41 -07:00
Fangrui Song d1a677cd33 [VE] Adapt D77995 CallSite removal 2020-04-13 19:54:49 -07:00
Matt Arsenault f48fe2c36e GlobalISel: Fix casted unmerge of G_CONCAT_VECTORS
This was assuming a scalarizing unmerge, and would fail assert if the
unmerge was to smaller vector types.
2020-04-13 22:03:05 -04:00
Mehdi Amini 384ca190ae Revert "Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object"
This reverts commit 10df1563d6.

Some buildbots are broken.
2020-04-14 00:27:08 +00:00
Christopher Tetreault eab73dfed9 [SVE] Change return type of getNumElements to unsigned
Reviewers: efriedma, sdesmalen, craig.topper, dexonsmith

Reviewed By: efriedma, sdesmalen

Subscribers: tschuett, hiraditya, rkruppe, psnobl, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, grosul1, frgossen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77763
2020-04-13 16:24:18 -07:00
Matt Arsenault 0ba40d4ccf AMDGPU/GlobalISel: Combines for V_CVT_F32_UBYTE[0-3]
Ports the existing DAG combines, minus the simplify demanded bits
which seems to have no equivalent now. Without these, this isn't
particularly helpful in most of the IR sample cases.
2020-04-13 19:18:19 -04:00
Mehdi Amini 10df1563d6 Move ModuleSummaryAnalysis from libAnalysis to libObject to break the dependency from Analysis to Object
ModuleSummaryAnalysis is the only file in libAnalysis that brings a
dependency on the CodeGen layer from libAnalysis, moving it breaks this
dependency.

Differential Revision: https://reviews.llvm.org/D77994
2020-04-13 23:12:11 +00:00
Austin Kerbow a69b3e010c [AMDGPU][GlobalISel] Fix div_scale in FDIV lowering
Differential Revision: https://reviews.llvm.org/D78004
2020-04-13 15:54:49 -07:00
Heejin Ahn ba40896f99 [WebAssembly] Fix try placement in fixing unwind mismatches
Summary:
In CFGStackify, `fixUnwindMismatches` function fixes unwind destination
mismatches created by `try` marker placement. For example,
```
try
  ...
  call @qux  ;; This should throw to the caller!
catch
  ...
end
```
When `call @qux` is supposed to throw to the caller, it is possible that
it is wrapped inside a `catch` so in case it throws it ends up unwinding
there incorrectly. (Also it is possible `call @qux` is supposed to
unwind to another `catch` within the same function.)

To fix this, we wrap this inner `call @qux` with a nested
`try`-`catch`-`end` sequence, and within the nested `catch` body, branch
to the right destination:
```
block $l0
  try
    ...
    try                 ;; new nested try
      call @qux
    catch               ;; new nested catch
      local.set n       ;; store exnref to a local
      br $l0
    end
  catch
    ...
  end
end
local.get n             ;; retrieve exnref back
rethrow                 ;; rethrow to the caller
```

The previous algorithm placed the nested `try` right before the `call`.
But it is possible that there are stackified instructions before the
call from which the call takes arguments.
```
try
  ...
  i32.const 5
  call @qux  ;; This should throw to the caller!
catch
  ...
end
```

In this case we have to place `try` before those stackified
instructions.
```
block $l0
  try
    ...
    try                 ;; this should go *before* 'i32.const 5'
      i32.const 5
      call @qux
    catch
      local.set n
      br $l0
    end
  catch
    ...
  end
end
local.get n
rethrow
```

We correctly handle this in the first normal `try` placement phase
(`placeTryMarker` function), but failed to handle this in this
`fixUnwindMismatches`.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77950
2020-04-13 15:50:01 -07:00
Benjamin Kramer f1542efd97 [CHR] Clean up some code and reduce copying. NFCI. 2020-04-13 23:11:20 +02:00
Craig Topper 113f37a1f9 [CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase
Differential Revision: https://reviews.llvm.org/D77995
2020-04-13 13:50:15 -07:00
Lang Hames 255cc202ea [Support] Add missing files from e823068306. 2020-04-13 13:30:45 -07:00
Sam McCall dffbeffa39 [Support] Fix CMakeLists after e823068306 2020-04-13 22:14:44 +02:00
Lang Hames e823068306 [Support] Add support RTTI support for open class hierarchies.
This patch extracts the RTTI part of llvm::ErrorInfo into its own class
(RTTIExtends) so that it can be used in other non-error hierarchies, and makes
it compatible with the existing LLVM RTTI function templates (isa, cast,
dyn_cast, dyn_cast_or_null) by adding the classof method.

Differential Revision: https://reviews.llvm.org/D39111
2020-04-13 12:52:44 -07:00
Christopher Tetreault 3297e9b7c3 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: rriddle, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77259
2020-04-13 12:29:43 -07:00
Rahman Lavaee 05192e585c Extend BasicBlock sections to allow specifying clusters of basic blocks in the same section.
Differential Revision: https://reviews.llvm.org/D76954
2020-04-13 12:19:59 -07:00
Rahman Lavaee 4ddf7ab454 Revert "Extend BasicBlock sections to allow specifying clusters of basic blocks"
This reverts commit 0d4ec16d3d Because
tests were not added to the commit.
2020-04-13 12:19:59 -07:00
Rahman Lavaee 0d4ec16d3d Extend BasicBlock sections to allow specifying clusters of basic blocks
in the same section.

This allows specifying BasicBlock clusters like the following example:
!foo
!!0 1 2
!!4
This places basic blocks 0, 1, and 2 in one section in this order, and
places basic block #4 in a single section of its own.
2020-04-13 11:46:11 -07:00
Benjamin Kramer ec228d722c [InstCombine] Use SmallBitVector for convienently checking if all bits are set 2020-04-13 20:37:15 +02:00
Vedant Kumar 4831f4b7bd [InstCombine] Fix debug variance issue in tryToMoveFreeBeforeNullTest
Fix an issue where the presence of debug info could disable an
optimization in tryToMoveFreeBeforeNullTest.
2020-04-13 10:55:17 -07:00
Vedant Kumar 122a6bfb07 [Debugify] Strip added metadata in the -debugify-each pipeline
Summary:
Share logic to strip debugify metadata between the IR and MIR level
debugify passes. This makes it simpler to hunt for bugs by diffing IR
with vs. without -debugify-each turned on.

As a drive-by, fix an issue causing CallGraphNodes to become invalid
when a dead llvm.dbg.value prototype is deleted.

Reviewers: dsanders, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77915
2020-04-13 10:55:17 -07:00
Craig Topper 68eb08646c [CallSite removal][GlobalISel] Use CallBase instead of CallSite in lowerCall and translateCallBase.
Differential Revision: https://reviews.llvm.org/D78001
2020-04-13 10:31:30 -07:00
Matt Arsenault e6605a209c DAG: Fix wrong legality check for ISD::FMAD
Since 1725f28841, this should check
isFMADLegalForFAddFSub rather than the the plain isOperationLegal.

This would assert in a subset of cases due to an oddity in how FMAD is
selected. We will allow FMA formation pre-legalize, but not FMAD even
in cases where it would be valid.

The current hook requires passing in the root fadd/fsub. However, in
this distributed case, this would be far more complicated to pass in
the relevant operand. AMDGPU doesn't get any value from the node, and
only needs the type and is the only implementor, so I'm not sure why
we have this complexity. Just rename and expand the assert to avoid
the more complicated checks spread through the distribution logic.
2020-04-13 10:25:39 -07:00
Craig Topper 6dbf1a1229 [X86] Move X86ShuffleDecode.cpp/h into MCTargetDesc and remove X86Utils library. NFC
The shuffle decoding is used by X86ISelLowering and
MCTargetDesc/X86InstComments. The latter used to be in a
separate InstPrinter library. The Utils library existed to allow
InstPrinter and CodeGen to share the shuffle decoding. Since
X86InstComments now lives in the MCTargetDesc, which CodeGen
already depends on, we can sink the shuffle decoding there as well.

Differential Revision: https://reviews.llvm.org/D77980
2020-04-13 10:14:08 -07:00
Jay Foad bc78baec4c [X86] Improve combineVectorShiftImm
Summary:
Fold (shift (shift X, C2), C1) -> (shift X, (C1 + C2)) for logical as
well as arithmetic shifts. This is needed to prevent regressions from
an upcoming funnel shift expansion change.

While we're here, fold (VSRAI -1, C) -> -1 too.

Reviewers: RKSimon, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77300
2020-04-13 15:54:55 +01:00
Simon Pilgrim 401cbe373b [X86][AVX] Attempt to scale masked shuffles to match the root type
Improve the chances of folding the writemask into the combined shuffle by scaling a wider shuffle mask to match the root's original type.

This creates a few minor issues with variable shuffles, preventing combines of shuffles because of the more limited support binary shuffle types. In most cases we're probably better off combining the shuffles and losing the writemask fold, but this isn't always going to be true.
2020-04-13 14:57:25 +01:00
Simon Pilgrim ad57286232 CodeMetrics.h - include and forward declaration cleanup. NFC.
Remove SmallPtrSet include, replace with forward declaration and include SmallPtrSet.h in CodeMetrics.cpp directly.
Remove unused llvm::DataLayout/Instruction forward declarations.
2020-04-13 13:09:39 +01:00
Simon Pilgrim fdd9ff9700 [X86][AVX] Create splitVectorIntBinary helper.
Removes duplicate code from split256IntArith/split512IntArith.
2020-04-13 13:09:38 +01:00
Gil Rapaport 41ed5d856c [LV] Clean up vectorizeInterleaveGroup (NFCI)
Pass from the calling recipe the interleave group itself instead of passing the
group's insertion position and having the function query CM for its interleave
group and making sure that given instruction is the insertion point of.

Differential Revision: https://reviews.llvm.org/D78002
2020-04-13 13:15:06 +03:00
Tyker 813f438baa [AssumeBundles] adapt Assumption cache to assume bundles
Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge.

Reviewers: jdoerfert, hfinkel

Reviewed By: jdoerfert

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77402
2020-04-13 12:04:51 +02:00
Benjamin Kramer 06408451bf Revert "[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef."
This reverts commit 1a02aaeaa4. Crashes on
the following test case:

$ cat crash.ll
source_filename = "__compute_module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

@0 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\00\00\C0\7F\09\85\08?\ED\C94\FE~\EB/\F3\90\CF\BA\C1"
@1 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\A3\A0\0FA\00\00\C0\7F\00\00\C0\7F\00\00\00\00\02\9AA\00"

define void @IgammaSpecialValues.448() {
entry:
  br label %fusion.26.loop_header.dim.0

fusion.26.loop_header.dim.0:                      ; preds = %fusion.26.loop_header.dim.0, %entry
  %fusion.26.invar_address.dim.0.0 = phi i64 [ 0, %entry ], [ %invar.inc17, %fusion.26.loop_header.dim.0 ]
  %0 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @0 to [6 x float]*), i64 0, i64 %fusion.26.invar_address.dim.0.0
  %1 = load float, float* %0
  %2 = fmul float %1, 0.000000e+00
  %3 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @1 to [6 x float]*), i64 0, i64 %fusion.26.invar_address.dim.0.0
  %4 = load float, float* %3
  %5 = fneg float %4
  %6 = fadd float %2, %5
  %invar.inc17 = add nuw nsw i64 %fusion.26.invar_address.dim.0.0, 1
  br label %fusion.26.loop_header.dim.0
}

$ opt -ipsccp -S < crash.ll
opt: llvm/include/llvm/Analysis/ValueLattice.h:251: bool llvm::ValueLatticeElement::markConstant(llvm::Constant *, bool): Assertion `getConstant() == V && "Marking constant with different value"' failed.
2020-04-13 11:23:26 +02:00
Florian Hahn 18138e0252 [VPlan] Introduce VPWidenSelectRecipe (NFC).
Widening a selects depends on whether the condition is loop invariant or
not. Rather than checking during codegen-time, the information can be
recorded at the VPlan construction time.

This was suggested as part of D76992, to reduce the reliance on
accessing the original underlying IR values.

Reviewers: gilr, rengolin, Ayal, hsaito

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D77869
2020-04-13 08:35:28 +01:00
Craig Topper f06cf9da89 [CallSite removal][CodeGen] Use CallBase instead of CallSite in getNoopInput in Analysis.cpp. NFC 2020-04-13 00:20:12 -07:00
Craig Topper 5889c5a814 [CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in TargetFrameLoweringInfo. NFC 2020-04-13 00:20:12 -07:00
Craig Topper e59162960c [CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in IntrinsicLowering. NFC 2020-04-13 00:19:27 -07:00
Craig Topper 83208cdd57 [CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in WinEHPrepare. NFC 2020-04-13 00:19:27 -07:00
Craig Topper 42487eafa6 [CallSite removal][CodeGen] Use CallBase instead of ImmutableCallSite in SwiftErrorValueTracking. NFC 2020-04-13 00:19:27 -07:00
Fangrui Song 835c2aa7a6 [MC] Reorganize and improve macro tests
* Reorganize tests and add coverage
* Improve diagnostic testing
* Make assert() tests more relevant
* Rename tests to macro-* or altmacro-*

This is not NFC because a (previously untested) diagnostic message is changed.
2020-04-12 22:54:01 -07:00
Austin Kerbow eab9a4f119 [AMDGPU] Don't assert on partial exec copy
After Machine CSE and coalescing we can end up with copies of exec to
subregister SGPRs.

Differential Revision: https://reviews.llvm.org/D77992
2020-04-12 21:14:36 -07:00
Craig Topper dbb272b0a3 [CallSite removal][FastISel] Use CallBase instead of CallSite in fastLowerCall. 2020-04-12 18:02:24 -07:00
Chris Lattner 89c8ffd542 NFC: Clean up the implementation of StringPool a bit, and remove dependence on some "implicitly MallocAllocator" based methods on StringMapEntry. This allows reducing the #includes in StringMapEntry.h.
Summary:
StringPool has many caveats and isn't used in the monorepo.  I will
propose removing it as a patch separate from this refactoring patch.

Reviewers: rriddle

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77976
2020-04-12 16:37:17 -07:00
Eli Friedman cfb844265a [GlobalOpt] Explicitly set alignment of bool load/store operations. 2020-04-12 16:03:12 -07:00
Huihui Zhang 4bde7c5986 [NFC] Use VectorType::isScalable to align with ongoing VectorType refactor. 2020-04-12 15:39:13 -07:00
Craig Topper 42fc7852f5 [X86] Print k-mask in FMA3 comments. 2020-04-12 13:16:53 -07:00
Craig Topper 95192f548d [CallSite removal][TargetLowering] Use CallBase instead of CallSite in TargetLowering::ParseConstraints interface.
Differential Revision: https://reviews.llvm.org/D77929
2020-04-12 11:26:25 -07:00
Jonathan Roelofs 41f13f1f64 reland: [DAG] Fix PR45049: LegalizeTypes crash
Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG
does, leading to accidental SDValue removal before its reference count was
truly zero.

Differential Revision: https://reviews.llvm.org/D76994

Reviewed-By: bjope

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049

Reverted in 3ce77142a6 because the previous patch
broke the expensive-checks bots. The new patch removes the broken check.
2020-04-12 09:52:17 -06:00
Mircea Trofin d2f1cd5d97 [llvm][NFC] Refactor uses of CallSite to CallBase - call promotion
Summary:
Updated CallPromotionUtils and impacted sites. Parameters that are
expected to be non-null, and return values that are guranteed non-null,
were replaced with CallBase references rather than pointers.

Left FIXME in places where more changes are facilitated by CallBase, but
aren't CallSites: Instruction* parameters or return values, for example,
where the contract that they are actually CallBase values.

Reviewers: davidxl, dblaikie, wmi

Reviewed By: dblaikie

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77930
2020-04-12 08:27:29 -07:00
Chris Lattner 617b08ff9b Refactor StringMap.h, splitting StringMapEntry out to its own header.
Summary:
StringMapEntry.h can have lower dependencies, than StringMap.h, which
is useful for public headers that want to expose inline methods on
StringMapEntry<> but don't need to expose all of StringMap.h.  One
example of this is mlir's Identifier.h, another example is the existing
LLVM StringPool.h.

StringPool also could use a cleanup, I'll deal with that in a follow-on
patch.

Reviewers: rriddle

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77963
2020-04-12 08:25:17 -07:00
Sanjay Patel d04db4825a [x86] use vector instructions to lower FP->int->FP casts
As discussed in PR36617:
https://bugs.llvm.org/show_bug.cgi?id=36617#c13
...we can avoid the likely slow round-trip from XMM to GPR to XMM
by using the vector versions of the convert instructions.

Based on experimental results from recent Intel/AMD chips, we don't
need to worry about triggering denorm stalls while operating on
garbage data in the high lanes with convert instructions, so this is
expected to always be as good or better perf than the scalar
instruction equivalent. FP exceptions are also not a concern because
strict code should not be using the regular SDAG opcodes.

Differential Revision: https://reviews.llvm.org/D77895
2020-04-12 10:26:43 -04:00
Sanjay Patel c23cbefd9d [VectorUtils] add IR-level analysis for widening of shuffle mask
This is similar to the recent move/addition of "scaleShuffleMask" (D76508),
but there are a couple of differences:

1. The existing x86 helper (canWidenShuffleElements) always tries to
   divide-by-2, so it gets called iteratively and wouldn't handle the
   general case of non-pow-2 length.
2. The existing x86 code handles "SM_SentinelZero" - we don't have
   that in IR, but this code should be safe to use with that or other
   special (negative) values.

The motivation is to enable shuffle folds in instcombine/vector-combine
that are similar to D76844 and D76727, but in the reverse-bitcast direction.
Those patterns are visible in the tests for D40633.

Differential Revision: https://reviews.llvm.org/D77881
2020-04-12 10:14:19 -04:00
Simon Pilgrim 2b74755ec5 TrigramIndex.h - remove unnecessary StringMap.h include. NFC
Include StringRef.h inside TrigramIndex.cpp as thats the only part of StringMap.h that is actually required.
2020-04-12 14:30:52 +01:00
Florian Hahn ae1e353a25 [VPlan] Turn classes with all public members into structs (NFC).
struct should be used when all members are public:
 https://llvm.org/docs/CodingStandards.html#use-of-class-and-struct-keywords

Reviewers: gilr, rengolin, Ayal, hsaito

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D77865
2020-04-12 11:03:39 +01:00
Simon Pilgrim 40581a0a2b [X86] Use isAnyZero shuffle mask helper where possible. NFC. 2020-04-12 10:57:29 +01:00
Craig Topper 5b42399029 [CallSite removal][FastISel] Remove uses of CallSite.
Differential Revision: https://reviews.llvm.org/D77933
2020-04-11 20:52:45 -07:00
Craig Topper d3465e0691 [X86] Enable shuffle combining for AVX512 unless the root is used by a vselect
A lot of vectorized code doesn't use masks so we shouldn't penalize them by not doing shuffle combining on avx512 targets.

I've added support for VALIGNQ/VALIGND and 512-bit SHUF128 to prevent some regressions. I also prevented recombining 256-bit SHUF128 to PERM2X128. We may not need to add 256-bit SHUF128 support, but I don't think I found any cases requiring that in my testing.

Differential Revision: https://reviews.llvm.org/D77928
2020-04-11 20:05:10 -07:00
Matt Arsenault 96819011ca AMDGPU/GlobalISel: Fix RegBankSelect for v2s16 shifts
These need to be promoted and scalarized for the SALU.
2020-04-11 20:55:33 -04:00
David Blaikie 7a45aeacf3 Revert "llvm-dwarfdump: Report errors when failing to parse loclist/debug_loc entries"
Broke an LLDB build bot & I can't seem to build LLDB locally to fix
forward...
http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/15567/steps/test/logs/stdio

This reverts commit 416fa7720e.
2020-04-11 16:54:49 -07:00
Matt Arsenault ac8d51a3c6 AMDGPU/GlobalISel: Legalize 16-bit shift amounts to s16
The current selector depends on 16-bit shifts using 16-bit shift
amount types, but really it should accept either for all types.
2020-04-11 18:12:26 -04:00
Craig Topper d1da1b53ff [X86] Cleanup ISD::BRIND handling code in X86DAGToDAGISel::Select. NFC
-Drop llvm:: on MVT::i32
-Use getValueType instead of getSimpleValueType for an equality
check just cause its shorter and doesn't matter.
-Don't create a const SDValue & since its cheap to copy.
-Remove explicit case from MVT enum to EVT.
-Add message to assert.
2020-04-11 15:01:05 -07:00
Craig Topper 21a7d08e72 [X86] Move code that replaces ISD::VSELECT with X86ISD::BLENDV from X86DAGToDAGISel::Select to PreprocessISelDAG 2020-04-11 15:01:05 -07:00
Craig Topper 806763efcf [CallSite removal][SelectionDAGBuilder] Use CallBase instead of ImmutableCallSite in visitPatchpoint.
Differential Revision: https://reviews.llvm.org/D77932
2020-04-11 13:07:31 -07:00
Matt Arsenault 1747ba25b2 GlobalISel: Fix typo in assert message 2020-04-11 16:02:26 -04:00
Matt Arsenault c5497e5399 AMDGPU/GlobalISel: Fix legalizing <3 x s16> vselects 2020-04-11 15:59:51 -04:00
Hongtao Yu 11455a7905 [CodeGen] Allow partial tail duplication in Machine Block Placement.
Summary: A count profile may affect tail duplication's heuristic causing a block to be duplicated in only a part of its predecessors. This is not allowed in the Machine Block Placement pass where an assert will go off. I'm removing the assert and making the optimization bail out when such case happens.

Reviewers: wenlei, davidxl, Carrot

Reviewed By: Carrot

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77748
2020-04-11 12:20:31 -07:00
Fangrui Song 0a55d3f557 [MC] Default MCAsmInfo::UseIntegratedAssembler to true 2020-04-11 10:13:52 -07:00
Fangrui Song d2e5157c1f [MC] Add UseIntegratedAssembler = false. NFC 2020-04-11 10:13:49 -07:00
Benjamin Kramer 52dcbcbfe0 Simplify string joins. NFCI. 2020-04-11 17:20:11 +02:00
Matt Arsenault cf29333f40 AMDGPU/GlobalISel: Work around forming illegal zextload after legalize
Selection would fail after the post legalize combiner put an illegal
zextload back together.

The base combiner has parameter to only allow legal operations, but
they appear to not be used. I also don't see a nice way to remove a
single entry from all_combines, so just hack around this.
2020-04-11 10:52:58 -04:00
Sanjay Patel 1318ddbc14 [VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC
As proposed in D77881, we'll have the related widening operation,
so this name becomes too vague.

While here, change the function signature to take an 'int' rather
than 'size_t' for the scaling factor, add an assert for overflow of
32-bits, and improve the documentation comments.
2020-04-11 10:05:49 -04:00
Benjamin Kramer e590bd6b92 [argpromote] Use formatv to simplify code. NFCI. 2020-04-11 14:54:32 +02:00
Benjamin Kramer 5ef2cb3df4 [FormatVariadic] Reduce allocations
- Move Adapters array to the stack, we know the size precisely
- Parse format string on demand into a SmallVector. In theory this could
  lead to parsing it multiple times, but I couldn't find a single instance
  of that in LLVM.
- Make more of the implementation details private.
2020-04-11 14:54:32 +02:00
Nemanja Ivanovic 512600e3c0 [PowerPC] Handle f16 as a storage type only
The PPC back end currently crashes (fails to select) with f16 input. This patch
expands it on subtargets prior to ISA 3.0 (Power9) and uses the HW conversions
on Power9.

Fixes https://bugs.llvm.org/show_bug.cgi?id=39865

Differential revision: https://reviews.llvm.org/D68237
2020-04-11 07:34:47 -05:00
Florian Hahn 719846c469 [VPlan] Drop redundant private: at beginning of class defs (NFC).
Default visibility for classes is private, so the private: at the top of
various class definitions is redundant.

Reviewers: gilr, rengolin, Ayal, hsaito

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D77810
2020-04-11 13:27:10 +01:00
Simon Pilgrim 89f6ca05b7 CodeGen/EdgeBundles - move Twine.h include down into EdgeBundles.cpp. NFC.
EdgeBundles.h has no use for it.
2020-04-11 12:21:04 +01:00
Craig Topper 9c1842d8af Change FastISel::CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This is the same as what was done to the CallLoweringInfo in
TargetLowering.h in r309159.

This is just a step on the way to replacing this with CallBase.
2020-04-10 23:45:36 -07:00
Shengchen Kan 5d73f79c54 [X86][MC] Make -x86-pad-max-prefix-size compatible with --mc-relax-all
Summary: We allow non-relaxable instructions emitted into relaxable Fragment when we prefix padding branch. So we need to check if the instruction need relaxation before relaxing it.  Without this patch, it currently triggers a `report_fatal_error` in `llvm::MCAsmBackend::relaxInstruction` when we prefix padding branch along with `--mc-relax-all`.

Reviewers: LuoYuanke, reames, MaskRay

Reviewed By: MaskRay

Subscribers: MaskRay, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77851
2020-04-11 11:30:15 +08:00
Craig Topper f49f6cf91e [CallSite removal][SelectionDAGBuilder] Remove most CallSite usage from visitInlineAsm.
I only left it at the interface to ParseConstraints since that
needs updates to other callers in different files. I'll do that
as a follow up.

Differential Revision: https://reviews.llvm.org/D77892
2020-04-10 19:23:33 -07:00
Nemanja Ivanovic 04eae39617 [PowerPC] Another folow-up fix for 6c4b40def7
There was another issue introduced by this commit that the OP
initially missed. Namely, for functions that are free to use
R2 as a callee-saved register, we emit a TOC expression based
on the address of the GEP label without emitting the GEP label.
Since we only emit such expressions for the large code model, this
issue only surfaced there.

I have confirmed that with this fix, the kernel build is successful
with target "all".
2020-04-10 21:09:59 -05:00
Scott Constable 0505181006 [X86] Fix to X86LoadValueInjectionRetHardeningPass for possible segfault
`MBB.back()` could segfault if `MBB.empty()`. Fixed by checking for `MBB.empty()` in the loop.

Differential Revision: https://reviews.llvm.org/D77584
2020-04-10 18:28:08 -07:00
Mehdi Amini ed03d9485e Revert "[TLI] Per-function fveclib for math library used for vectorization"
This reverts commit 60c642e74b.

This patch is making the TLI "closed" for a predefined set of VecLib
while at the moment it is extensible for anyone to customize when using
LLVM as a library.
Reverting while we figure out a way to re-land it without losing the
generality of the current API.

Differential Revision: https://reviews.llvm.org/D77925
2020-04-11 01:05:01 +00:00
Matt Arsenault 49ae0fc2f0 GlobalISel: Fix incorrect lowering G_FCOPYSIGN
In the basic case, this was reading the sign from the wrong operand.
2020-04-10 21:00:25 -04:00
Huihui Zhang 6e7eeb44b3 [GVN] Fix VNCoercion for Scalable Vector.
Summary:
For VNCoercion, skip scalable vector when analysis rely on fixed size,
otherwise call TypeSize::getFixedSize() explicitly.

Add unit tests to check funtionality of GVN load elimination for scalable type.

Reviewers: sdesmalen, efriedma, spatel, fhahn, reames, apazos, ctetreau

Reviewed By: efriedma

Subscribers: bjope, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76944
2020-04-10 17:49:07 -07:00
David Blaikie 416fa7720e llvm-dwarfdump: Report errors when failing to parse loclist/debug_loc entries
This probably isn't ideal - the error was being printed specifically
inline with the dumping that was more legible - but then the error
wasn't reported to stderr and didn't produce a non-zero exit code.

Probably the error message could be improved by adding more context now
that it isn't printed in-situ of the DIE dumping as much.
2020-04-10 17:28:09 -07:00
Eric Christopher 45dca04395 Exclude bitcast and ext/trunc signbit optimization on ppc_fp128
Revision a1c05fe <https://reviews.llvm.org/rGa1c05fe20f3def1f1be9f50d2adefc6b6f1578ad>
removed bitcast from the list of problematic transformations, however:

  %97 = fptrunc ppc_fp128 %2 to double            // we need to check ppc_fp128 here to prevent the transformation
  %98 = bitcast double %97 to i64                 // a1c05fe checks ppc_fp128 at here
  %99 = icmp slt i64 %98, 0
  %100 = zext i1 %99 to i8
  store i8 %100, i8* %7, align 1

so this patch does that. I'm also disabling it in the presence of extend just in case.

I verified separately that the hash of -std::infinity and std::infinity don't match now.

Differential Revision: https://reviews.llvm.org/D77911
2020-04-10 17:07:55 -07:00
Huihui Zhang 6c989d0248 [BasicAA] Fix aliasGEP/DecomposeGEPExpression for scalable type.
Summary:
Don't attempt to analyze the decomposed GEP for scalable type.
GEP index scale is not compile-time constant for scalable type.
Be conservative, return MayAlias.

Explicitly call TypeSize::getFixedSize() to assert on places where
scalable type doesn't make sense.

Add unit tests to check functionality of -basicaa for scalable type.

This patch is needed for D76944.

Reviewers: sdesmalen, efriedma, spatel, bjope, ctetreau

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77828
2020-04-10 16:58:26 -07:00
Sam Clegg 16206ee07d [WebAssembly] Minor cleanup to WebAssemblySubtarget. NFC.
Pretty much all other platforms pass CPU string as arg0 of
initializeSubtargetDependencies.

Differential Revision: https://reviews.llvm.org/D77894
2020-04-10 16:47:39 -07:00
Daniel Sanders f71350f05a Add -debugify-and-strip-all to add debug info before a pass and remove it after
Summary:
This allows us to test each backend pass under the presence
of debug info using pre-existing tests. The tests should not
fail as a result of this so long as it's true that debug info
does not affect CodeGen.

In practice, a few tests are sensitive to this:
* Tests that check the pass structure (e.g. O0-pipeline.ll)
* Tests that check --debug output. Specifically instruction
  dumps containing MMO's (e.g. prelegalizercombiner-extends.ll)
* Tests that contain debugify metadata as mir-strip-debug will
  remove it (e.g. fastisel-debugvalue-undef.ll)
* Tests with partial debug info (e.g.
  patchable-function-entry-empty.mir had debug info but no
  !llvm.dbg.cu)
* Tests that check optimization remarks overly strictly (e.g.
  prologue-epilogue-remarks.mir)
* Tests that would inject the pass in an unsafe region (e.g.
  seqpairspill.mir would inject between register alloc and
  virt reg rewriter)
In all cases, the checks can either be updated or
--debugify-and-strip-all-safe=0 can be used to avoid being
affected by something like llvm-lit -Dllc='llc --debugify-and-strip-all-safe'

I tested this without the lost debug locations verifier to
confirm that AArch64 behaviour is unaffected (with the fixes
in this patch) and with it to confirm it finds the problems
without the additional RUN lines we had before.

Depends on D77886, D77887, D77747

Reviewers: aprantl, vsk, bogner

Subscribers: qcolombet, kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77888
2020-04-10 16:36:07 -07:00
Lang Hames 59ed45b483 [ORC] Add an OrcV2 C API function for configuring TargetMachines. 2020-04-10 15:51:29 -07:00
Mircea Trofin da9bcdaad9 [llvm][NFC] Inliner.cpp: ensure InlineHistory ID is always initialized;
Summary:
The inline history is associated with a call site. There are two locations
we fetch inline history. In one, we fetch it together with the call
site. In the other, we initialize it under certain conditions, use it
later under same conditions (different if check), and otherwise is
uninitialized. Although currently there is no uninitialized use, the
code is more challenging to maintain correctly, than if the value were
always initialized.

Changed to the upfront initialization pattern already present in this
file.

Reviewers: davidxl, dblaikie

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77877
2020-04-10 15:28:53 -07:00
Daniel Sanders dfca98d6a8 [mir-strip-debug] Optionally preserve debug info that wasn't from debugify/mir-debugify
Summary:
A few tests start out with debug info and expect it to reach
the output. For these tests we shouldn't strip the debug info

Reviewers: aprantl, vsk, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77886
2020-04-10 15:24:14 -07:00
Christopher Tetreault 889f6606ed Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: stoklund, sdesmalen, efriedma

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77272
2020-04-10 14:53:43 -07:00
Christopher Tetreault 40ed21bb71 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: dexonsmith, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77276
2020-04-10 14:18:47 -07:00
Christopher Tetreault 2a922da3a9 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: dexonsmith, sdesmalen, efriedma

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77274
2020-04-10 13:58:11 -07:00
Daniel Sanders c162bc2aed Make TargetPassConfig and llc add pre/post passes the same way. NFC
Summary:
At the moment, any changes we make to the passes that can be
injected before/after others (e.g. -verify-machineinstrs and
-print-after-all) have to be duplicated in both
TargetPassConfig (for normal execution, -start-before/
-stop-before/etc) and llc (for -run-pass). Unify this pass
injection into addMachinePrePass/addMachinePostPass that both
TargetPassConfig and llc can use.

Reviewers: vsk, aprantl, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77887
2020-04-10 13:46:53 -07:00
Craig Topper b8a108140d [CallSite removal][X86] Remove uses of CallSite from X86WinEHState.cpp
Differential Revision: https://reviews.llvm.org/D77862
2020-04-10 11:34:06 -07:00
Marcello Maggioni ea11f4726f Split LiveRangeCalc in LiveRangeCalc/LiveIntervalCalc. NFC
Summary:
Refactor LiveRangeCalc such that it is now split into two classes

The objective is to split all the "register specific" logic away
from LiveRangeCalc.
The two new classes created are:

- LiveRangeCalc - is meant as a generic class to compute and modify
  live ranges in a generic way. This class should deal only with
  SlotIndices and VNInfo objects.

- LiveIntervalCals - is meant to be equivalent to the old LiveRangeCalc.
  It computes the liveness virtual registers tracked by a LiveInterval
  object.

With this refactoring LiveRangeCalc can be used to implement tracking of
liveness of LiveRanges that represent other things than just registers.

Subscribers: MatzeB, qcolombet, mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76584
2020-04-10 11:26:21 -07:00
Sumanth Gundapaneni a04ab2ec08 [Pipeliner] Fix the bug in pragma that disables the pipeliner.
Differential Revision: https://reviews.llvm.org/D76303.
2020-04-10 12:52:16 -05:00
Matt Morehouse bef187c750 Implement `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang
Summary:
This commit adds two command-line options to clang.
These options let the user decide which functions will receive SanitizerCoverage instrumentation.
This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing.

Patch by Yannis Juglaret of DGA-MI, Rennes, France

libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy.

Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions.

The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`.

Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists:

```
  # (a)
  src:*
  fun:*

  # (b)
  src:SRC/*
  fun:*

  # (c)
  src:SRC/src/woff2_dec.cc
  fun:*
```

Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points.

For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s.

We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction.

The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise.

Reviewers: kcc, morehouse, vitalybuka

Reviewed By: morehouse, vitalybuka

Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D63616
2020-04-10 10:44:03 -07:00
Fangrui Song a7aaaf7016 [MC][RISCV] Make .reloc support arbitrary relocation types
Similar to D76746 (ARM), D76754 (AArch64) and llvmorg-11-init-6967-g152d14da64c (x86)

Differential Revision: https://reviews.llvm.org/D77018
2020-04-10 10:43:53 -07:00
Matt Arsenault 4593e4131a AMDGPU: Teach toolchain to link rocm device libs
Currently the library is separately linked, but this isn't correct to
implement fast math flags correctly. Each module should get the
version of the library appropriate for its combination of fast math
and related flags, with the attributes propagated into its functions
and internalized.

HIP already maintains the list of libraries, but this is not used for
OpenCL. Unfortunately, HIP uses a separate --hip-device-lib argument,
despite both languages using the same bitcode library. Eventually
these two searches need to be merged.

An additional problem is there are 3 different locations the libraries
are installed, depending on which build is used. This also needs to be
consolidated (or at least the search logic needs to deal with this
unnecessary complexity).
2020-04-10 13:37:32 -04:00
Craig Topper a6732069ee [CallSite removal][X86] Remove unneeded use of CallSite. NFC
We already have a CallInst, we can just get the calling convention from it.
2020-04-10 10:27:21 -07:00
Kevin P. Neal 7f38812d5b [FPEnv][AArch64] Platform-specific builtin constrained FP enablement
When constrained floating point is enabled the AArch64-specific builtins don't use constrained intrinsics in some cases. Fix that.

Neon is part of this patch, so ARM is affected as well.

Differential Revision: https://reviews.llvm.org/D77074
2020-04-10 13:02:00 -04:00
Fangrui Song 7f36cb1f1a [AArch64InstPrinter] Change printAlignedLabel to print the target address in hexadecimal form
Similar to D76580 (x86) and D76591 (PPC).

```
// llvm-objdump -d output (before)
10000: 08 00 00 94                   bl      #32
10004: 08 00 00 94                   bl      #32

// llvm-objdump -d output (after)
10000: 08 00 00 94                   bl      0x10020
10004: 08 00 00 94                   bl      0x10024

// GNU objdump -d. The lack of 0x is not ideal due to ambiguity.
10000:       94000008        bl      10020 <bar+0x18>
10004:       94000008        bl      10024 <bar+0x1c>
```

The new output makes it easier to find the jump target.

Differential Revision: https://reviews.llvm.org/D77853
2020-04-10 09:21:09 -07:00
Simon Pilgrim 1824ae0f42 [X86] Remove defunct EmitLoweredAtomicFP declaration. NFC. 2020-04-10 17:05:07 +01:00
Simon Pilgrim dd84a2f77a [X86] Remove defunct emitFMA3Instr declaration. NFC. 2020-04-10 17:05:06 +01:00
Christopher Tetreault 65b8b643b4 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sdesmalen, efriedma, jonpa

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77265
2020-04-10 08:43:32 -07:00
Simon Pilgrim a88cc20456 ProfileSummaryInfo.h - remove unnecessary includes. NFC
Remove a number of includes that aren't necessary (nor are we relying on the remaining includes to provide the declarations), we just needed a llvm::Instruction forward declaration.

This exposed a couple of source files that were implicitly replying on the includes for their use of llvm::SmallSet or std::set, requiring local includes to be added there instead.
2020-04-10 16:25:48 +01:00
Stanislav Mekhanoshin 44920e8566 [AMDGPU] Disable sub-dword scralar loads IR widening
These will be widened in the DAG. In the meanwhile early
widening prevents otherwise possible vectorization of
such loads.

Differential Revision: https://reviews.llvm.org/D77835
2020-04-10 08:20:49 -07:00
Mircea Trofin f62335b534 [llvm][NFC] Style fixes in Inliner.cpp
Summary:
Function names: camel case, lower case first letter.
Variable names: start with upper letter. For iterators that were 'i',
renamed with a descriptive name, as 'I' is 'Instruction&'.

Lambda captures simplification.

Opportunistic boolean return simplification.

Reviewers: davidxl, dblaikie

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77837
2020-04-10 08:04:39 -07:00
Ilya Leoshkevich 3bc439bdff [MSan] Add instrumentation for SystemZ
Summary:
This patch establishes memory layout and adds instrumentation. It does
not add runtime support and does not enable MSan, which will be done
separately.

Memory layout is based on PPC64, with the exception that XorMask
is not used - low and high memory addresses are chosen in a way that
applying AndMask to low and high memory produces non-overlapping
results.

VarArgHelper is based on AMD64. It might be tempting to share some
code between the two implementations, but we need to keep in mind that
all the ABI similarities are coincidental, and therefore any such
sharing might backfire.

copyRegSaveArea() indiscriminately copies the entire register save area
shadow, however, fragments thereof not filled by the corresponding
visitCallSite() invocation contain irrelevant data. Whether or not this
can lead to practical problems is unclear, hence a simple TODO comment.
Note that the behavior of the related copyOverflowArea() is correct: it
copies only the vararg-related fragment of the overflow area shadow.

VarArgHelper test is based on the AArch64 one.

s390x ABI requires that arguments are zero-extended to 64 bits. This is
particularly important for __msan_maybe_warning_*() and
__msan_maybe_store_origin_*() shadow and origin arguments, since non
zeroed upper parts thereof confuse these functions. Therefore, add ZExt
attribute to the corresponding parameters.

Add ZExt attribute checks to msan-basic.ll. Since with
-msan-instrumentation-with-call-threshold=0 instrumentation looks quite
different, introduce the new CHECK-CALLS check prefix.

Reviewers: eugenis, vitalybuka, uweigand, jonpa

Reviewed By: eugenis

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits, stefansf, Andreas-Krebbel

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76624
2020-04-10 16:53:49 +02:00
Christopher Tetreault 3bebf02861 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sdesmalen, rriddle, efriedma

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77262
2020-04-10 07:47:19 -07:00
Jessica Clarke 49e20c4c9e [RISCV] Consume error from parsing attributes section
Summary:
We don't consume the error from getBuildAttributes, so an assertions
build crashes with "Program aborted due to an unhandled Error:".
Explicitly consume it like the ARM version in that case.

Reviewers: asb, jhenderson, MaskRay, HsiangKai

Reviewed By: MaskRay

Subscribers: kristof.beyls, hiraditya, simoncook, kito-cheng, shiva0217, rogfer01, rkruppe, psnobl, benna, Jim, lenary, s.egerton, sameer.abuasal, luismarques, evandro, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77841
2020-04-10 15:05:53 +01:00
Simon Pilgrim 91bc50c0d7 [CostModel][X86] Improve InsertElement costs for sub-128bit vectors
If we're inserting into v2i8/v4i8/v8i8/v2i16/v4i16 style sub-128bit vectors ensure we don't use the SK_PermuteTwoSrc cost of the legalized value type - this is a followup to rG12c629ec6c59 which added equivalent sub-128bit shuffle costs
2020-04-10 14:55:46 +01:00
Florian Hahn 1a02aaeaa4 [SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef.
For non-integer constants/expressions and overdefined, I think we can
just use SimplifyBinOp to do common folds. By just passing a context
with the DL, SimplifyBinOp should not try to get additional information
from looking at definitions.

For overdefined values, it should be enough to just pass the original
operand.

Note: The comment before the `if (isconstant(V1State)...` was wrong
originally: isConstant() also matches integer ranges with a single
element. It is correct now.

Reviewers: efriedma, davide, mssimpso, aartbik

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D76459
2020-04-10 11:02:57 +01:00
Mehdi Amini bbeeb35c1f Revert "[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff."
This reverts commit 0445c64998.

MLIR Build is broken by this change at the moment.
2020-04-10 07:44:06 +00:00
Alina Sbirlea 0445c64998 [DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff.
This replaces the ChildrenGetter inside the DominatorTree with
GraphTraits over a GraphDiff object, an object which encapsulated the
view of the previous CFG.
This also simplifies the extentions in clang which use DominatorTree, as
GraphDiff also filters nullptrs.

Re-land a90374988e after moving CFGDiff.h
to Support.

Differential Revision: https://reviews.llvm.org/D77341
2020-04-10 07:38:53 +00:00
Michael Liao b54b4ecac3 Fix `-Wextra` warning. NFC. 2020-04-10 03:22:02 -04:00
Mehdi Amini 57d2d48399 Revert "[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff."
This reverts commit a90374988e and 5da1671bf8.

A new dependency is introduced here from Support to IR which seems like
a layering violation. It also breaks the MLIR build at the moment.
2020-04-10 06:27:59 +00:00
Kai Luo b7d5229d78 [PowerPC] Update alignment for ReuseLoadInfo in LowerFP_TO_INTForReuse
In LowerFP_TO_INTForReuse, when emitting `stfiwx`, alignment of 4 is
set for the `MachineMemOperand`, but RLI(ReuseLoadInfo)'s alignment is
not updated for following loads.

It's related to failed alignment check reported in
https://bugs.llvm.org/show_bug.cgi?id=45297

Differential Revision: https://reviews.llvm.org/D77624
2020-04-10 05:49:19 +00:00
John McCall 8423a6f363 Rename OptimalLayout to OptimizedStructLayout at Chris's request. 2020-04-10 00:14:20 -04:00
David Blaikie e0fd87cc64 llvm-dwarfdump: Return non-zero on error
Makes it easier to test "this doesn't produce an error" (& indeed makes
that the implied default so we don't accidentally write tests that have
silent/sneaky errors as well as the positive behavior we're testing for)

Though the support for applying relocations is patchy enough that a
bunch of tests treat lack of relocation application as more of a warning
than an error - so rather than me trying to figure out how to add
support for a bunch of relocation types, let's degrade that to a warning
to match the usage (& indeed, it's sort of more of a tool warning anyway
- it's not that the DWARF is wrong, just that the tool can't fully cope
with it - and it's not like the tool won't dump the DWARF, it just won't
follow/render certain relocations - I guess in the most general case it
might try to render an unrelocated value & instead render something
bogus... but mostly seems to be about interesting relocations used in
eh_frame (& honestly it might be nice if we were lazier about doing this
relocation resolution anyway - if you're not dumping eh_frame, should we
really be erroring about the relocations in it?))
2020-04-09 20:53:58 -07:00
Nemanja Ivanovic 7f3787c0f2 [PowerPC] Bail out of redundant LI elimination on an implicit kill
The transformation currently does not differentiate between explicit
and implicit kills. However, it is not valid to later simply clear
an implicit kill flag since the kill could be due to a call or return.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45374
2020-04-09 22:17:29 -05:00
Serguei Katkov 4275eb1331 Re-land [Codegen/Statepoint] Allow usage of registers for non gc deopt values.
The change introduces the usage of physical registers for non-gc deopt values.
This require runtime support to know how to take a value from register.
By default usage is off and can be switched on by option.

The change also introduces additional fix-up patch which forces the spilling
of caller saved registers (clobbered after the call) and re-writes statepoint
to use spill slots instead of caller saved registers.

Reviewers: reames, danstrushin
Reviewed By: dantrushin
Subscribers: mgorny, hiraditya, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D77797
2020-04-10 10:13:39 +07:00
Max Kazantsev 4e87823026 [LoopLoadElim] Fix crash by always checking simplify form
Loop simplify form should always be checked because logic of
propagateStoredValueToLoadUsers relies on it (in particular, it
requires preheader).

Reviewed By: Fedor Sergeev, Florian Hahn
Differential Revision: https://reviews.llvm.org/D77775
2020-04-10 09:23:28 +07:00
Heejin Ahn b647de9925 [WebAssembly] Use dummy debug info in Emscripten SjLj
Summary:
D74269 added debug info to newly created instructions, including calls
to `malloc` and `free`, by taking debug info from existing instructions
around, whose debug info may or may not be empty.

But there are cases debug info is required by the IR verifier: when both
the caller and the callee functions have DISubprograms, meaning we
already have declarations to `malloc` or `free` with a DISubprogram
attached, newly created calls to `malloc` and `free` should have
non-empty debug info. This patch creates a non-empty dummy debug info in
this case to those calls to make the IR verifier pass.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45461.

Reviewers: dschuff

Subscribers: aprantl, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77784
2020-04-09 18:44:50 -07:00
Wenlei He 60c642e74b [TLI] Per-function fveclib for math library used for vectorization
Summary:
Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads
the attributes and select available vector function list based on that. Now we also populate function list for all supported vector
libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without
duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now.

Subscribers: hiraditya, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77632
2020-04-09 18:26:38 -07:00
Stefan Pintilie 5b18b6e9a8 [PowerPC][Future] Fix for 6c4b40def7
This is a fix for the previous patch 6c4b40def7.
In some cases it may be possible to have the compiler produce st_other=1 without
the compiler using mcpu=future which should not be the case. This patch adds a
guard to make sure that if we are using st_other=1 then we are also compiling
for future CPU.
2020-04-10 01:12:11 +00:00
Alina Sbirlea a90374988e [DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff.
Summary:
This replaces the ChildrenGetter inside the DominatorTree with
GraphTraits over a GraphDiff object, an object which encapsulated the
view of the previous CFG.
This also simplifies the extentions in clang which use DominatorTree, as
GraphDiff also filters nullptrs.

Reviewers: kuhar, dblaikie, NutshellySima

Subscribers: hiraditya, cfe-commits, llvm-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77341
2020-04-09 18:08:39 -07:00
Lang Hames 37bcf2df01 [ORC] Require JITDylib to be specified when adding IR and objects in the C API. 2020-04-09 17:59:26 -07:00
Craig Topper 5625e6ab37 [X86] Improve min/max reduction costs.
This is similar to what I recently did for getArithmeticReductionCost.

I'm trying to account for the narrowing from 512->256->128 as we go.

I've also added a new helper method getMinMaxCost that tries to
handle the cases where we have native min/max instructions and
fall back to cmp+select when we don't.

Differential Revision: https://reviews.llvm.org/D76634
2020-04-09 17:28:50 -07:00
Nemanja Ivanovic 5fe2809447 [PowerPC] Don't assert on SELECT_CC with i1 type
When we try to select a SELECT_CC on Power9, we check if it can be matched to a
SETB instruction. In that function, we assert that the output type is i32/i64.
This is unnecessary as it is perfectly reasonable to have an i1 SELECT_CC.
Change that from an assert to an early exit condition.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45448
2020-04-09 19:27:32 -05:00
Amara Emerson e99169f1c2 [AArch64][GlobalISel] CallLowering: Don't generate new copies each time we need
to store to a stack location for outgoing args.

During call arg lowering we shouldn't be modifying SP so cache the SP copy
vreg for subsequent uses.

Gives a 0.2% geomean code size improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D77838
2020-04-09 17:08:56 -07:00
Francesco Petrogalli c846d2682b [llvm][Codegen] Make `getVectorTypeBreakdownMVT` work with scalable types.
Reviewers: efriedma, andwar, sdesmalen

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77434
2020-04-10 00:48:27 +01:00
Christopher Tetreault 9f87d951fc Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: mcrosier, efriedma, sdesmalen

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77269
2020-04-09 16:43:29 -07:00
Lang Hames 0d5f15f700 [ORC] Add C API support for adding object files to an LLJIT instance. 2020-04-09 16:18:46 -07:00
Lang Hames 1cd8493e69 [ORC] Expand the OrcV2 C API bindings.
Adds basic support for LLJITBuilder and DynamicLibrarySearchGenerator. This
allows C API clients to configure LLJIT to expose process symbols to JIT'd
code. An example of this is added in
llvm/examples/OrcV2CBindingsReflectProcessSymbols.
2020-04-09 16:18:46 -07:00
Daniel Sanders a79b2fc44b Add pass to strip debug info from MIR
Summary:
Removes:
* All LLVM-IR level debug info using StripDebugInfo()
* All debugify metadata
* 'Debug Info Version' module flag
* All (valid*) DEBUG_VALUE MachineInstrs
* All DebugLocs from MachineInstrs

This is a more complete solution than the previous MIRPrinter
option that just causes it to neglect to print debug-locations.

* The qualifier 'valid' is used here because AArch64 emits
  an invalid one and tests depend on it

Reviewers: vsk, aprantl, bogner

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77747
2020-04-09 15:44:38 -07:00
Mircea Trofin 655aa1ae4a [llvm][NFC] Replace CallSite with CallBase in Inliner
Summary:
*Almost* all uses are replaced. Left FIXMEs for the two sites that
require refactoring outside of Inliner, to scope this patch.

Subscribers: eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77817
2020-04-09 15:01:58 -07:00
Christopher Tetreault 19cc9b9ded Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: efriedma, sdesmalen, rriddle

Reviewed By: sdesmalen

Subscribers: hiraditya, dantrushin, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77261
2020-04-09 14:59:14 -07:00
Christopher Tetreault 00a1032412 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: rriddle, sdesmalen, efriedma

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77260
2020-04-09 13:35:41 -07:00
James Y Knight 5e7b98fe75 Fix an unused-variable warning in Release mode. 2020-04-09 16:34:55 -04:00
Christopher Tetreault e634f482ea Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: arsenm, efriedma, sdesmalen

Reviewed By: arsenm

Subscribers: wdng, arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77268
2020-04-09 13:11:37 -07:00
Zequan Wu eccfa35d53 Fix lifetime call in landingpad blocking Simplifycfg pass
Fix lifetime call in landingpad blocks simplifycfg from removing the
landingpad.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D77188
2020-04-09 13:07:32 -07:00
Christopher Tetreault e1e131ea5e Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: grosbach, efriedma, sdesmalen

Reviewed By: efriedma

Subscribers: hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77271
2020-04-09 12:52:44 -07:00
Christopher Tetreault b96558f5e5 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sunfish, sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77273
2020-04-09 12:41:28 -07:00
Stefan Pintilie 64868cbfcf [PowerPC][Future] Fix for 75828ef615
Used unsigned long where uint64_t should have been used by mistake.
Fixed in this patch.
2020-04-09 19:33:12 +00:00
Simon Pilgrim 12c629ec6c [CostModel][X86] Add shuffle costs for some common sub-128bit vectors
v2i8/v4i8/v8i8 + v2i16/v4i16 all show up in vectorizer code and by just using the legalized types (v16i8/v8i16) we're highly exaggerating the actual cost of the shuffle.
2020-04-09 19:57:06 +01:00
Paolo Savini fae40bd5a1 [RISCV] Add MC layer support for proposed Bit Manipulation extension (version 0.92)
This adds the instruction encoding and mnenomics for the proposed
RISC-V Bit Manipulation extension (version 0.92). It is implemented with
each category of instruction as its own target feature, with the 'b'
extension feature enabling all options. Since this extension is not yet
ratified, all target features are prefixed with 'experimental-' to note
their status.

Differential Revision: https://reviews.llvm.org/D65649
2020-04-09 18:04:22 +01:00
jasonliu 085689d44c [PPC][AIX] Implement variadic function handling in LowerFormalArguments_AIX
Summary:
This patch adds support for handling of variadic functions for AIX.
This includes ensuring that use and consume correct type of
va_list (char *va_list) for AIX.

Authored by: ZarkoCA

Reviewers: cebowleratibm, sfertile, jasonliu

Reviewed by: jasonliu

Differential Revision: https://reviews.llvm.org/D76130
2020-04-09 16:49:44 +00:00
Stefan Pintilie 75828ef615 [PowerPC][Future] Initial support for PCRel addressing for constant pool loads
Add initial support for PC Relative addressing for constant pool loads.
This includes adding a new relocation for @pcrel and adding a new PowerPC flag
to identify PC relative addressing.

Differential Revision: https://reviews.llvm.org/D74486
2020-04-09 11:17:23 -05:00
Kazushi (Jam) Marukawa 015dee1ac8 [VE] Support (m)0 and (m)1 operands
Summary:
VE has special operands to represent 0b000...000111...111 (`(m)0`) and
0b111...111000...000 (`(m)1`) bit sequences.  This patch supports those
operands not only in machine instructions but also in DAG lowering.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D77769
2020-04-09 18:09:00 +02:00
Craig Topper 5a55363dc4 [X86] Remove redundant VMOVDDUPZ128rmk/VMOVDDUPZ128rmkz isel patterns.
These patterns are identical to the pattern for the instruction.
2020-04-09 09:06:58 -07:00
Gil Rapaport e2a1867880 [LV] Add VPValue operands to VPBlendRecipe (NFCI)
InnerLoopVectorizer's code called during VPlan execution still relies on
original IR's def-use relations to decide which vector code to generate,
limiting VPlan transformations ability to modify def-use relations and still
have ILV generate the vector code.
This commit introduces VPValues for VPBlendRecipe to use as the values to
blend. The recipe is generated with VPValues wrapping the phi's incoming values
of the scalar phi. This reduces ingredient def-use usage by ILV as a step
towards full VPlan-based def-use relations.

Differential Revision: https://reviews.llvm.org/D77539
2020-04-09 18:48:33 +03:00
Mircea Trofin b4924f01a4 [llvm][nfc] InstructionCostDetail encapsulation
Ensured initialized fields; encapsulad delta calulations and evaluation
of threshold having had changed; assertion for CostThresholdMap
dereference, to indicate design intent.

Differential Revision: https://reviews.llvm.org/D77762
2020-04-09 08:21:18 -07:00
Ayal Zaks 1678489234 [LV] FoldTail w/o Primary Induction
Introduce a new VPWidenCanonicalIVRecipe to generate a canonical vector
induction for use in fold-tail-with-masking, if a primary induction is absent.

The canonical scalar IV having start = 0 and step = VF*UF, created during code
-gen to control the vector loop, is widened into a canonical vector IV having
start = {<Part*VF, Part*VF+1, ..., Part*VF+VF-1> for 0 <= Part < UF} and
step = <VF*UF, VF*UF, ..., VF*UF>.

Differential Revision: https://reviews.llvm.org/D77635
2020-04-09 17:45:23 +03:00
Simon Cook 2df6a02fd7 [RISCV] Implement evaluateBranch
This implements the instruction analysis required to print branch
targets as part of llvm-objdump's disassembly.

Note, this only handles those branches which can be analyzed in a single
instruction, a future patch will handle multiple-instruction patterns,
such as AUIPC/LUI+JALR instruction pairs.

Differential Revision: https://reviews.llvm.org/D77567
2020-04-09 15:11:55 +01:00
Shengchen Kan 2477cec2ac [NFC][X86] Refine code in X86AsmBackend
Summary: Move code to a better place, rename function, etc

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77778
2020-04-09 21:31:52 +08:00
Sanjay Patel 812970edda [InstCombine] replace undef in vector constant for safe shift transform (PR45447)
As noted in PR45447, we have a vector-constant-with-undef-element transform bug:
https://bugs.llvm.org/show_bug.cgi?id=45447

We replace undefs with a safe constant (0 or -1) based on the (non-)negative
predicate constraint.

So this is correct:
http://volta.cs.utah.edu:8080/z/WZE36H
...but this is not:
http://volta.cs.utah.edu:8080/z/boj8gJ

Previously, we were relying on getSafeVectorConstantForBinop() in the related fold (D76800).
But that's making an assumption about what qualifies as "safe", and that assumption may
not always hold.

Differential Revision: https://reviews.llvm.org/D77739
2020-04-09 08:00:46 -04:00
Anton Bikineev 9e1ccec8d5 tsan: don't instrument __attribute__((naked)) functions
Naked functions are required to not have compiler generated
prologues/epilogues, hence no instrumentation is needed for them.

Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45400

Differential Revision: https://reviews.llvm.org/D77477
2020-04-09 13:47:47 +02:00
Pavel Labath b761a6484d [DWARF] Detect extraction errors in DWARFFormValue::extractValue
Summary:
Although the function had a bool return value, it was always returning
true. Presumably this is because the main type of errors one can
encounter here is running off the end of the stream, and until very
recently, the DataExtractor class made it very difficult to detect that.

The situation has changed now, and we can easily detect errors here,
which this patch does.

Reviewers: dblaikie, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77308
2020-04-09 13:41:02 +02:00
Serguei Katkov 44f0d7f136 Revert "[Codegen/Statepoint] Allow usage of registers for non gc deopt values."
This reverts commit a0275705bb.

It causes buildbot failures building LLVM with BUILD_SHARED_LIBS due to a linker error.
2020-04-09 18:24:47 +07:00
Florian Hahn a7efe06af0 [LV] Assert no DbgInfoIntrinsic calls are passed to widening (NFC).
When building a VPlan, BasicBlock::instructionsWithoutDebug() is used to
iterate over the instructions in a block. This means that no recipes
should be created for debug info intrinsics already and we can turn the
early exit into an assertion.

Reviewers: Ayal, gilr, rengolin, aprantl

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D77636
2020-04-09 11:37:32 +01:00
Serguei Katkov a0275705bb [Codegen/Statepoint] Allow usage of registers for non gc deopt values.
The change introduces the usage of physical registers for non-gc deopt values.
This require runtime support to know how to take a value from register.
By default usage is off and can be switched on by option.

The change also introduces additional fix-up patch which forces the spilling
of caller saved registers (clobbered after the call) and re-writes statepoint
to use spill slots instead of caller saved registers.

Reviewers: reames, dantrushin
Reviewed By: reames, dantrushin
Subscribers: mgorny, hiraditya, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D77371
2020-04-09 16:57:35 +07:00
Jay Foad bf730e1686 [CodeGen] Fix a simple FIXME. NFC. 2020-04-09 10:54:03 +01:00
Jay Foad 9c7bd94ce8 Fix typo in comment 2020-04-09 10:36:00 +01:00
Jay Foad 4970a1deca [AMDGPU] Remove outdated comment 2020-04-09 10:36:00 +01:00
Florian Hahn 9997ee23ed [VPlan] Add & use VPValue operands for VPWidenCallRecipe (NFC).
This patch adds VPValue versions for the arguments of the call to
VPWidenCallRecipe and uses them during code-generation.

Similar to D76373 this reduces ingredient def-use usage by ILV as
a step towards full VPlan-based def-use relations.

Reviewers: Ayal, gilr, rengolin

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D77655
2020-04-09 10:23:26 +01:00
Jay Foad c63aed890e [KnownBits] Move AND, OR and XOR logic into KnownBits
Summary:
There are at least three clients for KnownBits calculations:
ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the
common logic should be moved out of these clients and into KnownBits
itself.

This patch does this for AND, OR and XOR calculations by implementing
and using appropriate operator overloads KnownBits::operator& etc.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74060
2020-04-09 10:10:37 +01:00
Jay Foad 94cc9eccf6 [ValueTracking] Simplify KnownBits construction
Use the simpler BitWidth constructor instead of the copy constructor to
make it clear when we don't actually need to copy an existing KnownBits
value. Split out from D74539. NFC.
2020-04-09 09:27:22 +01:00
Serge Pavlov c7ff5b38f2 [FPEnv] Use single enum to represent rounding mode
Now compiler defines 5 sets of constants to represent rounding mode.
These are:

1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes
defined by IEEE-754 and is used in `APFloat` implementation.

2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754
rounding modes and a special value for dynamic rounding mode. It is used
in clang frontend.

3. `llvm::fp::RoundingMode`. Defines the same values as
`clang::LangOptions::FPRoundingModeKind` but in different order. It is
used to specify rounding mode in in IR and functions that operate IR.

4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7).
Besides constants for rounding mode it also uses a special value to
indicate error. It is convenient to use in intrinsic functions, as it
represents platform-independent representation for rounding mode. In this
role it is used in some pending patches.

5. Values like `FE_DOWNWARD` and other, which specify rounding mode in
library calls `fesetround` and `fegetround`. Often they represent bits
of some control register, so they are target-dependent. The same names
(not values) and a special name `FE_DYNAMIC` are used in
`#pragma STDC FENV_ROUND`.

The first 4 sets of constants are target independent and could have the
same numerical representation. It would simplify conversion between the
representations. Also now `clang::LangOptions::FPRoundingModeKind` and
`llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding
direction `roundTiesToAway`, although it is supported natively on
some targets.

This change defines all the rounding mode type via one `llvm::RoundingMode`,
which also contains rounding mode for IEEE rounding direction `roundTiesToAway`.

Differential Revision: https://reviews.llvm.org/D77379
2020-04-09 13:26:47 +07:00
Pratyai Mazumder e8d1c6529b [SanitizerCoverage] sancov/inline-bool-flag instrumentation.
Summary:
New SanitizerCoverage feature `inline-bool-flag` which inserts an
atomic store of `1` to a boolean (which is an 8bit integer in
practice) flag on every instrumented edge.

Implementation-wise it's very similar to `inline-8bit-counters`
features. So, much of wiring and test just follows the same pattern.

Reviewers: kcc, vitalybuka

Reviewed By: vitalybuka

Subscribers: llvm-commits, hiraditya, jfb, cfe-commits, #sanitizers

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D77244
2020-04-08 22:43:52 -07:00
Vitaly Buka 8b1a6c0a57 [NFC][SanitizerCoverage] Simplify alignment calculation
This reverts commit e42f2a0cd8b8007c816d0e63f5000c444e29105e.
2020-04-08 22:43:52 -07:00
WangTianQing a3dc949000 [X86] Add TSXLDTRK instructions.
Summary: For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference

Reviewers: craig.topper, RKSimon, LuoYuanke

Reviewed By: craig.topper

Subscribers: mgorny, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77205
2020-04-09 13:17:29 +08:00
Johannes Doerfert cb0ecc5c33 [CallGraphUpdater] Remove dead constants before replacing a function
Dead constants might be left when a function is replaced, we can
gracefully handle this case and avoid complexity for the users who would
see an assertion otherwise.
2020-04-08 22:52:46 -05:00
Lang Hames 5877d6f5f4 [ORC] Make mangling convenience methods part of the public API of LLJIT.
This saves clients from having to manually construct a MangleAndInterner.
2020-04-08 20:20:13 -07:00
Matt Arsenault 0aa0d70067 MIR: Use Register 2020-04-08 22:07:26 -04:00
Sam Clegg 7baad0c53c [WebAssembly][MC] Use StringRef over std::string pointer
This is followup based on feedback on 5be42f36f5.
See: https://reviews.llvm.org/D77627.

Differential Revision: https://reviews.llvm.org/D77674
2020-04-08 18:28:08 -07:00
Craig Topper f3d3cec648 [InstCombine] Avoid a call to deprecated version of CreateCall.
Passing a Value * to CreateCall has to call getPointerElementType
to find the type of the pointer.

In this case we can rely on the fact that Intrinsic::getDeclaration
returns a Function * and use that version of CreateCall.
2020-04-08 17:41:16 -07:00
Johannes Doerfert 0985554b70 [Attributor][NFC] Split AbstractAttributes out of Attributor.cpp
Attributor.cpp became quite big and we need to start provide structure.
The Attributor code is now in Attributor.cpp and the classes derived
from AbstractAttribute are in AttributorAttributes.cpp. Minor changes
were required but no intended functional changes.

We also minimized includes as part of this.

Reviewed By: baziotis

Differential Revision: https://reviews.llvm.org/D76873
2020-04-08 19:02:14 -05:00
Christopher Tetreault fe69eb1196 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: espindola, efriedma, sdesmalen

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77275
2020-04-08 16:29:36 -07:00
Christopher Tetreault 49fd24fe9e Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: hfinkel, efriedma, sdesmalen

Reviewed By: efriedma

Subscribers: wuzish, nemanjai, hiraditya, kbarton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77266
2020-04-08 16:10:55 -07:00
Christopher Tetreault 155740cc33 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: sdesmalen, rriddle, efriedma

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77263
2020-04-08 15:15:41 -07:00
Amara Emerson befc788cfa GlobalISel: Add a setInstrAndDebugLoc(MachineInstr&) convenience helper to MachineIRBuilder. NFC.
This saves doing two separate calls to set the Instr and DebugLoc from an existing MI.
2020-04-08 14:38:33 -07:00
Matt Arsenault e49e33b610 CodeGen: Use Register in MachineInstrBuilder 2020-04-08 17:03:53 -04:00
Kirill Naumov 8b67853a83 [CFGPrinter] Adding heat coloring to CFGPrinter
This patch introduces the heat coloring of the Control Flow Graph which is based
on the relative "hotness" of each BB. The patch is a part of sequence of three
patches, related to graphs Heat Coloring.

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D77161
2020-04-08 19:59:51 +00:00
Matt Arsenault c42cc7fd24 CodeGen: Use Register in MachineSSAUpdater 2020-04-08 14:29:01 -04:00
Artem Belevich a9627b7ea7 [CUDA] Add partial support for recent CUDA versions.
Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11.
None of the new features of CUDA-10.2+ have been implemented yet, so using these
versions will still produce a warning.

Differential Revision: https://reviews.llvm.org/D77670
2020-04-08 11:19:44 -07:00
Vedant Kumar 48e65fc630 MachineFunction: Copy call site info when duplicating insts
Summary:
Preserve call site info for duplicated instructions. We copy over the
call site info in CloneMachineInstrBundle to avoid repeated calls to
copyCallSiteInfo in CloneMachineInstr.

(Alternatively, we could copy call site info higher up the stack, e.g.
into TargetInstrInfo::duplicate, or even into individual backend passes.
However, I don't see how that would be safer or more general than the
current approach.)

Reviewers: aprantl, djtodoro, dstenb

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77685
2020-04-08 11:06:14 -07:00
Matt Arsenault 586769cce2 DAG: Use Register 2020-04-08 13:44:31 -04:00
Sean Fertile d0b57b41f4 [PowerPC][AIX][NFC] Replace deprecated getByValAlign call.
Replace call to deprecated 'getByValAlign()' with
'getNonZeroByValAlign()'.
2020-04-08 13:27:39 -04:00
Matt Arsenault dcce3ef1d2 FastISel: Partially use Register
Doesn't try to convert the cases that depend on generated code.
2020-04-08 12:10:58 -04:00
Matt Arsenault 7a46e36d51 CodeGen: Use Register more in CallLowering
Some of these MCPhysReg uses should probably be MCRegister, but right
now this would require more invasive changes.
2020-04-08 12:10:58 -04:00
Matt Arsenault ca0ace7298 CodeGen: Use Register in MachineBasicBlock 2020-04-08 12:10:58 -04:00
Matt Arsenault 84aa58cbe2 CodeGen: Use Register in TargetLowering 2020-04-08 12:10:58 -04:00
Kirill Naumov 0125db9ab2 [TimePasses] Small fix in "-time-passes" flag that makes it more stable
Adds StringMap for TimingData.

Differential Revision: https://reviews.llvm.org/D76946
Reviewed By: fedor.sergeev
2020-04-08 15:59:45 +00:00
Sean Fertile 8abfd2c3bb [PowerPC][AIX] Enable passing byval formal arguments in multiple registers.
Any or all the argument registers can be used to pass a byval formal
argument, with the limitation that the argument must fit in the
available registers (ie: is not split between registers and stack).

Differential Revision: https://reviews.llvm.org/D76902
2020-04-08 11:16:33 -04:00
Florian Hahn bbbec71609 [DSE.MSSA] Only use callCapturesBefore for calls.
callCapturesBefore always returns ModRef , if UseInst isn't a call. As
we only call it if we already know Mod is set, this only destroys the
Must bit for non-calls.
2020-04-08 15:12:33 +01:00
Florian Hahn a6353fdf3b [DSE,MSSA] Hoist getMemoryAccess call (NFC). 2020-04-08 15:10:05 +01:00
Alexey Lapshin 0ed2170dc4 [DWARFLinker][dsymutil] followup for 88c2137b6d
That patch is a followup for "Move DwarfStreamer into DWARFLinker".
It fixes build with LLVM_LINK_LLVM_DYLIB.
2020-04-08 16:46:52 +03:00
Stefan Pintilie 6c4b40def7 [PowerPC][Future] Add Support For Functions That Do Not Use A TOC.
On PowerPC most functions require a valid TOC pointer.

This is the case because either the function itself needs to use this
pointer to access the TOC or because other functions that are called
from that function expect a valid TOC pointer in the register R2.
The main exception to this is leaf functions that do not access the TOC
since they are guaranteed not to need a valid TOC pointer.

This patch introduces a feature that will allow more functions to not
require a valid TOC pointer in R2.

Differential Revision: https://reviews.llvm.org/D73664
2020-04-08 08:07:35 -05:00
Sanjay Patel a1c05fe20f [InstCombine] exclude bitcast of ppc_fp128 in icmp signbit fold
Based on the post-commit comments for rG0f56bbc, there might
be a problem with this transform:

(bitcast (fpext/fptrunc X)) to iX) < 0 --> (bitcast X to iY) < 0

...and the ppc_fp128 data type, so conservatively bypass if we
are bitcasting a ppc_fp128.

We might be able to account for endian or other differences to
enable this for PowerPC again if that is useful.

Differential Revision: https://reviews.llvm.org/D77642
2020-04-08 08:56:19 -04:00
Simon Pilgrim 66c18c729d [X86][SSE] Combine PTEST(AND(X,Y),AND(X,Y)) -> PTEST(X,Y) and ANDN equivalents
Tests derived from PR42035 examples
2020-04-08 12:42:22 +01:00
Jeremy Morse c77887e4d1 [DebugInfo][NFC] Early-exit when analyzing for single-location variables
This is a performance patch that hoists two conditions in DwarfDebug's
validThroughout to avoid a linear-scan of all instructions in a block. We
now exit early if validThrougout will never return true for the variable
location.

The first added clause filters for the two circumstances where
validThroughout will return true. The second added clause should be
identical to the one that's deleted from after the linear-scan.

Differential Revision: https://reviews.llvm.org/D77639
2020-04-08 12:27:11 +01:00
Shengchen Kan 916044d819 [X86][MC] Support enhanced relaxation for branch align
Summary:
Since D75300 has been landed, I want to support enhanced relaxation when we need to align branches and allow prefix padding. "Enhanced Relaxtion" means we allow an instruction that could not be traditionally relaxed to be emitted into RelaxableFragment so that we increase its length by adding prefixes for optimization.

The motivation is straightforward, RelaxFragment is mostly for relative jumps and we can not increase the length of jumps when we need to align them, so if we need to achieve D75300's purpose (reducing the bytes of nops) when need to align jumps, we have to make more instructions "relaxable".

Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight

Reviewed By: reames

Subscribers: hiraditya, llvm-commits, annita.zhang

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76286
2020-04-08 19:08:19 +08:00
Mikael Holmen 893df2032d [IfConversion] Disallow TrueBB == FalseBB for valid diamonds
Summary:
This fixes PR45302.

Previously the case

     BB1
     / \
    |   |
   TBB FBB
    |   |
     \ /
     BB2

was treated as a valid diamond also when TBB and FBB was the same basic
block. This then lead to a failed assertion in IfConvertDiamond.

Since TBB == FBB is quite a degenerated case of a diamond, we now
don't treat it as a valid diamond anymore, and thus we will avoid the
trouble of making IfConvertDiamond handle it correctly.

Reviewers: efriedma, kparzysz

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D77651
2020-04-08 12:50:36 +02:00
Anna Welker 89e1248d7b [ARM][MVE] Optimise offset addresses of gathers/scatters
This patch adds an analysis of the offset addresses used by gathers
and scatters to the MVEGatherScatterLowering pass to find
multiplications and additions that are loop invariant and thus can
be moved into the loop preheader, avoiding to execute them each time.

Differential Revision: https://reviews.llvm.org/D76681
2020-04-08 11:46:57 +01:00
Max Kazantsev 7adb9e06fd [LoopLoadElim] Add test showing that LoopLoadElim doesn't work correctly with new PM 2020-04-08 17:32:03 +07:00
Dominik Montada 35950fea8d [GlobalISel] support narrow G_IMPLICIT_DEF for DstSize % NarrowSize != 0
Summary:
When narrowing G_IMPLICIT_DEF where the original size is not a multiple
of the narrow size, emit a smaller G_IMPLICIT_DEF and use G_ANYEXT.

To prevent a potential endless loop in the legalizer, the condition
to combine G_ANYEXT(G_IMPLICIT_DEF) is changed from isInstUnsupported
to !isInstLegal, since in this case the combine is only valid if
consequent legalization of the newly combined G_IMPLICIT_DEF does not
introduce G_ANYEXT due to narrowing.

Although this legalization for G_IMPLICIT_DEF would also be valid for
the general case, it actually caused a lot of code regressions when
tried due to superfluous COPYs and combines not getting hit anymore.

Reviewers: dsanders, aemerson, volkan, arsenm, aditya_nandakumar

Reviewed By: arsenm

Subscribers: jvesely, nhaehnle, kerbowa, wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76598
2020-04-08 11:00:07 +02:00
Kazushi (Jam) Marukawa aa034867f1 [VE] Simplify definitions of uimm6 and simm7
Summary: To prepare continuous changes, simplify uimm6 and simm7 operands.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D77700
2020-04-08 09:53:42 +02:00
Igor Kudrin af11c556db [DebugInfo] Fix reading DWARFv5 type units in DWP.
In DWARFv5, type units are stored in .debug_info sections, along with
compilation units, and they are distinguished by the unit_type field
in the header, not by the name of the section. It is impossible to
associate the correct index section of a DWP file with the unit before
the unit's header is read. This patch fixes reading DWARFv5 type units
by parsing the header first and then applying the index entry according
to the actual unit type.

Differential Revision: https://reviews.llvm.org/D77552
2020-04-08 12:50:58 +07:00
Stanislav Mekhanoshin f96810ff34 [AMDGPU] Expand vector trunc stores from i16 to i8
Differential Revision: https://reviews.llvm.org/D77693
2020-04-07 21:47:45 -07:00
Johannes Doerfert a19eb1de72 [OpenMP] Add match_{all,any,none} declare variant selector extensions.
By default, all traits in the OpenMP context selector have to match for
it to be acceptable. Though, we sometimes want a single property out of
multiple to match (=any) or no match at all (=none). We offer these
choices as extensions via
  `implementation={extension(match_{all,any,none})}`
to the user. The choice will affect the entire context selector not only
the traits following the match property.

The first user will be D75788. There we can replace
```
  #pragma omp begin declare variant match(device={arch(nvptx64)})
  #define __CUDA__

  #include <__clang_cuda_cmath.h>

  // TODO: Hack until we support an extension to the match clause that allows "or".
  #undef __CLANG_CUDA_CMATH_H__

  #undef __CUDA__
  #pragma omp end declare variant

  #pragma omp begin declare variant match(device={arch(nvptx)})
  #define __CUDA__

  #include <__clang_cuda_cmath.h>

  #undef __CUDA__
  #pragma omp end declare variant
```
with the much simpler
```
  #pragma omp begin declare variant match(device={arch(nvptx, nvptx64)}, implementation={extension(match_any)})
  #define __CUDA__

  #include <__clang_cuda_cmath.h>

  #undef __CUDA__
  #pragma omp end declare variant
```

Reviewed By: mikerice

Differential Revision: https://reviews.llvm.org/D77414
2020-04-07 23:33:24 -05:00
Kazu Hirata 91eb442fde [JumpThreading] NFC: Simplify ComputeValueKnownInPredecessorsImpl
Summary:
ComputeValueKnownInPredecessorsImpl is the main folding mechanism in
JumpThreading.cpp.  To avoid potential infinite recursion while
chasing use-def chains, it uses:

  DenseSet<std::pair<Value *, BasicBlock *>> &RecursionSet

to keep track of Value-BB pairs that we've processed.

Now, when ComputeValueKnownInPredecessorsImpl recursively calls
itself, it always passes BB as is, so the second element is always BB.

This patch simplifes the function by dropping "BasicBlock *" from
RecursionSet.

Reviewers: wmi, efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77699
2020-04-07 18:37:36 -07:00
Eli Friedman 565b56a72c [NFC] Clean up uses of LoadInst constructor. 2020-04-07 16:28:53 -07:00
Daniel Sanders 1adeeabb79 Add MIR-level debugify with only locations support for now
Summary:
Re-used the IR-level debugify for the most part. The MIR-level code then
adds locations to the MachineInstrs afterwards based on the LLVM-IR debug
info.

It's worth mentioning that the resulting locations make little sense as
the range of line numbers used in a Function at the MIR level exceeds that
of the equivelent IR level function. As such, MachineInstrs can appear to
originate from outside the subprogram scope (and from other subprogram
scopes). However, it doesn't seem worth worrying about as the source is
imaginary anyway.

There's a few high level goals this pass works towards:
* We should be able to debugify our .ll/.mir in the lit tests without
  changing the checks and still pass them. I.e. Debug info should not change
  codegen. Combining this with a strip-debug pass should enable this. The
  main issue I ran into without the strip-debug pass was instructions with MMO's and
  checks on both the instruction and the MMO as the debug-location is
  between them. I currently have a simple hack in the MIRPrinter to
  resolve that but the more general solution is a proper strip-debug pass.
* We should be able to test that GlobalISel does not lose debug info. I
  recently found that the legalizer can be unexpectedly lossy in seemingly
  simple cases (e.g. expanding one instr into many). I have a verifier
  (will be posted separately) that can be integrated with passes that use
  the observer interface and will catch location loss (it does not verify
  correctness, just that there's zero lossage). It is a little conservative
  as the line-0 locations that arise from conflicts do not track the
  conflicting locations but it can still catch a fair bit.

Depends on D77439, D77438

Reviewers: aprantl, bogner, vsk

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77446
2020-04-07 16:25:13 -07:00
Fangrui Song 624654fd64 [VE] Migrate to the getMachineMemOperand overload using llvm::Align
Just delete the deprecated overload because nothing uses it.
2020-04-07 16:04:54 -07:00
Matt Arsenault 6011627f51 CodeGen: More conversions to use Register 2020-04-07 18:54:36 -04:00
Fangrui Song d2ef8c1f2c [ThinLTO] Drop dso_local if a GlobalVariable satisfies isDeclarationForLinker()
dso_local leads to direct access even if the definition is not within this compilation unit (it is
still in the same linkage unit). On ELF, such a relocation (e.g. R_X86_64_PC32) referencing a
STB_GLOBAL STV_DEFAULT object can cause a linker error in a -shared link.

If the linkage is changed to available_externally, the dso_local flag should be dropped, so that no
direct access will be generated.

The current behavior is benign, because -fpic does not assume dso_local
(clang/lib/CodeGen/CodeGenModule.cpp:shouldAssumeDSOLocal).
If we do that for -fno-semantic-interposition (D73865), there will be an
R_X86_64_PC32 linker error without this patch.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D74751
2020-04-07 15:46:01 -07:00
Fangrui Song 2f8fb4d1cd [VE] Adapt aa26dd9858 and 2481f26ac3 2020-04-07 15:45:19 -07:00
Wei Mi b49eac71ad Recommit [SampleFDO] Add flag for partial profile.
Fix the error of show-prof-info.test on some platforms without zlib.

The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them.

Differential Revision: https://reviews.llvm.org/D77426
2020-04-07 14:28:25 -07:00
Stanislav Mekhanoshin 96e51ed005 [AMDGPU] Implement copyPhysReg for 16 bit subregs
Differential Revision: https://reviews.llvm.org/D74937
2020-04-07 14:22:46 -07:00
Matt Arsenault 2481f26ac3 CodeGen: Use Register in TargetFrameLowering 2020-04-07 17:07:44 -04:00
Nikita Popov fe8abbf442 [BPI] Clear handles when releasing memory (NFC)
This reduces max-rss of sqlite compilation by 2.5%.
2020-04-07 22:51:01 +02:00
Matt Arsenault aa26dd9858 CodeGen: Use Register in more places 2020-04-07 15:59:40 -04:00
Wei Mi c5da949ae8 Revert "[SampleFDO] Add flag for partial profile." show-prof-info.test breaks on some platforms.
This reverts commit e3ba652a14.
2020-04-07 12:54:51 -07:00
Wei Mi e3ba652a14 [SampleFDO] Add flag for partial profile.
The common profile usage is to collect profile from a target and then use the profile to guide the optimized build for the same target. There are some cases that no profile can be collected for a target. In those cases, although no full profile is available, it is possible to have some partial profile collected from other targets to optimize common libraries and utilities. A flag is needed to tell the partial profile from the full profile apart, so compiler can use different strategy for them.

Differential Revision: https://reviews.llvm.org/D77426
2020-04-07 12:17:56 -07:00
Nemanja Ivanovic ecd8435483 [NFC][PowerPC] Fix register class for patterns using XXPERMDIs
There are a few patterns where we use a superclass for inputs to this
instruction rather than the correct class. This can sometimes lead to
unncessary copies.
2020-04-07 14:06:08 -05:00
Graham Sellers a19a56f6a1 [AMDGPU] Extend constant folding for logical operations
This patch extends existing constant folding in logical operations to
handle S_XNOR, S_NAND, S_NOR, S_ANDN2, S_ORN2, V_LSHL_ADD_U32 and
V_AND_OR_B32. Also added a couple of tests for existing folds.
2020-04-07 14:37:16 -04:00
Craig Topper c41685b16f [SelectionDAG] Make getZeroExtendInReg take a vector VT if the operand VT is a vector.
This removes a call to getScalarType from a bunch of call sites.
It also makes the behavior consistent with SIGN_EXTEND_INREG.

Differential Revision: https://reviews.llvm.org/D77631
2020-04-07 11:34:08 -07:00
Alexey Lapshin 88c2137b6d [DWARFLinker][dsymutil][NFC] Move DwarfStreamer into DWARFLinker.
For implementing "remove obsolete debug info in lld", it is neccesary
to have DWARF generation code implementation. dsymutil uses DwarfStreamer
for that purpose. DwarfStreamer uses AsmPrinter. It is considered OK
to use AsmPrinter based code in lld(D74169). This patch moves
DwarfStreamer implementation into DWARFLinker, so that it could be reused
from lld.

Generally, a better place for such a common DWARF generation code would be
not DWARFLinker but an additional separate library. Such a library could
contain a single version of DWARF generation routines and could also
be independent of AsmPrinter. At the current moment, DwarfStreamer
does not pretend to be such a general implementation of DWARF generation.
So I decided to put it into DWARFLinker since it is the only user
of DwarfStreamer.

Testing: it passes "check-all" lit testing. MD5 checksum for clang .dSYM
bundle matches for the dsymutil with/without that patch.

Reviewed By: JDevlieghere

Differential revision: https://reviews.llvm.org/D77169
2020-04-07 21:21:54 +03:00
Eli Friedman e9ac757f79 [AArch64] Don't expand memcmp in strict align mode.
7aecf232 fixed the bug where we would miscompile, but we still generate
a crazy amount of code. Turn off the expansion until someone implements
an appropriate heuristic.

Differential Revision: https://reviews.llvm.org/D77599
2020-04-07 10:53:36 -07:00
Matt Arsenault f596ab4066 AMDGPU: Use early return 2020-04-07 13:48:00 -04:00
Sam Clegg 5be42f36f5 [WebAssembly][MC] Fix leak of std::string members in MCSymbolWasm
Summary: Fixes: https://bugs.llvm.org/show_bug.cgi?id=45452

Subscribers: dschuff, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77627
2020-04-07 10:38:43 -07:00
Stanislav Mekhanoshin 12a324393d [AMDGPU] Limit endcf-collapase to simple if
We can only collapse adjacent SI_END_CF if outer statement
belongs to a simple SI_IF, otherwise correct mask is not in the
register we expect, but is an argument of an S_XOR instruction.

Even if SI_IF is simple it might be lowered using S_XOR because
lowering is dependent on a basic block layout. It is not
considered simple if instruction consuming its output is
not an SI_END_CF. Since that SI_END_CF might have already been
lowered to an S_OR isSimpleIf() check may return false.

This situation is an opportunity for a further optimization
of SI_IF lowering, but that is a separate optimization. In the
meanwhile move SI_END_CF post the lowering when we already know
how the rest of the CFG was lowered since a non-simple SI_IF
case still needs to be handled.

Differential Revision: https://reviews.llvm.org/D77610
2020-04-07 10:27:23 -07:00
Matt Arsenault b281138a1b DAG: Use the correct getPointerTy in a few places
These should not be assuming address space 0. Calling getPointerTy is
generally the wrong thing to do, since you should already know the
type from the incoming IR.
2020-04-07 12:45:41 -04:00
Nikita Popov 259649a519 [RDA] Avoid full reprocessing of blocks in loops (NFCI)
RDA sometimes needs to visit blocks twice, to take into account
reaching defs coming in along loop back edges. Currently it handles
repeated visitation the same way as usual, which means that it will
scan through all instructions and their reg unit defs again. Not
only is this very inefficient, it also means that all reaching defs
in loops are going to be inserted twice.

We can do much better than this. The only thing we need to handle
is a new reaching def from a predecessor, which either needs to be
prepended to the reaching definitions (if there was no reaching def
from a predecessor), or needs to replace an existing predecessor
reaching def, if it is more recent. Since D77508 we only store the
most recent predecessor reaching def, so that's the only one that
may need updating.

This also has the nice side-effect that reaching definitions are
now automatically sorted and unique, so drop the llvm::sort() call
in favor of an assertion.

Differential Revision: https://reviews.llvm.org/D77511
2020-04-07 17:55:37 +02:00
Nikita Popov 76e987b372 [RDA] Don't pass down TraversedMBB (NFC)
Only pass the MachineBasicBlock itself down to helper methods,
they don't need to know about traversal. Move the debug print
into the main method.
2020-04-07 17:53:04 +02:00
Nikita Popov 361c29d7ba [RDA] Avoid inserting duplicate reaching defs (NFCI)
An instruction may define the same reg unit multiple times,
avoid inserting the same reaching def multiple times in that case.

Also print the reg unit, rather than the super-register, in the
debug code.
2020-04-07 17:50:38 +02:00
David Tenty b9245f14b7 [NFC][PowerPC] Cleanup 64-bit and Darwin CalleeSavedRegs
Summary:
- Remove the no longer used Darwin CalleeSavedRegs
- Combine the SVR464 callee saved regs and AIX64 since the two are (and should be) identical into PPC64
- Update tests for 64-bit CSR change

Reviewers: sfertile, ZarkoCA, cebowleratibm, jasonliu, #powerpc

Reviewed By: sfertile

Subscribers: wuzish, nemanjai, hiraditya, kbarton, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77235
2020-04-07 11:49:10 -04:00
Simon Pilgrim e3b6059776 [X86][SSE] combineX86ShufflesConstants - early out for zeroable vectors (PR45443)
Shuffle combining can insert zero byte sized elements into the shuffle mask, which combineX86ShufflesConstants will attempt to fold without taking into account whether the byte-sized type is legal (e.g. AVX512F only targets).

If we have a full-zeroable vector then we should just return a zero version of the root type, otherwise if the type isn't valid we should bail.

Fixes PR45443
2020-04-07 14:45:29 +01:00
Keith Walker 01dc10774e [ARM] unwinding .pad instructions missing in execute-only prologue
If the stack pointer is altered for local variables and we are generating
Thumb2 execute-only code the .pad directive is missing.

Usually the size of the adjustment is stored in a PC-relative location
and loaded into a register which is then added to the stack pointer.
However when we are generating execute-only code code the size of the
adjustment is instead generated using the MOVW/MOVT instruction pair.

As a by product of handling the execute-only case this also fixes an
existing issue that in the none execute-only case the .pad directive was
generated against the load of the constant to a register instruction,
instead of the instruction which adds the register to the stack pointer.

Differential Revision: https://reviews.llvm.org/D76849
2020-04-07 11:51:59 +01:00
Florian Hahn 6aabb109be [SCCP] Use ranges for predicate info conditions.
This patch updates the code that deals with conditions from predicate
info to make use of constant ranges.

For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges:
1. The range of the original value.
2. The range imposed by the linked condition.

1. is known, 2. can be determined using makeAllowedICmpRegion. The
intersection of those ranges is the range for the copy.

With this patch, we get a nice increase in the number of instructions
eliminated by both SCCP and IPSCCP for some benchmarks:

For MultiSource, SPEC2000 & SPEC2006:

Tests: 237
Same hash: 170 (filtered out)
Remaining: 67
Metric: sccp.NumInstRemoved
Program                                        base    patch   diff
 test-suite...Source/Benchmarks/sim/sim.test    10.00   71.00  610.0%
 test-suite...CFP2000/177.mesa/177.mesa.test   361.00  1626.00 350.4%
 test-suite...encode/alacconvert-encode.test   141.00  602.00  327.0%
 test-suite...decode/alacconvert-decode.test   141.00  602.00  327.0%
 test-suite...CI_Purple/SMG2000/smg2000.test   1639.00 4093.00 149.7%
 test-suite...peg2/mpeg2dec/mpeg2decode.test    75.00  163.00  117.3%
 test-suite...T2006/401.bzip2/401.bzip2.test   358.00  513.00  43.3%
 test-suite...rks/FreeBench/pifft/pifft.test    11.00   15.00  36.4%
 test-suite...langs-C/unix-tbl/unix-tbl.test     4.00    5.00  25.0%
 test-suite...lications/sqlite3/sqlite3.test   541.00  667.00  23.3%
 test-suite.../CINT2000/254.gap/254.gap.test   243.00  299.00  23.0%
 test-suite...ks/Prolangs-C/agrep/agrep.test    25.00   29.00  16.0%
 test-suite...marks/7zip/7zip-benchmark.test   1135.00 1304.00 14.9%
 test-suite...lications/ClamAV/clamscan.test   1105.00 1268.00 14.8%
 test-suite...urce/Applications/lua/lua.test   398.00  436.00   9.5%

Metric: sccp.IPNumInstRemoved
Program                                        base   patch   diff
 test-suite...C/CFP2000/179.art/179.art.test     1.00   3.00  200.0%
 test-suite...006/447.dealII/447.dealII.test   429.00 1056.00 146.2%
 test-suite...nch/fourinarow/fourinarow.test     3.00   7.00  133.3%
 test-suite...CI_Purple/SMG2000/smg2000.test   818.00 1748.00 113.7%
 test-suite...ks/McCat/04-bisect/bisect.test     3.00   5.00  66.7%
 test-suite...CFP2000/177.mesa/177.mesa.test   165.00 255.00  54.5%
 test-suite...ediabench/gsm/toast/toast.test    18.00  27.00  50.0%
 test-suite...telecomm-gsm/telecomm-gsm.test    18.00  27.00  50.0%
 test-suite...ks/Prolangs-C/agrep/agrep.test    24.00  35.00  45.8%
 test-suite...TimberWolfMC/timberwolfmc.test    43.00  62.00  44.2%
 test-suite...encode/alacconvert-encode.test    46.00  66.00  43.5%
 test-suite...decode/alacconvert-decode.test    46.00  66.00  43.5%
 test-suite...langs-C/unix-tbl/unix-tbl.test    12.00  17.00  41.7%
 test-suite...peg2/mpeg2dec/mpeg2decode.test    31.00  41.00  32.3%
 test-suite.../CINT2000/254.gap/254.gap.test   117.00 154.00  31.6%

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D76611
2020-04-07 11:09:18 +01:00
Serguei Katkov b7e3759e17 [DAG] Consolidate require spill slot logic in lambda. NFC.
Move the logic whether lowering of deopt value requires a spill slot in
a separate lambda.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D77629
2020-04-07 16:43:47 +07:00
Peter Smith 14c1e98754 [ARM] Remove condition that could never be true
From Arm v8 Architecture Reference Manual F5.1.84 LDREXD
The ldrexd instruction in Arm state has the following conditions:

t = UInt(Rt); t2 = t + 1; n = UInt(Rn);
if Rt<0> == '1' || t2 == 15 || n == 15 then UNPREDICTABLE;

In when Rt is odd or if Rt is 14 (making t2 15).

In the implementation when the pair is the UNPREDICTABLE R14_R15 we
would ideally return SOFT_FAIL. We can't because there is no R14_R15
value for us to return so we fail early returning FAIL.

The early return for registers outside the bounds of the table means
the check for Rt == 14 (0xE) redundant which causes a static analyzer
to flag the condition as never being true.

To fix the warning I've removed the check and replaced with a comment
explaining the difference with the specification.

Fixes pr41660

Differential Revision: https://reviews.llvm.org/D77463
2020-04-07 09:50:56 +01:00
Simon Tatham aab9e9de4d [Support,Windows] Tolerate failure of CryptGenRandom
Summary:
In `Unix/Process.inc`, we seed a random number generator from
`/dev/urandom` if possible, but if not, we're happy to fall back to
ordinary pseudorandom strategies, like the current time and PID.

The corresponding function on Windows calls `CryptGenRandom`, but it
//doesn't// have a fallback if that strategy fails. But `CryptGenRandom`
//can// fail, if a cryptography provider isn't properly initialized, or
occasionally (by our observation) simply intermittently.

If it's reasonable on Unix to implement traditional pseudorandom-number
seeding as a fallback, then it's surely reasonable to do the same on
Windows. So this patch adds a last-ditch use of ordinary rand(), using
much the same strategy as the Unix fallback code.

Reviewers: hans, sammccall

Reviewed By: hans

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77553
2020-04-07 09:18:12 +01:00
Pierre-vh 4fc59a468f Revert "[CodeGen][SelectionDAG] Flip Booleans More Often"
This reverts commit 23342bdcc8.
2020-04-07 09:09:10 +01:00
Pierre-vh 23342bdcc8 [CodeGen][SelectionDAG] Flip Booleans More Often
Differential Revision: https://reviews.llvm.org/D77201
2020-04-07 08:19:57 +01:00
Sam Clegg f0bbf3d086 [WebAssembly] EmscriptenEHSjLj: Mark more functions as imported
These should have been part of https://reviews.llvm.org/D77192

Differential Revision: https://reviews.llvm.org/D77358
2020-04-06 21:27:31 -07:00
Xiang1 Zhang 01a32f2bd3 Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology)
Do not commit the llvm/test/ExecutionEngine/MCJIT/cet-code-model-lager.ll because it will
cause build bot fail(not suitable for window 32 target).

Summary:
This patch comes from H.J.'s 2bd54ce7fa

**This patch fix the failed llvm unit tests which running on CET machine. **(e.g. ExecutionEngine/MCJIT/MCJITTests)

The reason we enable IBT at "JIT compiled with CET" is mainly that:  the JIT don't know the its caller program is CET enable or not.
If JIT's caller program is non-CET, it is no problem JIT generate CET code or not.
But if JIT's caller program is CET enabled,  JIT must generate CET code or it will cause Control protection exceptions.

I have test the patch at llvm-unit-test and llvm-test-suite at CET machine. It passed.
and H.J. also test it at building and running VNCserver(Virtual Network Console), it works too.
(if not apply this patch, VNCserver will crash at CET machine.)

Reviewers: hjl.tools, craig.topper, LuoYuanke, annita.zhang, pengfei

Reviewed By: LuoYuanke

Subscribers: tstellar, efriedma, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76900
2020-04-07 09:48:47 +08:00
Jun Ma 46bff786bc [Coroutines] Remove alignment check in shouldBeMustTail
Differential Revision: https://reviews.llvm.org/D77362
2020-04-07 09:07:34 +08:00
Eli Friedman 3f13ee8a00 [NFC] Modernize misc. uses of Align/MaybeAlign APIs.
Use the current getAlign() APIs where it makes sense, and use Align
instead of MaybeAlign when we know the value is non-zero.
2020-04-06 17:53:04 -07:00
Eli Friedman 68b03aee1a Remove SequentialType from the type heirarchy.
Now that we have scalable vectors, there's a distinction that isn't
getting captured in the original SequentialType: some vectors don't have
a known element count, so counting the number of elements doesn't make
sense.

In some cases, there's a better way to express the commonality using
other methods. If we're dealing with GEPs, there's GEP methods; if we're
dealing with a ConstantDataSequential, we can query its element type
directly.

In the relatively few remaining cases, I just decided to write out
the type checks. We're talking about relatively few places, and I think
the abstraction doesn't really carry its weight. (See thread "[RFC]
Refactor class hierarchy of VectorType in the IR" on llvmdev.)

Differential Revision: https://reviews.llvm.org/D75661
2020-04-06 17:03:49 -07:00
Davide Italiano 8115e08b05 [MachineCSE] Don't carry the wrong location when hoisting
PR: 45425
<rdar://problem/61359768>

Differential Revision:  https://reviews.llvm.org/D77604
2020-04-06 16:36:22 -07:00
Daniel Sanders f27cea721e Add way to omit debug-location from MIR output
Summary:
In lieu of a proper pass that strips debug info, add a way
to omit debug-locations from the MIR output so that
instructions with MMO's continue to match CHECK's when
mir-debugify is used

Reviewers: aprantl, bogner, vsk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77575
2020-04-06 16:22:01 -07:00
Nick Desaulniers 41ba80182c [CallSite Removal] a CallBase is never an IndirectCall for isInlineAsm
Summary:
Thanks to Bill Wendling (void) for the report and steps to reproduce.  It looks
like this was missed during r350508's cleanup of the CallSite split into
CallBase, CallInst, and CallBrInst.

This was exposed by running pgo on a callbr, which was creating a ptrtoint to
the inline asm thinking it was an indirect call. The relevant callchain looks
like:

    IndirectCallPromotionPlugin::run()
    -> PGOIndirectCallVisitor::findIndirectCalls()
      -> PGOIndirectCallVisitor::visitCallBase()
        -> CallBase::isIndirectCall()

Reviewers: void, chandlerc

Reviewed By: void

Subscribers: hiraditya, llvm-commits, craig.topper, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77600
2020-04-06 16:14:46 -07:00
Vedant Kumar 5f185a8999 [AddressSanitizer] Fix for wrong argument values appearing in backtraces
Summary:
In some cases, ASan may insert instrumentation before function arguments
have been stored into their allocas. This causes two issues:

1) The argument value must be spilled until it can be stored into the
   reserved alloca, wasting a stack slot.

2) Until the store occurs in a later basic block, the debug location
   will point to the wrong frame offset, and backtraces will show an
   uninitialized value.

The proposed solution is to move instructions which initialize allocas
for arguments up into the entry block, before the position where ASan
starts inserting its instrumentation.

For the motivating test case, before the patch we see:

```
 | 0033: movq %rdi, 0x68(%rbx)  |   | DW_TAG_formal_parameter     |
 | ...                          |   |   DW_AT_name ("a")          |
 | 00d1: movq 0x68(%rbx), %rsi  |   |   DW_AT_location (RBX+0x90) |
 | 00d5: movq %rsi, 0x90(%rbx)  |   |       ^ not correct ...     |
```

and after the patch we see:

```
 | 002f: movq %rdi, 0x70(%rbx)  |   | DW_TAG_formal_parameter     |
 |                              |   |   DW_AT_name ("a")          |
 |                              |   |   DW_AT_location (RBX+0x70) |
```

rdar://61122691

Reviewers: aprantl, eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77182
2020-04-06 15:59:25 -07:00
Daniel Sanders 35b7b0851b Allow MachineFunction to obtain non-const Function (to enable MIR-level debugify)
Summary:
To debugify MIR, we need to be able to create metadata and to do that, we
need a non-const Module. However, MachineFunction only had a const reference
to the Function preventing this.

Reviewers: aprantl, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77439
2020-04-06 15:19:21 -07:00
Daniel Sanders 15f7bc7857 Add option to limit Debugify to locations (omitting variables)
Summary:
It can be helpful to test behaviour w.r.t locations without having DEBUG_VALUE
around. In particular, because DEBUG_VALUE has the potential to change CodeGen
behaviour (e.g. hasOneUse() vs hasOneNonDbgUse()) while locations generally
don't.

Reviewers: aprantl, bogner

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77438
2020-04-06 15:04:55 -07:00
David Blaikie 5aead592f0 X86ISelLowering: Minor refactor to avoid redundant initialization while ensuring compiler warnings can hopefully still prove initialization
Based on post-commit review/discussion in fabe52a7412b
2020-04-06 14:25:52 -07:00
Konstantin Pyzhov 72e8754916 [AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU.
Reviewers: sameerds, dstuttard

Differential Revision: https://reviews.llvm.org/D77228
2020-04-06 09:05:58 -04:00
Leonard Chan a0222ac1f9 [AsmPrinter] Do not define local aliases for global objects in a comdat
A global symbol that is defined in a comdat should not generate an alias since
call sites that would've referred to that symbol will refer to their own
independent local aliases rather than the surviving global comdat one. This
could result in something that looks like:

```
ld.lld: error: relocation refers to a discarded section: .text._ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub
>>> defined in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.file.cc.o)
>>> section group signature: _ZN3fbl8internal18NullFunctionTargetIvJjjPjEED1Ev.stub
>>> prevailing definition is in user-x64-clang/obj/system/ulib/minfs/libminfs.a(minfs._sources.vnode.cc.o)
>>> referenced by function.h:169 (../../zircon/system/ulib/fbl/include/fbl/function.h:169)
>>>               minfs._sources.file.cc.o:(minfs::File::AllocateAndCommitData(std::__2::unique_ptr<minfs::Transaction, std::__2::default_delete<minfs::Transaction> >)) in archive user-x64-clang/obj/system/ulib/minfs/libminfs.a
```

We ran into this when experimenting with a new C++ ABI for fuchsia
(refer to D72959) which takes relative offsets between comdat'd functions
which is why the normal C++ user wouldn't run into this.

Differential Revision: https://reviews.llvm.org/D77429
2020-04-06 13:48:05 -07:00
Nick Desaulniers 5bc291be71 [SelectionDAG] fix predecessor list for INLINEASM_BRs' parent
Summary:
A bug report mentioned that LLVM was producing jumps off the end of a
function when using "asm goto with outputs". Further digging pointed to
MachineBasicBlocks that had their address taken and were indirect
targets of INLINEASM_BR being removed by BranchFolder, because their
 predecessor list was empty, so they appeared to have no entry.

This was a cascading failure caused earlier, during Pre-RA instruction
scheduling. We have a few special cases in Pre-RA instruction scheduling
where we split a MachineBasicBlock in two.  This requires careful
handing of predecessor and successor lists for a MachineBasicBlock that
was split, and careful handing of PHI MachineInstrs that referred to the
MachineBasicBlock before it was split.

The clue that led to this fix was the observation that many callers of
MachineBasicBlock::splice() frequently call
MachineBasicBlock::transferSuccessorsAndUpdatePHIs() to update their PHI
nodes after a splice. We don't want to reuse that method, as we have
custom successor transferring logic for this block split.

This patch fixes 2 pre-existing bugs, and adds tests.

The first bug was that MachineBasicBlock::splice() correctly handles
updating most successors and predecessors; we don't need to do anything
more than removing the previous fallthrough block from the first half of
the split block post splice. Previously, we were updating the successor
list incorrectly (updating successors updates predecessors).

The second bug was that PHI nodes that needed registers from the first
half of the split block were not having entries populated.  The register
live out information was correct, and the FuncInfo->PHINodesToUpdate was
correct. Specifically, the check in SelectionDAGISel::FinishBasicBlock:

    for (unsigned i = 0, e = FuncInfo->PHINodesToUpdate.size(); i != e; ++i) {
      MachineInstrBuilder PHI(*MF, FuncInfo->PHINodesToUpdate[i].first);
      if (!FuncInfo->MBB->isSuccessor(PHI->getParent()))
        continue;
      PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB);

was `continue`ing because FuncInfo->MBB tracks the second half of
the post-split block; no one was updating PHI entries for the first half
of the post-split block.

SelectionDAGBuilder::UpdateSplitBlock() already expects to perform
special handling for MachineBasicBlocks that were split post calls to
ScheduleDAGSDNodes::EmitSchedule(), so I'm confident that it's both
correct for ScheduleDAGSDNodes::EmitSchedule() to return the second half
of the split block `CopyBB` which updates `FuncInfo->MBB` (ie. the
current MachineBasicBlock being processed), and perform special handling
for this in SelectionDAGBuilder::UpdateSplitBlock().

Reviewers: void, craig.topper, efriedma

Reviewed By: void, efriedma

Subscribers: hfinkel, fhahn, MatzeB, efriedma, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76961
2020-04-06 13:46:39 -07:00
Matt Arsenault 869f05c834 AMDGPU: Remove dead paths for requiresUniformRegister
The extracts from control flow intrinsics are already properly handled
by divergence analysis. The inline asm case isn't dead, but has also
never really worked correctly so leave it as-is for now.
2020-04-06 16:15:10 -04:00
Francesco Petrogalli 53b7abdd23 [llvm][CodeGen] Avoid implicit cast of TypeSize to integer in `initActions`.
Reviewers: sdesmalen, efriedma

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77317
2020-04-06 19:46:11 +01:00
Masoud Ataei jaliseh 9ed0612cca Add InjectTLIMappings pass to new pass manager
This pass is created in d6de5f12d4 and tested
for new and legacy pass manager but never added to new pass manager pipeline.
I am adding it to new pass manager pipeline.

This pass is get used in Vector Function Database (VFDatabase) and without
this pass in new pass manager pipeline, none of the vector libraries are work
ing with new pass manager.

Related passes:
66c120f025
https://reviews.llvm.org/D74944

Differential revision: https://reviews.llvm.org/D75354
2020-04-06 13:16:48 -05:00
Craig Topper 07ed1fb597 [SelectionDAGBuilder] Fix ISD::FREEZE creation for structs with fields of different types.
The previous code used the type of the first field for the VT
passed to getNode for every field.

I've based the implementation here off what is done in visitSelect
as it removes the need to special case aggregates.

Differential Revision: https://reviews.llvm.org/D77093
2020-04-06 11:03:40 -07:00
Konstantin Pyzhov 51dc028314 Revert e1730cfeb3 2020-04-06 05:56:11 -04:00
Kirill Naumov 3f995ce8b5 [CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo
The patch introduces the system to distinctively store the information
needed for the Control Flow Graph as well as the instrumentary needed for
the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo.
The patch is a part of sequence of three patches, related to graphs Heat Coloring.

Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu

Differential Revision: https://reviews.llvm.org/D76820
2020-04-06 17:42:54 +00:00
Konstantin Pyzhov e1730cfeb3 [AMDGPU] Disable 'Skip Uniform Regions' optimization by default for AMDGPU.
Reviewers: sameerds, dstuttard

Differential Revision: https://reviews.llvm.org/D77228
2020-04-06 05:10:37 -04:00
Fangrui Song a5d375e0cb [AArch64] Allow logical immediates to have all-1 in top bits
So that constant expressions like the following are permitted:

and w0, w0, #~(0xfe<<24)
and w1, w1, #~(0xff<<24)

The behavior matches GNU as (opcodes/aarch64-opc.c:aarch64_logical_immediate_p).

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D75885
2020-04-06 09:56:04 -07:00
Florian Hahn 7aba6a0333 [LV] Fix value that could be read uninitialized.
This should fix
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/18569
2020-04-06 17:54:50 +01:00
Nikita Popov e8b83f7ddc [RDA] Only store most recent reaching def from predecessors (NFCI)
When entering a basic block, RDA inserts reaching definitions coming
from predecessor blocks (which will be negative numbers) in a rather
peculiar way. If you have incoming reaching definitions -4, -3, -2, -1,
it will insert those. If you have incoming reaching definitions
-1, -2, -3, -4, it will insert -1, -1, -1, -1, as the max is taken
at each step. That's probably not what was intended...

However, RDA only actually cares about the most recent reaching
definition from a predecessor (to calculate clearance), so this
ends up working fine as far as behavior is concerned. It does
waste memory on unnecessary reaching definitions though.

This patch changes the implementation to first compute the most
recent reaching definition in one loop, and then insert only that
one in a separate loop.

Differential Revision: https://reviews.llvm.org/D77508
2020-04-06 18:39:09 +02:00
Nikita Popov 8d75df1438 [RDA] Don't adjust ReachingDefDefaultVal (NFCI)
At the end of a basic block, RDA adjusts all the reaching defs it
found to be relative to the end of the basic block, rather than the
start of it. However, it also does this to registers which don't
have a reaching def, indicated by ReachingDefDefaultVal. This means
that code checking against ReachingDefDefaultVal will not skip them,
and may insert them into the reaching definition list. This is
ultimately harmless, but causes unnecessary work and is logically
not right.

Differential Revision: https://reviews.llvm.org/D77506
2020-04-06 18:36:29 +02:00
Sanjay Patel fbb1b43f13 [ValueTracking] enhance matching of umin/umax with 'not' operands
The cmyk test is based on the known regression that resulted from:
rGf2fbdf76d8d0

This improves on the equivalent signed min/max change:
rG867f0c3c4d8c

The underlying icmp equivalence is:
  ~X pred ~Y --> Y pred X

For an icmp with constant, canonicalization results in a swapped pred:
  ~X < C -->  X > ~C
2020-04-06 11:51:59 -04:00
Matt Arsenault 8a5f0dafd4 AMDGPU/GlobalISel: Select llvm.amdgcn.div.scale 2020-04-06 11:50:19 -04:00
Matt Arsenault e87ec66762 AMDGPU/GlobalISel: Fix llvm.amdgcn.div.fmas.ll 2020-04-06 11:50:16 -04:00
Jay Foad ddd2f4b96f [AMDGPU] Fix inaccurate comments 2020-04-06 16:44:08 +01:00
Florian Hahn 90be3c24a7 [VPlan] Introduce new VPWidenCallRecipe (NFC).
This patch moves calls to their own recipe, to simplify the transition
to VPUser for operands of VPWidenRecipe, as discussed in D76992.

Subsequently additional information can be added to the recipe rather
than computing it during the execute step.

Reviewers: rengolin, Ayal, gilr, hsaito

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D77467
2020-04-06 16:07:37 +01:00
Chris Bowler d6ea82d11c [AIX][PPC] Implement by-val caller arguments in multiple registers
Differential Revision: https://reviews.llvm.org/D76380
2020-04-06 11:06:51 -04:00
Guillaume Chatelet 808286342a [Alignment][NFC] Assume AlignmentFromAssumptions::getNewAlignment is always set.
Summary:
In D77454 we explain that `LoadInst` and `StoreInst` always have their alignment defined.
This allows to work backward here and to infer that `getNewAlignment` does not need to return `0` in case of failure.
Returning `1` also works since it needs to be greater than the Load/Store alignment which is a least `1`.

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77538
2020-04-06 14:54:57 +00:00
diggerlin a26a441b99 [llvm-objdump][XCOFF] Use symbol index+symbol name + storage mapping class as label for -D
SUMMARY:

For the llvm-objdump -D, the symbol name is used as a label in the disassembly for the specific address (when a symbol address is equal to the virtual address in the dump).

In XCOFF, multiple symbols may have the same name, being differentiated by their storage mapping class. It is helpful to print the QualName and not just the name when forming the output label for a csect symbol. The symbol index further removes any ambiguity caused by duplicate names.

To maintain compatibility with the binutils objdump, the XCOFF-specific --symbol-description option is added to enable the enhanced format.

Reviewers: hubert.reinterpretcast, James Henderson, Jason Liu ,daltenty
Subscribers: wuzish, nemanjai, hiraditya

Differential Revision: https://reviews.llvm.org/D72973
2020-04-06 10:10:10 -04:00
Benjamin Kramer 880ec421dd [MC] Use a byte_swap in emitIntValue instead of doing it in a loop. NFCI. 2020-04-06 15:51:24 +02:00
Florian Hahn 6babae74c7 [Matrix] Update load/storeMatrix to take indices as Value* (NFC).
This allows using the functions to be used with loop dependent indices.
2020-04-06 14:48:48 +01:00
Matt Arsenault cbf719b568 AMDGPU: Use DAG patterns for div_fmas 2020-04-06 09:28:30 -04:00
Matt Arsenault 79b29d6df7 AMDGPU: Remove DisableInst feature
I'm not sure why these were bothering to check the instruction
profile, since those profiles should only be used with these
instruction classes.
2020-04-06 09:27:44 -04:00
Matt Arsenault 70726cec5b DAG: Combine extract_vector_elt of concat_vectors
Fixes extra canonicalize regressions when legalizing
vector fminnum/fmaxnum.
2020-04-06 09:26:29 -04:00
Hans Wennborg 64c2312750 Revert 43f031d312 "Enable IBT(Indirect Branch Tracking) in JIT with CET(Control-flow Enforcement Technology)"
ExecutionEngine/MCJIT/cet-code-model-lager.ll is failing on 32-bit
windows, see llvm-commits thread for fef2dab.

This reverts commit 43f031d312
and the follow-ups fef2dab100 and
6a800f6f62.
2020-04-06 15:05:25 +02:00
Sourabh Singh Tomar 5d7e9adce2 [DWARF5] Added support for emission of debug_macro section.
Summary:
This patch adds support for emission of following DWARFv5 macro forms
in .debug_macro section.

1. DW_MACRO_start_file
2. DW_MACRO_end_file
3. DW_MACRO_define_strp
4. DW_MACRO_undef_strp.

Reviewed By: dblaikie, ikudrin

Differential Revision: https://reviews.llvm.org/D72828
2020-04-06 17:45:10 +05:30
Pavel Labath 9154a6398e [llvm/Support] Make more DataExtractor methods error-aware
Summary:
This patch adds the optional Error argument, and the Cursor variants to
more DataExtractor methods. The functions now behave the same way as
other error-aware functions (they set the error when they fail, and
don't do anything if the error is already set).

I have merged the LEB128 implementations via a template (similarly to
how fixed-size functions are handled) to reduce code duplication.

Depends on D77304.

Reviewers: dblaikie, aprantl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77306
2020-04-06 14:14:11 +02:00
Pavel Labath a16fffa3f6 [Support] Make DataExtractor string functions error-aware
Summary:
This patch adds an optional Error argument to DataExtractor functions
for string extraction, and makes them behave like other DataExtractor
functions (set the error if extraction fails, don't do anything if the
error is already set).

I have merged the StringRef and C string versions of the functions to
reduce code duplication.

Reviewers: dblaikie, MaskRay

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77307
2020-04-06 14:14:11 +02:00
Guillaume Chatelet ff858d7781 [Alignment][NFC] Add DebugStr and operator*
Summary:
This is a roll forward of D77394 minus AlignmentFromAssumptions (which needs to be addressed separately)
Differences from D77394:
 - DebugStr() now prints the alignment value or `None` and no more `Align(x)` or `MaybeAlign(x)`
   - This is to keep Warning message consistent (CodeGen/SystemZ/alloca-04.ll)
 - Removed a few unneeded headers from Alignment (since it's included everywhere it's better to keep the dependencies to a minimum)

Reviewers: courbet

Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77537
2020-04-06 12:09:45 +00:00
Guillaume Chatelet 39cfba9e33 [Alignment][NFC] Remove deprecated functions introduced in 10.0.0
Summary:
24 March 2020: LLVM 10.0.0 is out.
I gathered all deprecated function introduced between 9 and 10 and cleaned them up so they will be removed from 11.

> git log -p -S LLVM_ATTRIBUTE_DEPRECATED llvmorg-9.0.0..llvmorg-10.0.0

Reviewers: courbet

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77409
2020-04-06 12:07:18 +00:00
Simon Pilgrim 9bc5b1a489 [X86][SSE] combineVectorSignBitsTruncation - remove minimum vector length limitations
truncateVectorWithPACK has its own vector length controls, so we can rely on those directly. This helps some existing truncation to subvector tests, which were being combined later during shuffle lowering at which point the sign/zero bit detection had become obscured preventing lowerShuffleWithPACK working as well as it could.
2020-04-06 12:45:23 +01:00
Benjamin Kramer 232eff55f6 [LTO] Replace hand-rolled endian conversion with support::endian. NFCI. 2020-04-06 13:23:27 +02:00
Benjamin Kramer e64e516790 [RuntimeDyld] Replace hand-rolled endian conversion with support::endian. NFCI. 2020-04-06 13:22:53 +02:00
Benjamin Kramer 9a9bc23672 [llvm-bcanalyzer] Simplify code. NFCI. 2020-04-06 12:50:50 +02:00