Commit Graph

1113 Commits

Author SHA1 Message Date
Sander de Smalen 55d18b3cc2 [TTI] Return a TypeSize from getRegisterBitWidth.
This patch changes the interface to take a RegisterKind, to indicate
whether the register bitwidth of a scalar register, fixed-width vector
register, or scalable vector register must be returned.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D98874
2021-03-24 14:45:13 +00:00
Mikael Holmen 6d22ba48ea [NVPTX] Fix warning, remove extra ";" [NFC]
gcc complained with
../lib/Target/NVPTX/NVPTXLowerArgs.cpp:203:2: warning: extra ';' [-Wpedantic]
  203 | };
      |  ^
2021-03-19 09:26:14 +01:00
Artem Belevich 50c7504a93 [NVPTX] Avoid temp copy of byval kernel parameters.
Avoid making a temporary copy of byval argument if all accesses are loads and
therefore the pointer to the parameter can not escape.

This avoids excessive global memory accesses when each kernel makes its own
copy.

Differential revision: https://reviews.llvm.org/D98469
2021-03-15 14:27:22 -07:00
Stephen Tozer f677413071 Reapply "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values"
Rewrites test to use correct architecture triple; fixes incorrect
reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour
so as to not be dependent on changes elsewhere in the patch stack.

This reverts commit d2000b45d0.
2021-03-05 12:32:05 +00:00
Stephen Tozer d2000b45d0 Revert "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values"
This reverts commit d07f106f4a.
2021-03-04 11:59:21 +00:00
gbtozers d07f106f4a [DebugInfo] Add new instruction and DIExpression operator for variadic debug values
This patch adds a new instruction that can represent variadic debug values,
DBG_VALUE_VAR. This patch alone covers the addition of the instruction and a set
of basic code changes in MachineInstr and a few adjacent areas, but does not
correctly handle variadic debug values outside of these areas, nor does it
generate them at any point.

The new instruction is similar to the existing DBG_VALUE instruction, with the
following differences: the operands are in a different order, any number of
values may be used in the instruction following the Variable and Expression
operands (these are referred to in code as “debug operands”) and are indexed
from 0 so that getDebugOperand(X) == getOperand(X+2), and the Expression in a
DBG_VALUE_VAR must use the DW_OP_LLVM_arg operator to pass arguments into the
expression.

The new DW_OP_LLVM_arg operator is only valid in expressions appearing in a
DBG_VALUE_VAR; it takes a single argument and pushes the debug operand at the
index given by the argument onto the Expression stack. For example the
sub-expression `DW_OP_LLVM_arg, 0` has the meaning “Push the debug operand at
index 0 onto the expression stack.”

Differential Revision: https://reviews.llvm.org/D82363
2021-03-04 11:45:35 +00:00
Artem Belevich 2aa01ccec3 [CUDA, NVPTX] Allow targeting sm_86 GPUs.
The patch only plumbs through the option necessary for targeting sm_86 GPUs w/o
adding any new functionality.

Differential Revision: https://reviews.llvm.org/D95974
2021-02-09 11:01:10 -08:00
Arthur Eubanks e84a4650eb [NVPTX][NewPM] Re-enable NVVMReflectPass
Disabled alongside NVVMIntrRangePass in https://reviews.llvm.org/D96166,
but turns out NVVMIntrRangePass was the issue.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D96291
2021-02-08 13:58:17 -08:00
Arthur Eubanks 526c0955c0 [NVPTX][NewPM] Temporarily disable NVPTX passes in new PM pipeline
These passes are causing numerical discrepancies after being added to
the pipeline. Disable while investigating.

Reviewed By: rupprecht

Differential Revision: https://reviews.llvm.org/D96166
2021-02-05 11:31:07 -08:00
Tres Popp 170199f562 [llvm][nvptx] add atomicity to counter in ISelLowering
Previously uniqueCallSite could have race conditions between different
threads. Now it is accessed with an atomic RMW and will be unique
between different threads.

Differential Revision: https://reviews.llvm.org/D94784
2021-01-19 10:20:20 +01:00
Kazu Hirata 23b0ab2acb [llvm] Use the default value of drop_begin (NFC) 2021-01-18 10:16:36 -08:00
Kazu Hirata 19aacdb715 [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-16 09:40:53 -08:00
Arthur Eubanks 9ccf13c36d [NewPM][NVPTX] Port NVPTX opt passes
There are only two used in the IR optimization pipeline.
Port these and add them to the default pipeline.

Similar to https://reviews.llvm.org/D93863.

I added -mtriple to some tests since under the new PM, the passes are
only available when the TargetMachine is specified.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D93930
2021-01-07 15:12:35 -08:00
Matt Arsenault c9122ddef5 CodeGen: Refactor regallocator command line and target selection
Make the sequence of passes to select and rewrite instructions to
physical registers be a target callback. This is to prepare to allow
targets to split register allocation into multiple phases.
2021-01-07 13:13:25 -05:00
Paul C. Anagnostopoulos f4c39ccd22 [TableGen] Continue cleaning up .td files
This pass includes LLVM and MLIR files.

Differential Revision: https://reviews.llvm.org/D93864
2021-01-01 10:21:02 -05:00
Florian Hahn b2f4c5fddc
[AsmWriter] Factor out mnemonic generation to accessible getMnemonic.
This patch factors out the part of printInstruction that gets the
mnemonic string for a given MCInst. This is intended to be used
subsequently for the instruction-mix remarks to display the final
mnemonic (D90040).

Unfortunately making `getMnemonic` available to the AsmPrinter
seems to require making it virtual. Not sure if there's a way around
that with the current layering of the AsmPrinters.

Reviewed By: Paul-C-Anagnostopoulos

Differential Revision: https://reviews.llvm.org/D90039
2020-11-17 09:47:38 +00:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Paul C. Anagnostopoulos eed768b700 [NVPTX] [TableGen] Use new features of TableGen to simplify and clarify.
Differential Revision: https://reviews.llvm.org/D90861
2020-11-06 09:20:19 -05:00
Yuriy Chernyshov 99e64623ec Do not construct std::string from nullptr
While I am trying to forbid such usages systematically in https://reviews.llvm.org/D79427 / P2166R0 to C++ standard, this PR fixes this (definitelly incorrect) usage in llvm.

This code is unreachable, so it could not cause any harm

Reviewed By: nikic, dblaikie

Differential Revision: https://reviews.llvm.org/D87697
2020-11-05 15:23:26 -08:00
Sander de Smalen d57bba7cf8 [SVE] Return StackOffset for TargetFrameLowering::getFrameIndexReference.
To accommodate frame layouts that have both fixed and scalable objects
on the stack, describing a stack location or offset using a pointer + uint64_t
is not sufficient. For this reason, we've introduced the StackOffset class,
which models both the fixed- and scalable sized offsets.

The TargetFrameLowering::getFrameIndexReference is made to return a StackOffset,
so that this can be used in other interfaces, such as to eliminate frame indices
in PEI or to emit Debug locations for variables on the stack.

This patch is purely mechanical and doesn't change the behaviour of how
the result of this function is used for fixed-sized offsets. The patch adds
various checks to assert that the offset has no scalable component, as frame
offsets with a scalable component are not yet supported in various places.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D90018
2020-11-05 11:02:18 +00:00
Paul C. Anagnostopoulos d56cd4291e [TableGen] Add !interleave operator to concatenate a list of values with delimiters
Add a test. Use it in some TableGen files.

Differential Revision: https://reviews.llvm.org/D90469
2020-11-04 09:23:54 -05:00
Paul C. Anagnostopoulos ef6f6d1c1a [TableGen] Eliminate uses of true and false in .td files.
They occurred in one NVPTX file and some test files.

Differential Revision: https://reviews.llvm.org/D90513
2020-10-31 10:54:33 -04:00
David Sherwood 47f2dc7e5f [SVE][NFC] Replace some TypeSize comparisons in non-AArch64 Targets
In most of lib/Target we know that we are not dealing with scalable
types so it's perfectly fine to replace TypeSize comparison operators
with their fixed width equivalents, making use of getFixedSize()
and so on.

Differential Revision: https://reviews.llvm.org/D89101
2020-10-15 09:01:21 +01:00
Justin Lebar e9ac1869a8
Preserve param alignment in NVPTXLowerArgs pass.
NVPTXLowerArgs works as follows.

  * Create a regular alloca with alignment identical to arg.
  * Copy arg from param space (and ASC'ing it from generic AS first) to
    the alloca (it's still in generic AS).
  * Replace loads of arg with loads of alloca.

The bug here is that we did not preserve the arg's alignment when
loading from the alloca.

The impact of this bug is that sometimes param loads would be lowered as
a series of u8 loads, because we're incorrectly assuming everything has
alignment 1.

Differential Revision: https://reviews.llvm.org/D89404
2020-10-14 11:15:30 -07:00
Simon Pilgrim 8f34216ece NVPTXTargetMachine.h - remove unused includes. NFCI. 2020-09-29 16:41:59 +01:00
Craig Topper c7a0b2684f [X86][MC][Target] Initial backend support a tune CPU to support -mtune
This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line.

This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned.

One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU.

I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning.

Differential Revision: https://reviews.llvm.org/D85165
2020-08-14 15:31:50 -07:00
Kazu Hirata 60434989e5 Use llvm::is_contained where appropriate (NFC)
Use llvm::is_contained where appropriate (NFC)

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D85083
2020-08-01 21:51:06 -07:00
Vitaly Buka b0eb40ca39 [NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621
2020-07-31 02:10:03 -07:00
Vitaly Buka 89051ebace [NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
2020-07-30 21:08:24 -07:00
Sebastian Neubauer 2a6c871596 [InstCombine] Move target-specific inst combining
For a long time, the InstCombine pass handled target specific
intrinsics. Having target specific code in general passes was noted as
an area for improvement for a long time.

D81728 moves most target specific code out of the InstCombine pass.
Applying the target specific combinations in an extra pass would
probably result in inferior optimizations compared to the current
fixed-point iteration, therefore the InstCombine pass resorts to newly
introduced functions in the TargetTransformInfo when it encounters
unknown intrinsics.
The patch should not have any effect on generated code (under the
assumption that code never uses intrinsics from a foreign target).

This introduces three new functions:
TargetTransformInfo::instCombineIntrinsic
TargetTransformInfo::simplifyDemandedUseBitsIntrinsic
TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic

A few target specific parts are left in the InstCombine folder, where
it makes sense to share code. The largest left-over part in
InstCombineCalls.cpp is the code shared between arm and aarch64.

This allows to move about 3000 lines out from InstCombine to the targets.

Differential Revision: https://reviews.llvm.org/D81728
2020-07-22 15:59:49 +02:00
Artem Belevich bf66003a4f [MC,NVPTX] Add MCAsmPrinter support for unsigned-only data directives.
PTX does not support negative values in .bNN data directives and we must
typecast such values to unsigned before printing them.

MCAsmInfo can now specify whether such casting is necessary for particular
target.

Differential Revision: https://reviews.llvm.org/D83423
2020-07-20 16:24:41 -07:00
Sidharth Baveja e541e1b757 [NFC] Separate Peeling Properties into its own struct (re-land after minor fix)
Summary:
This patch separates the peeling specific parameters from the UnrollingPreferences,
and creates a new struct called PeelingPreferences. Functions which used the
UnrollingPreferences struct for peeling have been updated to use the PeelingPreferences struct.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel), anhtuyen (Anh Tuyen Tran), nikic (Nikita Popov)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-10 18:39:30 +00:00
Nikita Popov 0b39d2d752 Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit 0369dc98f9.

Many failing tests.
2020-07-08 21:43:32 +02:00
Sidharth Baveja 0369dc98f9 [NFC] Separate Peeling Properties into its own struct
Summary:
This patch makes the peeling properties of the loop accessible by other loop transformations.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-08 18:59:59 +00:00
Anh Tuyen Tran 6965af43e6 Revert "[NFC] Separate Peeling Properties into its own struct"
This reverts commit fead250b43.
2020-07-08 18:58:05 +00:00
Anh Tuyen Tran fead250b43 [NFC] Separate Peeling Properties into its own struct
Summary:
This patch makes the peeling properties of the loop accessible by other loop transformations.

Author: sidbav (Sidharth Baveja)

Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel)

Reviewed By: Meinersbur (Michael Kruse)

Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM

Tag: LLVM

Differential Revision: https://reviews.llvm.org/D80580
2020-07-08 18:56:03 +00:00
Guillaume Chatelet 87e2751cf0 [Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D83082
2020-07-03 08:06:43 +00:00
Guillaume Chatelet 4f5133a4dc [Alignment][NFC] Migrate AArch64, ARM, Hexagon, MSP and NVPTX backends to Align
This patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82749
2020-06-30 07:56:17 +00:00
Guillaume Chatelet 1507fc1506 [Alignment][NFC] Migrate TTI::isLegalToVectorize{Load,Store}Chain to Align
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82653
2020-06-26 14:14:27 +00:00
tatz.j@northeastern.edu af5e61bf4f [NVPTX] Fix for NVPTX module asm regression
Currently module asm ends up emitted twice and at the wrong place in the PTX.
This patch moves module asm generation into emitStartOfAsmFile() which puts at
the correct location in the generated PTX.

Differential Revision: https://reviews.llvm.org/D82280
2020-06-24 11:17:09 -07:00
stozer 539381da26 [DebugInfo] Update MachineInstr to help support variadic DBG_VALUE instructions
Following on from this RFC[0] from a while back, this is the first patch towards
implementing variadic debug values.

This patch specifically adds a set of functions to MachineInstr for performing
operations specific to debug values, and replacing uses of the more general
functions where appropriate. The most prevalent of these is replacing
getOperand(0) with getDebugOperand(0) for debug-value-specific code, as the
operands corresponding to values will no longer be at index 0, but index 2 and
upwards: getDebugOperand(x) == getOperand(x+2). Similar replacements have been
added for the other operands, along with some helper functions to replace
oft-repeated code and operate on a variable number of value operands.

[0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139376.html<Paste>

Differential Revision: https://reviews.llvm.org/D81852
2020-06-22 16:01:12 +01:00
Craig Topper f96a7706d9 [Target] Use Align in TargetLoweringObjectFile::getSectionForConstant.
Differential Revision: https://reviews.llvm.org/D80363
2020-05-21 15:23:29 -07:00
Eli Friedman 4f04db4b54 AllocaInst should store Align instead of MaybeAlign.
Along the lines of D77454 and D79968.  Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.

Differential Revision: https://reviews.llvm.org/D80044
2020-05-16 14:53:16 -07:00
Eli Friedman 11aa3707e3 StoreInst should store Align, not MaybeAlign
This is D77454, except for stores.  All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.

Differential Revision: https://reviews.llvm.org/D79968
2020-05-15 12:26:58 -07:00
Eli Friedman accc6b5545 LoadInst should store Align, not MaybeAlign.
The fact that loads and stores can have the alignment missing is a
constant source of confusion: code that usually works can break down in
rare cases.  So fix the LoadInst API so the alignment is never missing.

To reduce the number of changes required to make this work, IRBuilder
and certain LoadInst constructors will grab the module's datalayout and
compute the alignment automatically.  This is the same alignment
instcombine would eventually apply anyway; we're just doing it earlier.
There's a minor risk that the way we're retrieving the datalayout
could break out-of-tree code, but I don't think that's likely.

This is the last in a series of patches, so most of the necessary
changes have already been merged.

Differential Revision: https://reviews.llvm.org/D77454
2020-05-14 13:19:21 -07:00
Sam Parker 40574fefe9 [NFC][CostModel] Add TargetCostKind to relevant APIs
Make the kind of cost explicit throughout the cost model which,
apart from making the cost clear, will allow the generic parts to
calculate better costs. It will also allow some backends to
approximate and correlate the different costs if they wish. Another
benefit is that it will also help simplify the cost model around
immediate and intrinsic costs, where we currently have multiple APIs.

RFC thread:
http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html

Differential Revision: https://reviews.llvm.org/D79002
2020-05-05 10:35:54 +01:00
Craig Topper a58b62b4a2 [IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand().
This method has been commented as deprecated for a while. Remove
it and replace all uses with the equivalent getCalledOperand().

I also made a few cleanups in here. For example, to removes use
of getElementType on a pointer when we could just use getFunctionType
from the call.

Differential Revision: https://reviews.llvm.org/D78882
2020-04-27 22:17:03 -07:00
Simon Pilgrim a3982491db [Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly
Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h.

Differential Revision: https://reviews.llvm.org/D78815
2020-04-26 12:58:20 +01:00
Simon Pilgrim 155190567c [NVPTX] Remove unused forward declarations. NFC. 2020-04-23 16:30:44 +01:00
Christopher Tetreault 2dea3f1298 [SVE] Add new VectorType subclasses
Summary:
Introduce new types for fixed width and scalable vectors.

Does not remove getNumElements yet so as to not break code during transition
period.

Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr

Reviewed By: sdesmalen

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm, #lldb

Differential Revision: https://reviews.llvm.org/D77587
2020-04-22 08:59:01 -07:00
Craig Topper d22989c34e [CallSite removal][Target] Replace CallSite with CallBase. NFC
In some cases just delete an unneeded include.
2020-04-21 23:29:36 -07:00
Craig Topper 113f37a1f9 [CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase
Differential Revision: https://reviews.llvm.org/D77995
2020-04-13 13:50:15 -07:00
Fangrui Song d2e5157c1f [MC] Add UseIntegratedAssembler = false. NFC 2020-04-11 10:13:49 -07:00
Artem Belevich a9627b7ea7 [CUDA] Add partial support for recent CUDA versions.
Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11.
None of the new features of CUDA-10.2+ have been implemented yet, so using these
versions will still produce a warning.

Differential Revision: https://reviews.llvm.org/D77670
2020-04-08 11:19:44 -07:00
Matt Arsenault 2481f26ac3 CodeGen: Use Register in TargetFrameLowering 2020-04-07 17:07:44 -04:00
Guillaume Chatelet c7468c1696 [Alignment][NFC] Use Align in SelectionDAG::getMemIntrinsicNode
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, nemanjai, hiraditya, kbarton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77149
2020-04-01 09:32:05 +00:00
Matt Arsenault a8cc9047de CodeGen: Add -denormal-fp-math-f32 flag
Make the set of FP related attributes and command flags closer.
2020-03-27 14:00:39 -07:00
Matt Arsenault 830cfda19f Utils: Mostly convert memcpy expansion to use Align
The TTI hooks aren't converted. I also think the intrinsics should
have mandatory alignment and never return MaybeAlign.
2020-03-22 11:21:44 -04:00
Guillaume Chatelet d000655a8c [Alignment][NFC] Deprecate getMaxAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76348
2020-03-18 14:48:45 +01:00
Guillaume Chatelet c3df69faa0 [Alignment][NFC] Deprecate getTransientStackAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76301
2020-03-18 09:02:48 +01:00
Nikita Popov 0e890cd4d4 [ConstantFolding] Always return something from ConstantFoldConstant
Spin-off from D75407. As described there, ConstantFoldConstant()
currently returns null for non-ConstantExpr/ConstantVector inputs,
but otherwise always returns non-null, independently of whether
any folding has happened or not.

This is confusing and makes consumer code more complicated.
I would expect either that ConstantFoldConstant() returns only if
it actually folded something, or that it always returns non-null.
I'm going to the latter possibility here, which appears to be more
useful considering existing usage.

Differential Revision: https://reviews.llvm.org/D75543
2020-03-04 18:24:47 +01:00
Fangrui Song 549b436beb [MC] De-capitalize MCStreamer::Emit{Bundle,Addrsig}* etc
So far, all non-COFF-related Emit* functions have been de-capitalized.
2020-02-15 09:11:48 -08:00
Fangrui Song bcd24b2d43 [AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI* 2020-02-13 22:08:55 -08:00
Fangrui Song 0dce409cee [AsmPrinter] De-capitalize Emit{Function,BasicBlock]* and Emit{Start,End}OfAsmFile 2020-02-13 13:22:49 -08:00
Frederic Bastien 019ab61e25 [NVPTX, LSV] Move the LSV optimization pass to later when the graph is cleaner
This allow it to recognize more loads as being consecutive when the load's address are complex at the start.

Differential Revision: https://reviews.llvm.org/D74444
2020-02-13 12:15:38 -08:00
Matt Arsenault a3c814d234 Separately track input and output denormal mode
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
2020-02-04 12:59:21 -05:00
Guillaume Chatelet 333f2ad8b8 [Alignment][NFC] Use Align for getMemcpy/Memmove/Memset
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73885
2020-02-03 17:13:19 +01:00
Benjamin Kramer adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Krzysztof Parzyszek 020041d99b Update spelling of {analyze,insert,remove}Branch in strings and comments
These names have been changed from CamelCase to camelCase, but there were
many places (comments mostly) that still used the old names.

This change is NFC.
2020-01-21 10:15:38 -06:00
Derek Schuff ff171acf84 [WebAssembly] Track frame registers through VReg and local allocation
This change has 2 components:

Target-independent: add a method getDwarfFrameBase to TargetFrameLowering. It
describes how the Dwarf frame base will be encoded.  That can be a register (the
default), the CFA (which replaces NVPTX-specific logic in DwarfCompileUnit), or
a DW_OP_WASM_location descriptr.

WebAssembly: Allow WebAssemblyFunctionInfo::getFrameRegister to return the
correct virtual register instead of FP32/SP32 after WebAssemblyReplacePhysRegs
has run.  Make WebAssemblyExplicitLocals store the local it allocates for the
frame register. Use this local information to implement getDwarfFrameBase

The result is that the DW_AT_frame_base attribute is correctly encoded for each
subprogram, and each param and local variable has a correct DW_AT_location that
uses DW_OP_fbreg to refer to the frame base.

This is a reland of rG3a05c3969c18 with fixes for the expensive-checks
and Windows builds

Differential Revision: https://reviews.llvm.org/D71681
2020-01-17 17:23:56 -08:00
Matt Arsenault a4451d88ee Consolidate internal denormal flushing controls
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.

- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute

AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).

Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.

Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.

-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.

Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.

This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.

AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
2020-01-17 20:09:53 -05:00
Derek Schuff 80906d9d16 Revert "[WebAssembly] Track frame registers through VReg and local allocation"
This reverts commit 3a05c3969c.
It breaks under expensive-checks and on Windows
2020-01-16 14:38:00 -08:00
Derek Schuff 3a05c3969c [WebAssembly] Track frame registers through VReg and local allocation
This change has 2 components:

Target-independent: add a method getDwarfFrameBase to TargetFrameLowering. It
describes how the Dwarf frame base will be encoded.  That can be a register (the
default), the CFA (which replaces NVPTX-specific logic in DwarfCompileUnit), or
a DW_OP_WASM_location descriptr.

WebAssembly: Allow WebAssemblyFunctionInfo::getFrameRegister to return the
correct virtual register instead of FP32/SP32 after WebAssemblyReplacePhysRegs
has run.  Make WebAssemblyExplicitLocals store the local it allocates for the
frame register. Use this local information to implement getDwarfFrameBase

The result is that the DW_AT_frame_base attribute is correctly encoded for each
subprogram, and each param and local variable has a correct DW_AT_location that
uses DW_OP_fbreg to refer to the frame base.

Differential Revision: https://reviews.llvm.org/D71681
2020-01-16 13:51:17 -08:00
Simon Pilgrim 7b15865225 Fix "pointer is null" static analyzer warning. NFCI.
Use cast<> instead of dyn_cast<> since the pointer is always dereferenced and cast<> will perform the null assertion for us.
2020-01-15 12:18:11 +00:00
Tom Stellard 0dbcb36394 CMake: Make most target symbols hidden by default
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.

A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.

This patch reduces the number of public symbols in libLLVM.so by about
25%.  This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so

One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.

Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278

Reviewers: chandlerc, beanz, mgorny, rnk, hans

Reviewed By: rnk, hans

Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D54439
2020-01-14 19:46:52 -08:00
Fangrui Song 3d87d0b925 [MC] Add parameter `Address` to MCInstrPrinter::printInstruction
Follow-up of D72172.

Reviewed By: jhenderson, rnk

Differential Revision: https://reviews.llvm.org/D72180
2020-01-06 20:44:14 -08:00
Fangrui Song aa708763d3 [MC] Add parameter `Address` to MCInstPrinter::printInst
printInst prints a branch/call instruction as `b offset` (there are many
variants on various targets) instead of `b address`.

It is a convention to use address instead of offset in most external
symbolizers/disassemblers. This difference makes `llvm-objdump -d`
output unsatisfactory.

Add `uint64_t Address` to printInst(), so that it can pass the argument to
printInstruction(). `raw_ostream &OS` is moved to the last to be
consistent with other print* methods.

The next step is to pass `Address` to printInstruction() (generated by
tablegen from the instruction set description). We can gradually migrate
targets to print addresses instead of offsets.

In any case, downstream projects which don't know `Address` can pass 0 as
the argument.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D72172
2020-01-06 20:42:22 -08:00
Mark de Wever 31262d6722 [NVPTX] Fixes -Wrange-loop-analysis warnings
This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall.

Also removed the top-level const as requested by Aaron Ballman in similar
patches.

Differential Revision: https://reviews.llvm.org/D71812
2019-12-22 19:27:44 +01:00
Reid Kleckner 5d986953c8 [IR] Split out target specific intrinsic enums into separate headers
This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
  Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of
  object file size.
- Incremental step towards decoupling target intrinsics.

The enums are still compact, so adding and removing a single
target-specific intrinsic will trigger a rebuild of all of LLVM.
Assigning distinct target id spaces is potential future work.

Part of PR34259

Reviewers: efriedma, echristo, MaskRay

Reviewed By: echristo, MaskRay

Differential Revision: https://reviews.llvm.org/D71320
2019-12-11 18:02:14 -08:00
David Green be7a107070 [ARM] Teach the Arm cost model that a Shift can be folded into other instructions
This attempts to teach the cost model in Arm that code such as:
  %s = shl i32 %a, 3
  %a = and i32 %s, %b
Can under Arm or Thumb2 become:
  and r0, r1, r2, lsl #3

So the cost of the shift can essentially be free. To do this without
trying to artificially adjust the cost of the "and" instruction, it
needs to get the users of the shl and check if they are a type of
instruction that the shift can be folded into. And so it needs to have
access to the actual instruction in getArithmeticInstrCost, which if
available is added as an extra parameter much like getCastInstrCost.

We otherwise limit it to shifts with a single user, which should
hopefully handle most of the cases. The list of instruction that the
shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR,
ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and
ICmp.

Differential Revision: https://reviews.llvm.org/D70966
2019-12-09 10:24:33 +00:00
Tom Stellard ab411801b8 [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries"
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO.  I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so.  Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:

1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so.  This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.

With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.

2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set.  This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.

I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:

- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON

Reviewers: beanz, smeenai, compnerd, phosek

Reviewed By: beanz

Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70179
2019-11-21 10:48:08 -08:00
Matt Arsenault b696b9dba7 DAG: Add function context to isFMAFasterThanFMulAndFAdd
AMDGPU needs to know the FP mode for the function to answer this
correctly when this is removed from the subtarget.

AArch64 had to make this more complicated by using this from an IR
hook, so add an IR typed overload.
2019-11-19 19:25:26 +05:30
Graham Hunter 3f08ad611a [SVE][CodeGen] Scalable vector MVT size queries
* Implements scalable size queries for MVTs, split out from D53137.

* Contains a fix for FindMemType to avoid using scalable vector type
  to contain non-scalable types.

* Explicit casts for several places where implicit integer sign
  changes or promotion from 32 to 64 bits caused problems.

* CodeGenDAGPatterns will treat scalable and non-scalable vector types
  as different.

Reviewers: greened, cameron.mcinally, sdesmalen, rovka

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D66871
2019-11-18 12:30:59 +00:00
Reid Kleckner 4c1a1d3cf9 Add missing includes needed to prune LLVMContext.h include, NFC
These are a pre-requisite to removing #include "llvm/Support/Options.h"
from LLVMContext.h: https://reviews.llvm.org/D70280
2019-11-14 15:23:15 -08:00
Simon Pilgrim 6ebc5089b2 Fix uninitialized variable warning. NFCI. 2019-11-13 14:40:20 +00:00
Matt Arsenault e6c9a9af39 Use MCRegister in copyPhysReg 2019-11-11 14:42:33 +05:30
Simon Pilgrim d5c4881a56 ManagedStringPool - pre-increment iterator. NFC. 2019-11-07 16:56:16 +00:00
Artem Belevich d9972f8482 [NVPTX] Added llvm.nvvm.mma.m8n8k4.* intrinsics
Differential Revision: https://reviews.llvm.org/D69324
2019-10-28 13:55:30 -07:00
Mirko Brkusanin 4b63ca1379 [Mips] Use appropriate private label prefix based on Mips ABI
MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64
regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo
we can find out Mips ABI and pick appropriate prefix.

Tags: #llvm, #clang, #lldb

Differential Revision: https://reviews.llvm.org/D66795
2019-10-23 12:24:35 +02:00
Guillaume Chatelet 17f5d2b1a5 [Alignment][NFC] Attributes use Align/MaybeAlign
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69278

llvm-svn: 375495
2019-10-22 09:51:06 +00:00
Guillaume Chatelet 882c43d703 [Alignment][NFC] Use Align for TargetFrameLowering/Subtarget
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68993

llvm-svn: 375084
2019-10-17 07:49:39 +00:00
Artem Belevich 5c6ab2a0b1 [NVPTX] Restructure shfl instrinsics and add variants that return a predicate.
Also, amend constraints for non-sync variants that are no longer
available on sm_70+ with PTX6.4+.

Differential Revision: https://reviews.llvm.org/D68892

llvm-svn: 374790
2019-10-14 16:53:34 +00:00
Alina Sbirlea 0fa07f4276 [LegacyPassManager] Deprecate the BasicBlockPass/Manager.
Summary:
The BasicBlockManager is potentially broken and should not be used.
Replace all uses of the BasicBlockPass with a FunctionBlockPass+loop on
blocks.

Reviewers: chandlerc

Subscribers: jholewinski, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68234

llvm-svn: 373254
2019-09-30 20:17:23 +00:00
Guillaume Chatelet ab11b9188d [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68141

llvm-svn: 373207
2019-09-30 13:34:44 +00:00
Thomas Raoux 3c8c667235 [TargetLowering] Make allowsMemoryAccess methode virtual.
Rename old function to explicitly show that it cares only about alignment.
The new allowsMemoryAccess call the function related to alignment by default
and can be overridden by target to inform whether the memory access is legal or
not.

Differential Revision: https://reviews.llvm.org/D67121

llvm-svn: 372935
2019-09-26 00:16:01 +00:00
Guillaume Chatelet 1ae7905fc8 [Alignment][NFC] DataLayout migration to llvm::Align
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67914

llvm-svn: 372596
2019-09-23 12:41:36 +00:00
Simon Pilgrim a75b947014 [NVPTX] NVPTXLowerAggrCopies - Silence static analyzer dyn_cast<StoreInst> null dereference warning. NFCI.
llvm-svn: 372504
2019-09-22 13:14:30 +00:00
Simon Pilgrim 5a5f04afcb NVPTXAsmPrinter - Don't dereference a dyn_cast result. NFCI.
llvm-svn: 372166
2019-09-17 19:16:00 +00:00
Graham Hunter 1a9195d817 [SVE][MVT] Fixed-length vector MVT ranges
* Reordered MVT simple types to group scalable vector types
    together.
  * New range functions in MachineValueType.h to only iterate over
    the fixed-length int/fp vector types.
  * Stopped backends which don't support scalable vector types from
    iterating over scalable types.

Reviewers: sdesmalen, greened

Reviewed By: greened

Differential Revision: https://reviews.llvm.org/D66339

llvm-svn: 372099
2019-09-17 10:19:23 +00:00
Matt Arsenault b366329a34 DAG/GlobalISel: Correct type profile of bitcount ops
The result integer does not need to be the same width as the input.
AMDGPU, NVPTX, and Hexagon all have patterns working around the types
matching. GlobalISel defines these as being different type indexes.

llvm-svn: 371797
2019-09-13 00:11:14 +00:00
Guillaume Chatelet 1c18a9cb9e [LLVM][Alignment] Introduce Alignment In MachineFrameInfo
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: jfb

Subscribers: hiraditya, dexonsmith, llvm-commits, courbet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65800

llvm-svn: 369531
2019-08-21 14:29:30 +00:00
Karl-Johan Karlsson 40da6be2bd [AsmPrinter] Remove const qualifier from EmitBasicBlockStart.
Overriders may want to modify state in it. AMDGPU wants
to, but has to make its members mutable in order to do so.

Besides, EmitBasicBlockEnd is not const, so why should
Start be?

Patch by Bevin Hansson.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D66341

llvm-svn: 369325
2019-08-20 05:13:57 +00:00
Daniel Sanders 0c47611131 Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

llvm-svn: 369041
2019-08-15 19:22:08 +00:00
Jonas Devlieghere 0eaee545ee [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Benjamin Kramer 3d5360a439 Replace llvm::MutexGuard/UniqueLock with their standard equivalents
All supported platforms have <mutex> now, so we don't need our own
copies any longer. No functionality change intended.

llvm-svn: 368149
2019-08-07 10:57:25 +00:00
Guillaume Chatelet c97a3d15d2 [LLVM][Alignment] Introduce Alignment Type
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jfb, jakehehrlich

Reviewed By: jfb

Subscribers: wuzish, jholewinski, arsenm, dschuff, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65514

llvm-svn: 367828
2019-08-05 11:02:05 +00:00
Daniel Sanders 2bea69bf65 Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC
llvm-svn: 367633
2019-08-01 23:27:28 +00:00
Michael Liao f3983cc14a [NVPTX] Fix PR41651
Summary:
- Use the passed `DL` directly as retrieving data layout from CS by
  checking the called function is not reliable. Under indirect function
  call, there is no called function.

Subscribers: jholewinski, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65468

llvm-svn: 367349
2019-07-30 19:52:01 +00:00
Bryant Wong 7ba838d29c Test commit. NFC.
Formatting fix.

llvm-svn: 365878
2019-07-12 08:25:59 +00:00
Benjamin Kramer fa1a4e4de5 [NVPTX] Use atomicrmw fadd instead of intrinsics
AutoUpgrade the old intrinsics to atomicrmw fadd.

llvm-svn: 365796
2019-07-11 17:11:25 +00:00
Matt Arsenault e3a676e9ad CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().

llvm-svn: 364191
2019-06-24 15:50:29 +00:00
Simon Pilgrim 266f43964e [TargetLowering] Add allowsMemoryAccess(MachineMemOperand) helper wrapper. NFCI.
As suggested by @arsenm on D63075 - this adds a TargetLowering::allowsMemoryAccess wrapper that takes a Load/Store node's MachineMemOperand to handle the AddressSpace/Alignment arguments and will also implicitly handle the MachineMemOperand::Flags change in D63075.

llvm-svn: 363048
2019-06-11 11:00:23 +00:00
Tom Stellard 4b0b26199b Revert CMake: Make most target symbols hidden by default
This reverts r362990 (git commit 374571301d)

This was causing linker warnings on Darwin:

ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)'
from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol
'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&),
std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)'
means the weak symbol cannot be overridden at runtime. This was likely caused by different translation
units being compiled with different visibility settings.

llvm-svn: 363028
2019-06-11 03:21:13 +00:00
Tom Stellard 374571301d CMake: Make most target symbols hidden by default
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.

A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.

This patch reduces the number of public symbols in libLLVM.so by about
25%.  This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so

One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.

Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278

Reviewers: chandlerc, beanz, mgorny, rnk, hans

Reviewed By: rnk, hans

Subscribers: Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D54439

llvm-svn: 362990
2019-06-10 22:12:56 +00:00
Dmitri Gribenko 2b369f83c5 Include what you use in NVPTX.h
Other files were not relying on these transitive includes, so I'm
submitting this change separately.

llvm-svn: 362403
2019-06-03 14:37:26 +00:00
Dmitri Gribenko 14c69fefe6 Include what you use in NVPTX.h
I also fixed all other files that were including NVPTX.h and were
relying on transitive includes.

llvm-svn: 362402
2019-06-03 14:26:50 +00:00
Petar Jovanovic e85bbf564d [DebugInfoMetadata] Refactor DIExpression::prepend constants (NFC)
Refactor DIExpression::With* into a flag enum in order to be less
error-prone to use (as discussed on D60866).

Patch by Djordje Todorovic.

Differential Revision: https://reviews.llvm.org/D61943

llvm-svn: 361137
2019-05-20 10:35:57 +00:00
Richard Trieu e8f83befd5 [NVPTX] Create a TargetInfo header. NFC
Move the declarations of getThe<Name>Target() functions into a new header in
TargetInfo and make users of these functions include this new header.
This fixes a layering problem.

llvm-svn: 360729
2019-05-14 23:56:18 +00:00
Richard Trieu 4b620fcf0f [NVPTX] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc.  Merging them together will fix this.  For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.

llvm-svn: 360500
2019-05-11 02:09:13 +00:00
Nick Desaulniers 7ab164c4a4 [AsmPrinter] refactor to support %c w/ GlobalAddress'
Summary:
Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when
printing the address of a MachineOperand::MO_GlobalAddress. Move that
handling into a new overriden method in each base class. A virtual
method was added to the base class for handling the generic case.

Refactors a few subclasses to support the target independent %a, %c, and
%n.

The patch also contains small cleanups for AVRAsmPrinter and
SystemZAsmPrinter.

It seems that NVPTXTargetLowering is possibly missing some logic to
transform GlobalAddressSDNodes for
TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended
inline assembly asm constraints.

Fixes:
- https://bugs.llvm.org/show_bug.cgi?id=41402
- https://github.com/ClangBuiltLinux/linux/issues/449

Reviewers: echristo, void

Reviewed By: void

Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60887

llvm-svn: 359337
2019-04-26 18:45:04 +00:00
Artem Belevich 5fe85a003f [CUDA] Implemented _[bi]mma* builtins.
These builtins provide access to the new integer and
sub-integer variants of MMA (matrix multiply-accumulate) instructions
provided by CUDA-10.x on sm_75 (AKA Turing) GPUs.

Also added a feature for PTX 6.4. While Clang/LLVM does not generate
any PTX instructions that need it, we still need to pass it through to
ptxas in order to be able to compile code that uses the new 'mma'
instruction as inline assembly (e.g used by NVIDIA's CUTLASS library
https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101)

Differential Revision: https://reviews.llvm.org/D60279

llvm-svn: 359248
2019-04-25 22:28:09 +00:00
Artem Belevich 16737538f4 PTX 6.3 extends `wmma` instruction to support s8/u8/s4/u4/b1 -> s32.
All of the new instructions are still handled mostly by tablegen. I've slightly
refactored the code to drive intrinsic/instruction generation from a master
list of supported variants, so all irregularities have to be implemented in one place only.

The test generation script wmma.py has been refactored in a similar way.

Differential Revision: https://reviews.llvm.org/D60015

llvm-svn: 359247
2019-04-25 22:27:57 +00:00
Artem Belevich 8d825b38ed [NVPTX] generate correct MMA instruction mnemonics with PTX63+.
PTX 6.3 requires using ".aligned" in the MMA instruction names.
In order to generate correct name, now we pass current
PTX version to each instruction as an extra constant operand
and InstPrinter adjusts its output accordingly.

Differential Revision: https://reviews.llvm.org/D59393

llvm-svn: 359246
2019-04-25 22:27:46 +00:00
Artem Belevich 7ecd82ce19 [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC.
Generalized constructions of 'fragments' of MMA operations to provide
common primitives for construction of the ops. This will make it easier
to add new variants of the instructions that operate on integer types.

Use nested foreach loops which makes it possible to better control
naming of the intrinsics.

This patch does not affect LLVM's output, so there are no test changes.

Differential Revision: https://reviews.llvm.org/D59389

llvm-svn: 359245
2019-04-25 22:27:35 +00:00
Bjorn Pettersson 71e8c6f20f Add "const" in GetUnderlyingObjects. NFC
Summary:
Both the input Value pointer and the returned Value
pointers in GetUnderlyingObjects are now declared as
const.

It turned out that all current (in-tree) uses of
GetUnderlyingObjects were trivial to update, being
satisfied with have those Value pointers declared
as const. Actually, in the past several of the users
had to use const_cast, just because of ValueTracking
not providing a version of GetUnderlyingObjects with
"const" Value pointers. With this patch we get rid
of those const casts.

Reviewers: hfinkel, materi, jkorous

Reviewed By: jkorous

Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61038

llvm-svn: 359072
2019-04-24 06:55:50 +00:00
Nick Desaulniers 3271ca01fe [NVPTXAsmPrinter] clean up dead code. NFC
Summary:
The printOperand function takes a default parameter, for which there are
zero call sites that explicitly pass such a parameter.  As such, there
is no case to support. This means that the method
printVecModifiedImmediate is purly dead code, and can be removed.

The eventual goal for some of these AsmPrinter refactoring is to have
printOperand be a virtual method; making it easier to print operands
from the base class for more generic Asm printing. It will help if all
printOperand methods have the same function signature (ie. no Modifier
argument when not needed).

Reviewers: echristo, tra

Reviewed By: echristo

Subscribers: jholewinski, hiraditya, llvm-commits, craig.topper, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60727

llvm-svn: 358527
2019-04-16 21:04:34 +00:00
Nick Desaulniers 5277b3ff25 [AsmPrinter] refactor to remove remove AsmVariant. NFC
Summary:
The InlineAsm::AsmDialect is only required for X86; no architecture
makes use of it and as such it gets passed around between arch-specific
and general code while being unused for all architectures but X86.

Since the AsmDialect is queried from a MachineInstr, which we also pass
around, remove the additional AsmDialect parameter and query for it deep
in the X86AsmPrinter only when needed/as late as possible.

This refactor should help later planned refactors to AsmPrinter, as this
difference in the X86AsmPrinter makes it harder to make AsmPrinter more
generic.

Reviewers: craig.topper

Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60488

llvm-svn: 358101
2019-04-10 16:38:43 +00:00
Bixia Zheng 6c21ccd245 [NVPTX] Fix the codegen for llvm.round.
Summary:
Previously, we translate llvm.round to PTX cvt.rni, which rounds to the
even interger when the source is equidistant between two integers. This
is not correct as llvm.round should round away from zero. This change
replaces llvm.round with a round away from zero implementation through
target specific custom lowering.

Modify a few affected tests to not check for cvt.rni. Instead, we check
for the use of a few constants used in implementing round. We are also
adding CUDA runnable tests to check for the values produced by
llvm.round to test-suites/External/CUDA.

Reviewers: tra

Subscribers: jholewinski, sanjoy, jlebar, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59947

llvm-svn: 357407
2019-04-01 16:10:26 +00:00
Matt Arsenault cf55a657f0 CodeGen: Refactor regallocator command line and target selection
This will allow targets more flexibility to replace the
register allocator core passes. In a future commit,
AMDGPU will run the core register assignment passes
twice, and will also want to disallow using the
standard -regalloc option.

llvm-svn: 356506
2019-03-19 19:33:12 +00:00
Alexey Bataev a8b3eb46b5 [NVPTX][DEBUGINFO]Temp workaround for crash of ptxas: disable packed bytes in debug sections.
Summary:
This patch works around the bug in the ptxas tool with the processing of bytes
separated by the comma symbol. The emission of the packed string is
temporarily disabled.

Reviewers: tra

Subscribers: jholewinski, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59148

llvm-svn: 355740
2019-03-08 21:29:17 +00:00
Alexey Bataev 78fcb8381f [DEBUG_INFO][NVPTX]Emit empty .debug_loc section in presence of the debug option.
Summary:
If the LLVM module shows that it has debug info, but the file is
actually empty and the real debug info is not emitted, the ptxas tool
emits error 'Debug information not found in presence of .target debug'.
We need at leas one empty debug section to silence this message. Section
`.debug_loc` is not emitted for PTX and we can emit empty `.debug_loc`
section if `debug` option was emitted.

Reviewers: tra

Subscribers: jholewinski, aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D57250

llvm-svn: 355719
2019-03-08 20:08:04 +00:00
Jordan Rupprecht 6387fa2715 [NFC] Fix typos: preceeding -> preceding
llvm-svn: 354715
2019-02-23 01:28:32 +00:00
James Y Knight 14359ef1b6 [opaque pointer types] Pass value type to LoadInst creation.
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

llvm-svn: 352911
2019-02-01 20:44:24 +00:00
Alexey Bataev 897129dc3f [DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target.
Enable full support for the debug info.

Differential revision: https://reviews.llvm.org/D46189

llvm-svn: 351974
2019-01-23 18:59:54 +00:00
Alexey Bataev 25624e2e5b Revert "[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target."
This reverts commit r351972. Some pieces of the patch was not applied
correctly.

llvm-svn: 351973
2019-01-23 18:48:36 +00:00
Alexey Bataev fe0b356063 [DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target.
Enable full support for the debug info. Recommit to fix the emission of
the not required closing brace.

Differential revision: https://reviews.llvm.org/D46189

llvm-svn: 351972
2019-01-23 18:28:59 +00:00
Haojian Wu 15a77418a9 Revert "[DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target."
This reverts commit r351846.

This patch may generate illegal assembly code, see

```
$ ./bin/clang -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-grtev4-linux-gnu -S -disable-free -disable-llvm-verifier -discard-value-names -main-file-name new.cc -mrelocation-model pic -pic-level 2 -mthread-model posix -fmerge-all-constants -mdisable-fp-elim -relaxed-aliasing -no-integrated-as -mpie-copy-relocations -munwind-tables -fcuda-is-device -target-feature +ptx60 -target-cpu sm_35 -dwarf-column-info -debug-info-kind=line-directives-only -dwarf-version=2 -debugger-tuning=gdb -o empty.s -x cuda empty.cc
$  cat empty.s
//
// Generated by LLVM NVPTX Back-End
//

.version 6.0
.target sm_35
.address_size 64

	}
```

llvm-svn: 351966
2019-01-23 16:39:57 +00:00
Alexey Bataev 4e9db1beff [DEBUGINFO, NVPTX] Enable support for the debug info on NVPTX target.
Summary: Enable full support for the debug info.

Reviewers: echristo

Subscribers: jholewinski, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D46189

llvm-svn: 351846
2019-01-22 17:43:37 +00:00
Alexey Bataev 9d5974a9fc [DEBUG_INFO, NVPTX] Fix relocation info.
Summary: Initial function labels must follow the debug location for the correct relocation info generation.

Reviewers: tra, jlebar, echristo

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D45784

llvm-svn: 351843
2019-01-22 17:24:16 +00:00
Chandler Carruth 57b08b0944 Update more file headers across all of the LLVM projects in the monorepo
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351648
2019-01-19 10:56:40 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Justin Lebar 49fac56ea3 [NVPTX] Allow libcalls that are defined in the current module.
The patch adds a possibility to make library calls on NVPTX.

An important thing about library functions - they must be defined within
the current module. This basically should guarantee that we produce a
valid PTX assembly (without calls to not defined functions). The one who
wants to use the libcalls is probably will have to link against
compiler-rt or any other implementation.

Currently, it's completely impossible to make library calls because of
error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can
lower ExternalSymbol to TargetExternalSymbol and verify if the function
definition is available.

Also, there was an issue with a DAG during legalisation. When we expand
instruction into libcall, the inner call-chain isn't being "integrated"
into outer chain. Since the last "data-flow" (call retval load) node is
located in call-chain earlier than CALLSEQ_END node, the latter becomes
a leaf and therefore a dead node (and is being removed quite fast).
Proposed here solution relies on another data-flow pseudo nodes
(ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and
instruction selection phases - we remove the pseudo instructions before
register scheduling phase.

Patch by Denys Zariaiev!

Differential Revision: https://reviews.llvm.org/D34708

llvm-svn: 350069
2018-12-26 19:12:31 +00:00
Justin Lebar 7f41fe3a58 [NVPTX] Reduce stack size in NVPTXAsmPrinter::doInitialization().
NVPTXAsmPrinter::doInitialization() was creating an NVPTXSubtarget on
the stack.  This object is huge, about 80kb.  Also it's slow to create.
And it's all redundant; we have one in NVPTXTargetMachine anyway!

llvm-svn: 349982
2018-12-22 01:30:37 +00:00
Artem Belevich 6d74bd638a [NVPTX] Lower instructions that expand into libcalls.
The change is an effort to split and refactor abandoned
D34708 into smaller parts.

Here the behaviour of unsupported instructions is changed
to match the behaviour of explicit intrinsics calls.
Currently LLVM crashes with:
> Assertion getInstruction() && "Not a call or invoke instruction!" failed.

With this patch LLVM produces a more sensible error message:
> Cannot select: ... i32 = ExternalSymbol'__foobar'

Author: Denys Zariaiev <denys.zariaiev@gmail.com>

Differential Revision: https://reviews.llvm.org/D55145

llvm-svn: 349213
2018-12-14 23:53:06 +00:00
Artem Belevich f802b9324a [NVPTX] do not rely on cached subtarget info.
If a module has function references, but no functions
themselves, we may end up never calling runOnMachineFunction
and therefore would never initialize nvptxSubtarget field
which would eventually cause a crash.

Instead of relying on nvptxSubtarget being initialized by
one of the methods, retrieve subtarget info directly.

Differential Revision: https://reviews.llvm.org/D55580

llvm-svn: 348952
2018-12-12 18:31:04 +00:00
David Green ca29c271d2 [Targets] Add errors for tiny and kernel codemodel on targets that don't support them
Adds fatal errors for any target that does not support the Tiny or Kernel
codemodels by rejigging the getEffectiveCodeModel calls.

Differential Revision: https://reviews.llvm.org/D50141

llvm-svn: 348585
2018-12-07 12:10:23 +00:00
Alexey Bataev 2e1a782189 [DEBUGINFO, NVPTX] Disable emission of ',debug' option if only debug directives are allowed.
Summary:
If the output of debug directives only is requested, we should drop
emission of ',debug' option from the target directive. Required for
supporting of nvprof profiler.

Reviewers: echristo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46061

llvm-svn: 348497
2018-12-06 16:25:35 +00:00
Alexey Bataev 64ad0ad5ed [DEBUGINFO, NVPTX]Emit last debugging directives.
Summary:
We may end up with not emitted debug directives at the end of the module
emission. Patch fixes this problem emitting those last directives the
end of the module emission.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D54320

llvm-svn: 348495
2018-12-06 16:02:09 +00:00
Artem Belevich e5664b1559 [NVPTX] Add lowering of i128 numbers as struct fields
Addition to D34555 - override VTs computation with ComputePTXValueVTs
for struct fields.

Author: Denys Zariaiev<denys.zariaiev@gmail.com>

Differential Revision: https://reviews.llvm.org/D55144

llvm-svn: 348057
2018-12-01 00:21:52 +00:00
Simon Pilgrim 66f42ea6e1 [SelectionDAG] Move (repeated) SDTIntShiftDOp double shift node def to common code. NFCI.
Prep work for PR39467.

llvm-svn: 347067
2018-11-16 17:50:59 +00:00
Alexey Bataev 93d018a916 Revert "[DEBUGINFO, NVPTX]DO not emit ',debug' option if no debug info or only debug directives are requested."
This reverts commit r345972. Need to update the description + possibly
to update the patch itself after discussion with Eric Christofer.

llvm-svn: 346508
2018-11-09 16:22:35 +00:00
Matthias Braun 7a75a91b5b MachineFunction: Store more specific reference to LLVMTargetMachine; NFC
MachineFunction can only be used in code using lib/CodeGen, hence we
can keep a more specific reference to LLVMTargetMachine rather than just
TargetMachine around.

Do the same for references in ScheduleDAG and RegUsageInfoCollector.

llvm-svn: 346183
2018-11-05 23:49:14 +00:00
Craig Topper 0b5f8169b0 [TargetLowering] Change TargetLoweringBase::getPreferredVectorAction to take an MVT instead of an EVT. NFC
The main caller of this already has an MVT and several targets called getSimpleVT inside without checking isSimple. This makes the simpleness explicit.

llvm-svn: 346180
2018-11-05 23:26:13 +00:00
Alexey Bataev 8831ef7a16 [DEBUGINFO, NVPTX]DO not emit ',debug' option if no debug info or only debug directives are requested.
Summary:
If the output of debug directives only is requested, we should drop
emission of ',debug' option from the target directive. Required for
supporting of nvprof profiler.

Reviewers: probinson, echristo, dblaikie

Subscribers: Hahnfeld, jholewinski, llvm-commits, JDevlieghere, aprantl

Differential Revision: https://reviews.llvm.org/D46061

llvm-svn: 345972
2018-11-02 13:47:47 +00:00
Alexey Bataev 0f2fe4f135 [DEBUG_INFO][NVPTX]Fix processing of DBG_VALUES.
Summary:
If the instruction in the eliminateFrameIndex function is a DBG_VALUE
instruction, it requires special processing. The frame register is set
to VRFrame and the offset is based on the object offset.
The code is similar to the code used in
lib/CodeGen/PrologEpilogInserter.cpp.

Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D53657

llvm-svn: 345269
2018-10-25 14:27:27 +00:00
Thomas Lively 30f1d69115 [NFC] Rename minnan and maxnan to minimum and maximum
Summary:
Changes all uses of minnan/maxnan to minimum/maximum
globally. These names emphasize that the semantic difference between
these operations is more than just NaN-propagation.

Reviewers: arsenm, aheejin, dschuff, javed.absar

Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53112

llvm-svn: 345218
2018-10-24 22:49:55 +00:00
Alexey Bataev c15c853c3a [DEBUGINFO, NVPTX] Try to pack bytes data into a single string.
Summary:
If the target does not support `.asciz` and `.ascii` directives, the
strings are represented as bytes and each byte is placed on the new line
as a separate byte directive `.b8 <data>`. NVPTX target allows to
represent the vector of the data of the same type as a vector, where
values are separated using `,` symbol: `.b8 <data1>,<data2>,...`. This
allows to reduce the size of the final PTX file. Ptxas tool includes ptx
files into the resulting binary object, so reducing the size of the PTX
file is important.

Reviewers: tra, jlebar, echristo

Subscribers: jholewinski, llvm-commits

Differential Revision: https://reviews.llvm.org/D45822

llvm-svn: 345142
2018-10-24 14:04:00 +00:00
Chandler Carruth edb12a838a [TI removal] Make variables declared as `TerminatorInst` and initialized
by `getTerminator()` calls instead be declared as `Instruction`.

This is the biggest remaining chunk of the usage of `getTerminator()`
that insists on the narrow type and so is an easy batch of updates.
Several files saw more extensive updates where this would cascade to
requiring API updates within the file to use `Instruction` instead of
`TerminatorInst`. All of these were trivial in nature (pervasively using
`Instruction` instead just worked).

llvm-svn: 344502
2018-10-15 10:04:59 +00:00
Artem Belevich 44ecb0e3c2 [CUDA] Added basic support for compiling with CUDA-10.0
llvm-svn: 342924
2018-09-24 23:10:44 +00:00
Benjamin Kramer 759e7d9819 [NVPTX] Implement isLegalToVectorizeLoadChain
This lets LSV nicely split up underaligned chains.

Differential Revision: https://reviews.llvm.org/D51306

llvm-svn: 340760
2018-08-27 17:29:43 +00:00
Benjamin Kramer d66dde5a98 [NVPTX] Remove ftz variants of cvt with rounding mode
These do not exist in ptxas, it refuses to compile them.

Differential Revision: https://reviews.llvm.org/D51042

llvm-svn: 340317
2018-08-21 18:44:25 +00:00
Chandler Carruth 66654b72c9 [SDAG] Remove the reliance on MI's allocation strategy for
`MachineMemOperand` pointers attached to `MachineSDNodes` and instead
have the `SelectionDAG` fully manage the memory for this array.

Prior to this change, the memory management was deeply confusing here --
The way the MI was built relied on the `SelectionDAG` allocating memory
for these arrays of pointers using the `MachineFunction`'s allocator so
that the raw pointer to the array could be blindly copied into an
eventual `MachineInstr`. This creates a hard coupling between how
`MachineInstr`s allocate their array of `MachineMemOperand` pointers and
how the `MachineSDNode` does.

This change is motivated in large part by a change I am making to how
`MachineFunction` allocates these pointers, but it seems like a layering
improvement as well.

This would run the risk of increasing allocations overall, but I've
implemented an optimization that should avoid that by storing a single
`MachineMemOperand` pointer directly instead of allocating anything.
This is expected to be a net win because the vast majority of uses of
these only need a single pointer.

As a side-effect, this makes the API for updating a `MachineSDNode` and
a `MachineInstr` reasonably different which seems nice to avoid
unexpected coupling of these two layers. We can map between them, but we
shouldn't be *surprised* at where that occurs. =]

Differential Revision: https://reviews.llvm.org/D50680

llvm-svn: 339740
2018-08-14 23:30:32 +00:00
Jonas Hahnfeld 20526bf483 [NVPTX] Select atomic loads and stores
According to PTX ISA .volatile has the same memory synchronization
semantics as .relaxed.sys, so it can be used to implement monotonic
atomic loads and stores. This is important for OpenMP's atomic
construct where
 - 'read's and 'write's are lowered to atomic loads and stores, and
 - an update of float or double types are lowered into a cmpxchg loop.
(Note that PTX could do better because it has atom.add.f{32,64} but
LLVM's atomicrmw instruction only allows integer types.)

Higher levels of atomicity (like acquire and release) need additional
synchronization properties which were added with PTX ISA 6.0 / sm_70.
So using these instructions still results in an error.

Differential Revision: https://reviews.llvm.org/D50391

llvm-svn: 339316
2018-08-09 07:45:49 +00:00
Artem Belevich 0a11b6366a [NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").
Summary:
libdevice in recent CUDA versions relies on __nvvm_reflect() to select
GPU-specific bitcode. This patch addresses the requirement.

Reviewers: jlebar

Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D50207

llvm-svn: 338908
2018-08-03 18:05:24 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Alexey Bataev 4dd7558fab [DEBUGINFO, NVPTX] Emit correct debug information for local variables.
Summary:
NVPTX target dos not use register-based frame information. Instead it
relies on the artificial local_depot that is used instead of the frame
and the data for variables must be emitted relatively to this
local_depot.

Reviewers: tra, jlebar, echristo

Subscribers: jholewinski, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D45963

llvm-svn: 338039
2018-07-26 16:29:52 +00:00
Ulrich Weigand c48aefb63b [TableGen] Support multi-alternative pattern fragments
A TableGen instruction record usually contains a DAG pattern that will
describe the SelectionDAG operation that can be implemented by this
instruction. However, there will be cases where several different DAG
patterns can all be implemented by the same instruction. The way to
represent this today is to write additional patterns in the Pattern
(or usually Pat) class that map those extra DAG patterns to the
instruction. This usually also works fine.

However, I've noticed cases where the current setup seems to require
quite a bit of extra (and duplicated) text in the target .td files.
For example, in the SystemZ back-end, there are quite a number of
instructions that can implement an "add-with-overflow" operation.
The same instructions also need to be used to implement just plain
addition (simply ignoring the extra overflow output). The current
solution requires creating extra Pat pattern for every instruction,
duplicating the information about which particular add operands
map best to which particular instruction.

This patch enhances TableGen to support a new PatFrags class, which
can be used to encapsulate multiple alternative patterns that may
all match to the same instruction.  It operates the same way as the
existing PatFrag class, except that it accepts a list of DAG patterns
to match instead of just a single one.  As an example, we can now define
a PatFrags to match either an "add-with-overflow" or a regular add
operation:

  def z_sadd : PatFrags<(ops node:$src1, node:$src2),
                        [(z_saddo node:$src1, node:$src2),
                         (add node:$src1, node:$src2)]>;

and then use this in the add instruction pattern:

  defm AR : BinaryRRAndK<"ar", 0x1A, 0xB9F8, z_sadd, GR32, GR32>;

These SystemZ target changes are implemented here as well.


Note that PatFrag is now defined as a subclass of PatFrags, which
means that some users of internals of PatFrag need to be updated.
(E.g. instead of using PatFrag.Fragment you now need to use
!head(PatFrag.Fragments).)


The implementation is based on the following main ideas:
- InlinePatternFragments may now replace each original pattern
  with several result patterns, not just one.
- parseInstructionPattern delays calling InlinePatternFragments
  and InferAllTypes.  Instead, it extracts a single DAG match
  pattern from the main instruction pattern.
- Processing of the DAG match pattern part of the main instruction
  pattern now shares most code with processing match patterns from
  the Pattern class.
- Direct use of main instruction patterns in InferFromPattern and
  EmitResultInstructionAsOperand is removed; everything now operates
  solely on DAG match patterns.


Reviewed by: hfinkel

Differential Revision: https://reviews.llvm.org/D48545

llvm-svn: 336999
2018-07-13 13:18:00 +00:00
Vedant Kumar b3091da3af Use Type::isIntOrPtrTy where possible, NFC
It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() ||
T.isPointerTy()`.

I used Python's re.sub with this regex to update users:

  r'([\w.\->()]+)isIntegerTy\(\)\s*\|\|\s*\1isPointerTy\(\)'

llvm-svn: 336462
2018-07-06 20:17:42 +00:00
Benjamin Kramer 2f0dd1405e [NVPTX] Expand v2f16 INSERT_VECTOR_ELT
Vectorization can create them.

llvm-svn: 336227
2018-07-03 20:40:04 +00:00
Benjamin Kramer 23d8282047 [NVPTX] Delete dead code
No functionality change.

llvm-svn: 335913
2018-06-28 20:05:35 +00:00
Jonas Hahnfeld c7410ed47a [NVPTX] Ignore target-cpu and -features for inlining
We don't want to prevent inlining because of target-cpu and -features
attributes that were added to newer versions of LLVM/Clang: There are
no incompatible functions in PTX, ptxas will throw errors in such cases.

Differential Revision: https://reviews.llvm.org/D47691

llvm-svn: 334904
2018-06-17 09:55:20 +00:00
Benjamin Kramer f663eba561 [NVPTX] Delete dead code from the AsmPrinter.
llvm-svn: 333924
2018-06-04 16:12:33 +00:00
Amaury Sechet 8467411dad Set ADDE/ADDC/SUBE/SUBC to expand by default
Summary:
They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while.

Target that uses these opcodes are changed in order to ensure their behavior doesn't change.

Reviewers: efriedma, craig.topper, dblaikie, bkramer

Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D47422

llvm-svn: 333748
2018-06-01 13:21:33 +00:00
Eric Christopher 68f2218e1e Revert "Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for debug info emission.""
This reapplies commits: r330271, r330592, r330779.

    [DEBUG] Initial adaptation of NVPTX target for debug info emission.

    Summary:
    Patch adds initial emission of the debug info for NVPTX target.
    Currently, only .file and .loc directives are emitted, everything else is
    commented out to not break the compilation of Cuda.

llvm-svn: 332689
2018-05-18 03:13:08 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Artem Belevich 2f348ea1c7 [NVPTX] Added a feature to use short pointers for const/local/shared AS.
Const/local/shared address spaces are all < 4GB and we can always use
32-bit pointers to access them. This has substantial performance impact
on kernels that uses shared memory for intermediary results.

The feature is disabled by default.

Differential Revision: https://reviews.llvm.org/D46147

llvm-svn: 331941
2018-05-09 23:46:19 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Eric Christopher 8024425801 Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for debug info emission."
This appears to have some issues associated with the file directive output
causing multiple global symbols with the name "file" to be emitted into a
startup section. I'm investigating more specific causes and working with the
original author.

This reverts commit r330271.

Also Revert "[DEBUGINFO, NVPTX] Add the test for the debug info of the local"

This reverts commit r330592 and the follow up of 330779 as the testcase is dependent upon r330271.

llvm-svn: 331237
2018-05-01 00:10:13 +00:00
Benjamin Kramer 733c7fc55d [NVPTX] Turn on Loop/SLP vectorization
Since PTX has grown a <2 x half> datatype vectorization has become more
important. The late LoadStoreVectorizer intentionally only does loads
and stores, but now arithmetic has to be vectorized for optimal
throughput too.

This is still very limited, SLP vectorization happily creates <2 x half>
if it's a legal type but there's still a lot of register moving
happening to get that fed into a vectorized store. Overall it's a small
performance win by reducing the amount of arithmetic instructions.

I haven't really checked what the loop vectorizer does to PTX code, the
cost model there might need some more tweaks. I didn't see it causing
harm though.

Differential Revision: https://reviews.llvm.org/D46130

llvm-svn: 331035
2018-04-27 13:36:05 +00:00
Benjamin Kramer 7dd437710e [NVPTX] Make the legalizer expand shufflevector of <2 x half>
There's no direct instruction for this, but it's trivially implemented
with two movs. Without this the code generator just dies when
encountering a shufflevector.

Differential Revision: https://reviews.llvm.org/D46116

llvm-svn: 330948
2018-04-26 15:26:29 +00:00
Benjamin Kramer bd89647229 [NVPTX] Deduplicate code. No functionality change.
llvm-svn: 330933
2018-04-26 12:30:16 +00:00
Nico Weber 5d53aed419 Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txt
llvm-svn: 330584
2018-04-23 12:49:34 +00:00
Artem Belevich 0ae8590354 [NVPTX, CUDA] Added support for m8n32k16 and m32n8k16 variants of wmma instructions.
The new instructions were added added for sm_70+ GPUs in CUDA-9.1.

Differential Revision: https://reviews.llvm.org/D45068

llvm-svn: 330296
2018-04-18 21:51:48 +00:00
Alexey Bataev 242706b8d1 [DEBUG] Initial adaptation of NVPTX target for debug info emission.
Summary:
Patch adds initial emission of the debug info for NVPTX target.
Currently, only .file and .loc directives are emitted, everything else is
commented out to not break the compilation of Cuda.

Reviewers: echristo, jlebar, tra, jholewinski

Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D41827

llvm-svn: 330271
2018-04-18 16:13:41 +00:00
Artem Belevich 2f8efcf3ca [NVPTX] Removed 'satom' feature which is no longer used.
Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329830
2018-04-11 17:51:33 +00:00
Artem Belevich 24e8a680e5 [NVPTX, CUDA] Improved feature constraints on NVPTX target builtins.
When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature,
consider those features available if we're compiling for GPU >= sm_XX or have
enabled PTX version >= ptxYY.

Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329829
2018-04-11 17:51:19 +00:00
Artem Belevich f256decdc4 [NVPTX] add support for initializing fp16 arrays.
Previously HalfTy was not handled which would either trigger an assertion,
or result in array initialized with garbage.

Differential Revision: https://reviews.llvm.org/D45391

llvm-svn: 329463
2018-04-06 22:25:08 +00:00
Artem Belevich a28e598ebb [NVPTX] Fixed vectorized LDG for f16.
v2f16 is a special case in NVPTX. v4f16 may be loaded as a pair of v2f16
and that was not previously handled correctly by tryLDGLDU()

Differential Revision: https://reviews.llvm.org/D45339

llvm-svn: 329456
2018-04-06 21:10:24 +00:00
Nico Weber 1cbd096914 Sort targetgen calls in lib/Target/*/CMakeLists.
Makes it easier to see mistakes such as the one fixed in r329178 and makes
the different target CMakeLists more consistent.

Also remove some stale-looking comments from the Nios2 target cmakefile.

No intended behavior change.

llvm-svn: 329181
2018-04-04 12:37:44 +00:00
Jun Bum Lim 7ab1b32b5e [CodeGen]Add NoVRegs property on PostRASink and ShrinkWrap
Summary:
This change declare that PostRAMachineSinking and ShrinkWrap require NoVRegs
property, so now the MachineFunctionPass can enforce this check.
These passes are disabled in NVPTX & WebAssembly.

Reviewers: dschuff, jlebar, tra, jgravelle-google, MatzeB, sebpop, thegameg, mcrosier

Reviewed By: dschuff, thegameg

Subscribers: jholewinski, jfb, sbc100, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D45183

llvm-svn: 329095
2018-04-03 18:17:34 +00:00
Tim Shen 8f9f026965 [NVPTX] Enable StructuredCFG for NVPTX
Summary:
Make NVPTX require structured CFG. Added a temporary flag to
"roll back" the behavior for easy deployment.

Combined with D45008, this fixes several internal Nvidia GPU test
failures that we suspect to be ptxas miscompiles (PR27738).

Reviewers: jlebar

Subscribers: jholewinski, sanjoy, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D45070

llvm-svn: 328885
2018-03-30 17:51:03 +00:00
Craig Topper 2fa1436206 [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.

The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.

Differential Revision: https://reviews.llvm.org/D45017

llvm-svn: 328806
2018-03-29 17:21:10 +00:00
David Blaikie 36a0f226b1 Fix layering by moving ValueTypes.h from CodeGen to IR
ValueTypes.h is implemented in IR already.

llvm-svn: 328397
2018-03-23 23:58:31 +00:00
David Blaikie 13e77db2df Fix layering of MachineValueType.h by moving it from CodeGen to Support
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)

llvm-svn: 328395
2018-03-23 23:58:25 +00:00
David Blaikie 6054e650ff Move TargetLoweringObjectFile from CodeGen to Target to fix layering
It's implemented in Target & include from other Target headers, so the
header should be in Target.

llvm-svn: 328392
2018-03-23 23:58:19 +00:00
Artem Belevich 30512869ff [NVPTX] Make tensor shape part of WMMA intrinsic's name.
This is needed for the upcoming implementation of the
new 8x32x16 and 32x8x16 variants of WMMA instructions
introduced in CUDA 9.1.

Differential Revision: https://reviews.llvm.org/D44719

llvm-svn: 328158
2018-03-21 21:55:02 +00:00
Artem Belevich 914d4babec [NVPTX] Make tensor load/store intrinsics overloaded.
This way we can support address-space specific variants without explicitly
encoding the space in the name of the intrinsic. Less intrinsics to deal with ->
less boilerplate.

Added a bit of tablegen magic to match/replace an intrinsics with a pointer
argument in particular address space with the space-specific instruction
variant.

Updated tests to use non-default address spaces.

Differential Revision: https://reviews.llvm.org/D43268

llvm-svn: 328006
2018-03-20 17:18:59 +00:00
Artem Belevich 7b14e7f041 [NVPTX] TblGen-ized lowering of WMMA intrinsics.
NFC.

Differential Revision: https://reviews.llvm.org/D43151

llvm-svn: 327672
2018-03-15 21:40:56 +00:00
Artem Belevich 8c9749b1dc [NVPTX] use pattern matching to lower int_nvvm_match_all_sync*.
Now that patterns can handle intrinsics returning multiple results,
use tablegen'ed pattern matching instead of custom lowering.

Differential Revision: https://reviews.llvm.org/D43890

llvm-svn: 326457
2018-03-01 18:28:45 +00:00
Justin Lebar faaf2d298e [NVPTX] Lower loads from global constants using ld.global.nc (aka LDG).
Summary:
After D43914, loads from global variables in addrspace(1) happen with
ld.global.  But since they're constants, even better would be to use
ld.global.nc, aka ldg.

Reviewers: tra

Subscribers: jholewinski, sanjoy, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D43915

llvm-svn: 326390
2018-02-28 23:58:05 +00:00