Commit Graph

946 Commits

Author SHA1 Message Date
Matthias Braun ca8210a952 X86InstrInfo: No need for liveness analysis in classifyLEAReg()
classifyLEAReg() deals with switching operands from 32bit to 64bit in
order to use a LEA64_32 instruction (for three address code goodness).
It currently performs a liveness analysis to determine the kill/undef
flag for the newly added operand. This should not be necessary:

- If the previous operand had a kill flag, then the 32bit part of the
  register gets killed, this will kill the super register as well.
- If the previous operand had an undef flag then we didn't care what
  value we read, just use the same flag on the new operand.
  (No matter what an operand with an undef flag won't affect liveness)

This makes the code independent of the presence of kill flags because it
avoids a call to MachineBasicBlock::computeRegisterLiveness().

Differential Revision: http://reviews.llvm.org/D22283

llvm-svn: 276222
2016-07-21 00:33:38 +00:00
Craig Topper a3c55f5915 [AVX512] Add EVEX versions of scalar ADD/SUB/MUL/DIV to load folding tables.
llvm-svn: 275775
2016-07-18 06:49:32 +00:00
Craig Topper 16a0744955 [AVX512] Add KADD/KAND/KOR/KXOR to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275771
2016-07-18 06:14:59 +00:00
Craig Topper 463f949a3a [X86] Add VPMULLW/D/Q instructions to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275770
2016-07-18 06:14:57 +00:00
Craig Topper 1af6cc00dc [X86] Add VPADD instructions to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275769
2016-07-18 06:14:54 +00:00
Craig Topper ba9b93d7f2 [X86] Add floating point packed logical ops to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275768
2016-07-18 06:14:50 +00:00
Craig Topper 3a99de4067 [X86] Add AVX512 instructions to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275767
2016-07-18 06:14:47 +00:00
Craig Topper fe5a6dc581 [X86] Add more AVX512 instructions to X86InstrInfo::isHighLatencyDef. Also add all packed fp division instructions.
llvm-svn: 275766
2016-07-18 06:14:45 +00:00
Craig Topper f7a06c29bc [X86] Add AVX512 load opcodes and a couple AVX load opcodes to X86InstrInfo::areLoadsFromSameBasePtr.
llvm-svn: 275765
2016-07-18 06:14:43 +00:00
Craig Topper 650a15e2b3 [X86] Add more opcodes to isFrameLoadOpcode/isFrameStoreOpcode. Mainly AVX-512 related.
llvm-svn: 275764
2016-07-18 06:14:39 +00:00
Craig Topper 5c913e84df [AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when VLX is supported.
Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step.

llvm-svn: 275763
2016-07-18 06:14:34 +00:00
Craig Topper 53f3d1b4d0 [X86] Fix 80-column violations. NFC
llvm-svn: 275762
2016-07-18 06:14:26 +00:00
Justin Lebar 0af80cd6f0 [CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary:
Previously we took an unsigned.

Hooray for type-safety.

Reviewers: chandlerc

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D22282

llvm-svn: 275591
2016-07-15 18:26:59 +00:00
Jacques Pienaar 71c30a14b7 Rename AnalyzeBranch* to analyzeBranch*.
Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect.

Reviewers: tstellarAMD, mcrosier

Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai

Differential Revision: https://reviews.llvm.org/D22409

llvm-svn: 275564
2016-07-15 14:41:04 +00:00
Dean Michael Berris 52735fc435 XRay: Add entry and exit sleds
Summary:
In this patch we implement the following parts of XRay:

- Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches.
- Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts).
- X86-specific nop sleds as described in the white paper.
- A machine function pass that adds the different instrumentation marker instructions at a very late stage.
- A way of identifying which return opcode is considered "normal" for each architecture.

There are some caveats here:

1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet.

2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library.

Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk

Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D19904

llvm-svn: 275367
2016-07-14 04:06:33 +00:00
Duncan P. N. Exon Smith 7b4c18e8f3 X86: Avoid implicit iterator conversions, NFC
Avoid implicit conversions from MachineInstrBundleIterator to
MachineInstr*, mainly by preferring MachineInstr& over MachineInstr* and
using range-based for loops.

llvm-svn: 275149
2016-07-12 03:18:50 +00:00
Craig Topper 516e14cd8e [AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one vectors.
llvm-svn: 275045
2016-07-11 05:36:48 +00:00
Craig Topper 8674849d6e [X86] Add the AVX512 SET0 pseudos to foldMemoryOperandImpl since they are marked for CanFoldAsLoad.
I don't really know how to test this.

llvm-svn: 275044
2016-07-11 05:36:41 +00:00
Duncan P. N. Exon Smith d26fdc83c9 CodeGen: Use MachineInstr& in LiveVariables API, NFC
Change all the methods in LiveVariables that expect non-null
MachineInstr* to take MachineInstr& and update the call sites.  This
clarifies the API, and designs away a class of iterator to pointer
implicit conversions.

llvm-svn: 274319
2016-07-01 01:51:32 +00:00
Duncan P. N. Exon Smith 9cfc75c214 CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.

llvm-svn: 274189
2016-06-30 00:01:54 +00:00
Rafael Espindola a99ccfce1a Drop support for creating $stubs.
They are created by ld64 since OS X 10.5.

llvm-svn: 274130
2016-06-29 14:59:50 +00:00
Dehao Chen 8cd84aaa6f Relax the clearance calculating for breaking partial register dependency.
Summary: LLVM assumes that large clearance will hide the partial register spill penalty. But in our experiment, 16 clearance is too small. As the inserted XOR is normally fairly cheap, we should have a higher clearance threshold to aggressively insert XORs that is necessary to break partial register dependency.

Reviewers: wmi, davidxl, stoklund, zansari, myatsina, RKSimon, DavidKreitzer, mkuper, joerg, spatel

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21560

llvm-svn: 274068
2016-06-28 21:19:34 +00:00
Rafael Espindola f9e348bd59 Convert a few more comparisons to isPositionIndependent(). NFC.
llvm-svn: 273945
2016-06-27 21:33:08 +00:00
Igor Breger e59165ca63 [AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering.
Differential Revision: http://reviews.llvm.org/D20897

llvm-svn: 273138
2016-06-20 07:05:43 +00:00
Benjamin Kramer 4ca41fd09e Run clang-tidy's performance-unnecessary-copy-initialization over LLVM.
No functionality change intended.

llvm-svn: 272516
2016-06-12 17:30:47 +00:00
Benjamin Kramer bdc4956bac Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00
Craig Topper 7a2993093e [X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR instructions. Then add shuffle decode printing for the EVEX forms which is made easier by having the naming structure more similar to other instructions.
llvm-svn: 272249
2016-06-09 07:06:38 +00:00
Rafael Espindola 712f957cae Simplify handling of hidden stub.
Since r207518 they are printed exactly like non-hidden stubs on x86 and
since r207517 on ARM.

This means we can use a single set for all stubs in those platforms.

llvm-svn: 269776
2016-05-17 16:01:32 +00:00
David L Kreitzer e7c583e06f Fix for PR27750. Correctly handle the case where the fallthrough block and
target block are the same in getFallThroughMBB.

Differential Revision: http://reviews.llvm.org/D20288

llvm-svn: 269760
2016-05-17 12:47:46 +00:00
Quentin Colombet 220f7da488 [X86] Properly check that EAX is dead when copying EFLAGS.
This fixes a bug introduced in r267623, where we got smarter and avoided to save
EAX before using it. However, we failed to check if any of the subregister of
EAX were alive and thus, missed cases where we have to save EAX before using it.

The problem may happen on every X86/i386/... platform.

This fixes llvm.org/PR27624

llvm-svn: 269115
2016-05-10 20:49:46 +00:00
Jonas Paulsson 8e5b0c65cc [foldMemoryOperand()] Pass LiveIntervals to enable liveness check.
SystemZ (and probably other targets as well) can fold a memory operand
by changing the opcode into a new instruction that as a side-effect
also clobbers the CC-reg.

In order to do this, liveness of that reg must first be checked. When
LIS is passed, getRegUnit() can be called on it and the right
LiveRange is computed on demand.

Reviewed by Matthias Braun.
http://reviews.llvm.org/D19861

llvm-svn: 269026
2016-05-10 08:09:37 +00:00
Craig Topper 3e0c038a84 [X86][AVX512] Strengthen the assertions from r269001. We need VLX to use the 128/256-bit move opcodes for extended registers.
llvm-svn: 269019
2016-05-10 05:28:04 +00:00
Quentin Colombet ee5f36bd54 [X86][AVX512] Use the proper load/store for AVX512 registers.
When loading or storing AVX512 registers we were not using the AVX512
variant of the load and store for VR128 and VR256 like registers.
Thus, we ended up with the wrong encoding and actually were dropping the
high bits of the instruction. The result was that we load or store the
wrong register. The effect is visible only when we emit the object file
directly and disassemble it. Then, the output of the disassembler does
not match the assembly input.

This is related to llvm.org/PR27481.

llvm-svn: 269001
2016-05-10 01:09:14 +00:00
Craig Topper e5ce84a33c [AVX512] Add VLX 128/256-bit SET0 operations that encode to 128/256-bit EVEX encoded VPXORD so all 32 registers can be used.
llvm-svn: 268884
2016-05-08 21:33:53 +00:00
Matthias Braun d1aabb2813 livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFC
The block must no be nullptr for the addLiveIns()/addLiveOuts()
function.

llvm-svn: 268340
2016-05-03 00:24:32 +00:00
Matthias Braun 24f26e6d91 LivePhysRegs: Automatically determine presence of pristine regs.
Remove the AddPristinesAndCSRs parameters from
addLiveIns()/addLiveOuts().

We need to respect pristine registers after prologue epilogue insertion,
Seeing that we got this wrong in at least two commits already, we should
rather pay the small price to query MachineFrameInfo for it.

There are three cases that did not set AddPristineAndCSRs to true even
after register allocation:
- ExecutionDepsFix: live-out registers are used as a hint that the
  register is used soon. This is not true for pristine registers so
  use the new addLiveOutsNoPristines() to maintain this behaviour.
- SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like
  a bug, should do the right thing automatically now.
- StackMapLivenessAnalysis: Not adding pristine registers looks like a
  bug to me. Added a FIXME comment but maintain the current behaviour
  as a change may need to get coordinated with GC runtimes.

llvm-svn: 268336
2016-05-03 00:08:46 +00:00
David L Kreitzer 0fe4632bd7 Enable the X86 call frame optimization for the 64-bit targets that allow it.
Fixes PR27241.

Differential Revision: http://reviews.llvm.org/D19688

llvm-svn: 268227
2016-05-02 13:45:25 +00:00
Igor Breger 131008fbcb Change AVX512 braodcastsd/ss patterns interaction with spilling . New implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth.
Differential Revision: http://reviews.llvm.org/D19579

llvm-svn: 268190
2016-05-01 08:40:00 +00:00
Craig Topper e012ede137 [X86] Reduce memory usage of MemOp2RegOp and RegOp2MemOp folding maps.
llvm-svn: 268164
2016-04-30 17:59:49 +00:00
Craig Topper 477649a4c0 [X86] Remove unused operand from a function and all its callers. NFC
llvm-svn: 267854
2016-04-28 05:58:46 +00:00
Quentin Colombet 2b3a4e787e [X86] Teach the expansion of copy instructions how to do proper liveness.
When the simple analysis provided by MachineBasicBlock::computeRegisterLiveness
fails, fall back on the LivePhysReg utility.

llvm-svn: 267623
2016-04-26 23:14:32 +00:00
Andrew Kaylor 2bee5ef462 Optimization bisect support in X86-specific passes
Differential Revision: http://reviews.llvm.org/D19439

llvm-svn: 267608
2016-04-26 21:44:24 +00:00
Mehdi Amini b550cb1750 [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
2016-04-18 09:17:29 +00:00
Aaron Ballman ef0fe1eed8 Silencing warnings from MSVC 2015 Update 2. All of these changes silence "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC.
llvm-svn: 264929
2016-03-30 21:30:00 +00:00
Hans Wennborg 4ae5119eeb X86: Use push-pop for materializing 8-bit immediates for minsize (take 2)
This is the same as r255936, with added logic for avoiding clobbering of the
red zone (PR26023).

Differential Revision: http://reviews.llvm.org/D18246

llvm-svn: 264375
2016-03-25 01:10:56 +00:00
Simon Pilgrim a6ba27fbde [X86][XOP] Fixed instruction postfixes to more closely match operands
Suggested by Sanjay in D18189 as the multiple folding options in XOP instructions can be tricky

llvm-svn: 264305
2016-03-24 16:31:30 +00:00
Cong Hou 94710840fb Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.
Currently, AnalyzeBranch() fails non-equality comparison between floating points
on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this
function can modify the branch by reversing the conditional jump and removing
unconditional jump if there is a proper fall-through. However, in the case of
non-equality comparison between floating points, this can turn the branch
"unanalyzable". Consider the following case:

jne.BB1
jp.BB1
jmp.BB2
.BB1:
...
.BB2:
...

AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be
removed:

jne.BB1
jnp.BB2
.BB1:
...
.BB2:
...

However, AnalyzeBranch() cannot analyze this branch anymore as there are two
conditional jumps with different targets. This may disable some optimizations
like block-placement: in this case the fall-through behavior is enforced even if
the fall-through block is very cold, which is suboptimal.

Actually this optimization is also done in block-placement pass, which means we
can remove this optimization from AnalyzeBranch(). However, currently
X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined
negation conditions for them.

In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP
and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them.
Here only the second conditional jump is reversed. This is valid as we only need
them to do this "unconditional jump removal" optimization.


Differential Revision: http://reviews.llvm.org/D11393

llvm-svn: 264199
2016-03-23 21:45:37 +00:00
Chad Rosier c27a18f39f [TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.
http://reviews.llvm.org/D17967

llvm-svn: 263021
2016-03-09 16:00:35 +00:00
Craig Topper cf65c62737 [X86] Use MCPhysReg and uint16_t for static arrays of registers and opcodes respectively should reduce size tiny bit. NFC
llvm-svn: 262458
2016-03-02 04:42:31 +00:00
Duncan P. N. Exon Smith 6307eb5518 CodeGen: TII: Take MachineInstr& in predicate API, NFC
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest).  All of these
functions require non-null parameters already, so references are more
clear.  As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.

No functionality change intended.

llvm-svn: 261605
2016-02-23 02:46:52 +00:00