Commit Graph

5412 Commits

Author SHA1 Message Date
Craig Topper 3828ce7eab [X86] Do something sensible when an expand load intrinsic is passed a 0 mask.
Previously we just returned undef, but really we should be returning the pass thru input. We also need to make sure we preserve the chain output that the original intrinsic node had to maintain connectivity in the DAG. So we should just return the incoming chain as the output chain.

llvm-svn: 333804
2018-06-01 22:59:07 +00:00
Simon Pilgrim ff0623cd29 [X86][SSE] Recognise splat rotations and expand back to shift ops.
Noticed while fixing PR37426, for splat rotations (rotation by an uniform value) its better to just expand back to shift ops than performing as a general non-uniform rotation.

llvm-svn: 333661
2018-05-31 15:47:17 +00:00
Simon Pilgrim c34395d889 [X86][AVX] Add peekThroughEXTRACT_SUBVECTORs helper (NFCI)
We often need this for AVX1 128-bit integer ops as they may have been split from a 256-bit source.

llvm-svn: 333660
2018-05-31 15:15:49 +00:00
Simon Pilgrim 346886bc0d [X86][SSE] Add support for detecting SUB(SPLAT_BV, SPLAT) cases for shift-rotate patterns.
This improves splat rotations (rotation by an uniform value), to avoid having to use the generic non-uniform shift code (extension to PR37426).

llvm-svn: 333641
2018-05-31 11:25:16 +00:00
Simon Pilgrim 5e9f459c62 [X86][SSE] Pulled out splat detection helper from LowerScalarVariableShift (NFCI)
Created the IsSplatValue helper from the splat detection code in LowerScalarVariableShift as a first NFC step towards improving support for splat rotations, which is an extension of PR37426.

llvm-svn: 333580
2018-05-30 19:16:59 +00:00
Gabor Buella 890e363e11 [X86] Lowering FMA intrinsics to native IR (LLVM part)
Support for Clang lowering of fused intrinsics. This patch:

1. Removes bindings to clang fma intrinsics.
2. Introduces new LLVM unmasked intrinsics with rounding mode:
     int_x86_avx512_vfmadd_pd_512
     int_x86_avx512_vfmadd_ps_512
     int_x86_avx512_vfmaddsub_pd_512
     int_x86_avx512_vfmaddsub_ps_512
     supported with a new intrinsic type (INTR_TYPE_3OP_RM).
3. Introduces new x86 fmaddsub/fmsubadd folding.
4. Introduces new tests for code emitted by sequentions introduced in Clang part.

Patch by tkrupa

Reviewers: craig.topper, sroland, spatel, RKSimon

Reviewed By: craig.topper, RKSimon

Differential Revision: https://reviews.llvm.org/D47443

llvm-svn: 333554
2018-05-30 15:25:16 +00:00
Craig Topper 5439b3d1e5 [X86] Fix a potential crash that occur after r333419.
The code could issue a truncate from a small type to larger type. We need to extend in that case instead.

llvm-svn: 333460
2018-05-29 20:04:10 +00:00
Matt Arsenault ab2b79cb97 DAG: Remove redundant version of getRegisterTypeForCallingConv
There seems to be no real reason to have these separate copies.
The existing implementations just copy each other for x86.
For Mips there is a subtle difference, which is just a bug
since it changes based on the context where which one was called.
Dropping this version, all tests pass. If I try to merge them
to match the removed version, a test fails.

llvm-svn: 333440
2018-05-29 17:42:26 +00:00
Alexander Ivchenko 96062eaa8e [X86] Scalar mask and scalar move optimizations
1. Introduction of mask scalar TableGen patterns.
2. Introduction of new scalar move TableGen patterns
   and refactoring of existing ones.
3. Folding of pattern created by introducing scalar
   masking in Clang header files.

Patch by tkrupa

Differential Revision: https://reviews.llvm.org/D47012

llvm-svn: 333419
2018-05-29 14:27:11 +00:00
Craig Topper a34f8731c7 [X86] Disable a DAG combine to allow packed AVX512DQ instructions to be consistently used for i64->float/double conversions.
Summary: We already get this right if the i64 didn't come from a load.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47439

llvm-svn: 333393
2018-05-29 06:22:45 +00:00
Craig Topper 21aeddc3dc [X86] Remove masked vpermi2var/vpermt2var intrinsics and autoupgrade.
We have unmasked intrinsics now and wrap them with a select. This is a net reduction of 36 intrinsics from before the unmasked intrinsics were added.

llvm-svn: 333388
2018-05-29 05:22:05 +00:00
Craig Topper dcfcfdb0d1 [X86] Converge X86ISD::VPERMV3 and X86ISD::VPERMIV3 to a single opcode.
These do the same thing with the first and second sources swapped. They previously came from separate intrinsics that specified different masking behavior. But we can cover that with isel patterns and a single node.

This is a step towards reducing the number of intrinsics needed.

A bunch of tests change because we are now biased to choosing VPERMT over VPERMI when there is nothing to signal that commuting is beneficial.

llvm-svn: 333383
2018-05-28 19:33:11 +00:00
Craig Topper 26bc84860a [X86] Stop forcing X86VPermi2X node index operand to match destination type to make masking pattern matching easier. Add extra patterns with bitcasts instead.
This basically reverts r280696 in favor of using extra patterns as mentioned as an alternative in that commit message. For now I've only added the cases we have test cases for, but it should be easy to add more in the future.

This will help to convert VPERMI2PS/VPERMT2PS intrinsics to use a single ISD node opcode. And hopefully allow some intrinsics to be removed.

llvm-svn: 333365
2018-05-28 05:37:25 +00:00
Craig Topper 51eddb8749 [X86] Remove masking from avx512ifma intrinsics. Use a select instead.
This allows us to avoid having mask and maskz variant. Reducing from 12 intrinsics to 6.

llvm-svn: 333346
2018-05-26 18:55:19 +00:00
Simon Pilgrim b8c7c9c369 [X86][SSE] Pull out (AND (XOR X, -1), Y) matching into a helper function. NFC.
llvm-svn: 333201
2018-05-24 16:16:42 +00:00
Simon Pilgrim 8bd73573c3 Fix unused variable warnings. NFCI.
llvm-svn: 333195
2018-05-24 15:34:50 +00:00
Simon Pilgrim 0c72316a21 [X86][SSE] Pull out OR(AND(~MASK,X),AND(MASK,Y)) matching into a helper function. NFC.
First stage towards matching more variants of the bitselect pattern for combineLogicBlendIntoPBLENDV (PR37549)

llvm-svn: 333191
2018-05-24 15:12:48 +00:00
Roman Lebedev 7772de25d0 [DAGCombine][X86][AArch64] Masked merge unfolding: vector edition.
Summary:
This **appears** to be the last missing piece for the masked merge pattern handling in the backend.

This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]].

[[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly.
Previously, `andps`+`andnps` / `bsl` would be generated. (see `@out`)
Now, they would no longer be generated  (see `@in`), and we need to make sure that they are generated.

Differential Revision: https://reviews.llvm.org/D46528

llvm-svn: 332904
2018-05-21 21:41:02 +00:00
Craig Topper aad3aefaeb [X86] Remove masking from vpternlog intrinsics. Use a select in IR instead.
This removes 6 intrinsics since we no longer need separate mask and maskz intrinsics.

Differential Revision: https://reviews.llvm.org/D47124

llvm-svn: 332890
2018-05-21 20:58:09 +00:00
Simon Pilgrim a8869e68a9 [X86][SSE] Add an assert to ensure that rotation amount is converted to a scale
Missed in rL332832 where we added SSE v4i32 rotations for PR37426.

llvm-svn: 332844
2018-05-21 15:17:23 +00:00
Simon Pilgrim 5aa7cdfd70 [X86][SSE] Support v4i32 rotations (PR37426)
As suggested by Fabian on PR37426, we can use PMULUDQ to perform v4i32 vector rotations as the upper 32bits of the multiply will contain the 'wrapped' bits of the rotation.

v8i16/v16i8 rotations would be straightforward to add to lowerRotate in the future - ideally we'd mostly share code with the vector shifts lowering.

Differential Revision: https://reviews.llvm.org/D46954

llvm-svn: 332832
2018-05-21 09:45:59 +00:00
Craig Topper e4c045b7df [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead.
Someday maybe we'll use selects for all intrinsics.

llvm-svn: 332824
2018-05-20 23:34:04 +00:00
Craig Topper f94ed26ea9 [X86] Directly legalize v16i16/v8i16 vselect to vXi8 vselect to use VPBLENDVB
The intrinsic legalization for masked truncate uses ISD::TRUNCATE which can be constant folded by getNode. This prevents getVectorMaskingNode from seeing the ISD::TRUNCATE special case where it should emit X86ISD::SELECT instead of ISD::VSELECT. This causes a vselect with a v16i1 or v8i1 condition to be emitted during vector legalization. but vector legalization doesn't revisit nodes it creates. DAG combine will then promote this condition to match the result type. Then op legalization will try to legalize it, but the custom lowering hook returned SDValue(). But op legalization doesn't have an Expand for VSELECT because it expects vector legalization to have taken care of it. So the operation sticks around and fails in isel.

This patch adds a custom legalization hook to morph it to a vXi8 vselect instead.

This also simplifies the normal vXi16 vselect handling because vector legalization was normally expanding to AND/ANDN/OR and DAG combine was turning that into VBLENDVB. So we can skip a step by doing it directly.

Fixes PR37499

Differential Revision: https://reviews.llvm.org/D47025

llvm-svn: 332743
2018-05-18 17:48:06 +00:00
Simon Pilgrim 2e0f6c9b21 [X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441)
As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure.

Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs.

Differential Revision: https://reviews.llvm.org/D46959

llvm-svn: 332524
2018-05-16 20:52:52 +00:00
Craig Topper 67aa726f8c [X86][AVX512DQ] Use packed instructions for scalar FP<->i64 conversions on 32-bit targets
As i64 types are not legal on 32-bit targets, insert these into a suitable zero vector and use the packed vXi64<->FP conversion instructions instead.

Fixes PR3163.

Differential Revision: https://reviews.llvm.org/D43441

llvm-svn: 332498
2018-05-16 17:40:07 +00:00
Mikael Holmen e01131decf Remove unused variable introduced in r332336
The unused variable caused a compilation warning:

../lib/Target/X86/X86ISelLowering.cpp:34614:17: error: unused variable 'SMax' [-Werror,-Wunused-variable]
    if (SDValue SMax = MatchMinMax(SMin, ISD::SMAX, C1))
                ^
1 error generated.

llvm-svn: 332431
2018-05-16 06:36:11 +00:00
Artur Gainullin 243a3d56d8 [X86] Improve unsigned saturation downconvert detection.
Summary:
New unsigned saturation downconvert patterns detection was implemented in
X86 Codegen:

(truncate (smin (smax (x, C1), C2)) to dest_type),
where C1 >= 0 and C2 is unsigned max of destination type.

(truncate (smax (smin (x, C2), C1)) to dest_type)
where C1 >= 0, C2 is unsigned max of destination type and C1 <= C2.
These two patterns are equivalent to:

(truncate (umin (smax(x, C1), unsigned_max_of_dest_type)) to dest_type)

Reviewers: RKSimon

Subscribers: llvm-commits, a.elovikov

Differential Revision: https://reviews.llvm.org/D45315

llvm-svn: 332336
2018-05-15 10:24:12 +00:00
Sanjay Patel b4e7893ba8 [x86] fix fmaxnum/fminnum with nnan
With nnan, there's no need for the masked merge / blend
sequence (that probably costs much more than the min/max
instruction).

Somewhere between clang 5.0 and 6.0, we started producing
these intrinsics for fmax()/fmin() in C source instead of
libcalls or fcmp/select. The backend wasn't prepared for
that, so we regressed perf in those cases.

Note: it's possible that other targets have similar problems
as seen here. 

Noticed while investigating PR37403 and related bugs:
https://bugs.llvm.org/show_bug.cgi?id=37403

The IR FMF propagation cases still don't work. There's
a proposal that might fix those cases in D46563.

llvm-svn: 331992
2018-05-10 15:40:49 +00:00
Craig Topper b9a473d186 [X86] Combine (vXi1 (bitcast (-1)))) and (vXi1 (bitcast (0))) to all ones or all zeros vXi1 vector.
llvm-svn: 331847
2018-05-09 06:07:20 +00:00
Shiva Chen 801bf7ebbe [DebugInfo] Examine all uses of isDebugValue() for debug instructions.
Because we create a new kind of debug instruction, DBG_LABEL, we need to
check all passes which use isDebugValue() to check MachineInstr is debug
instruction or not. When expelling debug instructions, we should expel
both DBG_VALUE and DBG_LABEL. So, I create a new function,
isDebugInstr(), in MachineInstr to check whether the MachineInstr is
debug instruction or not.

This patch has no new test case. I have run regression test and there is
no difference in regression test.

Differential Revision: https://reviews.llvm.org/D45342

Patch by Hsiangkai Wang.

llvm-svn: 331844
2018-05-09 02:42:00 +00:00
Jessica Paquette ec37c640dd Revert "[X86][CET] Shadow stack fix for setjmp/longjmp"
This reverts commit 30962eca38ef02666ebcdded72a94f2cd0292d68.

This commit has been causing test asan failures on a build bot.

http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/45108/

Original commit: https://reviews.llvm.org/D46181

llvm-svn: 331813
2018-05-08 22:00:57 +00:00
Jeremy Morse 4f799c027e [X86] Mark all byval parameters as aliased
This is a fix for PR30290: by marking all byval stack slots as being aliased,
the instruction scheduler is more conservative about rescheduling memory
accesses to such stack slots as an LLVM Value* might alias it. This fixes
errors such as in the patched test case, where reads and writes to a data
structure are illegally mixed.

This could be fixed better in the future with better analysis for the
instruction scheduler to know what Values alias what stack slots.

Differential Revision: https://reviews.llvm.org/D45022

llvm-svn: 331749
2018-05-08 09:18:01 +00:00
Alexander Ivchenko c47f799289 [X86][CET] Shadow stack fix for setjmp/longjmp
This patch adds a shadow stack fix when compiling
setjmp/longjmp with the shadow stack enabled. This
allows setjmp/longjmp to work correctly with CET.

Patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D46181

llvm-svn: 331748
2018-05-08 09:04:07 +00:00
Roman Lebedev cc42d08b1d [DagCombiner] Not all 'andn''s work with immediates.
Summary:
Split off from D46031.

In masked merge case, this degrades IPC by decreasing instruction count.
{F6108777}
The next patch should be able to recover and improve this.

This also affects the transform @spatel have added in D27489 / rL289738,
and the test coverage for X86 was missing.
But after i have added it, and looked at the changes in MCA, i'm somewhat confused.
{F6093591} {F6093592} {F6093593}
I'd say this regression is an improvement, since `IPC` increased in that case?

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: andreadb, llvm-commits, spatel

Differential Revision: https://reviews.llvm.org/D46493

llvm-svn: 331684
2018-05-07 21:52:11 +00:00
Craig Topper c882014f43 [X86] Fix copy/paste mistake in comment. NFC
llvm-svn: 331611
2018-05-07 00:47:02 +00:00
Craig Topper cb2abc7977 [X86] Enable reciprocal estimates for v16f32 vectors by using VRCP14PS/VRSQRT14PS
Summary:
The legacy VRCPPS/VRSQRTPS instructions aren't available in 512-bit versions. The new increased precision versions are. So we can use those to implement v16f32 reciprocal estimates.

For KNL CPUs we can probably use VRCP28PS/VRSQRT28PS and avoid the NR step altogether, but I leave that for a future patch.

Reviewers: spatel

Reviewed By: spatel

Subscribers: RKSimon, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D46498

llvm-svn: 331606
2018-05-06 17:48:21 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Craig Topper d656410293 [X86] Make the STTNI flag intrinsics use the flags from pcmpestrm/pcmpistrm if the mask instrinsics are also used in the same basic block.
Summary:
Previously the flag intrinsics always used the index instructions even if a mask instruction also exists.

To fix fix this I've created a single ISD node type that returns index, mask, and flags. The SelectionDAG CSE process will merge all flavors of intrinsics with the same inputs to a s ingle node. Then during isel we just have to look at which results are used to know what instruction to generate. If both mask and index are used we'll need to emit two instructions. But for all other cases we can emit a single instruction.

Since I had to do manual isel anyway, I've removed the pseudo instructions and custom inserter code that was working around tablegen limitations with multiple implicit defs.

I've also renamed the recently added sse42.ll test case to sttni.ll since it focuses on that subset of the sse4.2 instructions.

Reviewers: chandlerc, RKSimon, spatel

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46202

llvm-svn: 331091
2018-04-27 22:15:33 +00:00
Chandler Carruth 16429acacb [x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics
The LLVM commit introduces a crash in LLVM's instruction selection.

I filed http://llvm.org/PR37260 with the test case.

llvm-svn: 330997
2018-04-26 21:46:01 +00:00
Craig Topper 300e20d61c [X86] Form MUL_IMM for multiplies with 3/5/9 to encourage LEA formation over load folding.
Previously we only formed MUL_IMM when we split a constant. This blocked load folding on those cases. We should also form MUL_IMM for 3/5/9 to favor LEA over load folding.

Differential Revision: https://reviews.llvm.org/D46040

llvm-svn: 330850
2018-04-25 17:35:03 +00:00
Alexander Ivchenko 5717fbaf4c [X86] Replace action Promote with Expand for operation ISD::SINT_TO_FP
Summary:
If attribute "use-soft-float"="true" is set then X86ISelLowering.cpp sets
'Promote' action for ISD::SINT_TO_FP operation on type i32.

But 'Promote' action is not proper in this case since lib function
__floatsidf is available for casting from signed int to float type.
Thus Expand action is more suitable here.

The Expand action should be set for ISD::UINT_TO_FP for soft float as well.

If function attribute "use-soft-float"="true" is set then infinite looping
can happen in DAG combining, function visitSINT_TO_FP() replaces SINT_TO_FP
node with UINT_TO_FP node and function combineUIntToFP() replace vice versa in cycle.
The fix prevents it.

Patch by vrybalov

Differential Revision: https://reviews.llvm.org/D45572

llvm-svn: 330711
2018-04-24 12:57:51 +00:00
Craig Topper fe59bea07b [X86] Add DAG combine to turn (trunc (srl (mul ext, ext), 16) into PMULHW/PMULHUW.
Ultimately I want to use this to remove the intrinsics for these instructions.

llvm-svn: 330520
2018-04-21 18:39:21 +00:00
Gabor Buella 31fa8025ba [X86] WaitPKG instructions
Three new instructions:

umonitor - Sets up a linear address range to be
monitored by hardware and activates the monitor.
The address range should be a writeback memory
caching type.

umwait - A hint that allows the processor to
stop instruction execution and enter an
implementation-dependent optimized state
until occurrence of a class of events.

tpause - Directs the processor to enter an
implementation-dependent optimized state
until the TSC reaches the value in EDX:EAX.

Also modifying the description of the mfence
instruction, as the rep prefix (0xF3) was allowed
before, which would conflict with umonitor during
disassembly.

Before:
$ echo 0xf3,0x0f,0xae,0xf0 | llvm-mc -disassemble
.text
mfence

After:
$ echo 0xf3,0x0f,0xae,0xf0 | llvm-mc -disassemble
.text
umonitor        %rax

Reviewers: craig.topper, zvi

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45253

llvm-svn: 330462
2018-04-20 18:42:47 +00:00
Alexander Ivchenko e8fed1546e Lowering x86 adds/addus/subs/subus intrinsics (llvm part)
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations. The patch also includes folding
of previously missing saturation patterns so that IR emits the same
machine instructions as the intrinsics.

Patch by tkrupa

Differential Revision: https://reviews.llvm.org/D44785

llvm-svn: 330322
2018-04-19 12:13:30 +00:00
Keith Wyss 3d86823f3d [XRay] Typed event logging intrinsic
Summary:
Add an LLVM intrinsic for type discriminated event logging with XRay.
Similar to the existing intrinsic for custom events, but also accepts
a type tag argument to allow plugins to be aware of different types
and semantically interpret logged events they know about without
choking on those they don't.

Relies on a symbol defined in compiler-rt patch D43668. I may wait
to submit before I can see demo everything working together including
a still to come clang patch.

Reviewers: dberris, pelikan, eizan, rSerge, timshen

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45633

llvm-svn: 330219
2018-04-17 21:30:29 +00:00
Hiroshi Inoue ae17900997 [NFC] fix trivial typos in document and comments
"not not" -> "not" etc

llvm-svn: 330083
2018-04-14 08:59:00 +00:00
Craig Topper 254ed028a4 [X86] Remove the pmuldq/pmuldq intrinsics and replace with native IR.
This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics.

We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well.

llvm-svn: 329990
2018-04-13 06:07:18 +00:00
Sriraman Tallam d693093a65 GOTPCREL references must always use RIP.
With -fno-plt, global value references can use GOTPCREL and RIP must be used.

Differential Revision: https://reviews.llvm.org/D45460

llvm-svn: 329765
2018-04-10 22:50:05 +00:00
Chandler Carruth 0ca3bd0729 [x86] Model the direction flag (DF) separately from the rest of EFLAGS.
This cleans up a number of operations that only claimed te use EFLAGS
due to using DF. But no instructions which we think of us setting EFLAGS
actually modify DF (other than things like popf) and so this needlessly
creates uses of EFLAGS that aren't really there.

In fact, DF is so restrictive it is pretty easy to model. Only STD, CLD,
and the whole-flags writes (WRFLAGS and POPF) need to model this.

I've also somewhat cleaned up some of the flag management instruction
definitions to be in the correct .td file.

Adding this extra register also uncovered a failure to use the correct
datatype to hold X86 registers, and I've corrected that as necessary
here.

Differential Revision: https://reviews.llvm.org/D45154

llvm-svn: 329673
2018-04-10 06:40:51 +00:00
Chandler Carruth 19618fc639 [x86] Introduce a pass to begin more systematically fixing PR36028 and similar issues.
The key idea is to lower COPY nodes populating EFLAGS by scanning the
uses of EFLAGS and introducing dedicated code to preserve the necessary
state in a GPR. In the vast majority of cases, these uses are cmovCC and
jCC instructions. For such cases, we can very easily save and restore
the necessary information by simply inserting a setCC into a GPR where
the original flags are live, and then testing that GPR directly to feed
the cmov or conditional branch.

However, things are a bit more tricky if arithmetic is using the flags.
This patch handles the vast majority of cases that seem to come up in
practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of
partially preserved EFLAGS as LLVM doesn't currently model that at all.

There are a large number of operations that techinaclly observe EFLAGS
currently but shouldn't in this case -- they typically are using DF.
Currently, they will not be handled by this approach. However, I have
never seen this issue come up in practice. It is already pretty rare to
have these patterns come up in practical code with LLVM. I had to resort
to writing MIR tests to cover most of the logic in this pass already.
I suspect even with its current amount of coverage of arithmetic users
of EFLAGS it will be a significant improvement over the current use of
pushf/popf. It will also produce substantially faster code in most of
the common patterns.

This patch also removes all of the old lowering for EFLAGS copies, and
the hack that forced us to use a frame pointer when EFLAGS copies were
found anywhere in a function so that the dynamic stack adjustment wasn't
a problem. None of this is needed as we now lower all of these copies
directly in MI and without require stack adjustments.

Lots of thanks to Reid who came up with several aspects of this
approach, and Craig who helped me work out a couple of things tripping
me up while working on this.

Differential Revision: https://reviews.llvm.org/D45146

llvm-svn: 329657
2018-04-10 01:41:17 +00:00