Commit Graph

17416 Commits

Author SHA1 Message Date
David Blaikie 27549023b0 Fix stuff... again.
llvm-svn: 219693
2014-10-14 17:11:59 +00:00
Eric Christopher 307c2cb26f Remove unnecessary TargetMachine.h includes.
llvm-svn: 219672
2014-10-14 07:22:08 +00:00
Eric Christopher 6062180203 Grab the subtarget and subtarget dependent variables off of
MachineFunction rather than TargetMachine.

llvm-svn: 219671
2014-10-14 07:22:00 +00:00
Eric Christopher b66367a891 Grab the subtarget and subtarget dependent variables off of
MachineFunction rather than TargetMachine.

llvm-svn: 219670
2014-10-14 07:17:23 +00:00
Eric Christopher 92b4bcbbee Instead of the TargetMachine cache the MachineFunction
and TargetRegisterInfo in the peephole optimizer. This
makes it easier to grab subtarget dependent variables off
of the MachineFunction rather than the TargetMachine.

llvm-svn: 219669
2014-10-14 07:17:20 +00:00
Eric Christopher eb9e87f6e3 Access subtarget specific variables off of the MachineFunction's
cached subtarget and not the TargetMachine.

llvm-svn: 219668
2014-10-14 07:00:33 +00:00
Eric Christopher 99556d77ef Access the subtarget off of the MachineFunction via the DAG
scheduler or via the SelectionDAG if available. Otherwise
grab the subtarget off of the MachineFunction by going up
the parent chain.

llvm-svn: 219666
2014-10-14 06:56:25 +00:00
Eric Christopher b65c7b919c Remove the use and member variable of the TargetMachine from
MachineLICM as we can get the same data off of the MachineFunction.

llvm-svn: 219663
2014-10-14 06:26:57 +00:00
Eric Christopher 20c98938bb Have MachineInstrBundle use the MachineFunction for subtarget
access rather than the TargetMachine.

llvm-svn: 219662
2014-10-14 06:26:55 +00:00
Eric Christopher d3fa440d08 Access the subtarget off of the MachineFunction rather than
through the TargetMachine.

llvm-svn: 219661
2014-10-14 06:26:53 +00:00
Eric Christopher 2a321f74f0 Remove the TargetMachine from DFAPacketizer since it was only
being used to grab subtarget specific things that we can grab
from the MachineFunction anyhow.

llvm-svn: 219650
2014-10-14 01:03:16 +00:00
Eric Christopher 1c5fce0ebb Migrate another set of getSubtargetImpl away.
llvm-svn: 219636
2014-10-13 21:57:44 +00:00
Adrian Prantl 049d21caea Add an assertion about the integrity of the iterator.
Broken parent scope pointers in inlined DIVariables can cause
ensureAbstractVariableIsCreated to insert new abstract scopes, thus
invalidating the iterator in this loop and leading to hard-to-debug
crashes. Useful when manually reducing IR for testcases.

llvm-svn: 219628
2014-10-13 20:44:58 +00:00
Adrian Prantl 13c58820f8 constify the getters in SDNodeDbgValue.
llvm-svn: 219627
2014-10-13 20:43:47 +00:00
Chad Rosier df82a33d42 Refactor debug statement and remove dead argument. NFC.
llvm-svn: 219626
2014-10-13 19:46:39 +00:00
Benjamin Kramer 7000ca3f55 Modernize old-style static asserts. NFC.
llvm-svn: 219588
2014-10-12 17:56:40 +00:00
David Blaikie 325c5757aa Revert "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."
This invariant is violated (& the assertions fire) on some Objective C++
in the test-suite. Reverting while I investigate.

This reverts commit r219215.

llvm-svn: 219523
2014-10-10 18:46:21 +00:00
Hal Finkel 7a87f8a670 [MiSched] Fix a logic error in tryPressure()
Fixes a logic error in the MachineScheduler found by Steve Montgomery (and
confirmed by Andy). This has gone unfixed for months because the fix has been
found to introduce some small performance regressions. However, Andy has
recommended that, at this point, we fix this to avoid further dependence on the
incorrect behavior (and then follow-up separately on any regressions), and I
agree.

Fixes PR18883.

llvm-svn: 219512
2014-10-10 17:06:20 +00:00
David Blaikie 7d6f29d1ee Simplify a few uses of DwarfDebug::SPMap
llvm-svn: 219510
2014-10-10 16:59:52 +00:00
Timur Iskhodzhanov 2cf8a1ded8 Reorder functions in WinCodeViewLineTables.cpp [NFC]
This helps read the comments and understand the code in a natural order

llvm-svn: 219508
2014-10-10 16:05:32 +00:00
Benjamin Kramer 2c99e413ba Reduce double set lookups. NFC.
llvm-svn: 219505
2014-10-10 15:32:50 +00:00
Timur Iskhodzhanov 7edfc5948b Fix a small typo, NFC
llvm-svn: 219492
2014-10-10 12:52:58 +00:00
David Blaikie 4191cbce8c Sink the per-CU part of DwarfDebug::finishSubprogramDefinitions into DwarfCompileUnit.
llvm-svn: 219477
2014-10-10 06:39:29 +00:00
David Blaikie 58410f241e Sink most of DwarfDebug::constructAbstractSubprogramScopeDIE down into DwarfCompileUnit.
llvm-svn: 219476
2014-10-10 06:39:26 +00:00
David Blaikie 9ab48849ad Avoid unnecessary map lookup/insertion.
llvm-svn: 219466
2014-10-10 03:09:38 +00:00
Sanjay Patel 3d497cd778 Improve sqrt estimate algorithm (fast-math)
This patch changes the fast-math implementation for calculating sqrt(x) from:
y = 1 / (1 / sqrt(x))
to:
y = x * (1 / sqrt(x))

This has 2 benefits: less code / faster code and one less estimate instruction 
that may lose precision.

The only target that will be affected (until http://reviews.llvm.org/D5658 is approved)
is PPC. The difference in codegen for PPC is 2 less flops for a single-precision sqrtf
or vector sqrtf and 4 less flops for a double-precision sqrt. 
We also eliminate a constant load and extra register usage.

Differential Revision: http://reviews.llvm.org/D5682

llvm-svn: 219445
2014-10-09 21:26:35 +00:00
Sanjay Patel 6d28da10e5 delete function names from comments
llvm-svn: 219444
2014-10-09 21:24:46 +00:00
David Blaikie 73cc705a37 Remove unused parameter
llvm-svn: 219440
2014-10-09 20:36:27 +00:00
David Blaikie 78b65b6f2c Sink DwarfDebug::createAndAddScopeChildren down into DwarfCompileUnit.
llvm-svn: 219437
2014-10-09 20:26:15 +00:00
David Blaikie 1d072348cf Sink DwarfDebug::constructSubprogramScopeDIE down into DwarfCompileUnit
llvm-svn: 219436
2014-10-09 20:21:36 +00:00
David Blaikie 8b2fdb83c5 Sink DwarfDebug::createScopeChildrenDIE down into DwarfCompileUnit.
llvm-svn: 219422
2014-10-09 18:24:28 +00:00
Lang Hames 8f31f448c5 [PBQP] Replace PBQPBuilder with composable constraints (PBQPRAConstraint).
This patch removes the PBQPBuilder class and its subclasses and replaces them
with a composable constraints class: PBQPRAConstraint. This allows constraints
that are only required for optimisation (e.g. coalescing, soft pairing) to be
mixed and matched.

This patch also introduces support for target writers to supply custom
constraints for their targets by overriding a TargetSubtargetInfo method:

std::unique_ptr<PBQPRAConstraints> getCustomPBQPConstraints() const;

This patch should have no effect on allocations.

llvm-svn: 219421
2014-10-09 18:20:51 +00:00
David Blaikie 4a1a44e3bf Sink DwarfDebug.cpp::constructVariableDIE into DwarfCompileUnit.
llvm-svn: 219419
2014-10-09 17:56:39 +00:00
David Blaikie ee7df55306 Move DwarfUnit::constructVariableDIE down to DwarfCompileUnit, since it's only needed there.
llvm-svn: 219418
2014-10-09 17:56:36 +00:00
David Blaikie 0fbf8bdb08 Sink DwarfDebug::constructLexicalScopeDIE into DwarfCompileUnit
llvm-svn: 219414
2014-10-09 17:08:42 +00:00
David Blaikie a09bd0a15a Missing reformatting
llvm-svn: 219413
2014-10-09 17:08:38 +00:00
David Blaikie 01b48a84dc Sink DwarfDebug::constructInlinedScopeDIE into DwarfCompileUnit
This introduces access to the AbstractSPDies map from DwarfDebug so
DwarfCompileUnit can access it. Eventually this'll sink down to
DwarfFile, but it'll still be generically accessible - not much
encapsulation to provide it. (constructInlinedScopeDIE could stay
further up, in DwarfFile to avoid exposing this - but I don't think
that's particularly better)

llvm-svn: 219411
2014-10-09 16:50:53 +00:00
Eric Christopher edba30c434 Remove more calls to getSubtargetImpl from the schedulers and
remove cached or unnecessary TargetMachines.

llvm-svn: 219387
2014-10-09 06:28:06 +00:00
Eric Christopher 143f02c47d Remove unused argument to CreateTargetScheduleState and change
the TargetMachine to a TargetSubtargetInfo since everything
we wanted is off of that.

llvm-svn: 219382
2014-10-09 01:59:35 +00:00
Eric Christopher caf275126e Remove uses of getSubtargetImpl from ResourcePriorityQueue and
replace them with calls off of the MachineFuncton.

llvm-svn: 219381
2014-10-09 01:59:31 +00:00
Eric Christopher 147c2ea05a Remove the uses of getSubtargetImpl from InstrEmitter and remove
the now unused TargetMachine variable.

llvm-svn: 219379
2014-10-09 01:35:29 +00:00
Eric Christopher 85de8f98a9 Use the subtarget on the dag to get TargetFrameLowering rather
than off the target machine.

llvm-svn: 219378
2014-10-09 01:35:27 +00:00
Eric Christopher 2ae2de7562 Remove uses of the TargetMachine from FunctionLoweringInfo
via caching TargetLowering and using the MachineFunction.

llvm-svn: 219375
2014-10-09 00:57:31 +00:00
David Blaikie de12375c96 Push DwarfDebug::attachRangesOrLowHighPC down into DwarfCompileUnit
llvm-svn: 219372
2014-10-09 00:21:42 +00:00
David Blaikie 524002004d Sink DwarfDebug::addScopeRangeList down into DwarfCompileUnit
(& add a few accessors/make a couple of things public for this - it's a
bit of a toss-up, but I think I prefer it this way, keeping some more of
the meaty code down in DwarfCompileUnit - if only to make for smaller
implementation files, etc)

I think we could simplify range handling a bit if we removed the range
lists from each unit and just put a single range list on DwarfDebug,
similar to address pooling.

llvm-svn: 219370
2014-10-09 00:11:39 +00:00
Eric Christopher 40cba91ad1 Remove unnecessary include.
llvm-svn: 219368
2014-10-08 23:38:40 +00:00
Eric Christopher f55d4714d2 Use both the cached TLI and the subtarget off of the DAG in
the DAG combiner.

llvm-svn: 219367
2014-10-08 23:38:39 +00:00
Eric Christopher 4e3d6ded99 Remove getSubtargetImpl calls from FastISel, we can get it from
the MachineFunction where it's already cached.

llvm-svn: 219366
2014-10-08 23:38:33 +00:00
David Blaikie e5feec502d Sink DwarfUnit::addSectionDelta into DwarfCompileUnit, the only place it's needed.
llvm-svn: 219364
2014-10-08 23:30:05 +00:00
David Blaikie 33702a31e8 Reformat some stuff I missed in recent previous commits
llvm-svn: 219356
2014-10-08 23:09:42 +00:00
David Blaikie 6c0ee4ece3 Sink and coalesce DwarfDebug.cpp::addSectionLabel and DwarfUnit::addSectionLabel down into DwarfCompileUnit::addSectionLabel
llvm-svn: 219351
2014-10-08 22:46:27 +00:00
Eric Christopher ffcbe9b048 Remove dead call to getTypeToTransformTo. The result is
unused.

llvm-svn: 219347
2014-10-08 22:25:45 +00:00
David Blaikie f76aeaec66 DebugInfo: The rest of pushing DwarfDebug::constructScopeDIE down into DwarfCompileUnit
Funnily enough, I copied it, but didn't actually remove the original in
r219345. Let's do that.

llvm-svn: 219346
2014-10-08 22:23:10 +00:00
David Blaikie 9c65b1355c Push DwarfDebug::constructScopeDIE down into DwarfCompileUnit
One of many steps to generalize subprogram emission to both the DWO and
non-DWO sections (to emit -gmlt-like data under fission). Once the
functions are pushed down into DwarfCompileUnit some of the data
structures will be pushed at least into DwarfFile so that they can be
unique per-file, allowing emission to both files independently.

llvm-svn: 219345
2014-10-08 22:20:02 +00:00
Eric Christopher 1e845f269b Remove a bunch of getSubtargetImpl calls since we already have
a cached TLI instance.

llvm-svn: 219342
2014-10-08 21:08:32 +00:00
Timur Iskhodzhanov 5fcaeebb72 Fix COFF section index relocation should be 16 bits, not 32
Original patch by Andrey Guskov!
http://reviews.llvm.org/D5651

llvm-svn: 219327
2014-10-08 18:01:49 +00:00
Eric Christopher 58a2461368 Use the TargetLowering information we already have on the
SelectionDAG in SelectionDAGBuilder rather than going through
the TargetMachine for lookup.

llvm-svn: 219292
2014-10-08 09:50:54 +00:00
Eric Christopher d2670a34c9 Grab the TargetRegisterInfo off of the subtarget from the
MachineFunction rather than a lookup on the TargetMachine
to avoid unnecessary lookups.

llvm-svn: 219291
2014-10-08 09:50:52 +00:00
Eric Christopher 000ef037d4 Replace calls to get the subtarget and TargetFrameLowering with
cached variables and a single call in the constructor.

llvm-svn: 219287
2014-10-08 08:46:34 +00:00
Eric Christopher 51bedaf223 Use cached subtarget rather than looking it up on the
TargetMachine again.

llvm-svn: 219285
2014-10-08 07:51:41 +00:00
Eric Christopher b17140de35 Cache TargetLowering on SelectionDAGISel and update previous
calls to getTargetLowering() with the cached variable.

llvm-svn: 219284
2014-10-08 07:32:17 +00:00
Eric Christopher 60eb343e83 Cache SelectionDAGISel TargetInstrInfo lookups on the class and
propagate. Also use the TargetSubtargetInfo and the MachineFunction
and move TargetRegisterInfo query closer to uses.

llvm-svn: 219273
2014-10-08 01:58:03 +00:00
Eric Christopher 896044822e Reset the target options and optimization level as the first
thing we do inside selection dag. This code needs to be
migrated to queries on the function rather than global
data, but this organizes things before we start grabbing
the subtarget.

llvm-svn: 219271
2014-10-08 01:58:01 +00:00
Eric Christopher 8d07f44560 Have the selection dag grab TargetLowering off of the subtarget
inside init rather than have it passed in as an argument.

llvm-svn: 219270
2014-10-08 01:57:58 +00:00
Eric Christopher d9636c1dcf Have SelectionDAG's subtarget TargetSelectionDAGInfo be set
during init rather than construction time.

llvm-svn: 219262
2014-10-08 00:32:59 +00:00
Sanjay Patel 25d3c1cf61 typos
llvm-svn: 219221
2014-10-07 17:38:33 +00:00
Sanjay Patel eb0cc1bbf3 typos
llvm-svn: 219220
2014-10-07 17:36:50 +00:00
David Blaikie ff669d1723 DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.
Let me tell you a tale...

Originally committed in r211723 after discovering a nasty case of weird
scoping due to inlining, this was reverted in r211724 after it fired in
ASan/compiler-rt.

(minor diversion where I accidentally committed/reverted again in
r211871/r211873)

After further testing and fixing bugs in ArgumentPromotion (r211872) and
Inlining (r212065) it was recommitted in r212085. Reverted in r212089
after the sanitizer buildbots still showed problems.

Fixed another bug in ArgumentPromotion (r212128) found by this
assertion.

Recommitted in r212205, reverted in r212226 after it crashed some more
on sanitizer buildbots.

Fix clang some more in r212761.

Recommitted in r212776, reverted in r212793. ASan failures.
Recommitted in r213391, reverted in r213432, trying to reproduce flakey
ASan build failure.

Fixed bugs in r213805 (ArgPromo + DebugInfo), r213952
(LiveDebugVariables strips dbg_value intrinsics in functions not
described by debug info).

Recommitted in r214761, reverted in r214999, flakey failure on Windows
buildbot.

Fixed DeadArgElimination + DebugInfo bug in r219210.

Recommitting and hoping that's the last of it.

[That one burned down, fell over, then sank into the swamp.]

llvm-svn: 219215
2014-10-07 16:56:20 +00:00
Hal Finkel 9808595319 [DAGCombine] Remove SIGN_EXTEND-related inf-loop
The patch's author points out that, despite the function's documentation,
getSetCCResultType is only used to get the SETCC result type (with one
here-removed problematic exception). In one case, getSetCCResultType was being
used to get the predicate type to use for a SELECT node, and then
SIGN_EXTENDing (or truncating) to get the input predicate to match that type.
Unfortunately, this was happening inside visitSIGN_EXTEND, and creating new
SIGN_EXTEND nodes was causing an infinite loop. In addition, this behavior was
wrong if a target was not using ZeroOrNegativeOneBooleanContent. Lastly, the
extension/truncation seems unnecessary here: SELECT is defined as:

  Select(COND, TRUEVAL, FALSEVAL). If the type of the boolean COND is not i1
  then the high bits must conform to getBooleanContents.

So here we remove this use of getSetCCResultType and update
getSetCCResultType's documentation to reflect its actual uses.

Patch by deadal nix!

llvm-svn: 219141
2014-10-06 20:19:47 +00:00
Sanjay Patel 7bc9185ab5 Fast-math fold: x / (y * sqrt(z)) -> x * (rsqrt(z) / y)
The motivation is to recognize code such as this from /llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c:

float distance = sqrt(dx * dx + dy * dy + dz * dz);
float mag = dt / (distance * distance * distance);

Without this patch, we don't match the sqrt as a reciprocal sqrt, so for PPC the new testcase in this patch produces:

   addis 3, 2, .LCPI4_2@toc@ha
   lfs 4, .LCPI4_2@toc@l(3)
   addis 3, 2, .LCPI4_1@toc@ha
   lfs 0, .LCPI4_1@toc@l(3)
   fcmpu 0, 1, 4
   beq 0, .LBB4_2
# BB#1:
   frsqrtes 4, 1
   addis 3, 2, .LCPI4_0@toc@ha
   lfs 5, .LCPI4_0@toc@l(3)
   fnmsubs 13, 1, 5, 1
   fmuls 6, 4, 4
   fmadds 1, 13, 6, 5
   fmuls 1, 4, 1
   fres 4, 1                <--- reciprocal of reciprocal square root
   fnmsubs 1, 1, 4, 0
   fmadds 4, 4, 1, 4
.LBB4_2:
   fmuls 1, 4, 2
   fres 2, 1
   fnmsubs 0, 1, 2, 0
   fmadds 0, 2, 0, 2
   fmuls 1, 3, 0
   blr

After the patch, this simplifies to:

frsqrtes 0, 1
addis 3, 2, .LCPI4_1@toc@ha
fres 5, 2
lfs 4, .LCPI4_1@toc@l(3)
addis 3, 2, .LCPI4_0@toc@ha
lfs 7, .LCPI4_0@toc@l(3)
fnmsubs 13, 1, 4, 1
fmuls 6, 0, 0
fnmsubs 2, 2, 5, 7
fmadds 1, 13, 6, 4
fmadds 2, 5, 2, 5
fmuls 0, 0, 1
fmuls 0, 0, 2
fmuls 1, 3, 0
blr

Differential Revision: http://reviews.llvm.org/D5628

llvm-svn: 219139
2014-10-06 19:31:18 +00:00
Benjamin Kramer 6bf8af5de9 DbgValueHistoryCalculator: Store modified registers in a BitVector instead of std::set.
And iterate over the smaller map instead of the larger set first.  Reduces the time spent in
calculateDbgValueHistory by 30-40%.

llvm-svn: 219123
2014-10-06 15:31:04 +00:00
David Blaikie febfafd13a DebugInfo: Sink constructImportedEntityDIE down into DwarfUnit from DwarfDebug.
It was just calling a bunch of DwarfUnit functions anyway, as can be
seen by the simplification of removing "TheCU" from all the function
calls in the implementation.

llvm-svn: 219103
2014-10-06 05:37:24 +00:00
Chandler Carruth daa1ff985c [x86, dag] Teach the DAG combiner to prune inputs toa vector_shuffle
that are unused.

This allows the combiner to delete math feeding shuffles where the math
isn't actually necessary. This improves some of the vperm2x128 tests
that regressed when the vector shuffle lowering started actually
generating vperm instructions rather than forcibly decomposing them.

Sadly, this isn't enough to get this *really* right because we still
form a completely unnecessary permutation. To fix that, we also need to
fold shuffles which just rearrange concatenated or inserted subvectors.

llvm-svn: 219086
2014-10-05 19:14:34 +00:00
David Blaikie 60b8662ea7 Remove unused map
This became unnecessary/unused in r208636

llvm-svn: 219085
2014-10-05 16:31:13 +00:00
Benjamin Kramer 2e52f02864 Make AAMDNodes ctor and operator bool (!!!) explicit, mop up bugs and weirdness exposed by it.
llvm-svn: 219068
2014-10-04 22:44:29 +00:00
Benjamin Kramer c6cc58e703 Remove unnecessary copying or replace it with moves in a bunch of places.
NFC.

llvm-svn: 219061
2014-10-04 16:55:56 +00:00
David Blaikie cda2aa823e Sink DwarfDebug::updateSubprogramScopeDIE into DwarfCompileUnit
This requires exposing some of the current function state from
DwarfDebug. I hope there's not too much of that to expose as I go
through all the functions, but it still seems nicer to expose singular
data down to multiple consumers, than have consumers expose raw mapping
data structures up to DwarfDebug for building subprograms.

Part of a series of refactoring to allow subprograms in both the
skeleton and dwo CUs under Fission.

llvm-svn: 219060
2014-10-04 16:24:00 +00:00
David Blaikie 8945219dc9 Reformatting accidentally left out of r219057
llvm-svn: 219059
2014-10-04 16:00:26 +00:00
David Blaikie 14499a7d68 Sink DwarfDebug::attachLowHighPC into DwarfCompileUnit
One of many things to sink down into DwarfCompileUnit to allow handling
of subprograms in both the skeleton and dwo CU under Fission.

llvm-svn: 219058
2014-10-04 15:58:47 +00:00
David Blaikie 37c5231051 Move DwarfCompileUnit from DwarfUnit.h to its own header (DwarfCompileUnit.h)
In preparation for sinking all the subprogram emission code down from
DwarfDebug into DwarfCompileUnit, this will avoid bloating
DwarfUnit.h/cpp greatly and make concerns a bit more clear/isolated.

(sinking this handling down is part of the work to handle emitting
minimal subprograms for -gmlt-like data into the skeleton CU under
fission)

llvm-svn: 219057
2014-10-04 15:49:50 +00:00
Duncan P. N. Exon Smith 176b691d32 Revert "Revert "DI: Fold constant arguments into a single MDString""
This reverts commit r218918, effectively reapplying r218914 after fixing
an Ocaml bindings test and an Asan crash.  The root cause of the latter
was a tightened-up check in `DILexicalBlock::Verify()`, so I'll file a
PR to investigate who requires the loose check (and why).

Original commit message follows.

--

This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 219010
2014-10-03 20:01:09 +00:00
Adam Nemet ff63a2dc51 [ISel] Keep matching state consistent when folding during X86 address match
In the X86 backend, matching an address is initiated by the 'addr' complex
pattern and its friends.  During this process we may reassociate and-of-shift
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
shift into the scale of the address.

However as demonstrated by the testcase, this can trigger CSE of not only the
shift and the AND which the code is prepared for but also the underlying load
node.  In the testcase this node is sitting in the RecordedNode and MatchScope
data structures of the matcher and becomes a deleted node upon CSE.  Returning
from the complex pattern function, we try to access it again hitting an assert
because the node is no longer a load even though this was checked before.

Now obviously changing the DAG this late is bending the rules but I think it
makes sense somewhat.  Outside of addresses we prefer and-of-shift because it
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
example because it create a non-canonical node).  We currently don't recognize
addresses during DAGCombiner where arguably this canonicalization should be
performed.  On the other hand, having this in the matcher allows us to cover
all the cases where an address can be used in an instruction.

I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
the new shift node in FoldMaskedShiftToScaledMask.  This RAUW is responsible
for initiating the recursive CSE on users
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
is not strictly necessary since the shift is hooked into the visited user.  Of
course it's safer to keep the DAG consistent at all times (e.g. for accurate
number of uses, etc.).

So rather than changing the fundamentals, I've decided to continue along the
previous patches and detect the CSE.  This patch installs a very targeted
DAGUpdateListener for the duration of a complex-pattern match and updates the
matching state accordingly.  (Previous patches used HandleSDNode to detect the
CSE but that's not practical here).  The listener is only installed on X86.

I tested that there is no measurable overhead due to this while running
through the spec2k BC files with llc.  The only thing we pay for is the
creation of the listener.  The callback never ever triggers in spec2k since
this is a corner case.

Fixes rdar://problem/18206171

llvm-svn: 219009
2014-10-03 20:00:34 +00:00
Benjamin Kramer e12a6bac32 Eliminate some deep std::vector copies. NFC.
llvm-svn: 218999
2014-10-03 18:33:16 +00:00
Renato Golin 4e31ae1051 Revert 202433 - Provide a target override for the latest regalloc heuristic
That commit was introduced in order to help investigate a problem in ARM
codegen breaking from commit 202304 (Add a limit to the heuristic that register
allocates instructions in local order). Recent analisys indicated that the
problem no longer exists, so I'm reverting this change.

See PR18996.

llvm-svn: 218981
2014-10-03 12:20:53 +00:00
Chandler Carruth 7425c8c279 Fix the threshold added in r186434 (a re-apply of r185393) and updaated
to be a ManagedStatic in r218163 to not be a global variable written and
read to from within the innards of SpillPlacement.

This will fix a really scary race condition for anyone that has two
copies of LLVM running spill placement concurrently. Yikes!

This will also fix a really significant compile time hit that r218163
caused because the spill placement threshold read is actually in the
*very* hot path of this code. The memory fence on each read was showing
up as huge compile time regressions when spilling is responsible for
most of the compile time. For example, optimizing sanitized code showed
over 50% compile time regressions here. =/

llvm-svn: 218921
2014-10-02 22:23:14 +00:00
Duncan P. N. Exon Smith 786cd049fc Revert "DI: Fold constant arguments into a single MDString"
This reverts commit r218914 while I investigate some bots.

llvm-svn: 218918
2014-10-02 22:15:31 +00:00
Duncan P. N. Exon Smith 571f97bd90 DI: Fold constant arguments into a single MDString
This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString.  Integers are stringified and
a `\0` character is used as a separator.

Part of PR17891.

Note: I've attached my testcases upgrade scripts to the PR.  If I've
just broken your out-of-tree testcases, they might help.

llvm-svn: 218914
2014-10-02 21:56:57 +00:00
Adrian Prantl 87b7eb9d0f Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.
llvm-svn: 218787
2014-10-01 18:55:02 +00:00
Adrian Prantl b458dc2eee Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

llvm-svn: 218782
2014-10-01 18:10:54 +00:00
Adrian Prantl 25a7174e7a Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

llvm-svn: 218778
2014-10-01 17:55:39 +00:00
Jingyue Wu fd47fb9976 Revert r216862 due to a performance regression
Reported by Alexey Volkov in PR21115

llvm-svn: 218771
2014-10-01 15:22:13 +00:00
David Blaikie 32b0f365a2 Implement DW_TAG_subrange_type with DW_AT_count rather than DW_AT_upper_bound
This allows proper disambiguation of unbounded arrays and arrays of zero
bound ("struct foo { int x[]; };" and "struct foo { int x[0]; }"). GCC
instead produces an upper bound of -1 in the latter situation, but count
seems tidier. This way lower_bound is provided if it's not the language
default and count is provided if the count is known, otherwise it's
omitted. Simple.

If someone wants to look at rdar://problem/12566646 and see if this
change is acceptable to that bug/fix, that might be helpful (see the
empty-and-one-elem-array.ll test case which cites that radar).

llvm-svn: 218726
2014-10-01 00:56:55 +00:00
David Blaikie 6cca8109ab Omit DW_AT_inline under -gmlt to save a little more space.
llvm-svn: 218719
2014-09-30 23:29:16 +00:00
David Blaikie 1cae849c04 DebugInfo: Sink the code emitting DW_AT_APPLE_omit_frame_ptr down to a more common spot.
No functional change. Pre-emptive refactoring before I start pushing
some of this subprogram creation down into DWARFCompileUnit so I can
build different subprograms in the skeleton unit from the dwo unit for
adding -gmlt-like data to the skeleton.

llvm-svn: 218713
2014-09-30 22:32:49 +00:00
David Blaikie e1c79749ca Disable the -gmlt optimization implemented in r218129 under Darwin due to issues with dsymutil.
r218129 omits DW_TAG_subprograms which have no inlined subroutines when
emitting -gmlt data. This makes -gmlt very low cost for -O0 builds.

Darwin's dsymutil reasonably considers a CU empty if it has no
subprograms (which occurs with the above optimization in -O0 programs
without any force_inline function calls) and drops the line table, CU,
and everything in this situation, making backtraces impossible.

Until dsymutil is modified to account for this, disable this
optimization on Darwin to preserve the desired functionality.
(see r218545, which should be reverted after this patch, for other
discussion/details)

Footnote:
In the long term, it doesn't look like this scheme (of simplified debug
info to describe inlining to enable backtracing) is tenable, it is far
too size inefficient for optimized code (the DW_TAG_inlined_subprograms,
even once compressed, are nearly twice as large as the line table
itself (also compressed)) and we'll be considering things like Cary's
two level line table proposal to encode all this information directly in
the line table.

llvm-svn: 218702
2014-09-30 21:28:32 +00:00
Sanjay Patel ab7f460bca Use the target-specified iteration count to opt out of any further refinement of an estimate. NFC.
llvm-svn: 218700
2014-09-30 20:44:23 +00:00
Sanjay Patel 8fde95cb2b Split the estimate() interface into separate functions for each type. NFC.
It was hacky to use an opcode as a switch because it won't always match
(rsqrte != sqrte), and it looks like we'll need to add more special casing
per arch than I had hoped for. Eg, x86 will prefer a different NR estimate
implementation. ARM will want to use it's 'step' instructions. There also
don't appear to be any new estimate instructions in any arch in a long,
long time. Altivec vloge and vexpte may have been the first and last in
that field...

llvm-svn: 218698
2014-09-30 20:28:48 +00:00
Andrea Di Biagio c7c524129b [DAG] Check in advance if a build_vector has a legal type before attempting to convert it into a shuffle.
Currently, the DAG Combiner only tries to convert type-legal build_vector nodes
into shuffles. This patch simply moves the logic that checks if a
build_vector has a legal value type up before we even start analyzing the
operands. This allows to early exit immediately from method
'visitBUILD_VECTOR' if the node type is known to be illegal.

No functional change intended.

llvm-svn: 218677
2014-09-30 15:30:22 +00:00
Matt Arsenault 93ffe58f90 Add MachineOperand::ChangeToFPImmediate and setFPImm
llvm-svn: 218579
2014-09-28 19:24:59 +00:00
James Molloy 463db9a77c [AArch64] Redundant store instructions should be removed as dead code
If there is a store followed by a store with the same value to the same location, then the store is dead/noop. It can be removed.

This problem is found in spec2006-197.parser.

For example,
  stur    w10, [x11, #-4]
  stur    w10, [x11, #-4]
Then one of the two stur instructions can be removed.

Patch by David Xu!

llvm-svn: 218569
2014-09-27 17:02:54 +00:00
Sanjay Patel bdf1e38856 Refactor reciprocal and reciprocal square root estimate into target-independent functions (part 2).
This is purely refactoring. No functional changes intended. PowerPC is the only target
that is currently using this interface.

The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this:

z = y / sqrt(x)

into:

z = y * rsqrte(x)

And:

z = y / x

into:

z = y * rcpe(x)

using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 .

There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction
along with the number of refinement steps needed to make the estimate usable.

Differential Revision: http://reviews.llvm.org/D5484

llvm-svn: 218553
2014-09-26 23:01:47 +00:00
David Xu 418da223dd Revert patch ofr218493
llvm-svn: 218494
2014-09-26 02:28:03 +00:00
David Xu 64f661ee0b Redundant store instructions should be removed as dead code
llvm-svn: 218493
2014-09-26 02:02:09 +00:00
Eric Christopher 3976f78247 Move resetTargetOptions from taking a MachineFunction to a Function
since we are accessing the TargetMachine that we're a member
function of.

llvm-svn: 218489
2014-09-26 01:28:10 +00:00
Bruno Cardoso Lopes d04f7596e7 [MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfo
Machine Sink uses loop depth information to select between successors BBs to
sink machine instructions into, where BBs within smaller loop depths are
preferable.  This patch adds support for choosing between successors by using
profile information from BlockFrequencyInfo instead, whenever the information
is available.

Tested it under SPEC2006 train (average of 30 runs for each program); ~1.5%
execution speedup in average on x86-64 darwin.

<rdar://problem/18021659>

llvm-svn: 218472
2014-09-25 23:14:26 +00:00
Tom Stellard 529efcf9d0 SelectionDAG: Remove #if NDEBUG from check for a post-isel hook
The InstrEmitter will skip the check of MI.hasPostISelHook()
before calling AdjustInstrPostInstrSelection() when NDEBUG
is not defined.

This was added in r140228, and I'm not sure if it is intentional or not,
but it is a likely source for bugs, because it means with
Release+Asserts builds you can forget to set the hasPostISelHook
flag on TableGen definitions and AdjustInstrPostInstrSelection() will
still be called.

llvm-svn: 218458
2014-09-25 18:59:22 +00:00
Robin Morisset 810739d174 Lower idempotent RMWs to fence+load
Summary:
I originally tried doing this specifically for X86 in the backend in D5091,
but it was rather brittle and generally running too late to be general.
Furthermore, other targets may want to implement similar optimizations.
So I reimplemented it at the IR-level, fitting it into AtomicExpandPass
as it interacts with that pass (which could not be cleanly done before
at the backend level).

This optimization relies on a new target hook, which is only used by X86
for now, as the correctness of the optimization on other targets remains
an open question. If it is found correct on other targets, it should be
trivial to enable for them.

Details of the optimization are discussed in D5091.

Test Plan: make check-all + a new test

Reviewers: jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5422

llvm-svn: 218455
2014-09-25 17:27:43 +00:00
Jiangning Liu 3b096172cf Clear PreferredExtendType for in each function-specific state FunctionLoweringInfo.
llvm-svn: 218364
2014-09-24 03:22:56 +00:00
Robin Morisset 6dbbbc28b0 [X86] Make wide loads be managed by AtomicExpand
Summary:
AtomicExpand already had logic for expanding wide loads and stores on LL/SC
architectures, and for expanding wide stores on CmpXchg architectures, but
not for wide loads on CmpXchg architectures. This patch fills this hole,
and makes use of this new feature in the X86 backend.

Only one functionnal change: we now lose the SynchScope attribute.
It is regrettable, but I have another patch that I will submit soon that will
solve this for all of AtomicExpand (it seemed better to split it apart as it
is a different concern).

Test Plan: make check-all (lots of tests for this functionality already exist)

Reviewers: jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5404

llvm-svn: 218332
2014-09-23 20:59:25 +00:00
Robin Morisset dedef3325f Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder
Summary:
The goal is to eventually remove all the code related to getInsertFencesForAtomic
in SelectionDAGBuilder as it is wrong (designed for ARM, not really portable, works
mostly by accident because the backends are overly conservative), and repeats the
same logic that goes in emitLeading/TrailingFence.

In this patch, I make AtomicExpandPass insert the fences as it knows better
where to put them. Because this requires getting the fences and not just
passing an IRBuilder around, I had to change the return type of
emitLeading/TrailingFence.
This code only triggers on ARM for now. Because it is earlier in the pipeline
than SelectionDAGBuilder, it triggers and lowers atomic accesses to atomic so
SelectionDAGBuilder does not add barriers anymore on ARM.

If this patch is accepted I plan to implement emitLeading/TrailingFence for all
backends that setInsertFencesForAtomic(true), which will allow both making them
less conservative and simplifying SelectionDAGBuilder once they are all using
this interface.

This should not cause any functionnal change so the existing tests are used
and not modified.

Test Plan: make check-all, benefits from existing tests of atomics on ARM

Reviewers: jfb, t.p.northover

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5179

llvm-svn: 218329
2014-09-23 20:31:14 +00:00
Lang Hames d5f496d57c [MCJIT] Nuke MachineRelocation and MachineCodeEmitter. Now that the old JIT is
gone they're no longer needed.

llvm-svn: 218320
2014-09-23 18:08:47 +00:00
Sanjay Patel 6a42292795 Use SDValue bool operator to reduce code. No functional change.
llvm-svn: 218314
2014-09-23 16:24:20 +00:00
David Majnemer 597be2ded6 MC: ReadOnlyWithRel section kinds should map to rdata in COFF
Don't consider ReadOnlyWithRel as a writable section in COFF, they
really belong in .rdata.

llvm-svn: 218268
2014-09-22 20:39:23 +00:00
Sanjay Patel b67bd262ea Refactor reciprocal square root estimate into target-independent function; NFC.
This is purely a plumbing patch. No functional changes intended.

The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this:

z = y / sqrt(x)

into:

z = y * rsqrte(x)

using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 .

The first step is to add a target hook for RSQRTE, take the already target-independent code selfishly hoarded by PPC, and put it into DAGCombiner.

Next steps:

    The code in DAGCombiner::BuildRSQRTE() should be refactored further; tests that exercise that logic need to be added.
    Logic in PPCTargetLowering::BuildRSQRTE() should be hoisted into DAGCombiner.
    X86 and AArch64 overrides for TargetLowering.BuildRSQRTE() should be added.

Differential Revision: http://reviews.llvm.org/D5425

llvm-svn: 218219
2014-09-21 15:19:15 +00:00
Sanjay Patel d649235fc3 mop up: "Don’t duplicate function or class name at the beginning of the comment."
llvm-svn: 218218
2014-09-21 14:48:16 +00:00
Sanjay Patel 69df41e92e mop up: "Don’t duplicate function or class name at the beginning of the comment."
llvm-svn: 218194
2014-09-20 22:39:16 +00:00
David Majnemer b8dbebb31c MC: Treat ReadOnlyWithRel and ReadOnlyWithRelLocal as ReadOnly for COFF
A problem with our old behavior becomes observable under x86-64 COFF
when we need a read-only GV which has an initializer which is referenced
using a relocation: we would mark the section as writable.  Marking the
section as writable interferes with section merging.

This fixes PR21009.

llvm-svn: 218179
2014-09-20 07:31:46 +00:00
Peter Collingbourne 975726345c Fix crash with an insertvalue that produces an empty object.
llvm-svn: 218171
2014-09-20 00:10:47 +00:00
Chris Bieneman 1a98490ce5 Converting SpillPlacement's BlockFrequency threshold to a ManagedStatic to avoid static constructors and destructors.
llvm-svn: 218163
2014-09-19 22:46:28 +00:00
David Blaikie 3a7ce252cc Omit DW_TAG_subprograms for subprograms without inlined subroutines when producing -gmlt data
To reduce the size of -gmlt data, skip the subprograms without any
inlined subroutines. Since we've now got the ability to make these
determinations in the backend (funnily enough - we added the flag so we
wouldn't produce ranges under -gmlt, but with this change we use the
flag, but go back to producing ranges under -gmlt).

Instead, just produce CU ranges to inform the consumer which parts of
the code are described by this CU's line table. Tools could inspect the
line table directly to compute the range, but the CU ranges only seem to
be about 0.5% of object/executable size, so I'm not too worried about
teaching llvm-symbolizer that trick just yet - it's certainly a possible
piece of future work.

Update an llvm-symbolizer test just to demonstrate that this schema is
acceptable there (if it wasn't, the compiler-rt tests would catch this,
but good to have an in-llvm-tree test for llvm-symbolizer's behavior
here)

Building the clang binary with -gmlt with this patch reduces the total
size of object files by 5.1% (5.56% without ranges) without compression
and the executable by 4.37% (4.75% without ranges).

llvm-svn: 218129
2014-09-19 17:03:16 +00:00
Frederic Riss 9ba9efff56 Change DwarfCompileUnit::createGlobalVariable to getOrCreateGlobalVariable.
Summary:
This will allow to request the creation of a forward delacred variable
at is point of use (for imported declarations, this will be
DwarfDebug::constructImportedEntityDIE) rather than having to put the
forward decl in a retention list.

Note that getOrCreateGlobalVariable returns the actual definition DIE when the
routine creates a declaration and a definition DIE. If you agree this is the
right behavior, then I'll have a followup patch that registers the definition
in the DIE map instead of the declaration as it is today (this 'breaks' only
one test, where we test that the imported entity is the declaration). I'm
not sure what's best here, but it's easy enough for a consumer to follow the
DW_AT_specification link to get to the declaration, whereas it takes more
work to find the actual definition from a declaration DIE.

Reviewers: echristo, dblaikie, aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5381

llvm-svn: 218126
2014-09-19 15:12:03 +00:00
Hal Finkel 62ac736faa Optionally enable more-aggressive FMA formation in DAGCombine
The heuristic used by DAGCombine to form FMAs checks that the FMUL has only one
use, but this is overly-conservative on some systems. Specifically, if the FMA
and the FADD have the same latency (and the FMA does not compete for resources
with the FMUL any more than the FADD does), there is no need for the
restriction, and furthermore, forming the FMA leaving the FMUL can still allow
for higher overall throughput and decreased critical-path length.

Here we add a new TLI callback, enableAggressiveFMAFusion, false by default, to
elide the hasOneUse check. This is enabled for PowerPC by default, as most
PowerPC systems will benefit.

Patch by Olivier Sallenave, thanks!

llvm-svn: 218120
2014-09-19 11:42:56 +00:00
Jiangning Liu ffbc690933 Optimize sext/zext insertion algorithm in back-end.
With this optimization, we will not always insert zext for values crossing
basic blocks, but insert sext if the users of a value crossing basic block
has preference of sign predicate.

llvm-svn: 218101
2014-09-19 05:30:35 +00:00
David Blaikie 03c3dbeb62 Omit DW_AT_frame_base under -gmlt for size
llvm-svn: 218100
2014-09-19 04:55:05 +00:00
David Blaikie 0b9438b1c1 Describe the -gmlt optimization committed in the previous revision.
llvm-svn: 218099
2014-09-19 04:47:46 +00:00
David Blaikie 73b65d236c Omit all the extra static attributes on subprograms in -gmlt
This omission will be done in a fancier manner once we're dealing with
"put gmlt in the skeleton CUs under fission" - it'll have to be
conditional on the kind of CU we're emitting into (skeleton or gmlt).

llvm-svn: 218098
2014-09-19 04:30:36 +00:00
Hans Wennborg c0f0c511db Fix an it's vs. its typo.
llvm-svn: 218093
2014-09-19 01:14:56 +00:00
Frederic Riss 0baab0cded Revert part of r218041.
The patch moved some logic around in an attempt to generate potentially more
DW_AT_declaration attributes. The patch was flawed though and it stopped
generating the attribute in some cases.

llvm-svn: 218060
2014-09-18 16:41:04 +00:00
Frederic Riss be26dfb595 Always emit DW_AT_declaration attribute when the variable isn't a definition.
Summary:
This doesn't show up today as we don't emit decalration only variables. This
will be tested when the followup patches implementing import of forward
declared entities lands in clang.

Reviewers: echristo, dblaikie, aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5382

llvm-svn: 218041
2014-09-18 09:38:23 +00:00
Eric Christopher d85ffb1fc0 Add a new pass FunctionTargetTransformInfo. This pass serves as a
shim between the TargetTransformInfo immutable pass and the Subtarget
via the TargetMachine and Function. Migrate a single call from
BasicTargetTransformInfo as an example and provide shims where TargetMachine
begins taking a Function to determine the subtarget.

No functional change.

llvm-svn: 218004
2014-09-18 00:34:14 +00:00
Robin Morisset 25c8e318e4 [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass
This required a new hook called hasLoadLinkedStoreConditional to know whether
to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to
CmpXchg (X86).

Apart from that, the new code in AtomicExpandPass is mostly moved from
X86AtomicExpandPass. The main result of this patch is to get rid of that
pass, which had lots of code duplicated with AtomicExpandPass.

llvm-svn: 217928
2014-09-17 00:06:58 +00:00
Quentin Colombet ac55b15bf4 [CodeGenPrepare][AddressingModeMatcher] The promotion mechanism was expecting
instructions when truncate, sext, or zext were created. Fix that.

llvm-svn: 217926
2014-09-16 22:36:07 +00:00
Owen Anderson bfc80a45a7 Add back a fallback case for targets that do not or cannot implement getNoopForMachoTarget().
llvm-svn: 217899
2014-09-16 20:28:00 +00:00
Hal Finkel cc4f31d3d7 Fix BasicTTI::getCmpSelInstrCost to deal with illegal vector types
The default implementation of getCmpSelInstrCost, which provides the cost of
icmp/fcmp/select instructions, did not deal sensibly with illegal vector types
that were scalarized. We'd ask for the legalization cost of the vector type,
which would return something like (4, f64) given an input of <4 x double>, and
we'd then check the TLI status of the ISD opcode on that scalar type. This would
result in querying (ISD::VSELECT, f64), for example. Amusingly enough,
ISD::VSELECT on scalar types is marked as Legal by default (as with most other
operations), and most backends never change this because VSELECT is never
generated on scalars. However, seeing the resulting operation as Legal, we'd
neglect to add the scalarization cost before returning. The result is that we'd
grossly under-estimate the cost of cmps/selects on illegal vector types.

Now, if type legalization clearly results in scalarization, we skip the early
return and add the scalarization cost.

llvm-svn: 217859
2014-09-16 04:35:50 +00:00
David Blaikie ba656e1d7c DebugInfo: Add comment describing the need to disable address pool usage in skeleton units.
Post commit review from Eric Christopher.

llvm-svn: 217842
2014-09-15 22:41:25 +00:00
Sanjay Patel d4f4c4e416 Replace repeated null checks with an assert. NFC.
Without a vector to hold the created ops, these 
functions don't have any use.

llvm-svn: 217831
2014-09-15 21:52:51 +00:00
Juergen Ributzka d111d29f90 [FastISel] Move optimizeCmpPredicate to FastISel base class. NFC.
Make the optimizeCmpPredicate function available to all targets.

llvm-svn: 217822
2014-09-15 20:47:13 +00:00
Sanjay Patel bb29221129 Replace dead links to "Hacker's Delight" with general references. NFC.
llvm-svn: 217814
2014-09-15 19:47:44 +00:00
Rafael Espindola 6865d6f08a Fix a lot of confusion around inserting nops on empty functions.
On MachO, and MachO only, we cannot have a truly empty function since that
breaks the linker logic for atomizing the section.

When we are emitting a frame pointer, the presence of an unreachable will
create a cfi instruction pointing past the last instruction. This is perfectly
fine. The FDE information encodes the pc range it applies to. If some tool
cannot handle this, we should explicitly say which bug we are working around
and only work around it when it is actually relevant (not for ELF for example).

Given the unreachable we could omit the .cfi_def_cfa_register, but then
again, we could also omit the entire function prologue if we wanted to.

llvm-svn: 217801
2014-09-15 18:32:58 +00:00
Quentin Colombet 9dcb724d31 [CodeGenPrepare][AddressingModeMatcher] Fix a think-o for the sext(zext) -> zext promotion
introduced in r217629.
We were returning the old sext instead of the new zext as the promoted instruction!

Thanks Joerg Sonnenberger for the test case.

llvm-svn: 217800
2014-09-15 18:26:58 +00:00
Yaron Keren 66b0cebf7f In DwarfEHPrepare, after all passes are run, RewindFunction may be a dangling
pointer to a dead function. To make sure it's valid, doFinalization nullptrs
RewindFunction just like the constructor and so it will be found on next run.

llvm-svn: 217737
2014-09-14 20:36:28 +00:00
Owen Anderson e68ca8d4ba Allow targets to custom legalize vector insertion and extraction.
llvm-svn: 217711
2014-09-12 22:16:11 +00:00
Owen Anderson ec4f873d34 Remove an unnecessary restriction. MIsNeedChainEdge() should be checked even when scheduler AliasAnalysis is not
enabled.  A good chunk of the MIsNeedChainEdge() is logic that is valid and should be applied even for targets
that are not using for alias analysis.

llvm-svn: 217706
2014-09-12 21:17:55 +00:00
Benjamin Kramer 6d527ef9d6 Legalizer: Use the scalar bit width when promoting bit counting instrs on
vectors.

e.g. when promoting ctlz from <2 x i32> to <2 x i64> we have to fixup
the result by 32 bits, not 64. PR20917.

llvm-svn: 217671
2014-09-12 12:50:27 +00:00
Quentin Colombet b2c5c6dde3 [CodeGenPrepare] Teach the addressing mode matcher how to promote zext.
I.e., teach it about 'sext (zext a to ty) to ty2' => zext a to ty2.

llvm-svn: 217629
2014-09-11 21:22:14 +00:00
David Blaikie 6741bb09bb Remove the unused string section symbol parameter from DwarfFile::emitStrings
And since it /looked/ like the DwarfStrSectionSym was unused, I tried
removing it - but then it turned out that DwarfStringPool was
reconstructing the same label (and expecting it to have already been
emitted) and uses that.

So I kept it around, but wanted to pass it in to users - since it seemed
a bit silly for DwarfStringPool to have it passed in and returned but
itself have no use for it. The only two users don't handle strings in
both .dwo and .o files so they only ever need the one symbol - no need
to keep it (and have an unused symbol) in the DwarfStringPool used for
fission/.dwo.

Refactor a bunch of accelerator table usage to remove duplication so I
didn't have to touch 4-5 callers.

llvm-svn: 217628
2014-09-11 21:12:48 +00:00
Matt Arsenault 8239eaab99 Add DAG combine for shl + add of constants.
Do
 (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)

This is already done for multiplies, but since multiplies
by powers of two are turned into shifts, we also need
to handle it here.

This might want checks for isLegalAddImmediate to avoid
transforming an add of a legal immediate with one that isn't.

llvm-svn: 217610
2014-09-11 17:34:19 +00:00
Sanjay Patel 7bd228a82e Combine fmul vector FP constants when unsafe math is allowed.
This is an extension of the change made with r215820:
http://llvm.org/viewvc/llvm-project?view=revision&revision=215820

That patch allowed combining of splatted vector FP constants that are multiplied.

This patch allows combining non-uniform vector FP constants too by relaxing the
check on the type of vector. Also, canonicalize a vector fmul in the
same way that we already do for scalars - if only one operand of the fmul is a
constant, make it operand 1. Otherwise, we miss potential folds.

This fold is also done by -instcombine, but it's possible that extra
fmuls may have been generated during lowering.

Differential Revision: http://reviews.llvm.org/D5254

llvm-svn: 217599
2014-09-11 15:45:27 +00:00
David Xu f7aff68fe3 Build correct vector filled with undef nodes
llvm-svn: 217570
2014-09-11 05:10:28 +00:00
Adrian Prantl 1383d6f808 Cleanup: Use the appropriate API for accessing the DIVariable of a
DBG_VALUE intrinsic.

llvm-svn: 217533
2014-09-10 18:52:29 +00:00