Commit Graph

24 Commits

Author SHA1 Message Date
Yuanfang Chen a41c8b8fd5 [ADT] support fixed-width output with `utohexstr`
Will use it to output a hash value that needs fixed-width.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D118427
2022-01-28 10:07:54 -08:00
Ben Vanik 2fcffcd0e8 [ADT] Simplifying hex string parsing so it runs faster in debug modes.
This expands the lookup table statically and avoids routing through methods that
contain asserts (like StringRef/std::string element accessors and drop_front)
such that performance is more predictable across compilation environments. This
was primarily driven by slow debug mode performance but has a large benefit in
release builds as well.

```
ssd_mobilenet_v2_face_float (42MB .mlir)
  Debug/MSVC (old):  5.22s
  Debug/MSVC (new):  0.16s
Release/MSVC (old):  0.81s
Release/MSVC (new):  0.02s

huggingface_minilm (536MB .mlir)
  Debug/MSVC (old): 65.31s
  Debug/MSVC (new):  2.03s
Release/MSVC (old):  9.93s
Release/MSVC (new):  0.27s
```

Now in debug the time is split evenly between lexString, tryGetFromHex, and
element attrs hashing, with the next step to making it faster being to combine
the work (incremental hashing during conversion, etc) - but this is at least in
the right order of magnitude and retains the original API surface.

I have not profiled a build with clang but this is strictly less code and simpler
data structures so I'd expect improvements there as well.

This also fixes a bug where 0xFF bytes in the input would read out of bounds.

Reviewed By: dblaikie, stellaraccident

Differential Revision: https://reviews.llvm.org/D112105
2021-11-03 20:31:20 -07:00
Michał Górny 66e06cc8cb [llvm] [ADT] Update llvm::Split() per Pavel Labath's suggestions
Optimize the iterator comparison logic to compare Current.data()
pointers.  Use std::tie for assignments from std::pair.  Replace
the custom class with a function returning iterator_range.

Differential Revision: https://reviews.llvm.org/D110535
2021-10-22 12:27:46 +02:00
Michał Górny f4b71e3479 [llvm] [ADT] Add a range/iterator-based Split()
Add a llvm::Split() implementation that can be used via range-for loop,
e.g.:

    for (StringRef x : llvm::Split("foo,bar,baz", ','))
      ...

The implementation uses an additional SplittingIterator class that
uses StringRef::split() internally.

Differential Revision: https://reviews.llvm.org/D110496
2021-09-27 10:43:09 +02:00
Simon Pilgrim 4295c222a8 StringExtrasTest.cpp - add missing newline at the end of file. NFCI. 2021-06-11 14:32:35 +01:00
Simon Pilgrim 61cdaf66fe [ADT] Remove APInt/APSInt toString() std::string variants
<string> is currently the highest impact header in a clang+llvm build:

https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html

One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps.

This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code.

Differential Revision: https://reviews.llvm.org/D103888
2021-06-11 13:19:15 +01:00
Simon Pilgrim 955d88992a [ADT] Consistently use StringExtrasTest for the test suite filter. NFCI.
Noticed while updating D103888 - some of the tests were using "StringExtras" for the test_suite_name instead of the expected "StringExtrasTest"
2021-06-11 12:00:54 +01:00
Kazu Hirata 8fd8ff1f67 [StringExtras] Rename SubsequentDelim to ListSeparator
This patch renames SubsequentDelim to ListSeparator to clarify the
purpose of the class.

Differential Revision: https://reviews.llvm.org/D94649
2021-01-15 21:00:56 -08:00
Kazu Hirata 407b1e65a4 [StringExtras] Add a helper class for comma-separated lists
This patch introduces a helper class SubsequentDelim to simplify loops
that generate a comma-separated lists.

For example, consider the following loop, taken from
llvm/lib/CodeGen/MachineBasicBlock.cpp:

    for (auto I = pred_begin(), E = pred_end(); I != E; ++I) {
      if (I != pred_begin())
        OS << ", ";
      OS << printMBBReference(**I);
    }

The new class allows us to rewrite the loop as:

    SubsequentDelim SD;
    for (auto I = pred_begin(), E = pred_end(); I != E; ++I)
      OS << SD << printMBBReference(**I);

where SD evaluates to the empty string for the first time and ", " for
subsequent iterations.

Unlike interleaveComma, defined in llvm/include/llvm/ADT/STLExtras.h,
SubsequentDelim can accommodate a wider variety of loops, including:

- those that conditionally skip certain items,
- those that need iterators to call getSuccProbability(I), and
- those that iterate over integer ranges.

As an example, this patch cleans up MachineBasicBlock::print.

Differential Revision: https://reviews.llvm.org/D94377
2021-01-10 14:32:02 -08:00
River Riddle 1095419b10 [llvm][StringExtras] Add a fail-able version of `fromHex`
This revision adds a fail-able/checked version of `fromHex` that fails when the input string contains a non-hex character. This removes the need for users to have a separate check for if the string contains all hex digits. This becomes very costly for large hex strings given that checking if a string contains only hex digits is effectively the same as just converting it in the first place.

Context: In MLIR we use hex strings to represent very large constants in the textual format of the IR. These changes lead to a large decrease in compile time when parsing these constants (2 seconds -> 1 second).

Differential Revision: https://reviews.llvm.org/D90265
2020-10-28 16:58:06 -07:00
Thomas Preud'homme 8ca7d2a1ee [unittest, ADT] Add unit tests for itostr & utostr
Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D82300
2020-06-23 14:48:37 +01:00
Sam McCall b283ae7af8 [ADT] Add locale-independent isSpace() to StringExtras. NFC
Use this in clangd, will follow up with replacements for isspace where
locale-dependent is clearly not intended.
2020-05-02 15:20:05 +02:00
River Riddle 229e392b4e [llvm][StringExtras] Merge StringExtras from MLIR into LLVM
Summary:
This revision adds two utilities currently present in MLIR to LLVM StringExtras:

* convertToSnakeFromCamelCase
Convert a string from a camel case naming scheme, to a snake case scheme

* convertToCamelFromSnakeCase
Convert a string from a snake case naming scheme, to a camel case scheme

Differential Revision: https://reviews.llvm.org/D78167
2020-04-14 18:57:22 -07:00
Reid Kleckner 67d440b949 Print quoted backslashes in LLVM IR as \\ instead of \5C
This improves readability of Windows path string literals in LLVM IR.
The LLVM assembler has supported \\ in IR strings for a long time, but
the lexer doesn't tolerate escaped quotes, so they have to be printed as
\22 for now.

llvm-svn: 374415
2019-10-10 18:31:57 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Petr Hosek 72dc29a848 [ADT] Support converting to lowercase string in toHex
This is useful in certain use-cases such as D51833.

Differential Revision: https://reviews.llvm.org/D51835

llvm-svn: 341852
2018-09-10 19:34:44 +00:00
Michael Kruse 6f1da6e345 [ADT] Replace std::isprint by llvm::isPrint.
The standard library functions ::isprint/std::isprint have platform-
and locale-dependent behavior which makes LLVM's output less
predictable. In particular, regression tests my fail depending on the
implementation of these functions.

Implement llvm::isPrint in StringExtras.h with a standard behavior and
replace all uses of ::isprint/std::isprint by a call it llvm::isPrint.
The function is inlined and does not look up language settings so it
should perform better than the standard library's version.

Such a replacement has already been done for isdigit, isalpha, isxdigit
in r314883. gtest does the same in gtest-printers.cc using the following
justification:

    // Returns true if c is a printable ASCII character.  We test the
    // value of c directly instead of calling isprint(), which is buggy on
    // Windows Mobile.
    inline bool IsPrintableAscii(wchar_t c) {
      return 0x20 <= c && c <= 0x7E;
    }

Similar issues have also been encountered by Julia:
https://github.com/JuliaLang/julia/issues/7416

I noticed the problem myself when on Windows isprint('\t') started to
evaluate to true (see https://stackoverflow.com/questions/51435249) and
thus caused several unit tests to fail. The result of isprint doesn't
seem to be well-defined even for ASCII characters. Therefore I suggest
to replace isprint by a platform-independent version.

Differential Revision: https://reviews.llvm.org/D49680

llvm-svn: 338034
2018-07-26 15:31:41 +00:00
Jonas Devlieghere 745918ff87 [ADT] Make escaping fn conform to coding guidelines
As noted by Adrian on llvm-commits, PrintHTMLEscaped and PrintEscaped in
StringExtras did not conform to the LLVM coding guidelines. This commit
rectifies that.

llvm-svn: 333669
2018-05-31 17:01:42 +00:00
Jonas Devlieghere 50603518a0 [ADT] Add unit test for PrintHTMLEscaped
Add unit tests for PrintHTMLEscaped which was added in r333565.

llvm-svn: 333590
2018-05-30 20:47:18 +00:00
Francis Visoiu Mistrih 14bd3b9f21 [Support] Add unit test for printLowerCase
Add test case for the function added in r319171.

llvm-svn: 319177
2017-11-28 16:11:56 +00:00
Simon Pilgrim 171ba4a699 Fix double->float truncation warning on MSVC
llvm-svn: 306101
2017-06-23 13:53:55 +00:00
Pavel Labath ec000f42fa [ADT] Add llvm::to_float
Summary:
The function matches the interface of llvm::to_integer, but as we are
calling out to a C library function, I let it take a Twine argument, so
we can avoid a string copy at least in some cases.

I add a test and replace a couple of existing uses of strtod with this
function.

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34518

llvm-svn: 306096
2017-06-23 12:55:02 +00:00
Zachary Turner e46b4498b8 [StringExtras] Add a fromHex to complement toHex.
We already have a function toHex that will convert a string like
"\xFF\xFF" to the string "FFFF", but we do not have one that goes
the other way - i.e. to convert a textual string representing a
sequence of hexadecimal characters into the corresponding actual
bytes.  This patch adds such a function.

llvm-svn: 301356
2017-04-25 20:21:35 +00:00
Zachary Turner 0e31a38418 Add llvm::join_items to StringExtras.
llvm::join_items is similar to llvm::join, which produces a string
by concatenating a sequence of values together separated by a
given separator.  But it differs in that the arguments to
llvm::join() are same-type members of a container, whereas the
arguments to llvm::join_items are arbitrary types passed into
a variadic template.  The only requirement on parameters to
llvm::join_items (including for the separator themselves) is
that they be implicitly convertible to std::string or have
an overload of std::string::operator+

Differential Revision: https://reviews.llvm.org/D24880

llvm-svn: 282502
2016-09-27 16:37:30 +00:00