Commit Graph

217 Commits

Author SHA1 Message Date
Kadir Cetinkaya 36f77e20d9
Revert "Revert "[clang][Lex] Fix a crash on malformed string literals""
This reverts commit feea7ef23c.
Drops the test case, see https://reviews.llvm.org/D135161#3839510
2022-10-06 11:41:18 +02:00
Kadir Cetinkaya feea7ef23c
Revert "[clang][Lex] Fix a crash on malformed string literals"
This reverts commit 36a200208f.
2022-10-05 10:37:32 +02:00
Kadir Cetinkaya 36a200208f
[clang][Lex] Fix a crash on malformed string literals
Differential Revision: https://reviews.llvm.org/D135161
2022-10-05 09:55:50 +02:00
Fangrui Song 3f18f7c007 [clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131346
2022-08-08 09:12:46 -07:00
Corentin Jabot 6882ca9aff [Clang] Adjust extension warnings for delimited sequences
WG21 approved delimited escape sequences and named escape
sequences.
Adjust the extension warnings accordingly, and update
the release notes.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D129664
2022-07-14 07:50:58 +02:00
Corentin Jabot a9a60f20e6 [Clang] Rename StringLiteral::isAscii() => isOrdinary() [NFC]
"Ascii" StringLiteral instances are actually narrow strings
that are UTF-8 encoded and do not have an encoding prefix.
(UTF8 StringLiteral are also UTF-8 encoded strings, but with
the u8 prefix.

To avoid possible confusion both with actuall ASCII strings,
and with future works extending the set of literal encodings
supported by clang, this rename StringLiteral::isAscii() to
isOrdinary(), matching C++ standard terminology.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D128762
2022-06-29 18:28:51 +02:00
Corentin Jabot c92056d038 [Clang][C++23] P2071 Named universal character escapes
Implements [[ https://wg21.link/p2071r1  | P2071 Named Universal Character Escapes ]] - as an extension in all language mode, the patch  not warn in c++23 mode will be done later once this paper is plenary approved (in July).

We add

 * A code generator that transforms `UnicodeData.txt` and `NameAliases.txt` to a space efficient data structure that can be queried in `O(NameLength)`
 * A set of functions in `Unicode.h` to query that data, including

   * A function to find an exact match of a given Unicode character name
   * A function to perform a loose (ignoring case, space, underscore, medial hyphen) matching
   * A function returning the best matching codepoint for a given string per edit distance

 * Support of `\N{}` escape sequences in String and character Literals, with loose and typos diagnostics/fixits
 * Support of `\N{}` as UCN with loose matching diagnostics/fixits.

Loose matching is considered an error to match closely the semantics of P2071.

The generated data contributes to 280kB of data to the binaries.

`UnicodeData.txt` and `NameAliases.txt`  are not committed to the repository in this patch, and regenerating the data is a manual process.

Reviewed By: tahonermann

Differential Revision: https://reviews.llvm.org/D123064
2022-06-25 19:03:33 +02:00
Sam McCall 817550919e [Lex] Don't assert when decoding invalid UCNs.
Currently if a lexically-valid UCN encodes an invalid codepoint, then we
diagnose that, and then hit an assertion while trying to decode it.

Since there isn't anything preventing us reaching this state, remove the
assertion. expandUCNs("X\UAAAAAAAAY") will produce "XY".

Differential Revision: https://reviews.llvm.org/D125059
2022-05-06 08:51:42 +02:00
Aaron Ballman 9e3e85ac6e Silence -Wlogical-op-parentheses and fix a logic bug while doing so 2022-03-14 10:13:39 -04:00
Aaron Ballman 8cba72177d Implement literal suffixes for _BitInt
WG14 adopted N2775 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2775.pdf)
at our Feb 2022 meeting. This paper adds a literal suffix for
bit-precise types that automatically sizes the bit-precise type to be
the smallest possible legal _BitInt type that can represent the literal
value. The suffix chosen is wb (for a signed bit-precise type) which
can be combined with the u suffix (for an unsigned bit-precise type).

The preprocessor continues to operate as-if all integer types were
intmax_t/uintmax_t, including bit-precise integer types. It is a
constraint violation if the bit-precise literal is too large to fit
within that type in the context of the preprocessor (when still using
a pp-number preprocessing token), but it is not a constraint violation
in other circumstances. This allows you to make bit-precise integer
literals that are wider than what the preprocessor currently supports
in order to initialize variables, etc.
2022-03-14 09:24:19 -04:00
Daan De Meyer 5a6dac66db LiteralSupport: Don't assert() on invalid input
When using clangd, it's possible to trigger assertions in
NumericLiteralParser and CharLiteralParser when switching git branches.
This commit removes the initial asserts on invalid input and replaces
those asserts with the error handling mechanism from those respective
classes instead. This allows clangd to gracefully recover without
crashing.

See https://github.com/clangd/clangd/issues/888 for more information
on the clangd crashes.
2021-11-17 23:51:30 +00:00
Kazu Hirata dccfaddc6b [clang] Use StringRef::contains (NFC) 2021-10-21 08:58:19 -07:00
Jay Foad d933adeaca [APInt] Stop using soft-deprecated constructors and methods in clang. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in clang.

Differential Revision: https://reviews.llvm.org/D110808
2021-10-04 09:38:11 +01:00
Aaron Ballman 38d09080c9 Removing a default constructor argument; NFC
The argument is always used with its default value, so remove the
argument entirely.
2021-09-27 09:41:28 -04:00
Corentin Jabot 274adcb866 Implement delimited escape sequences.
\x{XXXX} \u{XXXX} and \o{OOOO} are accepted in all languages mode
in characters and string literals.

This is a feature proposed for both C++ (P2290R1) and C (N2785). The
papers have been seen by both committees but are not yet adopted into
either standard. However, they do have support from both committees.
2021-09-15 09:54:49 -04:00
Corentin Jabot bdeda959ab Make wide multi-character character literals ill-formed
This implements P2362, which has not yet been approved by the
C++ committee, but because wide-multi character literals are
implementation defined, clang might not have to wait for WG21.

This change is also being applied in C mode as the behavior is
implementation-defined in C as well and there's no benefit to
having different rules between the languages.

The other part of P2362, making non-representable character
literals ill-formed, is already implemented by clang
2021-08-20 11:10:53 -04:00
Jan Svoboda aa245ddd46 [clang][lex] NFC: Add explicit cast to silence -Wsign-compare 2021-07-22 12:21:12 +02:00
Anton Bikineev dc7ebd2cb0 [C++2b] Support size_t literals
This adds support for C++2b's z/uz suffixes for size_t literals (P0330).
2021-03-31 13:36:23 +00:00
Jan Svoboda 23cc8ebf59 [clang][lex] Speculative fix for buffer overrun on raw string parse
This attempts to fix a (non-deterministic) buffer overrun when parsing raw string literals during modular build.

Similar fix to 4e5b5c36f4.

Reviewed By: beccadax

Differential Revision: https://reviews.llvm.org/D94950
2021-03-15 15:13:47 +01:00
Chuyang Chen 8fa45e1fd5 Convert diagnostics about multi-character literals from extension to warning
This addresses PR46797.
2020-10-06 08:47:17 -04:00
Dmitri Gribenko 3cca818efa Refactored NumericLiteralParser to not require a Preprocessor
Summary:
We would like to use NumericLiteralParser in the implementation of the
syntax tree builder, and plumbing a preprocessor there seems
inconvenient and superfluous.

Reviewers: eduucaldas

Reviewed By: eduucaldas

Subscribers: gribozavr2, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83480
2020-07-09 17:33:58 +02:00
Leonard Chan ef37444058 [Lexer] Fix invalid suffix diagnostic for fixed-point literals
Committing on behalf of nagart, who authored this patch.

Differential Revision: https://reviews.llvm.org/D80412
2020-05-27 16:16:56 -07:00
Aaron Ballman 6a30894391 C++2a -> C++20 in some identifiers; NFC. 2020-04-21 15:37:19 -04:00
Reid Kleckner 76221c734e Remove llvm::Error include form Diagnostic.h
Saves ~400 related LLVM ADT. llvm/ADT/Error.h takes 90ms to parse.

$ diff -u <(sort thedeps-before.txt) <(sort thedeps-after.txt) \
    | grep '^[-+] ' | sort | uniq -c | sort -nr
    403 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/Error.h
    403 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm-c/Error.h
    397 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/Format.h
    397 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/Debug.h
    377 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/StringExtras.h
    158 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm-c/ExternC.h
    138 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/ErrorOr.h
     13 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Support/raw_ostream.h
     13 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/SmallString.h
      5 -    /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/Twine.h
2020-04-06 10:42:17 -07:00
Ehud Katz 24b326cc61 [APFloat] Fix checked error assert failures
`APFLoat::convertFromString` returns `Expected` result, which must be
"checked" if the LLVM_ENABLE_ABI_BREAKING_CHECKS preprocessor flag is
set.
To mark an `Expected` result as "checked" we must consume the `Error`
within.
In many cases, we are only interested in knowing if an error occured,
without the need to examine the error info. This is achieved, easily,
with the `errorToBool()` API.
2020-01-09 09:42:32 +02:00
Ehud Katz f3f7dc3d29 [APFloat] Fix compilation warnings 2020-01-06 11:30:40 +02:00
Ehud Katz c5fb73c5d1 [APFloat] Add recoverable string parsing errors to APFloat
Implementing the APFloat part in PR4745.

Differential Revision: https://reviews.llvm.org/D69770
2020-01-06 10:09:01 +02:00
Yaxun Liu d442500f5d [CUDA][HIP] Do not diagnose use of _Float16
r352221 caused regressions in CUDA/HIP since device function may use _Float16 whereas host does not support it.
In this case host compilation should not diagnose usage of _Float16 in device functions or variables.

For now just do not diagnose _Float16 for CUDA/HIP. In the future we should have more precise check.

Differential Revision: https://reviews.llvm.org/D57369

llvm-svn: 352488
2019-01-29 13:20:23 +00:00
Erich Keane e3f289c969 Remove F16 literal support based on Float16 support.
Float16 support was disabled recently on many platforms, however that
commit still allowed literals of Float16 type to work.  This commit
removes those based on the same logic as Float16 disable.

Change-Id: I72243048ae2db3dc47bd3d699843e3edf9c395ea
llvm-svn: 352229
2019-01-25 18:36:20 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Reid Kleckner 4dc0b1ac60 Fix clang -Wimplicit-fallthrough warnings across llvm, NFC
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
   of only 'break'.

We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
   doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
   the outer case.

I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.

Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu

Differential Revision: https://reviews.llvm.org/D53950

llvm-svn: 345882
2018-11-01 19:54:45 +00:00
Fangrui Song 6907ce2f8f Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338291
2018-07-30 19:24:48 +00:00
Erich Keane a67eb91047 Enable C++2a Chrono Literals
C++2a via http://wg21.link/p0355 permits the library
literals of 'd' and 'y'. This patch enables them in the
Lexer so that they can be properly parsed.

Note that 'd' gets confused with the hex character, so
modifications to how octal, binary, and decimal numbers are
parsed were required. Since this is simply making previously
invalid code legal, this should be fine.

Hex still greedily parses the 'd' as a hexit, since it would
a: violate [lex.ext]p1
b: break existing code.

Differential Revision: https://reviews.llvm.org/D49504

llvm-svn: 337454
2018-07-19 13:36:57 +00:00
Leonard Chan e5597ad9f3 [Fixed Point Arithmetic] Fix for bug where integer literals could be treated as fixed point literals
This addresses a bug brought up in https://bugs.llvm.org/show_bug.cgi?id=38161 where integer literals could be treated as fixed point types and throw errors related to fixed point types when the 'k' or 'r' suffix used. The fix also addresses the second issue brought up with the assertion by not treating integers as fixed point types in the first place.

Integers that have suffixes 'k' and 'r' now throw the error `invalid suffix 'k/r' on integer constant`.

A few more tests were also added to ensure that fixed point types, and any errors/warnings related to them, are limited to C for now.

Prior discussion also at https://reviews.llvm.org/D46915.

Differential Revision: https://reviews.llvm.org/D49327

llvm-svn: 337289
2018-07-17 14:58:49 +00:00
Erich Keane f702b029f4 PR15730/PR16986 Allow dependently typed vector_size types.
As listed in the above PRs, vector_size doesn't allow
dependent types/values. This patch introduces a new
DependentVectorType to handle a VectorType that has a dependent
size or type.

In the future, ALL the vector-types should be able to create one
of these to handle dependent types/sizes as well. For example,
DependentSizedExtVectorType could likely be switched to just use
this instead, though that is left as an exercise for the future.


Differential Revision: https://reviews.llvm.org/D49045

llvm-svn: 337036
2018-07-13 19:46:04 +00:00
Leonard Chan db01c3adc6 [Fixed Point Arithmetic] Fixed Point Precision Bits and Fixed Point Literals
This diff includes the logic for setting the precision bits for each primary fixed point type in the target info and logic for initializing a fixed point literal.

Fixed point literals are declared using the suffixes

```
hr: short _Fract
uhr: unsigned short _Fract
r: _Fract
ur: unsigned _Fract
lr: long _Fract
ulr: unsigned long _Fract
hk: short _Accum
uhk: unsigned short _Accum
k: _Accum
uk: unsigned _Accum
```
Errors are also thrown for illegal literal values

```
unsigned short _Accum u_short_accum = 256.0uhk;   // expected-error{{the integral part of this literal is too large for this unsigned _Accum type}}
```

Differential Revision: https://reviews.llvm.org/D46915

llvm-svn: 335148
2018-06-20 17:19:40 +00:00
Adrian Prantl 9fc8faf9e6 Remove \brief commands from doxygen comments.
This is similar to the LLVM change https://reviews.llvm.org/D46290.

We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46320

llvm-svn: 331834
2018-05-09 01:00:01 +00:00
Volodymyr Sapsai 579f0b307c [Lex] Fix handling numerical literals ending with ' and signed exponent.
For input `0'e+1` lexer tokenized as numeric constant only `0'e`. Later
NumericLiteralParser skipped 0 and ' as digits and parsed `e+1` as valid
exponent going past the end of the token. Because it didn't mark numeric
literal as having an error, it continued parsing and tried to expandUCNs
with StringRef of length -2.

The fix is not to parse exponent when we reached the end of token.

Discovered by OSS-Fuzz:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=4588

rdar://problem/36076719

Reviewers: rsmith, t.p.northover

Reviewed By: rsmith

Subscribers: cfe-commits, jkorous-apple

Differential Revision: https://reviews.llvm.org/D41834

llvm-svn: 324419
2018-02-06 22:39:25 +00:00
Aaron Ballman c351fba69e Now that C++17 is official (https://www.iso.org/standard/68564.html), start changing the C++1z terminology over to C++17. NFC intended, these are all mechanical changes.
llvm-svn: 319688
2017-12-04 20:27:34 +00:00
Sjoerd Meijer cc623ad071 Recommit "Add _Float16 as a C/C++ source language type"
This is a recommit of r312781; in some build configurations
variable names are omitted, so changed the new regression
test accordingly.

llvm-svn: 312794
2017-09-08 15:15:00 +00:00
Sjoerd Meijer 9aeedde7ff Revert "Add _Float16 as a C/C++ source language type"
The clang-with-lto-ubuntu bot didn't like the new regression
test, revert while I investigate the issue.

llvm-svn: 312784
2017-09-08 10:20:52 +00:00
Sjoerd Meijer ab36f33db8 Add _Float16 as a C/C++ source language type
This adds _Float16 as a source language type, which is a 16-bit floating point
type defined in C11 extension ISO/IEC TS 18661-3.

In follow up patches documentation and more tests will be added.

Differential Revision: https://reviews.llvm.org/D33719

llvm-svn: 312781
2017-09-08 09:42:32 +00:00
Richard Smith b115e5dda2 Rename cxx1z -> cxx17 across all diagnostic IDs.
llvm-svn: 310805
2017-08-13 23:37:29 +00:00
Tim Northover 9710338a56 Reapply Sema: allow imaginary constants via GNU extension if UDL overloads not present.
C++14 added user-defined literal support for complex numbers so that you
can write something like "complex<double> val = 2i". However, there is
an existing GNU extension supporting this syntax and interpreting the
result as a _Complex type.

This changes parsing so that such literals are interpreted in terms of
C++14's operators if an overload is present but otherwise falls back to
the original GNU extension.

(We now have more robust diagnostics for implicit conversions so the
libc++ test that caused the original revert still passes).

llvm-svn: 310478
2017-08-09 14:56:48 +00:00
Tim Northover d1fafc8b05 Revert "Lexer: always allow imaginary constants in GNU mode."
This reverts r310423. It was committed by mistake, I intended to commit the
improved diagnostics for implicit conversions instead.

llvm-svn: 310426
2017-08-08 23:17:51 +00:00
Tim Northover 8fbb4c40b3 Lexer: always allow imaginary constants in GNU mode.
llvm-svn: 310423
2017-08-08 22:03:54 +00:00
Galina Kistanova d321611bfb Added LLVM_FALLTHROUGH to address warning: this statement may fall through + formatted. NFC.
llvm-svn: 304642
2017-06-03 06:25:29 +00:00
Tim Northover 9d891185ad Revert "Sema: allow imaginary constants via GNU extension if UDL overloads not present."
This reverts commit r303697. It broke libc++ tests that were specifically
checking incompatibility in C++14 mode.

llvm-svn: 303813
2017-05-24 22:18:35 +00:00
Simon Pilgrim 2f5139340e Fix 'set but not used' [-Wunused-but-set-variable] warning
llvm-svn: 303734
2017-05-24 10:38:09 +00:00
Tim Northover 6b5eceac2e Sema: allow imaginary constants via GNU extension if UDL overloads not present.
C++14 added user-defined literal support for complex numbers so that you can
write something like "complex<double> val = 2i". However, there is an existing
GNU extension supporting this syntax and interpreting the result as a _Complex
type.

This changes parsing so that such literals are interpreted in terms of C++14's
operators if an overload is present but otherwise falls back to the original
GNU extension.

llvm-svn: 303694
2017-05-23 21:41:49 +00:00